I've attached 3 patches this time. to try to keep each individual change small(ish).
Ihor Radchenko <[email protected]> writes: > Steven Allen <[email protected]> writes: > >> 1. Inline source code: The original patch broke >> org-element--parse-paired-brackets because that function relies on >> syntax tables to find the pair of surrounding curly braces. > > In the long term, the parser should move away from relying on in-buffer > settings, including syntax table. So, I think that making things less > dependent on local syntax tables is the right direction. > This is because many people desire Org parser to work outside Org > buffers. Unexpected tables is one of the reasons why it is not possible > as of now. In this case at least, the org parser uses (with-syntax-table ...) to temporarily bind a new syntax table. The issue only arises when `parse-sexp-lookup-properties' is non-nil and the text has a syntax-table property. I've replaced `with-syntax-table' with an `org-with-syntax-table' macro that let-binds `parse-sexp-lookup-properties' to nil. >> 2. Source blocks: The original patch broke some regular expression (I >> think?) that matches the beginning of source blocks, causing it to >> include the first word of the source-code in the language name. E.g., >> in one test: >> >> #+BEGIN_SRC eshell >> echo 2 >> #+END_SRC >> >> Parsed the language as "eshell\necho" > > I am wondering if this was inside the parser or in one of the ad-hoc > regexp searches. If the former, it is a good idea to shield that part of > the parser. It's the former: 1. `org-element-src-block-parser' relies on the syntax table treating newlines as whitespace [1]. 2. eshell-mode treats newlines as "comment enders", not whitespace [2]. [1] https://git.savannah.gnu.org/cgit/org-mode.git/tree/lisp/org-element.el?id=565459cf27470d8830fdf0e54bcf2a3b71ac513d#n3055 [2] https://git.savannah.gnu.org/cgit/emacs.git/tree/lisp/eshell/esh-mode.el?id=ef2584585b095fff1e045e58673d2a7c8f9af799#n261 If the desire is to support parsing org-mode text outside of org-mode, it probably makes sense to replace all instances of "\\S-" in regular expressions with something more generic (e.g., "[ \t\r\n]+"). However, that's a very large search and replace. >>> What about inline src blocks? >> >> I restricted the previous version to JUST blocks because I figured it'd >> be safer (I assume org-mode skips over all the lines between begin/end >> when fontifying). > > Considering the above-state goal, I am willing to sacrifice short-term > stability in favor of revealing more bottlenecks in the parser where we > rely on defaults/major mode setup. SGTM. >> I've dug into this a bit more now and, I can fix the tests by binding >> parse-sexp-lookup-properties to nil inside >> org-element--parse-paired-brackets (patch attached). However, I'm not >> sure there aren't other cases lurking untested. >> >> One option here is to replace calls to with-syntax-table with an >> org-with-syntax-table that binds parse-sexp-lookup-properties to nil but >> that may be overkill. > > I think that will be exactly right kill of kill :) Done. I've added an `org-with-syntax-table' macro that does this. >> (setq pos next))) >> + (put-text-property start end 'syntax-table (syntax-table) >> org-buffer) >> (set-buffer-modified-p nil))) > > What if src block's major mode itself sets 'syntax-table property? Good point. Fixed in the latest patch set.
>From e6ea4ed4c876ce6d2c6ea291517299db18c00176 Mon Sep 17 00:00:00 2001 From: Steven Allen <[email protected]> Date: Sat, 8 Nov 2025 13:27:01 -0800 Subject: [PATCH 1/3] Ignore the syntax-table text-prop when binding a temp syntax-table This ensures we can, e.g., scan for matching brackets without having the syntax-table text property interfere with our search. * lisp/org-macs.el (org-with-syntax-table): Add a org-specific `with-syntax-table' macro that binds `parse-sexp-lookup-properties' to nil while evaluating the body. --- lisp/org-agenda.el | 2 +- lisp/org-capture.el | 2 +- lisp/org-element.el | 6 +++--- lisp/org-macs.el | 9 +++++++++ lisp/org.el | 8 ++++---- 5 files changed, 18 insertions(+), 9 deletions(-) diff --git a/lisp/org-agenda.el b/lisp/org-agenda.el index 0444d0d81..8242f8036 100644 --- a/lisp/org-agenda.el +++ b/lisp/org-agenda.el @@ -4809,7 +4809,7 @@ defun org-search-view (setq rtn (list (format "ORG-AGENDA-ERROR: No such org-file %s" file)))) (with-current-buffer buffer - (with-syntax-table (org-search-syntax-table) + (org-with-syntax-table (org-search-syntax-table) (unless (derived-mode-p 'org-mode) (error "Agenda file %s is not in Org mode" file)) (let ((case-fold-search t)) diff --git a/lisp/org-capture.el b/lisp/org-capture.el index 45348b7e9..3c8040854 100644 --- a/lisp/org-capture.el +++ b/lisp/org-capture.el @@ -2091,7 +2091,7 @@ defun org-capture-expand-embedded-elisp ;; Only mark valid and non-escaped sexp. ((org-capture-escaped-%) nil) (t - (let ((end (with-syntax-table emacs-lisp-mode-syntax-table + (let ((end (org-with-syntax-table emacs-lisp-mode-syntax-table (ignore-errors (scan-sexps (1- (point)) 1))))) (when end (put-text-property (- (point) 2) end 'org-embedded-elisp t)))))))) diff --git a/lisp/org-element.el b/lisp/org-element.el index 22fdec4d2..3dd33b11e 100644 --- a/lisp/org-element.el +++ b/lisp/org-element.el @@ -507,7 +507,7 @@ defun org-element--parse-paired-brackets (_ nil))) (pos (point))) (when syntax-table - (with-syntax-table syntax-table + (org-with-syntax-table syntax-table (let ((end (ignore-errors (scan-lists pos 1 0)))) (when end (goto-char end) @@ -3399,7 +3399,7 @@ defun org-element-citation-parser (match-string-no-properties 1)))) ;; Ignore blanks between cite type and prefix or key. (start (match-end 0)) - (closing (with-syntax-table org-element--pair-square-table + (closing (org-with-syntax-table org-element--pair-square-table (ignore-errors (scan-lists begin 1 0))))) (save-excursion (when (and closing @@ -3640,7 +3640,7 @@ defun org-element-footnote-reference-parser `:end', `:contents-begin', `:contents-end' and `:post-blank' as properties. Otherwise, return nil." (when (looking-at org-footnote-re) - (let ((closing (with-syntax-table org-element--pair-square-table + (let ((closing (org-with-syntax-table org-element--pair-square-table (ignore-errors (scan-lists (point) 1 0))))) (when closing (save-excursion diff --git a/lisp/org-macs.el b/lisp/org-macs.el index c6f2a9033..6ec3bfcea 100644 --- a/lisp/org-macs.el +++ b/lisp/org-macs.el @@ -274,6 +274,15 @@ defmacro org-element-with-disabled-cache `(cl-letf (((symbol-function #'org-element--cache-active-p) (lambda (&rest _) nil))) ,@body)) +(defmacro org-with-syntax-table (table &rest body) + "Evaluate BODY with syntax table of current buffer set to TABLE. + +This is the same as `with-syntax-table' except that it also binds +`parse-sexp-lookup-properties' to nil." + `(with-syntax-table ,table + (let ((parse-sexp-lookup-properties nil)) + ,@body))) + ;;; Buffer and windows diff --git a/lisp/org.el b/lisp/org.el index dcb1232c0..47d02433e 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -11816,7 +11816,7 @@ defun org-make-tags-matcher (mm (cond (regexp ; [2] - `(with-syntax-table org-mode-tags-syntax-table + `(org-with-syntax-table org-mode-tags-syntax-table (org-match-any-p ,(substring tag 1 -1) tags-list))) (propp (let* (;; Determine property name. @@ -11970,7 +11970,7 @@ defun org-tags-expand (add-text-properties (match-beginning 0) (match-end 0) '(regexp t) return-match))) ;; For each tag token found in MATCH, compute a regexp and it - (with-syntax-table org-mode-tags-syntax-table + (org-with-syntax-table org-mode-tags-syntax-table (replace-regexp-in-string key-regexp (lambda (m) @@ -16993,7 +16993,7 @@ defun org-transpose-words table, which interprets characters in `org-emphasis-alist' as word constituents." (interactive) - (with-syntax-table org-mode-transpose-word-syntax-table + (org-with-syntax-table org-mode-transpose-word-syntax-table (call-interactively 'transpose-words))) (defvar org-ctrl-c-ctrl-c-hook nil @@ -19902,7 +19902,7 @@ defun org-fill-element For convenience, when point is at a plain list, an item or a footnote definition, try to fill the first paragraph within." - (with-syntax-table org-mode-transpose-word-syntax-table + (org-with-syntax-table org-mode-transpose-word-syntax-table ;; Move to end of line in order to get the first paragraph within ;; a plain list or a footnote definition. (let ((element (save-excursion (end-of-line) (org-element-at-point)))) -- 2.51.2
>From 83c7b947b618eff400fe84499d1b36e49c40d251 Mon Sep 17 00:00:00 2001 From: Steven Allen <[email protected]> Date: Sat, 8 Nov 2025 13:29:25 -0800 Subject: [PATCH 2/3] Don't include the leading newline when fontifying source-blocks We still include the trailing newline, but there's no need to include the leading newline. Additionally, this patch skips fontification of empty source-blocks and removes a few redundant variables. * lisp/org.el (org-fontify-meta-lines-and-blocks-1): Include only the trailing newline, ot the leading newline, when fontifying. Skip empty source blocks. --- lisp/org.el | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/lisp/org.el b/lisp/org.el index 47d02433e..967cd223f 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5604,10 +5604,6 @@ defun org-fontify-meta-lines-and-blocks-1 limit t) (let ((beg (match-beginning 0)) (end-of-beginline (match-end 0)) - ;; Including \n at end of #+begin line will include \n - ;; after the end of block content. - (block-start (match-end 0)) - (block-end nil) (lang (match-string 7)) ; The language, if it is a source block. (bol-after-beginline (line-beginning-position 2)) (dc1 (downcase (match-string 2))) @@ -5633,7 +5629,6 @@ defun org-fontify-meta-lines-and-blocks-1 (setq beg-of-endline (match-beginning 0) end-of-endline (match-end 0) nl-before-endline (1- (match-beginning 0))) - (setq block-end (match-beginning 0)) ; Include the final newline. (when quoting (org-remove-flyspell-overlays-in bol-after-beginline nl-before-endline) (remove-text-properties beg end-of-endline @@ -5644,6 +5639,8 @@ defun org-fontify-meta-lines-and-blocks-1 (org-remove-flyspell-overlays-in nl-before-endline end-of-endline) (cond ((and org-src-fontify-natively + ;; Skip fontification of empty source-blocks + (< bol-after-beginline beg-of-endline) ;; Technically, according to the ;; `org-src-fontify-natively' docstring, we should ;; only fontify src blocks. However, it is common @@ -5653,8 +5650,8 @@ defun org-fontify-meta-lines-and-blocks-1 ;; for user convenience. (member block-type '("src" "export" "example"))) (save-match-data - (org-src-font-lock-fontify-block (or lang "") block-start block-end)) - (add-text-properties bol-after-beginline block-end '(src-block t))) + (org-src-font-lock-fontify-block (or lang "") bol-after-beginline beg-of-endline)) + (add-text-properties bol-after-beginline beg-of-endline '(src-block t))) (quoting (add-text-properties bol-after-beginline beg-of-endline -- 2.51.2
>From d1b3ef77324a0a9d388c26998fadf1d7c9943ced Mon Sep 17 00:00:00 2001 From: Steven Allen <[email protected]> Date: Sat, 8 Nov 2025 13:31:36 -0800 Subject: [PATCH 3/3] Apply the mode's syntax table(s) when fontifying code natively This makes it easier to work with source blocks without editing them in a new buffer: s-expression and symbol navigation/selection work as expected, < is not treated as a parentheses within source blocks unless appropriate for the source block's mode, etc. * lisp/org-src.el (org-src-font-lock-fontify-block): Preserve the fontified source-code's syntax-table text-property where set. Otherwise apply the syntax-table from the source-code's buffer to the text (both blocks and inline). (org-src--edit-element): Cleanup any syntax-table properties before editing. * lisp/org.el (org-mode): Obey the syntax table text property in commands. This ensures that, e.g., "C-h f" correctly suggests the symbol at point and doesn't, e.g., include any quotes, etc. (org-unfontify-region): Cleanup the syntax-table property. --- lisp/org-src.el | 12 +++++++-- lisp/org.el | 6 ++++- testing/lisp/test-org-src.el | 49 ++++++++++++++++++++++++++++++++++++ 3 files changed, 64 insertions(+), 3 deletions(-) diff --git a/lisp/org-src.el b/lisp/org-src.el index d36a69f85..763f93eed 100644 --- a/lisp/org-src.el +++ b/lisp/org-src.el @@ -608,7 +608,10 @@ defun org-src--edit-element ;; Insert contents. (insert contents) (remove-text-properties (point-min) (point-max) - '(display nil invisible nil intangible nil)) + '( display nil + invisible nil + intangible nil + syntax-table nil )) (let ((lf (eq type 'latex-fragment))) (unless preserve-ind (org-do-remove-indentation (and lf block-ind) lf))) (set-buffer-modified-p nil) @@ -695,7 +698,7 @@ defun org-src-font-lock-fontify-block ;; space and the remapping between 'font-lock-face and 'face ;; text properties may thus not be set. See commit ;; 453d634bc. - (dolist (prop (append '(font-lock-face face) font-lock-extra-managed-props)) + (dolist (prop (append '(font-lock-face face syntax-table) font-lock-extra-managed-props)) (let ((new-prop (get-text-property pos prop))) (when new-prop (if (not (eq prop 'invisible)) @@ -736,6 +739,11 @@ defun org-src-font-lock-fontify-block 'org-src-invisible new-prop org-buffer))))))) (setq pos next))) + (let ((new-table (syntax-table))) + (alter-text-property + start end 'syntax-table + (lambda (old-table) (or old-table new-table)) + org-buffer)) (set-buffer-modified-p nil))) (error (message "Native code fontification error in %S at pos%d\n Error: %S" diff --git a/lisp/org.el b/lisp/org.el index 967cd223f..707405ff2 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5155,6 +5155,9 @@ define-derived-mode org-mode (org-setup-filling) ;; Comments. (org-setup-comments-handling) + ;; Obey the syntax-table text property when navigating text (used in + ;; source blocks). + (setq-local parse-sexp-lookup-properties t) ;; Beginning/end of defun (setq-local beginning-of-defun-function 'org-backward-element) (setq-local end-of-defun-function @@ -6312,7 +6315,8 @@ defun org-unfontify-region (remove-text-properties beg end '(mouse-face t keymap t org-linked-text t invisible t intangible t - org-emphasis t)) + org-emphasis t + syntax-table t)) (org-fold-core-update-optimisation beg end) (org-remove-font-lock-display-properties beg end))) diff --git a/testing/lisp/test-org-src.el b/testing/lisp/test-org-src.el index ebf8d8569..ec7c0601a 100644 --- a/testing/lisp/test-org-src.el +++ b/testing/lisp/test-org-src.el @@ -569,6 +569,55 @@ (should (equal "#" (org-unescape-code-in-string "#"))) (should (equal "," (org-unescape-code-in-string ",")))) +;;; Syntax Table Preservation + +(ert-deftest test-org-src/preserve-syntax-table () + "Make sure we preserve the code's syntax-table where appropriate." + ;; Source blocks + (org-test-with-temp-text + " +#+begin_src nxml +<root><point>></root> +#+end_src +" + (should (looking-at-p "></root>")) + ;; nXML mode applies a different syntax-table to lone ">" + ;; characters, make sure we preserve that. + (should (equal (get-text-property (point) 'syntax-table) + (string-to-syntax "."))) + ;; Everywhere else should use the mode's syntax table. + (dolist (pos (list (1+ (point)) (1- (point)) (pos-bol) (pos-eol))) + (should (equal (get-text-property pos 'syntax-table) + nxml-mode-syntax-table))) + ;; But not outside the source code. + (dolist (pos (list (1- (pos-bol)) (1+ (pos-eol)))) + (should-not (get-text-property pos 'syntax-table)))) + ;; Inline source. + (org-test-with-temp-text + "src_nxml{<root><point>></root>}" + (should (looking-at-p "></root>")) + (should (equal (get-text-property (point) 'syntax-table) + (string-to-syntax "."))) + ;; Everywhere else should use the mode's syntax table. + (dolist (pos (list (1+ (point)) (1- (point)))) + (should (equal (get-text-property pos 'syntax-table) + nxml-mode-syntax-table))) + ;; We should correctly parse this as an inline source block. + (let ((e (org-element-context))) + (should (eq (org-element-type e) 'inline-src-block))) + ;; And we should only add the syntax table to the code itself. + (save-excursion + (should (search-forward "}")) + (goto-char (match-beginning 0)) + (should (eq (char-after) ?})) + (should-not (get-text-property (point) 'syntax-table)) + (should (equal (get-text-property (1- (point)) 'syntax-table) + nxml-mode-syntax-table))) + (save-excursion + (search-backward "{") + (should-not (get-text-property (point) 'syntax-table)) + (should (equal (get-text-property (1+ (point)) 'syntax-table) + nxml-mode-syntax-table))))) (provide 'test-org-src) ;;; test-org-src.el ends here -- 2.51.2
