I've attached 3 patches this time. to try to keep each individual change
small(ish).

Ihor Radchenko <[email protected]> writes:

> Steven Allen <[email protected]> writes:
>
>> 1. Inline source code: The original patch broke
>>    org-element--parse-paired-brackets because that function relies on
>>    syntax tables to find the pair of surrounding curly braces.
>
> In the long term, the parser should move away from relying on in-buffer
> settings, including syntax table. So, I think that making things less
> dependent on local syntax tables is the right direction.
> This is because many people desire Org parser to work outside Org
> buffers. Unexpected tables is one of the reasons why it is not possible
> as of now.

In this case at least, the org parser uses (with-syntax-table ...)  to
temporarily bind a new syntax table. The issue only arises when
`parse-sexp-lookup-properties' is non-nil and the text has a
syntax-table property.

I've replaced `with-syntax-table' with an `org-with-syntax-table' macro that
let-binds `parse-sexp-lookup-properties' to nil.

>> 2. Source blocks: The original patch broke some regular expression (I
>>    think?) that matches the beginning of source blocks, causing it to
>>    include the first word of the source-code in the language name. E.g.,
>>    in one test:
>>
>>    #+BEGIN_SRC eshell
>>    echo 2
>>    #+END_SRC
>>
>>    Parsed the language as "eshell\necho"
>
> I am wondering if this was inside the parser or in one of the ad-hoc
> regexp searches. If the former, it is a good idea to shield that part of
> the parser.

It's the former:

1. `org-element-src-block-parser' relies on the syntax table treating
newlines as whitespace [1].
2. eshell-mode treats newlines as "comment enders", not whitespace [2].

[1] 
https://git.savannah.gnu.org/cgit/org-mode.git/tree/lisp/org-element.el?id=565459cf27470d8830fdf0e54bcf2a3b71ac513d#n3055
[2] 
https://git.savannah.gnu.org/cgit/emacs.git/tree/lisp/eshell/esh-mode.el?id=ef2584585b095fff1e045e58673d2a7c8f9af799#n261

If the desire is to support parsing org-mode text outside of org-mode,
it probably makes sense to replace all instances of "\\S-" in regular
expressions with something more generic (e.g., "[ \t\r\n]+"). However,
that's a very large search and replace.

>>> What about inline src blocks?
>>
>> I restricted the previous version to JUST blocks because I figured it'd
>> be safer (I assume org-mode skips over all the lines between begin/end
>> when fontifying).
>
> Considering the above-state goal, I am willing to sacrifice short-term
> stability in favor of revealing more bottlenecks in the parser where we
> rely on defaults/major mode setup.

SGTM.

>> I've dug into this a bit more now and, I can fix the tests by binding
>> parse-sexp-lookup-properties to nil inside
>> org-element--parse-paired-brackets (patch attached). However, I'm not
>> sure there aren't other cases lurking untested.
>>
>> One option here is to replace calls to with-syntax-table with an
>> org-with-syntax-table that binds parse-sexp-lookup-properties to nil but
>> that may be overkill.
>
> I think that will be exactly right kill of kill :)

Done. I've added an `org-with-syntax-table' macro that does this.

>>                (setq pos next)))
>> +              (put-text-property start end 'syntax-table (syntax-table) 
>> org-buffer)
>>                (set-buffer-modified-p nil)))
>
> What if src block's major mode itself sets 'syntax-table property?

Good point. Fixed in the latest patch set.

>From e6ea4ed4c876ce6d2c6ea291517299db18c00176 Mon Sep 17 00:00:00 2001
From: Steven Allen <[email protected]>
Date: Sat, 8 Nov 2025 13:27:01 -0800
Subject: [PATCH 1/3] Ignore the syntax-table text-prop when binding a temp
 syntax-table

This ensures we can, e.g., scan for matching brackets without having the
syntax-table text property interfere with our search.

* lisp/org-macs.el (org-with-syntax-table): Add a org-specific
`with-syntax-table' macro that binds `parse-sexp-lookup-properties' to
nil while evaluating the body.
---
 lisp/org-agenda.el  | 2 +-
 lisp/org-capture.el | 2 +-
 lisp/org-element.el | 6 +++---
 lisp/org-macs.el    | 9 +++++++++
 lisp/org.el         | 8 ++++----
 5 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/lisp/org-agenda.el b/lisp/org-agenda.el
index 0444d0d81..8242f8036 100644
--- a/lisp/org-agenda.el
+++ b/lisp/org-agenda.el
@@ -4809,7 +4809,7 @@ defun org-search-view
 	    (setq rtn (list (format "ORG-AGENDA-ERROR: No such org-file %s"
 				    file))))
 	  (with-current-buffer buffer
-	    (with-syntax-table (org-search-syntax-table)
+	    (org-with-syntax-table (org-search-syntax-table)
 	      (unless (derived-mode-p 'org-mode)
 		(error "Agenda file %s is not in Org mode" file))
 	      (let ((case-fold-search t))
diff --git a/lisp/org-capture.el b/lisp/org-capture.el
index 45348b7e9..3c8040854 100644
--- a/lisp/org-capture.el
+++ b/lisp/org-capture.el
@@ -2091,7 +2091,7 @@ defun org-capture-expand-embedded-elisp
        ;; Only mark valid and non-escaped sexp.
        ((org-capture-escaped-%) nil)
        (t
-	(let ((end (with-syntax-table emacs-lisp-mode-syntax-table
+	(let ((end (org-with-syntax-table emacs-lisp-mode-syntax-table
 		     (ignore-errors (scan-sexps (1- (point)) 1)))))
 	  (when end
 	    (put-text-property (- (point) 2) end 'org-embedded-elisp t))))))))
diff --git a/lisp/org-element.el b/lisp/org-element.el
index 22fdec4d2..3dd33b11e 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -507,7 +507,7 @@ defun org-element--parse-paired-brackets
 			  (_ nil)))
 	  (pos (point)))
       (when syntax-table
-	(with-syntax-table syntax-table
+	(org-with-syntax-table syntax-table
 	  (let ((end (ignore-errors (scan-lists pos 1 0))))
 	    (when end
 	      (goto-char end)
@@ -3399,7 +3399,7 @@ defun org-element-citation-parser
                         (match-string-no-properties 1))))
 	   ;; Ignore blanks between cite type and prefix or key.
 	   (start (match-end 0))
-	   (closing (with-syntax-table org-element--pair-square-table
+	   (closing (org-with-syntax-table org-element--pair-square-table
 		      (ignore-errors (scan-lists begin 1 0)))))
       (save-excursion
 	(when (and closing
@@ -3640,7 +3640,7 @@ defun org-element-footnote-reference-parser
 `:end', `:contents-begin', `:contents-end' and `:post-blank' as
 properties.  Otherwise, return nil."
   (when (looking-at org-footnote-re)
-    (let ((closing (with-syntax-table org-element--pair-square-table
+    (let ((closing (org-with-syntax-table org-element--pair-square-table
 		     (ignore-errors (scan-lists (point) 1 0)))))
       (when closing
 	(save-excursion
diff --git a/lisp/org-macs.el b/lisp/org-macs.el
index c6f2a9033..6ec3bfcea 100644
--- a/lisp/org-macs.el
+++ b/lisp/org-macs.el
@@ -274,6 +274,15 @@ defmacro org-element-with-disabled-cache
   `(cl-letf (((symbol-function #'org-element--cache-active-p) (lambda (&rest _) nil)))
      ,@body))
 
+(defmacro org-with-syntax-table (table &rest body)
+  "Evaluate BODY with syntax table of current buffer set to TABLE.
+
+This is the same as `with-syntax-table' except that it also binds
+`parse-sexp-lookup-properties' to nil."
+  `(with-syntax-table ,table
+     (let ((parse-sexp-lookup-properties nil))
+       ,@body)))
+
 
 ;;; Buffer and windows
 
diff --git a/lisp/org.el b/lisp/org.el
index dcb1232c0..47d02433e 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -11816,7 +11816,7 @@ defun org-make-tags-matcher
 		   (mm
 		    (cond
 		     (regexp			; [2]
-                      `(with-syntax-table org-mode-tags-syntax-table
+                      `(org-with-syntax-table org-mode-tags-syntax-table
                          (org-match-any-p ,(substring tag 1 -1) tags-list)))
 		     (propp
 		      (let* (;; Determine property name.
@@ -11970,7 +11970,7 @@ defun org-tags-expand
 	    (add-text-properties
 	     (match-beginning 0) (match-end 0) '(regexp t) return-match)))
 	;; For each tag token found in MATCH, compute a regexp and  it
-	(with-syntax-table org-mode-tags-syntax-table
+	(org-with-syntax-table org-mode-tags-syntax-table
 	  (replace-regexp-in-string
 	   key-regexp
 	   (lambda (m)
@@ -16993,7 +16993,7 @@ defun org-transpose-words
 table, which interprets characters in `org-emphasis-alist' as
 word constituents."
   (interactive)
-  (with-syntax-table org-mode-transpose-word-syntax-table
+  (org-with-syntax-table org-mode-transpose-word-syntax-table
     (call-interactively 'transpose-words)))
 
 (defvar org-ctrl-c-ctrl-c-hook nil
@@ -19902,7 +19902,7 @@ defun org-fill-element
 
 For convenience, when point is at a plain list, an item or
 a footnote definition, try to fill the first paragraph within."
-  (with-syntax-table org-mode-transpose-word-syntax-table
+  (org-with-syntax-table org-mode-transpose-word-syntax-table
     ;; Move to end of line in order to get the first paragraph within
     ;; a plain list or a footnote definition.
     (let ((element (save-excursion (end-of-line) (org-element-at-point))))
-- 
2.51.2

>From 83c7b947b618eff400fe84499d1b36e49c40d251 Mon Sep 17 00:00:00 2001
From: Steven Allen <[email protected]>
Date: Sat, 8 Nov 2025 13:29:25 -0800
Subject: [PATCH 2/3] Don't include the leading newline when fontifying
 source-blocks

We still include the trailing newline, but there's no need to include
the leading newline. Additionally, this patch skips fontification of
empty source-blocks and removes a few redundant variables.

* lisp/org.el (org-fontify-meta-lines-and-blocks-1): Include only the
trailing newline, ot the leading newline, when fontifying. Skip empty
source blocks.
---
 lisp/org.el | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 47d02433e..967cd223f 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5604,10 +5604,6 @@ defun org-fontify-meta-lines-and-blocks-1
 	   limit t)
       (let ((beg (match-beginning 0))
 	    (end-of-beginline (match-end 0))
-	    ;; Including \n at end of #+begin line will include \n
-	    ;; after the end of block content.
-	    (block-start (match-end 0))
-	    (block-end nil)
 	    (lang (match-string 7)) ; The language, if it is a source block.
 	    (bol-after-beginline (line-beginning-position 2))
 	    (dc1 (downcase (match-string 2)))
@@ -5633,7 +5629,6 @@ defun org-fontify-meta-lines-and-blocks-1
 	    (setq beg-of-endline (match-beginning 0)
 		  end-of-endline (match-end 0)
 		  nl-before-endline (1- (match-beginning 0)))
-	    (setq block-end (match-beginning 0)) ; Include the final newline.
 	    (when quoting
 	      (org-remove-flyspell-overlays-in bol-after-beginline nl-before-endline)
 	      (remove-text-properties beg end-of-endline
@@ -5644,6 +5639,8 @@ defun org-fontify-meta-lines-and-blocks-1
 	    (org-remove-flyspell-overlays-in nl-before-endline end-of-endline)
             (cond
 	     ((and org-src-fontify-natively
+                   ;; Skip fontification of empty source-blocks
+                   (< bol-after-beginline beg-of-endline)
                    ;; Technically, according to the
                    ;; `org-src-fontify-natively' docstring, we should
                    ;; only fontify src blocks.  However, it is common
@@ -5653,8 +5650,8 @@ defun org-fontify-meta-lines-and-blocks-1
                    ;; for user convenience.
                    (member block-type '("src" "export" "example")))
 	      (save-match-data
-                (org-src-font-lock-fontify-block (or lang "") block-start block-end))
-	      (add-text-properties bol-after-beginline block-end '(src-block t)))
+                (org-src-font-lock-fontify-block (or lang "") bol-after-beginline beg-of-endline))
+	      (add-text-properties bol-after-beginline beg-of-endline '(src-block t)))
 	     (quoting
 	      (add-text-properties
 	       bol-after-beginline beg-of-endline
-- 
2.51.2

>From d1b3ef77324a0a9d388c26998fadf1d7c9943ced Mon Sep 17 00:00:00 2001
From: Steven Allen <[email protected]>
Date: Sat, 8 Nov 2025 13:31:36 -0800
Subject: [PATCH 3/3] Apply the mode's syntax table(s) when fontifying code
 natively

This makes it easier to work with source blocks without editing them in
a new buffer: s-expression and symbol navigation/selection work as
expected, < is not treated as a parentheses within source blocks unless
appropriate for the source block's mode, etc.

* lisp/org-src.el (org-src-font-lock-fontify-block): Preserve the
fontified source-code's syntax-table text-property where set.   Otherwise
apply the syntax-table from the source-code's buffer to the text (both
blocks and inline).
(org-src--edit-element): Cleanup any syntax-table properties before
editing.
* lisp/org.el (org-mode): Obey the syntax table text property in
commands.  This ensures that, e.g., "C-h f" correctly suggests the
symbol at point and doesn't, e.g., include any quotes, etc.
(org-unfontify-region): Cleanup the syntax-table property.
---
 lisp/org-src.el              | 12 +++++++--
 lisp/org.el                  |  6 ++++-
 testing/lisp/test-org-src.el | 49 ++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/lisp/org-src.el b/lisp/org-src.el
index d36a69f85..763f93eed 100644
--- a/lisp/org-src.el
+++ b/lisp/org-src.el
@@ -608,7 +608,10 @@ defun org-src--edit-element
 	;; Insert contents.
 	(insert contents)
 	(remove-text-properties (point-min) (point-max)
-				'(display nil invisible nil intangible nil))
+				'( display nil
+                                   invisible nil
+                                   intangible nil
+                                   syntax-table nil ))
 	(let ((lf (eq type 'latex-fragment)))
           (unless preserve-ind (org-do-remove-indentation (and lf block-ind) lf)))
 	(set-buffer-modified-p nil)
@@ -695,7 +698,7 @@ defun org-src-font-lock-fontify-block
                   ;; space and the remapping between 'font-lock-face and 'face
                   ;; text properties may thus not be set.  See commit
                   ;; 453d634bc.
-	          (dolist (prop (append '(font-lock-face face) font-lock-extra-managed-props))
+	          (dolist (prop (append '(font-lock-face face syntax-table) font-lock-extra-managed-props))
 		    (let ((new-prop (get-text-property pos prop)))
                       (when new-prop
                         (if (not (eq prop 'invisible))
@@ -736,6 +739,11 @@ defun org-src-font-lock-fontify-block
                                'org-src-invisible new-prop
 		               org-buffer)))))))
 	          (setq pos next)))
+              (let ((new-table (syntax-table)))
+                (alter-text-property
+                 start end 'syntax-table
+                 (lambda (old-table) (or old-table new-table))
+                 org-buffer))
               (set-buffer-modified-p nil)))
         (error
          (message "Native code fontification error in %S at pos%d\n Error: %S"
diff --git a/lisp/org.el b/lisp/org.el
index 967cd223f..707405ff2 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5155,6 +5155,9 @@ define-derived-mode org-mode
   (org-setup-filling)
   ;; Comments.
   (org-setup-comments-handling)
+  ;; Obey the syntax-table text property when navigating text (used in
+  ;; source blocks).
+  (setq-local parse-sexp-lookup-properties t)
   ;; Beginning/end of defun
   (setq-local beginning-of-defun-function 'org-backward-element)
   (setq-local end-of-defun-function
@@ -6312,7 +6315,8 @@ defun org-unfontify-region
     (remove-text-properties beg end
 			    '(mouse-face t keymap t org-linked-text t
 					 invisible t intangible t
-					 org-emphasis t))
+					 org-emphasis t
+                                         syntax-table t))
     (org-fold-core-update-optimisation beg end)
     (org-remove-font-lock-display-properties beg end)))
 
diff --git a/testing/lisp/test-org-src.el b/testing/lisp/test-org-src.el
index ebf8d8569..ec7c0601a 100644
--- a/testing/lisp/test-org-src.el
+++ b/testing/lisp/test-org-src.el
@@ -569,6 +569,55 @@
   (should (equal "#" (org-unescape-code-in-string "#")))
   (should (equal "," (org-unescape-code-in-string ","))))
 
+;;; Syntax Table Preservation
+
+(ert-deftest test-org-src/preserve-syntax-table ()
+  "Make sure we preserve the code's syntax-table where appropriate."
+  ;; Source blocks
+  (org-test-with-temp-text
+   "
+#+begin_src nxml
+<root><point>></root>
+#+end_src
+"
+   (should (looking-at-p "></root>"))
+   ;; nXML mode applies a different syntax-table to lone ">"
+   ;; characters, make sure we preserve that.
+   (should (equal (get-text-property (point) 'syntax-table)
+                  (string-to-syntax ".")))
+   ;; Everywhere else should use the mode's syntax table.
+   (dolist (pos (list (1+ (point)) (1- (point)) (pos-bol) (pos-eol)))
+     (should (equal (get-text-property pos 'syntax-table)
+                    nxml-mode-syntax-table)))
+   ;; But not outside the source code.
+   (dolist (pos (list (1- (pos-bol)) (1+ (pos-eol))))
+     (should-not (get-text-property pos 'syntax-table))))
+  ;; Inline source.
+  (org-test-with-temp-text
+   "src_nxml{<root><point>></root>}"
+   (should (looking-at-p "></root>"))
+   (should (equal (get-text-property (point) 'syntax-table)
+                  (string-to-syntax ".")))
+   ;; Everywhere else should use the mode's syntax table.
+   (dolist (pos (list (1+ (point)) (1- (point))))
+     (should (equal (get-text-property pos 'syntax-table)
+                    nxml-mode-syntax-table)))
+   ;; We should correctly parse this as an inline source block.
+   (let ((e (org-element-context)))
+     (should (eq (org-element-type e) 'inline-src-block)))
+   ;; And we should only add the syntax table to the code itself.
+   (save-excursion
+     (should (search-forward "}"))
+     (goto-char (match-beginning 0))
+     (should (eq (char-after) ?}))
+     (should-not (get-text-property (point) 'syntax-table))
+     (should (equal (get-text-property (1- (point)) 'syntax-table)
+                    nxml-mode-syntax-table)))
+   (save-excursion
+     (search-backward "{")
+     (should-not (get-text-property (point) 'syntax-table))
+     (should (equal (get-text-property (1+ (point)) 'syntax-table)
+                    nxml-mode-syntax-table)))))
 
 (provide 'test-org-src)
 ;;; test-org-src.el ends here
-- 
2.51.2

Reply via email to