Richard Stallman wrote: > believe that I found a solution that does the right thing in most cases > and will send it to you in the next days. > > Could you dscribe in words what it does?
Attached find a file called `lisp-font-lock-regexp.el' which contains all changes I propose. You may try to load it, make the face definitions meet your requirements, and look whether it works. Syntax-highlighting and decoration for lisp-font-lock-keywords-2 must be activated. Eventually someone would have to decide on appropriate names and defaults for faces. I have set regexp highlighting to the minimum level 1. If this were incorporated in font-lock.el, the standard level should be 0 - which means no regexp highlighting and thus no obtrusiveness. Emacs would behave as before the introduction of regexp highlighting a couple of weeks ago. Level 1 does regexp highlighting as introduced recently with some minor bug fixes. Levels 2 and 3 should do something that was proposed in font-lock.el but commented out due to problems with an "unbreakable endless loop". Level 2 does this for regexp groups on a single line only. Level 3 should handle regexp groups spanning several lines as well. By no means the default level should equal 3 as will become evident from remarks below. The variable `lisp-font-lock-regexp' can be used to set the default level. Individual buffer settings can be achieved by using the command `lisp-font-lock-regexp'. Levels 2 and 3 use the syntax-table property to remove parenthesis syntax from unescaped parentheses and escaped brackets within regexp groups. I added syntax-table to `font-lock-extra-managed-props' since I don't want font-lock to perform the extra syntactic fontification pass. This idea is non-standard and could be defeated by anyone who removed syntax-table from that list - so far no one seems to use syntax-table properties in elisp-mode. With that property paren-matching/blinking and forward/backward-sexp should work "as intended" within parenthetical groups. You may have noticed my simple-minded posting on emacs-pretest-bug about forward-sexp not being able to handle unescaped semicolons within strings. I resolved the problem by setting the syntax-table property of `;' to punctuation within regexp groups. For a similar reason I reset the escape syntax property of single backslashes preceding parentheses and brackets. I do not treat special characters "as ordinary ones if they are in contexts where their special meanings make no sense". Hence, subexpressions like \\(\\[[^]]*]\\)* in `reftex-extract-bib-entries-from-thebibliography' \\(\\[[^\\]]*\\]\\)? in `reftex-all-used-citation-keys' \\`\\(\\\\[sS]?.\\|\\[\\^?]?[^]]*]\\|[^\\]\\) in `gnus-score-regexp-bad-p' \\(\[[0-9]+\] \\)* in `gud-jdb-marker-filter' do contain mismatches. With level 3 highlighting I'm using the font-lock-multiline property. Apparently this property is used by `smerge.el' too. Consequently, I cannot simply reset the variable `font-lock-multiline' to nil when I switch to a lower level. I believe that this variable - and the variable `parse-sexp-lookup-properties' as well - should be handled in a way similar to hooks or `buffer-invisibility-spec'. Anyone who wants to set these variables should create or append its name to a corresponding list and remove its name to eventually reset the variable. Routines checking the value of the variable would not be affected by this convention. Likely font-lock-multiline, syntax-table and `lisp-font-lock-regexp' prefixed properties should be added to `yank-excluded-properties' too. I've been experimenting a bit with level 3 highlighting. With a 200MHz PC the results are negative: Fontifiying a buffer is moderatly slow, modifying text is hardly supportable. With a 1GHz PC I did not encounter substantial difficulties with one exception - fontifying `cperl-init-faces' took a couple of seconds. I tried to look at bit closer what's going on. When I scrolled down through `cperl.el' and looked at what font-lock is doing I found out that the range from position 168761 to 172839 gets fontified no less than _seven_ times in sequence: Apparently `xdisp.c' - encountering an unfontified object at a position START - asks `jit-lock-function' to fontify from position START. jit-lock-function now calls `jit-lock-fontify-now' to fontify from START to (+ START jit-lock-chunk-size). The latter sets the fontified property for this region to t. `font-lock-default-fontify-region' detects that there is a font-lock-multiline pattern, fontifies the entire region from beginning to end of the pattern - the 168761 to 172839 region above - but does not set the fontified property for this region. I simply inserted `(put-text-property beg end 'fontified t)' in the text of `font-lock-default-fontify-region' right before it calls `font-lock-unfontify-region' and the problem disappeared. When I change some text within a font-lock-multiline pattern of `cperl-init-faces' font-lock refontifies the entire area twice which can take a couple of seconds. What happens here? The first refontification is triggered by redisplay which encounters an unfontified thing it should display (the thing was unfontified by `jit-lock-after-change' previously). The second refontification is eventually triggered by `jit-lock-context-fontify' which unfontifies everything from `jit-lock-context-unfontify-pos' until point-max. However, the second refontification is useless because font-lock-default-fontify-region already took care of the font-lock multiline pattern. Moreover, the second fontification usualy occurs right after the first has finished _before_ I am able to enter the next character. I could resolve this by having font-lock-default-fontify-region fontify a region iff it has not fontified exactly that region already since the last modification of the buffer. But font-lock-multiline patterns do not seem suited for handling this problem anyway. Patterns spanning more than a couple of lines - your mileage may vary - will delay redisplay because inserting one single character triggers refontification of the _entire_ pattern. It should be possible to resolve this problem by using the `jit-lock-defer-multiline' property. However, the latter is broken. Suppose I used jit-lock-defer-multiline instead of font-lock-multiline for my pattern. Inserting a character now will not delay redisplay anymore since font-lock-default-fontify-region does not cater for jit-lock-defer-multiline. Eventually, jit-lock-context-fontify will unfontify the relevant parts of my buffer from the start of the pattern to point-max, and everything should get fontified correctly. It does not, however, when the jit-lock-defer-multiline pattern starts _before_ `window-start': After jit-lock-context-fontify has unfontified the buffer, redisplay - for some reason I did not investigate - intercepts this by fontifying the _visible_ part of the buffer without caring about my pattern. Eventually, the invisible parts get refontified but the already fontified part doesn't because, as mentioned before, font-lock-default-fontify-region does not know jit-lock-defer-multiline patterns. Hence, fontification appears incorrect. I'm afraid there are no simple patches for this. Hence I provided the appropriate warnings that level 3 highlighting should be used with sufficient care. The feature I propose could be quite useful for people who write regular expressions only occasionally and I don't want to compromise it on behalf of the recent controversies on font-lock-comment-delimiter and font-lock-negation-char-face faces. On the other hand, I don't want to give pretext to anyone who plans to introduce yet another feature in the pre-release phase. Hence if you think that this should be delayed or cancelled please tell me so. I've also experimented with a patch of `show-paren-function' where I overlay the backslashes in `\\(...\\)' groups with the respective count of that group. Hence I don't have to literally step through such pairs when searching for the subexpressions referenced by match-string, match-beginning, ...
(defface lisp-font-lock-regexp-delimiter '((t (:bold t))) "Face for highlighting regexp group delimiters and brackets." :group 'font-lock-highlighting-faces) (defface lisp-font-lock-regexp-backslash '((t (:foreground "PaleGreen3"))) "Face for highlighting the backslash part of regexp group delimiters." :group 'font-lock-highlighting-faces) (defface lisp-font-lock-regexp-group '((t (:background "Grey86"))) "Face for highlighting inner regexp groups." :group 'font-lock-highlighting-faces) (defun lisp-font-lock-regexp-hook () "Automatically turn on regexp highlighting in `emacs-lisp-mode'." (setq lisp-font-lock-regexp lisp-font-lock-regexp) ; set buffer-local value (when (> lisp-font-lock-regexp 1) (set (make-local-variable 'parse-sexp-lookup-properties) t) (when (> lisp-font-lock-regexp 2) (set (make-local-variable 'font-lock-multiline) t)) (set (make-local-variable 'font-lock-extra-managed-props) (append font-lock-extra-managed-props (list 'syntax-table 'lisp-font-lock-regexp 'lisp-font-lock-regexp-group 'lisp-font-lock-regexp-alt))))) (defcustom lisp-font-lock-regexp 1 "*Highlight regular expression in `emacs-lisp-mode'. The following levels are available: 0 (off) do no highlight regular expressions specially. 1 (minimum) highlight the non-backslash parts of regexp group delimiters with `lisp-font-lock-regexp-delimiter' face and delimiter backslashes with `lisp-font-lock-regexp-backslash' face. Group delimiters are the backslash-sequences `\\(' `\\(?:' `\\|' and `\\)'. Delimiters appearing in documentation strings or non-string text are not highlighted. Within proper strings, however, *every* instance of such a delimiter will be highlighted regardless of its actual or intended semantics. Hence, you should use these backslash-sequences *exclusively* for parenthetical grouping of regexps. For other purposes try something like `(concat \"\\\\\" \"(\")' instead. Within character alternatives write `)\\\\' instead of `\\\\('. 2 (medium) as 1 but also highlight brackets delimiting character alternatives within single-line regexp groups with `lisp-font-lock-regexp-delimiter' face. Moreover, highlight inner regexp groups with `lisp-font-lock-regexp-group' face. Inner regexp groups are character sequences within `\\(...\\)' and `\\(?:...\\)' that appear on a single line and do not contain one of the backslash-sequences `\\(' `\\(?:' or `\\)'. Inner regexp groups may contain non-string text provided the respective delimiters appear within a string. In addition, 2 will try to set the syntax-table properties of parentheses, brackets and semicolons within single-line regexp groups appropriately. More precisely, brackets that do not delimit a character alternative or class, parentheses that do not delimit a group, semicolons, and single backslashes preceding a parenthesis or bracket, are classified as punctuation characters. Note that you can always create a surrounding group with the shy group delimiters `\\(?:...\\)' without modifying the semantics of enclosed regexps. 3 (maximum) as 2 but permit operations on regexp groups spanning several lines. This option exploits the `font-lock-multiline' text-property which is not guaranteed to work reliably and is notorious for delaying redisplay considerably. Hence use this option with *extreme* care! Setting the default value of this variable does not affect highlighting of live buffers. Use the command `lisp-font-lock-regexp' to change highlighting for the current buffer only." :type '(choice (const :tag "off" 0) (const :tag "minimum" 1) (const :tag "medium" 2) (const :tag "maximum" 3)) :set (lambda (symbol value) (set-default symbol value) (remove-hook 'emacs-lisp-mode-hook 'lisp-font-lock-regexp-hook) (when (and (boundp 'font-lock-mode) ; silly if this is part of font-lock (> value 0)) (custom-add-option 'emacs-lisp-mode-hook 'lisp-font-lock-regexp-hook) (add-hook 'emacs-lisp-mode-hook 'lisp-font-lock-regexp-hook))) :version "22.1" :group 'font-lock) (make-variable-buffer-local 'lisp-font-lock-regexp) (defconst lisp-font-lock-keywords-2 (append lisp-font-lock-keywords-1 (eval-when-compile `( ;; Control structures. Emacs Lisp forms. (,(concat "(" (regexp-opt '("cond" "if" "while" "let" "let*" "prog" "progn" "progv" "prog1" "prog2" "prog*" "inline" "lambda" "save-restriction" "save-excursion" "save-window-excursion" "save-selected-window" "save-match-data" "save-current-buffer" "unwind-protect" "condition-case" "track-mouse" "eval-after-load" "eval-and-compile" "eval-when-compile" "eval-when" "with-category-table" "with-current-buffer" "with-electric-help" "with-local-quit" "with-no-warnings" "with-output-to-string" "with-output-to-temp-buffer" "with-selected-window" "with-syntax-table" "with-temp-buffer" "with-temp-file" "with-temp-message" "with-timeout" "with-timeout-handler") t) "\\>") . 1) ;; Control structures. Common Lisp forms. (,(concat "(" (regexp-opt '("when" "unless" "case" "ecase" "typecase" "etypecase" "ccase" "ctypecase" "handler-case" "handler-bind" "restart-bind" "restart-case" "in-package" "break" "ignore-errors" "loop" "do" "do*" "dotimes" "dolist" "the" "locally" "proclaim" "declaim" "declare" "symbol-macrolet" "lexical-let" "lexical-let*" "flet" "labels" "compiler-let" "destructuring-bind" "macrolet" "tagbody" "block" "go" "multiple-value-bind" "multiple-value-prog1" "return" "return-from" "with-accessors" "with-compilation-unit" "with-condition-restarts" "with-hash-table-iterator" "with-input-from-string" "with-open-file" "with-open-stream" "with-output-to-string" "with-package-iterator" "with-simple-restart" "with-slots" "with-standard-io-syntax") t) "\\>") . 1) ;; Exit/Feature symbols as constants. (,(concat "(\\(catch\\|throw\\|featurep\\|provide\\|require\\)\\>" "[ \t']*\\(\\sw+\\)?") (1 font-lock-keyword-face) (2 font-lock-constant-face nil t)) ;; Erroneous structures. ("(\\(abort\\|assert\\|warn\\|check-type\\|cerror\\|error\\|signal\\)\\>" 1 font-lock-warning-face) ;; Words inside \\[] tend to be for `substitute-command-keys'. ("\\\\\\\\\\[\\(\\sw+\\)\\]" 1 font-lock-constant-face prepend) ;; Words inside `' tend to be symbol names. ("`\\(\\sw\\sw+\\)'" 1 font-lock-constant-face prepend) ;; Constant values. ("\\<:\\sw+\\>" 0 font-lock-builtin-face) ;; ELisp and CLisp `&' keywords as types. ("\\&\\sw+\\>" . font-lock-type-face) ((lambda (bound) (when (and (local-variable-p 'lisp-font-lock-regexp) (not (zerop lisp-font-lock-regexp))) (while (cond ((and (= lisp-font-lock-regexp 3) (get-text-property (point) 'lisp-font-lock-regexp)) (re-search-forward "\\(?:\\(\\\\\\\\\\)\\(?:\\(?:\\\\\\\\\\)\\|\\((\\(?:\\?:\\)?\\)\\|\\(|\\)\\|\\()\\)\\|\\([][]\\)\\)\ \\|\\(\\\\[][()]\\)\\|\\(\\[:[a-zA-Z]+:\\]\\)\\|\\(\\[\\)\\|\\(\\]\\)\\|\\([(;)]\\)\\)" bound 'bound)) ((and (= lisp-font-lock-regexp 2) (or (and (get-text-property (point) 'lisp-font-lock-regexp) (re-search-forward "\\(?:\\(\\\\\\\\\\)\\(?:\\(?:\\\\\\\\\\)\\|\\((\\(?:\\?:\\)?\\)\\|\\(|\\)\\|\\()\\)\\|\\([][]\\)\\)\ \\|\\(\\\\[][()]\\)\\|\\(\\[:[a-zA-Z]+:\\]\\)\\|\\(\\[\\)\\|\\(\\]\\)\\|\\([(;)]\\)\\|\\($\\)\\)" bound 'eol) (not (match-beginning 11))) (re-search-forward "\\(?:\\(\\\\\\\\\\)\\(?:\\(?:\\\\\\\\\\)\\|\\((\\(?:\\?:\\)?\\)\\|\\(|\\)\\|\\()\\)\\)\\)" bound 'bound)))) (t (re-search-forward "\\(?:\\(\\\\\\\\\\)\\(?:\\(?:\\\\\\\\\\)\\|\\((\\(?:\\?:\\)?\\)\\|\\(|\\)\\|\\()\\)\\)\\)" bound 'bound))) (let ((face (get-text-property (1- (point)) 'face))) (when (or (and (listp face) (memq 'font-lock-string-face face)) (eq 'font-lock-string-face face)) (cond ((match-beginning 2) ; \\( (font-lock-prepend-text-property (match-beginning 1) (match-end 1) 'face 'lisp-font-lock-regexp-backslash) (font-lock-prepend-text-property (match-beginning 2) (match-end 2) 'face 'lisp-font-lock-regexp-delimiter) (when (> lisp-font-lock-regexp 1) (let* ((level (or (get-text-property (point) 'lisp-font-lock-regexp) 0)) (from (match-beginning 1)) (mid (match-end 2)) (to (or (and (= lisp-font-lock-regexp 2) (line-end-position)) (and (> level 1) (next-single-property-change (point) 'lisp-font-lock-regexp)) (save-excursion ;; beginning of next defun (if (re-search-forward "^(" nil t) (1- (point)) (point-max)))))) (put-text-property from to 'lisp-font-lock-regexp (1+ level)) (put-text-property ; `\\(' is not part of inner group from mid 'lisp-font-lock-regexp-group nil) (put-text-property mid to 'lisp-font-lock-regexp-group t) (when (> lisp-font-lock-regexp 2) (put-text-property from to 'font-lock-multiline t))))) ((match-beginning 3) ; \\| (font-lock-prepend-text-property (match-beginning 1) (match-end 1) 'face 'lisp-font-lock-regexp-backslash) (font-lock-prepend-text-property (match-beginning 3) (match-end 3) 'face 'lisp-font-lock-regexp-delimiter)) ((match-beginning 4) ; \\) (let ((level (when (> lisp-font-lock-regexp 1) (get-text-property (point) 'lisp-font-lock-regexp)))) (cond ((and level (not (get-text-property (point) 'lisp-font-lock-regexp-alt))) (let* ((from (when (get-text-property (point) 'lisp-font-lock-regexp-group) (previous-single-property-change (point) 'lisp-font-lock-regexp-group))) (to (or (and (= lisp-font-lock-regexp 2) (line-end-position)) (next-single-property-change (point) 'lisp-font-lock-regexp) (point-max)))) (font-lock-prepend-text-property (match-beginning 1) (match-end 1) 'face 'lisp-font-lock-regexp-backslash) (font-lock-prepend-text-property (match-beginning 4) (match-end 4) 'face 'lisp-font-lock-regexp-delimiter) (if (> level 1) (put-text-property (match-end 4) to 'lisp-font-lock-regexp (1- level)) (remove-text-properties (match-end 4) to '(lisp-font-lock-regexp nil)) (when (> lisp-font-lock-regexp 2) (remove-text-properties (match-end 4) to '(font-lock-multiline nil)))) (when from (remove-text-properties (match-end 4) to '(lisp-font-lock-regexp-group nil)) (font-lock-prepend-text-property from (match-beginning 1) 'face 'lisp-font-lock-regexp-group)))) ((not level) ;; no open \\( or lisp-font-lock-regexp equals 1 (font-lock-prepend-text-property (match-beginning 1) (match-end 1) 'face 'lisp-font-lock-regexp-backslash) (font-lock-prepend-text-property (match-beginning 4) (match-end 4) 'face 'lisp-font-lock-regexp-delimiter)) ((get-text-property (point) 'lisp-font-lock-regexp) (put-text-property (1- (point)) (point) 'syntax-table '(3)))))) ;; matches below should occur within parenthetical groups ((match-beginning 5) ; \\[ or \\] (if (get-text-property (point) 'lisp-font-lock-regexp-alt) ;; within alternative (goto-char (1- (point))) (put-text-property (1- (point)) (point) 'syntax-table '(3)))) ((match-beginning 6) ; escaped parenthesis or bracket (put-text-property (match-beginning 0) (1+ (match-beginning 0)) 'syntax-table '(3)) ;; reread paren (goto-char (1+ (match-beginning 0)))) ((match-beginning 7)) ;; POSIX character class, skip to preserve paren syntax ((match-beginning 8) ; [ (if (get-text-property (point) 'lisp-font-lock-regexp-alt) ;; already within alternative (put-text-property (1- (point)) (point) 'syntax-table '(3)) ;; starting new alternative (let ((to (or (and (= lisp-font-lock-regexp 2) (line-end-position)) (next-single-property-change (point) 'lisp-font-lock-regexp) (point-max)))) (font-lock-prepend-text-property (match-beginning 8) (match-end 8) 'face 'lisp-font-lock-regexp-delimiter) (put-text-property ;; the following should be reset at \\) (point) to 'lisp-font-lock-regexp-alt t)))) ((match-beginning 9) ; ] (let* ((from (when (get-text-property (point) 'lisp-font-lock-regexp-alt) (previous-single-property-change (point) 'lisp-font-lock-regexp-alt)))) (cond ((not from)) ;; here likely some cases are missing ((or (and (char-equal (char-before (1- (point))) ?\[ ) ;; [] (= from (1- (point)))) (and (char-equal (char-before (1- (point))) ?^ ) (or (and (char-equal (char-before (- (point) 2)) ?\[ ) ;; [^] (= from (- (point) 2))) (and (char-equal (char-before (- (point) 2)) ?\\ ) (char-equal (char-equal (- (point) 3)) ?\[ ) ;; [\^] (= from (- (point) 3))))) (and (char-equal (char-before (1- (point))) ?\\ ) (or (and (char-equal (char-before (- (point) 2)) ?\[ ) ;; [\] (= from (- (point) 2))) (and (char-equal (char-before (- (point) 2)) ?^ ) (char-equal (char-before (- (point) 3)) ?\[ ) ;; [^\] (= from (- (point) 3))) (and (char-equal (char-before (- (point) 2)) ?\^ ) (char-equal (char-equal (- (point) 3)) ?\\ ) (char-equal (char-equal (- (point) 4)) ?\[ ) ;; [\^\] (= from (- (point) 4)))))) (put-text-property (1- (point)) (point) 'syntax-table '(3))) ((< from (match-beginning 9)) (let ((to (or (and (= lisp-font-lock-regexp 2) (line-end-position)) (next-single-property-change (point) 'lisp-font-lock-regexp-alt) (point-max)))) (font-lock-prepend-text-property (match-beginning 9) (match-end 9) 'face 'lisp-font-lock-regexp-delimiter) (remove-text-properties (point) to '(lisp-font-lock-regexp-alt nil))))))) ((match-beginning 10) ; (;) (put-text-property (1- (point)) (point) 'syntax-table '(3))))))))) nil)))) "Gaudy level highlighting for Lisp modes.") (defun lisp-font-lock-regexp (arg) "Set regular expression highlighting for current buffer. ARG may be one of 0, 1, 2 or 3. See the documentation of the variable `lisp-font-lock-regexp' for the meaning of these values. Regular expression highlighting works in Emacs-Lisp mode only." (interactive (if (eq major-mode 'emacs-lisp-mode) (list (read-number "Locally highlight regexps 0 (off), 1 (min), 2 (med), 3 (max): ")) (error "This option can be used in `emacs-lisp-mode' only"))) (when (and (boundp 'font-lock-mode) font-lock-mode (>= arg 0) (<= arg 3) (/= arg lisp-font-lock-regexp)) (save-excursion (setq lisp-font-lock-regexp arg) (font-lock-unfontify-buffer) ;; can't reset parse-sexp-lookup-properties and font-lock-multiline since ;; they might have been set by other programs or the user, this could be ;; resolved by handling them like buffer-invisibility-spec (setq font-lock-extra-managed-props ;; removing syntax-table from font-lock-extra-managed-props is dirty ;; provided someone else did put it there; eventually this should be ;; handled by font-lock-syntactic-keywords in an appropriate fashion (delq 'syntax-table (delq 'lisp-font-lock-regexp (delq 'lisp-font-lock-regexp-group (delq 'lisp-font-lock-regexp-alt font-lock-extra-managed-props))))) ;; the following is needed to have jit-lock refontify the buffer (when jit-lock-mode (setq jit-lock-context-unfontify-pos (point-min))) (when (> arg 1) (set (make-local-variable 'parse-sexp-lookup-properties) t) (when (> arg 2) (set (make-local-variable 'font-lock-multiline) t)) (set (make-local-variable 'font-lock-extra-managed-props) (append font-lock-extra-managed-props (list 'syntax-table 'lisp-font-lock-regexp 'lisp-font-lock-regexp-group 'lisp-font-lock-regexp-alt)))) (dolist (window (get-buffer-window-list (current-buffer) nil t)) ;; refontify contents of any window showing current buffer, avoids ;; displaying unfontified buffers (with-selected-window window (font-lock-fontify-region (window-start) (window-end)))))))
_______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel