On 19/04/2021 15:33, Nicolas Goaziou wrote:
Could you try the following instead?
--8<---------------cut here---------------start------------->8---
(defun org-sort-remove-invisible (s)
"Remove invisible part of links and emphasis markers from string S."
(let ((remove-markers
(lambda (m)
Just a curiosity, what is style guide recommendation: let - lambda or
cl-labels?
(concat (match-string 1 m)
(match-string 4 m)
(match-string 5 m)))))
(remove-text-properties 0 (length s) org-rm-props s)
(replace-regexp-in-string
org-verbatim-re remove-markers
(replace-regexp-in-string org-emph-re remove-markers
(org-link-display-format s) t tt)
Single "t" is at the end, isn't it?
t t)))
I think, the patch is an improvement.
There is still a minor shortcoming that ordering of emphasized and
non-emphasized words is undefined
- /a/
- *a*
- a
Let's leave it aside since it requires multilevel comparison similar to
collation in locales and more strict definition which part of string is
compared at each level:
- /a/ :: A
- /a/ :: Z
- a :: A
- a :: Z
In my opinion, a more severe limitation comes from sequential
regexp-based approach. Consider stripping markers from
1. "a =b *c* d= e"
2. "*b* /i/"
First case could be solved by splitting the input string by verbatim
regexp at first and by applying emphasis substitution only to
non-verbatim parts. However I am rather shy to experiment with regexps
definition to avoid including chars before and after emphasis markers.
It would be great to get role of particular characters from more
reliable parser (or at least from text properties affected by
fontification).
I have some tests for `org-sort-remove-invisible', see the attachment.
Actually I do not like such style of tests, I strongly prefer to see all
failures at once, but I have not found if such technique is used
anywhere in org tests.
>From 6719386aa5a95394a4cb98ce28b889f545620bd9 Mon Sep 17 00:00:00 2001
From: Max Nikulin <maniku...@gmail.com>
Date: Mon, 19 Apr 2021 19:06:36 +0700
Subject: [PATCH] More tests for `org-sort-list'
* lisp/org.el (org-verbatim-re): Add to the docstring a statement
concerning subgroups meaning.
* testing/lisp/test-org-list.el (test-org-list/sort): Add cases with
text emphasis. Stress in comments that it is significant whether
locale-specific collation rules ignores spaces.
* testing/lisp/test-org.el (test-org/org-sort-remove-invisible): Add
tests for `org-sort-list' helper.
Add a test with expected failures to make apparent limitation of simple
regexp-based approach in `org-sort-remove-invisible' function.
---
lisp/org.el | 3 ++-
testing/lisp/test-org-list.el | 34 +++++++++++++++++++++++++++++++++-
testing/lisp/test-org.el | 27 +++++++++++++++++++++++++++
3 files changed, 62 insertions(+), 2 deletions(-)
diff --git a/lisp/org.el b/lisp/org.el
index c2738100f..3d4f5553a 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -3685,7 +3685,8 @@ After a match, the match groups contain these elements:
5 The character after the match, empty at the end of a line")
(defvar org-verbatim-re nil
- "Regular expression for matching verbatim text.")
+ "Regular expression for matching verbatim text.
+Groups 1-5 have the same meaning as in `org-emph-re' pattern.")
(defvar org-emphasis-regexp-components) ; defined just below
(defvar org-emphasis-alist) ; defined just below
diff --git a/testing/lisp/test-org-list.el b/testing/lisp/test-org-list.el
index 3689a172f..cc7914c8d 100644
--- a/testing/lisp/test-org-list.el
+++ b/testing/lisp/test-org-list.el
@@ -1225,7 +1225,39 @@ b. Item 2<point>"
(equal "- b\n- a\n- C\n"
(org-test-with-temp-text "- b\n- C\n- a\n"
(org-sort-list t ?A)
- (buffer-string))))))
+ (buffer-string))))
+ ;; Emphasis in the beginning.
+ (should
+ (equal "- a\n- /z/\n"
+ (org-test-with-temp-text "- /z/\n- a\n"
+ (org-sort-list t ?a)
+ (buffer-string))))
+ (should
+ (equal "- *a*\n- z\n"
+ (org-test-with-temp-text "- z\n- *a*\n"
+ (org-sort-list t ?a)
+ (buffer-string))))
+ ;; Emphasis of second word.
+ ;; Locales with significant spaces (C, es_ES, pl_PL)
+ ;; are more sensitive to problems with sort.
+ (should
+ (equal "- a b\n- a /z/\n"
+ (org-test-with-temp-text "- a /z/\n- a b\n"
+ (org-sort-list t ?a)
+ (buffer-string))))
+ (should
+ (equal "- a *b*\n- a z\n"
+ (org-test-with-temp-text "- a z\n- a *b*\n"
+ (org-sort-list t ?a)
+ (buffer-string))))
+ ;; Space role in sorting.
+ ;; Test would fail for locales with ignored space, e.g. en_US, it works
+ ;; in C and currently rare locales having significant space (es_ES, pl_PL)
+ (should
+ (equal "- Time stamp\n- Timer\n"
+ (org-test-with-temp-text "- Timer\n- Time stamp\n"
+ (org-sort-list t ?a)
+ (buffer-string))))))
;; Sort numerically.
(should
(equal "- 1\n- 2\n- 10\n"
diff --git a/testing/lisp/test-org.el b/testing/lisp/test-org.el
index f6fb4b3ca..307b30265 100644
--- a/testing/lisp/test-org.el
+++ b/testing/lisp/test-org.el
@@ -8262,4 +8262,31 @@ two
(provide 'test-org)
+(ert-deftest test-org/org-sort-remove-invisible ()
+ "Empasis markers are stripped by `org-sort-remove-invisible'."
+ (dolist (case '(("Test *bold* text" . "Test bold text")
+ ("Test /italic/ text" . "Test italic text")
+ ("Test _underlined_ text" . "Test underlined text")
+ ("Test +strike through+ text" . "Test strike through text")
+ ("Test =verbatim= text" . "Test verbatim text")
+ ("Test ~code~ text" . "Test code text")
+ ("*Beginning* of a line" . "Beginning of a line")
+ ("End =of line=" . "End of line")
+ ("Braces (*around*)" . "Braces (around)")
+ ("Multiple *emphasis* options /in the/ same *line*" .
+ "Multiple emphasis options in the same line")))
+ (let ((test-string (car case))
+ (expectation (cdr case)))
+ (should (equal expectation (org-sort-remove-invisible test-string))))))
+
+(ert-deftest test-org/org-sort-remove-invisible-failures ()
+ "Examples of `org-sort-remove-invisible` failures."
+ :expected-result :failed
+ (dolist (case '(("Lost =in *verbatim* text= fragment" .
+ "Lost in *verbatim* text fragment")
+ ("Adjucent emphasis: *Overlapped* /regexps/" .
+ "Adjucent emphasis: Ovelapped regexps")))
+ (let ((test-string (car case))
+ (expectation (cdr case)))
+ (should (equal expectation (org-sort-remove-invisible test-string))))))
;;; test-org.el ends here
--
2.25.1