Re: org-pop-mode
On 3/18/20 3:00 AM, Ihor Radchenko wrote: Any feedback? >From the first glance it does not look too different from inline headings. Could you highlight the difference? Best, Ihor Oh! And I forgot a crucial feature that org-pop has over inline tasks: you can put any amount of org-mode tree-structure inside an org-pop digression. Inline tasks seem to be limited to just body text; you can't put sub-headings inside them, etc. Org-pop digressions can contain sub-headings and whatever other structure, even further digressions and "pop"s. So that's something significant that inline headings lack. ~mark
Re: org-pop-mode
On 3/18/20 4:24 PM, Adam Porter wrote: BTW, in the body of your email, the text you write has these two characters between sentences: " ". The second is a plain space, but the first is a Unicode non-breaking space, or "C-x 8 RET a0". I noticed because it's displayed in Emacs as an underline character next to the plain space. Huh! Interesting. Because I know for a fact that I'm hitting the space bar twice and not a NBSP. (I use and maintain https://github.com/kragen/xcompose so I can type all kinds of things; NBSP is Multi-Key Space Space). I guess because my mailer has me composing things in HTML mode, and I double-space my periods, and it's thinking "strings of whitespace are collapsed into a single space in HTML! I'd better do something to make sure those extra spaces, which he took so much care to type, aren't lost!" and makes on a NBSP to take up extra space. (There was recently a whole discussion on the Unicode mailing list about how 0x00A0 is almost universally used as a fixed-width space when the specs say it should be flex-width, sigh.) ~mark
Hiding emphasis markers
On 3/18/20 4:58 AM, Norman Tovey-Walsh wrote: Mark E. Shoulson writes: On 2/19/20 2:39 AM, Bastien wrote: - org-hide-emphasis-markers => t Just to note: I've been working on a minor-mode in which the emphasis markers are "invisible" but not hidden (i.e. they still take up space, […] size, so the extra space is not quite as obvious. Does this sound interesting to anyone? Right now the code is kind of a mess, but it could be refined. Sounds interesting to me. All right, then, you asked for it. It's really very sloppy code right now; I'm just playing around to see what works. Comments are kind of stream-of-consciousness, they may be out of date wrt what works and what doesn't etc. But hey, have fun. https://gist.github.com/clsn/819a6463b1741eb465b310c39b4902a1 ~mark
Re: Spaces in bare URLs?
On 3/18/20 5:43 AM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: So... what is one supposed to do about spaces in URLs? When they're in [[link format]], with or without a description, it's no problem, but org-mode has a long tradition of support for "bare" URLs too. We're used to being able to type a URL or other link format and have it work, right? And that doesn't seem (to me) to be a thing that we'd want to abandon. In org-mode 9.1.9, I can type "info:elisp#Syntactic%20Font%20Lock" and it'd work. (Maybe not the greatest example, since %-encoding is seen more with http-based URIs, but still). The percent-encoding is well-established and reliable Unfortunately, that wasn't reliable. As it is not idempotent, you can never know how many times you need to decode an URL before sending it. Well, any form of escaping is pretty much by definition not idempotent. That's the whole point of escaping: you have something you can't say, so you make some magical character that changes the meaning of nearby characters so you can describe it in characters you can't say. And the price you pay is that now you can no longer say your magical character plain, you have to use another form of escaping to express it (usually the same form as the others). It's like how it's impossible to compress *every* file to make it smaller and some even have to get bigger. The pigeonhole principle shows _why_ it isn't possible, and escaping shows (one way) _how_ it isn't: say you use high-ascii bytes to represent common strings or something. How do you represent them when they're really in the text? You have to escape them... which makes your file *larger*. The thing is URL encoding is not for human consumption, i.e., we shouldn't have to deal with it. This is a good point. While on one hand it makes sense to be able to type URLs that have spaces in them without spaces, it is sort of ridiculous to expect users feel "natural" about typing "%20" instead. (I think this is why the specs say that you can also escape a space by using the "+" character, in order to make it easier for this most-common of characters... but that weird exception has caused all kinds of hassles in code from that day to this; I know from my own experience.) and you can *count* on it when nothing else works, because you can always fall back on plain ascii. Current backslash escaping is also well established, and as much ASCII-like as anyone would expect. Really? As ASCII-like as I could expect? What if my URL is https://he.wikipedia.com/שלום_עליכם ? If I am in some backward environment (still all too common) where all I can rely on is ASCII, I can percent-encode the UTF-8 representation and it will work. Can we count on being able to backslash-quote things clear down to ASCII? I don't see a way in the docs I've seen. But that won't work in org-mode 9.3.6. Nor will "info:elisp#Syntactic Font Lock" or "info:elisp#Syntactic\ Font\ Lock" or any other variant I've tried, short of putting it inside [[]]s or <>s (in other words, no longer using a bare URL). True, but that's a minor annoyance. You apparently prefer to encode a URL manually, replacing each space with %20 (and other characters with more baroque escape sequences), rather than adding <...> (or [[...]]) around it and be done with it. Perhaps this one was the bad idea, after all? Yes, using <>s works, as does [[]]. And yes, I do have to concede that claiming it should be "natural" for a user to hand-escape things with %20s is sort of ridiculous. Having to reprocess all old org-files for such a common notation still seems like more trouble than it was worth, but then you didn't ask me (and you were QUITE RIGHT not to do so!) I guess a converter-script should also enclose bare URLs in <>, at least if they have spaces or other whitespace. Still don't know about org-protocol and store-link, because I'm lazy. Right now, at least some of the emacsen I'm working with still use org-9.1.9, so I haven't converted anything. ~mark
Re: org-pop-mode
On 3/18/20 3:15 PM, Adam Porter wrote: "Mark E. Shoulson" writes: This is something I've wanted for years in org-mode, but which in some ways could actually be _offensive_ to its ideals. If you're an outline purist, look away. ... So, I present a pre-alpha version, https://gist.github.com/clsn/09ac4b098b6ad7366bb5e0bc2d5f of org-pop-mode. To "pop" back up, create a headline at the level you're popping back to, and give it a tag of "contd", and the headline text should not be something important. Instructions and explanations are in the comments of the file (the part about installing from MELPA is a lie, though). Any feedback? Hi Mark, Indeed, this is something that is frequently asked about. I probably wouldn't use it myself, but it looks like you've done a good job on it. Here is some feedback: 1. I'd suggest a more descriptive name, especially if you plan to publish it to MELPA. org-pop doesn't seem to convey anything about what it does. :) Heh; fair enough. The filename originally was "org-level-end.el", I think; I started using the catchier "org-pop" because... well, it was catchier. It made sense in my mind, in the "push"/"pop" sense used with stacks in programming, that you "push" to a deeper level and this library would allow you to "pop" back up to a higher one. I'll see if I can think of something better, thanks. 2. In the code, I saw you comment about cl-flet, and I see you using fset and unwind-protect in the org-pop-with-continuations macro. Instead, use cl-letf with symbol-function, like: (cl-letf* (((symbol-function 'foo) #'my-foo) ((symbol-function 'bar) (lambda () ...))) BODY) See also Nic Ferrier's package, noflet. I'll take a look, thanks. It's questionable whether I really should even be messing about with that macro anyway. I must have removed the comments, but I had a whole thing there about how I had been trying with cl-letf and/or cl-flet and it didn't work. Thing is, cl-flet, according to the docs, (info:cl#Function Bindings) is strictly *lexical* binding, which is not going to cut it. cl-letf might be different; the docs are different about it, but I am pretty sure I tried it and it didn't work, or didn't work "enough of the time." But maybe I had it wrong, and maybe noflet will succeed. Thanks! ~mark
Re: org-pop-mode
On 3/18/20 3:00 AM, Ihor Radchenko wrote: Any feedback? >From the first glance it does not look too different from inline headings. Could you highlight the difference? Best, Ihor Well, it's true there is similarity. I even found in my notes where I noticed inline tasks and their similarity, but all I wrote there was "but mine is different," so maybe that isn't so helpful. I have not used inline tasks, and was only barely familiar with them (I did know they existed, though), so that is my excuse for having invented my own wheel. In terms of differences, let's see: inline tasks are, from a strict outline point-of-view, a zillion levels down (approximately), which makes them indent wy over if you're using org-indent-mode. My org-pop is the "normal," expected single level down. inline tasks mark the end of the task with a special header at the same level as the task. Org-pop marks the end of the digression with a special header at the same level as the "base" (the surrounding text). Your call as to what makes better sense. inline tasks are well-integrated and worked deep into the innards of org-mode, to the point that it seems from looking at code that they cause something of a headache to developers with their exceptional behavior. On the plus side, that means that many/most packages will Do the Right Thing in the face of inline tasks. My org-pop is new and non-standard, with hacks to make a few key things work right with it, but doesn't have the support of... well, anything else. I'm pretty sure exporting works well with inline tasks, but currently org-pop has no special tweaks for it (I'm not even sure what they should be). This is a reason to stick with inline tasks. Both approaches sinfully break the underlying outline-mode structure, which explicitly forbids exactly what we're trying to accomplish with them. Inline tasks have (way) more seniority and support and indulgence for doing so, though. I haven't experimented much with inline tasks as regards the two or three behaviors that I actually cared enough about to write org-pop; have to see if they do something like I would have wanted. Thanks! ~mark
org-pop-mode
This is something I've wanted for years in org-mode, but which in some ways could actually be _offensive_ to its ideals. If you're an outline purist, look away. It's something we can do with plain lists: work on a list item at level X, then make a sublist at level X+1, and then "pop" back up to the same list item you had been working on at level X, without needing a new header. You just adjust the indentation. + Some stuff at this level. + More stuff at this level. Might even have multiple paragraphs. - a sublevel, for a digression And back to the same higher level, even without a new bullet. I use org-mode to keep daily notes at work, sometimes almost stream-of-consciousness, and often wished I could digress and then pop back. So, I present a pre-alpha version, https://gist.github.com/clsn/09ac4b098b6ad7366bb5e0bc2d5f of org-pop-mode. To "pop" back up, create a headline at the level you're popping back to, and give it a tag of "contd", and the headline text should not be something important. Instructions and explanations are in the comments of the file (the part about installing from MELPA is a lie, though). Any feedback? ~mark
Re: Survey: changing a few default settings for Org 9.4
On 2/19/20 2:39 AM, Bastien wrote: - org-hide-emphasis-markers => t Just to note: I've been working on a minor-mode in which the emphasis markers are "invisible" but not hidden (i.e. they still take up space, they're just in 'org-hide face or something similar), except when the point is closeby, at which point they become visible. The extra space is pretty ugly, I'll grant, but this does avoid the sudden jerks as text shifts when characters become visible. Also, in org-variable-pitch-mode, the emphasis markers are also reduced in size, so the extra space is not quite as obvious. Does this sound interesting to anyone? Right now the code is kind of a mess, but it could be refined. ~mark
Spaces in bare URLs?
So, in the "new" org-mode, we've done away with standard percent-encoding of URLs, in favor of a more... idiosyncratic method using backslashes. So... what is one supposed to do about spaces in URLs? When they're in [[link format]], with or without a description, it's no problem, but org-mode has a long tradition of support for "bare" URLs too. We're used to being able to type a URL or other link format and have it work, right? And that doesn't seem (to me) to be a thing that we'd want to abandon. In org-mode 9.1.9, I can type "info:elisp#Syntactic%20Font%20Lock" and it'd work. (Maybe not the greatest example, since %-encoding is seen more with http-based URIs, but still). The percent-encoding is well-established and reliable, and you can *count* on it when nothing else works, because you can always fall back on plain ascii. But that won't work in org-mode 9.3.6. Nor will "info:elisp#Syntactic Font Lock" or "info:elisp#Syntactic\ Font\ Lock" or any other variant I've tried, short of putting it inside [[]]s or <>s (in other words, no longer using a bare URL). I think dropping percent-escaping of URLs was a bad idea, in terms of breaking past usage and lack of consistency with the standard used for URLs everywhere else. But I don't know what impelled the decision to drop it, so I might well be missing something important. At any rate, it does leave a hole in what org-mode can do, a thing it used to be able to do and can't anymore. Is there a right way to do this? (without using delimiters.) I haven't yet looked at how this interacts with org-protocol's store-link transaction. ~mark
[PATCH] strike-through text in tables
I didn't see a response to this, and I hope it's just because I sent it wrongly or something. If not, is there something amiss with this? Make a org-mode table. In one of the cells of the table, have some text that is +struck out+. Note that the struck-out text is default text color (black, for me), and not org-table text color (blue, for me). It's even worse if you're running org-variable-pitch-mode, because the text also won't be set in a fixed-pitch face, and so will screw up the alignment of table text. I found out why. When org-do-emphasis-faces constructs the new face that it applies to the text, it passes the lookup value from the org-emphasis-alist to font-lock-prepend-text-property, which makes a list, composing it with the existing face. This would fail for strike-though mode in a table, since the org-emphasis-alist lookup would return (:strike-through t), resulting in a face of (:strike-through t org-table), which is an invalid face, and then emacs has no choice but to render it unfaced. Attaching a patch for the issue. Rather than try to figure out how to make org-do-emphasis-faces somehow smart enough to deal with this situation (I'm not sure it's possible, in general), I took the easy way out and defined an org-strike-through face which can be used in org-emphasis-alist. Humbly submitted for your approval... ~mark From 9a489ddf9d411bfc907a5b765d015e757b0b6903 Mon Sep 17 00:00:00 2001 From: "Mark E. Shoulson" Date: Thu, 5 Mar 2020 10:03:37 -0500 Subject: [PATCH] org-faces.el: Add org-strike-through face org-faces.el: Create org-strike-through face. org.el: Use org-strike-through-face in org-emphasis-alist. --- lisp/org-faces.el | 4 lisp/org.el | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/lisp/org-faces.el b/lisp/org-faces.el index d78b606ec..107ea9763 100644 --- a/lisp/org-faces.el +++ b/lisp/org-faces.el @@ -427,6 +427,10 @@ For source-blocks `org-src-block-faces' takes precedence." :group 'org-faces :version "22.1") +(defface org-strike-through '((t (:strike-through t))) + "Face for struck-through text." + :group 'org-faces) + (defface org-quote '((t (:inherit org-block))) "Face for #+BEGIN_QUOTE ... #+END_QUOTE blocks. Active when `org-fontify-quote-and-verse-blocks' is set." diff --git a/lisp/org.el b/lisp/org.el index 31133c554..8b27e4708 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -3677,7 +3677,7 @@ You need to reload Org or to restart Emacs after setting this.") ("_" underline) ("=" org-verbatim verbatim) ("~" org-code verbatim) -("+" (:strike-through t))) +("+" org-strike-through)) "Alist of characters and faces to emphasize text. Text starting and ending with a special character will be emphasized, for example *bold*, _underlined_ and /italic/. This variable sets the -- 2.24.1
[PATCH] strike-through text in tables
Make a org-mode table. In one of the cells of the table, have some text that is +struck out+. Note that the struck-out text is default text color (black, for me), and not org-table text color (blue, for me). It's even worse if you're running org-variable-pitch-mode, because the text also won't be set in a fixed-pitch face, and so will screw up the alignment of table text. I found out why. When org-do-emphasis-faces constructs the new face that it applies to the text, it passes the lookup value from the org-emphasis-alist to font-lock-prepend-text-property, which makes a list, composing it with the existing face. This would fail for strike-though mode in a table, since the org-emphasis-alist lookup would return (:strike-through t), resulting in a face of (:strike-through t org-table), which is an invalid face, and then emacs has no choice but to render it unfaced. Attaching a patch for the issue. Rather than try to figure out how to make org-do-emphasis-faces somehow smart enough to deal with this situation (I'm not sure it's possible, in general), I took the easy way out and defined an org-strike-through face which can be used in org-empasis-alist. Humbly submitted for your approval... ~mark >From 9a489ddf9d411bfc907a5b765d015e757b0b6903 Mon Sep 17 00:00:00 2001 From: "Mark E. Shoulson" Date: Thu, 5 Mar 2020 10:03:37 -0500 Subject: [PATCH] org-faces.el: Add org-strike-through face org-faces.el: Create org-strike-through face. org.el: Use org-strike-through-face in org-emphasis-alist. --- lisp/org-faces.el | 4 lisp/org.el | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/lisp/org-faces.el b/lisp/org-faces.el index d78b606ec..107ea9763 100644 --- a/lisp/org-faces.el +++ b/lisp/org-faces.el @@ -427,6 +427,10 @@ For source-blocks `org-src-block-faces' takes precedence." :group 'org-faces :version "22.1") +(defface org-strike-through '((t (:strike-through t))) + "Face for struck-through text." + :group 'org-faces) + (defface org-quote '((t (:inherit org-block))) "Face for #+BEGIN_QUOTE ... #+END_QUOTE blocks. Active when `org-fontify-quote-and-verse-blocks' is set." diff --git a/lisp/org.el b/lisp/org.el index 31133c554..8b27e4708 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -3677,7 +3677,7 @@ You need to reload Org or to restart Emacs after setting this.") ("_" underline) ("=" org-verbatim verbatim) ("~" org-code verbatim) -("+" (:strike-through t))) +("+" org-strike-through)) "Alist of characters and faces to emphasize text. Text starting and ending with a special character will be emphasized, for example *bold*, _underlined_ and /italic/. This variable sets the -- 2.24.1
Bug: org-ellipsis does not work as a local variable [9.3.1 (release_9.3.1-95-gf93020 @ /home/mark/git-repos/org-mode/lisp/)]
The "org-ellipsis" variable is specifically marked as (potentially) a safe local variable, so obviously someone intended for it possibly to be used that, and believed that it might be useful to someone as a local variable. However, there is no setup that I can find which makes this work. Placing : # Local Variables: : # org-ellipsis: "XXX" : # End: at the end of the file, or setting it in the top line, makes no change in the ellipsis, even if you do M-x org-mode again or reload the file (with find-alternate-file). Indeed, M-x org-mode clears the local value assignment altogether. Setting it by hand with setq-local doesn't work (it is cleared when you do M-x org-mode again anyway, as mentioned.) So, is org-ellipsis really not meant ever to be a local variable? If so, that likely should be documented, and certainly the :safe annotation on it should be removed, as it strongly implies that using it as a local variable is acceptable and useful. I ran this with emacs -Q, using using the code from the git repository, commit f93020d5e6d7594c335cc129ad02c21ac26ed58a (as you can see by the local filepath below.) I hope I have explained the bug clearly enough. Thanks ~mark Emacs : GNU Emacs 26.3 (build 1, x86_64-redhat-linux-gnu, GTK+ Version 3.24.13) of 2019-12-10 Package: Org mode version 9.3.1 (release_9.3.1-95-gf93020 @ /home/mark/git-repos/org-mode/lisp/) current state: == (setq org-src-mode-hook '(org-src-babel-configure-edit-buffer org-src-mode-configure-edit-buffer) org-link-shell-confirm-function 'yes-or-no-p org-metadown-hook '(org-babel-pop-to-session-maybe) org-clock-out-hook '(org-clock-remove-empty-clock-drawer) org-reveal-start-hook '(org-decrypt-entry) org-mode-hook '((closure (org--rds reftex-docstruct-symbol org-element-greater-elements org-clock-history org-agenda-current-date org-with-time org-defdecode org-def org-read-date-inactive org-ans2 org-ans1 org-columns-current-fmt-compiled org-clock-current-task org-clock-effort org-agenda-skip-function org-agenda-skip-comment-trees org-agenda-archives-mode org-end-time-was-given org-time-was-given org-log-note-extra org-log-note-purpose org-log-post-message org-last-inserted-timestamp org-last-changed-timestamp org-entry-property-inherited-from org-blocked-by-checkboxes org-state org-agenda-headline-snapshot-before-repeat org-capture-last-stored-marker org-agenda-start-on-weekday org-agenda-buffer-tmp-name org-priority-regexp org-mode-syntax-table buffer-face-mode-face org-tbl-menu org-org-menu org-struct-menu org-entities org-last-state org-id-track-globally org-clock-start-time texmathp-why remember-data-file org-agenda-tags-todo-honor-ignore-options iswitchb-temp-buflist calc-embedded-open-mode calc-embedded-open-formula calc-embedded-close-formula align-mode-rules-list org-emphasis-alist org-emphasis-regexp-components org-export-registered-backends org-modules org-babel-load-languages org-indent-indentation-per-level org-element-paragraph-separate ffap-url-regexp org-inlinetask-min-level t) nil (add-hook (quote change-major-mode-hook) (quote org-show-all) (quote append) (quote local)) ) (closure (org-src-window-setup *this* org-babel-confirm-evaluate-answer-no org-src-preserve-indentation org-src-lang-modes org-link-file-path-type org-edit-src-content-indentation org-babel-library-of-babel t) nil (add-hook (quote change-major-mode-hook) (quote org-babel-show-result-all) (quote append) (quote local)) ) org-babel-result-hide-spec org-babel-hide-all-hashes) org-archive-hook '(org-attach-archive-delete-maybe) org-confirm-elisp-link-function 'yes-or-no-p org-agenda-before-write-hook '(org-agenda-add-entry-text) org-metaup-hook '(org-babel-load-in-session-maybe) org-bibtex-headline-format-function #[257 "\300\236A\207" [:title] 3 "\n\n(fn ENTRY)"] org-babel-pre-tangle-hook '(save-buffer) org-tab-first-hook '(org-babel-hide-result-toggle-maybe org-babel-header-arg-expand) org-occur-hook '(org-first-headline-recenter) org-cycle-hook '(org-cycle-hide
[O] Org-file "path"-type links
Hey. New to this list this time around... I've been tinkering with a custom link-type and I'm curious if anyone else is at all interested in it. I use org-mode (among other things) to keep a sort of log or daily journal at work. Things are entered in by date: I have "* 2019" as a top-level head, then "** 2019-07 July" and "*** 2019-07-25 Þursday" (actually not exactly that since I use odd-only mode, but not the point), and so forth. (Note that the headlines are *not* date-links; the docs say that's a Bad Thing.) My formatting is pretty free-form beyond that, and I don't always remember to sub-head subjects below that, or I use plain lists or whatever... also not the point. What matters is that sometimes I *do* make proper sub-headings, and sometimes I may want to link to them. I know I can make a link to [[*Best Firing Ever]] to link to a particular headline, but often there are many headlines with the same text, like " Weekly Shouting at Boss" or something, and [[*Weekly Shouting at Boss]] won't necessarily link to the one I want. I want the one that was under "2019-07-24 Wednesday" for this link, and not any other. I don't know of any way that org-mode has to distinguish such things (apart from "name the headlines uniquely, moron," which to be sure is one way to do it), so I have been making a "path"-type or "tree"-type link. So I can have a hyperlink that links to [[tree:2019*2019-07 July*2019-07-24 Wednesday*Weekly Shouting at Boss][really told him off this time]] and that links to *this* particular headline, following the tree down link by link, separated by *'s. And you can also have a filename, of course, like say [[tree:/home/me/bestofjournal.org::Successes*Solitaire Games*Best One Ever]] or something. The code at this point works, including the store-link code (though adding it in overshadows the default store-link code for org-mode, which might be a problem), but it's still just a PoC, needs documentation, etc. And probably a better name; I have "tree:" as the keyword but it's probably awful... maybe something like "orglink:"? I know there's already a library by that name. I'm just wondering if anyone else thinks this is a good idea. Maybe it's just for my unusual way of using org. There are other ways to do this, probably (maybe a "link-store" function that stores an ordinary org-mode link *and* also creates a unique target at the point?); this is what I did. If I put this up on github or elpa, would anyone else use it? ~mark
Re: [O] Smart Quotes Exporting
Update on the smart-quotes patch. Supports the odt exporter now too, which I think covers all the current major "new" exporters for which it is relevant (adding smart quotes to ASCII export is a contradiction in terms; should it be in the "publish" exporter? It didn't look like it to me). Added an options keyword, '"' (that is, the double-quote mark) to select smart quotes on/off, and a defcustom for customizing your default. Set the default default [sic] to nil, though actually it might be reasonable to set it to t. Slight touch-up to the regexps since last time, but they will definitely be subject to a lot of fine-tuning as more special cases are found that break them and ways to fix it are found (the close-quote still breaks on one of "/a/." or "/a./") It's pretty good on the whole, though, usually guesses right. I know there's some work being done on the odt exporter; hope this fits in well with it. How does it look to you? ~mark >From e6df2efd1a9ce36964a20fc06aa2a688acd87efb Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 29 May 2012 23:01:12 -0400 Subject: [PATCH] Add `smart' quotes for onscreen display and for latex and html export * lisp/org.el: Add `smart' quotes: custom variables to define regexps to recognize quotes, to define how and whether to display them, and org-fontify-quotes to display `smart-quote' characters when activated. * contrib/lisp/org-export.el: Add function org-export-quotation-marks as a utility function usable by individual exporters to apply `smart' quotes. Also add keyword '"' for customizing smart quotes, and custom default for it. * contrib/lisp/org-e-latex.el: Replace org-e-latex-quotes custom with org-e-latex-quotes-replacements and make org-e-latex--quotation-marks use the org-export-quotation-marks function in org-export.el. * contrib/lisp/org-e-html.el: Replace org-e-html-quotes custom with org-e-html-quotes-replacements and enable org-e-html--quotation-marks, using org-export-quotation-marks function in org-export.el. * contrib/lisp/org-e-odt.el: Replace org-e-odt-quotes custom with org-e-odt-quotes-replacements and make org-e-odt--quotation-marks use org-export-quotations-marks function in org-export.el. --- contrib/lisp/org-e-html.el | 57 contrib/lisp/org-e-latex.el | 67 ++--- contrib/lisp/org-e-odt.el | 68 ++--- contrib/lisp/org-export.el | 38 lisp/org.el | 101 +++ 5 files changed, 203 insertions(+), 128 deletions(-) diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el index 4287a59..c49608d 100644 --- a/contrib/lisp/org-e-html.el +++ b/contrib/lisp/org-e-html.el @@ -1043,37 +1043,24 @@ in order to mimic default behaviour: Plain text -(defcustom org-e-html-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") - ("\\(\\S-\\)\"" . "~»") - ("\\(\\s-\\|(\\|^\\)'" . "'")) -("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "``") - ("\\(\\S-\\)\"" . "''") - ("\\(\\s-\\|(\\|^\\)'" . "`"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-html-smart-quote-replacements + '(("fr" "« " " »" "‘" "’" "’") +("en" "“" "”" "‘" "’" "’") +("de" "„" "“" "‚" "‘" "’")) + "What to export for `smart-quotes'. +A list of five strings: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-html :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote " + (string :tag "Open double-quotes"); "â" + (string :tag "Close double-quotes") ; "â" + (string :tag "Open single-quote") ; "â" + (string :tag "Close single-quote"); "â" + (string :tag "Mid-word apostrophe"))) ; "â" Compilation @@ -1459,15 +1446,7 @@ This is used to choose a separator for constructs like \\verb." "Export quotation marks depending on language conventions. TEXT is a string containing quotation marks to be replaced. INFO is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text st
Re: [O] Smart Quotes Exporting
All right, preliminary patch is attached, *maybe* good enough for more serious consideration now, but might need some fixes. Still only uses ordinary regexps and plain-text strings, but can now handle the example with formatting-breaks next to quotes. Things have been moved into more appropriate locations, made customs, docstrings and types fixed, etc, etc. It supports onscreen display of "smart" quotes (when enabled); I have the quotes displayed in org-document-info face so they are slightly distinct, to make it clearer that they are "altered" from what they are in the plain text. This may or may not be a popular (or good) idea. I have also built it into the new export engine in org-e-latex and org-e-html as proofs of concept. I'm not positive the latex one will work properly for German, though; there might need to be something enabled in LaTeX for it to format ,, into „. It should probably be set not to smartify quotes onscreen in comments; I haven't done that yet. Comments welcome; I hope I didn't complicate matters in the export engines too much. ~mark >From 1bc507cf69c94d5645436abc6e28e7d96999083e Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 29 May 2012 23:01:12 -0400 Subject: [PATCH] Add `smart' quotes for onscreen display and for latex and html export * lisp/org.el: Add `smart' quotes: custom variables to define regexps to recognize quotes, to define how and whether to display them, and org-fontify-quotes to display `smart-quote' characters when activated. * contrib/lisp/org-export.el: Add function org-export-quotation-marks as a utility function usable by individual exporters to apply `smart' quotes. * contrib/lisp/org-e-latex.el: Replace org-e-latex-quotes custom with org-e-latex-quotes-replacements and make org-e-latex--quotation-marks use the org-export-quotation-marks function in org-export.el. * contrib/lisp/org-e-html.el: Replace org-e-html-quotes custom with org-e-html-quotes-replacements and enable org-e-html--quotation-marks, using org-export-quotation-marks function in org-export.el. --- contrib/lisp/org-e-html.el | 57 contrib/lisp/org-e-latex.el | 67 ++--- contrib/lisp/org-export.el | 26 +++ lisp/org.el | 101 +++ 4 files changed, 168 insertions(+), 83 deletions(-) diff --git a/contrib/lisp/org-e-html.el b/contrib/lisp/org-e-html.el index 53547a0..d4a505e 100644 --- a/contrib/lisp/org-e-html.el +++ b/contrib/lisp/org-e-html.el @@ -1077,37 +1077,24 @@ in order to mimic default behaviour: Plain text -(defcustom org-e-html-quotes - '(("fr" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "«~") - ("\\(\\S-\\)\"" . "~»") - ("\\(\\s-\\|(\\|^\\)'" . "'")) -("en" - ("\\(\\s-\\|[[(]\\|^\\)\"" . "``") - ("\\(\\S-\\)\"" . "''") - ("\\(\\s-\\|(\\|^\\)'" . "`"))) - "Alist for quotes to use when converting english double-quotes. - -The CAR of each item in this alist is the language code. -The CDR of each item in this alist is a list of three CONS: -- the first CONS defines the opening quote; -- the second CONS defines the closing quote; -- the last CONS defines single quotes. - -For each item in a CONS, the first string is a regexp -for allowed characters before/after the quote, the second -string defines the replacement string for this quote." +(defcustom org-e-html-smart-quote-replacements + '(("fr" "« " " »" "‘" "’" "’") +("en" "“" "”" "‘" "’" "’") +("de" "„" "“" "‚" "‘" "’")) + "What to export for `smart-quotes'. +A list of five strings: + 1. Open double-quotes + 2. Close double-quotes + 3. Open single-quote + 4. Close single-quote + 5. Mid-word apostrophe" :group 'org-export-e-html :type '(list - (cons :tag "Opening quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote ")) - (cons :tag "Closing quote" - (string :tag "Regexp for char after ") - (string :tag "Replacement quote ")) - (cons :tag "Single quote" - (string :tag "Regexp for char before") - (string :tag "Replacement quote " + (string :tag "Open double-quotes"); "â" + (string :tag "Close double-quotes") ; "â" + (string :tag "Open single-quote") ; "â" + (string :tag "Close single-quote"); "â" + (string :tag "Mid-word apostrophe"))) ; "â" Compilation @@ -1497,15 +1484,7 @@ This is used to choose a separator for constructs like \\verb." "Export quotation marks depending on language conventions. TEXT is a string containing quotation marks to be replaced. INFO is a plist used as a communication channel." - (mapc (lambda(l) - (let ((start 0)) - (while (setq start (string-match (car l) text start)) - (let ((new-quote (concat (match-string 1 text) (cdr l - (setq text (replace-match new-quote t t text)) - (cdr (or (assoc (plist-get info :language) org-e-html-quotes) - ;; Falls back on English. - (as
Re: [O] Smart Quotes Exporting
On 06/01/2012 01:11 PM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: Oh, certainly; they're all a disaster. I think I said that in the writeup at the top. This is just proof of concept, nothing is in the right place, nothing is properly documented. They have to be defcustoms, there needs to be a good :type in the defcustom as well as a proper docstring. You'll get no argument from me about the lack (or inaccuracy) of docstrings and such. I hadn't gotten that far yet. I said the patch was only if you wanted to tinker with the development as this progresses. No worries, I was just making some comments before forgetting about them. Ah, ok. Good! Thanks. +(defun org-e-latex--quotation-marks (text info) + (org-export-quotation-marks text info org-e-latex-quote-replacements)) + ;; (mapc (lambda(l) + ;; (let ((start 0)) + ;; (while (setq start (string-match (car l) text start)) + ;; (let ((new-quote (concat (match-string 1 text) (cdr l + ;; (setq text (replace-match new-quote t t text)) + ;; (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) + ;;;; Falls back on English. + ;;(assoc "en" org-e-latex-quotes + ;; text) Use directly `org-e-latex-quote-replacements' in code then. Not sure I understand this comment. Since `org-e-latex--quotation-marks' just calls `org-export-quotation-marks', you can remove completely the former from "org-export.el" and use the latter instead. Well, that was done on purpose, and maybe the reason will make sense. As I see it, each exporter should be able to have its own smartifier function, and the export engine should make no assumptions about that: just call the individual exporter's function. On the other hand, many (but perhaps not all!) of the exporters may find themselves using essentially the same code just with different replacement strings. So I thought that "general-purpose" should be in org-export.el, just for the convenience of exporters should they choose to make use of it. So, many of the exporters' smartifier functions will really just be calls to the more general-purpose function. Does that make sense? So... there's the filter-parse-tree-functions hook gets applied within the parse tree... so a back-end can add a function to that list which looks over the parse-tree and watches for these border cases (and also the ones within ordinary strings). Looks like it's going to be tough to work in any flexibility to define further per-language or per-backend cleverness to handle anything beyond the "canonical set" of open-double, close-double, open-single, close-single, and mid-word. To be sure, anything we do will most assuredly fail even on some fairly reasonable input, in which case the users are pretty much on their own and will have to do things the hard way. And I could use that as the answer here, that, "well, it'll work only within plain-text strings" (and I might possibly still have to use that answer), but I would rather include the situations you bring up in the supported set and not throw up my hands at it. So, yes, will look at that. Actually it isn't very hard to handle this problem. But it will be different than the fontification used in an Org buffer. Yes, the fontification on-screen is different, and uses a rather different function--but if I can help it, the same regexps! So things work the same everywhere. I also started thinking a little about what you write below, how we can inspect the characters just after or before quotes at the very beginning or end of each chunk. It would be nice if it could all be encapsulated neatly in the regexp(s). As a first approximation, I can imagine a function accepting an element, an object or a secondary string and returning an equivalent element, object or secondary string, with its quotes "smartified". The algorithm could go like this: Walk element/object/secondary-string's contents . Need it be element/object/secondary-string? At the bottom level it's always about strings; the higher levels don't affect the processing of each string in isolation. Do we need to intercept it at the element level or just wait to grab things in the plain-text filter, since we have access at that point too? (Might also be that my understanding of the process and the nature of elements is faulty or limited. Will have to see what works.) 1. When a string is encountered: 1. If it has a quote as its first or last position, check for objects before or after the string to guess its status. An object never starts with a white space, but you may have to check :post-blank property in order to know if previous object had white spaces at its end. Hmm, this may in fact answer my question above: y
[O] Smart Quotes Exporting (Was: Re: (no subject))
Sorry for messing up the thread subject header; I think I misused gmane's posting. On 05/31/2012 09:38 AM, Nicolas Goaziou wrote: Hello, Mark Shoulson writes: +(defvar org-e-html-quote-replacements + '(("fr" "« " " »" "‘" "’" "’") +("en" "“" "”" "‘" "’" "’") +("de" "„" "“" "‚" "‘" "’")) A docstring will be required for this variable. It should be a defcustom. Oh, certainly; they're all a disaster. I think I said that in the writeup at the top. This is just proof of concept, nothing is in the right place, nothing is properly documented. They have to be defcustoms, there needs to be a good :type in the defcustom as well as a proper docstring. You'll get no argument from me about the lack (or inaccuracy) of docstrings and such. I hadn't gotten that far yet. I said the patch was only if you wanted to tinker with the development as this progresses. +(defun org-e-latex--quotation-marks (text info) + (org-export-quotation-marks text info org-e-latex-quote-replacements)) + ;; (mapc (lambda(l) + ;; (let ((start 0)) + ;; (while (setq start (string-match (car l) text start)) + ;; (let ((new-quote (concat (match-string 1 text) (cdr l + ;; (setq text (replace-match new-quote t t text)) + ;; (cdr (or (assoc (plist-get info :language) org-e-latex-quotes) + ;;;; Falls back on English. + ;;(assoc "en" org-e-latex-quotes + ;; text) Use directly `org-e-latex-quote-replacements' in code then. Not sure I understand this comment. +; +;; Probably a defcustom eventually. + +;; Each element of this consists of: car=language code, cdr=list of +;; double-quote-open-regexp, double-quote-close-regexp, +;; single-quote-open-regexp, single-quote-close-regexp,&optional +;; single-apostrophe regexp? +;; Just about all will be the same anyway, so mostly language DEFAULT. + +;; For testing purposes, poorly-designed at first. +(defvar org-export-quotes-regexps + '((DEFAULT + "\\(?:\\s-\\|[[(]\\|^\\)\\(\"\\)\\w" + "\\(?:\\S-\\)\\(\"\\)\\s-" + "\\(?:\\s-\\|(\\|^\\)\\('\\)\\w" + "\\w\\('\\)\\(?:\\s-\\|\\s.\\|$\\)" + "\\w\\('\\)\\w"))) I'm not sure this variable can be used for both the buffer and the export engine. Export back-ends will only see chunks of the paragraph. For example, in the following text, He crossed the Rubicon and said: "/Alea jacta est./" Plain text translators will see three strings: 1. "He crossed the Rubicon and said: \"" 2. "Alea jacta est." 3. "\"" In case 1, you have an opening quote with nothing after it. In case 3, you have a closing quote with nothing before or after it. Plain regexps can't help here. The only solution in can think of is to do quote substitutions in paragraphs within the parse tree before they reach the translators (i.e. with `org-export-filter-parse-tree-functions'). That's the only way to know if "\"" is an opening or a closing quote, for example. The current approach won't work. Hm. OK, this may indeed be (a) a problem and (b) an indication that I really don't understand the process as I thought I did... ... ... Ah. So when the "plain" text is being exported, the exporter passes along the text in chunks as divided up by the formatting. So string #2 is broken out from the others due to its being in italics. That is indeed an issue. Moreover, I never even properly considered the effects of formatting characters (as opposed to punctuation) right next to the quote-marks, even if this weren't a problem. So... there's the filter-parse-tree-functions hook gets applied within the parse tree... so a back-end can add a function to that list which looks over the parse-tree and watches for these border cases (and also the ones within ordinary strings). Looks like it's going to be tough to work in any flexibility to define further per-language or per-backend cleverness to handle anything beyond the "canonical set" of open-double, close-double, open-single, close-single, and mid-word. To be sure, anything we do will most assuredly fail even on some fairly reasonable input, in which case the users are pretty much on their own and will have to do things the hard way. And I could use that as the answer here, that, "well, it'll work only within plain-text strings" (and I might possibly still have to use that answer), but I would rather include the situations you bring up in the supported set and not throw up my hands at it. So, yes, will look at that. + (let* ((start 0) +(regexps + (cdr + (or + (assoc (plist-get info :language) + org-export-quotes-regexps) + (assoc 'DEFAULT org-export-quotes-regexps Use `assq' instead of `assoc' in the second case. Good call. +(subs (cdr (or (assoc (plist-get info :language) + replacements) + (assoc "en" replace
Re: [O] "Smart" quotes
On 05/29/2012 01:57 PM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: I guess it doesn't actually matter, but it starts to get weird if you find yourself looking arbitrarily far back, and then you start building in exceptions for crossing paragraph boundaries... True. I had the exporter in mind, where you always start at the beginning of the paragraph. It would be more difficult with search starting in the middle of the paragraph. Maybe the on-screen stuff is no harder; will just have to see. And then there's the fact that multi-paragraph quotes usually have an open-quote for each paragraph but only one close-quote at the end... Some french typographers suggest to use a close-quote at the beginning of the paragraph to avoid that confusion, or to simply drop them (since they are a pain to maintain anyway). I don't know about other languages but, if that's the same, is it a good idea to bother implementing it? I've never heard of it. But I think we may be overthinking this; we can drive ourselves crazy trying to compress a dozen different typographical traditions (and informal customs) into a few Elisp rules. On the other hand, I don't think we need to throw up our hands and give up either! :) Actually keeping count of what level you're at, accurately, is a classic example of a non-regular language; you need a push-down automaton to keep count, and regular expressions don't cut it. This is limited to 2 levels. True. I'm rambling. In sum, I'm going to start off /not/ trying to solve that problem, and assume the writer is going to use alternating " and as typography requires and not try to second-guess what level we're at. You are right, the problem will be easier to solve with both " and '. Though, "as typography requires" is not true. In France, the /Imprimerie Nationale/ suggests to use guillemots at both levels. Remember that typography is localized, which is the main difficulty of the implementation. Also a good point. All right, bottom line, this is sort of what I'm seeing. I'm not 100% sure which files should house these things, but something like this: 1) a variable containing for each language regexp for each of: open double-quote, close double-quote, open single-quote, close single-quote, and maybe mid-word apostrophe. Odds are these regexps are going to be the same for just about all languages (the regexps detecting them, mind you), so probably should have some sort of default that the alist can just reference. A language should also be allowed to define other quote regexps in its list too. We need these to be ordered, with a standard set, so that we can have... 2) for each *exporter* (including on-screen display), a variable that defines, for each language, what the *substitution* will be for open-double-quote, close-double-quote, etc. Other extras can be defined too. That way we can have an exporter-independent way to detect quotes to be smartified, but each exporter has its own way to smartify them. 3) Since most exporters are probably going to be handling doing the process approximately the same (match the regexp, stick in the associated substitution), org-export.el should have a generic function that does this which each exporter *may* call in (or as) its quote-smartifier in its text translator, unless it needs something more specific which it can provide itself. In terms of what is handled, the idea in my head is that we would expect the writer to be using " or ' to surround their quotes, regardless of what their native custom is (if they're doing it using their language-specific quote-marks, we don't need to bother with all this anyway). Goal is to handle either "quotes" or 'quotes' in either nesting (or no nesting, if someone does "quote' for some reason), and with any luck not get too confused with other uses of apostrophe. It makes sense to me, but I bet I explained it badly and people are going to have all kinds of issues with it. :) No telling when (if?) I'll be able to produce something along these lines, but it's something to start thinking about anyway. ~mark
[O] [PATCH] Add \asciicirc entity
Per prior emails, added \asciicirc entity in org-entities.el, to expand to ascii "^" character, or \textasciicirc in LaTeX. Also fixed bug a few lines earlier, wherein \circ (org entity) would expand to \circ (latex entity) in LaTeX export, even though the former indicates a circumflex accent and the latter a small *circle*. (The expansion to ˆ (html entity) is correct.) ~mark >From deebb683e8d46a87247aa17c0fad8bef7b7b14a3 Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Mon, 28 May 2012 22:48:07 -0400 Subject: [PATCH] Add \asciicirc entity * org-entities.el (org-entities): Added \asciicirc entity for ^; also fixed \circ expansion in latex. TINYCHANGE --- lisp/org-entities.el |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/lisp/org-entities.el b/lisp/org-entities.el index fce3b68..b1c8ad4 100644 --- a/lisp/org-entities.el +++ b/lisp/org-entities.el @@ -252,7 +252,7 @@ loaded, add these packages to `org-export-latex-packages-alist'." "* Other" "** Misc. (often used)" -("circ" "\\circ" t "ˆ" "^" "^" "Ë") +("circ" "\\^{}" nil "ˆ" "^" "^" "Ë") ("vert" "\\vert{}" t "|" "|" "|" "|") ("brvbar" "\\textbrokenbar{}" nil "¦" "|" "¦" "¦") ("sect" "\\S" nil "§" "paragraph" "§" "§") @@ -264,6 +264,7 @@ loaded, add these packages to `org-export-latex-packages-alist'." ("plus" "+" nil "+" "+" "+" "+") ("under" "\\_" nil "_" "_" "_" "_") ("equal" "=" nil "=" "=" "=" "=") +("asciicirc" "\\textasciicircum{}" nil "^" "^" "^" "^") ("dagger" "\\textdagger{}" nil "†" "[dagger]" "[dagger]" "â ") ("Dagger" "\\textdaggerdbl{}" nil "‡" "[doubledagger]" "[doubledagger]" "â¡") -- 1.7.7.6
Re: [O] "Smart" quotes
On 05/26/2012 02:48 AM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: The regexp may be able to tell level 1 from level 2 quotes. Do you mean that the author would use the same characters for both first and second level quotes, and the regexp would be smart enough to distinguish which level each was at? I don't think that's possible, and you probably don't either. Actually, I do. Since you can tell an opening quote from a closing one by the position of the white space (or parenthesis, beginning/end of line) near it, I think you can deduce the quote level. I may be wrong, though. Maybe, if it's all on one line. But if the quote is several lines long, can you sensibly count the levels? I guess it doesn't actually matter, but it starts to get weird if you find yourself looking arbitrarily far back, and then you start building in exceptions for crossing paragraph boundaries... And then there's the fact that multi-paragraph quotes usually have an open-quote for each paragraph but only one close-quote at the end... Actually keeping count of what level you're at, accurately, is a classic example of a non-regular language; you need a push-down automaton to keep count, and regular expressions don't cut it. Then again, Emacs regexps are more powerful than simple regular expressions, and we only would want to keep track of even vs odd level anyway. I'm rambling. In sum, I'm going to start off /not/ trying to solve that problem, and assume the writer is going to use alternating " and ' as typography requires and not try to second-guess what level we're at. As that progresses, maybe I'll come to understand better what can and can't (and should and shouldn't) be deduced by the regexps. "this is a 'quote', and that's all you need to know." becoming, for instance «this is a ‹quote›, and that’s all you need to know.» "this is a "quote", and that's all you need to know" is as parsable to me. As a side note, at least in French, many typographers would recommend "this is a /quote/, and that's all you need to know" here. Oh, and I know that was just an example. I see; because I can tell that the second " must be an open-quote and not closing the first, due to its position relative to the spaces. It does seem possible, but I think I'm going to try not solving that problem first. (And French typography raises other problems, since French puts lots of space around the quote-marks, to the extent that French typists typing plain-text will often put a space on both sides of a quote-mark, making it hard to see whether it opens or closes... another issue, not necessarily solvable, to watch for.) ~mark
Re: [O] "Smart" quotes
On 05/25/2012 01:14 PM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: Hm. I like the idea, but it raises some questions for me. It would be particularly good if this could share code/custom variables with the pieces of the (new) exporter that make smart quotes on export. That way we could be sure that what it looks like onscreen would also be what it looked like when exported. I could be interesting, but keep in mind that no matter how "smart" your quotes are, they will fail in some situations. So, it will have to be optional for export, independently on their in-buffer status. The OPTIONS keyword may be used, with q:t and q:nil items. "Smart" quotes absolutely have to be optional, and probably disabled by default. They're going to fail sometimes, so they should only be there when you ask for them. Smart-quotes-for-export and smart-quotes-onscreen need to be settable independently, yes. Smart-quotes-for-export needs to be settable per-file/per-buffer, with OPTIONS or something. Smart-quotes-onscreen doesn't have to be buffer-local, though it might be a good idea. Using q:t or maybe ":t in options seems perfectly good for setting exporting smart quotes. It still would be good if onscreen and export could share code. Looking at contrib/lisp/org-e-latex.el at an upcoming exporter for such things, I see a variable org-e-latex-quotes, which has nice language-aware parts... but misses an important point. Each language gets to define one regexp for opening quotes, one for closing quotes, and one for single quotes. But don't we want to talk about (at least) two levels of quotes, see your own reference[fn:1]? Probably. But that's going to be somewhat harder. Single quotes would be for inner, second-level quotes (if we're using double straight quotes according to (American) English usage, I would guess we'd be using single straight quotes the same way). That works okay for English, where a single apostrophe not part of a grouping construct is going to be interpreted as a "close" single quote and look right for an apostrophe. The regexp may be able to tell level 1 from level 2 quotes. Do you mean that the author would use the same characters for both first and second level quotes, and the regexp would be smart enough to distinguish which level each was at? I don't think that's possible, and you probably don't either. What I meant, and you probably did as well, was that if we use apostrophes for second-level quotes, a regexp can be smart enough to tell the difference between a second-level quote and a non-quote apostrophe It might not work so good in French where apostrophes are also used, There are no spaces around apostrophes, so they shouldn't be caught by the regexp. which is what you say here. They *should* be caught by a regexp, but not the same one; they need to be smartified also, just not necessarily treated the same as second-level quotes. but also single guillemets for inner-level quotes. What are single guillemets? I don't think there is such thing in French. You're right; the Wikipedia page says that French uses quote-marks or the same double-chevrons for inner quotes. I thought it used \lsaquo and \rsaquo, « like ‹ this › ». Looks like it does in Swiss typography for various languages, according to the page. Danish also uses the single-chevrons (pointing the other direction), and Azerbaijani and Basque, etc... Whatever. What I meant was, if people are going to be writing using straight ascii quotes and expect them to be changed into language-appropriate quotes, they're going to want something like "this is a 'quote', and that's all you need to know." becoming, for instance «this is a ‹quote›, and that’s all you need to know.» that is, it should be possible to use the single quotes for inner quotes, which would mean more than just opening/closing/single in the org-e-latex-quotes (and analogous variables in other exporters). Being able to determine when you need ‹› and when ’ might be a little uncertain, but it isn't hard to make a regexp that can make a decent guess at it. Should/can we consider extending this for the new exporters? I think it would be a good addition to the export mechanism, if you want to give it a try. I'd love to get org more export-friendly. I'll see what I can understand of the (new) export code. (I'm looking forward to HTML and ODT exporters that can do smart quotes; the straight quotes are really the main jarring things about using Org as a lightweight markup and exporting into something fancier) A function, provided in org-export, could help changing dumb quotes into smart quotes in plain text. Then, it would be easier for back-ends to provide the feature, if they wanted to. That sounds like a possibility, might mak
Re: [O] #+STARTUP: showstars
Enda writes: Otherwise is like in vi, the additional stars (like **, opposed to *) are too noisy, and since I do not want to see them whether it is in org-mode or in vi, etc, so I wondered was there a way to have the file like * first level heading =C2=A0* second level heading =C2=A0=C2=A0 * third level heading There seems to be some confusion getting across. As I understand it, this is what the OP is after: Problem: doesn't like the look of multiple *s for headline; prefers the look of hidestars-mode. Furthermore, hidestars isn't good enough, because Enda wants the file *really to be like that*, without the stars there at all, so that it still looks good when being edited in vi or something. It needs to look like hidestars *all the time*, even outside of org, to avoid seeing the s. OP's requested fix: change the syntax so that a line which begins with N spaces followed by a star and then more spaces after that acts as if the N spaces at the beginning were stars. That is, * at the start of the line should be a third-level headline, as if it started with ***. OP, please let me know if I'm stating your position correctly. Problem with requested fix: In addition to being a pretty major change to start with, it also already conflicts with established usage. A line that starts with spaces and then a star-space is considered an item in a plain list. Using *s for plain lists is discouraged, because of the potential confusion with headlines if there are no spaces before it, but it is permitted (and if it could have been dropped it would have been already). So, no, this isn't going to happen. I _guess_ one could try defining something like a NO-BREAK-SPACE character to behave like a star at the beginning of the line when it comes to determining headlines, so you basically have a character that *looks* blank you can use for the non-final stars, but that would also be a pretty enormous change and I don't see it looking like a good move either. I hope that accurately portrays the request (and responses thereto). ~mark
Re: [O] [PATCH] Add entities for /, +, _, =
On 05/25/2012 11:04 AM, Nicolas Goaziou wrote: Hello, You're right, there could be another entity for ^. asciicirc is good enough as a name. Would you want to make a patch for it? Also, you may want to consider signing FSF papers for more important contributions. Yes, I'll do both those things, so don't worry about the patch. Might have to wait until next week though. ~mark
Re: [O] "Smart" quotes
On 05/23/2012 06:17 PM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: "Smart" quotes can be annoying when they aren't smart enough. But when they work you can miss them. I'm attaching a patch that defines a custom variable org-smart-quotes (nil by default), which when non-nil causes the " and ' characters to display as “smart” quotes, hopefully the right ones. They're still ' and " in the underlying text, just overlaid with “”. This is not related to entities, so code shouldn't be in org-entities.el. Agreed. Also, quotes are dependent on locale[fn:1]. English/US only quotes look like a niche to me. Would it be possible to modify the patch and have this feature handle LANGUAGE keyword, or at least have a support for it? Hm. I like the idea, but it raises some questions for me. It would be particularly good if this could share code/custom variables with the pieces of the (new) exporter that make smart quotes on export. That way we could be sure that what it looks like onscreen would also be what it looked like when exported. Looking at contrib/lisp/org-e-latex.el at an upcoming exporter for such things, I see a variable org-e-latex-quotes, which has nice language-aware parts... but misses an important point. Each language gets to define one regexp for opening quotes, one for closing quotes, and one for single quotes. But don't we want to talk about (at least) two levels of quotes, see your own reference[fn:1]? Single quotes would be for inner, second-level quotes (if we're using double straight quotes according to (American) English usage, I would guess we'd be using single straight quotes the same way). That works okay for English, where a single apostrophe not part of a grouping construct is going to be interpreted as a "close" single quote and look right for an apostrophe. It might not work so good in French where apostrophes are also used, but also single guillemets for inner-level quotes. Does the setup there need to be smarter, or at least more extensible, to allow for more than exactly three entries? Clever enough regexps could distinguish inner quotes from apostrophes, etc. Should/can we consider extending this for the new exporters? (I'm looking forward to HTML and ODT exporters that can do smart quotes; the straight quotes are really the main jarring things about using Org as a lightweight markup and exporting into something fancier) ~mark
[O] [PATCH] Add entities for /, +, _, =
On 05/23/2012 05:53 PM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: Also attached is another patch that might or might not be useful. Sometimes it can be a problem when you can't type, say, asterisks around a word when you NEED asterisks around the word, not a boldface word (I'd been getting around it by using Unicode characters that look like asterisks, like ∗). The way to do it right is to use the \ast entity, which expands to the right thing but doesn't affect formatting. There's also already a \tilde entity, to allow putting in tildes without accidentally setting something verbatim. I added entities for the remaining markup characters: \plus, \under, \equal, and \slash. \under might be particularly handy when avoiding subscripting (which raises the question of if there should be an \asciicirc (or something) entity for ^ also). I think they are all useful. Though, asciicirc already exists as circ. I hadn't counted \circ because it expands under Unicode to ˆ (U+02C6) and not to the true ascii circumflex ^ (U+005E); the point of these entities is to represent ascii characters that otherwise would confuse things. Maybe \circ should expand to ^; maybe there should be another entity for it (maybe neither). Anyway; attaching the relevant patch (changelog tweaked), once again hoping I got the formatting and everything right. ~mark >From 4d6c4ccc90fd181f446ff4c7d56f5c980ec9d940 Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Wed, 23 May 2012 21:53:35 -0400 Subject: [PATCH] Add entities for /, +, _, = * org-entities.el (org-entities): Add new entities for characters which could cause formatting changes if typed directly. --- lisp/org-entities.el |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/lisp/org-entities.el b/lisp/org-entities.el index 8b5b3f3..fce3b68 100644 --- a/lisp/org-entities.el +++ b/lisp/org-entities.el @@ -260,6 +260,10 @@ loaded, add these packages to `org-export-latex-packages-alist'." ("lt" "\\textless{}" nil "<" "<" "<" "<") ("gt" "\\textgreater{}" nil ">" ">" ">" ">") ("tilde" "\\~{}" nil "˜" "~" "~" "~") +("slash" "/" nil "/" "/" "/" "/") +("plus" "+" nil "+" "+" "+" "+") +("under" "\\_" nil "_" "_" "_" "_") +("equal" "=" nil "=" "=" "=" "=") ("dagger" "\\textdagger{}" nil "†" "[dagger]" "[dagger]" "â ") ("Dagger" "\\textdaggerdbl{}" nil "‡" "[doubledagger]" "[doubledagger]" "â¡") -- 1.7.7.6
[O] [PATCH] Fix for displaying entities ending in a number
On 05/23/2012 05:53 PM, Nicolas Goaziou wrote: Hello, "Mark E. Shoulson" writes: There's a small bug in rendering the entities when org-pretty-entities is on (I get the feeling that org-pretty-entities is not a very commonly-used feature). The entities \sup1 \sup2 \sup3 and \there4 are not rendered properly. The regex detecting entities apparently doesn't catch numbers at the end, except for the special case of fractions. I've added the others to the special-casing and attach a patch for it; I hope I managed to include the changelog properly (is git format-patch - -attach the way to go?). This looks good. You should add a title to your patch, like "Fix detection of entities ending with a number" or "org-entities: Add some entities". Also, please capitalize the word after the colons. I was trying to copy the format seen in other patches on the list; I guess I missed some details. I've set the subject of this thread as I've seen done with other patches, and I attach only a single patch, as requested by the website, and created the changelog with C-x 4 a and everything. I hope I got it right. Other patch follows under separate cover. Could you modify slightly your changelogs before I apply the patches? Thank you. Regards, >From 9b8e1b56c5c60720f985ea2b26952702c6c730a6 Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Wed, 23 May 2012 20:17:40 -0400 Subject: [PATCH] Fix for displaying entities ending in a number * lisp/org.el (org-fontify-entities): Fix bug: The entities \sup[123] and \there4 were not "prettified" when org-pretty-entities was enabled. TINYCHANGE --- lisp/org.el |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lisp/org.el b/lisp/org.el index 0b00851..c44c7ab 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5966,7 +5966,7 @@ needs to be inserted at a specific position in the font-lock sequence.") (when org-pretty-entities (catch 'match (while (re-search-forward - "\\(frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)" + "\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)" limit t) (if (and (not (org-in-indented-comment-line)) (setq ee (org-entity-get (match-string 1))) -- 1.7.7.6
[O] "Smart" quotes
(I seem to be winding up fixating on non-asciisms for org-mode; strange) "Smart" quotes can be annoying when they aren't smart enough. But when they work you can miss them. I'm attaching a patch that defines a custom variable org-smart-quotes (nil by default), which when non-nil causes the " and ' characters to display as “smart” quotes, hopefully the right ones. They're still ' and " in the underlying text, just overlaid with “”. I started working on parallel patches on the export end (in the -e- files in contrib); it would really help org-mode's standing as a lightweight markup language, imo, but then I saw that's already underway; I had just looked in the wrong exports. I also note that my post here dated May 9 (http://article.gmane.org/gmane.emacs.orgmode/55756 ) has gone completely without comment, good or bad. Did I miss something? Did it not get through (it shows up on gmane)? At least one of the changes submitted there I would think is pretty uncontroversial. ~mark diff --git a/lisp/org-entities.el b/lisp/org-entities.el index 8b5b3f3..ee54abc 100644 --- a/lisp/org-entities.el +++ b/lisp/org-entities.el @@ -47,6 +47,14 @@ in backends where the corresponding character is not available." :version "24.1" :type 'boolean) +(defcustom org-smart-quotes nil + "Non-nil means display ' and \" characters as Unicode \"smart\" quotes. +Org-mode will try to figure out if a quote character is opening or closing. + +Note: this does not affect export, only on-screen appearance." + :group 'org-entities + :type 'boolean) + (defcustom org-entities-user nil "User-defined entities used in Org-mode to produce special characters. Each entry in this list is a list of strings. It associates the name diff --git a/lisp/org.el b/lisp/org.el index 05f5375..213490e 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5926,6 +5926,7 @@ needs to be inserted at a specific position in the font-lock sequence.") '(1 'org-archived prepend)) ;; Specials '(org-do-latex-and-special-faces) + '(org-smartify-quotes) '(org-fontify-entities) '(org-raise-scripts) ;; Code @@ -5948,6 +5949,33 @@ needs to be inserted at a specific position in the font-lock sequence.") '(org-font-lock-keywords t nil nil backward-paragraph)) (kill-local-variable 'font-lock-keywords) nil)) +(defconst org-smart-quotes-regex + ;; ' is a word character, " is punctuation. + "\\(\"\\<\\)\\|\\>\\s.*\\(\"\\)\\|\\(?:\\W\\|^\\)\\('\\)\\|\\w\\s.*\\('\\)") + + +(defun org-smartify-quotes (limit) + "Make 'smart quotes' out of straight quotes." + (let* (start end subst k) +(when org-smart-quotes + (catch 'match + (while (re-search-forward org-smart-quotes-regex + limit t) + (cond ((match-string 1) + (setq k 1 subst "â")) + ((match-string 2) + (setq k 2 subst "â")) + ((match-string 3) + (setq k 3 subst "â")) + ((match-string 4) + (setq k 4 subst "â"))) + (add-text-properties (match-beginning k) (match-end k) + (list 'font-lock-fontified t)) + (compose-region (match-beginning k) (match-end k) subst nil) + (backward-char 1) + (throw 'match t)) + nil + (defun org-toggle-pretty-entities () "Toggle the composition display of entities as UTF8 characters." (interactive) 2012-05-21 Mark Shoulson * lisp/org.el, lisp/org-entities.el: added org-smart-quotes for displaying ' and " characters as "smart quotes."
[O] Entities
There's a small bug in rendering the entities when org-pretty-entities is on (I get the feeling that org-pretty-entities is not a very commonly-used feature). The entities \sup1 \sup2 \sup3 and \there4 are not rendered properly. The regex detecting entities apparently doesn't catch numbers at the end, except for the special case of fractions. I've added the others to the special-casing and attach a patch for it; I hope I managed to include the changelog properly (is git format-patch --attach the way to go?). Also attached is another patch that might or might not be useful. Sometimes it can be a problem when you can't type, say, asterisks around a word when you NEED asterisks around the word, not a boldface word (I'd been getting around it by using Unicode characters that look like asterisks, like ∗). The way to do it right is to use the \ast entity, which expands to the right thing but doesn't affect formatting. There's also already a \tilde entity, to allow putting in tildes without accidentally setting something verbatim. I added entities for the remaining markup characters: \plus, \under, \equal, and \slash. \under might be particularly handy when avoiding subscripting (which raises the question of if there should be an \asciicirc (or something) entity for ^ also). ~mark >From 5070e37aaae6f952bab022c71212fabb7549105e Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 8 May 2012 15:15:10 -0400 Subject: [PATCH] Fix for displaying certain "pretty" entities MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="1.7.7.6" This is a multi-part message in MIME format. --1.7.7.6 Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: 8bit * org.el (org-fontify-entities): fix bug: The entities \sup[123] and \there4 were not "prettified" when org-pretty-entities was enabled. TINYCHANGE --- lisp/org.el |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) --1.7.7.6 Content-Type: text/x-patch; name="0001-Fix-for-displaying-certain-pretty-entities.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="0001-Fix-for-displaying-certain-pretty-entities.patch" diff --git a/lisp/org.el b/lisp/org.el index 66f9c3e..1d2955f 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5954,7 +5954,7 @@ needs to be inserted at a specific position in the font-lock sequence.") (when org-pretty-entities (catch 'match (while (re-search-forward - "\\(frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)" + "\\(there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)" limit t) (if (and (not (org-in-indented-comment-line)) (setq ee (org-entity-get (match-string 1))) --1.7.7.6-- >From 58d18562f39ed64a547fa2d60510cae5983bcbef Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Tue, 8 May 2012 15:22:48 -0400 Subject: [PATCH] Add entities for /, +, _, = MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="1.7.7.6" This is a multi-part message in MIME format. --1.7.7.6 Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: 8bit * org-entities.el (org-entities): add new entities for characters which could cause formatting changes if typed directly. --- lisp/org-entities.el |4 1 files changed, 4 insertions(+), 0 deletions(-) --1.7.7.6 Content-Type: text/x-patch; name="0001-Add-entities-for-_.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="0001-Add-entities-for-_.patch" diff --git a/lisp/org-entities.el b/lisp/org-entities.el index 8b5b3f3..fce3b68 100644 --- a/lisp/org-entities.el +++ b/lisp/org-entities.el @@ -260,6 +260,10 @@ loaded, add these packages to `org-export-latex-packages-alist'." ("lt" "\\textless{}" nil "<" "<" "<" "<") ("gt" "\\textgreater{}" nil ">" ">" ">" ">") ("tilde" "\\~{}" nil "˜" "~" "~" "~") +("slash" "/" nil "/" "/" "/" "/") +("plus" "+" nil "+" "+" "+" "+") +("under" "\\_" nil "_" "_" "_" "_") +("equal" "=" nil "=" "=" "=" "=") ("dagger" "\\textdagger{}" nil "†" "[dagger]" "[dagger]" "â ") ("Dagger" "\\textdaggerdbl{}" nil "‡" "[doubledagger]" "[doubledagger]" "â¡") --1.7.7.6--
[O] Hiding the braces when org-pretty-entities is enabled
It's a very tiny patch, but one that probably should have happened before. When org-pretty-entities is enabled, the entities are displayed as Unicode characters, which is nice, but if they are in the middle of a word, you need to terminate them with {}, which are also still visible. So you have to write something like M\ouml{}bius, and when org-pretty-entities is on that is displayed as Mö{}bius, which really isn't what you want. This patch special-cases "{}" following an entity, so they get hidden as well. Hope it helps. ~mark >From 819e66254de18ae78e96b52dee1b5098920c3e31 Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Fri, 4 May 2012 18:19:06 -0400 Subject: [PATCH] Entities: when org-pretty-entities is on, the {} terminating entities is also hidden --- lisp/org.el | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/lisp/org.el b/lisp/org.el index 8776664..65dc1ce 100644 --- a/lisp/org.el +++ b/lisp/org.el @@ -5938,16 +5938,19 @@ needs to be inserted at a specific position in the font-lock sequence.") (when org-pretty-entities (catch 'match (while (re-search-forward - "\\(frac[13][24]\\|[a-zA-Z]+\\)\\($\\|[^[:alpha:]\n]\\)" + "\\(frac[13][24]\\|[a-zA-Z]+\\)\\($\\|{}\\|[^[:alpha:]\n]\\)" limit t) (if (and (not (org-in-indented-comment-line)) (setq ee (org-entity-get (match-string 1))) (= (length (nth 6 ee)) 1)) - (progn + (let* + ((end (if (equal (match-string 2) "{}") + (match-end 2) + (match-end 1 (add-text-properties - (match-beginning 0) (match-end 1) + (match-beginning 0) end (list 'font-lock-fontified t)) - (compose-region (match-beginning 0) (match-end 1) + (compose-region (match-beginning 0) end (nth 6 ee) nil) (backward-char 1) (throw 'match t -- 1.7.7.6
Re: [O] Flexible plain list bullets
On 04/20/2012 09:38 AM, Bastien wrote: Hi Mark, I agree with Nicolas that a solution based on overlays would be better. Probably, though very possibly not worth it. I also agree with you that there are many areas where we let the users modify the content of Org files in a way that makes them unparsable in a systematic manner. This is always a trade-off: user flexibility vs a rigid syntax. Yes. And as I noted in my examples, some/many of the cases involved were dealing with things that *really are worth* bending parsability for (as was pointed out to me on IRC.) TODO keywords, for example, *really do* have to be customizable, and the system has to deal with it. Even with global configuration variables, that in no way follow the files around. I agree that those were worth doing it for. On top of this trade-off there are many other one to consider, based on what are the available human (benevolent) resources to maintain Org. I try to prioritize things this way: 1. fix current bugs 2. improve syntax stability (against current state of flexibility) 3. extend current features 4. integrate new features Sometimes, a request raises a discussion somewhere between (3) and (2) -- which is precisely the discussion we have now. But pointing at failure in the current syntactic ground does not help promoting a feature request at (3) :) Sure. I'm just noting these because probably nobody would have seen them as issues if the importance of hard-coding the syntax hadn't come up as a point in answering me. They (probably) aren't doing much harm anyway. I'd offer to write a patch for some of the more obvious ones, to free up that much time from others, but it would be so small, it would probably take as long for someone to look over my patch as to write it themselves, so it wouldn't save anyone any time really. "Pointing out a failure in the current syntactic ground does not help promoting a feature request at (3)"... It might have, had the failures been so widespread as to show that this was the intended mode all along (as was not the case, so no, it doesn't). But getting the feature request in was already mostly off the table, I thought. I found some stuff that might be useful to know about; thought I should say so. (I'm talkative, yes, but not necessarily a jerk.) Also, Nicolas is working on a generic parser which will be the base for future decision on (2), and (consequently) on (3). That was mentioned, especially with respect to the source-blocks. So yes, this is complicate. We have many users. And an overlay based solution will *not* be a temporary fix, it will be something that any user can enjoy. I played with making the customization only able to ADD new characters, which I think would not be so harmful, but really only for my own edification; I can see that there really is no desire (outside me) to add this feature anyway. ~mark
Re: [O] Flexible plain list bullets
I guess. I spoke with someone on the IRC channel about this too, the basic idea being that the Org format should be stable, so the same file won't parse or behave differently on different installations. There's something to be said for that, but there are a fair number of customizable options that conflict with that ideal already. Some maybe should be there anyway, some might be better being made constants (or else reconsider my patch :) ). Examples: + org-emphasis-regexp-components and org-emphasis-alist are probably top candidates. These affect the parsing of Org in a pretty basic way: if you can change what characters to use for emphasis, and worse, exactly how they extend (what characters can interfere, etc), it's probably at least as potentially disruptive as alternate bullets. You might consider making these defconst instead of defcustom, if at all possible. + org-edit-src-region-extra is also a good example of exactly what you're saying shouldn't be there. First code blocks came in different ad-hoc flavors like #+ascii or or whatever. Then the #+begin_src format came in order to unify them all and keep them from proliferating as new languages come up. And so all of those are quite appropriately hardcoded, just as you say they should be, in the org-edit-src-find-region-and-lang function. But that function also looks at org-edit-src-region-extra, which throws open exactly the same kind of problem you're objecting to. + org-drawers is a customization that affects structure and parsing. Notably it is also settable in-file, which anything like this really needs to be, so a file can carry its special needs with it. This is actually probably a deeper structural change than bullets, but drawers can do great things, and so may be powerful enough to be worth it. + TODO keywords and the like also affect parsing and export and cursor-movement (about the same stuff bullets would) and are settable, but again are really important and useful. The COMMENT keyword less critical, but since it's a word, it's only reasonable that people should be able to have it in the appropriate language for their file. Which does bring up one point: it isn't fair to imply that customizable bullets would not be "pure plain text." Apart from the fact that they might well be used to make pure ASCII bullets (characters like @ or ! seem like possibilities), the fact is that Unicode *IS* plain text, that's what it's for. TODO keywords and such can and should be able to take on values that use non-ascii letters for users of other languages, and Org files written in Hindi or Hebrew remain "pure plain text". (I wonder if it would matter if the customization could only ADD possibilities, like the org-edit-src-region-extra variable does, and not replace or take away the basic ones.) ~mark On 04/19/2012 06:01 AM, Carsten Dominik wrote: On Apr 19, 2012, at 11:40 AM, suvayu ali wrote: I think this is very well put. Org must remain parsable, and all basic syntactic elements should be pure plain text and not configurable. - Carsten However, Nicolas' suggestion about a minor mode to add overlays sounds like a great idea to me. -- Suvayu Open source is the future. It sets us free. - Carsten
[O] Flexible plain list bullets
Attached is a patch that adds a customization variable for setting which characters you can use as bullets in plain lists. Unicode has all kinds of pretty characters like ❧ or ☞ that would be good for bullets, why limit ourselves to just [-+*]? The variable's "set" function sets associated other related variables, regular expressions using it, treating "*" specially since it isn't a plain-list bullet at the beginning of a line. Care is taken that the character "-" does not wind up in the middle of the character range (but there isn't special processing for "]", and maybe there should be). I put in some example sets to choose from in the customization menu, though they probably will not be usable since I understand Org-mode still has to support emacs versions that don't support Unicode. Please take a look, see if it's worth adding, tell me what else I need to do if necessary. Thanks! ~mark >From 5db3081b9487c09b17c7accfcf1b25f45002aa13 Mon Sep 17 00:00:00 2001 From: Mark Shoulson Date: Wed, 18 Apr 2012 20:55:41 -0400 Subject: [PATCH] Lists: enable customization for arbitrary characters for plain list bullets * lisp/org-list.el (org-list-bulletcharlist): new custom variable to set a list of characters for use as the bullets in plain lists. Entails a few other variables set along with it. --- lisp/org-list.el | 67 + 1 files changed, 57 insertions(+), 10 deletions(-) diff --git a/lisp/org-list.el b/lisp/org-list.el index 882ce3d..c751d1f 100644 --- a/lisp/org-list.el +++ b/lisp/org-list.el @@ -360,17 +360,60 @@ specifically, type `block' is determined by the variable "Regex corresponding to the end of a list. It depends on `org-empty-line-terminates-plain-lists'.") -(defconst org-list-full-item-re - (concat "^[ \t]*\\(\\(?:[-+*]\\|\\(?:[0-9]+\\|[A-Za-z]\\)[.)]\\)\\(?:[ \t]+\\|$\\)\\)" - "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[A-Za-z]\\)\\][ \t]*\\)?" - "\\(?:\\(\\[[ X-]\\]\\)\\(?:[ \t]+\\|$\\)\\)?" - "\\(?:\\(.*\\)[ \t]+::\\(?:[ \t]+\\|$\\)\\)?") +;; There shouldn't really have to be two different values here, since +;; they need to be changed in sync... + +(defvar org-list-bullet-re) +(defvar org-list-bullet-chars) +(defvar org-list-full-item-re nil "Matches a list item and puts everything into groups: group 1: bullet group 2: counter group 3: checkbox group 4: description tag") +(defcustom org-list-bulletcharlist '(?+ ?- ?*) + "Characters used as unordered plain list bullets. +If nil, defaults to (?- ?+ ?*), i.e. hyphen, plus, and *. If * is present, +it only matches when not at the beginning of the line (it must be preceded +by whitespace). + +Using letters as bullet characters is not recommended, as they also get +interpreted as ordered lists." + :group 'org-plain-lists + :type '(choice (const nil) + (const :tag "(+ - *)" '(?+ ?- ?*)) + (const :tag "(+ - * â£)" '(?+ ?- ?* ?â£)) + (const :tag "(â ⦠⧠â¥)" '(?â ?⦠?⧠?â¥)) + (repeat character)) + :set (lambda (name val) + (let* ((val (or val '(?- ?+ ?*))) + ;; - mustn't be in the middle! Place it in front. + (val (if (member ?- val) + (cons ?- (remove ?- val)) + val)) + (star-p (member ?* val)) + (val (remove ?* val))) + (setq org-list-bullet-chars + (concat (eval `(string ,@val)) + (when star-p "*"))) + (setq org-list-full-item-re + (concat "^[ \t]*\\(\\(?:[" org-list-bullet-chars "]" + "\\|\\(?:[0-9]+\\|[A-Za-z]\\)[.)]\\)\\(?:[ \t]+\\|$\\)\\)" + "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[A-Za-z]\\)\\][ \t]*\\)?" + "\\(?:\\(\\[[ X-]\\]\\)\\(?:[ \t]+\\|$\\)\\)?" + "\\(?:\\(.*\\)[ \t]+::\\(?:[ \t]+\\|$\\)\\)?")) + ;; * is a special case + (setq org-list-bullet-re + (concat + "\\(?:" + (when star-p "[[:blank:]]+\\*") + (when (and star-p val) "\\|") + "[[:blank:]]*[" + (when val (eval `(string ,@val))) + "]\\)"))) + (set name val))) + (defun org-item-re () "Return the correct regular expression for plain lists." (let ((term (cond @@ -379,8 +422,8 @@ group 4: description tag") ((= org-plain-list-ordered-item-terminator ?.) "\\.") (t "[.)]"))) (alpha (if org-alphabetical-lists "\\|[A-Za-z]" ""))) -(concat "\\([ \t]*\\([-+]\\|\\(\\([0-9]+" alpha "\\)" term - "\\)\\)\\|[ \t]+\\*\\)\\([ \t]+\\|$\\)"))) +(concat "\\(" org-list-bullet-re "\\|[ \t]*\\(\\(\\([0-9]+" alpha "\\)" term + "\\)\\)\\)\\([ \t]+\\|$\\)"))) (defsubst org-item-beginning-re () "Regexp matching the beginning of a plain list item." @@ -2229,7 +2272,7 @@ is an integer, 0 means `-', 1 means `+' etc. If WHICH is (t (org-trim bullet ;; Compute list of possible bullets, depending on context. (bullet-list - (append '("-" "+" ) + (append (mapcar 'char-to-string (string-to-list org-list-bullet-chars))