[PATCH] lisp/org-expiry.el: Account for org-time-stamp-formats

2022-12-04 Thread Tom Gillespie
Hi,
   Here is a patch for org-contrib/lisp/org-expiry.el to account for
recent changes to org-time-stamp-formats. Best,
Tom

PS is this list still the best place to send org-contrib patches?
From 2408e92a9c5e155b55a374462d1314aabbe50fe0 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Sun, 4 Dec 2022 01:02:35 -0800
Subject: [PATCH] lisp/org-expiry.el: Account for org-time-stamp-formats
 refactor

* lisp/org-expiry.el (org-expiry-insert-created)
(org-expiry-insert-expiry): timestamp formats dropped delimiters so a
slight modification is required following org commit
e3a7c01874c9bb80e04ffa58c578619faf09e7f0, also bump version to 0.3 and
add a dependency on org 9.6 to ensure that the new version of
org-time-stamp-formats is present and users of older versions of org
will not accidentally load the new version
---
 lisp/org-expiry.el | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/lisp/org-expiry.el b/lisp/org-expiry.el
index 98ad58a..d8d604b 100644
--- a/lisp/org-expiry.el
+++ b/lisp/org-expiry.el
@@ -3,9 +3,10 @@
 ;; Copyright 2007-2021 Free Software Foundation, Inc.
 ;;
 ;; Author: Bastien Guerry 
-;; Version: 0.2
+;; Version: 0.3
 ;; Keywords: org, expiry
 ;; Homepage: https://git.sr.ht/~bzg/org-contrib
+;; Package-Requires: ((org "9.6"))
 
 ;; This file is not part of GNU Emacs.
 
@@ -299,10 +300,11 @@ update the date."
   (setq d-hour (format-time-string "%H:%M" d-time))
   (setq timestr
 	;; two C-u prefixes will call org-read-date
-	(if (equal arg '(16))
-		(concat "<" (org-read-date
-			 nil nil nil nil d-time d-hour) ">")
-	  (format-time-string (cdr org-time-stamp-formats
+(concat "<"
+(if (equal arg '(16))
+(org-read-date nil nil nil nil d-time d-hour)
+  (format-time-string (cdr org-time-stamp-formats)))
+">"))
   ;; maybe transform to inactive timestamp
   (if org-expiry-inactive-timestamps
 	  (setq timestr (concat "[" (substring timestr 1 -1) "]")))
@@ -320,10 +322,11 @@ and insert today's date."
 (setq d-time (if d (org-time-string-to-time d)
 		   (current-time)))
 (setq d-hour (format-time-string "%H:%M" d-time))
-(setq timestr (if today
-		  (format-time-string (cdr org-time-stamp-formats))
-		(concat "<" (org-read-date
- nil nil nil nil d-time d-hour) ">")))
+(setq timestr (concat "<"
+  (if today
+  (format-time-string (cdr org-time-stamp-formats))
+(org-read-date nil nil nil nil d-time d-hour))
+  ">"))
 ;; maybe transform to inactive timestamp
 (if org-expiry-inactive-timestamps
 	(setq timestr (concat "[" (substring timestr 1 -1) "]")))
-- 
2.37.4



Re: org-assert-version considered harmful

2022-12-03 Thread Tom Gillespie
Hi Ihor,
I have been able to use `unload-feature' (with some additional
hackery) to get things working. It is not a pretty sight (there are a
bunch of pitfalls and unexpected side-effects that are likely bugs
when using unload-feature), so yes, ideally Emacs would make it
possible to have multiple versions of the same package. Many thanks!
Tom



Re: Feedback on Org syntax document

2022-12-02 Thread Tom Gillespie
Hi Ihor,
   Thank you for resurfacing this. I'll keep an eye out for the other
threads. I've been preoccupied with trying to wrap up my PhD work so
have been (and will continue to be for a bit longer) focused on other
things. I'll be back to focus on many of these questions some time in
the late winter/early spring if I had to guess. Best!
Tom



Re: org-assert-version considered harmful

2022-12-02 Thread Tom Gillespie
Sorry to be so late chiming in here. I've only now encountered this
due to the 9.6 release. In short, org-assert-version is an absolute
disaster for me.

At the very least org-assert-version should be non-fatal by default.

Without going into too much detail, an orgstrap shebang block is
forced to use the system installed version of org because it is
intended to work in the absence of an init.el file, or before an
init.el file can ever be loaded.

This means that if a newer version of org is installed then no code
can ever run again after that package is visible on the load path
because loading the newer version of org will immediately cause an
error when something (e.g. ob-python) tries to require org-macs,
terminating the execution of the orgstrap block prematurely. There is
no simple workaround, and there is no guaranteed workaround aside from
going to great lengths to only ever use the builtin version of org.

I'm not going to write anything else at the moment because I've just
spent the last 3+ hours trying to deal with this and am in an
extremely uncharitable mood.

Tom



Re: [PATCH] Delete some Emacs 24 compat code

2022-08-09 Thread Tom Gillespie
> The manual actually says
>
>   "If this exists, it names packages on which the current package
>   depends for proper operation."
>
> so I think it is reasonable to only list the minimum supported Emacs
> version, not the minimum version where it partially or fully works, but
> is not supported.

The weasel words here are "proper operation" because it covers everything
from "will byte compile without errors (but maybe with warnings)" to "has
zero bugs and will never fail under any circumstances." My interpretation
is that Package-Requires means "will byte compile without errors" because
all software has bugs. Unfortunately package metadata doesn't seem to
have another field for something like Package-Supported-Version.

> Problem I see with your approach is there will be an expectation that if
> it lists Emacs 25.x that it works under that version and anything which
> doesn't work is a bug. People will not check this list, README or NEWS
> files to verify what version of Emacs is compatible with - if they can
> use package.el to install it, they will expect that it works without any
> issues and any encountered are either a configuration error or a bug.

I agree that this is an issue. I think the easiest solution would be to add
something to org-submit-bug-report which would inform the user that they
are running a version of org that is too new for their version of emacs and
is thus unsupported.

Another solution would be for package metadata to deconflate "will
immediately fail if you try to run this on old versions" from "only make
this update available to users running emacs at or above this version."
I think this is a variant of your suggestion to make upgrading to unsupported
versions more difficult but not impossible.

> Even worse, once a problem with (for example) Emacs 25.x is found, what
> do we do? Would we have to push out a new version just to now update the
> requires line and forcing an update for all users? Which commit do we
> use to push out that update (given there will have been changes since
> the last release and we may not be ready to push them out in a new
> version yet).

I don't see how a bug that is only encountered on 25 is different from
any other bug in this case. We aren't going to continue to support
25 by continuing to maintain a 9.5.x branch when we go on to 9.6,
but as long as we don't e.g. start using e.g. functions that are not
present in 25 that cause immediate runtime failures or byte compile
failures, then hard blocking users on 25 from installing from elpa at
all seems like artificially depriving users of the ability to choose at
their own risk (albeit to make the maintainer's lives easier).

> An alternative approach is to deliberately make it harder to upgrade org
> if your running an unsupported version of Emacs. This would prevent
> automatic updates to a version which is not supported and (possibly) doe
> sot work, either partially or fully.  Manage user expectations by making
> it very explicit to the end user they are running a older version of
> emacs which may not be compatible with latest version of org.They can
> either decide to continue with the existing version they have installed
> or they can upgrade to a more recent Emacs or they can install org
> manually if they really want to accept the risk and run in an
> unsupported configuration.

As mentioned above, I also like this approach. We could create a hack
to work around the missing package metadata field, which would cause
a failure when trying to build on emacs < 26 unless org-i-know-what-i-am-doing
or some such is non-nil. The error message would say something along
the lines of "this version of org {org-version} will run on {emacs-version}"
but it is not supported. If you still want to install it, please run
(setq org-i-know-what-i-am-doing t) and then install the package again"
or something like that.

Best!

Tom



Re: [PATCH] Delete some Emacs 24 compat code

2022-08-09 Thread Tom Gillespie
> Please, keep ";; Package-Requires: " version in org.el consistent with
> such statement (Should it be updated for the bugfix branch as well?).

Unfortunately it is not clear that this is the right thing to do because
nearly every feature of org may work on old versions. Should we put
users through the pain of having to fight the metadata saying that they
can't run org on an old version of emacs when only a tiny subfeature
may or may not be broken? For example, I can load the current
version of org and go through most of my normal workflows without
issue on 25.

Package-Requires does not mean what it says, what it actually means
is "actively does not work on any versions not specified" which is not
true if we were to say >=26 and would make users' of older versions
of emacs lives harder. What this means is that we could say >=25
(which is what org.el current has by listing 25.1) because it is possible
to load current versions of org-mode on 25 but not on 24 (which works
only at 9.4.6 at 652430128896e690dc6ef2a83891a1209094b3da).



Re: [PATCH] ol-man.el (org-man-open): Set window point not buffer point

2022-08-09 Thread Tom Gillespie
> (while (process-live-p process)
>   (accept-process-output process)))

When I tried this before it didn't work, but now it does, I
must have missed something. Patch updated accordingly.

The order in which the man.el code does things is supremely
confusing, but I think when accept-process-output returns that
means the process sentinel has finished its final run and the
man buffer is fully populated so it is safe to search.

> Also, compiling the patch yields

No byte compiler errors now, and I think I got all the formatting issues.
From 848d6fc9bd395d7d45f14af71c4df8ea44ed7b4c Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Thu, 28 Jul 2022 23:33:22 -0700
Subject: [PATCH] ol-man: Set window point not buffer point and wait before
 search

* lisp/ol-man.el (org-man-open): Set window point not buffer point and
wait before search.  When passed man:path::SEARCH `org-man-open' uses
`search-forward' to jump to the location of e.g. a heading.  Prior to
this fix it only used `search-forward', which will not change the
point of the cursor in the window, meaning that even if there is a
match it will not appear.  Use `accept-process-output' to block until
the manpage finishes rendering before searching the buffer so that
there will be something to find.
---
 lisp/ol-man.el | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/lisp/ol-man.el b/lisp/ol-man.el
index aa22964c5..24e896f30 100644
--- a/lisp/ol-man.el
+++ b/lisp/ol-man.el
@@ -43,12 +43,22 @@ If PATH contains extra ::STRING which will use `occur' to search
 matched strings in man buffer."
   (string-match "\\(.*?\\)\\(?:::\\(.*\\)\\)?$" path)
   (let* ((command (match-string 1 path))
-	 (search (match-string 2 path)))
-(funcall org-man-command command)
+ (search (match-string 2 path))
+ (buffer (funcall org-man-command command)))
 (when search
-  (with-current-buffer (concat "*Man " command "*")
-	(goto-char (point-min))
-	(search-forward search)
+  (with-current-buffer buffer
+(goto-char (point-min))
+(unless (search-forward search nil t)
+  (let ((process (get-buffer-process buffer)))
+(while (process-live-p process)
+  (accept-process-output process)))
+  (goto-char (point-min))
+  (search-forward search))
+(forward-line -1)
+(let ((point (point)))
+  (let ((window (get-buffer-window buffer)))
+(set-window-point window point)
+(set-window-start window point)))
 
 (defun org-man-store-link ()
   "Store a link to a README file."
-- 
2.35.1



Re: [PATCH] ol-man.el (org-man-open): Set window point not buffer point

2022-08-08 Thread Tom Gillespie
Hi Ihor,
   Here is an updated patch. We can't use accept-process-output
because it doesn't seem to block in the way we need, or it blocks
exactly long enough for the process to finish but then continues
immediately to search instead of allowing the function that fills
the buffer to complete. Instead I use sleep-for a shorter time and
process-live-p which gives better results. I think I got the commit
message formats right this time. Best!
Tom
From 2db2ce6d83b27fcf6366183cbd8b5fa79fcbc4a7 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Thu, 28 Jul 2022 23:33:22 -0700
Subject: [PATCH] ol-man: Set window point not buffer point and wait before
 search

* lisp/ol-man.el (org-man-open): Set window point not buffer point
When passed man:path::SEARCH org-man-open tries to use search-forward
to jump to the location of e.g. a heading. Prior to this fix it only
used search-forward, which will not change the point of the cursor in
the window, meaning that even if there is a match it will not appear.
Uses process-live-p and sleep-for to wait until the manpage finishes
rendering before searching the buffer so that there will be something
to find.
---
 lisp/ol-man.el | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/lisp/ol-man.el b/lisp/ol-man.el
index aa22964c5..8633fe5cb 100644
--- a/lisp/ol-man.el
+++ b/lisp/ol-man.el
@@ -43,12 +43,22 @@ If PATH contains extra ::STRING which will use `occur' to search
 matched strings in man buffer."
   (string-match "\\(.*?\\)\\(?:::\\(.*\\)\\)?$" path)
   (let* ((command (match-string 1 path))
-	 (search (match-string 2 path)))
-(funcall org-man-command command)
+ (search (match-string 2 path))
+ (buffer (funcall org-man-command command)))
 (when search
-  (with-current-buffer (concat "*Man " command "*")
-	(goto-char (point-min))
-	(search-forward search)
+  (with-current-buffer buffer
+(goto-char (point-min))
+(unless (search-forward search nil t)
+  (let ((process (get-buffer-process buffer)))
+(while (process-live-p process)
+  (sleep-for 0.01)))
+  (goto-char (point-min))
+  (search-forward search))
+(previous-line)
+(let ((point (point)))
+  (let ((window (get-buffer-window buffer)))
+(set-window-point window point)
+(set-window-start window point)))
 
 (defun org-man-store-link ()
   "Store a link to a README file."
-- 
2.35.1



[PATCH] ol-man.el (org-man-open): Set window point not buffer point

2022-07-29 Thread Tom Gillespie
Here's a patch to fix the follow behavior for ol-man links so
that the ::SEARCH functionality will actually work. Best!
Tom
From 2c3e3b994fd7b47a6e91d147d2b1f08cd97a1908 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Thu, 28 Jul 2022 23:33:22 -0700
Subject: [PATCH] * lisp/ol-man.el (org-man-open): Set window point not buffer
 point

When passed man:path::SEARCH org-man-open tries to use search-forward
to jump to the location of e.g. a heading. Prior to this fix it only
used search-forward, which will not change the point of the cursor in
the window, meaning that even if there is a match it will not appear.

Use sleep-for as a horrible hack to work around the fact that the man
command runs in the background with no way to synchronize back.
---
 lisp/ol-man.el | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/lisp/ol-man.el b/lisp/ol-man.el
index aa22964c5..5843cd5f6 100644
--- a/lisp/ol-man.el
+++ b/lisp/ol-man.el
@@ -43,12 +43,20 @@ If PATH contains extra ::STRING which will use `occur' to search
 matched strings in man buffer."
   (string-match "\\(.*?\\)\\(?:::\\(.*\\)\\)?$" path)
   (let* ((command (match-string 1 path))
-	 (search (match-string 2 path)))
-(funcall org-man-command command)
+ (search (match-string 2 path))
+ (buffer (funcall org-man-command command)))
 (when search
-  (with-current-buffer (concat "*Man " command "*")
-	(goto-char (point-min))
-	(search-forward search)
+  (with-current-buffer buffer
+(goto-char (point-min))
+(unless (search-forward search nil t)
+  (sleep-for 0.75) ; async, can't block, no callback
+  (goto-char (point-min))
+  (search-forward search))
+(previous-line)
+(let ((point (point)))
+  (let ((window (get-buffer-window buffer)))
+(set-window-point window point)
+(set-window-start window point)))
 
 (defun org-man-store-link ()
   "Store a link to a README file."
-- 
2.35.1



Re: Links to javascript-based websites from orgmode.org: Paypal and Github

2022-06-27 Thread Tom Gillespie
> GNU packages should not steer people towards running nonfree software.
> As a consequence, they should not suggest people donate using payment services
> that _require_ the donor to run a nonfree program.

A slight variant of Ihor's question.

While GNU packages should not steer people toward nonfree software,
I assume that there is nothing that prohibits GNU contributors from
accepting donations via non-free systems.

This thread suggests that it is no other option if devs also do not want
to steer people toward cryptocurrencies (which some consider to be as
ethically important as not steering people toward nonfree software).

My question is whether the website for a GNU package can include links
to the websites of individual developers with a note that you can provide
financial support to the project by supporting individuals. In the end the
user still winds up using nonfree JS, but is GNU living up to its principles
by virtue of the extra layer of indirection?

Given that https://www.fsf.org/about/ways-to-donate/ does include paypal
as an option, with a disclaimer, is a disclaimer not a sufficient solution for
GNU packages as well?



Re: We have asynchronous sessions, why have anything else?

2022-06-27 Thread Tom Gillespie
> I am not even sure if all the babel backends support try-except.
> Think about ob-gnuplot or, say, ob-latex.

Indeed many do not. Defining some standard "features"
for org babel language implementations is something that
is definitely of interest so that we can provide clear interfaces
for things like stdio, error handling, return values, async,
file output, remote execution, sessions, return value caching,
module discovery/tangling, execution from file vs stdin, execution
without a file system path, runtime environment specification,
and much more. However, at the moment there is only a preliminary
survey of a subset of these that was put together by Ian Martins.

https://orgmode.org/worg/org-contrib/babel/languages/lang-compat.html

> the two could be unified if we expand the functionality of the async filter

While this might be possible, I would definitely hold off on this because
the changes in semantics absolutely will break many users' blocks. We
barely knew what the impact of changing the default return value for shell
blocks would be.

I absolutely look forward to the day when this can be done safely and
with confidence, but I think we need a much stronger handle on babel
interfaces in general before such a change could even be considered.

At the moment each ob lang impl pretty much has to be considered
to be completely unique, even if the text looks like bash for e.g.
shell, comint, and screen. Users are known to rely on undocumented
quirks of the ob lang impls that can differ wildly in their semantics.

Best!
Tom



Re: Org mode and Emacs (was: Convert README.org to plain text README while installing package)

2022-06-16 Thread Tom Gillespie
Having read the whole thread now: oof. Thank you Ihor for shepherding
that and for the performance improvements!

With regard to the key-bindings straw man. I guess I'm a bit of an
outsider on this one, because I started writing org documents by just
typing them in and only over time learning some of the bindings. Maybe
having an org-markup-mode or something like that would be a way to
provide a sandbox for the +recalcitrants+ newcomers? It might also be
a nice way to a/b test them on whether the Emacs editing commands
really are as good as they think they are (said the evil-mode user).

With regard to ... everything else. I guess at this point it is
unsurprising that (for lack of a better term) the uninitiated in the
dark corners of org syntax frequently think that syntactic extensions
are advisable, skipping over the consideration of possible.

Given the opportunities that seem to be lurking in the thread, it
seems like it would be good to have some examples of how the
e.g. texinfo semantic markup could (or could not) be implemented using
existing org syntax. The suggestion to use custom link types seems
very practical. It requires no new syntax, and is basically fully
extensible for semantic markup needs.

I say this having recently spent time reworking the paragraph grammar
and the lexer needed to enable it in laundry for the 3rd (or is it
4th?) time. Say it with me: No new syntactic forms! We have more than
enough syntax to enable all the extensibility that pretty much anyone
will ever need (we just have to document how to use it).

In-document extensibility of link types might be possible if we get my
regularized keyword syntax implemented, if that were done then all the
configuration could in-principle live in a setup file (I have a
response on the syntax thread drafted, will try to get back to it).

Nesting markup inside code or verbatim seems more difficult because
they are intentionally terminal. I am also unfamiliar with texinfo so
will be of no help with the examples, but I do look forward to them.

Best!
Tom



Re: [DISCUSSION] Refactoring fontification system

2022-06-07 Thread Tom Gillespie
> As for lang parameter support in example blocks, would you mind creating
> a separate feature request thread? Extending export blocks export will
> require changing in parser syntax and thus should be discussed carefully
> in a separate thread.

I would strongly caution against allowing an optional #+begin_example lang
syntax. It will lead to extreme confusion, even when users know to use org-lint.
The reason for this is that example blocks do not have (and frankly should not
have) full org-babel support. Babel is already complex enough as is without
having to explain to a user that yes they can noweb an example block into
a src block, but that they cannot noweb a source block into an example block.

One of the most powerful features of src blocks is that they can go from being
dumb examples all the way up to fully executable programs. Example blocks
cannot do that, and adding features that overlap with code blocks is inviting
duplicated effort and will confuse and frustrate users if they have
the misfortune
to start with an example block an then have to change mid way through to a
code block.

I also think that adding a parameter #+begin_example :lang bash to example
blocks will also lead to confusion because now there are two different ways
to specify what lang a block is. To me the answer should be to just use source
blocks if you need highlighting, example blocks should not highlight at all in
order to make the distinction clear.

Best,
Tom



Re: [PATCH] Re: tangle option to not write a file with same contents?

2022-06-06 Thread Tom Gillespie
I can report that with the current changes in the tree
I see some nice performance improvements in files
where I have large numbers of blocks where I modify
a subset of them (beyond a single case where C-u
C-c C-v C-t works) and then retangle the whole file.
Best,
Tom



Re: [BUG] 67275f4 broke evil-search Re: [PATCH 10/35] Implement link folding

2022-05-27 Thread Tom Gillespie
The workaround from the other thread to
(setq org-fold-core-style 'overlays) is perfect.

> The whole point of the patch is _not_ using overlays. For performance
> reasons.

Yep,  the workaround is sufficient for now, and the note on
performance for large files in the docstring makes it clear
what the tradeoffs are, and why we want the text properties
to be the default. Not need to "restore" the old behavior since
it is just a setq away.

> Note that if evil were to comply with the canonical isearch
> implementation and respect isearch-mode-end-hook, there would be no
> issue.

I think we might want to update the documentation to mention
issue with evil for now, and alert the evil devs about this change.
Then we can approach them about implementing support for
searching inside invisible regions marked via text properties
since that is essentially a new feature that is being added to
org for 9.6, though one that will be on by default. The evil-search
module doesn't seem to support _any_ of the isearch hooks needed
but while looking into this I think I know generally where it might be
possible to add them.

Thanks!
Tom



Re: org-persist-gc and tramp

2022-05-27 Thread Tom Gillespie
> Off topic: Did you report the issue to evil devs?

Not yet. Needed to understand what is going on.

> alternative workaround could be setting org-fold-core-style to
> 'overlays.

Yes! This fixes the issue, and is consistent with my observations
in the other thread (I will respond with more there).



Re: [BUG] markdown blocks remain visible when they should be folded

2022-05-27 Thread Tom Gillespie
Confirming fixed. Many thanks!
Tom



Re: [BUG] 67275f4 broke evil-search Re: [PATCH 10/35] Implement link folding

2022-05-27 Thread Tom Gillespie
> It appears to respect isearch-filter-predicate, but not
> isearch-mode-end-hook.

This is true only when isearch is used as the module via
(evil-select-search-module 'evil-search-module 'isearch),
and indeed, when using evil search in that way headings
no longer refold. When using evil-search, things won't
even unfold.

I think that I have tracked the issue down to
evil-ex-find-next in a call to isearch-range-invisible
which returns nil for commits < 67275f4, and t for >=.
When isearch-range-invisible returns nil the invisible
overlay is made visible, when it returns t it stays closed.

Might restoring the invisible overlay text property restore
the old behavior? Is there a reason it was removed?

Best,
Tom



[BUG] markdown blocks remain visible when they should be folded

2022-05-27 Thread Tom Gillespie
One of the commits between
ffdc508429c58716272743c0e0650bb721fd906a (good) and
67275f4664ce00b5263c75398d78816e7dc2ffa6 (bad)
a change was introduced that broke folding for markdown
blocks. I'm not sure of the exact commit because folding
is completely broken for all the commits in between.

Looking at the diff between the two there are so many
changes involving the invisible text property that I have no
idea which one is the culprit.

The markdown blocks remain visible and have the invisible
text property set to nil, when folded #+end_src will turn into
... but the markdown remains visible. The whole block will
be left open if any of the markdown list or heading chars are
present in the block.

Here are some examples that trigger the issue:

* a
#+begin_src markdown
1.
#+end_src
* b
#+begin_src markdown
-
#+end_src
* c
#+begin_src markdown
# hello
#+end_src



Re: org-persist-gc and tramp

2022-05-27 Thread Tom Gillespie
> Can you confirm that you are using the latest version of Org?

I was running a version from back in december due to the evil
search issues I was having. Updating to the latest version
resolves indeed resolves the org-persist issue.

Thanks!
Tom



org-persist-gc and tramp

2022-05-27 Thread Tom Gillespie
While debugging an unrelated issue I noticed that
tramp was initiating a new connection every time
save-buffers-kill-emacs ran. I eventually tracked it
down to org-persist-gc running on kill-emacs-hook.

It turns out that org-persist--index held a record
for a remote file, and the transparency of tramp
means that emacs will happily open any number
of remote connections to check whether such files
exist on exit. This could produce unexpected and
quite bad behavior for users that edit many remote
org files. It also leaks information that a user has
closed emacs over the network even if they have
not intentionally made any remote connections.

I see a couple potential fixes.
- skip gc for remote files
- never persist remote files
- +only gc remote files when their cache is expired+

I think the last option might be the most reasonable
compromise? Er, nope, I was looking at an old/variant
cache location that still had :expiry values in the plist,
but it looks like the most recent version does not, so
we are left with option 1 or option 2.

Given that the network latency is likely to dominate
accessing such files the time to reparse if we don't
persist (option 2) seems like it will be small?

I think for option 1 to work safely there would need
to be a way to periodically gc remote files, maybe
when another file on that remote was accessed so
an existing tramp connection would be used?

Thoughts?
Tom



[BUG] 67275f4 broke evil-search Re: [PATCH 10/35] Implement link folding

2022-05-04 Thread Tom Gillespie
Hi Ihor,
It seems that this patch (as commit
67275f4664ce00b5263c75398d78816e7dc2ffa6, found using git bisect to
hunt down the issue) breaks search in evil mode when
(evil-select-search-module 'evil-search-module 'evil-search) is set.
The broken behavior is that evil-search no longer searches inside
folded headings. I had a quick look at the changes but couldn't figure
out why these changes might cause the issue. Best,
Tom



Re: Suggestion: convert dispatchers to use transient

2022-02-03 Thread Tom Gillespie
The backward compatibility requirements for org mean that it won't be
possible to replace the existing implementation
for quite a while. That said, I imagine that having optional transient
dispatchers for users on newer versions of emacs would be appreciated.
Best,
Tom



Re: [BUG] Make SVG + LaTeX work by default [9.5.2 (release_9.5.2-9-g7ba24c @ /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/org/)]

2022-01-30 Thread Tom Gillespie
I do not think we can add -shell-escape by default because it
is an arbitrary code execution vector. It might be good to add
a setting in org that would do the right thing without requiring
a user to understand the arcana of latex cli options though.
Best,
Tom



Re: [PATCH] Add support for $…$ latex fragments followed by a dash

2022-01-26 Thread Tom Gillespie
> The change is local and minor.
We can't know that. Consider for example someone that has
the following line somewhere in their files.
#+begin_src org
I spent $20 on food and was paid$-10 dollars by friends so
I am down $10.
#+end_src
Yes =paid$-10= is probably a typo that should have a space
in between, but it could still be in a file and cause an issue.
The more likely case would be of someone that has $ in the
name of a variable that also uses dashes. For example if
I have a list of variable names such as
#+begin_src org
Text a $A_BASH_VAR
Text b some-$-lisp-var
#+end_src

The proposed change would break any file with a pattern like
this.

We have no way of seeing every org file that users have
written so we don't know the extent of the impact, and thus
have to assume that there would be some impact. Making
such a change with an unknown blast radius in the midst of
considering removing support for that syntax altogether is
inviting disaster.

Best,
Tom



Re: [PATCH] Add support for $…$ latex fragments followed by a dash

2022-01-25 Thread Tom Gillespie
> The attached patch adds support for $…$ latex fragments followed by a
> dash, such as $n$-th.

Unfortunately this falls into the realm of changes to syntax. The current
behavior is not a bug and is working as specified because hyphen minus
(U+002D) does not count as punctuation for the purposes of org syntax.
We should specify which chars count as punctuation in the syntax doc.
As noted by Eric \(\) has no such restrictions.

>From https://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
> POST is any punctuation (including parentheses and quotes) or space 
> character, or the end of line.

Best,
Tom



Re: call blocks as a function from inside elisp code

2022-01-19 Thread Tom Gillespie
Hi George,
Here is an example of how I call nested elisp and python. The
python block is an input argument to the elisp block in this case, but
the python block could be called directly as well. I'm not sure how to
pass arguments to the block from inside elisp via org-babel-eval
though, that seems like it would require some deeper
tampering/advising of functions. Best,
Tom

https://github.com/SciCrunch/sparc-curation/blame/master/docs/queries.org#L1704-L1707
#+begin_src elisp :results none :exports none
(ow-babel-eval "neru-simplified")
#+end_src

The implementation I use is included below and is source dfrom
https://github.com/tgbugs/orgstrap/blob/bc981b957967be8d872c08be9ba7f2dbde5caf1d/ow.el#L786-L803

(defun ow-babel-eval (block-name  universal-argument)
  "Use to confirm running a chain of dependent blocks starting with BLOCK-NAME.
This retains single confirmation at the entry point for the block."
  ;; TODO consider a header arg for a variant of this in org babel proper
  (interactive "P")
  (let ((org-confirm-babel-evaluate (lambda (_l _b) nil))) ;; FIXME
TODO set messages buffer size to nil
(save-excursion
  (when (org-babel-find-named-block block-name)
;; goto won't raise an error which results in the block where
;; `ow-confirm-once' is being used being called an infinite
;; number of times and blowing the stack
(org-babel-goto-named-src-block block-name)
(unwind-protect
(progn
  ;; FIXME optionally raise errors on failure here !?
  (advice-add #'org-babel-insert-result :around
#'ow--results-silent)
  (org-babel-execute-src-block))
  (advice-remove #'org-babel-insert-result #'ow--results-silent))

(defun ow--results-silent (fun  args)
  "Whoever named the original version of this has a strange sense of humor."
  ;; so :results silent, which is what org babel calls between vars
  ;; set automatically is completely broken when one block calls another
  ;; there likely needs to be an internal gensymed value that babel blocks
  ;; can pass to eachother so that a malicious user cannot actually slience
  ;; values, along with an option to still print, but until then we have this
  (let ((result (car args))
(result-params (cadr args)))
(if (member "silent" result-params)
result
  (apply fun args



Re: Problem when tangling source blocks with custom coderefs

2022-01-18 Thread Tom Gillespie
Hi Luis,
   I don't think you are doing anything wrong. IIRC the portion of the
patch that allowed the customization to propagate to the tangled code
was not included. Given that I am no longer the only one who is
looking for/expecting this behavior, maybe it is worth revisiting the
decision. The simplest fix right now would be to prepend your coderef
with the python comment symbols # |hello| so that at the very least it
won't break your tangled files. I would like to see this implemented,
so let's see what Nicolas has to say. Best!
Tom



Re: Org Syntax Specification

2022-01-18 Thread Tom Gillespie
Hi Ihor,
  Thank you very much for the detailed responses. Let me start with
some context.

1. A number of the comments that I made fall into the brainstorming
   category, so they don't need to make their way into the document at
   this time. I agree that it is critical for this document to capture
   how org is parsed right now and that we should not put the
   pie-in-the-sky changes in until the behavior of org-element matches
   (if such a change is made at all).
2. Though I haven't been hacking on it, I fully intend to contribute
   test cases and exploratory work on org-element in the future, so
   please don't interpret some of what I am writing as requests for
   other people to write code (unless they want to :)
3. When I say grammar in this context I mean specifically an eBNF that
   generates a LALR(1) or LR(1) parser. This is narrower than the
   definition used in the document, which includes things that have to
   be implemented in the tokenizer, or in a pass after the grammar has
   been applied, or are related to some other aspect beyond the pure
   surface syntax.
4. A number of my comments are about the structure of the document
   more than the structure of the syntax or the implementation. I
   think that most of them are trying to ask whether we want to
   clearly delineate pure surface syntax from semantics to make the
   document easier to understand.

More replies in line.
Best!
Tom

> As for your other comments, you seem to be suggesting a number of
> changes to the existing Org syntax. Some of them looks fine, some are
> not. However, please keep in mind that we have to deal with back
> compatibility, third party compatibility, and not breaking existing Org
> documents unless we have a very strong justification. I suggest to
> branch a number of new threads from here for each concrete suggestion
> where you want to make changes to Org syntax, as opposed to just
> document wording. Otherwise, this discussion will become a total mess.

Agreed. I put many of these in here as notes from my experiences, I
will branch those off into separate discussions so that we don't
pollute this thread.

> Nope. Sections are actually elements. See =org-element-all-elements=.

I realized this at a slightly later date but missed cleaning up this
comment.  See my response on section vs segment below.

> I disagree. Nesting rules are the important part of syntax. We have
> restrictions on what elements can be inside other element. The same
> patterns are not recognised in Org depending on their nesting. For
> example, links that you put into property drawers are not considered
> link objects.

When I wrote this comment I was still confused about sections.I think
discussion of nesting in most contexts is ok, but there are some case
where nesting cannot be determined from the grammar, and there I think
we need to make a distinction.

In my thinking I separate the context sensitive nature of parsing from
the nesting structure of the resulting sexpressions, org elements,
etc.The most obvious example of this is that the sexpression
representation for headings nests based on the level of the heading,
but heading level cannot be determined by the grammar so it must be
reconstructed from a flat sequence of headings that have varying level.

> Again I disagree. While your idea about table cells is reasonable
> (similar for citation-references inside citations), I am against
> decoupling Org syntax from org-element implementation. In
> org-element.el, table-cells are just yet another object. If we make
> things in org-element and syntax document out of sync, confusion and
> errors will follow during future maintenance.

Org element treats all elements and objects as a single homogenous
type.  This is fine. However, to help people understand the syntax it
seems easier to define things in a positive way so that we don't say
"all except these two."  Therefore, despite the fact that the
implementation of org-element treats table rows and cells no different
from any other node in the parse tree, we don't need to burden the
reader with that information at this point in time, and could provide
that information as an implementation note for cells.  I think the
other issue I was having here is that the spec for tables is spread
allover the place, and it would be much easier to understand and
implement ifit were all in one place.

> This actually reads slightly confusing. "Blank lines separate paragraphs
> and other elements" sounds like blank lines are only relevant
> before/after paragraphs. However, there are also footnote references and
> lists. Maybe we can try something like:
>
> Blank lines can be used to indicate end of some elements.
>
> "can" because a single blank line usually does not separate anything.

I think your version is quite a bit more readable.  Can we list the
set of all the elements that can be ended by a new lineas well as
those that cannot (iirc they are elements such as footnotes that can
only 

Re: Org Syntax Specification

2022-01-17 Thread Tom Gillespie
Hi Timothy,
I have attached a patch with some modifications and a bunch of
comments (as footnotes). More replies in line. Thank you for all your
work on this!
Tom

> Marking this as depreciated would have no effect on Org’s current behaviour, 
> but we could:
>
> Mark as depreciated now-ish
> Add a utility to convert from TeX-style to LaTeX-style
> Add org lint/fortification warnings
> A while later (half a decade? more?) actually remove support

In favor of this. There are good alternatives for this now.

> The other component of the syntax which feels particularly awkward to me is 
> source block switches. They seem a bit odd, and since arguments exist, 
> completely redundant.

Extremely in favor of removing switches. There are so many better ways
to do this now that aren't like some eldritch unix horror crawling up
out of the abyss and into the eBNF :)
From 3527331f02e593ec6ba6cb4c8bde3f64de3ad216 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Mon, 17 Jan 2022 19:34:21 -0500
Subject: [PATCH] Tom's comments and modifications to org syntax edited

I removed any mention of markdown because it is a distraction in this
document and is not something we want anyone attending to here.

I change "top level section" to "zeroth section" which I think is more
consistent terminology because level is often used to refer to the
depth of parsing at any given point in the file and the top level
refers to anything that can be parsed without context. Zeroth makes it
clear that we are talking about the actual zeroth occurrence of a
section in a file/buffer/stream.
---
 dev/org-syntax-edited.org | 399 +++---
 1 file changed, 331 insertions(+), 68 deletions(-)

diff --git a/dev/org-syntax-edited.org b/dev/org-syntax-edited.org
index c3259473..2e99070d 100644
--- a/dev/org-syntax-edited.org
+++ b/dev/org-syntax-edited.org
@@ -19,9 +19,7 @@ under the GNU General Public License v3 or later.
 Org is a plaintext format composed of simple, yet versatile, forms
 which represent formatting and structural information.  It is designed
 to be both intuitive to use, and capable of representing complex
-documents.  Like [[https://datatracker.ietf.org/doc/html/rfc7763][Markdown]], Org may be considered a lightweight markup
-language.  However, while Markdown refers to a collection of similar
-syntaxes, Org is a single syntax.
+documents.
 
 This document describes and comments on Org syntax as it is currently
 read by its parser (=org-element.el=) and, therefore, by the export
@@ -32,14 +30,13 @@ framework.
 ** Objects and Elements
 
 The components of this syntax can be divided into two classes:
-"[[#Objects][objects]]" and "[[#Elements][elements]]".  To better understand these classes,
-consider the paragraph as a unit of measurement.  /Elements/ are
-syntactic components that exist at the same or greater scope than a
-paragraph, i.e. which could not be contained by a paragraph.
-Conversely, /objects/ are syntactic components that exist with a smaller
-scope than a paragraph, and so can be contained within a paragraph.
-
-Elements can be stratified into "[[#Headings][headings]]", "[[#Sections][sections]]", "[[#Greater_Elements][greater
+"[[#Elements][elements]]" and "[[#Objects][objects]]".  Elements are
+syntactic components that have the same priority as or greater
+priority than a paragraph. Objects are syntactic components that are
+only recognized inside a paragraph or other paragraph-like elements
+such as heading titles.
+
+Elements are further divided into "[[#Headings][headings]]", "[[#Sections][sections]]"[fn::sections are not elements], "[[#Greater_Elements][greater
 elements]]", and "[[#Lesser_Elements][lesser elements]]", from broadest scope to
 narrowest.  Along with objects, these sub-classes define categories of
 syntactic environments.  Only [[#Headings][headings]], [[#Sections][sections]], [[#Property_Drawers][property drawers]], and
@@ -52,7 +49,12 @@ elements that cannot contain any other elements.  As such, a paragraph
 is considered a lesser element.  Greater elements can themselves
 contain greater elements or lesser elements. Sections contain both
 greater and lesser elements, and headings can contain a section and
-other headings.
+other headings. [fn:tom2:I would not discuss strata here because it is
+not related to the syntax of the document. It is related to how that
+syntax is interpreted by org mode. The strata are nesting rules that
+are independent of the syntax, and discussing that here in the syntax
+document is confusing, because the nesting is not something that can be
+parsed directly because it depends on the number of asterisks.]
 
 ** The minimal and standard sets of objects
 
@@ -60,25 +62,33 @@ To simplify references to common collections of objects, we define two
 useful sets.  The

Re: Parens matching errors in org-babel code blocks

2021-12-21 Thread Tom Gillespie
Definitely a known issue. No easy way to fix it without someone doing
a deep pass on syntax propertization I think. I have a version of
rainbow delimiters mode that tries to work around this at least for
font locking, but it is severely broken and has some nasty quadratic
performance issues in large files. I'll have to look into the proposed
solution that Tim mentions, I may have missed it (unless it was the
solution for <> that John mentions in the linked thread, in which case
that one is not sufficient). Here is a discussion from back in April.
https://lists.gnu.org/archive/html/emacs-orgmode/2021-04/msg00031.html

Best,
Tom



Re: Concrete suggestions to improve Org mode third-party integration :: an afterthought following Karl Voit's Orgdown proposal

2021-12-06 Thread Tom Gillespie
Hi all,
I have a much longer mail in the works, a quick one for now.

I think it is a major strategic mistake to exclude discussions
about interoperability from this list. As Bastien pointed out in
his talk at Emacsconf there is only a single list for both users
and developers. Discussion about interoperability with tools for
working with Org are entirely valid subjects for the user
list. Obviously help and support for other tools is not valid for
the list, but questions about interoperability or incorrectness
of some external tool should always be valid.

We must provide strong technical leadership for all tools that
want to work with Org syntax otherwise we risk it spiraling out
of control. Forcing discussions off list will split the community
and I think the fact that Karl's work made it to this list so
late in the process shows the danger of trying to exclude certain
discussions.

I follow this list, I keep the community up to date with my work,
I have no idea where to look for other Org related dicussions,
nor frankly do I have time to look for them. I suspect I am not
alone in this.

Whether a certain portion of the Org community likes it or not,
there is another portion for whom Org syntax already has a life
beyond Org mode (e.g. academic papers and computation notebook
style workflows). For some workflows documents written in Org
syntax are a primary exchange format and format of record, not
just an internal format from which documents for sharing are
generated. The plain text nature of Org syntax and the freedom
that it enables also means freedom from Emacs. Empowering users
to own and control their own data to use with their own tools is
the whole point. The fact that this means that it works outside
Emacs is a critical feature for many data preservation use cases.

Enough for now. Best!
Tom



Re: Org-syntax: Intra-word markup

2021-12-04 Thread Tom Gillespie
> Since org is a valid export backend though, perhaps this behaviour should be
> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
> fairly nicely.

This ends up being even more convenient than I initially realized.
The current spec for export snippets is ambiguous when it says
"NAME can contain any alpha-numeric character and hyphens"
but the implementation behavior requires that "any" means "at
least one" and is implemented using the + regex operator.

What this means is that @@:...@@ syntax is not actually used
in Org at all at the moment and renders as plain text. I agree that
we need to avoid @@org:..@@ because it has legitimate uses.
Making a back-end of empty string valid for parse separately
syntax thus makes @@ syntax more regular overall, and allows
@@:...@@ to be processed separately because it currently
never enters the export snippet processing.

This is important because export snippets do not seem to be easily
accessible to earlier phases of the org-export machinery, i.e. there
isn't a nice centralized place to preprocess @@org:...@@ even
if we wanted to. On the other hand @@:...@@ isn't processed
at all. I could be missing something in the org export code though.

It will take a bit of work to get this behavior implemented I think,
but it doesn't seem to have any conflicts. Some users may have
set the empty backend to expand manually via
org-export-snippet-translation-alist, but as long as we give
org-export-snippet-translation-alist priority and warn people
that setting "" manually will disable the new functionality
then there shouldn't be any disruption. The behavior also sort
of matches what we would want the empty string to be in this
case, which is "all backends" and of course the only markup
that makes sense for "all backends" is org itself!

Best,
Tom



Re: Org-syntax: Intra-word markup

2021-12-04 Thread Tom Gillespie
Hi all,
After a bunch of rambling (see below if interested), I think I have
a solution that should work for everyone. The key realization is that
what we really want is the ability to have a "parse me separately"
type of syntax. This meets the intra-word syntax needs and might
meet some other needs as well.

The solution is to make @@org:...@@ "parse me separately"
block! It nearly works that way already too! To minimize typing
we could have @@:...@@ the empty type default to org.

This seems like a winner to me. The syntax for it already exists
and won't conflict. It requires relatively minimal additional typing
the implication is clear, and there are other places where such
behavior could be useful.

This syntax seems like a winner to me
@@org:/hello/@@world
@@:/hello/@@world

You can also do things like
#+begin_src org
I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
#+end_src

Which would render to
#+begin_src org
I want a number in this number3word!
#+end_src

Thoughts?

Best!
Tom

--- rambling below -


> This idea reminds me a bit of Scribble/Racket where every document is
> just inverted code, which makes it possible to insert arbitrary Racket
> code in your prose...

I will say, despite some of my comments elsewhere, that I think
exploring certain features of Scribble syntax for use in Org mode
would simplify certain parts of the syntax immensely.

For example
various inline blocks are an absolute pain to parse because they
allow nested delimiters /if they are matched/. The implementation
of the /if they are matched/ clause is currently a nasty hack which
generates a regular expression that can only actually handle nesting
to depth 3. Actually implementing the recursive grammar add a lot
of complexity to the syntax and is hard to get right.

It would be vastly simpler to use Scribble's |<{hello }} world}>|
style syntax and always terminate at the first matching delimiter.
I'm sure that this would break some Org files, but it would make
dealing with latex fragments and inline source blocks and inline
footnotes SO much simpler. Matching an arbitrary number of
angle brackets does add some complexity, but it is tiny compared
to the complexity of enforcing matched parens and their failure cases
especially because many of the places where nesting is required
probably only see use of the nesting feature in a tiny fraction of
all cases.

One other reason why this is attractive is that all the instances
where nested delimiters can appear on a line are preceded by
some non-whitespace character. This means that using the
pipe syntax does not conflict with table syntax!

Now the question comes. If we could implement this for
delimiters, could we also implement something similar
for markup? The issue with the proposed markup outside
delimiter inside approach is that it will change existing
behavior for files that want the delimiters to be included
in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
second issue is that putting the delimiter inside the markup
cannot work for verbatim and code ={oops}= is ={oops}= no
matter what. Therefore the solution is not uniform across all
types of markup. We need another solution that works for
all types of markup.

What if we put the "start arbitrary markup" char outside
the markup? Say something like |/ital/|icks? Or what if
we went whole hog and used |{/ital/}|ics and made the
|{...}| syntax trigger a generalized feature where the
contents of the |{...}| block are parsed by themselves
and can abutt any other text? This would be generally
useful in a variety of situations beyond just intra-word
markup.

What are the issues with this approach? The first issue
is that there is a conflict with table syntax if we were to
use the pipe character because markup can appear at
the start of a line. The second issue is that it might be
confusing for users if |{}| also worked like {} when in the
context of latex elements or inline src blocks, or maybe
that is ok because |{}| never renders as text. Hrm. Ok.
Second issue resolved, but what to do about the first?

If we want generalized "parse this by itself" syntax so
that we can write hello|{/world/}|ok, then we need a
solution that can appear at the start of a line. So we
can't use pipe because that is always a table line even
if a zero width space is put before it ;). What other
options do we have? How about #+|{/hello/}|world for
the start of a line? As long as there is no trailing colon
it isn't a keyword, so it could work ... except that if
someone reflows the text and it is no longer a the
start of a line then the syntax breaks. That is to say
using #+| at the start of a line is not uniform, so we
can't take that approach.

What other chars to we have at our disposal? Hrm.
How about @@? Could we use that? What happens
if we use @@org:/hello/@@world? Or maybe if we
want to minimize the number of chars we could do
@@:/hello/@@world and have the empty prefix in
@@ blocks mean org?



Re: Some commentary on the Org Syntax document

2021-12-03 Thread Tom Gillespie
Hi Timothy,
   Replies in line. Some things might seem a bit out of order
because I responded from bottom to top. Best,
Tom

> from heading to bed, so to quote Pascal "I have only made this letter
> longer because I have not had the time to make it shorter".

Likewise, and I've heard it as Mark Twain :D

> I think a a big problem is the mix of implicit and explicit information.
> Some components are rigorously specified in terms of the characters they
> may contain, elements and objects that are recognised inside them, and
> even the order in which different parts of the pattern are parsed.

I agree completely.

> As mentioned originally, the current Dynamic Blocks description doesn't
> even mention the CONTENTS part of the pattern, and relies on the reader
> inferring that it operates similarly to the CONTENTS part of Drawers.

Indeed this should be fixed.

> Forcing the reader to start making inferences like this is a treacherous
> path, and I think I can blame for some of the other issues I've
> experienced. Take for instance the "surely X can't contain a newline?"
> comments I've made. In the Node Properties and Entities descriptions you
> have statements along the lines of "X can contain any character [...]
> except a newline". In my mind this then sets up the reader to interpret
> a similar statement without the "except a newline" clause to mean that
> newlines are permitted.

I agree completely and had almost the exact same experience as you
when I was working on it. As I mention below, my responses were to
illustrate why the explicit information is missing, not to suggest that it
should be left out. We should definitely work to make everything more
explicit so that future readers don't have to go through the same issues
we have.

> I'm also thinking that the term "element" is overworked in the document.
> It's basically pulling tripple duty: you have Elements, Greater
> Elements, and elements which are Elements and/or Greater Elements .

In extreme agreement.

> 3. Section

Technically This isn't part of the syntax, rather it is part of
elisp Org mode's internal representation. I'm not sure I would
even mention sections at all, because they have to do with
the interpretation of the syntax. In a section on the internal
representation for Org sections definitely belong, but they
are incidental. That said, I suspect we will find that they are
useful for talking about the behavior of the file under transformation,
e.g. "headings are not reordered when pressing M-up or M-down,
sections are reordered" this allows us to make it possible to
talk about an Org implementation that has commands that allow
one to switch the headings without moving their associated
sections.

> 5. (Greater Element / Element)

There are issues here with forms that are part of the syntax vs
forms that are part of the intermediate representation. A line
based parser for Org syntax that assembles greater blocks
after the fact and a parser that uses arbitrary lookahead to
truncate on headings won't have the exact same surface
syntax, however they will both have an equivalent in their
intermediate representation that corresponds to a greater
block. Again, very deep in implementation details here,
but trying to force things like sections into the syntax
hierarchy seems confusing to me.

> 7. Object

Paragraph element maybe? Might seem odd for heading titles
to have paragraph scope, but on the other hand it certainly
simplifies the explanation of the grammar. And you can put
an inline footnote in a heading title.

> 8. Pattern / Form

Don't know what to make of this one. Like "Term" these are
incredibly generic.

> 9. Term

Use of "Term" is super confusing to me.

> We could say call (1) Components, (7) Units, (6) Objects, (5) Element or
> Object (why not spell it out to avoid telling people to remember
> something).

I'm not sure we are ready to specify this. One way that we
might try to manage this would be to create a taxonomy of
element types, e.g. top-level elements, paragraph elements,
etc. This would be consistent with the fact that the elisp
implementation of org-element has all of these as an instance
of element.

> I could have put more thought into this, but it should do for
> illustrating my line of thinking. Let me know if you have any good
> ideas.

Let's leave the terminology as is right now. I'm expecting that there
will be quite a few new terms that we will want to introduce and we
will want to separate syntax and intermediate representation.

With progress on using org-element for fontification and on laundry
we should be able to come up with language that can be used to
distinguish between concepts that are needed for syntax, (tokens,
parser) and for intermediate representations. Things like basic syntax
highlighting need only the language for syntax to be specified, but more
complex syntax such as babel font-locking either requires a more
advanced tokenizer or it requires that we talk about it at the level
of the 

Re: On zero width spaces and Org syntax

2021-12-03 Thread Tom Gillespie
An important note: for intra-word markup you probably want to
use word joiner U+2060 and not zero width space, because a
zero width space allows layout to break the word, whereas a
word joiner does not. We may need to check to make sure that
U+2060 counts as whitespace for the purposes of markup.

> 2. It is more natural that this type of space characters are part of the
> 'output' and not of the 'input'.

That is not relevant in this case. However, Org export should not be
emitting byte-literal zero width spaces either, that causes as NASTY
surprise for the user. All that Org does in this pass is pass something
along for the user. The kludge is a kluge because it just happens to
be compatible with Org syntax, that is all. I agree that significant
whitespace is decidedly undesirable, unfortunately Org already
has some, though it is nowhere near as bad as markdown with
the trailing whitespace. There also happen to be ways to mitigate
issues with non-printing chars via font-locking etc. to make them
print/visible when authoring. This is another good reason to use
macros as well --- they can be documented.

> As for the matter of emphasis marks between words. I believe that this
> is not the underlying problem, but rather the (little) inconsistency of
> the markup on certain contexts. Think, for example, of a text where you
> have to put many words in italics, enclosed between brackets. I don't
> care if that type of text is 'typical' or 'non-typical', 'majority' or
> 'non-majority'. It is simply a kind of scenario absolutely legitimate
> and feasible, and right now I could quote you more than a type of text
> in that direction.

The problem here is that there is an unbalanced design tradeoff.
Supporting intra-word markup using Org's simple markup syntax
actually introduces more inconsistencies elsewhere (see my
note at the end about where the burden of proof lies with
regard to statements like this).

Further, we also have to consider the impact of such a change
across the whole population of Emacs users and use cases.
Adding complexity to support a very narrow use case, and one
that will produce inconsistencies elsewhere means that the
whole community is forced to bear the burden of that complexity.

This is the principle that I think Tim touches on in terms of keeping
simple things simple. Complexity in pursuit of niche use cases is
never worth the cost when it has to be borne by 99% of users that
will never need such things.

Further, Org provides not only a single solution to these cases, but
multiple solutions. Worst case it is also possible to fail over to
text macros, which are an absurdly powerful escape hatch for users
that have advanced (read niche) needs.

> My proposal here also does not arise from an irrepressible desire to add
> more complexity to the syntax. If it's recommended that the user, in
> certain contexts, enter implicitly a zero-width space (which, I insist,
> is a practice that should be avoided as much as possible in a plain text
> document), why not at least offer a graphical alternative, a *real* mark
> whose role is *exactly* the same as that of the zero-with space? Is that
> adding more complexity??? Honestly I think that's exactly the opposite.

This has the same problems as other proposals about this, whether
they are escape chars, or other syntactic additions. It complicates
the syntax for the community as a whole. It may simplify it for your
particular use case, but not when averaged out with everyone else.

I think one approach is to encourage the use of \emph{a}b and friends.
They are printable and hide nothing. I would also suggest that we work
to update other export backends to support \emph where possible.

> In any case, I have suggested that new mark as a possibility, in case it
> is interesting to implement it, since a thread has emerged these days
> about the topic of the intra-words syntax. Discussions and threads
> arised about these questions and any other are perfectly legitimate and
> natural and welcome. Please: there are no issues more 'important' than
> others; no two users are the same in Org. What you do not find useful,
> another user may perhaps finds it indispensable. And vice versa. And I
> think no one is in willingness to state what the average Org user does
> or does not want, given that we do not know even 1% of Org users.

I think we have a fairly good idea in this particular case. If someone
wanted to do a more thorough study of existing org files in the wild
to see whether they are using a workaround it would certainly be
interesting, if unlikely to reject the null hypothesis. Take a survey
of all the html in the world and see how many documents make
use of intra-word markup that use any markup at all. I'm guessing
it is a vanishingly small percentage.

If we could figure out how to implement intra-word markup in a way
that didn't induce complexity it would be done, and probably
would already have been done, and I suspect people might use it.


Re: Some commentary on the Org Syntax document

2021-12-02 Thread Tom Gillespie
Hi Timothy,
Replies in line. Best!
Tom

On Thu, Dec 2, 2021 at 1:32 AM Timothy  wrote:
>
> Hi All (& Nicolas in particular again),
>
> With my recent efforts to write a parser based on
> , I’ve developed a few thoughts 
> on
> that document. Hopefully, they can lead to some improvements and
> clarifications.
>
> 
>
> As a general comment, in many places the Org Syntax document states what
> characters a component can contain, but not what objects/elements. This feels
> like a bit of a hole in the current specifications.

This is indeed confusing because there are some implicit constraints
that are not
listed because they never come up. For example, you cannot have two newlines
inside an inline footnote because the two newlines break the paragraph and the
thing that appears to be an inline footnote is just plain text that is
never terminated.

Ensuring that font locking is in sync org-element and org-export is
critical to ensure
that users know what will actually happen.

>
>
> Sections
> 
>
> Heading
> ───
>
> ⁃ Ok, so `TITLE' can have any character but a newline, but what Org 
> components can it contain?
>   I’m going to assume any object?

Via org-element-object-restrictions it is standard-set-no-line-break which is
all elements except citation-reference, table-cell, and line-break.

>
>
> Affiliated Keywords
> ═══
>
>
> Greater Elements
> 
>
> Greater blocks
> ──
>
> ⁃ It is not explained what is ment by a “special block”
> ⁃ Aren’t lines starting with `#+' also quoted by a comma?
>
>
> Drawers and Property Drawers
> 
>
> ⁃ “Contents can contain any element but another drawer”
>   • Does “any element” mean “any Element or Greater Element”

Any element that does not have greater precedence, so that would
be only a heading.

>
> Dynamic Blocks
> ──
>
> ⁃ It is not specified what `CONTENTS' may be

Implicitly follows the same rules as drawers, no headings
and no nesting of dynamic blocks. Text should be added
that states this explicitly.

> ⁃ Surely `PARAMETERS' cannot contain a newline?

Termination by newline is implicit in the example, but the text is confusing.

> Plain Lists and Items
> ─
>
> ⁃ It is not completely clear what content an item may have.
>   I assume any Object?

By my reading it may contain anything, objects and elements,
except for a heading, but that is already implied by the de-indent.

To quote from the docs:

An item ends before the next item, the first line less or equally
indented than its starting line, or two consecutive empty lines.
Indentation of lines within other greater elements do not count,
neither do inlinetasks boundaries.

This makes plain lists one of the most complex elements to parse.

>
> Tables
> ──
>
> ⁃ Surely newlines are not allowed in `FORMULAS'

No newlines are implicit in the use of "lines" but still confusing.

>
> Elements
> 
>
> Clocks
> ──
>
> Two allowed forms are listed, but are all four of the below allowed or only 
> two?
> ┌
> │ CLOCK: INACTIVE-TIMESTAMP
> │ CLOCK: INACTIVE-TIMESTAMP DURATION
> │ CLOCK: INACTIVE-TIMESTAMP-RANGE
> │ CLOCK: INACTIVE-TIMESTAMP-RANGE DURATION
> └

No. Only the two are allowed. An inactive timestamp alone is a
starting point, adding a duration without the end point means
that there is no way to check that the range and duration match.

> All the best,
> Timothy



Re: Org-syntax: Intra-word markup

2021-12-02 Thread Tom Gillespie
I don't mean to be a wet blanket, but the edge cases for
the current markup syntax are already hard enough to
implement correctly, to the point where different parts of
Org mode are inconsistent. Intra-word markup isn't viable
because there simply isn't any sane way to parse something
like *hello world*/hrm/oh no*. The other issue is that this will
degrade parsing performance because almost every
character could precede the start of a markup section.

I recommend anyone suggesting solutions try to implement
something that can parse the markup unambiguously with
lots of nasty test cases. You will likely find that it is impossible
to consistently tokenize markup, and that you have to hand
write a whole bunch of heuristics, making Org syntax even
harder to implement correctly.

Any solution that suggests extending how =/*~+_  can be
used gets a hard no from me. I could see teaching other
exporters how to interpret \emph{hello}world, but trying for
to have any sane behavior for something like
why *hello*world oh no a wild askterisk*
is not worth it.

Best,
Tom



Re: Orgdown: negative feedback & attempt of a root-cause analysis (was: "Orgdown", the new name for the syntax of Org-mode)

2021-11-30 Thread Tom Gillespie
Karl,
   The exact naming of a thing is nearly always the most contentious
step in trying to promulgate it. In my own field we can easily get all
parties to agree on a definition, but they refuse to budge on a name.
As others have said, I wouldn't worry about kibitizing over the name.

I would however worry about the larger negative reaction. From my
perspective I think the issue is that there are many efforts working
toward a formalized specification for Org syntax and Org mode
functionality, and some of those stakeholders who have invested
significant effort may feel blindsided by a public declaration
announcing Orgdown because they were not consulted and not
made aware that you were working on it.

I appreciate the amount of work that you have put in, I have devoted
hundreds of hours to working on an alternate implementation of org
in Racket that uses a formal ebfn in hopes that others will be able
to use it as a guide and as a way to talk formally about how Org
parsers and implementations should behave.

It would thus be easy for me to say that your approach has put the
cart before the horse, because there are countless nuances in the
specification for Org syntax which must be addressed before any
levels of org compliance can be specified, otherwise the behavior
between levels will be inconsistent.

If I were to say this, it would not be fair to you at all. The ideas
and motivation for Orgdown are vital and important. You have put
in enormous thought and effort, all because you care about Org
and want to see it succeed.

The issue is that any shared specification for Org syntax is
fundamentally about how to coordinate as a community.
The way that Orgdown was presented to the community feels
(to me) like it is being imposed top down or coming from an
individual source, not from an open and visible community
process (the subject of your original email reads as a declaration
in english, and thus can be quite off putting, though I know that
was not the intention).

I personally haven't bothered with promulgation because I think
that we are not technically ready as a community to approach
outreach to other developers in a way that we can succeed.

The good news is that all of this can co-exist if we want it to,
but we need to be clear about our objectives as a community.

To me these objectives are as follows (and I would love
to hear from others about additional or alternate objectives).

1. To never fracture Org syntax so as to avoid the nightmare
of markdown flavors. (This means being able to say clearly
as a community that a parser is out of compliance and that
it is up to the user to fix their files. The ruby org parser used
by Github is a major issue here.)
2. To provide a clear specification for what graceful degradation
looks like when parsing Org syntax if a parser does not support
some portion of that syntax (e.g. should property drawer lines
be excluded or rendered as plain text?).
3. Provide a solid basis on which further formal specification
can be built. (My interests in particular are around providing
consistent semantics for org-babel blocks across languages
so that babel implementations can clearly communicate what
runtime features they support.)

The approach for Orgdown can absolutely meet all three of
these objectives, however in its current form Orgdown1 is not
sufficiently well specified to avoid fracturing the syntax.
This is because Org syntax is extremely complex (even the
elisp implementation of Org mode is internally inconsistent)
and there are edge cases where behavior will diverge if parsing
of even the simplest elements is not fully specified.

There are many ways to remedy this, however they require
a more formal approach. A number of us are working to build
technical foundations for such a formal approach, but I do not
think that any of those projects are ready to be used to
specify discrete levels of Org syntax parsing compliance.

If I may, I would suggest that an Orgdown0 is something that
could be well specified, but it would avoid parsing of markup
altogether and only deal with the major element types. Parsing
paragraphs and all the org objects is not something that can
be done piecemeal. There are too many interactions between
different parts of the syntax, and in some cases the existing
specification desperately needs to be revisited due to the
complexity that it induces or because it is underspecified.
Of course this would make Orgdown0 fairly useless as a
replacement for markdown, but at least it would be a start.

Best,
Tom



Re: noweb and shell heredocs

2021-11-30 Thread Tom Gillespie
Hi Łukasz,
One workaround that is fairly reliable is to prefix the names
of the blocks to be nowebbed with an &. So #+name: block-name
becomes #+name:  Then you reference it as
<<>> and the heredoc syntax is broken. Best,
Tom



Re: Formal syntax for org-cite

2021-11-30 Thread Tom Gillespie
Hi Timothy,
Thanks for putting this together. Comments in line. Best!
Tom

For reference here is the tokenizer pattern I use in laundry at the moment.
There are a number of issues with it ...
https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L896-L913

> Citation syntax is currently not documented, but from the implementation
> it looks something like this:
> #+begin_example
> [cite CITESTYLE: GLOBALPREFIX KEYCITES GLOBALSUFFIX]
> #+end_example

There is potential confusion here because =[cite= does not have to be
followed by a space (rather, cannot be).

The top level syntax is =[cite= terminating at the first occurrence of =]=.
I think we may also need to include a note that no whitespace is allowed either?
It will only be recognized within paragraph context (e.g. headings, paragraphs,
and other places where org objects can appear). Stating that up front would
clarify that the rest of the syntax described here is how to determine whether
the citation is well formed/how to parse it.

> =KEY= can be made of any word-constituent character, =-=, =.=, =:=, =?=,
> =!=, =`=, ='=, =/=, =*=, =@=, =+=, =|=, =(=, =)=, ={=, =}=, =<=, =>=,
> =&=, =_=, =^=, =$=, =#=, =%=, =%=, or =~=.

You have a duplicated =%= here.

> I have not yet confirmed what =KEYPREFIX= and =KEYSUFFIX= may contain,
> but as a starting point, any of the characters allowed in =KEY= except
> =@= plus whitespace would seem fairly safe. =KEYSUFFIX= must start with
> a whitespace character to be able to be differentiated from =KEY=.

I don't think we can allow whitespace here?

> =CITESTYLE= consists of a main =STYLE= and any number of =VARIANT=s
> (including zero), prefixed by forwards slashes in the following pattern
> #+begin_example
> /STYLE/VARIANT/VARIANT/VARIANT
> #+end_example

Need clarification on empty syles e.g. [cite//:]

> "cite" and =CITESTYLE=, =KEYCITES= and =GLOBALSUFFIX= are /not/
> separated by whitespace. Neither are =KEYPREFIX=, =@KEY=, or =KEYSUFFIX=
> separated by whitespace.

I may be missing something, but this is confusing with respect to the
statement about =KEYSUFFIX= and whitespace made above.



Re: "Orgdown", the new name for the syntax of Org-mode

2021-11-28 Thread Tom Gillespie
> I believe (IMHO) that it does not make much sense to separately name the
> Org Mode syntax (as a markup language). That would only generate
> confusion among users.

This is unfortunately not the case. Conflating Org mode which is an Emacs
major mode with Org syntax is a major communication barrier that leads to
confusion for anyone trying to implement a tool based on Org syntax. For
example I couldn't just call my implementation of an org-mode-like package
for Racket "Org mode" because it is not an Emacs major mode. The absence
of a name for Org syntax hampers search and discovery. I'm happy to keep
using the multi-word term Org syntax, but I have found a practical need to
distinguish the surface syntax from the Emacs major mode to reduce
confusing for technical users. Best,
Tom

PS Another brainstormed name: Orgsyn?



Re: "Orgdown", the new name for the syntax of Org-mode

2021-11-28 Thread Tom Gillespie
I had jokingly suggested "orgup" to have a more positive feeling (up
instead of down) than markdown. I'm not sure orgdown will be any more
confusing than some other name. It could imply a version of the org
syntax that uses markdown surface syntax, but it seems that that would
probably be called org flavored markdown by the existing conventions
in the markdown community. Best,
Tom



Re: [PATCH] Accept more :tangle-mode specification forms

2021-11-18 Thread Tom Gillespie
Hi Timothy,
The confusion with 755 and "755" could lead to security issues in
cases like 600 vs "600" vs #o600. The need to protect against the 600
case is fairly important, however I don't think there is anything we
can do about it, because someone might want to enter their modes as
base 10 integers.

If we were to prepend every integer with #o (or setting the radix to 8
when reading this particular field) before passing it to
org-babel-parse-header-arguments then it would be impossible to use
base 10 integers unless they were provided in the #10r600 form (Emacs
doesn't support #d600 notation).

I think the best bet is to change the radix for bare integers to 8
when reading that particular header, however I don't know how complex
that would be to implement.

If we don't want to change the radix to 8 then here are some suggestions.

If #o0600 already parses correctly, then I suggest we leave things as
is. Adding complexity just to drop the leading # seems wasteful.

We may want to warn or raise an error if someone uses a value such as
the base 10 integer 600 which does not map to the usual expected octac
codes so that they don't silently get bad file modes that could leave
files readable to the world.

Best,
Tom



Re: how to org-babel-detangle with nested noweb?

2021-10-18 Thread Tom Gillespie
Hi Edgar,
Degangling of nested noweb blocks tangled using
:comments noweb is broken at the moment. There are
some deep bugs that need to be worked out, and last
time I looked at the code I think my conclusion that it
was better to do a complete rewrite starting from a new
specification of the behavior along with some gnarly test
cases to ensure that everything works as expected.
Best!
Tom



Re: Org lint and named source blocks

2021-10-04 Thread Tom Gillespie
Thanks for the pointer! The actual point of contact seems to be
https://github.com/milisims/tree-sitter-org. Good to find another
group that is working on this. Best,
Tom



Re: Org lint and named source blocks

2021-10-04 Thread Tom Gillespie
> By the way, wouldn't it be better to use tree-sitter rather than
> something else for the format grammar?

Not really since we are going to need more than one implementation
using a parser generator to avoid baking implementation specific
details into the spec by accident. This is true for more than just
the grammar as well. The complexity of tokenization, parsing,
expanding, etc, for Org means that we are going to need multiple
implementations to nail the behavior for any formal spec.

That said, we definitely want a TS implementation at some point.
See https://github.com/tgbugs/laundry/issues/1 for a recent
discussion about ways forward.

The implementation I'm working on should translate to TS without
too much work since both brag and tree sitter describe LR variants.
There may be some subtle differences, but nothing fundamental.

The issue for me is that I don't have the bandwidth to get started
with a full tree sitter implementation, especially because it is going
to need a custom scanner, and because you're effectively on your
own when it comes to reconstructing the output of the AST into the
actual internal representation of an Org file. I also have no idea how
to deal with nested parsers in tree sitter. I have some ideas about
how it might be done, but nothing concrete (see the linked issue
for more on that).

Best,
Tom



Re: [PATCH] Don't fill displayed equations

2021-10-04 Thread Tom Gillespie
> Does anybody have any other thoughts?

>From time to time I encounter random patterns that I don't want to be
reformatted during a fill operation. Maybe a custom variable like
org-fill-paragraph-skip-regexp or similar that could be set by the user?
For Timothy's use case he would set it to the regexp provided in the
original patch? Not sure how much of the implementation in the patch
is dependent on that particular regexp, but a general solution that
could even be set per org file might be a very useful new feature.

Best!
Tom



Re: [PATCH] Don't fill displayed equations

2021-10-03 Thread Tom Gillespie
Some thoughts.

> Maybe you are right and Tom was actually assuming \begin{equation*}, not
> #+begin_export latex.

Correct. My bad on that one.

> Just as Timothy, I believe that \begin{equation*} is unnecessary verbose
> when \[ works *mostly* in a similar way.

\begin{equation*} is absolutely required if you want to be able to include
newlines because \[ and \begin are not similar at all as far as parsing
is concerned.

>From the spec: https://orgmode.org/worg/dev/org-syntax.html#LaTeX_Environments
> CONTENTS can contain anything but the “\end{NAME}” string.
The spec is not completely accurate since latex environments can't
contain a new heading, but the point is that latex environments are
elements, whereas \[ \] is an object.

> If I understand correctly, making \[ \] available outside paragraph
> would mean that it becomes a new element (currently \[\] is a
> latex-fragment object).

Correct. Promoting \[ to an element would mean every \ in an org file
becomes a stop word. Also, Since full fledged latex environments
already exist to serve this purpose I find it hard to justify, especially
given that Org tries to give clear indication of when a block structure
is starting and ending.

> Isn't the whole point of the \[ ... \], \( ... \), $ ... $, $$ ... $$,
> and \begin{env} ... \end{env} and constructs in Org to be consistent
> with LaTeX?

For \begin and \end yes. For the others no. In general it would be to
make it possible to express things using latex-like syntax that would
otherwise require Org to come up with some new and different syntax.
These are values that may be translated to latex, but they exist inside
a larger syntax that is decidedly not latex, and thus they only have
meaningful translation to latex if they exist as well formed Org.

As a side note, the $ syntax is slated to be deprecated and removed.
https://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
> It would introduce incompatibilities with previous Org versions, but
> support for $...$ (and for symmetry, $$...$$) constructs ought to be removed.

> Indeed, it will be a breaking change.

I'm actually fairly certain that such a change should never be made
due to the recent changes in org link syntax. Specifically given how
\[ is used for escapes in links. https://orgmode.org/manual/Link-Format.html
This means that the only place you could reliably use \[ is at the start of a
new line preceded only by whitespace. However, if this were to happen then
pretty much every org document that uses \[ \] is at risk for being broken
because something that was once a single paragraph will now be multiple
paragraphs.

If you need multiline use \begin \end, that is what they are there for, and they
fit better with org's general extensible approach to blocks. I would dearly love
to be able to have a single shorthand for src blocks that worked inline and
standalone, but the complexity that it would induce is just not worth it. Same
thing for \[ \]. It seems simple until you get down to account for all the edge
that it would induce in the grammar.

Consider the case where you have something like

\[ something something

more content
more content [[www.example.com/\]oops][evil link]] \]

I've seen enough cases that are similar to this in the existing implementation
that have inconsistent behavior that I can safely say that this one would too.
Not to mention that I can think of at least 3 different cases that will all have
slightly different behavior that is inexplicable to users at best and
infuriating
at worst.

\[ a

b \]

\[
a
b
\]

a \[ b

c \] d

etc. There are plenty more variants that would all be subtly different depending
on the exact way such a thing were implemented.

In short. Just not worth it.



Re: [PATCH] Don't fill displayed equations

2021-10-02 Thread Tom Gillespie
Hi Timothy,

> │ \[
> │   not part of a paragraph
> │ \]

My point is that that parses first as a paragraph (check org-element-at-point).
\[ and \] would be meaningless if it did not first parse as a paragraph.

> I also don’t see how footnotes are analogous, as footnotes are placed in the
> middle of a line of text.

Inline footnotes [fn::
can span
multiple lines] but can't contain empty lines because the empty line ends the
paragraph that they are contained in.

> org-latex-preview :)

But surely #+begin_export latex works with org-latex-preview? If not then
that would be a feature request to org-latex-preview yes?

Best!
Tom



Re: Comments break up a paragraph when writing one-setence-per-line

2021-10-02 Thread Tom Gillespie
A general comment (heh) here. This is not a bug and not easily fixed.
Line comments are their own top level element distinct from
paragraphs. If you need something that fits in a paragraph you can use
@@comment:@@ at the start of a line.

I agree that it is annoying, but Org line comment syntax also only
works if it starts the line, so the behavior diverges from traditional
code comments. It may make sense to update the docs to call them "line
comments" instead of just comments.

One area where we could almost certainly do better is in how line
comments break up the flow of text. I'm not sure there will ultimately
be much we can do about it, but it is worth investigating.

Best,
Tom



Re: [PATCH] Don't fill displayed equations

2021-10-02 Thread Tom Gillespie
> do not see a reason for idiosyncrasy that markup intended to add LaTeX
> snippet that looks like exactly as LaTeX commands for this purpose and
> even actually preserved during export to LaTeX should have different
> semantics for Org parser.

The answer is that \[ \] can only occur inside paragraphs. The issues
here are exactly the same as the issues for inline footnotes. Org gives
us a bit more power, but not the full power because Org is Org, not
Latex. Making \[ \] available outside of a paragraph would be a massive
breaking change.

In Timothy's original example he is narrowly skirting the syntax to
allow that all to remain a single paragraph, but stick in a newline
anywhere and boom, no more paragraph, no more equation.

I guess one thing I'm missing/not understanding is when/why people
want to use \[ \] instead of full #+begin_export latex block?

Best,
Tom



Re: [PATCH] Accept more :tangle-mode specification forms

2021-10-01 Thread Tom Gillespie
> I'd like to understand these objections better. Aren't you overstating
what is at issue?

Yes, after hitting send I realized I overstated my position a bit.
In the meantime the comments in this thread are encouraging,
however I have finally figured out what I was really trying to say.

tl;dr file permission modes are not universal and should thus not
be part of the Org implementation, Org itself knows nothing about
files or permissions, it is the system that Org is running in/on.
Therefore, so long as we make it abundantly clear that the
value for :tangle-mode is not expected to be portable and that
it is always up to the user to ensure correct behavior, then we
are ok. I'm not happy about this conclusion from a security
perspective, but it isn't really worse than the situation we have
right now.


As many have pointed out, the grammar itself will not be affected.
However, other parts of the spec will. In general my objective is to
try to reduce the number of special cases that an org implementation
has to know about and delegate them to something else.

However in this case it is a bit tricky because of the security implications
and due to the fact that octal modes for file permissions are NOT universal
and should not be expected to be universal!

I actually think that my gut reaction was correct, but was expressed
in the wrong way.

Unix file modes are not universal and should thus not be encoded as
part of a portable document format. This means that it is up to the
user to know what representation is suitable.

Right now that representation is delegated to Emacs, because Emacs
handles file permissions for Org, and Emac's language for modes is
octal.

There are some octal modes that do not translate on Windows, and cannot
be correctly set. There will (hopefully) be some happy day in the future
where there is an operating system that will run Org babel where octal
file modes do not exist at all!

Therefore I suggest that we do not enshrine a particularly obscure way
of expressing file modes into Org itself. Right now Org is confined to
Emacs' representations, which in a sense protects Org from becoming
too ossified by bad designs of the past --- Emacs can keep all that
for us!

If we want a more user friendly syntax for this I would suggest that we do
something like what has been done for Org babel :results, i.e. like
:tangle-mode read write execute, unfortunately that does not compose
well at all with user, group, and other and becomes exceedingly verbose.


Final conclusion, after all that rambling, is that I'd actually be ok with
any of the solutions proposed, so long as it is clear that :tangle-mode
will always be implementation dependent, and may or may not be
meaningful depending on which operating system you are using.
Unfortunate for security, but I don't see any way around tha. The
best we could do for security would be for implementations to
test the file modes after tangling to ensure that they match,
which is more important I think.

That said, reducing the number of forms as Eric suggests would
be a happy medium.

Best!
Tom



Re: [PATCH] Accept more :tangle-mode specification forms

2021-09-30 Thread Tom Gillespie
I strongly oppose this patch. It adds far too much complexity to the
org grammar. Representation of numbers is an extremely nasty part of
nearly every language, and I suggest that org steer well clear of
trying to formalize this. With an eye to future portability I suggest
that no special cases be given to something as important for security
as tangle mode without very careful consideration. Emacs lisp closures
have clear semantics in Org and the number syntax is clear. If users
are concerned about the verbosity of (identity #o0600) they could go
with the sorter (or #o0600). Best,
Tom



Re: Empty headline titles unsupported: Bug?

2021-09-26 Thread Tom Gillespie
Hi Bastien,
I am strongly in favor of this change. It simplifies the grammar
significantly, and from my work on the laundry lexer and parser, I'm
99% certain that the current behavior is a bug that is the result of
gobbling the space after the stars in the headline. The correct
implementation peeks 1 char ahead for the space, and then starts
parsing again starting with the space. This is because tags MUST be
preceded by a space, so if you incorrectly gobble the space after the
stars then that space cannot be used as the start for tags. Best,
Tom



Re: [PATCH] lisp/ox-html.el: Restore org-svg class

2021-09-21 Thread Tom Gillespie
Bumping this patch for 9.5.

On Fri, Jul 30, 2021 at 8:59 PM Tom Gillespie  wrote:
>
> Hi,
>This patch restores the addition of class="org-svg" to svg images
> during html export. Best!
> Tom



Re: Org lint and named source blocks

2021-09-21 Thread Tom Gillespie
> Should we allow syntax like #+KEYWORD:value to be correct or do we
> require a whitespace/space after colon all the time?

The spec as written is ambiguous/silent on this issue. In my work on
laundry tokenizer and grammar I have found keyword syntax to be a
thorny issue, and I strongly suggest that for the time being we either
make no ruling on this or we state that the colon that ends the
keyword should be followed by a space as a precautionary measure.
The safe thing to do is to always require whitespace after the colon
because it guarantees correct interpretation.

Requiring whitespace after the colon simplifies the grammar, however
it means that you can't compact keyword lines, and it induces an
annoying failure mode where missing spaces are no longer keywords.

However, it is technically possible to make keywords work without the
whitespace, so long as there is at least one whitespace prior to the
next colon (but not contained in square brackets, e.g. #+key:lol[ a b
c ]:value is a well formed keyword under a slighly generalized
grammar). The problem is that we would like to make keyword syntax
fully closed, and I need a bit more time to get that worked out before
any definitive conclusions are drawn.

The complexity of the generalized keyword syntax can be seen here
https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L107-L249

Best,
Tom



Re: [org-cite] citations in property drawers?

2021-09-16 Thread Tom Gillespie
> I understand the problem, but the solution should not be: "let's pretend
> export does not exist".

>From my perspective any org object that is not in a section that
allows org objects could in principle be parsed as such, but it would
not be in the core of the grammar, and it also would have to parse to
something that did not trigger side effects related to export.

Allowing org objects to appear at arbitrary places in the grammar is
definitely not a good idea because in many senses they cannot actually
be those objects. Maybe the syntax could be the same, but they would
have to be "shadow objects" or something like that?

Best,
Tom



Re: [org-cite] citations in property drawers?

2021-09-15 Thread Tom Gillespie
> That would be a terrible idea. Exporters are not required to handle all
> data contained in properties drawers, so this may introduce errors,
> e.g., when trying to number citations.

I agree completely. You can't export something that has no anchor in
text that would be rendered. Maybe I misunderstood the original
question, because there is no way that a citation or footnote could be
exported from there, so I think in your conception text that follows
the format of the citations or footnotes isn't actually a citation or
footnote unless it exports as such.

Best,
Tom



Re: [org-cite] citations in property drawers?

2021-09-14 Thread Tom Gillespie
Hi Bruce,
I could certainly imagine using it, and I don't see any issue with
doing it from the point of view of the grammar. Footnotes can appear
in a property drawer without issue, though obviously they don't
export. One question though since I may have missed this in the other
threads is cite: allowed without the square brackets? Either way, org
element just parses the value to a string and it is up to any
consuming application to parse the node property further. Best!
Tom

On Thu, Sep 9, 2021 at 11:45 AM Bruce D'Arcus  wrote:
>
> Just bumping this.
>
> Another question about where to allow cite elements.
>
> On Fri, Aug 20, 2021 at 4:18 PM Bruce D'Arcus  wrote:
> >
> > So this is a tentative request/question; I'm not really sure the best
> > approach here.
> >
> > This is based on discussion with one of the org-roam-bibtex developers
> > about what the proper way to indicate an org-roam note is a
> > bibliographic note; e.g. a note about a bibliographic source.
> >
> > Traditionally in org-roam, that is in a property drawer; like:
> >
> > :ROAM_REFS: cite:wallace-wells2019
> >
> > That is using org-ref syntax there.
> >
> > So the obvious question is should one just put an org-cite citation
> > there to do the same thing?
> >
> > Right now, the answer is clearly no, since they aren't allowed in
> > property drawers.
> >
> > But perhaps they should be, just as any link can be?
> >
> > Except if they are, I recognize, they need to be treated as special
> > cases; e.g ignored for the purposes of export and such.
> >
> > WDYT?
> >
> > Bruce
>



Re: Expanding how the new cite syntax is used to include cross-references - thoughts?

2021-08-10 Thread Tom Gillespie
In general I like John's suggestion. Org link syntax can be made to do
nearly anything because it is possible to bind link actions to
arbitrary elisp functions (I have used them to create buttons that run
source blocks for some of my non-technical colleagues). The grouping
of cross references under org-cite seems reasonable to me, and I would
love it if they could handle arbitrary references, e.g. to hypothesis
web annotation links or org-capture links.

Actually, having written this now, I think that both solutions have
their own use cases. Org cite is clearly about providing evidence for,
or a scholarly reference for something, and critically it can embed
some metadata about that reference in the document as a citation or
perhaps as an excerpt (and extension of what org-ref does now when the
cursor is over a reference?). Regular links do not provide any way to
embed metadata within the document, they are purely pointers.

That being said, it seems that there are a number of use cases where
org-ref links are simply internal document links that can point to an
element with a specific #+name: and no embedded information about the
target is needed. However, I think it would be a mistake to use up
equation/eq and table/tbl or figure/fig prefixes for references that
are internal to org, because it implicitly limits/collides with the
#+link: keyword.

Best,
Tom



Re: [Concept talk] Org-connector

2021-08-10 Thread Tom Gillespie
Hi Sébastien,
I think you are probably looking for org-sync which implements
exactly this functionality. You would need to write a new backend for
your particular ticketing system, but github, bit bucket, and redmine
backends already exist and can serve as an example. Best,
Tom

https://orgmode.org/worg/org-contrib/gsoc2012/student-projects/org-sync/tutorial/



Re: bug: Error handling in source blocks.

2021-08-10 Thread Tom Gillespie
I will also chime in here to say that managing output streams and
errors for babel is a major new feature that I am interested in. The
issue, as Tim points out, is that there is a lot of complexity lurking
here due to the fact that certain languages have fundamentally
different capabilities and ways of handling or not handling errors,
and of running code (on arbitrary hosts) in the first place.

What works for one will almost certainly not work for another. Take
for example ob-lisp where there is already built in error handling in
emacs itself. Compare that with python where someone would likely need
to implement a special PYTHONBREAKPOINT entrypoint or something like
that, if it were possible at all.

I have had a draft of a document on what I called "babel
regularization" for well over a year now, but it is not in a state
that would be productive to share due to the sheer number of ob-langs
and systems affected and the need to be able to clearly catalog and
articulate the diversity of existing behaviors.

If you dig through old conversations on this list you will find a
discussion of the default behavior for ob-shell :returns values vs
output as the default, we were barely able to agree on which
principles should be followed to make the decision. In that case we
were lucky that there was already a way for users to set their desired
behavior in their init file or in a setup file or in the file itself.
How to handle errors will be much more complex, in part because it
will touch on what ob-lang implementations are able to overwrite
and/or must provide in order to actually function. At the moment there
are practically no constraints.

Lots of work to do here, so grateful for a report on the variability
in the behavior of the existing system.

Best!
Tom



Re: [PATCH] Rename headline to heading

2021-08-08 Thread Tom Gillespie
Hi André,
Thanks for taking a first pass at this. I think that this patch is
difficult to review. Could you break it into two separate patches, one
for documentation (non-code, e.g. docstring and comment) changes and
one for code changes?  That way we could more easily see where we may
need to mitigate the kind of issues Maxim noticed. Best!
Tom



Re: Help requested: Support for basic Org mode support in tools outside of Emacs

2021-08-03 Thread Tom Gillespie
Hi Karl,
   Great initiative. For many of the things in the table you will
probably want to link to the underlying library For example for github
and gitlab there is https://github.com/wallyqs/org-ruby (which I have
been trying to find time to submit fixes to). I've linked a couple
relevant threads and repos. Best!
Tom

python https://github.com/novoid/Memacs
python https://github.com/karlicoss/orgparse
python https://github.com/bjonnh/PyOrgMode
racket https://github.com/tgbugs/laundry/tree/next
racket https://github.com/jeapostrophe/org-mode
racket https://github.com/antoineB/org-mode
See https://github.com/tgbugs/laundry/blob/next/laundry/cursed.org for
an org file that github fails to render
clojure https://github.com/200ok-ch/org-parser/blob/master/resources/org.ebnf

https://orgmode.org/list/ca+g3_pobab1qx1zv8q9sjfh4khuhvmanxp3xo7__6eosdxk...@mail.gmail.com/
https://orgmode.org/list/ca+g3_pnj6pekqv+twfkwbd778xhw9wsfx+kjjhjsoreplhu...@mail.gmail.com/

On Tue, Aug 3, 2021 at 11:46 AM Greg Minshall  wrote:
>
> Karl,
>
> orgtbl-query is a script for querying tables in .org files.  it doesn't
> do any special text formatting.
>
> https://gitlab.com/minshall/orqtbl-query
>
> cheers, Greg
>



[PATCH] lisp/ox-html.el: Restore org-svg class

2021-07-30 Thread Tom Gillespie
Hi,
   This patch restores the addition of class="org-svg" to svg images
during html export. Best!
Tom
From 4363eec0913ccd0d05ecf3d6346208c62d3597f8 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Fri, 30 Jul 2021 20:53:07 -0700
Subject: [PATCH] lisp/ox-html.el: Restore org-svg class.

* lisp/ox-html.el (org-html--format-image): Restore org-svg class.
d96e8975791bd3b1a5f8fdb75609d73f134dc831 removed the org-svg class
which is necessary even when using  tags otherwise svg images
will render at absurdly large sizes.
---
 lisp/ox-html.el | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lisp/ox-html.el b/lisp/ox-html.el
index bd6771a76..f25a9731e 100644
--- a/lisp/ox-html.el
+++ b/lisp/ox-html.el
@@ -1707,7 +1707,9 @@ a communication channel."
 (org-html-encode-plain-text
  (org-find-text-property-in-string 'org-latex-src source))
   (file-name-nondirectory source)))
- attributes))
+ (if (string= "svg" (file-name-extension source))
+ (org-combine-plists '(:class "org-svg") attributes '(:fallback nil))
+   attributes)))
info))
 
 (defun org-html--textarea-block (element)
-- 
2.31.1



Re: Headings and Headlines

2021-07-23 Thread Tom Gillespie
I enthusiastically support changing the documentation to use heading.
I use heading in my formal grammar because I have found there are more
ways that it can be modified and remain grammatically correct when
used in english sentences. The internal implementation in elisp still
refers to headlines, but changing the docs would be a good first step.
Best!
Tom



Re: [PATCH] ob-core: tangle check library of babel after current buffer

2021-07-17 Thread Tom Gillespie
Pinging on this to see if anyone can test it so that it can be merged.
Tom

On Wed, Jun 16, 2021 at 4:29 PM Tom Gillespie  wrote:
>
> Hi,
>This is a patch that fixes tangling behavior when a block has been
> ingested into the library of babel and then modified. Best!
> Tom



Re: A requires/provides approach to linking source code blocks

2021-07-13 Thread Tom Gillespie
We have been receiving many new feature suggestions and requests
coming in for org babel. I think that Tim's suggestion is the right
one. Nearly all of these need to be implemented as an extension first
and tested independently. Further, even if this is done, it should be
clear that there is zero expectation that such extensions will be
incorporated.

Once I wrap up the formal grammar for org, one of the next things I
plan to work on is a clear specification for org babel. This is
critical because so many of the suggestions that come in deal with
individuals' specific problems and thus fail to account for how such
features interact with existing features and how the newly proposed
feature would block some other features in the future, confuse users,
etc. Such suggestions also often fail to account for increased
complexity, nor have they been exposed to a sufficient number of
examples to reveal fundamental ambiguities in how they could be
interpreted. The issues with variable behavior between ob langs for
:pre :post :prologue :epilogue etc. are already enough to keep us busy
for quite some time.

With regard to this thread in particular, it is of some interest, but
there are fundamental issues, including the fact that certain
languages (e.g. racket) expect module code to exist somewhere on the
file system. There are ways around many of these issues, in fact there
are likely many ways around any individual issue, so org babel needs
to systematically consider the issues and provide a clear
specification, or at least a guide for how such cases should be
handled.

To give an example from one of my continual pain points: I start
writing python or racket in an org src block and then I want it to be
a library so that it can be reused by other code both inside and
outside the org file without having to resort to noweb.

What is the best way to handle this? I don't know. Right now I tangle
things and resort to awful hacks for the reuse-in-this-org-file case, but
I'm guessing there is a better generic solution which would allow _any_
org block to be exported as a library instead of nowebbed in.

Before jumping for any particular suggestion for how to handle this
we need to explore the diversity of cases that various ob langs
present, so that we can find a solution that will work for all of
them. After all, packaging code to a library for reuse is an
extremely common pattern that org babel should be able to
abstract, but it is a major undertaking, not just the addition of a
keyword here and there.

In short I suggest that we issue a general moratorium on new org babel
feature suggestions until we can stabilize what we already have and
provide a clear specification for correct behavior. Until we have that spec
we could encourage users to create extensions that implement those
features.

Best,
Tom


PS The other next thing that I am working on might be another way out
for this particular feature request. Namely, it is simplifying and
extending org keyword syntax so that new keywords (with options) and
associated keywords can be specified using keyword syntax within a
single org file. This would make it possible to get useful high level
keyword behavior in a single file without burdening the core
implementation with more special cases for associated keywords, and it
would allow users to write small elisp functions that could do some of
what is suggested here, all without need to add anything to the core
org implementation.



Re: [PATCH] Allow tangling to a list of files

2021-07-07 Thread Tom Gillespie
Reading over this with the new information about the use case, it
seems that using noweb to manage the many-to-many nature of a mapping
between blocks and files is a much better way to achieve the desired
result. In addition it is already supported and does not add more
complexity to an already complex part of org.

The one area that a noweb approach does not support is dynamic
construction of files at runtime on the basis of some information that
is only available at runtime, however that does not seem to be
important for this use case.

Therefore I suggest that the tangling behavior be left 1:1 block:file,
and if there is some desire to tangle to multiple files then noweb
should be used with multiple blocks. Obviously there is some
performance penalty here. Also this doesn't help with cases where we
want to tangle to hundreds of servers using tramp, but if that is the
use case then I would suggest that that operation not be hidden behind
:tangle. Instead tangle once and then use another elisp block write
all the files to their final destinations using tramp, ssh, or some
other means.

I personally have use cases for things like this, but even so I don't
think we wan't the :tangle parameter to be the way to do it. I would
suggest instead that if we want to enable a tangled file to be
automatically distributed to a number of different locations that we
provide a separate header argument so that the functionality is not
conflated with the tangle functionality. I don't have a good name for
it, but the objective seems to be something like :tangle-copy-to that
accepts a function returning zero or more paths, or a list of multiple
paths (I don't recall how/whether any of the babel args deal with
accepting multiple values).

Best,
Tom



Re: Large source block causes org-mode to be unusable

2021-06-21 Thread Tom Gillespie
> That said, I think keeping 2000 lines of source code inside an
> org src block is neither a standard use case nor a reasonable idea.

I would say that it certainly is a standard use case for people who
want to keep everything in a single file (e.g. to simplify
reproducibility and avoid the mess of trying to distribute multiple
files to non-technical users). #+INCLUDE is not a substitute if you
are going to be tangling files, breaks many workflows, and as a result
should rarely be recommended as a solution when src blocks are
involved. Org should definitely be able to handle this case because
there is no reason why performance should be any worse than having a
2000 line file in another buffer.

Org babel has many basic interactivity performance pitfalls that need
to be investigated. I personally have many workarounds for bad emacs
performance degradations related to code executing in the event loop
because I need to get on with the task at hand, but they need to be
fixed, not dismissed.



[PATCH] ob-core: tangle check library of babel after current buffer

2021-06-16 Thread Tom Gillespie
Hi,
   This is a patch that fixes tangling behavior when a block has been
ingested into the library of babel and then modified. Best!
Tom
From 22d0689257f977d09b013a143e899f788b45a039 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Mon, 14 Jun 2021 19:18:28 -0700
Subject: [PATCH] ob-core: tangle check library of babel after current buffer

* lisp/ob-core.el (org-babel-expand-noweb-references): Fix order when
searching for named babel blocks so that blocks in the current buffer
are always found first. This fixes a bug where stale versions of
blocks that have been ingested into the library of babel were being
preferentially tangled instead of newly modified versions from the
current buffer.
---
 lisp/ob-core.el | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/ob-core.el b/lisp/ob-core.el
index 857e03e55..384c06c9a 100644
--- a/lisp/ob-core.el
+++ b/lisp/ob-core.el
@@ -2828,8 +2828,6 @@ block but are passed literally to the \"example-block\"."
 		 (setq cache nil)
 		 (let ((raw (org-babel-ref-resolve id)))
 		   (if (stringp raw) raw (format "%S" raw
-		;; Retrieve from the Library of Babel.
-		((nth 2 (assoc-string id org-babel-library-of-babel)))
 		;; Return the contents of headlines literally.
 		((org-babel-ref-goto-headline-id id)
 		 (org-babel-ref-headline-body))
@@ -2842,6 +2840,8 @@ block but are passed literally to the \"example-block\"."
 			  (not (org-in-commented-heading-p))
 			  (funcall expand-body
    (org-babel-get-src-block-info t))
+		;; Retrieve from the Library of Babel.
+		((nth 2 (assoc-string id org-babel-library-of-babel)))
 		;; All Noweb references were cached in a previous
 		;; run.  Extract the information from the cache.
 		((hash-table-p cache)
-- 
2.31.1



Re: colored src blocks question

2021-06-01 Thread Tom Gillespie
Hi John,
Are you perhaps missing the :extend t directive in the font spec? Best,
Tom

>


Re: A formal grammar for Org

2021-06-01 Thread Tom Gillespie
Hi Jakob,
Thank you for getting in touch. I had been meaning to after
someone pointed me to your repo in a reddit thread, but you beat me to
it. Replies in line. Best!
Tom

PS ccing this back to the list for the record.

On Tue, Jun 1, 2021 at 1:56 AM Jakob Schöttl  wrote:
>
> Hi Tom,
>
> I came to your post at the mailing list from here:
> https://github.com/gagbo/LuaOrgParser/issues/1
> Sorry, I don't know, how I can answer on the mailing list when I don't have 
> received the original mail.

No worries, I never managed to figure that out either so I just
subscribed. Maybe by matching the subject as you do here and ccing the
list (attempting it in this email to see what happens)?

> We have a pretty similar project, org-parser[1]. It's also written in a Lisp 
> dialect, Clojure, but it uses instaparse instead of brag as parser library.

https://github.com/tgbugs/laundry/tree/next#similar-projects I managed
to get it into my README as a reminder to myself to have a thorough
look at it, but have been occupied with other work since then.

> My idea was, to transform the formal grammar to a grammar.js for tree-sitter. 
> It would be so cool, if it could be generated from one formal specification.

Yes, that would be great. It would be a major step to have a couple of
grammars for org that can be used for stuff like this and compared to
each other, along with test cases that we can use to define correct
behavior.

One issue that I don't have a full understanding of at the
moment is how certain ambiguous forms will impact the ability to
transform directly into the tree sitter grammar.

The reason I mention
this is because I have had to move to a two phase parser in order to
deal with ambiguous parses.

Having not looked carefully at your
approach I don't know whether you have encountered similar issues. For
the tree sitter use case in particular I'm not entirely sure that the
ambiguity matters, but I haven't had a chance to look at it yet.

> Do you plan, in your parser, to do a transformation step from the raw parser 
> AST to a higher-level AST? E.g. the raw parser AST would parse a (:date  
> "2021-06-01") and the transformed AST would transform this to a higher-level 
> timestamp object.

Yes. I already do that to a certain extent in the expander
https://github.com/tgbugs/laundry/blob/next/laundry/expander.rkt (the
raw AST is hard to work with directly), but there will be more. I also
expect that I will add an intermediate step where the AST is
rearranged to account for aspects of org semantics that cannot be
captured by the context free part of the grammar.

After that step there are a number of potential conversions, one of which will
transform the AST into Racket structs, but I haven't made it quite
that far yet. That said, I think that in terms of defining a canonical
parse, I am aiming to do that in the transformed intermediate
s-expression representation because I think it will be easier to
define the correctness of certain user interactions on that form rather than
on the higher level object representation, even if the higher level
objects are ultimately used to actually implement that behavior.

> Do you have any automated tests for your parser?

Yes. See https://github.com/tgbugs/laundry/blob/next/laundry/test.rkt
you can run them from the working directory via =raco test laundry=.
I haven't fully specified the expected AST (and transforms) in most
cases because I'm still hammering out details. In some cases I do
specify the parse that I expect, e.g. for headings I specify when
tags are expected in cases where there might be some ambiguity. If you
are looking for edge cases there are a number that are not yet in the
automated tests but that are in
https://github.com/tgbugs/laundry/blob/next/laundry/cursed.org because
they hit on some cases of extreme ambiguity and internal inconsistency
in the elisp implementation or on weird behavior under user
interaction (I also have some other test cases that haven't been
committed to the repo yet).

It would be great to align the grammars and the behavior using a set
of common test cases.



Re: Empty headline titles unsupported: Bug?

2021-05-29 Thread Tom Gillespie
Hi Ihor,
Yes, happy to put my test cases into the org element cases and
visa versa. My long term plan is to come up with a set of test cases
that are unambiguous and potentially ambiguous so that we can
determine the expected behavior in those cases, so this is a great
first step. Best,
Tom



Re: Empty headline titles unsupported: Bug?

2021-05-29 Thread Tom Gillespie
Hi David,
Laundry produces a full s-expression representation of the org
parse tree (though it is still evolving). I haven't added a pass that
converts it to some Racket internal representation (probably will be
structs). If you get it installed and put #lang org at the top of an
org file you can use racket-mode to parse arbitrary org files, though
you may hit an error and will definitely encounter an
incomplete/incorrect parse since it is still a work in progress. Best,
Tom



Re: Empty headline titles unsupported: Bug?

2021-05-27 Thread Tom Gillespie
Hi all,
 Here is the 4th (or so) iteration of the grammar for titles that
I think deals with most of the issues in this thread along with a
bunch of nasty test cases. The previous attempts can be inspected in
the git history, but long story short, it is extremely hard to find a
grammar that follows the principle of least surprise and you have to
use the tokenizer to ensure that the tags pattern always parses as
such so that tags don't magically switch to being the title when you
remove the rest of the contents of the title. The final example
L1648-L1665 shows many of the things that should parse as tags and do
with this tokenizer/grammar combination. The key to dealing with the
ambiguity of empty title and tags vs something that looks like tags
but parses as a title (which is surprising) is to use the tokenizer to
greedily recognize tags at the end of the line. This ensures that the
tags pattern at the end of the line always parses as tags and doesn't
switch just because the title is empty. Happy to elaborate. Best,
Tom

https://github.com/tgbugs/laundry/blob/next/laundry/heading.rkt
https://github.com/tgbugs/laundry/blob/971cf35683cd60156868c12b070c2dd9e19d8d06/laundry/tokenizer.rkt#L98-L140

https://github.com/tgbugs/laundry/blob/971cf35683cd60156868c12b070c2dd9e19d8d06/laundry/test.rkt#L326-L367
https://github.com/tgbugs/laundry/blob/971cf35683cd60156868c12b070c2dd9e19d8d06/laundry/test.rkt#L400-L558
https://github.com/tgbugs/laundry/blob/971cf35683cd60156868c12b070c2dd9e19d8d06/laundry/test.rkt#L1298-L1369
https://github.com/tgbugs/laundry/blob/971cf35683cd60156868c12b070c2dd9e19d8d06/laundry/test.rkt#L1371-L1419
https://github.com/tgbugs/laundry/blob/971cf35683cd60156868c12b070c2dd9e19d8d06/laundry/test.rkt#L1648-L1665



bug#48676: Arbitrary code execution in Org export macros

2021-05-26 Thread Tom Gillespie
Hi Glenn,
 The definition for local variables doesn't cover things like org
macros, though the spirit of the policy is something worth keeping in
mind. Running M-x org-export-dispatch and hitting two keys means that
the user has to do something to trigger code execution, much like they
would have to intentionally accept certain risky local variables.

That said, the fact that many org operations can run arbitrary code is
definitely something that needs clearer documentation. It might make
sense to add a setting to detect closures that appear in org files to
ask for permission before running, but it likely should not be on by
default.

For a fairly extensive discussion of code execution in org see this
thread from Nov 2020.
https://orgmode.org/list/robi94$ma$1...@ciao.gmane.io/#t
Best,
Tom





Re: execute elisp link without prompt

2021-05-21 Thread Tom Gillespie
> In the end I've set as to nil as a local variable

If you want something a bit more secure you could use a function that
checks the block name ("some-block" in this example). Best!
Tom

(lambda (_lang _body)
   (not
(string= "some-block"
 (plist-get (cadr (org-element-at-point)) :name

#+begin_src elisp
(setq-local
 org-confirm-babel-evaluate
 (lambda (_lang _body)
   (not
(string= "some-block"
 (plist-get (cadr (org-element-at-point)) :name)
#+end_src

#+name: some-block
#+begin_src elisp
(message "yay!")
#+end_src

#+RESULTS: some-block
: yay!

#+name: some-other-block
#+begin_src elisp
(message "I ask to run")
#+end_src

#+RESULTS: some-other-block
: I ask to run



Re: URLs with brackets not recognised

2021-05-12 Thread Tom Gillespie
A quick fix is to percent encode the troublesome characters, but the
underlying issue is in org-link-any-re which is defined in
org-link-make-regexps which is what org uses to find the next link.
Some improvements might be possible for some of the edge cases there,
but a complete solution for bare urls is not possible due to conflicts
with native org syntax.

Org doesn't handle these cases well because in some cases org's own
syntax takes priority over url syntax at the moment adding bare url
syntax as part of org syntax is something that could be considered.
However, I would suggest against that because it will taint any org
parser in the future by forcing it to implement full url parsing at
arbitrary positions in paragraphs, which adds a lot of complexity. I
also suggest against it because org already has clear ways to
demarcate links using <> and [[]] which are guaranteed to behave
correctly even in cases where org syntax will always take priority.
For example with
https://en.wikipedia.org/wiki/Cathedral_Basilica_of_St._John_the_Baptist_[[Savannah,_Georgia]].

> It might be worthwhile to issue an warning each time a url is written in
> an org file without enclosing brackets < > or [[ ]].

Unfortunately warning on links without < > or [[ ]] will generate
countless annoying false positives for anyone who doesn't hit this
edge case. Maybe a separate function could be added to org lint that
would not run all the time?



Re: Multiple calc commands with orgbabel

2021-05-07 Thread Tom Gillespie
Hi Bastien,
Here's a patch to make it official. :)
Tom
From 3a61289e8fa4442f6d340138dcb67b950e980212 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Thu, 6 May 2021 23:52:21 -0700
Subject: [PATCH] lisp/ob-calc.el: Add Tom Gillespie as the maintainer

* lisp/ob-calc.el: Add Tom Gillespie as the maintainer.
---
 lisp/ob-calc.el | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lisp/ob-calc.el b/lisp/ob-calc.el
index 39ebce100..520f39145 100644
--- a/lisp/ob-calc.el
+++ b/lisp/ob-calc.el
@@ -3,6 +3,7 @@
 ;; Copyright (C) 2010-2021 Free Software Foundation, Inc.
 
 ;; Author: Eric Schulte
+;; Maintainer: Tom Gillespie 
 ;; Keywords: literate programming, reproducible research
 ;; Homepage: https://orgmode.org
 
-- 
2.26.3



Re: Multiple calc commands with orgbabel

2021-05-06 Thread Tom Gillespie
Hi Bastien,
Given the short length of the file, the fact that I now have a
fairly good idea of how it works, and the fact that I share a last
name with the original author of calc, I would be happy to. I'll hunt
down the steps you mentioned for becoming an ob- maintainer and ping
back when they are done. Best!
Tom



Re: Multiple calc commands with orgbabel

2021-05-05 Thread Tom Gillespie
Here is a quick and dirty implementation that more or less does what
you want (I think). The if t would probably need to be replaced by a
value that corresponded to an option that indicated that ob-calc
should resolve all expressions on the stack. This isn't really an
issue of return value, it is due to the fact that ob-calc makes
stateful modifications to calc. If you want a stateless (idempotent?)
ob-calc block you would need to do something like this as well, and
then you would need an option to discard the additional values instead
of retruning them as I do here. Best!
Tom
diff --git a/lisp/ob-calc.el b/lisp/ob-calc.el
index 39ebce100..e2102feca 100644
--- a/lisp/ob-calc.el
+++ b/lisp/ob-calc.el
@@ -48,6 +48,7 @@
   "Execute a block of calc code with Babel."
   (unless (get-buffer "*Calculator*")
 (save-window-excursion (calc) (calc-quit)))
+  (let ((unpopped 0))
   (let* ((vars (org-babel--get-vars params))
 	 (org--var-syms (mapcar #'car vars))
 	 (var-names (mapcar #'symbol-name org--var-syms)))
@@ -58,12 +59,14 @@
  vars)
 (mapc
  (lambda (line)
+   (setq unpopped (1+ unpopped)) ; ICK
(when (> (length line) 0)
 	 (cond
 	  ;; simple variable name
 	  ((member line var-names) (calc-recall (intern line)))
 	  ;; stack operation
 	  ((string= "'" (substring line 0 1))
+   (setq unpopped (- unpopped 2))
 	   (funcall (lookup-key calc-mode-map (substring line 1)) nil))
 	  ;; complex expression
 	  (t
@@ -89,9 +92,17 @@
 	 (split-string (org-babel-expand-body:calc body params) "[\n\r]"
   (save-excursion
 (with-current-buffer (get-buffer "*Calculator*")
-  (prog1
-(calc-eval (calc-top 1))
-(calc-pop 1)
+  (if t
+  (let ((out
+ (cl-loop for i from 1 to unpopped
+  do (message "%S %S" unpopped calc-stack)
+  collect (calc-eval (calc-top 1))
+  do (calc-pop 1
+(message "%S" out)
+(mapcar #'list (reverse out)))
+(prog1
+(calc-eval (calc-top 1))
+(calc-pop 1)))
 
 (defun org-babel-calc-maybe-resolve-var (el)
   (if (consp el)


Re: Multiple calc commands with orgbabel

2021-05-05 Thread Tom Gillespie
Looking at ob-calc there is a call to calc-push-list. Knowing the
length of that list (i.e. the number of arguments) it should be
possible to inspect calc-stack to retrieve the other values on the
stack from the current block. You can see this if you run M-:
calc-stack. This would probably need a specialized result type if it
were implemented. Best,
Tom

On Wed, May 5, 2021 at 8:33 AM  wrote:
>
>
> Example
>
> (require 'ob-calc)
>  (org-babel-do-load-languages
>  'org-babel-load-languages
>  '( (calc . t) )
>
>  calc.org 
>
> # To execute, place cursor point on a line, then hit "C-c * u" hard with no 
> harm.
>
> #+name: Simplifying Formulas
> #+begin_src calc
>
> simplify((x + y) (x + y)) =>
>
> simplify(a x^2 b / (c x^3 d)) =>
>
> simplify((4 x + 6) / (8 x)) =>
>
> simplify((1 + 2 i) (3 + 4 i)) =>
>
> simplify(5 + i^2 + i - 8 i) =>
>
> simplify((1, 2) + (3, 4)) =>
>
> simplify((1, 2) (3, 4)) =>
>
> #+end_src
>
>
>
> Sent: Thursday, May 06, 2021 at 3:11 AM
> From: "Matt Price" 
> To: "Org Mode List" 
> Cc: pie...@caramail.com
> Subject: Re: Multiple calc commands with orgbabel
> Can you explain how you get calc embedded mode working in org? I have never 
> used it and it sounds interesting, but I don't understand what hte delimiters 
> are.
>
> On Wed, May 5, 2021 at 2:35 AM Eric S Fraga  wrote:
>>
>> On Wednesday,  5 May 2021 at 07:46, pie...@caramail.com wrote:
>> > Have been trying to execute multiple calc commands, but when I
>> > evaluate the calc expressions, I get just one result.
>>
>> ob-calc returns the top element of the stack when finished and this will
>> be the result of the last operation in the src block.  I don't think
>> there's any way around this.
>>
>> I use embedded Calc for this reason.  You could rewrite your equations
>> as simple lines (separated by empty lines from the surroundings) and
>> evaluate each in turn with "C-x * u":
>>
>> fsolve(x 2 + x = 4, x) => x = 1.333
>>
>> fsolve([x + y = a, x - y = b], [x, y]) => [x = a + (b - a) / 2, y = (a - b) 
>> / 2]
>>
>> I added the "=>" at the end of each expression so that the result is
>> shown to the right instead of replacing the expression itself (default
>> embedded Calc behaviour).
>>
>> --
>> : Eric S Fraga via Emacs 28.0.50, Org release_9.4.5-395-g82fbdd
>>



Re: About multilingual documents

2021-05-03 Thread Tom Gillespie
I like Aleksandar's solution quite a bit because it also works inline
e.g. as src_org[:lang de]{Meine deutsch ist zher schlect!}. In
principle this means that you could leverage the org-babel and org-src
buffer system to get flyspell results in that language in line as well
(though I don't think transporting overlays into the original buffer
has been implemented). Best!
Tom



Re: <> and ?font-lock? fly-check, ...

2021-05-03 Thread Tom Gillespie
Hi Greg,
   I just checked and it induces a syntax error, which I did not know,
but turns out to be quite useful because it means that an untangled or
incorrectly tangled file will fail to run beyond that point. Best!
Tom

On Sun, May 2, 2021 at 9:11 PM Greg Minshall  wrote:
>
> Tom, that is quite devious, actually.  thank you very much!  do you
> know, by the way, what flycheck and/or the shell make the "<<&"
> construct out to be?  cheers, Greg
>



Re: [PATCH] Fontification for inline src blocks

2021-05-02 Thread Tom Gillespie
Hi Timothy,
Another thought about this. In some languages (e.g. python) blocks
require an explicit return by default. It would be nice to be able to
set header arguments in the property drawer separately for inline
source blocks in such cases.

src_python[:prologue "x = (" :epilogue ")\nreturn x"]{1 + 2} {{{results(=3=)}}}

A quick review of ob-core and a check of the behavior suggests that
there is a concept of inline-header-args, but only for default
arguments, and that :inline-header-args:python: does not work.

Extending the concept so that inline blocks can have headers set via
property drawers separate from regular blocks seems important.
Especially because inline blocks can accidentally inherit header-args
that are incompatible (e.g. :results list). I don't think these
patches depend on that though, so probably better to deal with that
separately.

Best,
Tom



Re: [PATCH] Fontification for inline src blocks

2021-05-02 Thread Tom Gillespie
> I see. I imagine the expected behaviour of such a function would be to
> toggle org-inline-src-prettify-results and redisplay?

Yeah, see org-toggle-link-display for inspiration I think.

;;;###autoload
(defun org-toggle-link-display ()
  "Toggle the literal or descriptive display of links."
  (interactive)
  (if org-link-descriptive (remove-from-invisibility-spec '(org-link))
(add-to-invisibility-spec '(org-link)))
  (org-restart-font-lock)
  (setq org-link-descriptive (not org-link-descriptive)))



Re: [PATCH] Fontification for inline src blocks

2021-05-02 Thread Tom Gillespie
Hi Timothy,
   It seems to work more or less as expected. A few comments below. Best,
Tom

1. I think there needs to be a function to toggle
org-inline-src-prettify-results as there is e.g. for hyperlinks. I was
quite confused by the prettified results.

2. I'm also not sure that this approach to prettify is a good idea.
There are issues with unexpected killing/yanking and basic navigation
behavior of the prettified text which seem worse than the already
troublesome issues with hyperlinks. I'm not sure we can do anything
about this though?

3. I'm not sure about the default choice for prettified delimiters. I
see there is already a way to customize the delimiters by providing a
cons. I think a default value of '("" . "") might be a better choice
since ⟨ and ⟩ being hardcoded seems like it introduces completely
alien characters. Going with empty strings also seems consistent with
the behavior for hyperlinks.

4. There is an interaction with rainbow delimiters that there isn't an
easy solution for. I wish there was a syntax type that was "this is a
paren for electric pair mode but not for font locking."

5. I'm not sure that the faces selected for src_ and lang are the
right ones. Is there any issue with adding new faces specifically for
those rather than reusing existing faces? I thought that matching the
font locking of #+begin_src lines might make sense, but then I
realized that that doesn't make sense because that is for blocks more
generally.



Re: [POLL] Setting `org-adapt-indentation' to nil by default?

2021-05-02 Thread Tom Gillespie
Hi Nicolas,
   Sorry, I did not mean to imply that such things were not possible
currently. I was writing in the context of how to specify the current
behavior formally. As you point out they absolutely are possible. More
replies in line. Best,
Tom

> This is inaccurate.
>
> The following is a perfectly valid list.
>
> --8<---cut here---start->8---
>   1. foo
>
>  #+begin_src emacs-lisp
> (+ 1 1)
>  #+end_src
>
>   2. bar
> --8<---cut here---end--->8---

Yes. My question is about how to deal with cases like

--8<---cut here---start->8---
 1. foo

#+begin_src emacs-lisp
  (+ 1 1)
#+end_src

 2. bar
--8<---cut here---end--->8---


> Source blocks for languages that have significant whitespace should use
> the -i flag.

My known issues with switches aside, the misaligned cases are the ones
that I worry about, and I don't think being able to flag a block as
being indentation sensitive helps resolve the potential ambiguity
there.

> What makes you think this is not the case?

Sorry, my wording is unclear here. I was not talking about the current
implementation which can and does do this, but instead about how to
formally specify what should be done in such cases.



Re: [Feature request] String escaped noweb expansion

2021-05-02 Thread Tom Gillespie
Hi Sebastien,
I have encountered issues with this before when trying to noweb code
into a string that was code to be sent via ssh. I ended up switching
to use typeset -f in bash in most cases now, but that is not possible
for other languages. Some languages also have enough different types
of syntax for strings that they can work around such cases, but again,
not all do.

One potential issue with this suggestion is how it would interact with
multi-line blocks, because you can't have anything on the same
starting line as the noweb expressions since it will be repeated in
front of every subsequent line.

This would also require each org-babel lang implementation to provide
a method for correctly string-escaping the nowebbed values (in some
cases e.g. shell this is decidedly non-trivial).

With all of these things in mind, I would thus suggest not trying to
overload the noweb operator for this purpose. Having a string escaped
equivalent would be nice, but because it requires more than just a
simple copy/paste into the buffer, it seems like it probably needs
separate notation. Best,
Tom



Re: <> and ?font-lock? fly-check, ...

2021-05-02 Thread Tom Gillespie
Hi Greg,
A slightly different suggestion that doesn't break other org
processors (which might not allow users to change
org-babel-noweb-wrap- values) is to prefix the names of the blocks
with & (e.g. <<>>) as I do in multiple places in
https://github.com/tgbugs/pyontutils/blob/master/docs/release.org#build-release.
I found this solution while fighting with the font-locking behavior in
shell blocks. Best!
Tom



Re: [POLL] Setting `org-adapt-indentation' to nil by default?

2021-05-02 Thread Tom Gillespie
Hi Bastien,
Strong +1 here. Users can get the same visual effect without
materializing the whitespace into the file.

Materializing the whitespace causes many potential issues with source
blocks for languages that have significant whitespace, issues with
#+begin_src and #+end_src having different levels of indentation
(still an issue if you want a block in a plain list), weirdness with
noweb, obligatory two pass parsing to get the spacing correct in
paragraphs, etc.

There are many cases where adapting indentation requires the
specification of extremely detailed heuristics that must be followed
exactly in order to get at least a consistent parse of a source block.
The only sane way forward for a language specification would be to
avoid that leading whitespace or avoid trying to specify the
interpretation of source blocks in contexts with leading whitespace
(src blocks in plain lists may come back to haunt us here).

Setting org-adapt-indentation to nil by default would be a major step
toward resolving these issues and frankly I couldn't ask for more.

Best!
Tom

PS I have included some notes on the worg/dev/org-syntax.org
file that I wrote while working on the formal grammar. I would
qualify what I wrote slightly to state that users could in principle
have leading whitespace before source blocks but that the behavior of
org in such cases would be left unspecified in the not quite nasal
demons sense, but that it might be better to have the behavior
described below with a note that no attempt to deal with correctly
preserving leading whitespace is required, user beware. A final
aside: maybe plain lists could have the #+begin_ and #+end_
lines indented to the level of the plain list but maybe not the body?
-

Eliminate leading whitespace in the canonical representation.

There are other ways that make it possible to have the indentation
visibility without adding massive complexity to the implementation.

The existing implementation can continue to support it, but any other
implementation SHALL convert indented sections to the canonical form
where there is NO leading whitespace. This eliminates the problem of
significant whitespace for everything except plain lists.

Users will need a migration path and this will require extensive
testing to make sure that the tooling catches as many of the issues as
possible. However, the benefits in the long run are vastly reduced
complexity without all the risks of accidentally botching an indent
somewhere.



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-20 Thread Tom Gillespie
Hi Sébastien,
The temp -> rename approach is good, but you should probably use
make-temp-file to create the file to reduce the risk of
collisions/race conditions. For example as (make-temp-file (concat
file-name ".tangling")).

I think that the location of condition-case is ok, but I wonder what
would happen if something were to fail before entering that? I think
that only a subset of the files would be tangled, but they would all
have their correct modes, so I think that that is ok.

I also think that the message to the user should probably not be
changed right now. While it might can be useful for debug, if someone
is tangling to a large number of files then the filenames/paths are
going to flood messages, so I would leave it out of this patch, and
possibly submit it as another patch for a separate discussion.

Best!
Tom



Re: Concerns about community contributor support

2021-04-20 Thread Tom Gillespie
Hi Tim, David, and Gustav,
I am fairly certain that with only a few exceptions it is possible
to specify a context free grammar for org syntax, followed by a second
pass that deals specifically with markup and a few other forms,
notably the reassembly of things like plain lists. The fact that this
is possible because most org constructs are line oriented.

Just a note that the linked parser.rkt [0] is indeed a BNF describing org
syntax in the same style as a bison/yacc grammar. One of the reasons
why I set out to work on this was precisely so that there could be a
reference that could be consulted by the community when questions
about extended org come up.

There are proposals for new syntax that appear on this list with
terrifying frequency, and they are routinely shot down or simply
ignored for good reason, however it is hard to communicate that to
enthusiastic contributors who have an immediate use case that they
want to solve and share and are unlikely to be aware of side effects.
Having a grammar where such issues can be tested empirically should
provide a significant safeguard while also making it easier for
contributors to play with the grammar and see the issues.

In all my work on the grammar I have found maybe 2 or 3 places where
the grammar could be "extended" but it isn't so much extended as it is
regularized, where some parts of org already parse a more complex
grammar while other very similar parts choose not to. Overall the cost
of not parsing certain forms in certain situations adds complexity
rather than reducing it.

The situation for contribution is also further complicated by the fact
that the elisp implementation of org mode is internally inconsistent
in its behavior with regard to the syntax, so great care has to be
taken if someone tries to make and argument based on the behavior of
one component.

All this to say that the need for a conservative approach to changes
and extensions combined with the internally inconsistent behavior of
different parts of the elisp implementation means that the
introduction of new features is extremely difficult because it is hard
to predict the consequences on other parts of org.

Overcoming this is why I started working on the grammar, because
in the absence of a formal spec for what org should do, it is very hard
to make changes to what it is currently doing without having nasty
side effects.

Best!
Tom

0. https://github.com/tgbugs/laundry/blob/next/laundry/parser.rkt note
the upcoming path change (which I will note in the original thread when
it happens).

PS I'm planning to reply to the main thread as well. My short take is
finding a dedicated and responsive maintainer that can take over from
Bastien is a high priority. The only other thing that might help is to
have some way to track outstanding and closed patches, issues, etc.
that is more accessible than trolling through years worth of posts on
this mailing list, but that is a can of worms that has already been
shot down multiple times.



Re: [Patch] to correctly sort the items with emphasis marks in a list

2021-04-19 Thread Tom Gillespie
Hi Greg,
seq cannot be used because it is not available in older versions
of emacs that org still supports. When support for those older
versions is dropped then seq could be used. Best,
Tom



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-18 Thread Tom Gillespie
Hi Sébastien,
   Some comments while looking over this (will report back when I have
tested it out as well). This is a section of the ob export
functionality that I have been looking for on and off for quite a
while because it is responsible for some bad and insecure behavior. I
think that some of your changes may have fixed/improved this as a side
effect. I don't know whether it is worth doing anything about the
issues in this patch, but since we are here, I think they are worth
mentioning. All of the issues that I'm aware of are related to what
happens if tangling fails part way through the process. First, your
patch already fixes a major issue which is that the modes of all files
would not be set if any one of them failed to tangle. Next, during the
process the existing file is deleted prior to tangling, which means
that it cannot be restored if tangling fails, it would be better if
the old file was moved to a temporary location and then deleted on
success or replaced on failure. This likely requires wrapping the bits
that can fail in unwind-protect and restoring on failure or fully
deleting at the end of success. The next issue is that setting the
tangle mode should happen before the file is written, an empty file
should be created, the mode should then be set, the contents of the
file should be written only after the mode has been set. This involves
a bit of reordering of operations in lines 124-126 of your patch. This
ordering of opertions prevents security issues related to race
conditions and potential errors being evoked during write-region
(though again, your changes already make the tangling code much more
secure by setting the modes on each file immediately after writing
instead of how it works currently where if any other block encounters
an error then no modes were set). Best!
Tom

On Sun, Apr 18, 2021 at 12:23 AM Sébastien Miquel
 wrote:
>
> Hi,
>
> The attached patch modifies the ~org-babel-tangle~ function to avoid a
> quadratic behavior in the number of blocks tangled to a single file.
>
> Tangling an org buffer with 200 blocks to 5 different files yields a
> 25 % speedup.
>
>
> * lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group
> collected blocks by tangled file name.
> (org-babel-tangle): Avoid quadratic behavior in number of blocks.
>
> --
> Sébastien Miquel



Re: Bug: inconsistent escaping of coderef regexp

2021-04-07 Thread Tom Gillespie
Hi Nicolas,
I've included the simplest patch I could come up with for the
divergence in behavior between org-babel-tangle-single-file and
org-link-search. I think there are two new threads that I need to
create. One is related to how to make it possible to specify what
should be removed along with the coderef (i.e. coderef prefix), the
other is the addition of header arguments that provide the same
functionality as switches. Best,
Tom

> This is already conflating the two. I'd like to solve the issue at hand
> without having header args interfere at all.
>
> This can happen later, after a discussion on the ML.

Ok. I've included the simplest version of the fix, which is to use
org-src-coderef-regexp in org-babel-tangle-single-file.

> Would you mind answering my questions first? I still don't follow you
> about the coderef prefix/regexp.

https://code.orgmode.org/bzg/org-mode/src/2d78ea57cfad1ddc3e993c949daf117b76315170/lisp/org-src.el#L882

That line defines a hardcoded regular expression for matching
coderefs. The codref prefix is the first =[ \t]*= and the coderef
regexp is the equivalent to the fully formatted version of that format
string. Neither of those can currently be specified by the user. The
user should not be able to specify the coderef regexp due to the fact
that it is too easy to specify a regexp that will not work correctly
and because the format string is needed to make org-link-search work
for named coderefs (otherwise you wind up trying to replace .+ in the
coderef regexp which is a nightmare). The coderef prefix is something
that should probably be configurable by the user so that empty
comments are not left in the file. I also looked into detecting the
comment character for the language in question, but that is
significantly more difficult even using (with-temp-buffer (funcall
lang-mode) comment-start) because not all languages have sane comment
start values and comment-start is not complete, so we would need a way
to manually specify what to exclude anyway.
From c30913da6b1c8d6be3670a59ae867df019505af3 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Wed, 7 Apr 2021 12:29:01 -0700
Subject: [PATCH] lisp/ob-tangle.el: Fix coderef removal during tangling

* lisp/ob-tangle.el (orb-babel-tangle-single-block): Regularize
behavior when removing coderefs during tangling. This fixes an issue
where trailing whitespace would be retained when coderefs were removed
for tangling.
---
 lisp/ob-tangle.el | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index aa0373ab8..4c0c3132d 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -414,9 +414,8 @@ non-nil, return the full association list to be used by
 	 (src-lang (nth 0 info))
 	 (params (nth 2 info))
 	 (extra (nth 3 info))
-	 (cref-fmt (or (and (string-match "-l \"\\(.+\\)\"" extra)
-			(match-string 1 extra))
-		   org-coderef-label-format))
+ (coderef (nth 6 info))
+	 (cref-regexp (org-src-coderef-regexp coderef))
 	 (link (let ((l (org-no-properties (org-store-link nil
  (and (string-match org-link-bracket-re l)
   (match-string 1 l
@@ -445,8 +444,7 @@ non-nil, return the full association list to be used by
 	(funcall assignments-cmd params))
 	  (when (string-match "-r" extra)
 		(goto-char (point-min))
-		(while (re-search-forward
-			(replace-regexp-in-string "%s" ".+" cref-fmt) nil t)
+		(while (re-search-forward cref-regexp nil t)
 		  (replace-match "")))
 	  (run-hooks 'org-babel-tangle-body-hook)
 	  (buffer-string
-- 
2.26.3



Re: Bug: inconsistent escaping of coderef regexp

2021-04-05 Thread Tom Gillespie
Missed removing a debug message. Here is the correct patch. Best,
Tom

On Sun, Apr 4, 2021 at 10:22 PM Tom Gillespie  wrote:
>
> Hi Nicolas,
>I've attached a patch with a first pass implementation that I think
> resolves most of the issues. It probably needs a few tests to go along
> with it, but I think it is the simplest way forward. I tried to make the
> changes without disrupting the org-babel info structure, but it comes
> with the cost of having to pull out :coderef-prefix in a number of separate
> contexts. Best,
> Tom
>
> > If possible, I'd like not to conflate current issue with switches
> > deprecation, which needs to be discussed separately.
>
> We can decouple them, so not an issue. The attached patch implements
> the header arg equivalents of -r and -l without making any changes to the
> existing switch behavior.
>
> > What do you mean by "it is impossible for the user to specify their own
> > coderef regexp that can be used in both cases"? In particular, what is
> > a coderef regexp in this context? I know about coderef format, but
> > I don't think users are supposed to provide a regexp here.
>
> I did a first pass implementation and realized that allowing users to
> specify coderef-regexp is a bad idea. The attached patch fixes the
> divergent behavior of org-bable-tangle-single-block and provides a
> standard way to specify a :coderef-prefix regexp so that empty
> comments can be stripped.
From 91aa10a5a14737b770e58b1a7f9f0e0b563dae62 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Sun, 4 Apr 2021 21:40:32 -0700
Subject: [PATCH] improve org-src-coderef-regexp and regularize usage

* lisp/ob-core.el
org-babel-common-header-args-w-values: new :coderef- header args
org-babel-safe-header-args: include the new :coderef- header args
(org-babel-get-src-block-info): calulate params before info in let* so
that they can be used to set the coderef-format field (nth 6 info)
(org-babel--expand-body): use coderef-prefix to correctly strip
coderefs when expanding

* lisp/ob-tangle.el (orb-babel-tangle-single-block): Regularize
behavior when removing coderefs during tangling. This fixes an issue
where trailing whitespace would be retained when coderefs were removed
for tangling. Make the header argument :coderef-tangle no work the
same way that the -r switch currently works

* lisp/ol.el (org-link-search): use org babel info to match the
coderef format for each block

* lisp/org-src.el (org-src-coderef-regexp): now takes an additional
argument rx-prefix that can be used to customize the text preceeding
the coderef that should be removed during tangling, this is most
useful for removing comments and trailing whitespace.

* lisp/ox.el (org-export-resolve-coderef)
and (org-export-unravel-code): use org babel info to
correctly match the coderef format for each block.

This commit adds support for three new src block header arguments,
:coderef-format :coderef-prefix and :coderef-tangle. :coderef-format
has the same behavior has the org src switch -l and :coderef-tangle
has the same behavior as org src switch -r. :coderef-prefix provides
new functionality and makes it possible to set the regexp for text
leading up to the coderef. In particular this can be used to strip
comments, which are required if authoring an org file that works with
older versions of org.
---
 lisp/ob-core.el   | 43 +--
 lisp/ob-tangle.el | 17 ++---
 lisp/ol.el| 17 +++--
 lisp/org-src.el   |  5 +++--
 lisp/ox.el| 15 +++
 5 files changed, 60 insertions(+), 37 deletions(-)

diff --git a/lisp/ob-core.el b/lisp/ob-core.el
index 2e78ac3e6..feb6f2235 100644
--- a/lisp/ob-core.el
+++ b/lisp/ob-core.el
@@ -76,7 +76,7 @@
 (declare-function org-previous-block "org" (arg  block-regexp))
 (declare-function org-show-context "org" ( key))
 (declare-function org-src-coderef-format "org-src" ( element))
-(declare-function org-src-coderef-regexp "org-src" (fmt  label))
+(declare-function org-src-coderef-regexp "org-src" (fmt  label rx-prefix))
 (declare-function org-src-get-lang-mode "org-src" (lang))
 (declare-function org-table-align "org-table" ())
 (declare-function org-table-convert-region "org-table" (beg0 end0  separator))
@@ -392,6 +392,9 @@ then run `org-babel-switch-to-session'."
 (defconst org-babel-common-header-args-w-values
   '((cache	. ((no yes)))
 (cmdline	. :any)
+(coderef-format . :any)
+(coderef-prefix . :any)
+(coderef-tangle . ((nil yes no)))
 (colnames	. ((nil no yes)))
 (comments	. ((no link yes org both noweb)))
 (dir	. :any)
@@ -434,7 +437,8 @@ Note that individual languages may define their own language
 specific header arguments as well.")
 
 (defconst org-babel-safe-header-args
-  '(:cache :colnames :comme

Re: Bug: inconsistent escaping of coderef regexp

2021-04-04 Thread Tom Gillespie
Hi Nicolas,
   I've attached a patch with a first pass implementation that I think
resolves most of the issues. It probably needs a few tests to go along
with it, but I think it is the simplest way forward. I tried to make the
changes without disrupting the org-babel info structure, but it comes
with the cost of having to pull out :coderef-prefix in a number of separate
contexts. Best,
Tom

> If possible, I'd like not to conflate current issue with switches
> deprecation, which needs to be discussed separately.

We can decouple them, so not an issue. The attached patch implements
the header arg equivalents of -r and -l without making any changes to the
existing switch behavior.

> What do you mean by "it is impossible for the user to specify their own
> coderef regexp that can be used in both cases"? In particular, what is
> a coderef regexp in this context? I know about coderef format, but
> I don't think users are supposed to provide a regexp here.

I did a first pass implementation and realized that allowing users to
specify coderef-regexp is a bad idea. The attached patch fixes the
divergent behavior of org-bable-tangle-single-block and provides a
standard way to specify a :coderef-prefix regexp so that empty
comments can be stripped.
From e017fe3f4fb36da2c8560ae526b8bdfd42dc Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Sun, 4 Apr 2021 21:40:32 -0700
Subject: [PATCH] improve org-src-coderef-regexp and regularize usage

* lisp/ob-core.el
org-babel-common-header-args-w-values: new :coderef- header args
org-babel-safe-header-args: include the new :coderef- header args
(org-babel-get-src-block-info): calulate params before info in let* so
that they can be used to set the coderef-format field (nth 6 info)
(org-babel--expand-body): use coderef-prefix to correctly strip
coderefs when expanding

* lisp/ob-tangle.el (orb-babel-tangle-single-block): Regularize
behavior when removing coderefs during tangling. This fixes an issue
where trailing whitespace would be retained when coderefs were removed
for tangling. Make the header argument :coderef-tangle no work the
same way that the -r switch currently works

* lisp/ol.el (org-link-search): use org babel info to match the
coderef format for each block

* lisp/org-src.el (org-src-coderef-regexp): now takes an additional
argument rx-prefix that can be used to customize the text preceeding
the coderef that should be removed during tangling, this is most
useful for removing comments and trailing whitespace.

* lisp/ox.el (org-export-resolve-coderef)
and (org-export-unravel-code): use org babel info to
correctly match the coderef format for each block.

This commit adds support for three new src block header arguments,
:coderef-format :coderef-prefix and :coderef-tangle. :coderef-format
has the same behavior has the org src switch -l and :coderef-tangle
has the same behavior as org src switch -r. :coderef-prefix provides
new functionality and makes it possible to set the regexp for text
leading up to the coderef. In particular this can be used to strip
comments, which are required if authoring an org file that works with
older versions of org.
---
 lisp/ob-core.el   | 43 +--
 lisp/ob-tangle.el | 18 +++---
 lisp/ol.el| 17 +++--
 lisp/org-src.el   |  5 +++--
 lisp/ox.el| 15 +++
 5 files changed, 61 insertions(+), 37 deletions(-)

diff --git a/lisp/ob-core.el b/lisp/ob-core.el
index 2e78ac3e6..feb6f2235 100644
--- a/lisp/ob-core.el
+++ b/lisp/ob-core.el
@@ -76,7 +76,7 @@
 (declare-function org-previous-block "org" (arg  block-regexp))
 (declare-function org-show-context "org" ( key))
 (declare-function org-src-coderef-format "org-src" ( element))
-(declare-function org-src-coderef-regexp "org-src" (fmt  label))
+(declare-function org-src-coderef-regexp "org-src" (fmt  label rx-prefix))
 (declare-function org-src-get-lang-mode "org-src" (lang))
 (declare-function org-table-align "org-table" ())
 (declare-function org-table-convert-region "org-table" (beg0 end0  separator))
@@ -392,6 +392,9 @@ then run `org-babel-switch-to-session'."
 (defconst org-babel-common-header-args-w-values
   '((cache	. ((no yes)))
 (cmdline	. :any)
+(coderef-format . :any)
+(coderef-prefix . :any)
+(coderef-tangle . ((nil yes no)))
 (colnames	. ((nil no yes)))
 (comments	. ((no link yes org both noweb)))
 (dir	. :any)
@@ -434,7 +437,8 @@ Note that individual languages may define their own language
 specific header arguments as well.")
 
 (defconst org-babel-safe-header-args
-  '(:cache :colnames :comments :exports :epilogue :hlines :noeval
+  '(:cache :coderef-format :coderef-prefix :coderef-tangle
+   :colnames :comments :exports :epilogue :hlines :noeval
 	   :noweb :noweb-ref :noweb-sep :padline :prologue :rownames
 	   :sep :session :tangl

  1   2   >