Nicolas Goaziou <m...@nicolasgoaziou.fr> writes:
> Please note that those short answers did not help me much. So I did my > homework and looked at your code. I didn't test it thoroughly, so I may > be missing something. It's a pity to hear that I wasn't able to suitably clarify things in my reply. Thank you for being willing to investigate my implementation. > Now, here's the elephant in the room: "puny.el" was included in Emacs > 26.1. Org cannot make use of it yet. Gah. > Also, the bootstring algorithm, and yours, are very much > English-centered, as can attest > `org-reference-contraction-stripped-words'. I insisted on non-latin > languages for a reason: > > (org-reference-contraction "こんにちは") => "28j2a3ar1p-" > > or, for a not so long title > > (org-reference-contraction "こんにちは コンニチハ") => "v8ttbvbva7si998jvba0bzb0m-" > > which is arguably worse than "org1234567". Mmmm. This isn't great. I preferred the output of Unidecode (ASCII transliteration) mentioned previously, but that doesn't look like it could easily be used. >>> references are guaranteed to be unique in the document; >> >> The suffixed number I mentioned ensures this. > > Unfortunately, because of them, you cannot guarantee stable links during > export, much like random references. > > For example, if you first export > > * Foo > bar > > and if you later modify your document like this > > * Foo > baz > * Foo > bar > > your link will now point to the "baz" contents instead of "bar". > > As a side note, this the reason why I introduced randomness in > references in the first place. We cannot reference first headline as > "headline-1", second one as "headline-2", i.e., in a monotonic way, > because we cannot assume their order is fixed. >From this I take it you'd rather a broken reference than an incorrect one? I don't think there's any "good" solution here, just pick your poison (and, no surprise, I prefer my way). > More importantly, the above is not limited to headlines with the exact > same title. Since your algorithm truncates output, this will happen in > various, less obvious, situations. While this is technically possible, I think it's worth noting that I have never seen this in practice, and for reference I have documents with hundreds of headings (250 in my config, for example). >>> Also, header content is not stable enough: when you're linking to the >>> custom ID, you may be able to change the title and yet preserve the >>> link. >> >> Custom IDs still work, so I don't quite see the point here. > > How can you be sure? > > The point is that in some export back-ends, e.g., ASCII, you will only > provide a single reference for a headline, i.e., not one for the title > and another one for the custom ID. If your reference is based solely on > the title, the reference will break whenever you modify the title > without touching custom ID. I gave an example in an earlier post > already. This is a regression wrt the current system. I remain rather confused on this point. Say I have a document with the following content: * Some heading :PROPERTIES: :CUSTOM_ID: hey :END: See [[#hey]] or [[Some heading]] In an HTML export I see: <li><a href="#hey">1. Some heading</a></li> [...] See <a href="#hey">1</a> or <a href="#hey">1</a></p> In an ASCII export: 1 Some heading ══════════════ See 1 or 1 In a LaTeX export: \section{Some heading} \label{hey} See \ref{hey} or \ref{hey} etc. I don't see how my code affects custom IDs. > In a nutshell: > > - there are very interesting points in your proposal; Glad you've found some things of interest. > - it is not applicable at the moment; I'm guessing this is solely due to punycode? > - it greatly improves references for English language, it is slightly > better for latin languages, and worse for non-latin ones; > > - it does not guarantee link stability during export; Indeed. However no approach that doesn't cache every heading with every export does, and I find this /significantly/ improves stability. > - it introduces a regression wrt custom ID. See my confusion above. > Link stability is still an issue, even if the proposal gives a false > sense of security in that area. I don't think we can solve it without > creating a cache for export, where you store all previous references for > a given file. Even this is not sufficient, because you can export > buffers not attached to files. To me this is a case of "don't let the perfect be the enemy of the good", though I do see that a false sense of security may be problematic, I consider the benefits to outweigh this. I hope you've found this reply more useful than my last, Timothy.