branch: externals/truename-cache
commit 36ac252e5ded1a75044f112f6945e0bbba6ee6ab
Author: Martin Edström <[email protected]>
Commit: Martin Edström <[email protected]>
Doc
---
README.org | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/README.org b/README.org
index 9bad59b77b..91e7e2d794 100644
--- a/README.org
+++ b/README.org
@@ -23,6 +23,8 @@ That is unacceptably slow, at least in the use-case where you
often scan a list
That's the sort of thing that might be done as part of a user command. If the
command is to be pleasant to use, it must take less than 100 milliseconds so it
feels "instant". And you may be dealing with not 1,000 but 10,000 or even
100,000 files.
+Sidenote for Elisp devs: It might occur to you that you can also de-dup by
filesystem inodes. See
[[README.org#appendix-on-referring-to-inodes-instead-of-truenames][Appendix: On
referring to inodes instead of truenames]].
+
** Bonus: Merging lists
The routine =truename-cache-collect-files-and-attributes= can be used to merge
multiple file lists and return de-duplicated truenames.
@@ -48,6 +50,25 @@ While you could simply let
=truename-cache-collect-files-and-attributes= return
It can easily be the difference between a runtime of 2.00 seconds and 0.02
seconds!
-2. If you wanted to apply your filters to relative file names rather than
absolute names ([[https://github.com/org-roam/org-roam/pull/2178][example
use-case]]), you'd ordinarily have to use =(relative-file-name FILE DIR)= on
every file, and that procedure isn't cheap either.
+2. If you wanted to apply your filters to relative file names rather than
absolute names ([[https://github.com/org-roam/org-roam/pull/2178][example
use-case]]), you'd ordinarily have to use =(file-relative-name FILE DIR)= on
every file, and that isn't completely free either, keeping in mind our
aforementioned 100 millisecond budget.
That's why it provides =:relative-file-deny=, =:relative-dir-deny=.
Another bottleneck dodged.
+
+** Bonus: Abbreviation
+
+Sometimes you do not want a true name but a name abbreviated with
=abbreviate-file-name=. Even that can blow our aforementioned 100 millisecond
budget, all by itself.
+
+So =truename-cache-collect-files-and-attributes= can pre-abbreviate names for
you with the argument =:abbrev 'full=. This does it slightly more efficiently
(informal benchmark: 50-75% of normal runtime), and much more efficiently if
you also pass the argument =:local-name-handlers nil= (informal benchmark: 20%
of normal runtime).
+
+** Appendix: On referring to inodes instead of truenames
+:PROPERTIES:
+:CUSTOM_ID: inodes
+:END:
+
+I have a theory that if de-dup is all you want, it could be possible with some
loop that makes use of the function =file-attribute-file-identifier=.
+
+I've not tried that, but there are other upsides to true names.
+
+- It's a more human-friendly UI: when something needs debugging, better to see
a file name than some meaningless inode number.
+
+- Once you have a list of true names, it is very friendly to further
manipulation. You can use trivial string comparisons like =string-prefix-p= in
place of =file-in-directory-p=.