Max Nikulin <maniku...@gmail.com> writes: > On 10/08/2022 18:43, Ihor Radchenko wrote: >> Ihor Radchenko writes: >> >> I have updated the patch to use "__/id" when id is too short. >> Any objections? >> >> +When ID is too short (less than 3 chars), use its md5 hash to create > > Misleading docs, you do not use md5.
Oops. Will fix. >> +the path." >> + (if (< (length id) 3) >> + (format "--/%s" id) > > Please, do not use path components starting with dash, it is terrible > for CLI tools. By the way, you promised underscores, not dashes. Why? I have no opinion about the possible dummy folder name, except that it should fit the general pattern we already have "xx/..." or "YYYYMM/...". >> + (format "%s/%s" >> + (substring id 0 2) >> + (substring id 2)))) > > Ihor, I have not look into the code around, so my suggestions may have > no sense. > > Is it possible to pass empty string as ID to these functions? Should it > be explicitly checked? It should not be possible under normal operation. > What if ID contains "/" that can not be used in file name? Windows has > more forbidden characters. Then, creating the attachment dir will fail in front of the user. I am not sure if we really need to do much about this. At least, not until we get a bug report. If we really need to solve the edge cases with attach dir generation, we may need to change the overall design in org-attach/org-id to something more restrictive. I am not sure if it is worth the effort compared to other directions where to improve Org. > Do you expect any problem if here (and for timestamp-based ids) > directory component is just padded with some character (unsure what can > better than underscore) to required length? > > "x" -> "_x/x" or "______x/x" > "xy" -> "xy/xy" or "_____xy/xy" > etc. > > From my point of view it might be a bit better than "__" and "unknown". Does it make sense? Timestamp-based IDs are like 20220810T210121.478800 by default. Then, the top-level folders will be like 202208/... If the actual format is different, we cannot possibly expect the timestamp to follow the same pattern, which is why I chose "unknown" referring to unknown date. On the other hand, if an ID has more than 6 characters, it will generate nonsense top-level folder with or without the patch. We could go further and match the ID against [1-9][0-9]\{5\} and put the folder into "unknown/ID" instead, but that would be a breaking change. In summary, I am more of less neutral towards this fallback format, except that it would be nice to be able to recover the ID from the directory path. Padding should be recoverable. I slightly dislike the "___xx" compared to "______" because it will create a proliferation of top-level folders as opposed to cramping the non-standard IDs into a single "______" folder. Attaching the updated patch.
>From 50d0f9de0acdf5d67b797476816cbeb40b19f554 Mon Sep 17 00:00:00 2001 Message-Id: <50d0f9de0acdf5d67b797476816cbeb40b19f554.1660137585.git.yanta...@gmail.com> From: Ihor Radchenko <yanta...@gmail.com> Date: Sat, 23 Jul 2022 13:13:24 +0800 Subject: [PATCH v3] org-attach-dir-from-id: Do not rely on ID being over 6 chars long * lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to "__/ID" when the ID contains 2 chars or less and cannot be split into the "xy/z...." path. (org-attach-id-ts-folder-format): Fall back to "______/ID" path format when the ID contains less than 7 chars and cannot be split into the "YYYYMM/rest" path. Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me --- lisp/org-attach.el | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/lisp/org-attach.el b/lisp/org-attach.el index fe49af6f3..0f5d5af82 100644 --- a/lisp/org-attach.el +++ b/lisp/org-attach.el @@ -159,19 +159,27 @@ (defcustom org-attach-archive-delete nil (defun org-attach-id-uuid-folder-format (id) "Translate an UUID ID into a folder-path. Default format for how Org translates ID properties to a path for -attachments. Useful if ID is generated with UUID." - (format "%s/%s" - (substring id 0 2) - (substring id 2))) +attachments. Useful if ID is generated with UUID. + +When ID is too short (less than 3 chars), return \"__/ID\"." + (if (< (length id) 3) + (format "__/%s" id) + (format "%s/%s" + (substring id 0 2) + (substring id 2)))) (defun org-attach-id-ts-folder-format (id) "Translate an ID based on a timestamp to a folder-path. Useful way of translation if ID is generated based on ISO8601 timestamp. Splits the attachment folder hierarchy into -year-month, the rest." - (format "%s/%s" - (substring id 0 6) - (substring id 6))) +year-month, the rest. + +When ID is too short (less than 7 chars), return \"______/ID\"." + (if (< (length id) 7) + (format "______/%s" id) + (format "%s/%s" + (substring id 0 6) + (substring id 6)))) (defcustom org-attach-id-to-path-function-list '(org-attach-id-uuid-folder-format org-attach-id-ts-folder-format) -- 2.35.1
-- Ihor Radchenko, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92