Ihor Radchenko <[email protected]> writes:
> Rens Oliemans <[email protected]> writes:
>
>>>> There are a couple of known limitations:
>>>> - Some github gists automatically redirect appropriately (f.e.,
>>>> https://gist.github.com/4343164 redirects to
>>>> https://gist.github.com/tonyday567/4343164). gothub only supports the
>>>> "complete" url (/user/gistid). Those gist urls aren't supported yet.
>>>> - github supports two way of forcing the "raw" file, via
>>>> raw.github.com/... and via github.com/...?raw=true, I currently only
>>>> convert the first one.
>>>> - github issues pages are not rewritten, see above.
>>>
>>> I am wondering whether such unconvertable links might be captured when
>>> publishing with --debug flag.
>>
>> I am not sure what you mean with captured, do you mean "following" the
>> redirection?
>
> I mean - throw an error if we have links that cannot be converted.
> That error will be visible during the build and can be quickly fixed by
> manually altering the problematic links.
Clear. My goal is to send a message to stderr when --debug is passed and a
problematic link is found, so you can only look at stderr and easily view all
problematic links. However, just (message "...") already goes to stderr, so
the entire publish.sh output gets written to stderr, making it difficult to
see the relevant output. Do you know why?
Alternatively I can do (error "..."), which will stop the export process
immediately when a problematic link is seen (and --debug is given).
Also, is --debug already a special flag for Emacs? I do not see it in man, but
if I do
./publish.sh --debug
, an (error "...") call in my export hook properly errors and exits the
script. If I do
./publish.sh
, my (error "...") call just outputs to stderr and publish.sh continues with
the next org file. I do not do anything with command-line arg parsing at the
moment.
> A canonical extension of Elisp data is .eld
Thanks.
>> +(push "./" load-path)
>> +(require 'rewrite-urls)
>
> We can simply use (load "rewrite-urls")
> Also, it is a good practice to use a non-generic prefix. Something like
> worg-urls.
I can't get (load "...") to work, unless I do (push "./" load-path)
beforehand. From the docs of `load':
...
This function searches the directories in ‘load-path’.
Am I missing something?
>> +(add-hook 'org-export-before-parsing-functions #'add-alternative-links)
>
> Similarly, it is a good practice to prefix library function names with
> library name: worg-urls-add-alternative-links
OK.
>> +(defcustom free-alternatives '((youtube . invidious)
>> + (github . gothub))
>> + "Alist of free frontends to use. The CDR corresponds to a symbol that is
>> +known by libredirect, see https://codeberg.org/LibRedirect/instances.")
>
> First line of the docstring, by convention, should be a short, full
> sentence, briefly describing the purpose of a given variable. This is
> for the purposes of eldoc displaying nice description.
OK.
See attached patch for incorporated changes. I don't have a lot of free time
to work on this so it's a bit slow and still WIP, I'm sending it now also to
show some progress. Since it's WIP you can ignore reviewing it if you want,
I'll improve on it myself. Current limitations:
- I still don't fully understand the (error) behaviour, but I guess it is kind
of useful since you can do --debug and the program exits on an invalid gist
link (I removed these in the worg pages).
- I do (load "worg-urls-rewrite"), even though this does not work for me
without adding "./" to load-path.
- I haven't yet fixed the links that refer to the linked domain.
- The program creates .eld files in subdirectories as well, I'll look into
this.
Best,
Rens
>From 913a32de5e9f6b7e23d1f3c48c4e76b013439de0 Mon Sep 17 00:00:00 2001
From: Rens Oliemans <[email protected]>
Date: Tue, 25 Nov 2025 11:10:46 +0100
Subject: [PATCH] Add alternative links for YouTube, GitHub and Reddit
This commit finds alternative links for the hosts mentioned above
since they rely on non-free JS. See
https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/ for
context.
---
.gitignore | 1 +
org-contrib/index.org | 2 +-
org-hacks.org | 2 +-
org-tutorials/non-beamer-presentations.org | 3 +-
publish.sh | 59 ++++---
worg-urls-rewrite.el | 196 +++++++++++++++++++++
6 files changed, 233 insertions(+), 30 deletions(-)
create mode 100644 worg-urls-rewrite.el
diff --git a/.gitignore b/.gitignore
index c9dc5b62..c572b901 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,3 +4,4 @@
patches/
/.project
*patch
+*.eld
diff --git a/org-contrib/index.org b/org-contrib/index.org
index 16b8c95d..89051927 100644
--- a/org-contrib/index.org
+++ b/org-contrib/index.org
@@ -430,7 +430,7 @@ See [[file:../exporters/index.org][Exporters]].
- [[https://github.com/eschulte/org-S5][org-S5]] from Eric Schulte
- [[https://github.com/relevance/org-html-slideshow.git][org-html-slideshow]] from Stuart Sierra
- - [[https://gist.github.com/509761][org-html5presentation.el]] from kinjo
+ - [[https://gist.github.com/takumikinjo/509761][org-html5presentation.el]] from kinjo
For use with the new exporter, /ox-s5.el/ ([[contribfile:lisp/ox-s5.el][link to raw file)]] and
/ox-deck.el/ ([[contribfile:lisp/ox-deck.el][link to raw file]]), both by Rick Frankel, are
diff --git a/org-hacks.org b/org-hacks.org
index 63b36d7e..62aa6094 100644
--- a/org-hacks.org
+++ b/org-hacks.org
@@ -67,7 +67,7 @@ Here is a hook function to use archive this effect:
*** Picking up a random task in the global TODO list
-Tony day [[http://mid.gmane.org/[email protected]][shared]] [[https://gist.github.com/4343164][this gist]] to pick up a
+Tony day [[http://mid.gmane.org/[email protected]][shared]] [[https://gist.github.com/tonyday567/4343164][this gist]] to pick up a
random task.
** Building and Managing Org
diff --git a/org-tutorials/non-beamer-presentations.org b/org-tutorials/non-beamer-presentations.org
index 33ea005e..e57e4a92 100644
--- a/org-tutorials/non-beamer-presentations.org
+++ b/org-tutorials/non-beamer-presentations.org
@@ -35,7 +35,8 @@ documents.
- org-html5presentation :: Is an Exporter of Org-mode documents to
HTML5 slide show presentations.
- (see [[https://gist.github.com/509761][gist.github.com/509761]] for code and usage information)
+ (see [[https://gist.github.com/takumikinjo/509761][gist.github.com/takumikinjo/509761]] for code and usage
+ information)
- [[org-tree-slide]] :: is a simple tool to treat a tree of an org buffer as
a single slide. Since each slide is displayed by simple narrowing,
diff --git a/publish.sh b/publish.sh
index b499de3d..6a439655 100755
--- a/publish.sh
+++ b/publish.sh
@@ -14,6 +14,8 @@ This variable can be set when running publish.sh script:
(require 'htmlize)
(require 'org-inlinetask)
+(load "worg-urls-rewrite")
+
(setq make-backup-files nil
debug-on-error t)
@@ -58,30 +60,33 @@ This variable can be set when running publish.sh script:
(R . t)
(gnuplot . t)))
-(dolist (org-file (cl-remove-if
- (lambda (n) (string-match-p "worg/archive/" n))
- (directory-files-recursively default-directory "\\.org$")))
- (let ((html-file (concat (file-name-directory org-file)
- (file-name-base org-file) ".html")))
- (if (and (file-exists-p html-file)
- (file-newer-than-file-p html-file org-file)
- ;; If there are include files or code, we need to
- ;; re-generate the HTML just in case if the included
- ;; files are changed.
- (with-temp-buffer
- (insert-file-contents org-file)
- (and
- (save-excursion
- (goto-char (point-min))
- (not (re-search-forward "#\\+include:" nil t)))
- (save-excursion
- (goto-char (point-min))
- (not (re-search-forward "#\\+begin_src" nil t))))))
- (message " [skipping] unchanged %s" org-file)
- (message "[exporting] %s" (file-relative-name org-file default-directory))
- (with-current-buffer (find-file org-file)
- (if worg-publish-stop-on-error
- (org-html-export-to-html)
- (condition-case err
- (org-html-export-to-html)
- (error (message (error-message-string err)))))))))
+(add-hook 'org-export-before-parsing-functions #'worg-urls-add-alternative-links)
+(let ((export-all (worg-urls-libredirect-data-changed-p worg-urls-libredirect-data)))
+ (dolist (org-file (cl-remove-if
+ (lambda (n) (string-match-p "worg/archive/" n))
+ (directory-files-recursively default-directory "\\.org$")))
+ (let ((html-file (concat (file-name-directory org-file)
+ (file-name-base org-file) ".html")))
+ (if (and (not export-all)
+ (file-exists-p html-file)
+ (file-newer-than-file-p html-file org-file)
+ ;; If there are include files or code, we need to
+ ;; re-generate the HTML just in case if the included
+ ;; files are changed.
+ (with-temp-buffer
+ (insert-file-contents org-file)
+ (and
+ (save-excursion
+ (goto-char (point-min))
+ (not (re-search-forward "#\\+include:" nil t)))
+ (save-excursion
+ (goto-char (point-min))
+ (not (re-search-forward "#\\+begin_src" nil t))))))
+ (message " [skipping] unchanged %s" org-file)
+ (message "[exporting] %s" (file-relative-name org-file default-directory))
+ (with-current-buffer (find-file org-file)
+ (if worg-publish-stop-on-error
+ (org-html-export-to-html)
+ (condition-case err
+ (org-html-export-to-html)
+ (error (message (error-message-string err))))))))))
diff --git a/worg-urls-rewrite.el b/worg-urls-rewrite.el
new file mode 100644
index 00000000..95c903e0
--- /dev/null
+++ b/worg-urls-rewrite.el
@@ -0,0 +1,196 @@
+(defvar worg-urls-libredirect-url "https://raw.githubusercontent.com/libredirect/instances/main/data.json"
+ "Location where up-to-date libredirect data is found")
+(defvar worg-urls-libredirect-data nil
+ "Contents of libredirect data, obtained from `worg-urls-libredirect-url'.")
+
+(defcustom worg-urls-free-alternatives '((youtube . invidious)
+ (github . gothub))
+ "Alist of alternative frontends to websites containing non-free JS.
+The CDR corresponds to a symbol that is known by libredirect, see
+https://codeberg.org/LibRedirect/instances.")
+(defcustom worg-urls-redirect-save-filename "redirect-local-data.eld"
+ "Filename used to save the relevant output of libredirect's json")
+
+
+(defun worg-urls-libredirect-data-changed-p (&optional data)
+ "Whether or not libredirect data changed since our last export."
+ (setq worg-urls-libredirect-data (or data (-get-libredirect-data)))
+ (if (not (file-exists-p worg-urls-redirect-save-filename))
+ ;; We hadn't saved stuff from our previous export, do so now.
+ (progn
+ (-save-redirect-data worg-urls-libredirect-data)
+ t)
+ ;; Compare results from stored file with data, overwrite stored data if newer
+ (let* ((stored-data (with-temp-buffer
+ (insert-file-contents worg-urls-redirect-save-filename)
+ (read (buffer-string))))
+ (data-equal (equal (-relevant-redirect-data worg-urls-libredirect-data)
+ stored-data)))
+ (unless data-equal
+ (-save-redirect-data worg-urls-libredirect-data))
+ (not data-equal))))
+
+
+(defun -get-libredirect-data ()
+ "Returns libredirect data from github"
+ (condition-case _
+ (let ((response (with-current-buffer
+ (url-retrieve-synchronously worg-urls-libredirect-url nil nil 5)
+ (prog2
+ (re-search-forward "\n\n" nil t) ; skip HTTP headers
+ (buffer-substring-no-properties (point) (point-max))
+ (kill-buffer)))))
+ (json-parse-string response :object-type 'alist :array-type 'list))
+ (t
+ (message "Worg couldn't access libredirect data, using original URLs.")
+ nil)))
+
+
+(defun -save-redirect-data (&optional data)
+ "Saves redirect data in a file with filename `worg-urls-redirect-save-filename'."
+ (let ((tosave (or data worg-urls-libredirect-data)))
+ (with-temp-file worg-urls-redirect-save-filename
+ (prin1
+ (-relevant-redirect-data data)
+ (current-buffer)))))
+
+
+(defun -relevant-redirect-data (data)
+ "Extracts relevant redirection data, defined in `worg-urls-free-alternatives'."
+ (seq-map (lambda (association)
+ (let ((from (car association)))
+ `(,from . ,(-first-link-of-json data from))))
+ worg-urls-free-alternatives))
+
+(defun -is-short-gist-path-p (link)
+ "Tests whether a link links to a gist without a user id.
+These redirect to a \"full\" gist.github.com/<user>/<id> link and at the
+moment have to be fixed manually - gothub does not support these short
+links."
+ (string-match (rx "//gist.github.com/" (1+ (in digit))) link))
+
+(defun worg-urls-add-alternative-links (&optional _)
+ "Add alternative links for websites containing non-free JS.
+For each link that has an alternative (currently YouTube and GitHub), we
+insert a link to the free alternative, and change the link text of the
+original link to to =(original URL)=. We also redirect reddit links to
+old.reddit.com.
+
+See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/ "
+ (unless worg-urls-libredirect-data
+ (setq worg-urls-libredirect-data (-get-libredirect-data)))
+
+ (let ((links (org-element-map (org-element-parse-buffer) 'link #'identity)))
+ (dolist (link (nreverse links))
+ (when (string-prefix-p "http" (org-element-property :type link))
+ (let ((path (org-element-property :path link)))
+ (if-let* ((new-url (-find-replacement path)))
+ (let* ((begin (org-element-property :begin link))
+ (contents-begin (org-element-property :contents-begin link))
+ (contents-end (org-element-property :contents-end link)))
+ (save-excursion
+ (if (and contents-begin contents-end)
+ ;; Link has description
+ (let ((description (buffer-substring contents-begin contents-end)))
+ ;; Change text of original link to (original URL)
+ (goto-char contents-begin)
+ (delete-region contents-begin contents-end)
+ (insert "(original URL)")
+ ;; Insert new link with free alternative
+ (goto-char begin)
+ (insert (format "[[%s][%s]] " new-url description)))
+ ;; else, "bare" link without description. We insert
+ ;; the original link as bare, and add the old one
+ ;; with (original URL) as description.
+ (let ((end (org-element-property :end link)))
+ (goto-char end)
+ (insert (format "][%s]]" "(original URL)"))
+ (goto-char begin)
+ (insert (format "[[%s]] [[" new-url))))))
+ ;; no replacement found. check if the link is a short
+ ;; github gist, this can be fixed manually
+ (when (-is-short-gist-path-p path)
+ (error (format "Please replace URL %s in %s with its redirect target"
+ path (buffer-file-name))))))))))
+
+
+(defun -find-replacement (path)
+ "Takes PATH and returns an alternative link if known and nil otherwise."
+ (and worg-urls-libredirect-data
+ (pcase path
+ ;; See https://docs.invidious.io/redirector/
+ ;; This matches both
+ ;; - youtube.com/watch?v=<video-id>
+ ;; - youtube.com/playlist?list=<playlist-id>
+ ;; with optionally ? and = escaped
+ ((rx "//" (? "www\.")
+ "youtube.com"
+ (group
+ (or (and "/watch"
+ (or "?" "%3F")
+ "v"
+ (or "=" "%3D"))
+ (and "/playlist"
+ (or "?" "%3F")
+ "list"
+ (or "=" "%3D"))))
+ (group (+ not-newline)))
+ (let* ((route (match-string 1 path))
+ (id (match-string 2 path))
+ (host (-first-link-of-json worg-urls-libredirect-data 'youtube)))
+ (concat host route id)))
+ ;; youtu.be/<video-id>
+ ((rx "//" (? "www\.")
+ "youtu.be"
+ "/"
+ (group (+ not-newline)))
+ (let* ((video-id (match-string 1 path))
+ (host (-first-link-of-json worg-urls-libredirect-data 'youtube)))
+ (concat host "/watch?v=" video-id)))
+ ;; gothub supports all links currently used in Worg, except for
+ ;; /<user>/<repo>/<issues>, so we exclude those.
+ ((and
+ (rx "//" (? "www\.")
+ "github.com"
+ "/"
+ (group (+ not-newline)))
+ ;; Exclude issues path. This approach currently also would
+ ;; exclude a supported link if author or repo starts with
+ ;; "issues"
+ (guard (not (string-match-p "/issues" path))))
+ (let* ((url (match-string 1 path))
+ (host (-first-link-of-json worg-urls-libredirect-data 'github)))
+ (concat host "/" url)))
+ ;; gist.github.com links can have the format
+ ;; gist.github.com/<user>/<id>, but also obsolte
+ ;; gist.github.com/<id>. These last links are automatically
+ ;; redirected on github.com, but we do not know where to
+ ;; redirect them to at this moment, so we cannot convert
+ ;; these.
+ ((rx "//" "gist.github.com/"
+ ;; user
+ (group (+ "[^/$]"))
+ ;; id
+ (group (+ not-newline)))
+ (let* ((url (match-string 1 path))
+ (host (-first-link-of-json worg-urls-libredirect-data 'github)))
+ (concat host "/gist/" url)))
+ ;; redirect reddit.com to old.reddit.com
+ ((rx "//" (? "www\.")
+ "reddit.com"
+ "/"
+ (group (+ not-newline)))
+ (let* ((url (match-string 1 path))
+ (host "https://old.reddit.com"))
+ (concat host "/" url))))))
+
+(defun -first-link-of-json (data host)
+ "Gets the first available link of the alternative links for HOST.
+This is based on the json from
+https://codeberg.org/LibRedirect/instances/src/branch/main/data.json."
+ (let* ((alternative (cdr (assoc host worg-urls-free-alternatives)))
+ (object (cdr (assoc alternative worg-urls-libredirect-data)))
+ (links (cdr (assoc 'clearnet object))))
+ (car links)))
+
+(provide 'worg-urls-rewrite)
--
2.50.1