Ihor Radchenko <[email protected]> writes:

> Then, what about presenting two links in the static html file:
> alternative and original? JS may hide either for most users, while users
> with JS disabled will see both and can choose whichever they prefer to
> (or can) open.

Consider the attached patches for Worg. I'm still learning (e)lisp, please
don't hesitate with feedback, either for details or higher level.

I took a slightly different approach than suggested elsewhere in the thread
(replacing [[https://youtube.com/...]] links with [[youtube:...]]). Instead I
add an export hook that edits relevant links in the buffer before export based
on Rx. It seems a bit more hacky, but perhaps it only seems so since I'm not
that familiar with the Emacs way (editing buffers). It has the benefit of not
having to modify worg source, but doesn't overcome the same limitation
(publish.sh only looks at modified files).

I don't have a JS program that shows only one of the two options yet.

You can try this version of worg on https://files.rensoliemans.nl/worg/, but I
promise nothing about keeping it up-to-date or reliably online, it's just to
see how these alternative links look.

The links that I convert are:
    - youtube.com / youtu.be
    - [gist.]github.com. Excluding github.com/u/r/issues, since gothub does
    not support issues. gothub also doesn't support pull requests, but Worg
    does not link to a pull request page, so I ignored that.
    - [old.]reddit.com

There are a couple of known limitations:
    -  Some github gists automatically redirect appropriately (f.e.,
    https://gist.github.com/4343164 redirects to
    https://gist.github.com/tonyday567/4343164). gothub only supports the
    "complete" url (/user/gistid). Those gist urls aren't supported yet.
    - github supports two way of forcing the "raw" file, via
    raw.github.com/... and via github.com/...?raw=true, I currently only
    convert the first one.
    - github issues pages are not rewritten, see above.

Note that there are some invalid github links on worg unrelated to this
rewriting, I will clean those up whenever I find them.

>From abddfa1d2aa5f1f72ea0b56ec9b14281de170888 Mon Sep 17 00:00:00 2001
From: Rens Oliemans <[email protected]>
Date: Tue, 25 Nov 2025 11:10:46 +0100
Subject: [PATCH 1/3] Find alternative links for YouTube, GitHub and Reddit

This commit finds alternative links for the hosts mentioned above
since they rely on non-free JS. See
https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/ for
context.
---
 publish.sh | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/publish.sh b/publish.sh
index b499de3d..642f03b3 100755
--- a/publish.sh
+++ b/publish.sh
@@ -20,6 +20,95 @@ This variable can be set when running publish.sh script:
 (push '(:eval . "no-export") org-babel-default-header-args)
 (push '(:eval . "no-export") org-babel-default-inline-header-args)
 
+(setq libredirect-url "https://raw.githubusercontent.com/libredirect/instances/main/data.json";
+      libredirect-data nil)
+
+(defun add-alternative-links (backend)
+  "Add alternative links for websites containing non-free JS.
+See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/";
+  (unless libredirect-data
+    (let ((response (with-current-buffer
+		      (url-retrieve-synchronously libredirect-url)
+		      (prog2
+			  (re-search-forward "\n\n" nil t) ; skip HTTP headers
+			  (buffer-substring-no-properties (point) (point-max))
+			(kill-buffer)))))
+      (setq libredirect-data (json-parse-string response :object-type 'alist :array-type 'list))))
+  
+  (org-element-map (org-element-parse-buffer) 'link
+    (lambda (link)
+      (if (string-prefix-p "http" (org-element-property :type link))
+	  (let ((replacement-string (find-replacement (org-element-property :path link))))
+	    replacement-string)
+	nil))))
+
+(defun first-link-of-json (data host)
+  (let* ((object (cdr (assoc host libredirect-data)))
+	 (links (cdr (assoc 'clearnet object))))
+    (car links)))
+
+(defun find-replacement (path)
+  (pcase path
+    ;; See https://docs.invidious.io/redirector/
+    ;; This matches both
+    ;; - youtube.com/watch?v=<video-id>
+    ;; - youtube.com/playlist?list=<playlist-id>
+    ;; with optionally ? and = escaped
+    ((rx "//" (? "www\.")
+	 "youtube.com"
+	 (group
+	  (or (and "/watch"
+		   (or "?" "%3F")
+		   "v"
+		   (or "=" "%3D"))
+	      (and "/playlist"
+		   (or "?" "%3F")
+		   "list"
+		   (or "=" "%3D"))))
+	 (group (+ not-newline)))
+     (let* ((route (match-string 1 path))
+	    (id (match-string 2 path))
+	    (host (first-link-of-json libredirect-data 'invidious)))
+       (concat host route id)))
+    ;; youtu.be/<video-id>
+    ((rx  "//" (? "www\.")
+	  "youtu.be"
+	  "/"
+	  (group (+ not-newline)))
+     (let* ((video-id (match-string 1 path))
+	    (host (first-link-of-json libredirect-data 'invidious)))
+       (concat host "/watch?v=" video-id)))
+    ;; gothub supports all links currently used in Worg, except for
+    ;; /<user>/<repo>/<issues>
+    ((and (rx "//" (? "www\.")
+	      "github.com"
+	      "/"
+	      (group (+ not-newline)))
+	  ;; Exclude issues path. This approach currently also would
+	  ;; exclude a supported link if author or repo starts with
+	  ;; "issues"
+	  (guard (not (string-match-p "/issues" path))))
+     (let* ((url (match-string 1 path))
+	    (host (first-link-of-json libredirect-data 'gothub)))
+       (concat host "/" url)))
+    ((rx "//" "gist.github.com/"
+	 (group (+ not-newline)))
+     (let* ((url (match-string 1 path))
+	    (host (first-link-of-json libredirect-data 'gothub)))
+       (concat host "/gist/" url)))
+    ;; www.reddit.com; old.reddit.com; and reddit.com are all redirected
+    ((rx "//"
+	 (? (or "www\."
+		"old\."))
+	 "reddit.com"
+	 "/"
+	 (group (+ not-newline)))
+     (let* ((url (match-string 1 path))
+	    (host (first-link-of-json libredirect-data 'redlib)))
+       (concat host "/" url)))
+    ;; Leave unsupported link unchanged
+    (_ path)))
+
 ;; FIXME: Working around ESS bug.  `font-lock-reference-face' has been removed in Emacs 29.
 (define-obsolete-variable-alias
   'font-lock-reference-face 'font-lock-constant-face "20.3")
-- 
2.50.1

>From be27e46140dd565a6a44010529ce4259a80e9d45 Mon Sep 17 00:00:00 2001
From: Rens Oliemans <[email protected]>
Date: Wed, 26 Nov 2025 10:44:18 +0100
Subject: [PATCH 2/3] Insert alternative links in buffer

---
 publish.sh | 68 ++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 45 insertions(+), 23 deletions(-)

diff --git a/publish.sh b/publish.sh
index 642f03b3..2426a52e 100755
--- a/publish.sh
+++ b/publish.sh
@@ -23,31 +23,53 @@ This variable can be set when running publish.sh script:
 (setq libredirect-url "https://raw.githubusercontent.com/libredirect/instances/main/data.json";
       libredirect-data nil)
 
-(defun add-alternative-links (backend)
+(defun add-alternative-links (&optional _)
   "Add alternative links for websites containing non-free JS.
-See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/";
+For each link that has an alternative (currently YouTube, GitHub and
+Reddit), we insert a link to the free alternative, and change the link
+text of the original link to to =(original URL)=.
+
+See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/ "
   (unless libredirect-data
     (let ((response (with-current-buffer
-		      (url-retrieve-synchronously libredirect-url)
+			(url-retrieve-synchronously libredirect-url)
 		      (prog2
 			  (re-search-forward "\n\n" nil t) ; skip HTTP headers
 			  (buffer-substring-no-properties (point) (point-max))
 			(kill-buffer)))))
-      (setq libredirect-data (json-parse-string response :object-type 'alist :array-type 'list))))
-  
-  (org-element-map (org-element-parse-buffer) 'link
-    (lambda (link)
-      (if (string-prefix-p "http" (org-element-property :type link))
-	  (let ((replacement-string (find-replacement (org-element-property :path link))))
-	    replacement-string)
-	nil))))
+      (setq libredirect-data (json-parse-string response :object-type 'alist
+						:array-type 'list))))
+
+  (let ((links (org-element-map (org-element-parse-buffer) 'link #'identity)))
+    (dolist (link (nreverse links))
+      (when (string-prefix-p "http" (org-element-property :type link))
+	(let ((new-url (find-replacement (org-element-property :path link))))
+	  (when new-url
+	    (let* ((begin (org-element-property :begin link))
+		   (contents-begin (org-element-property :contents-begin link))
+		   (contents-end (org-element-property :contents-end link))
+		   (original-text (buffer-substring contents-begin contents-end)))
+	      (save-excursion
+		;; Change text of original link to (original URL)
+		(goto-char contents-begin)
+		(delete-region contents-begin contents-end)
+		(insert "(original URL)")
+		;; Insert new link with free alternative
+		(goto-char begin)
+		(insert (format "[[%s][%s]] " new-url original-text))
+		))
+	    link))))))
 
 (defun first-link-of-json (data host)
+  "Gets the first available link of the alternative links for HOST.
+This is based on the json from
+https://codeberg.org/LibRedirect/instances/src/branch/main/data.json.";
   (let* ((object (cdr (assoc host libredirect-data)))
 	 (links (cdr (assoc 'clearnet object))))
     (car links)))
 
 (defun find-replacement (path)
+  "Takes PATH and returns an alternative link if known and nil otherwise."
   (pcase path
     ;; See https://docs.invidious.io/redirector/
     ;; This matches both
@@ -79,15 +101,16 @@ See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/";
 	    (host (first-link-of-json libredirect-data 'invidious)))
        (concat host "/watch?v=" video-id)))
     ;; gothub supports all links currently used in Worg, except for
-    ;; /<user>/<repo>/<issues>
-    ((and (rx "//" (? "www\.")
-	      "github.com"
-	      "/"
-	      (group (+ not-newline)))
-	  ;; Exclude issues path. This approach currently also would
-	  ;; exclude a supported link if author or repo starts with
-	  ;; "issues"
-	  (guard (not (string-match-p "/issues" path))))
+    ;; /<user>/<repo>/<issues>, so we exclude those.
+    ((and
+      (rx "//" (? "www\.")
+	  "github.com"
+	  "/"
+	  (group (+ not-newline)))
+      ;; Exclude issues path. This approach currently also would
+      ;; exclude a supported link if author or repo starts with
+      ;; "issues"
+      (guard (not (string-match-p "/issues" path))))
      (let* ((url (match-string 1 path))
 	    (host (first-link-of-json libredirect-data 'gothub)))
        (concat host "/" url)))
@@ -105,9 +128,7 @@ See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/";
 	 (group (+ not-newline)))
      (let* ((url (match-string 1 path))
 	    (host (first-link-of-json libredirect-data 'redlib)))
-       (concat host "/" url)))
-    ;; Leave unsupported link unchanged
-    (_ path)))
+       (concat host "/" url)))))
 
 ;; FIXME: Working around ESS bug.  `font-lock-reference-face' has been removed in Emacs 29.
 (define-obsolete-variable-alias
@@ -147,6 +168,7 @@ See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/";
    (R . t)
    (gnuplot . t)))
 
+(add-hook 'org-export-before-parsing-functions #'add-alternative-links)
 (dolist (org-file (cl-remove-if
 		   (lambda (n) (string-match-p "worg/archive/" n))
 		   (directory-files-recursively default-directory "\\.org$")))
-- 
2.50.1

>From 9b7ebe2ea1f9f6faf6feea01052cf001aef675b4 Mon Sep 17 00:00:00 2001
From: Rens Oliemans <[email protected]>
Date: Wed, 26 Nov 2025 11:27:13 +0100
Subject: [PATCH 3/3] Support links without description

---
 publish.sh | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/publish.sh b/publish.sh
index 2426a52e..7861a3fc 100755
--- a/publish.sh
+++ b/publish.sh
@@ -47,26 +47,27 @@ See https://list.orgmode.org/orgmode/87pl9szmy6.fsf@localhost/ "
 	  (when new-url
 	    (let* ((begin (org-element-property :begin link))
 		   (contents-begin (org-element-property :contents-begin link))
-		   (contents-end (org-element-property :contents-end link))
-		   (original-text (buffer-substring contents-begin contents-end)))
+		   (contents-end (org-element-property :contents-end link)))
 	      (save-excursion
-		;; Change text of original link to (original URL)
-		(goto-char contents-begin)
-		(delete-region contents-begin contents-end)
-		(insert "(original URL)")
-		;; Insert new link with free alternative
-		(goto-char begin)
-		(insert (format "[[%s][%s]] " new-url original-text))
-		))
-	    link))))))
-
-(defun first-link-of-json (data host)
-  "Gets the first available link of the alternative links for HOST.
-This is based on the json from
-https://codeberg.org/LibRedirect/instances/src/branch/main/data.json.";
-  (let* ((object (cdr (assoc host libredirect-data)))
-	 (links (cdr (assoc 'clearnet object))))
-    (car links)))
+		(if (and contents-begin contents-end)
+		    ;; Link has description
+		    (let ((description (buffer-substring contents-begin contents-end)))
+		      ;; Change text of original link to (original URL)
+		      (goto-char contents-begin)
+		      (delete-region contents-begin contents-end)
+		      (insert "(original URL)")
+		      ;; Insert new link with free alternative
+		      (goto-char begin)
+		      (insert (format "[[%s][%s]] " new-url description)))
+		  ;; else, "bare" link without description.  We insert
+		  ;; the original link as bare, and add the old one
+		  ;; with (original URL) as description.
+		  (let ((end (org-element-property :end link)))
+		    (goto-char end)
+		    (insert (format "][%s]]" "(original URL)"))
+		    (goto-char begin)
+		    (insert (format "[[%s]] [[" new-url)))
+		  )))))))))
 
 (defun find-replacement (path)
   "Takes PATH and returns an alternative link if known and nil otherwise."
@@ -130,6 +131,14 @@ https://codeberg.org/LibRedirect/instances/src/branch/main/data.json.";
 	    (host (first-link-of-json libredirect-data 'redlib)))
        (concat host "/" url)))))
 
+(defun first-link-of-json (data host)
+  "Gets the first available link of the alternative links for HOST.
+This is based on the json from
+https://codeberg.org/LibRedirect/instances/src/branch/main/data.json.";
+  (let* ((object (cdr (assoc host libredirect-data)))
+	 (links (cdr (assoc 'clearnet object))))
+    (car links)))
+
 ;; FIXME: Working around ESS bug.  `font-lock-reference-face' has been removed in Emacs 29.
 (define-obsolete-variable-alias
   'font-lock-reference-face 'font-lock-constant-face "20.3")
-- 
2.50.1

Reply via email to