Okay, back to link escaping.
What this is about:
Current implementation of percent escaping URIs uses a whitelist
approach, e.g. only percent escapes characters that are in
`org-link-escape-chars' or in a user supplied list. This is a problem
because using this function requires knowledge about
Hi David,
I have not have time to follow this in detail, but if you feel
confident that this is
doing the right thing, pleas go ahead and apply the necessary
patches. I am an encoding moron, so I am easily convinced that you
and Sebastian together cook up something useful. :-)
- Carsten
David Maus dm...@ictsoc.de writes:
Also I guess the decoding is secure. Means we could change the
comment of this function:
(defun org-protocol-unhex-compound (hex)
Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
Note: this function falls back on single byte decoding
Sebastian Rose wrote:
David Maus dm...@ictsoc.de writes:
sh$ man utf-8
Thanks! I finally get a grip on one of my personal nightmares.
It's not that bad, is it? :D
Even better: It makes sense ;)
The attached patch is the first step in this direction: It modifies
the algorithm of
The binary representation of 127 is 0111 and valid ascii char. DEL
actually (sh$ man ascii)
Right, and that's why it is encoded: No control characters in a URI.
Great ! :)
The final algorithm for the shiny new unicode aware percent encoding
function would be:
- percent encode all
David Maus dm...@ictsoc.de writes:
Sebastian Rose wrote:
David Maus dm...@ictsoc.de writes:
sh$ man utf-8
Thanks! I finally get a grip on one of my personal nightmares.
It's not that bad, is it? :D
Even better: It makes sense ;)
The attached patch is the first step in this direction:
rrrggrgrggrgr
premature and wrong patch, sorry. Again against master:
diff --git a/lisp/org-protocol.el b/lisp/org-protocol.el
index 21f28e7..d69d584 100644
--- a/lisp/org-protocol.el
+++ b/lisp/org-protocol.el
@@ -305,7 +305,7 @@ part.
(defun org-protocol-unhex-string(str)
Unhex
Also I guess the decoding is secure. Means we could change the comment
of this function:
(defun org-protocol-unhex-compound (hex)
Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
Note: this function falls back on single byte decoding if a
character sequence is not valid
Also I guess the decoding is secure. Means we could change the
comment of this function:
(defun org-protocol-unhex-compound (hex)
Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
Note: this function falls back on single byte decoding if a
character sequence is not
Sebastian Rose wrote:
David Maus dm...@ictsoc.de writes:
Sebastian Rose wrote:
Is there a reason for this distinction between multibyte and unibyte?
I favour the shotgun-approach if not. It's bullet-proof.
The JavaScript function `encodeURIComponent()' encodes the German Umlaut
`ü' as `%C3%B6'
David Maus dm...@ictsoc.de writes:
sh$ man utf-8
Thanks! I finally get a grip on one of my personal nightmares.
It's not that bad, is it? :D
The
attached patch is the first step in this direction: It modifies the
algorithm of `org-link-escape', now iterating over the input string
Sebastian Rose wrote:
Is there a reason for this distinction between multibyte and unibyte?
I favour the shotgun-approach if not. It's bullet-proof.
The JavaScript function `encodeURIComponent()' encodes the German Umlaut
`ü' as `%C3%B6' regardless of the sources encoding actually. That's why
I
David Maus dm...@ictsoc.de writes:
Sebastian Rose wrote:
Is there a reason for this distinction between multibyte and unibyte?
I favour the shotgun-approach if not. It's bullet-proof.
The JavaScript function `encodeURIComponent()' encodes the German Umlaut
`ü' as `%C3%B6' regardless of the
Sébastien Vauban wrote:
Hello,
With current git pull, and such an Org file (in UTF-8 encoding):
...
I get the following error when trying to export it via PDFLaTeX:
The problem is, that the 'É' character is not in Org's default list
for link escapes but `string-match' matches for the lower
David Maus dm...@ictsoc.de writes:
Sébastien Vauban wrote:
Hello,
With current git pull, and such an Org file (in UTF-8 encoding):
...
I get the following error when trying to export it via PDFLaTeX:
The problem is, that the 'É' character is not in Org's default list
for link escapes but
15 matches
Mail list logo