Re: [elinks-users] Support for unicode characters in URL
On Wed, Jun 26, 2013 at 08:44:11AM EDT, Lars Bjørndal wrote: > Hi > I wrote: > > On intranet at work, there sometimes happens to be unicode (UTF-8) > > characters such as a Norwegian ø in the filename. With lynx I can > > retrieve these files, but not with elinks. Is there something I can > > do to get elinks work also with these URLs? > Let me describe the problem some more: > > - I'm using text console only from a Fedora system, with charset > iso-8859-1. (Dont't think that that matter.) You could switch your locale to UTF_8 and see if it makes any difference. > - I use Elinks 0.13.GIT with ECMAScript (SpiderMonkey) built in. The elinks.or.cz page does not mention anything higher than 0.12pre6 > - The intranet solution is based on Microsoft SharePoint 2010. Ouch.. there we have it.. the culprit, I mean :-) > - Some files that I want to download, such as pdf or docx has UTF-8 > characters in their file names, e.g. \303\270 as Will mentioned > (thank you). When selecting such a link, Elinks asks me if I want to > save the file, and I save it. The file content, however, is only > this line: 404 NOT FOUND Sounds like for some reason, the Elinks's download routine is confused and is sending out a mangled URL to the server.. suggests some problem relative to converting between the one-byte 0xf8 latin1 and the two-byte 0xc3 0xb8 UTF-8 encodings of the "latin small letter o with stroke".. maybe..? One heavy-handed way of finding out if this assumption is correct and if so, what actually gets sent out would be to use tcpdump or such to capture/filter the actual dialog between Elinks and your server.. since I'm not aware of a debug option in Elinks that would help here. You could obviously do the same against lynx and see where they differ.. I ran another test with Elinks 0.12pre5 and tried to access a site whose url is http://snl.no/Sønner_af_Norge.. then downloaded the web page via the "Save As".. and "Save formatted document" under Elinks' File menu.. no problem. I switched my locale to en_ISO8859-1 and logged into a linux console (assumming that's what you mean by text console).. no problem either. I went as far as generating the nn_NO.ISO-8859-1 locale.. logged in again and was still able to download. I also ran command-line Elinks with the "-dump" option, copy-pasting the url and redirecting the output to a file.. and no problem either. But then I run debian rather than fedora and both versions of Elinks may have different patches applied.. > - The file content is preserved if I use Lynx to download and save the > file. So maybe this is a bug that occurs in a very specific context that may be specific to the fedora version.. Perhaps you could build an Elinks executable in your $HOME from the 0.12pre6 tarball to clarify. As a workaround, you could try replacing the "ø"'s in the url's with html's Ø.. see if that helps. Oh, and please note that I am not an "encodings expert" or an Elinks developer.. CJ -- ALL YOUR BASE ARE BELONG TO US! ___ elinks-users mailing list elinks-users@linuxfromscratch.org http://linuxfromscratch.org/mailman/listinfo/elinks-users
Re: [elinks-users] Support for unicode characters in URL
Hi I wrote: > On intranet at work, there sometimes happens to be unicode (UTF-8) > characters such as a Norwegian ø in the filename. With lynx I can > retrieve these files, but not with elinks. Is there something I can do > to get elinks work also with these URLs? Let me describe the problem some more: - I'm using text console only from a Fedora system, with charset iso-8859-1. (Dont't think that that matter.) - I use Elinks 0.13.GIT with ECMAScript (SpiderMonkey) built in. - The intranet solution is based on Microsoft SharePoint 2010. - Some files that I want to download, such as pdf or docx has UTF-8 characters in their file names, e.g. \303\270 as Will mentioned (thank you). When selecting such a link, Elinks asks me if I want to save the file, and I save it. The file content, however, is only this line: 404 NOT FOUND - The file content is preserved if I use Lynx to download and save the file. Hope this is clarifying Thanks and regards, Lars ___ elinks-users mailing list elinks-users@linuxfromscratch.org http://linuxfromscratch.org/mailman/listinfo/elinks-users
Re: [elinks-users] Support for unicode characters in URL
> On Tue, Jun 25, 2013 at 03:48:37AM EDT, Lars Bjørndal wrote: >> On intranet at work, there sometimes happens to be unicode (UTF-8) >> characters such as a Norwegian ø in the filename. With lynx I can >> retrieve these files, but not with elinks. [...] * Chris Jones [13-06/25=Tu 18:37 -0400]: > I did the following to create a test file: > % echo 'øø' > /tmp/file-ø > Pointed elinks to /tmp/file-ø > and was able to display the file's content successfully. > Vim tells me that the characters in the file are U+00F8. [...] Lars's email, including the From header, was encoded in ISO-8859-1 (aka Latin-1), not UTF-8, and \370 is the Latin-1 encoding of small letter o with stroke; the UTF-8 encoding for that would be \303\270. Lars also says "unicode (UTF-8)", suggesting a confusion; they are not synonymous. Chris reports that Vim reports that the file was encoded in Latin-1. Perhaps Lars is using multiple encodings without realizing it. It's particularly easy for an X-based desktop to have encodings different from those selected by environment variables in terminal sessions, and those encodings might differ from that of the filesystem. ___ elinks-users mailing list elinks-users@linuxfromscratch.org http://linuxfromscratch.org/mailman/listinfo/elinks-users
Re: [elinks-users] Support for unicode characters in URL
On Tue, Jun 25, 2013 at 03:48:37AM EDT, Lars Bjørndal wrote: > On intranet at work, there sometimes happens to be unicode (UTF-8) > characters such as a Norwegian ø in the filename. With lynx I can > retrieve these files, but not with elinks. Is there something I can do > to get elinks work also with these URLs? I did the following to create a test file: % echo 'øø' > /tmp/file-ø Pointed elinks to /tmp/file-ø and was able to display the file's content successfully. Vim tells me that the characters in the file are U+00F8. % elinks --version ELinks 0.12pre5 I mention the latter because I remember that maybe 2-3 years ago I did have some trouble getting the version of Elinks that came with debian to work with my en_UTF8 locale. I had to build my own from a more recent tarball AFAICR.. With the current version that comes with debian stable, I don't remember doing any kind of customization, so my guess is that whatever problem I had with UTF-8 in the past was fixed and that this should work out of the box with any recent version of Elinks.. CJ -- HOW ARE YOU GENTLEMEN? ___ elinks-users mailing list elinks-users@linuxfromscratch.org http://linuxfromscratch.org/mailman/listinfo/elinks-users