On Wed, Jun 26, 2013 at 08:44:11AM EDT, Lars Bjørndal wrote:
> Hi

> I wrote:

> > On intranet at work, there sometimes happens to be unicode (UTF-8)
> > characters such as a Norwegian ø in the filename. With lynx I can
> > retrieve these files, but not with elinks. Is there something I can
> > do to get elinks work also with these URLs?

> Let me describe the problem some more:
> 
> - I'm using text console only from a Fedora system, with charset
>   iso-8859-1. (Dont't think that that matter.)

You could switch your locale to UTF_8 and see if it makes any
difference.

> - I use Elinks 0.13.GIT with ECMAScript (SpiderMonkey) built in.

The elinks.or.cz page does not mention anything higher than 0.12pre6

> - The intranet solution is based on Microsoft SharePoint 2010.

Ouch.. there we have it.. the culprit, I mean :-)

> - Some files that I want to download, such as pdf or docx has UTF-8
>   characters in their file names, e.g. \303\270 as Will mentioned
>   (thank you). When selecting such a link, Elinks asks me if I want to
>   save the file, and I save it. The file content, however, is only
>   this line: 404 NOT FOUND

Sounds like for some reason, the Elinks's download routine is confused
and is sending out a mangled URL to the server.. suggests some problem
relative to converting between the one-byte 0xf8 latin1 and the two-byte
0xc3 0xb8 UTF-8 encodings of the "latin small letter o with stroke"..
maybe..?

One heavy-handed way of finding out if this assumption is correct and if
so, what actually gets sent out would be to use tcpdump or such to
capture/filter the actual dialog between Elinks and your server.. since
I'm not aware of a debug option in Elinks that would help here.

You could obviously do the same against lynx and see where they differ..

I ran another test with Elinks 0.12pre5 and tried to access a site whose
url is http://snl.no/Sønner_af_Norge.. then downloaded the web page via
the "Save As"..  and "Save formatted document" under Elinks' File menu..
no problem.

I switched my locale to en_ISO8859-1 and logged into a linux console
(assumming that's what you mean by text console).. no problem either.

I went as far as generating the nn_NO.ISO-8859-1 locale.. logged in
again and was still able to download. 

I also ran command-line Elinks with the "-dump" option, copy-pasting the
url and redirecting the output to a file.. and no problem either.

But then I run debian rather than fedora and both versions of Elinks may
have different patches applied.. 

> - The file content is preserved if I use Lynx to download and save the
>   file.

So maybe this is a bug that occurs in a very specific context that may
be specific to the fedora version.. Perhaps you could build an Elinks
executable in your $HOME from the 0.12pre6 tarball to clarify.

As a workaround, you could try replacing the "ø"'s in the url's with
html's &Oslash.. see if that helps. 

Oh, and please note that I am not an "encodings expert" or an Elinks
developer.. 

CJ

-- 
ALL YOUR BASE ARE BELONG TO US!
_______________________________________________
elinks-users mailing list
elinks-users@linuxfromscratch.org
http://linuxfromscratch.org/mailman/listinfo/elinks-users

Reply via email to