On 2017-10-23, Helge Hafting wrote:
> Den 17. okt. 2017 19:50, skrev Guenter Milde:

Hej Helge,

thank you for looking into this.

>> TODO: find out which encoding is used for the arguments by CMake
>> (maybe we need the locale encoding) and eventually adapt the argument
>> parsing:

>>        arg = arg.decode('UTF-8') # support non-ASCII characters in arguments

> Is that sort of thing necessary?
> Arguments are often file/pathnames, right? 

Often, yes. But here we have also the to-be-replaced values and their
replacement are handed to the script as command line arguments:

# Syntax: ReplaceValues.py [<var1>=<Subst1> [<var2>=<Subst> ...]] <Inputfile> 
[<Inputfile> ...]

> Anyway, anything that *is* a 
> pathname, should not be 'decoded' or otherwise altered. 

Then, we need a way to get the raw byte-string of the command line in
Python 3. (The above line is only used to get the same result with Py2 and
Py3.)

However, we have to think of the use case: ReplaceValues.py is an auxiliary
script for the package generation with cmake. It is developers-only.

> While it may be necessary to encode/decode the contents of files, this 
> should not be done with pathnames. A pathname is the real name used by 
> the underlying filesystem. Any change, and it won't match reality 
> anymore.

This is a bit tricky: 

* if I get the pathname from the system, it is best to keep it undecoded,
* if the pathname is stored in a script or data file, I need to change the
  encodings from the data-file's encoding to the filesystem encoding.

Here, it is even a bit more tricky, because the pathname is stored in a
cmake config file and passed as command line argument to the script via make.
Fortunately, filenames of the LyX documentation are all ASCII.

Thanks,
Günter

Reply via email to