José Matos wrote: > It would be nice to have this ready before the beta to improve the test, > but this should not delay the 2.2 release. The solution will be always > better than what we have now. > > In order to improve the reading of the different lyx files and to make the > process more robust I was thinking about falling back chardet[1] with a > file fails to load in python. The idea is to make the process more robust > than what we have now. > > What do others, namely Georg and Enrico, think about this choice?
Some general remarks: Chardet works fine in some cases, but not at all in some others (if only very few non-ASCII symbols are in the file). Therefore I would always try to determine the encoding from other sources, and only use chardet as a last fallback. Unfortunately I am not sure which files we are talking about. Do you mean lyx2lyx and .lyx files? Or other files? In case of lyx2lyx, very old files can use several encodings in one file, so these would need to be opened in binary mode. Maybe we should split lyx2lyx into one part which works only with python2 which converts up to 246 (LyX 1.4), and another one which converts from 246 on? I fear that a rewrite of this old code would need far too much testing. Georg
