On 2014-03-09 14:02:48 +0100, Christoph Biedl wrote: > Vincent Lefevre wrote... > > > On a LaTeX file, one currently gets: > > > > LaTeX 2e document text > > > > It would be useful to have the encoding too, e.g. > > > > ISO-8859-1 LaTeX 2e document text > > UTF-8 LaTeX 2e document text > (...) > > From wheezy (5.11) on, file also prints a file encoding, like > > | LaTeX 2e document, UTF-8 Unicode text > > That one is guessed from the file content, not by eximation of > statements like 'inputenc'. Is that sufficient for you?
Yes, more or less. The problem is for ISO-8859 files: one doesn't know which version of ISO-8859 it is. I only use the ISO-8859-1 version, so that this is unambiguous for me, but this can be a problem for filters based on "file" output that are distributed widely. > > On LaTeX files, the encoding can be obtained unambiguously (well, > > in practice) by looking at \usepackage[...]{inputenc} commands, > > e.g. > > > > \usepackage[latin1]{inputenc} > > \usepackage[utf8]{inputenc} > > Seems feasible but still requires some hackery using regular > expressions. I think that in most cases, these commands occur at the beginning of a line (looking for such a command would be useful only in the ISO-8859 case, to differentiate the various versions). -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org