Georg Baum wrote:
On Friday 29 December 2006 00:32, Dov Feldstern wrote:
So basically, in pre-1.5, the solution is just to use the "default"
encoding, rather than "auto".
Now we move to 1.5.0: when you try to use "default", you get the
following message in the stderr:
Unknown inputenc value `default'. Using `auto' instead.
So now there's no way to generate the latex file without the explicit
encodings, which means that we're stuck with the problem I originally
described, because of inputenc's limitations.
That is a bug and not intentional. The attached patch fixes the problem (at
least the LaTeX generation side). With this patch it is possible again to use
the "default" encoding. Unfortunately the display in LyX is wrong: Everything
is treated as latin1. I don't know how that works in LyX 1.4.x (the hebrew
words are displayed with hebrew characters, but not RTL). I'll have a look
and see whether this can be fixed in 1.5. Meanwhile I am going to put in the
attached patch.
Thanks, the patch works in the sense that it doesn't complain now about
not finding the "default" encoding. And the display in the GUI is
actually okay (and there's no reason why it should be affected by the
encoding --- it depends only on the language, I think). However, the
latex file is still not fully generated, because of the problem with iconv.
One solution would be to see if we could fix this using a newer version
of inputenc, as Jean-Marc suggested. But perhaps we could solve this by
again using the "default" encoding option? I realize that in 1.5.0 it's
harder than in previous versions: now LyX itself has to know what the
encoding should be, so that it can generate the latex file correctly.
You are right, that is exactly the problem.
OTOH, it *should* already know that --- it's explicitly writing that
information to the generated latex file! So all that really needs to be
done is to *not* write the explicit encoding commands to the generated
latex file, if the "default" encoding option is chosen.
I'll have a look, see above. The problem is that this "default" encoding does
not fit ver well into the new unicode world, and I am not yet sure how to
integrate it better.
Yeah, I agree that I'm not totally clear about what exactly we want, either.
But here's where the second problem arises, and this time it's LyX's
problem, not latex's (though I'm less sure about this part): it seems to
me like LyX itself --- not only latex --- is also determining the
encoding based on the paragraph, rather than based on the individual
characters' language.
Yes. It is implemented like that because of the limitation of older inputenc
packages.
There's no real reason why LyX should limit itself just because latex
does. Here exactly is an example where latex will manage, if only LyX would.
If LyX would perform the conversions on a per-character basis (or
rather, for consecutive characters with the same encoding), then it
would at least be able to generate the latex file, and then we'd only be
left with the first problem.
Yes, we should require a current inputenc version and output each character in
the encoding that it's language demands.
Again, I think that perhaps we could do the second half ("output each
character in the encoding that it's language demands") regardless of the
first half ("a current inputenc version").
But again, I agree that I'm not totally clear how this will fit in with
"real unicode".
Georg
------------------------------------------------------------------------
Index: src/bufferparams.C
===================================================================
--- src/bufferparams.C (Revision 16420)
+++ src/bufferparams.C (Arbeitskopie)
@@ -1468,11 +1468,18 @@
{
if (inputenc == "auto")
return *(language->encoding());
- Encoding const * const enc = encodings.getFromLaTeXName(inputenc);
+ Encoding const * const enc = (inputenc == "default") ?
+ encodings.getFromLyXName("iso8859-1") :
+ encodings.getFromLaTeXName(inputenc);
if (enc)
return *enc;
- lyxerr << "Unknown inputenc value `" << inputenc
- << "'. Using `auto' instead." << endl;
+ if (inputenc == "default")
+ lyxerr << "Could not find iso8859-1 encoding for inputenc "
+ "value `default'. Using inputenc `auto' instead."
+ << endl;
+ else
+ lyxerr << "Unknown inputenc value `" << inputenc
+ << "'. Using `auto' instead." << endl;
return *(language->encoding());
}
Index: src/bufferparams.h
===================================================================
--- src/bufferparams.h (Revision 16420)
+++ src/bufferparams.h (Arbeitskopie)
@@ -180,7 +180,10 @@
* The input encoding for LaTeX. This can be one of
* - auto: find out the input encoding from the used languages
* - default: Don't load the inputenc package and hope that it will
- * work (unlikely)
+ * work (unlikely). The encoding is an unspecified 8bit encoding,
+ * the interpretation is up to the LaTeX compiler. Because we need
+ * a rule how to create this from our internal UCS4 encoded
+ * document contents we treat this as latin1 internally.
* - any encoding supported by the inputenc package
* The encoding of the LyX file is always utf8 and has nothing to
* do with this setting.
Index: development/FORMAT
===================================================================
--- development/FORMAT (Revision 16420)
+++ development/FORMAT (Arbeitskopie)
@@ -78,8 +78,9 @@
encoding of the LyX file:
\inputencoding LyX file encoding
- auto as determined by the document language
- default latin1
+ auto as determined by the document language(s)
+ default unspecified 8bit (treated as latin1 internally,
+ see comment in bufferparams.h)
everything else as determined by \inputencoding
2006-07-03 Georg Baum <[EMAIL PROTECTED]>