Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-11 Thread Jean-Marc Lasgouttes
Guenter Milde mi...@users.berlios.de writes:
 Ah, so you probably wanted to say: add encoding to the text properties
 dialogue?

Yes.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-11 Thread Jean-Marc Lasgouttes
Uwe Stöhr uwesto...@web.de writes:
 We must do this:

 - if there is only one encoding, use this
 - if there are several encodings, set h_inputencoding to auto.

 So yes, this is independent of the languages. But what is there is
 only one encoding specified in the package loading command of inputenc
 but another one in the options of \documentclass. We already had some
 strange TeX files in the past.
 I therefore check for the languages in the \documentclass and the
 babel call. When this should be made independent of the language, we
 also need to check the encoding twice: in \documentclass and the
 inputenc call.

This should be made more general: all our handling of \usepackage should
look at these two lists of options. In the long run, it will be better
to implement this globally.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-11 Thread Jean-Marc Lasgouttes
Guenter Milde  writes:
> Ah, so you probably wanted to say: add encoding to the text properties
> dialogue?

Yes.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-11 Thread Jean-Marc Lasgouttes
Uwe Stöhr  writes:
> We must do this:
>
> - if there is only one encoding, use this
> - if there are several encodings, set h_inputencoding to "auto".
>
> So yes, this is independent of the languages. But what is there is
> only one encoding specified in the package loading command of inputenc
> but another one in the options of \documentclass. We already had some
> strange TeX files in the past.
> I therefore check for the languages in the \documentclass and the
> babel call. When this should be made independent of the language, we
> also need to check the encoding twice: in \documentclass and the
> inputenc call.

This should be made more general: all our handling of \usepackage should
look at these two lists of options. In the long run, it will be better
to implement this globally.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Jean-Marc Lasgouttes

Le 10 déc. 09 à 03:46, Uwe Stöhr a écrit :
LyX doesn't know ascii as encoding; we therefore need to translate  
it to an encoding that LyX understands. LyX understands latin5.


Hmmm...

2007-02-16  Georg Baum  georg.b...@post.rwth-aachen.de

* format incremented to 262: Allow ascii \inputencoding

What else do we need to make it work?

When having only one encoding, this encoding can be set as  
h_inputencoding, otherwise h_inputencoding must be set to auto (the  
one of the document language). This is done in the routine for  
babel, but there are TeX files where the input encoding is specified  
before document or babel languages. Therefore do nothing when there  
is more than one encoding in the routine for inputenc.

I now updated the comments a bit to make this more clear.


OK. But if there is one encoding and two languages, why do we need to  
set encoding to auto instead of
using this fixed encoding? Why not just use auto when there are  
several encodings?


In other words, why does the language matter?

Well you wrote: I'll just remove the code I do not understand and  
look at what happens

I could therefore not resist to drop a funny statement.


My original sentence started with otherwise. And yes, it was  
designed as a way to get

a quick reaction :)

JMarc

Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Uwe Stöhr

Jean-Marc Lasgouttes schrieb:

LyX doesn't know ascii as encoding; we therefore need to translate 
it to an encoding that LyX understands. LyX understands latin5.


2007-02-16  Georg Baum  georg.b...@post.rwth-aachen.de

* format incremented to 262: Allow ascii \inputencoding

What else do we need to make it work?


tex2lyx currently produces fileformat 264, so the special ascii handling can indeed go. I'll have a 
closer look later.


When having only one encoding, this encoding can be set as 
h_inputencoding, otherwise h_inputencoding must be set to auto (the 
one of the document language). This is done in the routine for babel, 
but there are TeX files where the input encoding is specified before 
document or babel languages. Therefore do nothing when there is more 
than one encoding in the routine for inputenc.

I now updated the comments a bit to make this more clear.


OK. But if there is one encoding and two languages,


In this case we could set the encoding as it is but when the inputencoding is specified after babel, 
we don't know if there is only one or several encodings when we detect several languages. I 
therefore set the encoding to auto in the babel routine. This doesn't harm and is safe.


My original sentence started with otherwise. And yes, it was designed 
as a way to get a quick reaction :)


Heh, that's extortion! ;-)

regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Jean-Marc Lasgouttes
G. Milde mi...@users.sourceforge.net writes:
 OK. Then the document encoding unconditionally.

 Only if it differs from the language's default.

Maybe. Note that tex2lyx does not read Languages file currently (but I
think it should). We just have to be careful about what language is
handled at what revision of the file format.

 We could add encoding to fonts, like we do for languages. But I do not
 know how useful this would be.

 This would not solve the problem, nor would it be sensible. There are
 several possible encodings for German language (utf8, utf8x,
 latin1, latin12, ansinew, ...) and even more for Latin Modern fonts
 (utf8, latin1...15, koi8, ...).
 Are you sure you do not confuse inputenc with fontenc?

No, I meant have the user be able to say this part of my text should be
output in cp1252. But I fail to see who would like that :)

I'll have a look at your example file.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Jean-Marc Lasgouttes
Uwe Stöhr uwesto...@web.de writes:
 OK. But if there is one encoding and two languages,

 In this case we could set the encoding as it is but when the
 inputencoding is specified after babel, we don't know if there is only
 one or several encodings when we detect several languages. I therefore
 set the encoding to auto in the babel routine. This doesn't harm and
 is safe.

But what about ignoring the number of languages? It is about encodings,
after all.

 My original sentence started with otherwise. And yes, it was
 designed as a way to get a quick reaction :)

 Heh, that's extortion! ;-)

:)

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Uwe Stöhr

Jean-Marc Lasgouttes schrieb:


OK. But if there is one encoding and two languages,



In this case we could set the encoding as it is but when the
inputencoding is specified after babel, we don't know if there is only
one or several encodings when we detect several languages. I therefore
set the encoding to auto in the babel routine. This doesn't harm and
is safe.


But what about ignoring the number of languages? It is about encodings,
after all.


We must do this:

- if there is only one encoding, use this
- if there are several encodings, set h_inputencoding to auto.

So yes, this is independent of the languages. But what is there is only one encoding specified in 
the package loading command of inputenc but another one in the options of \documentclass. We already 
had some strange TeX files in the past.
I therefore check for the languages in the \documentclass and the babel call. When this should be 
made independent of the language, we also need to check the encoding twice: in \documentclass and 
the inputenc call.


regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Uwe Stöhr

Uwe Stöhr schrieb:


2007-02-16  Georg Baum  georg.b...@post.rwth-aachen.de

* format incremented to 262: Allow ascii \inputencoding

What else do we need to make it work?


tex2lyx currently produces fileformat 264, so the special ascii handling 
can indeed go. I'll have a closer look later.


I implemented this now:
http://www.lyx.org/trac/changeset/32472

regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Guenter Milde
On 2009-12-10, Jean-Marc Lasgouttes wrote:
 G. Milde mi...@users.sourceforge.net writes:
 OK. Then the document encoding unconditionally.

 Only if it differs from the language's default.

 Maybe. Note that tex2lyx does not read Languages file currently (but I
 think it should). We just have to be careful about what language is
 handled at what revision of the file format.

 We could add encoding to fonts, like we do for languages. But I do not
 know how useful this would be.

 This would not solve the problem, nor would it be sensible.
...
 Are you sure you do not confuse inputenc with fontenc?

 No, I meant have the user be able to say this part of my text should be
 output in cp1252. But I fail to see who would like that :)

Ah, so you probably wanted to say: add encoding to the text properties
dialogue?

I agree with you that this is not a usefull/needed feature.

Günter



Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Guenter Milde
On 2009-12-10, Uwe Stöhr wrote:
 Jean-Marc Lasgouttes schrieb:

 But what about ignoring the number of languages? It is about encodings,
 after all.

 - if there is only one encoding, use this
 - if there are several encodings, set h_inputencoding to auto.

 ... But what is there is only one encoding specified in the package
 loading command of inputenc but another one in the options of
 \documentclass. 
...
 When this should be made independent of the language, we also need to
 check the encoding twice: in \documentclass and the inputenc call.

I favour this clean approach. Checking the language is just guessing.

Alternatively, drop the language check and allow the strange files to
get a different encoding in LyX. Since the switch to Unicode in LyX,
there should no longer be any error, just some more replacements from the
unicodesymbols file.

Günter




Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Jean-Marc Lasgouttes

Le 10 déc. 09 à 03:46, Uwe Stöhr a écrit :
LyX doesn't know "ascii" as encoding; we therefore need to translate  
it to an encoding that LyX understands. LyX understands latin5.


Hmmm...

2007-02-16  Georg Baum  

* format incremented to 262: Allow ascii \inputencoding

What else do we need to make it work?

When having only one encoding, this encoding can be set as  
h_inputencoding, otherwise h_inputencoding must be set to auto (the  
one of the document language). This is done in the routine for  
babel, but there are TeX files where the input encoding is specified  
before document or babel languages. Therefore do nothing when there  
is more than one encoding in the routine for inputenc.

I now updated the comments a bit to make this more clear.


OK. But if there is one encoding and two languages, why do we need to  
set encoding to auto instead of
using this fixed encoding? Why not just use "auto" when there are  
several encodings?


In other words, why does the language matter?

Well you wrote: "I'll just remove the code I do not understand and  
look at what happens"

I could therefore not resist to drop a funny statement.


My original sentence started with "otherwise". And yes, it was  
designed as a way to get

a quick reaction :)

JMarc

Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Uwe Stöhr

Jean-Marc Lasgouttes schrieb:

LyX doesn't know "ascii" as encoding; we therefore need to translate 
it to an encoding that LyX understands. LyX understands latin5.


2007-02-16  Georg Baum  

* format incremented to 262: Allow ascii \inputencoding

What else do we need to make it work?


tex2lyx currently produces fileformat 264, so the special ascii handling can indeed go. I'll have a 
closer look later.


When having only one encoding, this encoding can be set as 
h_inputencoding, otherwise h_inputencoding must be set to auto (the 
one of the document language). This is done in the routine for babel, 
but there are TeX files where the input encoding is specified before 
document or babel languages. Therefore do nothing when there is more 
than one encoding in the routine for inputenc.

I now updated the comments a bit to make this more clear.


OK. But if there is one encoding and two languages,


In this case we could set the encoding as it is but when the inputencoding is specified after babel, 
we don't know if there is only one or several encodings when we detect several languages. I 
therefore set the encoding to auto in the babel routine. This doesn't harm and is safe.


My original sentence started with "otherwise". And yes, it was designed 
as a way to get a quick reaction :)


Heh, that's extortion! ;-)

regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Jean-Marc Lasgouttes
"G. Milde"  writes:
>> OK. Then the document encoding unconditionally.
>
> Only if it differs from the language's default.

Maybe. Note that tex2lyx does not read Languages file currently (but I
think it should). We just have to be careful about what language is
handled at what revision of the file format.

>> We could add encoding to fonts, like we do for languages. But I do not
>> know how useful this would be.
>
> This would not solve the problem, nor would it be sensible. There are
> several possible encodings for German language (utf8, utf8x,
> latin1, latin12, ansinew, ...) and even more for "Latin Modern" fonts
> (utf8, latin1...15, koi8, ...).
> Are you sure you do not confuse inputenc with fontenc?

No, I meant have the user be able to say "this part of my text should be
output in cp1252". But I fail to see who would like that :)

I'll have a look at your example file.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Jean-Marc Lasgouttes
Uwe Stöhr  writes:
>> OK. But if there is one encoding and two languages,
>
> In this case we could set the encoding as it is but when the
> inputencoding is specified after babel, we don't know if there is only
> one or several encodings when we detect several languages. I therefore
> set the encoding to auto in the babel routine. This doesn't harm and
> is safe.

But what about ignoring the number of languages? It is about encodings,
after all.

>> My original sentence started with "otherwise". And yes, it was
>> designed as a way to get a quick reaction :)
>
> Heh, that's extortion! ;-)

:)

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Uwe Stöhr

Jean-Marc Lasgouttes schrieb:


OK. But if there is one encoding and two languages,

>>

In this case we could set the encoding as it is but when the
inputencoding is specified after babel, we don't know if there is only
one or several encodings when we detect several languages. I therefore
set the encoding to auto in the babel routine. This doesn't harm and
is safe.


But what about ignoring the number of languages? It is about encodings,
after all.


We must do this:

- if there is only one encoding, use this
- if there are several encodings, set h_inputencoding to "auto".

So yes, this is independent of the languages. But what is there is only one encoding specified in 
the package loading command of inputenc but another one in the options of \documentclass. We already 
had some strange TeX files in the past.
I therefore check for the languages in the \documentclass and the babel call. When this should be 
made independent of the language, we also need to check the encoding twice: in \documentclass and 
the inputenc call.


regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Uwe Stöhr

Uwe Stöhr schrieb:


2007-02-16  Georg Baum  

* format incremented to 262: Allow ascii \inputencoding

What else do we need to make it work?


tex2lyx currently produces fileformat 264, so the special ascii handling 
can indeed go. I'll have a closer look later.


I implemented this now:
http://www.lyx.org/trac/changeset/32472

regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Guenter Milde
On 2009-12-10, Jean-Marc Lasgouttes wrote:
> "G. Milde"  writes:
>>> OK. Then the document encoding unconditionally.

>> Only if it differs from the language's default.

> Maybe. Note that tex2lyx does not read Languages file currently (but I
> think it should). We just have to be careful about what language is
> handled at what revision of the file format.

>>> We could add encoding to fonts, like we do for languages. But I do not
>>> know how useful this would be.

>> This would not solve the problem, nor would it be sensible.
...
>> Are you sure you do not confuse inputenc with fontenc?

> No, I meant have the user be able to say "this part of my text should be
> output in cp1252". But I fail to see who would like that :)

Ah, so you probably wanted to say: add encoding to the text properties
dialogue?

I agree with you that this is not a usefull/needed feature.

Günter



Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-10 Thread Guenter Milde
On 2009-12-10, Uwe Stöhr wrote:
> Jean-Marc Lasgouttes schrieb:

>> But what about ignoring the number of languages? It is about encodings,
>> after all.

> - if there is only one encoding, use this
> - if there are several encodings, set h_inputencoding to "auto".

> ... But what is there is only one encoding specified in the package
> loading command of inputenc but another one in the options of
> \documentclass. 
...
> When this should be made independent of the language, we also need to
> check the encoding twice: in \documentclass and the inputenc call.

I favour this clean approach. Checking the language is just guessing.

Alternatively, drop the language check and allow the "strange" files to
get a different encoding in LyX. Since the switch to Unicode in LyX,
there should no longer be any error, just some more replacements from the
unicodesymbols file.

Günter




Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Guenter Milde
On 2009-12-09, Jean-Marc Lasgouttes wrote:
 Uwe,

 I have been looking at this code from preamble.cpp

   // only set when there is not more than one inputenc
   // option therefore check for the , character also
   // only set when there is not more then one babel
   // language option
   if (opts.find(,) == string::npos  one_language == true) {
   if (opts == ascii)
   //change ascii to auto to be in the unicode 
 range, see
   //http://www.lyx.org/trac/ticket/4719
   h_inputencoding = auto;
   else if (!opts.empty())
   h_inputencoding = opts;
   }

 ... but I cannot understand what is so special about having several
 encodings, or even several languages defined.

In LyX, you can set DocumentSettingsLanguage to either

  (*) Language default
  ( ) Other
  
With Other, you set *one* encoding for the whole document,
with Language default the encoding is taken from the 'languages'
config file and there can be more than one encoding in the document.

To get multiple defined encodings from a latex file into a lyx
document, the best try is using auto encoding (i.e. Language
default).

Please do not remove this code.

Günter



Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Jean-Marc Lasgouttes

In LyX, you can set DocumentSettingsLanguage to either

 (*) Language default
 ( ) Other

With Other, you set *one* encoding for the whole document,
with Language default the encoding is taken from the 'languages'
config file and there can be more than one encoding in the document.

To get multiple defined encodings from a latex file into a lyx
document, the best try is using auto encoding (i.e. Language
default).


OK, so what about just keeping the last encoding and, after finishing
the parsing of preamble, resetting this encoding to auto if it is
the default encoding for the language?

Or just set the encoding to auto unconditionally?

The current scheme is arbitrary and not predictable IMO. I do not see  
why counting

encodings and language means something. Moreover, it seems that it
fails if the babel statement is after the inputenc package loading.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Uwe Stöhr

 I have been looking at this code from preamble.cpp

 // only set when there is not more than one inputenc
 // option therefore check for the , character also
 // only set when there is not more then one babel
 // language option
 if (opts.find(,) == string::npos  one_language == true) {
 if (opts == ascii)
 //change ascii to auto to be in the unicode range, see
 //http://www.lyx.org/trac/ticket/4719
 h_inputencoding = auto;
 else if (!opts.empty())
 h_inputencoding = opts;
 }

 and I do not understand its meaning. I can see that the special handling of ascii is not needed 
 anymore, but I cannot understand what is so special about having several encodings, or even

 several languages defined.

auto means in this case that the default encoding is used. So ascii is converted to the default 
encoding of the document language.

TeX documents can have different encodings. The last listed encoding is the 
document-wide encoding.

 Could you enlighten me? Otherwise, I'll just remove the code I do not 
understand
 and look at what happens :)

Removing something that one doesn't understand is a bad idea. Imagine you stop the water cooling of 
an atomic powerplant because you think its unnecessary ;-).


regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Jean-Marc Lasgouttes

Le 9 déc. 09 à 22:48, G. Milde a écrit :

OK, so what about just keeping the last encoding and, after finishing
the parsing of preamble, resetting this encoding to auto if it is
the default encoding for the language?


But what to do if not? We should rather expect the other encoding to
be used (why would it be defined otherwise) or parse the whole
document first to be sure.


We should not parse the document in advance IMO. This is too much  
effort for

a case that does not seem so useful.

BTW what would be you real-world examples of multi-encoding documents?
I am interested because I suspect that we do not have the same thing  
in mind.





Or just set the encoding to auto unconditionally?


No. I want to keep e.g. utf8x also with a round-trip.


OK. Then the document encoding unconditionally.

A question that I have is: do we to fear that documents will be wrongly
output because of encoding, or are our unicode tables good enough (in  
practice) to

work around problems.

I guess my question is: is the main problem correctness of the printed  
result or

keeping the same encodings to make our tex-based coauthor happy?

The problem is that we have a lossy transformation. There are less  
degrees

of freedom in LyX than in LaTeX.


We could add encoding to fonts, like we do for languages. But I do not  
know how

useful this would be.


The current scheme is trying its best to get working results and
round-trips without encoding change.


I failed to find the discussion and the use cases that led to this...
That's why I do not grasp it yet.


To find out which encoding setting to use:


[snip reasonable proposals]

  b3) use a premble command and ERT instead of or in addition to the  
LyX

  encoding mechanism.


I'd really like to avoid that.

JMarc




Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Uwe Stöhr

Jean-Marc Lasgouttes schrieb:


For the case of ascii I already know that I am going to remove it :)
If you read the linked bug, you will notice that we do create
ascii-friendly accents as needed since we switched to unicode. There
zero need to handle ascii differently from, say, latin5.


LyX doesn't know ascii as encoding; we therefore need to translate it to an encoding that LyX 
understands. LyX understands latin5.



TeX documents can have different encodings. The last listed encoding is
the document-wide encoding.


Please. Imagine at least that I know that and that I may have a reason 
to ask :) My question is why we do this weird only do this if we have 1 
language and 1 encoding dance?

 What is the difference between a latin1 document in french (or english)
 only and a latin1 document in french and english?

You question sounded more general therefore my general reply.
To answer you question precisely:

When having only one encoding, this encoding can be set as h_inputencoding, otherwise 
h_inputencoding must be set to auto (the one of the document language). This is done in the routine 
for babel, but there are TeX files where the input encoding is specified before document or babel 
languages. Therefore do nothing when there is more than one encoding in the routine for inputenc.

I now updated the comments a bit to make this more clear.


  Could you enlighten me? Otherwise, I'll just remove the code I do not
 understand and look at what happens :)

Removing something that one doesn't understand is a bad idea. Imagine
you stop the water cooling of an atomic powerplant because you think its
unnecessary ;-).


Read again. I did not remove any code, I asked. However, you will have 
to try harder in order to convince me.


Well you wrote: I'll just remove the code I do not understand and look at what 
happens
I could therefore not resist to drop a funny statement.

regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Guenter Milde
On 2009-12-09, Jean-Marc Lasgouttes wrote:
> Uwe,

> I have been looking at this code from preamble.cpp

>   // only set when there is not more than one inputenc
>   // option therefore check for the "," character also
>   // only set when there is not more then one babel
>   // language option
>   if (opts.find(",") == string::npos && one_language == true) {
>   if (opts == "ascii")
>   //change ascii to auto to be in the unicode 
> range, see
>   //http://www.lyx.org/trac/ticket/4719
>   h_inputencoding = "auto";
>   else if (!opts.empty())
>   h_inputencoding = opts;
>   }

> ... but I cannot understand what is so special about having several
> encodings, or even several languages defined.

In LyX, you can set Document>Settings>Language to either

  (*) Language default
  ( ) Other
  
With "Other", you set *one* encoding for the whole document,
with "Language default" the encoding is taken from the 'languages'
config file and there can be more than one encoding in the document.

To get multiple defined encodings from a latex file into a lyx
document, the best try is using "auto" encoding (i.e. "Language
default").

Please do not remove this code.

Günter



Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Jean-Marc Lasgouttes

In LyX, you can set Document>Settings>Language to either

 (*) Language default
 ( ) Other

With "Other", you set *one* encoding for the whole document,
with "Language default" the encoding is taken from the 'languages'
config file and there can be more than one encoding in the document.

To get multiple defined encodings from a latex file into a lyx
document, the best try is using "auto" encoding (i.e. "Language
default").


OK, so what about just keeping the last encoding and, after finishing
the parsing of preamble, resetting this encoding to "auto" if it is
the default encoding for the language?

Or just set the encoding to "auto" unconditionally?

The current scheme is arbitrary and not predictable IMO. I do not see  
why counting

encodings and language means something. Moreover, it seems that it
fails if the babel statement is after the inputenc package loading.

JMarc


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Uwe Stöhr

> I have been looking at this code from preamble.cpp
>
> // only set when there is not more than one inputenc
> // option therefore check for the "," character also
> // only set when there is not more then one babel
> // language option
> if (opts.find(",") == string::npos && one_language == true) {
> if (opts == "ascii")
> //change ascii to auto to be in the unicode range, see
> //http://www.lyx.org/trac/ticket/4719
> h_inputencoding = "auto";
> else if (!opts.empty())
> h_inputencoding = opts;
> }
>
> and I do not understand its meaning. I can see that the special handling of "ascii" is not needed 
> anymore, but I cannot understand what is so special about having several encodings, or even

> several languages defined.

auto means in this case that the default encoding is used. So ascii is converted to the default 
encoding of the document language.

TeX documents can have different encodings. The last listed encoding is the 
document-wide encoding.

> Could you enlighten me? Otherwise, I'll just remove the code I do not 
understand
> and look at what happens :)

Removing something that one doesn't understand is a bad idea. Imagine you stop the water cooling of 
an atomic powerplant because you think its unnecessary ;-).


regards Uwe


Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Jean-Marc Lasgouttes

Le 9 déc. 09 à 22:48, G. Milde a écrit :

OK, so what about just keeping the last encoding and, after finishing
the parsing of preamble, resetting this encoding to "auto" if it is
the default encoding for the language?


But what to do if not? We should rather expect the other encoding to
be used (why would it be defined otherwise) or parse the whole
document first to be sure.


We should not parse the document in advance IMO. This is too much  
effort for

a case that does not seem so useful.

BTW what would be you real-world examples of multi-encoding documents?
I am interested because I suspect that we do not have the same thing  
in mind.





Or just set the encoding to "auto" unconditionally?


No. I want to keep e.g. utf8x also with a round-trip.


OK. Then the document encoding unconditionally.

A question that I have is: do we to fear that documents will be wrongly
output because of encoding, or are our unicode tables good enough (in  
practice) to

work around problems.

I guess my question is: is the main problem correctness of the printed  
result or

keeping the same encodings to make our tex-based coauthor happy?

The problem is that we have a lossy transformation. There are less  
degrees

of freedom in LyX than in LaTeX.


We could add encoding to fonts, like we do for languages. But I do not  
know how

useful this would be.


The current scheme is trying its best to get working results and
round-trips without encoding change.


I failed to find the discussion and the use cases that led to this...
That's why I do not grasp it yet.


To find out which encoding setting to use:


[snip reasonable proposals]

  b3) use a premble command and ERT instead of or in addition to the  
LyX

  encoding mechanism.


I'd really like to avoid that.

JMarc




Re: (to Uwe) Special tex2lyx handling for multiple encodings declarations

2009-12-09 Thread Uwe Stöhr

Jean-Marc Lasgouttes schrieb:


For the case of "ascii" I already know that I am going to remove it :)
If you read the linked bug, you will notice that we do create
ascii-friendly accents as needed since we switched to unicode. There
zero need to handle ascii differently from, say, latin5.


LyX doesn't know "ascii" as encoding; we therefore need to translate it to an encoding that LyX 
understands. LyX understands latin5.



TeX documents can have different encodings. The last listed encoding is
the document-wide encoding.


Please. Imagine at least that I know that and that I may have a reason 
to ask :) My question is why we do this weird "only do this if we have 1 
language and 1 encoding" dance?

> What is the difference between a latin1 document in french (or english)
> only and a latin1 document in french and english?

You question sounded more general therefore my general reply.
To answer you question precisely:

When having only one encoding, this encoding can be set as h_inputencoding, otherwise 
h_inputencoding must be set to auto (the one of the document language). This is done in the routine 
for babel, but there are TeX files where the input encoding is specified before document or babel 
languages. Therefore do nothing when there is more than one encoding in the routine for inputenc.

I now updated the comments a bit to make this more clear.


 > Could you enlighten me? Otherwise, I'll just remove the code I do not
 >understand and look at what happens :)

Removing something that one doesn't understand is a bad idea. Imagine
you stop the water cooling of an atomic powerplant because you think its
unnecessary ;-).


Read again. I did not remove any code, I asked. However, you will have 
to try harder in order to convince me.


Well you wrote: "I'll just remove the code I do not understand and look at what 
happens"
I could therefore not resist to drop a funny statement.

regards Uwe