Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-31 Thread Guenter Milde
Dear Enrico, dear Lyx developers,

the patch for em- and en-dashes [72a488d7] tries to restore (as far as
possible/sensible) the pre-2.2 behaviour regarding dashes.
It also keeps the behaviour of of 2.2 documents :-).

Nevertheless, there are some shortcomings and problems:

1. Pre-2.2 documents using literal dashes were not affected by the change in
   2.2 but now they require user-interaction (unchecking
   use-ligature-dashes) to work as before.

2. Back-conversion destroys information whether dashes shall be exported as
   literal characters or ligatures.

3. Different behaviour for documents with non-TeX fonts when compiled with
   LuaTeX.


Fixes
=

lyx2lyx converts dash ligatures to the interim representations \twohyphens
and \threehyphens and back:

2.1  <->  2.2
--   <->  \twohyphens
---  <->  \threehyphens

literal EM DASH and EN DASH characters are kept as-is

-> the information about the dash-type is only lost if a document is *opened*
in 2.2. :-)

This allows solving problems 1 and 2 with the following steps:

a) back-convert "ligature dashes" to \twohyphens rsp. \threehyphens

   +1 solves problem 2
  (keeps info also in previous versions (unless opened in 2.2))
   +1 works also in LyX versions not supporting ZWSP.

b) move the ligature-dash -> literal-dash+ZWSP conversion from
   lyx2lyx/lyx_2_3.py to Text.cpp

   +1 backport to 2.2 fixes unwanted overfull lines in 2.2

c) When converting older documents to 2.3 with lyx2lyx, set
   \use-ligature-dashes TRUE if the document contains \threehyphens or
   \twohyphens instead of depending on the document's fileformat version.

   +1 respect pre-2.2 documents with literal dashes (solves problem 1)

Old documents that contain a *mix* of literal and ligature dashes will
still show changed behaviour (regardless of the value of
"use_ligature_dashes"). This may be tolerable (assuming that only a
small fraction of existing LyX documents mix the dash representations and
only a small fraction of them will experience changed line breaks).


Alternatives


Support for parallel use of ligature and literal dashes can be realized with
a "special character" for ligature dashes instead of the buffer setting.

-1 em- and en-dashes are common printable characters (except for the line
   break details). Keeping two alternative representations may be overkill.
   (OTOH, we have a similar case with quotes.)



Different line breaking behaviour with LuaTeX (non-TeX fonts) compared to
the other exports (problem 3) can be solved with the alternatives

a) use literal dashes exclusively, set \XeTeXdashbreakstate=0

   +1 simple
   -1 no line break after dashes (can be fixed with "allowbreak" (see below))

b) export as ligature also with non-TeX fonts, except for teletype.

c) Preamble code making the literal dashes active and bind to ligatures for
   LuaTeX


Last but not least:

The expansion of the literal dashes in lib/unicodesymbols changed to macros
instead of ligatures 10 years ago in
https://www.lyx.org/trac/changeset/18802/lyxsvn with the explanation:

"unicodesymbols: use commands for the dashes for consistency reasons and
to avoid potential problems with some LaTeX-packages"


Before going back to ligatures, we should explore the reasons/side-effects:

@Uwe: Can you give an example of "potential problems with some LaTeX-packages"?

"Consistency" is clear: "inputenc" uses \textendash and \textemdash for
encodings supporting the literal dashes.

The LaTeX team introduced \text* commands replacing the font ligatures
already 23 years ago and gave the reasons in "LaTeX2ε for authors"
(usrguide.pdf):

\textemdash \textendash \textexclamdown \textquestiondown \textquotedblleft
\textquotedblright \textquoteleft \textquoteright

New feature 1994/12/01

These commands produce characters which would otherwise be accessed via
ligatures
...
The reason for making these characters directly accessible is so that they
will work in encodings which do not have these characters.


My preference:

The "allowbreak" special character (ticket #10585) allows an easily
configurable line break option after the em-dash. The combination literal
em-dash + allowbreak is a good default for most use cases, c.f.
https://www.lyx.org/trac/raw-attachment/ticket/10543/emdash-line-breaks.pdf
and https://www.lyx.org/trac/raw-attachment/ticket/10543/dash-problems.lyx

   +1 correct feedback
   +1 optional line break after the dash
   +1 allows hyphenation of preceding/following word
   +1 configurable
   +1 simple local override by deleting ZWSP

Suggestion: add an "allowbreak" when converting en-dash + hyphen to em-dash
on input.

For the en-dash, I suggest to leave it as-is so that range-specifications
(pages 2--12, years 1987--1990, ...) don't wrap.

Problem #10490 (sorting indexes) can also be solved without reverting to
ligature input.

Günter



Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-21 Thread Jürgen Spitzmüller
2017-03-21 11:06 GMT+01:00 José Abílio Matos :

> On Tuesday, 21 March 2017 09.26.39 WET Jürgen Spitzmüller wrote:
> >
> > I am not sure I understand. The default setting of \dynamic_quotes is
> > "false", which is equal to the case when the param is missing.
> >
> > Jürgen
>
> That was the subject of my rambling on Sunday and something that I have
> been
> trying to enforce in 2.3, but that I have clearly no communicated properly.
> :-)
>
> You are right if you compare the output for lyx-2.3.
>
> My problem is a different one. Suppose that a document that you created
> with
> lyx 2.2 is not touched with lyx 2.3 but later opened with a later version.
> Suppose also that in after 2.3 we decide to change the default setting of
> \dynamic_quotes to "true".
>

In that case, I would ensure that old documents (with no \dynamic_quotes
header) get the correct value, i.e. "false".

I mean, if we would decide that the default value is true, we would have to
assure that the header is always output for new documents (no header would
still mean "false").

But I see what you mean in the sense of "more transparent file format".

Jürgen


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-21 Thread José Abílio Matos
On Tuesday, 21 March 2017 09.26.39 WET Jürgen Spitzmüller wrote:
> 
> I am not sure I understand. The default setting of \dynamic_quotes is
> "false", which is equal to the case when the param is missing.
> 
> Jürgen

That was the subject of my rambling on Sunday and something that I have been 
trying to enforce in 2.3, but that I have clearly no communicated properly. 
:-)

You are right if you compare the output for lyx-2.3.

My problem is a different one. Suppose that a document that you created with 
lyx 2.2 is not touched with lyx 2.3 but later opened with a later version. 
Suppose also that in after 2.3 we decide to change the default setting of 
\dynamic_quotes to "true".

So now depending on the conversion path we will get two different documents 
(with a possibly different output):

Case 1:
 * open the original document in lyx 2.3, do not change any of its content but 
make it dirty in order to be able to save it;
 * open that saved document with a later lyx version.

Case 2:
 * open the original document directly with the later version.


Honestly most of the time there will not be any difference between the two 
scenarios. My problem is that when there is a difference it could be very 
difficult to catch.


So what I am proposing is a stricter implementation of a new file format. In 
the case where a new header property is added the lyx2lyx associated changes 
already has in the reversion step a moment where the property is removed from 
the header. I propose that as much as possible you should ensure that in the 
conversion step the properties should be added to the header with the default 
value.

The ideal case would be a new test where for each new file format we a have a 
set of tests where we take a document (with the User's Guide being the best 
candidate), convert it with lyx2lyx to the new file format, load it to lyx 
forcing a save and compare the difference. Ideally there should not be any 
difference between the two versions.

I hope that this makes sense. :-)

Regards,
-- 
José Abílio


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread Enrico Forestieri
On Mon, Mar 20, 2017 at 07:05:27PM +0100, Guillaume MM wrote:

> Le 20/03/2017 à 11:00, Enrico Forestieri a écrit :
> > On Mon, Mar 20, 2017 at 07:25:38AM +0100, Guillaume MM wrote:
> > 
> > > Le 19/03/2017 à 20:58, Enrico Forestieri a écrit :
> > > > +  Don't use ligatures for en- and em-dashes
> > > 
> > > Can I suggest to phrase it positively? "Output en- and em- dashes as
> > > ligatures"
> > 
> > I thought about it, but this is not possible. As it is phrased now, you
> > can be sure that, if it is checked, dash ligatures will not be used.
> > If you phrase it positively, then you have to always use dash ligatures,
> > also when using non-TeX fonts, in which case it will not work.
> > 
> 
> Even clearer, then, would be to phrase positively and grey out the box
> on use-non-tex-fonts. Just a suggestion.

Good idea. I did that.

-- 
Enrico


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread Guillaume MM

Le 20/03/2017 à 11:00, Enrico Forestieri a écrit :

On Mon, Mar 20, 2017 at 07:25:38AM +0100, Guillaume MM wrote:


Le 19/03/2017 à 20:58, Enrico Forestieri a écrit :

+  Don't use ligatures for en- and em-dashes


Can I suggest to phrase it positively? "Output en- and em- dashes as
ligatures"


I thought about it, but this is not possible. As it is phrased now, you
can be sure that, if it is checked, dash ligatures will not be used.
If you phrase it positively, then you have to always use dash ligatures,
also when using non-TeX fonts, in which case it will not work.



Even clearer, then, would be to phrase positively and grey out the box
on use-non-tex-fonts. Just a suggestion.



Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread José Abílio Matos
On Monday, 20 March 2017 13.28.36 WET Enrico Forestieri wrote:
> I think that info was already in document.start

You are right and initially I started to search for it but for some reason got 
sidetracked.

I intend to commit a change to remove document.start and to change all 
references to document.initial_format for symmetry with document.final_format.

Those names are more descriptive.
-- 
José Abílio


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread Enrico Forestieri
On Mon, Mar 20, 2017 at 11:56:02AM +, José Abílio Matos wrote:
> On Monday, 20 March 2017 10.52.26 WET Enrico Forestieri wrote:
> > 
> > How do you know what is the initial starting file format?
> 
> I am glad you ask. :-) After my last commit:
> 
> document.initial_format

I think that info was already in document.start

-- 
Enrico


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread José Abílio Matos
On Monday, 20 March 2017 10.52.26 WET Enrico Forestieri wrote:
> > For this changes I would expect for \use_dash_ligatures to be referred in
> > the FORMATS since this is a new header that does not exists before.
> 
> I think you mean development/FORMAT. I will add the reference. Note that
> also \dynamic_quotes needs to be mentioned there.

Yes, that is what I meant. :-)

> > At the same time I would expect for this header to be set every new
> > documents, either to be true or false depending on the file content. Or as
> > you describe depending on the initial starting file format.
> 
> How do you know what is the initial starting file format?

I am glad you ask. :-) After my last commit:

document.initial_format

> > We want that as much as possible the result of a lyx2lyx file conversion
> > to
> > be equal to the same content read and exported by lyx without any further
> > changes
> 
> I see. Will do that. Note that this has also to be done for \dynamic_quotes.

I agree.

Thank you. :-)
-- 
José Abílio


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread Enrico Forestieri
On Mon, Mar 20, 2017 at 12:35:35AM +, José Abílio Matos wrote:

> On Sunday, 19 March 2017 19.58.08 WET Enrico Forestieri wrote:
> > commit 72a488d7e6b56432263c80dd92cd6acc565e03a7
> > Author: Enrico Forestieri 
> > Date:   Sun Mar 19 20:50:34 2017 +0100
> > 
> > Fix output of en- and em-dashes with TeX fonts
> > 
> > This commit fixes the regression introduced in 2.2 about the
> > output of en- and em-dashes. In 2.2 en- and em-dashes are output as
> > the \textendash and \textemdash macros when using TeX fonts, causing
> > changed output in old documents and also bugs (for example, #10490).
> > 
> > Now documents produced with older versions work again as intended,
> > while documents produced with 2.2 can be made to produce the exact
> > same output by simply checking "Don't use ligatures for en-and
> > em-dashes" in Document->Settings->Fonts.
> > 
> > When exporting documents using TeX fonts to earlier versions, in order
> > to avoid changed output, a zero-width space character is inserted after
> > each en/em-dash if dash ligatures are allowed. These characters are
> > removed when reloading  documents with 2.3, so that they don't
> > accumulate.
> 
> For this changes I would expect for \use_dash_ligatures to be referred in the 
> FORMATS since this is a new header that does not exists before.

I think you mean development/FORMAT. I will add the reference. Note that
also \dynamic_quotes needs to be mentioned there.

> At the same time I would expect for this header to be set every new
> documents, either to be true or false depending on the file content. Or as
> you describe depending on the initial starting file format.

How do you know what is the initial starting file format?

> We want that as much as possible the result of a lyx2lyx file conversion to
> be equal to the same content read and exported by lyx without any further
> changes

I see. Will do that. Note that this has also to be done for \dynamic_quotes.

-- 
Enrico


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread Enrico Forestieri
On Mon, Mar 20, 2017 at 07:25:38AM +0100, Guillaume MM wrote:

> Le 19/03/2017 à 20:58, Enrico Forestieri a écrit :
> > +  Don't use ligatures for en- and em-dashes
> 
> Can I suggest to phrase it positively? "Output en- and em- dashes as
> ligatures"

I thought about it, but this is not possible. As it is phrased now, you
can be sure that, if it is checked, dash ligatures will not be used.
If you phrase it positively, then you have to always use dash ligatures,
also when using non-TeX fonts, in which case it will not work.

-- 
Enrico


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread José Abílio Matos
On Sunday, 19 March 2017 19.58.08 WET Enrico Forestieri wrote:
> commit 72a488d7e6b56432263c80dd92cd6acc565e03a7
> Author: Enrico Forestieri 
> Date:   Sun Mar 19 20:50:34 2017 +0100
> 
> Fix output of en- and em-dashes with TeX fonts
> 
> This commit fixes the regression introduced in 2.2 about the
> output of en- and em-dashes. In 2.2 en- and em-dashes are output as
> the \textendash and \textemdash macros when using TeX fonts, causing
> changed output in old documents and also bugs (for example, #10490).
> 
> Now documents produced with older versions work again as intended,
> while documents produced with 2.2 can be made to produce the exact
> same output by simply checking "Don't use ligatures for en-and
> em-dashes" in Document->Settings->Fonts.
> 
> When exporting documents using TeX fonts to earlier versions, in order
> to avoid changed output, a zero-width space character is inserted after
> each en/em-dash if dash ligatures are allowed. These characters are
> removed when reloading  documents with 2.3, so that they don't
> accumulate.

For this changes I would expect for \use_dash_ligatures to be referred in the 
FORMATS since this is a new header that does not exists before.

At the same time I would expect for this header to be set every new documents, 
either to be true or false depending on the file content. Or as you describe 
depending on the initial starting file format.

We want that as much as possible the result of a lyx2lyx file conversion to be 
equal to the same content read and exported by lyx without any further changes

Regards,
-- 
José Abílio


Re: [LyX/master] Fix output of en- and em-dashes with TeX fonts

2017-03-20 Thread Guillaume MM

Le 19/03/2017 à 20:58, Enrico Forestieri a écrit :

+  Don't use ligatures for en- and em-dashes


Can I suggest to phrase it positively? "Output en- and em- dashes as
ligatures"