Re: ctests for input encoding

2016-11-07 Thread Guenter Milde
On 2016-11-07, Kornel Benko wrote:
> Am Montag, 7. November 2016 um 15:47:28, schrieb Guenter Milde 
> 

Dear Kornel,

...

>> One export route suffices. This is why I set the "default output format"
>> to pdf2 in the sample files.

> Do you(we) want all encoding tests for 084-misc-symbols_pdf2* be
> contained in one test, or do you prefer them be separate?

Actually, 084-misc-symbols was only an example.

A simple way for proper testing of the data files "lib/unicodesymbols"
and "lib/encodings", would be to create 9*51 separate tests for
the 9 sample files¹ matching "autotests/export/latex/Unicode-characters/*.lyx"
in all 51 input encodings² defined in lib/encodings.

¹ After removing the 9 "doublettes" "Unicode-characters/*utf8.lyx".

² Possibly some encodings only work with special export routes, these
  encodings could be excluded with patterns in "ignoredTests".

Thanks a lot,

Günter 



Re: ctests for input encoding

2016-11-07 Thread Kornel Benko
Am Montag, 7. November 2016 um 15:47:28, schrieb Guenter Milde 

> On 2016-11-07, Kornel Benko wrote:
> > Am Montag, 7. November 2016 um 15:10:09, schrieb Kornel Benko 
> > 
> >> Am Montag, 7. November 2016 um 13:44:17, schrieb Guenter Milde 
> >> 
> 
> Dear Kornel,
> 
> ...
> 
> >> > I'd like an "expansion" of
> >> > export/latex/Unicode-characters/084-misc-symbols.lyx into
> >> > 
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_armscii
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_latin1
> >> >  ...
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_tis620-0
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8-plain
> >> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_ascii
> >> >  
> >> > Would this be feasible?
> 
> >> Yes, it is. New parameter for useSystemFonts.pl. But I'd prefer to use
> >> it only for files in a special directory.
> 
> So do I: please limit the "expansion" with *all* encodings to the
> directory "autotests/export/latex/Unicode-characters" (maybe renaming it
> to "autotests/export/latex/test_lib_unicodesymbols/" first). With the
> "inputenc expansion", I would remove the "*utf8.lyx" samples, of course.
> 
> (OTOH, I still suggest testing "utf8" and "ascii" input encodings with all
> samples and pdf2.)
> 
> >> Do you want to test also combinations like  084-misc-symbols_pdf4_utf8 ?
> > Ouch .. I see you already want it ...
> >> Looks like overkill ...
> 
> One export route suffices. This is why I set the "default output format"
> to pdf2 in the sample files.

Do you(we) want all encoding tests for 084-misc-symbols_pdf2* be contained in 
one test, or
do you prefer them be separate?

> Thanks,
> 
> Günter 

Kornel

signature.asc
Description: This is a digitally signed message part.


Re: [patch] remove unsupported encoding l7xenc

2016-11-07 Thread Uwe Stöhr

Am 07.11.2016 um 08:21 schrieb Guenter Milde:


Great. IMO, this is should also go to stable.


I'll ask Richard. For master it is in.


Do you also update the wiki?


Could you please do this:
http://wiki.lyx.org/Windows/Lithuanian
since you know apparently more about this topic.

thanks and regards
Uwe


Re: [LyX/master] Get rid of ParagraphMetrics::insetDimension

2016-11-07 Thread Scott Kostyshak
On Mon, Oct 19, 2015 at 01:18:03PM +0200, Jean-Marc Lasgouttes wrote:
> commit 26eb5092fb69464d181caaf212d6a4d9c9cff2f0
> Author: Jean-Marc Lasgouttes 
> Date:   Mon Oct 12 16:11:58 2015 +0200
> 
> Get rid of ParagraphMetrics::insetDimension
> 
> We already have a CoordCache of insets dimensions. It is not necessary
> to store the same information in two places.
> 
> Give a name to CoordCache tables types to improve code readability.
> 
> Remove ParagraphMetrics::singleWidth, which is not used anymore.

This commit leads to an assertion for me at line 31 (of current master)
of CoordCache.cpp.

To reproduce:

1. open the attached file, undefined_branch.21.lyx.
2. put the cursor at the beginning, to the left of the branch.
3. press shift+ to highlight the branch.
4. go to Insert > Branch > Insert New Branch, put "newBranch" and press
"OK".

I do not get an assertion if the branch being highlighted is a defined
branch.

Scott
#LyX 2.2 created this file. For more info see http://www.lyx.org/
\lyxformat 474
\begin_document
\begin_header
\textclass article
\use_default_options true
\maintain_unincluded_children false
\language english
\language_package default
\inputencoding auto
\fontencoding global
\font_roman default
\font_sans default
\font_typewriter default
\font_math auto
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\default_output_format default
\output_sync 1
\bibtex_command default
\index_command default
\paperfontsize default
\spacing single
\use_hyperref false
\papersize default
\use_geometry false
\use_package amsmath 1
\use_package amssymb 1
\use_package cancel 1
\use_package esint 1
\use_package mathdots 1
\use_package mathtools 1
\use_package mhchem 1
\use_package stackrel 1
\use_package stmaryrd 1
\use_package undertilde 1
\cite_engine basic
\cite_engine_type default
\biblio_style plain
\use_bibtopic false
\use_indices false
\paperorientation portrait
\suppress_date false
\justification true
\use_refstyle 1
\index Index
\shortcut idx
\color #008000
\end_index
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\paragraph_indentation default
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict false
\end_header

\begin_body

\begin_layout Standard
\begin_inset Branch notDefined
status open

\begin_layout Standard
hello
\end_layout

\end_inset


\end_layout

\end_body
\end_document


signature.asc
Description: PGP signature


Re: Master is slow

2016-11-07 Thread racoon

On 01.11.2016 19:14, racoon wrote:

On 22.10.2016 19:39, Guillaume Munch wrote:

Le 18/10/2016 à 21:44, Guillaume Munch a écrit :


Profiling shows that calls to BufferParams::isExportableFormat
are numerous and expensive when doing char-forward (33% of the total
amount of CPU). This is called from GuiView::updateToolbars ->
GuiView::getStatus. There is room for improvement, but this is not new
behaviour apparently.

I also found that calls to TabWorkArea::updateTabTexts are
expensive and repeatedb. This amounts to 31% of the total amount of CPU,
shared between QTabWidget::setTabText and QTabWidget::setTabIcon.
TabWorkArea::updateTabTexts is connected to the signal
GuiWorkArea::titleChanged.




After improvements by Richard and Jean-Marc, GuiView::getStatus is down
to 10% (mostly lyx::to_utf8) and there is no trace of tabs updating.

New bottlenecks are Buffer::updateMacros (25%, of course depends on
the document) and nothing else looks really out of place. You can
celebrate.

Daniel, is it still slow for you?


Sorry for the late reply. I still have a hard time finding out when
someone sends a message to the group without also answering to my email
address.

Yes, it is still slow. But it still seems to be in some kind of debug
mode: there is a console in the background, Visual Studio says

Build started: Project: lyx_version, Configuration: Debug Win32
Build started: Project: frontend_qt, Configuration: Debug Win32
etc.
Install configuration: "Debug"

I tried to uncheck/check the flags that have been suggested but it did
not change those messages.


Update on slowness.

After installing Qt5.6.2 and using the latest master LyX seems now fine 
even though I still get the 'Install configuration: "Debug"' and the 
like from Visual Studio.


Daniel




Re: pasted non-acceptable symbol

2016-11-07 Thread Guenter Milde
On 2016-11-07, Jean-Marc Lasgouttes wrote:
> Le 07/11/2016 à 14:31, Guenter Milde a écrit :

...

> Converting them is possible in insertStringAsXXX and in tex2lyx, as I 
> wrote. But we cannot forbid the characters to be in our documents (or if 
> we can, it will be a waste of energy to catter for all cases). This is 
> why I was wondering what would be the last-chance solution.

> We have a patch for not crashing. Now an approach for exporting some 
> working LaTeX would be a good complement.

I see.
As we have the precedence with NBSP and similar characters, I recommend to
use replacements in lib/unicodesymbols for the fallback export solution
(as written in my last posting).

Günter



Re: pasted non-acceptable symbol

2016-11-07 Thread Jean-Marc Lasgouttes

Le 07/11/2016 à 14:31, Guenter Milde a écrit :

Finally, I convinced myself that your approach is correct if we want to
keep the breaks. In the following patch I add some one screen hints of
what is going on. I could use a color of the characters, but I am not
sure what to do, these are actual characters, not insets. A solution
could be to add a frame around the characters.


I'd rather convert them to the "usual LyX representations"

  \begin_inset Newline newline
  \end_inset

and

  \begin_layout 


Converting them is possible in insertStringAsXXX and in tex2lyx, as I 
wrote. But we cannot forbid the characters to be in our documents (or if 
we can, it will be a waste of energy to catter for all cases). This is 
why I was wondering what would be the last-chance solution.


We have a patch for not crashing. Now an approach for exporting some 
working LaTeX would be a good complement.


JMarc




Re: ctests for input encoding

2016-11-07 Thread Guenter Milde
On 2016-11-07, Kornel Benko wrote:
> Am Montag, 7. November 2016 um 15:10:09, schrieb Kornel Benko 
>> Am Montag, 7. November 2016 um 13:44:17, schrieb Guenter Milde 
>> 

Dear Kornel,

...

>> > I'd like an "expansion" of
>> > export/latex/Unicode-characters/084-misc-symbols.lyx into
>> > 
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_armscii
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_latin1
>> >  ...
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_tis620-0
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8-plain
>> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_ascii
>> >  
>> > Would this be feasible?

>> Yes, it is. New parameter for useSystemFonts.pl. But I'd prefer to use
>> it only for files in a special directory.

So do I: please limit the "expansion" with *all* encodings to the
directory "autotests/export/latex/Unicode-characters" (maybe renaming it
to "autotests/export/latex/test_lib_unicodesymbols/" first). With the
"inputenc expansion", I would remove the "*utf8.lyx" samples, of course.

(OTOH, I still suggest testing "utf8" and "ascii" input encodings with all
samples and pdf2.)

>> Do you want to test also combinations like  084-misc-symbols_pdf4_utf8 ?
> Ouch .. I see you already want it ...
>> Looks like overkill ...

One export route suffices. This is why I set the "default output format"
to pdf2 in the sample files.

Thanks,

Günter 



Re: pasted non-acceptable symbol

2016-11-07 Thread Enrico Forestieri
On Mon, Nov 07, 2016 at 01:31:31PM +, Guenter Milde wrote:
> 
> As the meaning of LINE SEPARATOR and PARAGRAPH SEPARATOR is clear from
> http://unicode.org/versions/Unicode5.2.0/ch05.pdf
> we can transform them to the corresponding LaTeX representation:
> 
> 0x2028 ""  "" "" "" "" # LINE SEPARATOR
> 0x2029 "\\par" "" "" "" "" # PARAGRAPH SEPARATOR

We have insets for both:

\begin_inset Newline newline
\end_inset

\begin_inset Separator latexpar
\end_inset

-- 
Enrico


Re: ctests for input encoding

2016-11-07 Thread Kornel Benko
Am Montag, 7. November 2016 um 15:10:09, schrieb Kornel Benko 
> Am Montag, 7. November 2016 um 13:44:17, schrieb Guenter Milde 
> 
> > Dear Kornel,
> > 
> > On 2016-11-07, Kornel Benko wrote:
> > > Am Montag, 7. November 2016 um 07:29:39, schrieb Guenter Milde 
> > > 
> > 
> > >> the recent discussion about http://www.lyx.org/trac/ticket/10474
> > >> showed, that input encoding support was never fully tested.
> > 
> > >> I suggest to extend the tests in 
> > >> autotests/export/latex/Unicode-characters
> > >> to all encodings defined in lib/encodings.
> > 
> > >> Could you write a rule testing 
> > >> "autotests/export/latex/Unicode-characters/*.lyx" with all encodings
> > 
> > > I don't understand. This directory already exists ...
> > 
> > Yes, but currently, it only tests 2 out of 51 LyX-supported input encodings.
> > And, it does so with 2*9 separate files for 9 Unicode blocks in 2 fixed
> > encodings (ascii and utf8).
> > 
> > Instead of adding 9*49 new files (and changing 51 files with every
> > correction/addition), I'd prefer a ctest rule that allows to test all 51
> > LyX-supported input encodings based on 9 sample files.
> > 
> > The test machinery does a similar action with the test of various export
> > routes for one sample file.
> > 
> > Just like doc/de/Intro.lyx is tested as
> > 
> >   Test #923: export/doc/de/Intro_xhtml
> >   Test #924: export/doc/de/Intro_dvi
> >   Test #925: export/doc/de/Intro_dvi3_texF
> >   Test #926: export/doc/de/Intro_dvi3_systemF
> >   Test #927: export/doc/de/Intro_pdf
> >   Test #928: export/doc/de/Intro_pdf2
> >   Test #929: export/doc/de/Intro_pdf3
> >   Test #930: export/doc/de/Intro_pdf4_texF
> >   Test #931: export/doc/de/Intro_pdf4_systemF
> >   Test #932: export/doc/de/Intro_pdf5_texF
> >   Test #933: export/doc/de/Intro_pdf5_systemF
> 
> Yes, but this is derived from possible lyx export types.
> 
> > I'd like an "expansion" of
> > export/latex/Unicode-characters/084-misc-symbols.lyx into
> > 
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_armscii
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_latin1
> >  ...
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_tis620-0
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8-plain
> >  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_ascii
> >  
> > Would this be feasible?
> 
> Yes, it is. New parameter for useSystemFonts.pl. But I'd prefer to use it 
> only for
> files in a special directory.
> Do you want to test also combinations like  084-misc-symbols_pdf4_utf8 ?

Ouch .. I see you already want it ...

> Looks like overkill ...
> 
> > Thanks,
> > 
> > Günter 
> >  
>   Kornel

signature.asc
Description: This is a digitally signed message part.


Re: ctests for input encoding

2016-11-07 Thread Kornel Benko
Am Montag, 7. November 2016 um 13:44:17, schrieb Guenter Milde 

> Dear Kornel,
> 
> On 2016-11-07, Kornel Benko wrote:
> > Am Montag, 7. November 2016 um 07:29:39, schrieb Guenter Milde 
> > 
> 
> >> the recent discussion about http://www.lyx.org/trac/ticket/10474
> >> showed, that input encoding support was never fully tested.
> 
> >> I suggest to extend the tests in autotests/export/latex/Unicode-characters
> >> to all encodings defined in lib/encodings.
> 
> >> Could you write a rule testing 
> >> "autotests/export/latex/Unicode-characters/*.lyx" with all encodings
> 
> > I don't understand. This directory already exists ...
> 
> Yes, but currently, it only tests 2 out of 51 LyX-supported input encodings.
> And, it does so with 2*9 separate files for 9 Unicode blocks in 2 fixed
> encodings (ascii and utf8).
> 
> Instead of adding 9*49 new files (and changing 51 files with every
> correction/addition), I'd prefer a ctest rule that allows to test all 51
> LyX-supported input encodings based on 9 sample files.
> 
> The test machinery does a similar action with the test of various export
> routes for one sample file.
> 
> Just like doc/de/Intro.lyx is tested as
> 
>   Test #923: export/doc/de/Intro_xhtml
>   Test #924: export/doc/de/Intro_dvi
>   Test #925: export/doc/de/Intro_dvi3_texF
>   Test #926: export/doc/de/Intro_dvi3_systemF
>   Test #927: export/doc/de/Intro_pdf
>   Test #928: export/doc/de/Intro_pdf2
>   Test #929: export/doc/de/Intro_pdf3
>   Test #930: export/doc/de/Intro_pdf4_texF
>   Test #931: export/doc/de/Intro_pdf4_systemF
>   Test #932: export/doc/de/Intro_pdf5_texF
>   Test #933: export/doc/de/Intro_pdf5_systemF

Yes, but this is derived from possible lyx export types.

> I'd like an "expansion" of
> export/latex/Unicode-characters/084-misc-symbols.lyx into
> 
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_armscii
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_latin1
>  ...
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_tis620-0
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8-plain
>  export/export/latex/Unicode-characters/084-misc-symbols_pdf2_ascii
>  
> Would this be feasible?

Yes, it is. New parameter for useSystemFonts.pl. But I'd prefer to use it only 
for
files in a special directory.
Do you want to test also combinations like  084-misc-symbols_pdf4_utf8 ?
Looks like overkill ...

> Thanks,
> 
> Günter 
>  
Kornel

signature.asc
Description: This is a digitally signed message part.


Re: ctests for input encoding

2016-11-07 Thread Guenter Milde
Dear Kornel,

On 2016-11-07, Kornel Benko wrote:
> Am Montag, 7. November 2016 um 07:29:39, schrieb Guenter Milde 
> 

>> the recent discussion about http://www.lyx.org/trac/ticket/10474
>> showed, that input encoding support was never fully tested.

>> I suggest to extend the tests in autotests/export/latex/Unicode-characters
>> to all encodings defined in lib/encodings.

>> Could you write a rule testing 
>> "autotests/export/latex/Unicode-characters/*.lyx" with all encodings

> I don't understand. This directory already exists ...

Yes, but currently, it only tests 2 out of 51 LyX-supported input encodings.
And, it does so with 2*9 separate files for 9 Unicode blocks in 2 fixed
encodings (ascii and utf8).

Instead of adding 9*49 new files (and changing 51 files with every
correction/addition), I'd prefer a ctest rule that allows to test all 51
LyX-supported input encodings based on 9 sample files.

The test machinery does a similar action with the test of various export
routes for one sample file.

Just like doc/de/Intro.lyx is tested as

  Test #923: export/doc/de/Intro_xhtml
  Test #924: export/doc/de/Intro_dvi
  Test #925: export/doc/de/Intro_dvi3_texF
  Test #926: export/doc/de/Intro_dvi3_systemF
  Test #927: export/doc/de/Intro_pdf
  Test #928: export/doc/de/Intro_pdf2
  Test #929: export/doc/de/Intro_pdf3
  Test #930: export/doc/de/Intro_pdf4_texF
  Test #931: export/doc/de/Intro_pdf4_systemF
  Test #932: export/doc/de/Intro_pdf5_texF
  Test #933: export/doc/de/Intro_pdf5_systemF

I'd like an "expansion" of
export/latex/Unicode-characters/084-misc-symbols.lyx into

 export/export/latex/Unicode-characters/084-misc-symbols_pdf2
 export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8
 export/export/latex/Unicode-characters/084-misc-symbols_pdf2_armscii
 export/export/latex/Unicode-characters/084-misc-symbols_pdf2_latin1
 ...
 export/export/latex/Unicode-characters/084-misc-symbols_pdf2_tis620-0
 export/export/latex/Unicode-characters/084-misc-symbols_pdf2_utf8-plain
 export/export/latex/Unicode-characters/084-misc-symbols_pdf2_ascii
 
Would this be feasible?

Thanks,

Günter 
 



Re: pasted non-acceptable symbol

2016-11-07 Thread Guenter Milde
On 2016-11-07, Jean-Marc Lasgouttes wrote:
> Le 06/11/2016 à 14:30, Jean-Marc Lasgouttes a écrit :
>> This is a more radical approach that what I have in mind, and I do not
>> know whether it is safe. My idea was to modify the Row building code and
>> replace the character with some visual cue (in addition with the row
>> breaking), because I am not confident in sending this character to Qt
>> string rendering functions.

>> I'll propose something shortly.

> Finally, I convinced myself that your approach is correct if we want to 
> keep the breaks. In the following patch I add some one screen hints of 
> what is going on. I could use a color of the characters, but I am not 
> sure what to do, these are actual characters, not insets. A solution 
> could be to add a frame around the characters.

I'd rather convert them to the "usual LyX representations" 

  \begin_inset Newline newline
  \end_inset

and

  \begin_layout 
  
whenever possible. In my understanding,
http://unicode.org/versions/Unicode5.2.0/ch05.pdf recommends just this:
interpret these characters as unambiguous representations of a line preak
and paragraph break.

> The next problem is running LaTeX. By default, these characters are not 
> accepted. Could our local latex+unicode experts tell us whether it makes 
> any sense to handle these characters in LaTeX of whether nobody cares 
> and they should be ignored on output?

> I suspect that adding them to lib/unicodesymbols would do more harm than 
> good.

We could handle them similar to other characters that have a corresponding
LyX inset, e.g. spaces:

\begin_inset space ~
\end_inset

corresponds to

0x00a0 "~""" "notermination=both" "~" "" # NO-BREAK 
SPACE

or to other special characters like
\SpecialChar nobreakdash or \SpecialChar softhyphen

0x2011 "\\nobreakdash-"   "amsmath" "notermination=text" "" "" # 
NON-BREAKING HYPHEN

0x00ad "\\-"  "" "notermination=text" "" "" # SOFT HYPHEN

In both cases, lib/unicodesymbols has "fallback definitions" in case the
literal Unicode character is still in the document.

As the meaning of LINE SEPARATOR and PARAGRAPH SEPARATOR is clear from
http://unicode.org/versions/Unicode5.2.0/ch05.pdf
we can transform them to the corresponding LaTeX representation:

0x2028 ""  "" "" "" "" # LINE SEPARATOR
0x2029 "\\par" "" "" "" "" # PARAGRAPH SEPARATOR


Günter



Re: ctests for input encoding

2016-11-07 Thread Kornel Benko
Am Montag, 7. November 2016 um 07:29:39, schrieb Guenter Milde 

> Dear Kornel,
> 
> the recent discussion about http://www.lyx.org/trac/ticket/10474
> showed, that input encoding support was never fully tested.
> 
> I suggest to extend the tests in autotests/export/latex/Unicode-characters
> to all encodings defined in lib/encodings.
> 
> Could you write a rule testing 
> "autotests/export/latex/Unicode-characters/*.lyx" with all encodings

I don't understand. This directory already exists ...

> (this would also allow to remove the ascii vs. utf8 doubling) or
> should I place individual files into
> autotests/export/latex/Unicode-characters/ ?

If you want them in an extra dir, simply create one and move the related files 
there.
(e.g. #mkdir autotests/export/latex/Unicode-encodings)
The test machinery is IMHO already prepared for such case.

> Günter

Kornel

signature.asc
Description: This is a digitally signed message part.


Re: pasted non-acceptable symbol

2016-11-07 Thread Jean-Marc Lasgouttes

Le 06/11/2016 à 14:30, Jean-Marc Lasgouttes a écrit :

This is a more radical approach that what I have in mind, and I do not
know whether it is safe. My idea was to modify the Row building code and
replace the character with some visual cue (in addition with the row
breaking), because I am not confident in sending this character to Qt
string rendering functions.

I'll propose something shortly.


Finally, I convinced myself that your approach is correct if we want to 
keep the breaks. In the following patch I add some one screen hints of 
what is going on. I could use a color of the characters, but I am not 
sure what to do, these are actual characters, not insets. A solution 
could be to add a frame around the characters.


The next problem is running LaTeX. By default, these characters are not 
accepted. Could our local latex+unicode experts tell us whether it makes 
any sense to handle these characters in LaTeX of whether nobody cares 
and they should be ignored on output?


I suspect that adding them to lib/unicodesymbols would do more harm than 
good.


I am not sure that the approach of removing them when converting from 
plain text (paste or insert) is worth it, since we have to handle the 
characters anyway. But again, at some moments it seems right to me to 
handle them there.


For example, this hints that we should handle them like (CR)LF:
http://stackoverflow.com/questions/3072152/what-is-unicode-character-2028-ls-line-separator-used-for

JMarc

From 1d5ae75919e70c2b93a471bde9024c3738a9b13f Mon Sep 17 00:00:00 2001
From: Jean-Marc Lasgouttes 
Date: Mon, 7 Nov 2016 10:14:39 +0100
Subject: [PATCH] Handle properly unicode paragraph/line break

They are shown on screen by arrow or pilcrow symbol and cause a line break.

They are still not handled in LaTeX output, though.
---
 src/Paragraph.cpp   |5 +
 src/TextMetrics.cpp |   19 ++-
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/Paragraph.cpp b/src/Paragraph.cpp
index 8afa475..05c10b5 100644
--- a/src/Paragraph.cpp
+++ b/src/Paragraph.cpp
@@ -3147,6 +3147,11 @@ bool Paragraph::isHfill(pos_type pos) const
 
 bool Paragraph::isNewline(pos_type pos) const
 {
+	// U+2028 LINE SEPARATOR
+	// U+2029 PARAGRAPH SEPARATOR
+	char_type const c = d->text_[pos];
+	if (c == 0x2028 || c == 0x2029)
+		return true;
 	Inset const * inset = getInset(pos);
 	return inset && inset->lyxCode() == NEWLINE_CODE;
 }
diff --git a/src/TextMetrics.cpp b/src/TextMetrics.cpp
index 8f7ac82..17ee2e4 100644
--- a/src/TextMetrics.cpp
+++ b/src/TextMetrics.cpp
@@ -864,7 +864,23 @@ bool TextMetrics::breakRow(Row & row, int const right_margin) const
 		} else if (c == '\t')
 			row.addSpace(i, theFontMetrics(*fi).width(from_ascii("")),
  *fi, par.lookupChange(i));
-		else {
+		else if (c == 0x2028 || c == 0x2029) {
+			/**
+			 * U+2028 LINE SEPARATOR
+			 * U+2029 PARAGRAPH SEPARATOR
+
+			 * These are special unicode characters that break
+			 * lines/pragraphs. Not handling them lead to trouble wrt
+			 * Qt QTextLayout formatting. We add a visible character
+			 * on screen so that the user can see that something is
+			 * happening.
+			*/
+			row.finalizeLast();
+			// ⤶ U+2936 ARROW POINTING DOWNWARDS THEN CURVING LEFTWARDS
+			// ¶ U+00B6 PILCROW SIGN
+			char_type const screen_char = (c == 0x2028) ? 0x2936 : 0x00B6;
+			row.add(i, screen_char, *fi, par.lookupChange(i));
+		} else {
 			// FIXME: please someone fix the Hebrew/Arabic parenthesis mess!
 			// see also Paragraph::getUChar.
 			if (fi->language()->lang() == "hebrew") {
@@ -925,6 +941,7 @@ bool TextMetrics::breakRow(Row & row, int const right_margin) const
 		BufferParams const & bparams
 			= text_->inset().buffer().params();
 		f.setLanguage(par.getParLanguage(bparams));
+		// ¶ U+00B6 PILCROW SIGN
 		row.addVirtual(end, docstring(1, char_type(0x00B6)), f, Change());
 	}
 
-- 
1.7.9.5