Re: Finding Bad characters

2008-08-18 Thread G. Milde
On 30.07.08, Andre Poenitz wrote:
 On Fri, Jul 25, 2008 at 12:45:38PM +0200, G. Milde wrote:
  On 25.07.08, Sebastian Rohrer wrote:
  
   I have a relatively large lyx document that does not compile. The error  
   message is:
  
   Some characters of your document are probably not representable in the  
   chosen encoding. Changing the document encoding to utf8 could help.
  
  Actually, changing the encoding to utf8 will most probably *not* help 
  (as utf8 is very limited in the range of supported characters).

 Hm?

The encoding setting utf8 (i.e. utf8.def of the inputenc package)
supports only a small subset of unicode characters (mainly latin). 
It works e.g. well with German umlauts and other Latin extensions.

However, these are also well supported by LyX internal conversion rules
as defined in the unicodesymbols definition file.

utf8x (the interface to the ucs package) might be a solution for some
characters not known by LyX (e.g. for some accented Greek characters used
in polytonic Greek).

Günter


Re: Finding Bad characters

2008-08-18 Thread G. Milde
On 30.07.08, Andre Poenitz wrote:
 On Fri, Jul 25, 2008 at 12:45:38PM +0200, G. Milde wrote:
  On 25.07.08, Sebastian Rohrer wrote:
  
   I have a relatively large lyx document that does not compile. The error  
   message is:
  
   Some characters of your document are probably not representable in the  
   chosen encoding. Changing the document encoding to utf8 could help.
  
  Actually, changing the encoding to utf8 will most probably *not* help 
  (as utf8 is very limited in the range of supported characters).

 Hm?

The encoding setting utf8 (i.e. utf8.def of the inputenc package)
supports only a small subset of unicode characters (mainly latin). 
It works e.g. well with German umlauts and other Latin extensions.

However, these are also well supported by LyX internal conversion rules
as defined in the unicodesymbols definition file.

utf8x (the interface to the ucs package) might be a solution for some
characters not known by LyX (e.g. for some accented Greek characters used
in polytonic Greek).

Günter


Re: Finding "Bad" characters

2008-08-18 Thread G. Milde
On 30.07.08, Andre Poenitz wrote:
> On Fri, Jul 25, 2008 at 12:45:38PM +0200, G. Milde wrote:
> > On 25.07.08, Sebastian Rohrer wrote:
> > 
> > > I have a relatively large lyx document that does not compile. The error  
> > > message is:
> > 
> > > "Some characters of your document are probably not representable in the  
> > > chosen encoding. Changing the document encoding to utf8 could help."
> > 
> > Actually, changing the encoding to utf8 will most probably *not* help 
> > (as utf8 is very limited in the range of supported characters).

> Hm?

The encoding setting "utf8" (i.e. utf8.def of the "inputenc" package)
supports only a small subset of unicode characters (mainly latin). 
It works e.g. well with German umlauts and other Latin extensions.

However, these are also well supported by LyX internal conversion rules
as defined in the "unicodesymbols" definition file.

"utf8x" (the interface to the "ucs" package) might be a solution for some
characters not "known" by LyX (e.g. for some accented Greek characters used
in polytonic Greek).

Günter


Re: Finding Bad characters

2008-07-30 Thread Andre Poenitz
On Fri, Jul 25, 2008 at 12:45:38PM +0200, G. Milde wrote:
 On 25.07.08, Sebastian Rohrer wrote:
  Hi,
 
  I have a relatively large lyx document that does not compile. The error  
  message is:
 
  Some characters of your document are probably not representable in the  
  chosen encoding. Changing the document encoding to utf8 could help.
 
 Actually, changing the encoding to utf8 will most probably *not* help 
 (as utf8 is very limited in the range of supported characters).

Hm?

Andre'


Re: Finding Bad characters

2008-07-30 Thread Andre Poenitz
On Fri, Jul 25, 2008 at 12:45:38PM +0200, G. Milde wrote:
 On 25.07.08, Sebastian Rohrer wrote:
  Hi,
 
  I have a relatively large lyx document that does not compile. The error  
  message is:
 
  Some characters of your document are probably not representable in the  
  chosen encoding. Changing the document encoding to utf8 could help.
 
 Actually, changing the encoding to utf8 will most probably *not* help 
 (as utf8 is very limited in the range of supported characters).

Hm?

Andre'


Re: Finding "Bad" characters

2008-07-30 Thread Andre Poenitz
On Fri, Jul 25, 2008 at 12:45:38PM +0200, G. Milde wrote:
> On 25.07.08, Sebastian Rohrer wrote:
> > Hi,
> 
> > I have a relatively large lyx document that does not compile. The error  
> > message is:
> 
> > "Some characters of your document are probably not representable in the  
> > chosen encoding. Changing the document encoding to utf8 could help."
> 
> Actually, changing the encoding to utf8 will most probably *not* help 
> (as utf8 is very limited in the range of supported characters).

Hm?

Andre'


Re: Finding Bad characters

2008-07-28 Thread G. Milde
On 25.07.08, Steve Litt wrote:
...

 I know nothing about unicode, so I'd need to be brought up to speed on that 
 before writing the program. 

References:

* The Absolute Minimum Every Software Developer Absolutely,
  Positively Must Know About Unicode and Character Sets (No Excuses!)
  http://www.joelonsoftware.com/articles/Unicode.html

* On the Goodness of Unicode
  http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode

* Wikipedia article on Unicode
  http://en.wikipedia.org/wiki/Unicode

 I assume unicode is a 16 bit representation of characters.

Actually, unicode is a character -- number mapping without an upper
limit to the numbers. 

Once upon a time, 16 bit where enough to represent all defined unicode
characters, but even then several different encodings into a computer
readable format existed. Programs that relied on unicode == 16 bit
(including LaTeX) have problems now with higher unicode numbers.

Günter


Re: Finding Bad characters

2008-07-28 Thread Abdelrazak Younes

G. Milde wrote:

On 25.07.08, Steve Litt wrote:

I assume unicode is a 16 bit representation of characters.


Actually, unicode is a character--  number mapping without an upper
limit to the numbers.

Once upon a time, 16 bit where enough to represent all defined unicode
characters, but even then several different encodings into a computer
readable format existed. Programs that relied on unicode == 16 bit
(including LaTeX) have problems now with higher unicode numbers.


FYI, LyX uses utf32 internally, that is the 32 bit version, and utf8 as 
its file format.


Abdel.



Re: Finding Bad characters

2008-07-28 Thread Christian Ridderström

On Fri, 25 Jul 2008, Sebastian Rohrer wrote:

after pasting from OOwriter. Is there any way to find those characters? 
It is very tedious by eye...


you can try a divide-and-conquer approach. Erase the last half and see if 
the problem persists. If it does, erase the now remaining last half and so 
on. Pretty soon you should have only a small portion left that you an 
inspect maually. Five times would reduce 32 pages to just one page.


regards,
Christian

--
Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr

Re: Finding Bad characters

2008-07-28 Thread G. Milde
On 25.07.08, Steve Litt wrote:
...

 I know nothing about unicode, so I'd need to be brought up to speed on that 
 before writing the program. 

References:

* The Absolute Minimum Every Software Developer Absolutely,
  Positively Must Know About Unicode and Character Sets (No Excuses!)
  http://www.joelonsoftware.com/articles/Unicode.html

* On the Goodness of Unicode
  http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode

* Wikipedia article on Unicode
  http://en.wikipedia.org/wiki/Unicode

 I assume unicode is a 16 bit representation of characters.

Actually, unicode is a character -- number mapping without an upper
limit to the numbers. 

Once upon a time, 16 bit where enough to represent all defined unicode
characters, but even then several different encodings into a computer
readable format existed. Programs that relied on unicode == 16 bit
(including LaTeX) have problems now with higher unicode numbers.

Günter


Re: Finding Bad characters

2008-07-28 Thread Abdelrazak Younes

G. Milde wrote:

On 25.07.08, Steve Litt wrote:

I assume unicode is a 16 bit representation of characters.


Actually, unicode is a character--  number mapping without an upper
limit to the numbers.

Once upon a time, 16 bit where enough to represent all defined unicode
characters, but even then several different encodings into a computer
readable format existed. Programs that relied on unicode == 16 bit
(including LaTeX) have problems now with higher unicode numbers.


FYI, LyX uses utf32 internally, that is the 32 bit version, and utf8 as 
its file format.


Abdel.



Re: Finding Bad characters

2008-07-28 Thread Christian Ridderström

On Fri, 25 Jul 2008, Sebastian Rohrer wrote:

after pasting from OOwriter. Is there any way to find those characters? 
It is very tedious by eye...


you can try a divide-and-conquer approach. Erase the last half and see if 
the problem persists. If it does, erase the now remaining last half and so 
on. Pretty soon you should have only a small portion left that you an 
inspect maually. Five times would reduce 32 pages to just one page.


regards,
Christian

--
Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr

Re: Finding "Bad" characters

2008-07-28 Thread G. Milde
On 25.07.08, Steve Litt wrote:
...

> I know nothing about unicode, so I'd need to be brought up to speed on that 
> before writing the program. 

References:

* The Absolute Minimum Every Software Developer Absolutely,
  Positively Must Know About Unicode and Character Sets (No Excuses!)
  http://www.joelonsoftware.com/articles/Unicode.html

* On the Goodness of Unicode
  http://www.tbray.org/ongoing/When/200x/2003/04/06/Unicode

* Wikipedia article on Unicode
  http://en.wikipedia.org/wiki/Unicode

> I assume unicode is a 16 bit representation of characters.

Actually, unicode is a character <--> number mapping without an upper
limit to the numbers. 

Once upon a time, 16 bit where enough to represent all defined unicode
characters, but even then several different encodings into a computer
readable format existed. Programs that relied on unicode == 16 bit
(including LaTeX) have problems now with higher unicode numbers.

Günter


Re: Finding "Bad" characters

2008-07-28 Thread Abdelrazak Younes

G. Milde wrote:

On 25.07.08, Steve Litt wrote:

I assume unicode is a 16 bit representation of characters.


Actually, unicode is a character<-->  number mapping without an upper
limit to the numbers.

Once upon a time, 16 bit where enough to represent all defined unicode
characters, but even then several different encodings into a computer
readable format existed. Programs that relied on unicode == 16 bit
(including LaTeX) have problems now with higher unicode numbers.


FYI, LyX uses utf32 internally, that is the 32 bit version, and utf8 as 
its file format.


Abdel.



Re: Finding "Bad" characters

2008-07-28 Thread Christian Ridderström

On Fri, 25 Jul 2008, Sebastian Rohrer wrote:

after pasting from OOwriter. Is there any way to find those characters? 
It is very tedious by eye...


you can try a divide-and-conquer approach. Erase the last half and see if 
the problem persists. If it does, erase the now remaining last half and so 
on. Pretty soon you should have only a small portion left that you an 
inspect maually. Five times would reduce 32 pages to just one page.


regards,
Christian

--
Christian Ridderström, +46-8-768 39 44http://www.md.kth.se/~chr

Re: Finding Bad characters

2008-07-27 Thread Jürgen Spitzmüller
Sebastian Rohrer wrote:
 The problem is most likely caused by some characters that I forgot to
 change after pasting from OOwriter. Is there any way to find those
 characters? It is very tedious by eye...

In recent versions (1.5.5, if not earlier), you can find it by 
opening View-Source Code. The problematic characters are highlighted in 
red.

Jürgen


Re: Finding Bad characters

2008-07-27 Thread G. Milde
On 25.07.08, Sebastian Rohrer wrote:
 Hi,

 I have a relatively large lyx document that does not compile. The error  
 message is:

 Some characters of your document are probably not representable in the  
 chosen encoding. Changing the document encoding to utf8 could help.

Actually, changing the encoding to utf8 will most probably *not* help 
(as utf8 is very limited in the range of supported characters).

You could try changing to utf8x. This might help - maybe just by a more
specific error message.

Günter


Re: Finding Bad characters

2008-07-27 Thread Steve Litt
On Friday 25 July 2008 06:28, Sebastian Rohrer wrote:
 Hi,

 I have a relatively large lyx document that does not compile. The error
 message is:

 Some characters of your document are probably not representable in the
 chosen encoding. Changing the document encoding to utf8 could help.

 The problem is most likely caused by some characters that I forgot to
 change after pasting from OOwriter. Is there any way to find those
 characters? It is very tedious by eye...

 Thanks,

 Sebastian

If you can list all the allowable characters including carriage return and/or 
linefeed, then you (or a friend like me) could write a C program to read each 
character, test it for inclusion in that set, and report on any characters 
(character ascii/unicode value and position in the file) that aren't in the 
set.

I know nothing about unicode, so I'd need to be brought up to speed on that 
before writing the program. I assume unicode is a 16 bit representation of 
characters.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Finding Bad characters

2008-07-27 Thread Jürgen Spitzmüller
Sebastian Rohrer wrote:
 The problem is most likely caused by some characters that I forgot to
 change after pasting from OOwriter. Is there any way to find those
 characters? It is very tedious by eye...

In recent versions (1.5.5, if not earlier), you can find it by 
opening View-Source Code. The problematic characters are highlighted in 
red.

Jürgen


Re: Finding Bad characters

2008-07-27 Thread G. Milde
On 25.07.08, Sebastian Rohrer wrote:
 Hi,

 I have a relatively large lyx document that does not compile. The error  
 message is:

 Some characters of your document are probably not representable in the  
 chosen encoding. Changing the document encoding to utf8 could help.

Actually, changing the encoding to utf8 will most probably *not* help 
(as utf8 is very limited in the range of supported characters).

You could try changing to utf8x. This might help - maybe just by a more
specific error message.

Günter


Re: Finding Bad characters

2008-07-27 Thread Steve Litt
On Friday 25 July 2008 06:28, Sebastian Rohrer wrote:
 Hi,

 I have a relatively large lyx document that does not compile. The error
 message is:

 Some characters of your document are probably not representable in the
 chosen encoding. Changing the document encoding to utf8 could help.

 The problem is most likely caused by some characters that I forgot to
 change after pasting from OOwriter. Is there any way to find those
 characters? It is very tedious by eye...

 Thanks,

 Sebastian

If you can list all the allowable characters including carriage return and/or 
linefeed, then you (or a friend like me) could write a C program to read each 
character, test it for inclusion in that set, and report on any characters 
(character ascii/unicode value and position in the file) that aren't in the 
set.

I know nothing about unicode, so I'd need to be brought up to speed on that 
before writing the program. I assume unicode is a 16 bit representation of 
characters.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Re: Finding "Bad" characters

2008-07-27 Thread Jürgen Spitzmüller
Sebastian Rohrer wrote:
> The problem is most likely caused by some characters that I forgot to
> change after pasting from OOwriter. Is there any way to find those
> characters? It is very tedious by eye...

In recent versions (1.5.5, if not earlier), you can find it by 
opening "View->Source Code". The problematic characters are highlighted in 
red.

Jürgen


Re: Finding "Bad" characters

2008-07-27 Thread G. Milde
On 25.07.08, Sebastian Rohrer wrote:
> Hi,

> I have a relatively large lyx document that does not compile. The error  
> message is:

> "Some characters of your document are probably not representable in the  
> chosen encoding. Changing the document encoding to utf8 could help."

Actually, changing the encoding to utf8 will most probably *not* help 
(as utf8 is very limited in the range of supported characters).

You could try changing to utf8x. This might help - maybe just by a more
specific error message.

Günter


Re: Finding "Bad" characters

2008-07-27 Thread Steve Litt
On Friday 25 July 2008 06:28, Sebastian Rohrer wrote:
> Hi,
>
> I have a relatively large lyx document that does not compile. The error
> message is:
>
> "Some characters of your document are probably not representable in the
> chosen encoding. Changing the document encoding to utf8 could help."
>
> The problem is most likely caused by some characters that I forgot to
> change after pasting from OOwriter. Is there any way to find those
> characters? It is very tedious by eye...
>
> Thanks,
>
> Sebastian

If you can list all the allowable characters including carriage return and/or 
linefeed, then you (or a friend like me) could write a C program to read each 
character, test it for inclusion in that set, and report on any characters 
(character ascii/unicode value and position in the file) that aren't in the 
set.

I know nothing about unicode, so I'd need to be brought up to speed on that 
before writing the program. I assume unicode is a 16 bit representation of 
characters.

SteveT

Steve Litt
Recession Relief Package
http://www.recession-relief.US



Finding Bad characters

2008-07-25 Thread Sebastian Rohrer

Hi,

I have a relatively large lyx document that does not compile. The error 
message is:


Some characters of your document are probably not representable in the 
chosen encoding. Changing the document encoding to utf8 could help.


The problem is most likely caused by some characters that I forgot to 
change after pasting from OOwriter. Is there any way to find those 
characters? It is very tedious by eye...


Thanks,

Sebastian

--
Sebastian Rohrer
AK Baumann - Molecular Modelling Group
Institute of Pharmaceutical Chemistry
Braunschweig University of Technology
Beethovenstr. 55

38106 Braunschweig
Germany

Phone: +49-531-3912797



Finding Bad characters

2008-07-25 Thread Anders Ekberg
If all goes well the cursor should be at the bad characters when the  
error pops up. Do you have the latest 1.5-version? Ligatures (which  
are very hard to see) are fixed there IIRC. Also a good thing is to  
check your formulas.


Anders


Finding Bad characters

2008-07-25 Thread Sebastian Rohrer

Hi,

I have a relatively large lyx document that does not compile. The error 
message is:


Some characters of your document are probably not representable in the 
chosen encoding. Changing the document encoding to utf8 could help.


The problem is most likely caused by some characters that I forgot to 
change after pasting from OOwriter. Is there any way to find those 
characters? It is very tedious by eye...


Thanks,

Sebastian

--
Sebastian Rohrer
AK Baumann - Molecular Modelling Group
Institute of Pharmaceutical Chemistry
Braunschweig University of Technology
Beethovenstr. 55

38106 Braunschweig
Germany

Phone: +49-531-3912797



Finding Bad characters

2008-07-25 Thread Anders Ekberg
If all goes well the cursor should be at the bad characters when the  
error pops up. Do you have the latest 1.5-version? Ligatures (which  
are very hard to see) are fixed there IIRC. Also a good thing is to  
check your formulas.


Anders


Finding "Bad" characters

2008-07-25 Thread Sebastian Rohrer

Hi,

I have a relatively large lyx document that does not compile. The error 
message is:


"Some characters of your document are probably not representable in the 
chosen encoding. Changing the document encoding to utf8 could help."


The problem is most likely caused by some characters that I forgot to 
change after pasting from OOwriter. Is there any way to find those 
characters? It is very tedious by eye...


Thanks,

Sebastian

--
Sebastian Rohrer
AK Baumann - Molecular Modelling Group
Institute of Pharmaceutical Chemistry
Braunschweig University of Technology
Beethovenstr. 55

38106 Braunschweig
Germany

Phone: +49-531-3912797



Finding "Bad" characters

2008-07-25 Thread Anders Ekberg
If all goes well the cursor should be at the bad characters when the  
error pops up. Do you have the latest 1.5-version? Ligatures (which  
are very hard to see) are fixed there IIRC. Also a good thing is to  
check your formulas.


Anders