Re: [NTG-context] decomposed u umlaut

2018-03-25 Thread Henning Hraban Ramm
Am 2018-03-25 um 22:36 schrieb Arthur Reutenauer 
:

> On Thu, Mar 22, 2018 at 10:08:44AM +0100, Mojca Miklavec wrote:
>> On 20 March 2018 at 08:42, Henning Hraban Ramm wrote:
>>> I’ve one annoying problem with ConTeXt: all üs (small u umlauts) seem to be 
>>> encoded as decomposed unicode or something like that, at least every ü 
>>> breaks into u + garbage if I copy some text from a ConTeXt PDF to an app 
>>> that doesn’t really support Unicode.
>> 
>> You are on macOS, right?
>> In my experience it was usually Apple's technology to blame.
> 
>  I agree with you that Apple’s software has a tendency to decompose
> characters, but I wouldn’t blame them for that: it’s perfectly
> Unicode-compliant to do so, and by now software should support
> combining characters in at least a basic way.  It’s a real problem that
> the software from the Deutsche Post isn’t able to handle them correctly.

While DP shop should be able to handle more than Latin-1, the problem seems to 
be in the viewer or in a combination of viewer and OS:
- It doesn’t depend on the font, I tried Computer Modern and Alegreya (that is 
known to have some OpenType ligature issues).
- I checked with several viewers, and the Adobe apps (Acrobat Pro 9 and Reader 
DC) decompose just the ü, while my other viewers including Apple’s Preview 
decompose all the umlauts. (Just copied and pasted into an hex editor.)
- It also happens with PDFs from other sources.

So it’s not a ConTeXt bug. Sorry for the noise.

Greetlings, Hraban
---
http://www.fiee.net
http://wiki.contextgarden.net
GPG Key ID 1C9B22FD

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___

Re: [NTG-context] decomposed u umlaut

2018-03-25 Thread Arthur Reutenauer
On Thu, Mar 22, 2018 at 10:08:44AM +0100, Mojca Miklavec wrote:
> On 20 March 2018 at 08:42, Henning Hraban Ramm wrote:
>> I’ve one annoying problem with ConTeXt: all üs (small u umlauts) seem to be 
>> encoded as decomposed unicode or something like that, at least every ü 
>> breaks into u + garbage if I copy some text from a ConTeXt PDF to an app 
>> that doesn’t really support Unicode.
> 
> You are on macOS, right?
> 
> In my experience it was usually Apple's technology to blame.

  I agree with you that Apple’s software has a tendency to decompose
characters, but I wouldn’t blame them for that: it’s perfectly
Unicode-compliant to do so, and by now software should support
combining characters in at least a basic way.  It’s a real problem that
the software from the Deutsche Post isn’t able to handle them correctly.

Best,

Arthur
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___

Re: [NTG-context] decomposed u umlaut

2018-03-22 Thread Hans Hagen

On 3/22/2018 10:34 AM, Ulrike Fischer wrote:

Am Tue, 20 Mar 2018 08:42:08 +0100 schrieb Henning Hraban Ramm:


I’ve one annoying problem with ConTeXt: all üs (small u umlauts)
seem to be encoded as decomposed unicode or something like that,
at least every ü breaks into u + garbage if I copy some text from
a ConTeXt PDF to an app that doesn’t really support Unicode.


This can depend on the font. I just looked for another question at
cambria and it e.g. uses char + accent for some of the Umlauts. So
concrete code is needed to test this.
btw, the same is true for ligature building (but i already explained 
that many times so i won't repeat myself)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___

Re: [NTG-context] decomposed u umlaut

2018-03-22 Thread Ulrike Fischer
Am Tue, 20 Mar 2018 08:42:08 +0100 schrieb Henning Hraban Ramm:

> I’ve one annoying problem with ConTeXt: all üs (small u umlauts)
> seem to be encoded as decomposed unicode or something like that,
> at least every ü breaks into u + garbage if I copy some text from
> a ConTeXt PDF to an app that doesn’t really support Unicode.

This can depend on the font. I just looked for another question at
cambria and it e.g. uses char + accent for some of the Umlauts. So
concrete code is needed to test this. 


-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___

Re: [NTG-context] decomposed u umlaut

2018-03-22 Thread Mojca Miklavec
On 20 March 2018 at 08:42, Henning Hraban Ramm wrote:
> Ahoi,
>
> I’ve one annoying problem with ConTeXt: all üs (small u umlauts) seem to be 
> encoded as decomposed unicode or something like that, at least every ü breaks 
> into u + garbage if I copy some text from a ConTeXt PDF to an app that 
> doesn’t really support Unicode.

You are on macOS, right?

In my experience it was usually Apple's technology to blame. Perfectly
valid PDFs with proper accented characters would always end up with
decomposed characters when copy-pasting. Even pdftotext did a better
job.

Mojca
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___

[NTG-context] decomposed u umlaut

2018-03-20 Thread Henning Hraban Ramm
Ahoi,

I’ve one annoying problem with ConTeXt: all üs (small u umlauts) seem to be 
encoded as decomposed unicode or something like that, at least every ü breaks 
into u + garbage if I copy some text from a ConTeXt PDF to an app that doesn’t 
really support Unicode.
All other characters within Latin-1, including umlauts, are no problem, that’s 
why I think the problem might be in ConTeXt’s font handling.

(Actually, this is about my invoice addresses that I copy from PDF to the 
German Post postage webshop. The site is quite new and I can’t understand how a 
big company can buy such crappy software. I already complained, there are more 
problems, but of course got only a template answer.)

Greetlings, Hraban
---
http://www.fiee.net
http://wiki.contextgarden.net
GPG Key ID 1C9B22FD

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___