Capitalization in German (was: s-j combination in Unicode?)

2013-02-19 Thread Otto Stolz
Hello, am 17.02.2013 06:55, schrieb Stephan Stiller: As far as real ambiguities are introduced, the loss of capitalization on the first letter introduces far more, impressionistically speaking, and they might be legally subtle; Here is a minimal pair to illustrate that point: Er hat in

Re: s-j combination in Unicode?

2013-02-17 Thread Stephan Stiller
As far as real ambiguities are introduced, the loss of capitalization on the first letter introduces far more, impressionistically speaking, and they might be legally subtle Though, to partially correct myself, /this/ is an issue for English, but not really for German. But I have to ask one

Re: s-j combination in Unicode?

2013-02-17 Thread Asmus Freytag
On 2/17/2013 12:30 AM, Stephan Stiller wrote: But I have to ask one more thing: Since the latter is expected to be rare, I personally would be comfortable with making a code point for it, so that fonts like this, which are actually used, can be mapped to Unicode w/o forcing people into

Re: s-j combination in Unicode?

2013-02-17 Thread Stephan Stiller
I think it's a waste of everybody's time to even contemplate forcing fallback transformations (which are a pain to program) when perfectly straightforward capital form can be deduced, and has been deduced (at least by font creators - we don't know what user requests they based their work

Re: s-j combination in Unicode?

2013-02-17 Thread Richard Wordingham
On Sat, 16 Feb 2013 10:08:24 -0800 Asmus Freytag asm...@ix.netcom.com wrote: On 2/16/2013 7:04 AM, Andries Brouwer wrote: I found Diauni.ttf at http://www.thesauruslex.com/typo/dialekt.htm (swedish) http://www.thesauruslex.com/typo/engdial.htm (english) It has landmålsalfabetet at

Re: s-j combination in Unicode?

2013-02-16 Thread Andries Brouwer
On Fri, Feb 15, 2013 at 10:56:17PM -0600, Ben Scarborough wrote: On Feb 16, 2013 02:13, Andries Brouwer wrote: The fragment of text I showed was not from dialectology, but just from a novel written in Elfdalian. The symbols are meant to be those of ordinary orthography. Does that mean

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/15/2013 11:59 PM, Andries Brouwer wrote: On Fri, Feb 15, 2013 at 10:56:17PM -0600, Ben Scarborough wrote: On Feb 16, 2013 02:13, Andries Brouwer wrote: The fragment of text I showed was not from dialectology, but just from a novel written in Elfdalian. The symbols are meant to be those of

Re: s-j combination in Unicode?

2013-02-16 Thread Stephan Stiller
That would make it analogous in a way to German ß. The minute things show up in real orthographies the pressure to handle ALL CAPS exists. The question then is whether you'll find SJ or overlaid S/J. Or how a Swede would instinctively handle this, in the absence of an example of a

Re: s-j combination in Unicode?

2013-02-16 Thread Jukka K. Korpela
2013-02-16 11:38, Stephan Stiller wrote: (By the way, for those finding the German rule to write SS unsatisfactory: It's hard to come by an actual minimal pair. Example: Strauss vs. Strauß. Originally the same name, but two spellings make them two names that may need to be distinguished from

Re: s-j combination in Unicode?

2013-02-16 Thread Stephan Stiller
[...] an actual minimal pair. Example: Strauss vs. Strauß. Originally the same name, but two spellings make them two names that may need to be distinguished from each other. True for Wei{ß/ss} as well. Or a non-name example: Buße (repentance) vs Busse (buses). But then, non-name examples

German »ß« (was: s-j combination in Unicode?)

2013-02-16 Thread Otto Stolz
Hello, Am 16.02.2013 11:48, schrieb Stephan Stiller: Or a non-name example: Buße (repentance) vs Busse (buses). But then, non-name examples are far less likely to remain ambiguous in context. Years ago, I have seen with my own eyes, in a Swiss magazine (where they consistently replace “ß”

Re: s-j combination in Unicode?

2013-02-16 Thread Andries Brouwer
On Sat, Feb 16, 2013 at 12:22:08AM -0800, Asmus Freytag wrote: On 2/15/2013 11:59 PM, Andries Brouwer wrote: On Fri, Feb 15, 2013 at 10:56:17PM -0600, Ben Scarborough wrote: Does that mean there's also a capital S-J? Probably, in entirely capitalized text. At sentence start I see

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/16/2013 1:38 AM, Stephan Stiller wrote: That would make it analogous in a way to German ß. The minute things show up in real orthographies the pressure to handle ALL CAPS exists. The question then is whether you'll find SJ or overlaid S/J. Or how a Swede would instinctively handle

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/16/2013 7:04 AM, Andries Brouwer wrote: [BTW Is the fact that o-slash is not decomposed not entirely analogous to the fact that i is not decomposed? I would say that neither gives an indication of how symbols involving a combining dot or combining slash are handled in general.] Why don't

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/16/2013 7:04 AM, Andries Brouwer wrote: I found Diauni.ttf at http://www.thesauruslex.com/typo/dialekt.htm (swedish) http://www.thesauruslex.com/typo/engdial.htm (english) It has landmålsalfabetet at E100-E197 (lower case only) and s-j at E19F, S-J at E1A5, with Y-ogonek, Å-ogonek,

Re: s-j combination in Unicode?

2013-02-16 Thread Stephan Stiller
It's hard to come by an actual minimal pair. MASSE - mass or measurements? See, not hard at all. [and] With the new orthography, ss vs. ß affects the pronunciation of the preceding vowel. It's irritating to see SS because you have to override that rule when you know that the word in

Re: s-j combination in Unicode?

2013-02-16 Thread Stephan Stiller
the issue is a bit different, as not focused on one letter While we're splitting hairs: Word- or larger-level all-caps /does/ normally make a one-letter difference. When we undo all-caps, one can /normally/ lowercase everything of the word except the first letter. The capitalization bit of

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/16/2013 10:48 AM, Stephan Stiller wrote: the issue is a bit different, as not focused on one letter While we're splitting hairs: Word- or larger-level all-caps /does/ normally make a one-letter difference. When we undo all-caps, one can /normally/ lowercase everything of the word except

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/16/2013 10:48 AM, Stephan Stiller wrote: the issue is a bit different, as not focused on one letter While we're splitting hairs: Word- or larger-level all-caps /does/ normally make a one-letter difference. When we undo all-caps, one can /normally/ lowercase everything of the word except

Re: s-j combination in Unicode?

2013-02-16 Thread Stephan Stiller
from earlier: Otto Scholz Oops, sorry. Otto Stolz. And usually not totally sense-destroying to a human reader with context available. But these fallbacks allow clear misspelled words to appear, not just miscapitalized ones. That's huge. I'm all for a capital version of ß and other such

Re: s-j combination in Unicode?

2013-02-16 Thread Asmus Freytag
On 2/16/2013 9:55 PM, Stephan Stiller wrote: from earlier: Otto Scholz Oops, sorry. Otto Stolz. And usually not totally sense-destroying to a human reader with context available. But these fallbacks allow clear misspelled words to appear, not just miscapitalized ones. That's huge. I'm all

Re: s-j combination in Unicode?

2013-02-15 Thread Karl Pentzlin
Am Donnerstag, 14. Februar 2013 um 14:38 schrieb Andries Brouwer: AB and learn from Karl Pentzlin about n3555.pdf where Michael Everson AB proposes U+1E0A2 LATIN SMALL LETTER ESJ (and many other characters). AB This document is from 2008. What is the status? In fact, the workgroup on the German

Re: s-j combination in Unicode?

2013-02-15 Thread Andries Brouwer
On Fri, Feb 15, 2013 at 10:06:22AM +0100, Karl Pentzlin wrote: Am Donnerstag, 14. Februar 2013 um 14:38 schrieb Andries Brouwer: AB and learn from Karl Pentzlin about n3555.pdf where Michael Everson AB proposes U+1E0A2 LATIN SMALL LETTER ESJ (and many other characters). AB This document is

Re: s-j combination in Unicode?

2013-02-15 Thread Ben Scarborough
On Feb 16, 2013 02:13, Andries Brouwer wrote: The fragment of text I showed was not from dialectology, but just from a novel written in Elfdalian. The symbols are meant to be those of ordinary orthography. Does that mean there's also a capital S-J? —Ben Scarborough

Re: s-j combination in Unicode?

2013-02-14 Thread Andries Brouwer
I asked: : wondered how to code an s-j overstrike combination and learn from Karl Pentzlin about n3555.pdf where Michael Everson proposes U+1E0A2 LATIN SMALL LETTER ESJ (and many other characters). This document is from 2008. What is the status? On Wed, Feb 13, 2013 at 02:24:12PM -0800, Asmus

Re: s-j combination in Unicode?

2013-02-14 Thread Asmus Freytag
On 2/14/2013 5:38 AM, Andries Brouwer wrote: I asked: : wondered how to code an s-j overstrike combination and learn from Karl Pentzlin about n3555.pdf where Michael Everson proposes U+1E0A2 LATIN SMALL LETTER ESJ (and many other characters). This document is from 2008. What is the status?

s-j combination in Unicode?

2013-02-13 Thread Andries Brouwer
I wondered how to code an s-j overstrike combination in Unicode. Attached a photograph of some text containing this combination. Andries attachment: js.jpg

Re: s-j combination in Unicode?

2013-02-13 Thread Jukka K. Korpela
2013-02-13 21:31, Andries Brouwer wrote: I wondered how to code an s-j overstrike combination in Unicode. Attached a photograph of some text containing this combination. It looks like something that has not been encoded. The same applies to what seems to be an eth (ð) with a stroke

Re: s-j combination in Unicode?

2013-02-13 Thread Karl Pentzlin
Am Mittwoch, 13. Februar 2013 um 21:13 schrieb Jukka K. Korpela: JKK 2013-02-13 21:31, Andries Brouwer wrote: I wondered how to code an s-j overstrike combination in Unicode. Attached a photograph of some text containing this combination. JKK It looks like something that has not been encoded

Re: s-j combination in Unicode?

2013-02-13 Thread Stephan Stiller
It looks like something that has not been encoded. What is the reason for not having a true combining grapheme joiner, one that overlays graphemes? Or a code point that instructs that the preceding (or following, I guess) code point should be printed at this position but otherwise be

Re: s-j combination in Unicode?

2013-02-13 Thread Andries Brouwer
On Wed, Feb 13, 2013 at 10:13:43PM +0200, Jukka K. Korpela wrote: 2013-02-13 21:31, Andries Brouwer wrote: I wondered how to code an s-j overstrike combination in Unicode. Attached a photograph of some text containing this combination. It looks like something that has not been encoded

Re: s-j combination in Unicode?

2013-02-13 Thread Leo Broukhis
On Wed, Feb 13, 2013 at 11:31 AM, Andries Brouwer a...@win.tue.nl wrote: I wondered how to code an s-j overstrike combination in Unicode. I'd write s ZWJ j and use a font that has the appropriate ligature. Leo

Re: s-j combination in Unicode?

2013-02-13 Thread Buck Golemon
On Wed, Feb 13, 2013 at 2:30 PM, Asmus Freytag asm...@ix.netcom.com wrote: On 2/13/2013 1:24 PM, Stephan Stiller wrote: It looks like something that has not been encoded. What is the reason for not having a true combining grapheme joiner, one that overlays graphemes? Or a code point that

Re: s-j combination in Unicode?

2013-02-13 Thread Asmus Freytag
On 2/13/2013 1:59 PM, Andries Brouwer wrote: [Concerning the g-slash, r-slash, eth-slash symbols, they can be coded using U+0337 as g̷ r̷ ð̷. Unicode generally does not decompose slashed symbols - so for example, o-slash does not have a decomposition using U+0337. The UTC may not feel bound

Re: s-j combination in Unicode?

2013-02-13 Thread Asmus Freytag
On 2/13/2013 1:24 PM, Stephan Stiller wrote: It looks like something that has not been encoded. What is the reason for not having a true combining grapheme joiner, one that overlays graphemes? Or a code point that instructs that the preceding (or following, I guess) code point should be

Re: s-j combination in Unicode?

2013-02-13 Thread Asmus Freytag
On 2/13/2013 2:58 PM, Buck Golemon wrote: On Wed, Feb 13, 2013 at 2:30 PM, Asmus Freytag asm...@ix.netcom.com wrote: On 2/13/2013 1:24 PM, Stephan Stiller wrote: It looks like something that has not been encoded. What is the reason for not having a true combining grapheme joiner, one that

Re: s-j combination in Unicode?

2013-02-13 Thread Asmus Freytag
On 2/13/2013 2:56 PM, Leo Broukhis wrote: On Wed, Feb 13, 2013 at 11:31 AM, Andries Brouwer a...@win.tue.nl wrote: I wondered how to code an s-j overstrike combination in Unicode. I'd write s ZWJ j and use a font that has the appropriate ligature. These features in Unicode aren't intended

Re: s-j combination in Unicode?

2013-02-13 Thread Leo Broukhis
wondered how to code an s-j overstrike combination in Unicode. I'd write s ZWJ j and use a font that has the appropriate ligature. These features in Unicode aren't intended as just hacks to get the right appearance. The idea is that you can encode the intention of the author more directly. Unless

Re: s-j combination in Unicode?

2013-02-13 Thread Asmus Freytag
, Feb 13, 2013 at 11:31 AM, Andries Brouwer a...@win.tue.nl wrote: I wondered how to code an s-j overstrike combination in Unicode. I'd write s ZWJ j and use a font that has the appropriate ligature. These features in Unicode aren't intended as just hacks to get the right appearance. The idea