Meteorological symbols for cloud conditions (on maps or elsewhere)

2016-03-19 Thread Philippe Verdy
See https://fr.wikipedia.org/wiki/Carte_m%C3%A9t%C3%A9orologique#/media/File:Station_model_fr.svg I see these symbols for noting cloud types (here cirrus and altocumulus, one drawn in diagonal for middle altitude, another drawn horizontally for high altitudes). Note that the symbols may vary:

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Philippe Verdy
2016-03-17 19:02 GMT+01:00 Pierpaolo Bernardi : > On Thu, Mar 17, 2016 at 6:37 PM, Leonardo Boiko > wrote: > > The PDF *displays* correctly. But try copying the string 'ti' from > > the text another application outside of your PDF viewer, and you'll

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Leonardo Boiko
Yeah, I've stumbled upon this a lot in academic Japanese/Chinese texts. I try to copy some Chinese character, only to find out that it's really a string of random ASCII characters. Is there only one of those crap PDF pseudo-encodings? If so, I'll use a conversor next time... 2016-03-17 14:57

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Philippe Verdy
2016-03-18 19:11 GMT+01:00 Garth Wallace : > > The issues with line breaking (if you can use these combining around all > > characters, inclusing spaces, can be solved using unbreakable characters. > > Line breaking isn't really a problem that I can see with the Quivira > model.

Re: Swapcase for Titlecase characters

2016-03-19 Thread Marcel Schneider
On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote: > I'm working on extending the case conversion methods for the programming > language Ruby from the current ASCII only to cover all of Unicode. > > Ruby comes with four methods for case conversion. Three of them, upcase, > downcase, and

Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Don Osborn
Odd result when copy/pasting text from a PDF: For some reason "ti" in the (English) text of the document at http://web.isanet.org/Web/Conferences/Atlanta%202016/Atlanta%202016%20-%20Full%20Program.pdf is coded as "Ɵ". Looking more closely at the original text, it does appear that the glyph is

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Garth Wallace
On Thu, Mar 17, 2016 at 11:28 PM, J Decker wrote: > On Thu, Mar 17, 2016 at 9:18 PM, Garth Wallace wrote: >> There's another strategy for dealing with enclosed numbers, which is >> taken by the font Quivira in its PUA: encoding separate >>

Proposal for *U+2427 NARROW SHOULDERED OPEN BOX (was: Re: Proposal for *U+23FF SHOULDERED NARROW OPEN BOX?)

2016-03-19 Thread Marcel Schneider
On Mon, 14 Mar 2016 09:19:35 -0700, Ken Whistler wrote: > U+23FF is already assigned to OBSERVER EYE SYMBOL, which is > already under ballot for 10646 (and approved by the UTC). > > http://www.unicode.org/alloc/Pipeline.html > > Please always first check that page before suggesting code points

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Philippe Verdy
That's a smart idea... Note that you could encode the middle digits so that their enclosure at top and bottom are by default only horizontal (no arcs of circle) when shown in isolation, and the left and right parts are just connecting by default horizontally to the top and bottom position of the

Re: Variations and Unifications ?

2016-03-19 Thread Philippe Verdy
One problem caused by disunification is the complexification of algorithms handling text. I forgot an important case where disunification also occured : combining sequences are the "normal" encoding, but legacy charsets encoded the precomposed character separately and Unicode had to map them for

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Andrew Cunningham
Hi Don, Latin is fine if you keep to simple well made fonts and avoid using more sophisticated typographic features available in some fonts. Dumb it down typographically and it works fine. PDF, despite all the current rhetoric coming from PDF software developers, is a preprint format. Not an

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Julian Bradfield
On 2016-03-19, Don Osborn wrote: > The details may or may not be relevant to the list topic, but as a user > of documents in PDF format, I fail to see the benefit of such obscure > mappings. And as a creator of PDFs ("save as") looking at others' PDFs Aren't you just being

Re: Meteorological symbols for cloud conditions (on maps or elsewhere)

2016-03-19 Thread Philippe Verdy
Some other resources (outside Wikipedia): - Kean University: http://www.kean.edu/~fosborne/resources/ex10g.htm - Documented by the NOAA in US (but I don't find the complete reference) - These symbols seem to be supported by an "international standard", but I don't know which one exactly. -

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Asmus Freytag (t)
On 3/18/2016 11:48 AM, Philippe Verdy wrote: East Asian vertical presentation does not just stack the elements on top of each other, very frequently they rotate them (including Latin/Greek/Cyrillic letters) So this is not really a new complication.

Re: Variations and Unifications ?

2016-03-19 Thread Asmus Freytag (t)
On 3/16/2016 11:11 PM, Philippe Verdy wrote: "Disunification may be an answer?" We should avoid it as well. Disunification is only acceptable when - there's a complete disunification of concepts I

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Garth Wallace
On Fri, Mar 18, 2016 at 11:48 AM, Philippe Verdy wrote: > 2016-03-18 19:11 GMT+01:00 Garth Wallace : >> >> > The issues with line breaking (if you can use these combining around all >> > characters, inclusing spaces, can be solved using unbreakable >> >

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Don Osborn
Thanks Andrew, Looking at the issue of ToUnicode mapping you mention, why in the 1-many mapping of ligatures (for fonts that have them) do the "many" not simply consist of the characters ligated? Maybe that's too simple (my understanding of the process is clearly inadequate). The "string of

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Philippe Verdy
Sequences were introduced long before. I know that they add their own complications everywhere, but they are already part of existing algorithms. If sequences (not just combining sequences) were not there, there would be much more characters encoded in the database and eveything would be encoded

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Don Osborn
Thanks all for the feedback. Doug, It may well be my clipboard (running Windows 7 on this particular laptop). Get same results pasting into Word and EmEditor. So, when I did a web search on "internaƟonal," as previously mentioned, and come up with a lot of results (mostly PDFs), were those

Re: Swapcase for Titlecase characters

2016-03-19 Thread Marcel Schneider
On Sat Mar 19, 2016 12:54:51, Martin J. Dürst wrote: > On 2016/03/19 04:33, Marcel Schneider wrote: > > On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote: > > >> b) Convert to upper (or lower), which may simplify implementation. > > >> For example, 'Džinsi' (jeans) would become 'DžINSI' with

Re: Swapcase for Titlecase characters

2016-03-19 Thread Doug Ewell
Martin J. Dürst wrote: Now the question I have is: What to do for titlecase characters? [ ... ] For example, 'Džinsi' (jeans) would become 'DžINSI' with a), 'DŽINSI' (or 'džinsi') with b), and 'dŽINSI' with c). For the Latin letters at least, my 0.02 cents' worth (you read that right) is that

Re: Variations and Unifications ?

2016-03-19 Thread Philippe Verdy
"Disunification may be an answer?" We should avoid it as well. We have other solutions in Unicode - variation selectors (often used for sinograms when their unified shapes must be distinguished in some contexts such as people names or toponyms or trademark names or in other specific contexts), -

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Steve Swales
Yes, it seems like your mileage varies with the PDF viewer/interpreter/converter. Text copied from Preview on the Mac replaces the ti ligature with a space. Certainly not a Unicode problem, per se, but an interesting problem nevertheless. -steve > On Mar 17, 2016, at 11:11 AM, Doug Ewell

Re: Swapcase for Titlecase characters

2016-03-19 Thread Martin J. Dürst
Thanks everybody for the feedback. On 2016/03/19 04:33, Marcel Schneider wrote: On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote: b) Convert to upper (or lower), which may simplify implementation. For example, 'Džinsi' (jeans) would become 'DžINSI' with a), 'DŽINSI' (or 'džinsi') with

Re: Variations and Unifications ?

2016-03-19 Thread Asmus Freytag (t)
On 3/15/2016 8:14 PM, David Faulks wrote: As part of my investigations into astrological symbols, I'm beginning to wonder if glyph variations are justifications for separate encoding of symbols I would have previously considered the same or unifiable with symbols

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Andrew Cunningham
There are a few things going on. In the first instance, it may be the font itself that is the source of the problem. My understanding is that PDF files contain a sequence of glyphs. A PDF file will contain a ToUnicode mapping between glyphs and codepoints. This iseither a 1-1 mapping or a 1-many

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Andrew West
On 18 March 2016 at 23:49, Garth Wallace wrote: > > Correction: the 2-digit pairs would require 19 characters. There would > be no need for a left half circle enclosed digit one, since the > enclosed numbers 10–19 are already encoded. This would only leave > enclosed 20 as a

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Marcel Schneider
On Thu, Mar 17, 2016 at 19:02:19, Pierpaolo Bernardi wrote: > unicode says nothing about font technologies It mentions them a little bit however in the core specifications: http://www.unicode.org/versions/Unicode8.0.0/ch23.pdf#G23126 > unicode does not mandate how to encode ligatures

Re: Swapcase for Titlecase characters

2016-03-19 Thread Mark Davis ☕️
The 'swapcase' just sounds bizarre. What on earth is it for? My inclination would be to just do the simplest possible implementation that has the expected results for the 1:1 case pairs, and whatever falls out from the algorithm for the others. Mark On Sat, Mar 19, 2016 at 4:11 AM, Asmus

Re: Joined "ti" coded as "Ɵ" in PDF

2016-03-19 Thread Leonardo Boiko
The PDF *displays* correctly. But try copying the string 'ti' from the text another application outside of your PDF viewer, and you'll see that the thing that *displays* as 'ti' is *coded* as Ɵ, as Don Osborn said. 2016-03-17 14:26 GMT-03:00 Pierpaolo Bernardi : > That

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Garth Wallace
On Thu, Mar 17, 2016 at 9:18 PM, Garth Wallace wrote: > There's another strategy for dealing with enclosed numbers, which is > taken by the font Quivira in its PUA: encoding separate > left-half-circle-enclosed and right-half-circle-enclosed digits. This > would require 20

Re: Swapcase for Titlecase characters

2016-03-19 Thread Asmus Freytag (t)
On 3/18/2016 12:33 PM, Marcel Schneider wrote: As about decomposing digraphs and ypogegrammeni to apply swapcase: That probably would be doing no good, as itʼs unnecessary and users wonʼt expect it. That was my intuition as well, but based on a

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Andrew West
Hi Frédéric, The historic use of ideographic numbers for marking Go moves are discussed in the latest draft of my document: http://www.babelstone.co.uk/Unicode/GoNotation.pdf Andrew On 16 March 2016 at 13:35, Frédéric Grosshans wrote: > Le 15/03/2016 22:21,

Re: Purpose of and rationale behind Go Markers U+2686 to U+2689

2016-03-19 Thread Martin J. Dürst
On 2016/03/19 04:55, Garth Wallace wrote: On Fri, Mar 18, 2016 at 11:48 AM, Philippe Verdy wrote: 2016-03-18 19:11 GMT+01:00 Garth Wallace : Rotation is definitely not salient in standard go kifu like it is in fairy chess notation. Go variants for more