See
https://fr.wikipedia.org/wiki/Carte_m%C3%A9t%C3%A9orologique#/media/File:Station_model_fr.svg
I see these symbols for noting cloud types (here cirrus and altocumulus,
one drawn in diagonal for middle altitude, another drawn horizontally for
high altitudes).
Note that the symbols may vary:
2016-03-17 19:02 GMT+01:00 Pierpaolo Bernardi :
> On Thu, Mar 17, 2016 at 6:37 PM, Leonardo Boiko
> wrote:
> > The PDF *displays* correctly. But try copying the string 'ti' from
> > the text another application outside of your PDF viewer, and you'll
Yeah, I've stumbled upon this a lot in academic Japanese/Chinese
texts. I try to copy some Chinese character, only to find out that
it's really a string of random ASCII characters.
Is there only one of those crap PDF pseudo-encodings? If so, I'll use
a conversor next time...
2016-03-17 14:57
2016-03-18 19:11 GMT+01:00 Garth Wallace :
> > The issues with line breaking (if you can use these combining around all
> > characters, inclusing spaces, can be solved using unbreakable characters.
>
> Line breaking isn't really a problem that I can see with the Quivira
> model.
On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote:
> I'm working on extending the case conversion methods for the programming
> language Ruby from the current ASCII only to cover all of Unicode.
>
> Ruby comes with four methods for case conversion. Three of them, upcase,
> downcase, and
Odd result when copy/pasting text from a PDF: For some reason "ti" in
the (English) text of the document at
http://web.isanet.org/Web/Conferences/Atlanta%202016/Atlanta%202016%20-%20Full%20Program.pdf
is coded as "Ɵ". Looking more closely at the original text, it does
appear that the glyph is
On Thu, Mar 17, 2016 at 11:28 PM, J Decker wrote:
> On Thu, Mar 17, 2016 at 9:18 PM, Garth Wallace wrote:
>> There's another strategy for dealing with enclosed numbers, which is
>> taken by the font Quivira in its PUA: encoding separate
>>
On Mon, 14 Mar 2016 09:19:35 -0700, Ken Whistler wrote:
> U+23FF is already assigned to OBSERVER EYE SYMBOL, which is
> already under ballot for 10646 (and approved by the UTC).
>
> http://www.unicode.org/alloc/Pipeline.html
>
> Please always first check that page before suggesting code points
That's a smart idea... Note that you could encode the middle digits so that
their enclosure at top and bottom are by default only horizontal (no arcs
of circle) when shown in isolation, and the left and right parts are just
connecting by default horizontally to the top and bottom position of the
One problem caused by disunification is the complexification of algorithms
handling text.
I forgot an important case where disunification also occured : combining
sequences are the "normal" encoding, but legacy charsets encoded the
precomposed character separately and Unicode had to map them for
Hi Don,
Latin is fine if you keep to simple well made fonts and avoid using more
sophisticated typographic features available in some fonts.
Dumb it down typographically and it works fine. PDF, despite all the
current rhetoric coming from PDF software developers, is a preprint format.
Not an
On 2016-03-19, Don Osborn wrote:
> The details may or may not be relevant to the list topic, but as a user
> of documents in PDF format, I fail to see the benefit of such obscure
> mappings. And as a creator of PDFs ("save as") looking at others' PDFs
Aren't you just being
Some other resources (outside Wikipedia):
- Kean University:
http://www.kean.edu/~fosborne/resources/ex10g.htm
- Documented by the NOAA in US (but I don't find the complete reference)
- These symbols seem to be supported by an "international standard", but I
don't know which one exactly.
-
On 3/18/2016 11:48 AM, Philippe Verdy
wrote:
East Asian vertical presentation does not just stack
the elements on top of each other, very frequently they rotate
them (including Latin/Greek/Cyrillic letters) So this is not
really a new complication.
On 3/16/2016 11:11 PM, Philippe Verdy
wrote:
"Disunification may be an answer?" We should avoid
it as well.
Disunification is only acceptable when
- there's a complete disunification of concepts
I
On Fri, Mar 18, 2016 at 11:48 AM, Philippe Verdy wrote:
> 2016-03-18 19:11 GMT+01:00 Garth Wallace :
>>
>> > The issues with line breaking (if you can use these combining around all
>> > characters, inclusing spaces, can be solved using unbreakable
>> >
Thanks Andrew, Looking at the issue of ToUnicode mapping you mention,
why in the 1-many mapping of ligatures (for fonts that have them) do the
"many" not simply consist of the characters ligated? Maybe that's too
simple (my understanding of the process is clearly inadequate).
The "string of
Sequences were introduced long before. I know that they add their own
complications everywhere, but they are already part of existing algorithms.
If sequences (not just combining sequences) were not there, there would be
much more characters encoded in the database and eveything would be encoded
Thanks all for the feedback.
Doug, It may well be my clipboard (running Windows 7 on this particular
laptop). Get same results pasting into Word and EmEditor.
So, when I did a web search on "internaƟonal," as previously mentioned,
and come up with a lot of results (mostly PDFs), were those
On Sat Mar 19, 2016 12:54:51, Martin J. Dürst wrote:
> On 2016/03/19 04:33, Marcel Schneider wrote:
> > On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote:
>
> >> b) Convert to upper (or lower), which may simplify implementation.
>
> >> For example, 'Džinsi' (jeans) would become 'DžINSI' with
Martin J. Dürst wrote:
Now the question I have is: What to do for titlecase characters?
[ ... ]
For example, 'Džinsi' (jeans) would become 'DžINSI' with a), 'DŽINSI' (or
'džinsi') with b), and 'dŽINSI' with c).
For the Latin letters at least, my 0.02 cents' worth (you read that
right) is that
"Disunification may be an answer?" We should avoid it as well.
We have other solutions in Unicode
- variation selectors (often used for sinograms when their unified shapes
must be distinguished in some contexts such as people names or toponyms or
trademark names or in other specific contexts),
-
Yes, it seems like your mileage varies with the PDF
viewer/interpreter/converter. Text copied from Preview on the Mac replaces the
ti ligature with a space. Certainly not a Unicode problem, per se, but an
interesting problem nevertheless.
-steve
> On Mar 17, 2016, at 11:11 AM, Doug Ewell
Thanks everybody for the feedback.
On 2016/03/19 04:33, Marcel Schneider wrote:
On Fri, Mar 18, 2016, 08:43:56, Martin J. Dürst wrote:
b) Convert to upper (or lower), which may simplify implementation.
For example, 'Džinsi' (jeans) would become 'DžINSI' with a), 'DŽINSI' (or
'džinsi') with
On 3/15/2016 8:14 PM, David Faulks
wrote:
As part of my investigations into astrological symbols, I'm beginning to wonder if glyph variations are justifications for separate encoding of symbols I would have previously considered the same or unifiable with symbols
There are a few things going on.
In the first instance, it may be the font itself that is the source of the
problem.
My understanding is that PDF files contain a sequence of glyphs. A PDF file
will contain a ToUnicode mapping between glyphs and codepoints. This
iseither a 1-1 mapping or a 1-many
On 18 March 2016 at 23:49, Garth Wallace wrote:
>
> Correction: the 2-digit pairs would require 19 characters. There would
> be no need for a left half circle enclosed digit one, since the
> enclosed numbers 10–19 are already encoded. This would only leave
> enclosed 20 as a
On Thu, Mar 17, 2016 at 19:02:19, Pierpaolo Bernardi wrote:
> unicode says nothing about font technologies
It mentions them a little bit however in the core specifications:
http://www.unicode.org/versions/Unicode8.0.0/ch23.pdf#G23126
> unicode does not mandate how to encode ligatures
The 'swapcase' just sounds bizarre. What on earth is it for? My inclination
would be to just do the simplest possible implementation that has the
expected results for the 1:1 case pairs, and whatever falls out from the
algorithm for the others.
Mark
On Sat, Mar 19, 2016 at 4:11 AM, Asmus
The PDF *displays* correctly. But try copying the string 'ti' from
the text another application outside of your PDF viewer, and you'll
see that the thing that *displays* as 'ti' is *coded* as Ɵ, as Don
Osborn said.
2016-03-17 14:26 GMT-03:00 Pierpaolo Bernardi :
> That
On Thu, Mar 17, 2016 at 9:18 PM, Garth Wallace wrote:
> There's another strategy for dealing with enclosed numbers, which is
> taken by the font Quivira in its PUA: encoding separate
> left-half-circle-enclosed and right-half-circle-enclosed digits. This
> would require 20
On 3/18/2016 12:33 PM, Marcel Schneider
wrote:
As about decomposing digraphs and ypogegrammeni to apply swapcase: That probably would be doing no good, as itʼs unnecessary and users wonʼt expect it.
That was my intuition as well, but based on a
Hi Frédéric,
The historic use of ideographic numbers for marking Go moves are
discussed in the latest draft of my document:
http://www.babelstone.co.uk/Unicode/GoNotation.pdf
Andrew
On 16 March 2016 at 13:35, Frédéric Grosshans
wrote:
> Le 15/03/2016 22:21,
On 2016/03/19 04:55, Garth Wallace wrote:
On Fri, Mar 18, 2016 at 11:48 AM, Philippe Verdy wrote:
2016-03-18 19:11 GMT+01:00 Garth Wallace :
Rotation is definitely not salient in standard go kifu like it is in
fairy chess notation. Go variants for more
34 matches
Mail list logo