Don Osborn wrote: > Odd result when copy/pasting text from a PDF: For some reason "ti" in > the (English) text of the document at > http://web.isanet.org/Web/Conferences/Atlanta%202016/Atlanta%202016%20-%20Full%20Program.pdf > is coded as "Ɵ". Looking more closely at the original text, it does > appear that the glyph is a "ti" ligature (which afaik is not coded as > such in Unicode).
When I copy and paste the PDF text in question into BabelPad, I get: > Internaonal Order and the Distribuon of Identy in 1950 (By > invitaon only) The "ti" ligatures are implemented as U+10019F, a Plane 16 private-use character. Truncating this character to 16 bits, which is a Bad Thing™, yields U+019F LATIN CAPITAL LETTER O WITH MIDDLE TILDE. So it looks like either Don's clipboard or the editor he pasted it into is not fully Unicode-compliant. Don's point about using alternative characters to implement ligatures, thereby messing up web searches, remains valid. -- Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸

