Yes, it seems like your mileage varies with the PDF viewer/interpreter/converter. Text copied from Preview on the Mac replaces the ti ligature with a space. Certainly not a Unicode problem, per se, but an interesting problem nevertheless.
-steve > On Mar 17, 2016, at 11:11 AM, Doug Ewell <[email protected]> wrote: > > Don Osborn wrote: > >> Odd result when copy/pasting text from a PDF: For some reason "ti" in >> the (English) text of the document at >> http://web.isanet.org/Web/Conferences/Atlanta%202016/Atlanta%202016%20-%20Full%20Program.pdf >> is coded as "Ɵ". Looking more closely at the original text, it does >> appear that the glyph is a "ti" ligature (which afaik is not coded as >> such in Unicode). > > When I copy and paste the PDF text in question into BabelPad, I get: > >> Internaonal Order and the Distribuon of Identy in 1950 (By >> invitaon only) > > The "ti" ligatures are implemented as U+10019F, a Plane 16 private-use > character. > > Truncating this character to 16 bits, which is a Bad Thing™, yields > U+019F LATIN CAPITAL LETTER O WITH MIDDLE TILDE. So it looks like either > Don's clipboard or the editor he pasted it into is not fully > Unicode-compliant. > > Don's point about using alternative characters to implement ligatures, > thereby messing up web searches, remains valid. > > -- > Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸 > >

