Hi, Thanks for your message. Let me try to ask this question a little differently.
Regarding Unicode, it appears that Skim behaves unlike every other PDF reader I've got, e.g., Acrobat, PDF Expert, Foxit Reader, and Preview. If I copy some text from a PDF file that uses Unicode, all of these PDF readers will perform Unicode normalization, while Skim does not. For example, I copy the string "shū" from a PDF using any of the other reader apps, and the clipboard contains 7368c5ab. If I try this with Skim, though, the clipboard contains 736875cc84. You can verify this using the command line, e.g., *echo -n "shū" | xxd -p* Similarly, the output via skimnotes is not being normalized. So, it seems that other PDF reader apps are doing normalization, but Skim does not. When you say "any normalization should happen before the data was created" — in fact this is not how all other PDF reader apps that I've got seem to work. Would you consider adding an option for *skimnotes* to behave like other PDF apps w.r.t. Unicode? Thanks, M. On Fri, Apr 12, 2024 at 6:21 PM Christiaan Hofman <cmhof...@gmail.com> wrote: > That would be useless. SkimNotes does not process the data, it just copies > it between different locations. Also passing the file through conversion > does not work, as none of the formats involved are unicode text files. The > point is that the strings included as part off the data may represent > strings in some encoding in some form. So any normalization should happen > before the data was created (which is not an option), or by parsing the > plist, normalizing all strings in it, and reassemble the plist. > > Christiaan > > On 12 Apr 2024, at 04:48, Mark Roberts <mroberts1...@gmail.com> wrote: > > Hi, > > Thanks for clarifying. > > I looked for a tool to do this, but I haven't found anything. > > Some people suggest running a text file through *oconv*, but that seems > to be just a brute force approach to patch specific characters. > > What would you think about an option to *skimnotes* that invokes > precomposedStringWithCanonicalMapping() or whatever the appropriate > function is? > > Thanks ! > > On Thu, Apr 11, 2024 at 11:18 PM Christiaan Hofman <cmhof...@gmail.com> > wrote: > >> >> >> On 11 Apr 2024, at 13:31, Mark Roberts <mroberts1...@gmail.com> wrote: >> >> I've been using the *skimnotes* command line app to "get" skim notes as >> a plist, and to then convert them to XML with plutil. >> >> One thing I've discovered is that notes in Unicode may not be normalized. >> >> Some apps can handle this, but some cannot. >> >> Question: is there a way to get skimnotes to normalize the Unicode, or >> could you suggest an app I can use in a pipe with plutil... ? >> >> Thanks ! >> >> >> I don’t think there exists a tool to normalize unicode strings in a >> binary plist. I am pretty sure that any tool that may exist to convert a >> binary plist to an XML plist goes through the same system code from Apple, >> so it will all do the same thing. Perhaps there is a tool to post-process >> the XML to normalize any strings as Unicode, but I could not help you there >> either. >> >> Christiaan >> > > _______________________________________________ > Skim-app-users mailing list > Skim-app-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/skim-app-users >
_______________________________________________ Skim-app-users mailing list Skim-app-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/skim-app-users