Hi,

Thanks for your message. Let me try to ask this question a little
differently.

Regarding Unicode, it appears that Skim behaves unlike every other PDF
reader I've got, e.g., Acrobat, PDF Expert, Foxit Reader, and Preview.

If I copy some text from a PDF file that uses Unicode, all of these PDF
readers will perform Unicode normalization, while Skim does not.

For example, I copy the string "shū" from a PDF using any of the other
reader apps, and the clipboard contains 7368c5ab.

If I try this with Skim, though, the clipboard contains 736875cc84. You can
verify this using the command line, e.g., *echo -n "shū" | xxd -p*

Similarly, the output via skimnotes is not being normalized. So, it seems
that other PDF reader apps are doing normalization, but Skim does not.

When you say "any normalization should happen before the data was created"
— in fact this is not how all other PDF reader apps that I've got seem to
work.

Would you consider adding an option for *skimnotes* to behave like other
PDF apps w.r.t. Unicode?

Thanks,

M.

On Fri, Apr 12, 2024 at 6:21 PM Christiaan Hofman <cmhof...@gmail.com>
wrote:

> That would be useless. SkimNotes does not process the data, it just copies
> it between different locations. Also passing the file through conversion
> does not work, as none of the formats involved are unicode text files. The
> point is that the strings included as part off the data may represent
> strings in some encoding in some form. So any normalization should happen
> before the data was created (which is not an option), or by parsing the
> plist, normalizing all strings in it, and reassemble the plist.
>
> Christiaan
>
> On 12 Apr 2024, at 04:48, Mark Roberts <mroberts1...@gmail.com> wrote:
>
> Hi,
>
> Thanks for clarifying.
>
> I looked for a tool to do this, but I haven't found anything.
>
> Some people suggest running a text file through *oconv*, but that seems
> to be just a brute force approach to patch specific characters.
>
> What would you think about an option to *skimnotes* that invokes
> precomposedStringWithCanonicalMapping() or whatever the appropriate
> function is?
>
> Thanks !
>
> On Thu, Apr 11, 2024 at 11:18 PM Christiaan Hofman <cmhof...@gmail.com>
> wrote:
>
>>
>>
>> On 11 Apr 2024, at 13:31, Mark Roberts <mroberts1...@gmail.com> wrote:
>>
>> I've been using the *skimnotes* command line app to "get" skim notes as
>> a plist, and to then convert them to XML with plutil.
>>
>> One thing I've discovered is that notes in Unicode may not be normalized.
>>
>> Some apps can handle this, but some cannot.
>>
>> Question: is there a way to get skimnotes to normalize the Unicode, or
>> could you suggest an app I can use in a pipe with plutil... ?
>>
>> Thanks !
>>
>>
>> I don’t think there exists a tool to normalize unicode strings in a
>> binary plist. I am pretty sure that any tool that may exist to convert a
>> binary plist to an XML plist goes through the same system code from Apple,
>> so it will all do the same thing. Perhaps there is a tool to post-process
>> the XML to normalize any strings as Unicode, but I could not help you there
>> either.
>>
>> Christiaan
>>
>
> _______________________________________________
> Skim-app-users mailing list
> Skim-app-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/skim-app-users
>
_______________________________________________
Skim-app-users mailing list
Skim-app-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/skim-app-users

Reply via email to