Hi, This is very helpful — thanks !!
I just tried your suggestion and got an XML file as expected. I more or less understand all the elements of the XML, but it seems the entire note is in a <string> element, while the quadrilateralPoints for the highlighting boxes are separate. What I was hoping to do is somehow get each line of my note and then look for a hyphen at the end of each line, and then trim that hyphen, as necessary. The objective is to try and clean up the skim note to eliminate line-break hyphens in the source text. Any ideas about how I could do this? Thanks again, M. On Mon, Mar 14, 2022 at 7:27 PM Christiaan Hofman <cmhof...@gmail.com> wrote: > > > On 14 Mar 2022, at 11:13, Christiaan Hofman <cmhof...@gmail.com> wrote: > > > > On 14 Mar 2022, at 10:56, Christiaan Hofman <cmhof...@gmail.com> wrote: > > > > On 14 Mar 2022, at 10:50, Christiaan Hofman <cmhof...@gmail.com> wrote: > > > > On 14 Mar 2022, at 04:49, Mark Roberts <mroberts1...@gmail.com> wrote: > > Is there some way to get more detailed information about skim notes, i.e., > other than the code framework? > > I have tried the skimnotes command line tool (e.g., the 'get' and 'format' > commands), but it seems to only output the basic information about notes, > such as the note type, page number, and note text. > > Perhaps(?) there's another mode for the skimnotes tool, but I couldn't > find it from reading the documentation. > > I'd like to get more complete data on each note, such as a timestamp, the > coordinates of the boxes that are highlighted in the PDF file, the > highlight color, and the text contained in each box. > > I assume(?) this data is in the notes file, but the skimnotes app ignores > it for now. > > I'm wondering about this because if possible I'd like to make a script > that gathers my notes for a PDF file, and tries to fix words that were > broken by hyphenation in the original PDF. If I can get the highlight boxes > in the notes file, and the text in each box, then it should be possible to > check for a hyphen character at the end of each line, and then stitch > together the words that were split across lines. > > Any suggestions? > > Thanks in advance, > > M. > > > The skimnotes tool is not a tool that can interpret the data. It only > copies the data around to various locations that are supported (such as > between extended attributes, .skim files, or within a .pdfd bundle). There > is no tool to interpret he data. The Wiki has information about how the > data is formatted. You could try to build your own tool to unarchive the > data from that, but that would be quite a bit of work. > > Christiaan > > > I can also note that in the near future the skim notes will be saved in a > plist format, which can be read by various tools and apps, including > AppleScript. You can already have Skim do that by activating a hidden > preference, see the Wiki for details. > > Christiaan > > > I just remembered that the skimnotes tool *can* convert to the plist > format, which you may be able to read, using the ’skimnotes format’ > command.' skimnotes format plist SKIM_FILE' can do that. The help for > skimnotes does not say so, but you can immediately also get the skim notes > plist format from the skimnotes tool as follows: > > skimnotes get plist PDF_FILE SKIM_FILE > > This will get you a plist file in SKIM_FILE. Perhaps for other tools to > read it you have to change the extension to .plist. You could also then > pass it through plutil to convert the binary plist to xml plist (plutil > -convert xml1 PLIST_FILE), which would even be human readable. You could > combine that to get the skimnotes in xml format as follows: > > skimnotes get plist PDF_FILE - | plutil -format xml1 -o PLIST_FILE - > > Christiaan > > > Small correction, I messed up ‘-format’ arguments to the commands. It > should be added in skimnotes, and in plutil it is -convert: > > skimnotes get -format plist PDF_FILE SKIM_FILE > > plutil -convert xml1 PLIST_FILE > > skimnotes get -format plist PDF_FILE - | plutil -convert xml1 -o > PLIST_FILE - > > If you want to go to the reverse, and write the xml plist data as skim > notes, you could do: > > plutil -convert binary1 -o - PLIST_FILE | skimnotes set PDF_FILE - > > Christiaan > > _______________________________________________ > Skim-app-users mailing list > Skim-app-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/skim-app-users >
_______________________________________________ Skim-app-users mailing list Skim-app-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/skim-app-users