On Thu, Dec 25, 2008 at 7:30 AM, Christiaan Hofman <[email protected]>wrote:

>
> On 25 Dec 2008, at 3:39 PM, Hydro Meteor wrote:
>
> [ SNIP ]

>
> From the perspective of long-term archival of PDFs and its annotations,
> from what I've read about Apple's implementation of Extanded Attributes on
> the file system, Apple supposedly conforms to *POSIX.1e* ACL, per
> http://developer.apple.com/documentation/Darwin/Reference/Manpages/man3/acl.3.htmlbut
>  if one reads the Description section of this page carefully, you'll see
> that there are differences from pure POSIX.1e:
>
> *This implementation of the POSIX.1e library differs from the standard in
> a number of non-portable ways in order to support the MacOS/Darwin ACL
> semantic. * Where possible, these differences are implemented using the
> mechanisms provided in the standard for such extensions.  Where routines are
> non-standard, they are suffixed with _np to indicate that they are not
> portable.
>
> POSIX.1e describes a set of ACL manipulation routines to manage the
> contents of ACLs, as well as their relationships with files; *almost all*of 
> these support routines are implemented.
>
>
> The "almost all" and "differs from the standard in a number of non-portable
> ways ..." is concerning to me from the perspective of long-term *archiving
> * of PDF documents with annotations, which PDFs with annotations need to
> be preserved into perpetuity. One reason for concern is that open source
> network backup solutions such as Bacula < http://www.bacula.org/en/ > do
> not yet fully handle ACLs in Mac OS X (I have tried and its just not there
> yet although it may be in the future). So there is the possibility of losing
> annotations when backing up and restoring.
>
>
> EAs and ACLs are not the same thing. E.g. they are often handled
> differently by copy/backup tools (some preserve both, some none, some one
> but not the other).
>

Thanks. I dug a little deeper for clarification and found some of my sys
admin notes. ACLs are a form of EAs but not all EAs are ACLs, according to
Amit Singh's Mac OS X Internals A Systems Approach < http://osxbook.com/ >
(page 134) where Singh writes:

*ACLs – File system ACLs are supported for finer-grained and flexible
admission control when using on-disk information. Per-file ACLs are
implemented as extended attributes in the file system.*

For Skim notes, the real question is just how EAs are handled. What's
> relevant is not the API that Apple provides but whether the backup tool
> preserves the data, that's all. It very much depends on the tool.
>

Great point. My Bacula notes also suggest that some EAs for files can be
backed up and restored. When it comes to quiescently creating HFS+ or HFSX
disk snapshots (using Apple's command-line tools such as hdiutil and asr),
you'll be able to capture everything on the filesystem because its possible
to use these tools at the device (block) level.


> The FAQ on the Skim Wiki has a discussion about skim notes and backup
> tools, including a link to a test page of various tools and what they
> preserve.
>

Appreciate the links to the pages. Can I add a link to the Wiki page to the
open source Bacula project? I am probably a wee bit biased because Bacula is
a very solid industrial strength backup and recovery tool that happens to be
just about the only one out there that is totally FOSS and also multi
platform (runs on OS X as well as Linux). Bacula is known however for not
being able to backup and recover ACLs on OS X (so it may not fit everyone's
needs). I'll for sure have to test it out with Skim-generated EA metadata
for PDFs.


> If you want to be sure you won't lose the notes you can save them in the
> data of a separate .skim file (you can do this manually, or choose to always
> do this automatically). You can also convert to a PDF bundle, which is just
> a file package containing the (original) PDF and the notes in separate
> files.
>

I had not checked out Skim's Preference previously -- but I see the ability
to automatically save Skim notes backups. That is a great feature, thanks
for including it!


>
> I think one can be pretty sure that Apple's implementation of EAs will be
> compatible with any future changes.
>

That's probably a reasonable assumption considering that starting with
Leopard, Mac OS X is "an Open Brand UNIX 03 Registered Product, conforming
to the SUSv3 and POSIX 1003.1 specifications for the C API, Shell Utilities,
and Threads" < http://www.apple.com/macosx/technology/unix.html  >

It will be interesting to see what if anything changes on this score once
Snow Leopard is released :-)

Skim's ability to separate PDF from annotations is *excellent*. This allows
> for archival preservation of the original PDF document in a library system
> and then recombine the annotations with the original as a separate process
> at a later time for example. While it is possible to and greatly appreciated
> that export options exist for annotations in the form of text, RTF, RTFD,
> the only format for round tripping annotations in and out of a PDF document
> is FDF (I.e., Skim only parses FDF (Forms Data Format)). I can't quite
> discern if FDF is an open format / ISO standard.
>
>
> FDF is actually just a simplified form of PDF (in fact, Skim reads FDF by
> replacing the first "F" in "FDF" by "P" and reading it as PDF), so it's just
> as open as PDF (though I'm not 100% sure if it's an ISO standard.
>

I'll see if I can find out (not sure how to get a hold of the ISO 32000
document but I'll try). My guess is that FDF is part of the standard because
the Adobe 1.7 Reference document includes an entire section (8.6.6) on Forms
Data Format. Here's a copy of the brief introduction about FDF:

*8.6.6 Forms Data Format *

This section describes Forms Data Format (FDF), the file format used for
inter-
active form data (PDF 1.2). FDF is used when submitting form data to a
server,
receiving the response, and incorporating it into the interactive form. It
can also
be used to export form data to stand-alone files that can be stored,
transmitted
electronically, and imported back into the corresponding PDF interactive
form.
In addition,* beginning in PDF 1.3, FDF can be used to define a container
for an-
notations that are separate from the PDF document to which they apply*.

FDF is based on PDF; it uses the same syntax (see Section 3.1, "Lexical
Conven-
tions") and basic object types (Section 3.2, "Objects"), and has essentially
the
same file structure (Section 3.4, "File Structure"). However, it differs
from PDF in
the following ways:

•The cross-reference table (Section 3.4.3, "Cross-Reference Table") is
optional.

•FDF files cannot be updated (see Section 3.4.5, "Incremental Updates").
Objects can only be of generation 0, and no two objects can have the same
object number.

•The document structure is much simpler than PDF, since the body of an FDF
document consists of only one required object.

•The length of a stream may not be specified by an indirect object.
FDF uses the MIME content type application / vnd . fdf. On the Windows and
UNIX platforms, FDF files have the extension . fdf; on Mac OS, they have
file type ' FDF '.



>
> Be careful though to use FDF for backup, because it may lead to data loss,
> as there's not a complete 1-to-1 mapping between Skim notes and PDF/FDF
> annotations (especially anchored notes do not exist in PDF). Only the Skim
> Notes export type is completely data preserving (as it's the same data as
> what's saved in the EAs).
>

Thanks for the heads up. Apparently FDF is not capable of handling anchored
notes. I noticed that Apple's Preview app (on Leopard) provides the ability
to add anchored notes to PDFs. Does PDFKit provide some API hooks into
adding anchored notes, which Skim is availing of, which is probably what
Preview is also availing of? In other words, are anchored notes pretty much
specific to PDFKit? I wonder how Adobe Acrobat (assuming Acrobat also
provides anchored notes?) structures and saves them (if not as FDF)?


>
> The Skim notes format is a proprietary format from Skim. But it is
> completely open, Skim is OSS, and the format for Skim notes is completely
> described on the Wiki. Moreover, a library to read and write them including
> the source code is available from the site. It uses only standard Cocoa and
> the BSD library for EAs. So it would always be possible to as a minimum be
> able to convert Skim notes to whatever you want (including PDF annotations).
>

[SNIP]

Using the normal Save (or Export as PDF), Skim does not touch the original
> PDF data at all. The same is true for export without notes. For export with
> embedded notes, Skim fully relies on Apple's PDFKit, so there's absolutely
> no control over it.
>
> Actually, PDFKit does a very bad job exporting PDF annotations (I'm talking
> about Leopard, on Tiger it's not even possible). The saved notes are
> actually changed. This is one more reason why Skim doesn't use it.
>

That seems quite reasonable. So "standard" Cocoa excludes PDFKit? Given what
you've written about PDFKit, I can understand all the more reason for Skim
notes format! My curiosity is probably unusual in that I'm looking at
preservation of annotations from the perspective of an archivist (such that
these notes coudl be looked at, potentially, 100 years from now). Most
people have shorter term horizons!


> You may see this in Preview, though it may not be immediately obvious
> because Preview does not support the types of annotations for which this is
> the worst (such as lines and freehand notes), while it uses workarounds for
> other types of notes (like highlights). To see the problems, try exporting a
> file containing various types of notes as PDF With Embedded Notes, then
> reopen that file in Skim and choose File > Convert Notes. Try editing the
> notes afterwards (especially line notes and multi-line underlines). (Convert
> Notes will fix this in the next release.)
>

I tried what you suggested. Yuck regarding PDFKit. Skim notes all teh way.
When you came up with the Skim format, had you looked at XML-based SVG by
any chance? SVG hasn't really taken off (several years ago I thought it
would), and I'm not sure why. Any thoughts as to why? Ironically, Adobe was
behind SVG as a member of the W3C SVG committee years ago if I'm not
mistaken.

I think PDFKit currently supports PDF 1.3 features only. The fact that this
> is lower than 1.7 is not a problem though, quite to the contrary. PDF 1.3 is
> a strict subset of PDF 1.7; PDF 1.7 just adds new features (such as
> interactive features). The essence is that it ALLOWS more, it does not
> REQUIRE more. In other words, PDF 1.3 is always valid as PDF 1.7.
>

Great! Unusually mided archivists and curators like me can now rest more
easily at night :-)

> When it comes to document presevation and annotations of those documents,
> this is the type of stuff that archivists worry about, and correctly so
> (decades may seem far away, but they will be here sooner than later)!
>
>
> If you keep on to a version of Skim, you will always be able to read Skim
> notes. Even if Skim in the future will use PDF rather than .skim notes
> (which I think won't happen), there will be some old version available that
> can convert the notes, in fact there should than be a simple conversion tool
> available, perhaps embedded in the skimnotes tool.
>

Even if Apple were to disappear decades from now, there will always be a
museum somewhere that will have a Mac with Tiger or Leopard on it along with
the XCode tools with the standard Cocoa libraries to link to and compile
with gcc ;-)

By the way, thank you (and the other contributors) for such a great app in
Skim. From a reading standpoint, 1.) the Automaic Resize is fantastic such
that the content of a PDF page is resized to the viewing pane that its in
and 2.) Skim is the best I've ever seen when it comes to Full Screen mode --
I can take a page in Skim, go Full Screen and then Command +  will literally
scale it up to the full physical display size (sans vertical scroll bar on
the right hand side). And then on top of it all, I can still annotate this
fully scaled page and I can even use the Magnify bubble. That is just sweet!
My Grandfather passed away at age 90 several years ago and in his older age
we got him a large CRT (an old TV set) which was equipped with a video input
from a camera that would aim down and a light shining on it (so he could
take a sheet of paper and have it magnified on the CRT to help him with his
vision). If Skim was around then, we could have gotten him a 30 inch Cinema
Display and he would have loved it!

Its a joy to have discovered Skim a few days ago (I can only imagine things
will get more interesting as Apple plays brings multi touch to more of the
Mac world and perhaps they'll finally bring a tablet out as light as the Air
for way cool annotation capability?).

Cheers,

Hydro


[SNIP]
------------------------------------------------------------------------------
_______________________________________________
Skim-app-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/skim-app-users

Reply via email to