On Fri, Dec 26, 2008 at 1:21 PM, Christiaan Hofman <[email protected]>wrote:

>
> On 26 Dec 2008, at 7:59 PM, Hydro Meteor wrote:
>
>
>
> On Thu, Dec 25, 2008 at 7:30 AM, Christiaan Hofman <[email protected]>wrote:
>
>>
>> On 25 Dec 2008, at 3:39 PM, Hydro Meteor wrote:
>>
>> [ SNIP ]
>
>>
>> From the perspective of long-term archival of PDFs and its annotations,
>> from what I've read about Apple's implementation of Extanded Attributes on
>> the file system, Apple supposedly conforms to *POSIX.1e* ACL, per
>> http://developer.apple.com/documentation/Darwin/Reference/Manpages/man3/acl.3.htmlbut
>>  if one reads the Description section of this page carefully, you'll see
>> that there are differences from pure POSIX.1e:
>>
>> *This implementation of the POSIX.1e library differs from the standard in
>> a number of non-portable ways in order to support the MacOS/Darwin ACL
>> semantic. * Where possible, these differences are implemented using the
>> mechanisms provided in the standard for such extensions.  Where routines are
>> non-standard, they are suffixed with _np to indicate that they are not
>> portable.
>>
>> POSIX.1e describes a set of ACL manipulation routines to manage the
>> contents of ACLs, as well as their relationships with files; *almost all*of 
>> these support routines are implemented.
>>
>>
>> The "almost all" and "differs from the standard in a number of
>> non-portable ways ..." is concerning to me from the perspective of long-term
>> *archiving* of PDF documents with annotations, which PDFs with
>> annotations need to be preserved into perpetuity. One reason for concern is
>> that open source network backup solutions such as Bacula <
>> http://www.bacula.org/en/ > do not yet fully handle ACLs in Mac OS X (I
>> have tried and its just not there yet although it may be in the future). So
>> there is the possibility of losing annotations when backing up and
>> restoring.
>>
>>
>> EAs and ACLs are not the same thing. E.g. they are often handled
>> differently by copy/backup tools (some preserve both, some none, some one
>> but not the other).
>>
>
> Thanks. I dug a little deeper for clarification and found some of my sys
> admin notes. ACLs are a form of EAs but not all EAs are ACLs, according to
> Amit Singh's Mac OS X Internals A Systems Approach < http://osxbook.com/ >
> (page 134) where Singh writes:
>
> *ACLs – File system ACLs are supported for finer-grained and flexible
> admission control when using on-disk information. Per-file ACLs are
> implemented as extended attributes in the file system.*
>
> For Skim notes, the real question is just how EAs are handled. What's
>> relevant is not the API that Apple provides but whether the backup tool
>> preserves the data, that's all. It very much depends on the tool.
>>
>
> Great point. My Bacula notes also suggest that some EAs for files can be
> backed up and restored. When it comes to quiescently creating HFS+ or HFSX
> disk snapshots (using Apple's command-line tools such as hdiutil and asr),
> you'll be able to capture everything on the filesystem because its possible
> to use these tools at the device (block) level.
>
>
>
> ACLs seem to be implemented as some kind of special EAs, that are somewhat
> hidden from the user (at least by the xattr tool, perhaps even by the BSD
> library). Maybe that's why bacula doesn't copy them. However Skim uses
> ordinary EAs.
>

At some point sooner than later I will get around to testing Skim EAs with
Bacula. I will confirm my observations to this mailing list and perhaps
cross post to the Bacula mailing list as it might be useful for people to
learn some heuristics on the Bacula mailing list about various EA flavors.


>
> The FAQ on the Skim Wiki has a discussion about skim notes and backup
>> tools, including a link to a test page of various tools and what they
>> preserve.
>>
>
> Appreciate the links to the pages. Can I add a link to the Wiki page to the
> open source Bacula project?
>
>
>
> No ATM, as the wiki is currently uneditable.
>

Would you please add it? I think Bacula deserves to be listed among the
resources (I've worked with it quite exensively on Mac OS X although I'm
still learning about Bacula). I am not a Bacula cult member, its just that I
find it to be quite amazing and I'd like for more people in the Mac OS X
community (especially those who appreciate FOSS <
http://en.wikipedia.org/wiki/FOSS >) to be aware of it.

[SNIP]


> Thanks for the heads up. Apparently FDF is not capable of handling anchored
> notes.
>
>
> Anchored notes are our invention, and they predate our use of PDF
> annotations.
>

> I noticed that Apple's Preview app (on Leopard) provides the ability to add
> anchored notes to PDFs.
>
>
> Those are not anchored notes but notes of type Text. Anchored notes are
> implemented as a subclass of Text notes, and the difference is that anchored
> notes can have an image and an additional rich text property (apart from the
> plain text string).
>
> Does PDFKit provide some API hooks into adding anchored notes, which Skim
> is availing of, which is probably what Preview is also availing of? In other
> words, are anchored notes pretty much specific to PDFKit? I wonder how Adobe
> Acrobat (assuming Acrobat also provides anchored notes?) structures and
> saves them (if not as FDF)?
>
>
>
> As we invented anchored notes, neither Adobe nor PDFKit knows about them.
> When Skim saves with embedded notes or exports to FDF, the anchored notes
> are saved as Text annotations, which are the closest PDF/FDF have to offer.
> That's why it loses information. Another thing that's lost is transparency
> in colors, because PDF/FDF doesn't support that. Moreover, the font of
> (Skim's) text notes may get lost.
>

Appreciated that anchored notes were invented via Skim team! This is quite
educational and a fresh reminder that big corporations don't always invent
great things.

> The Skim notes format is a proprietary format from Skim. But it is
>> completely open, Skim is OSS, and the format for Skim notes is completely
>> described on the Wiki. Moreover, a library to read and write them including
>> the source code is available from the site. It uses only standard Cocoa and
>> the BSD library for EAs. So it would always be possible to as a minimum be
>> able to convert Skim notes to whatever you want (including PDF annotations).
>>
>
> [SNIP]
>
> Using the normal Save (or Export as PDF), Skim does not touch the original
>> PDF data at all. The same is true for export without notes. For export with
>> embedded notes, Skim fully relies on Apple's PDFKit, so there's absolutely
>> no control over it.
>>
>> Actually, PDFKit does a very bad job exporting PDF annotations (I'm
>> talking about Leopard, on Tiger it's not even possible). The saved notes are
>> actually changed. This is one more reason why Skim doesn't use it.
>>
>
> That seems quite reasonable. So "standard" Cocoa excludes PDFKit?
>
>
> Yes, it's just Foundation and AppKit. Note that that's just also what's
> available as the cross-platform GNUStep.
>

Great. So is in possible, in theory and in practice, to compile and run Skim
on GNUStep? Has anyone tried? With today's rich world of virtual machine
technology, it might be worth giving it a shot in VMWare or Parallels vm?

Given what you've written about PDFKit, I can understand all the more reason
for Skim notes format! My curiosity is probably unusual in that I'm looking
at preservation of annotations from the perspective of an archivist (such
that these notes coudl be looked at, potentially, 100 years from now). Most
people have shorter term horizons!


100 years is extremely long for computing, and I don't think anything on my
> computer will last that long. In fact, I wouldn't even know how to recover
> my files on floppy disks from a mere 15 years ago!
>

Physical media preservation is one thing, but file formats and software
executables are another. I agreew with you about floppies, but think about
how there are still to this day programs running in Cobol and Fortran (and
operating on data that may be very old hisorically). In fact, I would argue
that data archival and recovery is going to become ever more increasingly
important given the latest "financial crisis" the world is in (which has
been attributed in part to using 200 year old statistical models in
economics -- the Alan Greenspans and the Milton Friedmans of the world were
trained in the 60s and 70s to think that financial markets were mostly
Guassian and didn't experience kurtosis ala "fat tails", etc.). This may
seem odd to you given your mathematics background (as it does to me given my
background in atmospheric science), but economists don't generally take into
account chaos theory / perurbations / probabilistic outcomes, for a great
article see:

*"Economics needs a scientific revolution" *

http://www.nature.com/nature/journal/v455/n7217/full/4551181a.html

*Financial engineers have put too much faith in untested axioms and faulty
> models, says Jean-Philippe Bouchaud. To prevent economic havoc, that needs
> to change.
>
> Compared with physics, it seems fair to say that the quantitative success
> of the economic sciences has been disappointing. Rockets fly to the Moon;
> energy is extracted from minute changes of atomic mass.
> *


Besides economics likely to be changing (and thus the importance of
archiving), also don't forget that 100 years is child's play in terms of the
scale of climatology (as our friends up north at UND in Grand Forks at the
Center for Aerospace Sciences can attest to).

Thus, if an organization (such as a university or a research lab) has built
up years and years worth of PDFs and important annotations of those PDFs,
archiving both the PDFs and the annotations well into the future (beyond a
human lifetime or two) is really important. Imagine university professor
today who may have made an incredible insightful annotation on a PDF
document today written about, say, the financial crisis or about global
climate change, and what if that professor passes away unexpectedly -- a
student years from now should be able to discover that professor's
annotations for use in his or her research which might lead to further
revelations and insights!

This is why I approached Skim and PDF from the ISO standard viewpoint -- as
crazy as it may have seemed at first. Also this is why I happen to be
adamant about open source (such that the source code can always be compiled
independent of platform, independent of corporation) and why I love
OpenDocument (to be free of the shackles of Microsoft with regard to Word
processing once and for all  see also <
http://www.boston.com/business/technology/articles/2005/09/02/state_may_drop_office_software/
>)!


>
> You may see this in Preview, though it may not be immediately obvious
>> because Preview does not support the types of annotations for which this is
>> the worst (such as lines and freehand notes), while it uses workarounds for
>> other types of notes (like highlights). To see the problems, try exporting a
>> file containing various types of notes as PDF With Embedded Notes, then
>> reopen that file in Skim and choose File > Convert Notes. Try editing the
>> notes afterwards (especially line notes and multi-line underlines). (Convert
>> Notes will fix this in the next release.)
>>
>
> I tried what you suggested. Yuck regarding PDFKit.
>
>
> Development of PDFKit is pretty slow. I was very much disappointed by
> Leopard's improvements, I'd expected a lot more. I got the impression
> they've got just one guy working on it, at least he's complaining about
> Apple giving too few man hours.
>

That's very interesting to know. Thanks for sharing. Perhaps not too
surprising. I recall Leopard was delayed from its original intended release
date because Apple needed to borrow some engineers and human power for the
iPhone. This is another good example of why its tricky to count on
corporations -- they wil ebb and flow to do what's in their best interest
which is fine (not a judgment) but why its great if we can compile against
FOSS (e.g., GNUstep).

Skim notes all teh way. When you came up with the Skim format, had you
looked at XML-based SVG by any chance? SVG hasn't really taken off (several
years ago I thought it would), and I'm not sure why. Any thoughts as to why?
Ironically, Adobe was behind SVG as a member of the W3C SVG committee years
ago if I'm not mistaken.


I don't really think SVG is appropriate for these notes, it certainly cannot
> capture anything. E.g. anchored notes, and colors.
>

Perhaps XML (for everything manta) pushed by the W3C got carried away (SOAP,
SVG, XUL, SMIL, OWL, RDF, etc, etc.) ... committees are not known to always
chalk up efficiency.

> I think PDFKit currently supports PDF 1.3 features only. The fact that this
>> is lower than 1.7 is not a problem though, quite to the contrary. PDF 1.3 is
>> a strict subset of PDF 1.7; PDF 1.7 just adds new features (such as
>> interactive features). The essence is that it ALLOWS more, it does not
>> REQUIRE more. In other words, PDF 1.3 is always valid as PDF 1.7.
>>
>
> Great! Unusually mided archivists and curators like me can now rest more
> easily at night :-)
>
>>  When it comes to document presevation and annotations of those
>> documents, this is the type of stuff that archivists worry about, and
>> correctly so (decades may seem far away, but they will be here sooner than
>> later)!
>>
>>
>> If you keep on to a version of Skim, you will always be able to read Skim
>> notes. Even if Skim in the future will use PDF rather than .skim notes
>> (which I think won't happen), there will be some old version available that
>> can convert the notes, in fact there should than be a simple conversion tool
>> available, perhaps embedded in the skimnotes tool.
>>
>
> Even if Apple were to disappear decades from now, there will always be a
> museum somewhere that will have a Mac with Tiger or Leopard on it along with
> the XCode tools with the standard Cocoa libraries to link to and compile
> with gcc ;-)
>
>
> And moreover, Apple's implementation is not the only one, there's also
> GNUStep's.
>

As aforementioned, Skim should be able to compile under GNUStep?

[SNIP]

You're welcome.
>
> Christiaan
>

Cheers,

Hydro
------------------------------------------------------------------------------
_______________________________________________
Skim-app-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/skim-app-users

Reply via email to