On Mon, 2007-06-18 at 13:20 -0700, Jason Kivlighn wrote:
> Whoops, I forgot the intro on this.
> 
> This is my progress thus far with extracting licenses from various
> formats.    Jamie, I'm curious on your thoughts on adding new extractors
> (besides the ones mentioned below, GIF is another I have in mind.  I'm
> not sure whether or not it's worthwhile, however).  I don't want to be
> adding bloat...
> 
> Cheers,
> Jason

Jamie, plese drop us a line to discuss this project. Did you get the
chat time invite?

jon

> Jason Kivlighn wrote:
> > Hi,
> >
> > imagemagick: Uses 'convert filename xmp:-' to output an image's embedded
> > XMP.  This works for at least JPEG and TIFF files.  For JPEGs, however,
> > Imagemagick outputs the namespace and XMP, seperated by \0.  I'm not
> > sure how I can handle this, without simply assuming that 'convert'
> > returned two null-terminated strings.  Nevertheless, this extracts the
> > XMP from TIFF files.
> >
> > msoffice: Extends the msoffice extractor to also parse the
> > DocumentSummeryInformation infile, which contains user-defined metadata,
> > along with license metadata embedded by the MSOffice Creative Commons Add-in
> >
> > pdf: Extends the pdf extractor to read a PDF's metadata stream and parse
> > it as XMP.  I'm still awaiting poppler extending the glib bindings to
> > allow reading the metadata stream.  Until then, it will simply never
> > find the metadata stream and go on without error.
> >
> > png: Adds a check for the XML:com:adobe:xmp iTXt field, and parses it as
> > XMP.
> >
> > html: Adds a new html parser using libxml2.  Parses the document,
> > checking for RDFa licenses.  It also checks for other basic HTML
> > properties like title and author.
> >
> > There's also several XML formats I'd like to parse for license data,
> > particularly SVG and SMIL.  Would this be do-able, and if so, how should
> > I go about it?  Write new extractors for each format or is this too much
> > overhead?  These could use GMarkupParse, rather than bringing in libxml2
> > like the HTML parser.
> >
> > Cheers,
> > Jason
> >
> >   
> 
> _______________________________________________
> tracker-list mailing list
> [EMAIL PROTECTED]
> http://mail.gnome.org/mailman/listinfo/tracker-list
> 
-- 
Jon Phillips

San Francisco, CA
USA PH 510.499.0894
[EMAIL PROTECTED]
http://www.rejon.org

MSN, AIM, Yahoo Chat: kidproto
Jabber Chat: [EMAIL PROTECTED]
IRC: [EMAIL PROTECTED]

_______________________________________________
cc-devel mailing list
[email protected]
http://lists.ibiblio.org/mailman/listinfo/cc-devel

Reply via email to