That sounds like a good plan. Calling external libraries will definately make programming faster (which right now is more important than execution speed).
Luke On 3/21/07, Jason Kivlighn <[EMAIL PROTECTED]> wrote: > I think I've settled on Tracker. I got an okay from them as well as > someone who volunteered to mentor me with Tracker code while working > under Creative Commons. > > I like the idea of separating it into two parts. Since there's so many > indexers out there, separating the parser means we have an > application/library that any indexer can use. Looking at Tracker's > infrastructure it should work nicely. Even using Tracker, cc-sharp may > come in handy, since Tracker can call external processes to extract the > search data. Here's the list of formats I was hoping to support: MP3, > OGG, RSS, SVG, HTML, XML, JPEG, PDF, SMIL. The big problem I see with > cc-sharp is working with C#. I'd consider myself fairly fluent in > C,C++,Java, and Python. > > I notice that ccPublisher already attaches licenses, and ccLookup reads > licenses in anything with RDF metadata as well as in mp3s. In response > to your second email, Luke, it might work to extend ccLookup to support > more formats and then have the Tracker extractor call this program. > Then I'm sticking with a high-level language I'm familiar with. > However, I'm not sure if that will bode well for performance, though. > The extraction process needs to be fast, so a C library might be a > better option. Given the scope of formats, our extractor would be run > quite often for the typical desktop. > > The Tracker code base from what I've seen looks very manageable, but I > hope to get more feedback from the Tracker folks soon. > > Cheers, > Jason > > Jason, > > I did something similar to this last year for SoC and it resulted in a > > new CC library called cc-sharp: > > http://code.google.com/p/cc-sharp/ > > > > So your project could have two parts: the 1) license handling and then > > 2) integrating that data with the desktop search application. If you > > wanted to use C# (Beagle), I'd help flesh out cc-sharp with you and > > you could work on the integration. > > > > The other C# CC lib around is CCLicenseLib which hasn't been developed > > in four years. > > http://workspaces.gotdotnet.com/cclib > > > > It contains object representations of the older CC licenses. It would > > be nice to make one condensed lib for CC stuff in C# so developers for > > other projects could easily integrate with their software. I see it > > being laid out as such: > > > > - Attaching licenses to media > > - Reading licenses from meda > > - Verifying licenses > > > > This desktop search idea would primarily use reading and verifying. > > Right now all cc-sharp does is verify because I was originally working > > on Banshee. Banshee already had read the metadata from the MP3 via my > > patch so all my lib really was, was an abstraction of the > > verification. Since verification is done over the Internet, that's not > > really something you want to include by default in core application > > code. > > > > I'd like to abstract license reading so we can just "plug" support for > > different file types to be read whether they are images, audio, etc. > > Kind of like vfs. > > > > What are your thoughts? > > > > -Luke > > > > On 3/21/07, Jason K <[EMAIL PROTECTED]> wrote: > > > >> Hi, > >> > >> I'm looking into adding support for searching/indexing licenses for a > >> service such as Tracker, Beagle, or Strigi for a Google SoC project. My > >> first hurdle though, is picking which indexer. The ideal service would > >> be cross-desktop, to avoid implementing extraction filters over and over > >> again for different indexers. It also needs to be widely adopted. > >> > >> Tracker is looking like a good candidate, given that it is a > >> Freedesktop.org project, is desktop-neutral, and appears to have the > >> intention of following standards as well as creating standards for other > >> search services to use. I get the impression GNOME will be including > >> this soon. > >> > >> Strigi is also desktop-neutral, though favored by KDE and is going to be > >> used by KDE 4. It doesn't rely on KDE, though. In fact, Strigi's only > >> requirements are are the stdc++ libraries, while Tracker is glib-based. > >> > >> And for Beagle, Mono is one significant reason I'm shying away from it. > >> Tracker or Strigi appear more interoperable and look to be getting wider > >> adoption. > >> > >> Formats I plan to include are: > >> HTML, SVG, SMIL, XML in general (RDF) > >> PDF, JPEG, other images (XMP) > >> MP3, OGG, other audio/video > >> RSS > >> > >> >From what I've seen, most license data is either in RDF or XMP form. > >> MP3, OGG, and RSS are exceptions. For all these formats, I would follow > >> the embedding specification on the Creative Commons website, at > >> http://creativecommons.org/technology/usingmarkup > >> > >> Since most licenses are placed in RDF or XMP, that code can be separated > >> and reused from various extraction modules. > >> > >> So enough rambling... thoughts? > >> > >> -Jason Kivlighn > >> _______________________________________________ > >> cc-devel mailing list > >> [email protected] > >> http://lists.ibiblio.org/mailman/listinfo/cc-devel > >> > >> > > > > > > > > -- Luke Hoersten http://www.cs.purdue.edu/homes/lhoerste/ http://openradix.org/ -- Luke Hoersten http://www.cs.purdue.edu/homes/lhoerste/ http://openradix.org/ _______________________________________________ cc-devel mailing list [email protected] http://lists.ibiblio.org/mailman/listinfo/cc-devel
