Hey Gang,

I was wondering if you had a todo list or something somewhere? I have been loosely following the discussions here and see the general outline of what the goals are here: http://www.mail-archive.com/tika- [EMAIL PROTECTED]/msg00024.html (Tika discussions in Amsterdam)

Here's where I am at: I am considering extracting the Nutch parsing plugins for a project I am undertaking and wrapping them for my own purposes, but knowing Tika is around, I would just as soon do this in the context of Tika, or at least try to help out that way and have it become a part of Tika. I have not looked at Lius yet. I guess I am wondering if you have some interfaces in mind that you want to hook into, or is the Nutch model (or Lius model) already going to serve as the main model? I pretty much think the Nutch model has everything I need at the moment, but I don't want to carry around the whole set of Nutch dependencies. I am not worried about content detection at this point so much as extraction.

Is the plan to adopt a similar plugin approach as Nutch?

So, I guess the question is what can I do at this point to help? Should I just go ahead with my needs and then give it back as a patch and you can decide what to do with it from there? I am in somewhat of a hurry to get the basics working in the next week or so.

Also, anyone have any recommendations for parsing various mail repositories like Outlook, Mac Mail (which I think is mbox), etc.?

Cheers,
Grant


Reply via email to