On 7/8/07, Chris Mattmann <[EMAIL PROTECTED]> wrote:
Hi Robert,
hi chris
Yes, in fact it is. I am currently working on porting an implementation similar to that of freedesktop.org, original intended for the Nutch project. It was written by Jerome Charron. I've created an issue in TIKA's jira issue tracker to discuss this: http://issues.apache.org/jira/browse/TIKA- I've been sitting on the code for a few weeks now, as I just haven't had the time to make much progress porting it. It shouldn't take too much effor though. Please let me know if you'd like to work on helping to port it.
quite possibly :-) seems like we might have a good match but let me explain why i think there might be some synergy... RAT is a project comprehension tool built to help me review releases in the incubator. it works by guessing meta-data from documents and then analysing or reporting on it. checking headers and licenses, that sort of thing. it started out as a hacked together tool for myself but it's started to become reasonably well used within apache. so, it's about time that it moved on. i'd like to start running RAT against all the incubator source. this should make it easier for incubating projects to cut releases and allow easier oversight of the code base. RAT has a very basic set of heuristics for determining broad MIME type. this really needs to be replaced by something better. - robert
