Hello, I'm very disapointed by this mail. In fact, I annouce, two weeks ago that I was currently working on a Extension/MimeType mapper and MagicNumber/MimeTye mapper. And I created an issue related to this point too, in which I mentionned I was working on it !! My implementation is now ready to be delivered in a few days for the Nutch community (just need to perform some additional Unit Tests). I really hate loosing my time and working for nothing, especially if it is due to a lack of communication. So please, in the future, announce that you are working on a topic!!! (it could be certainly better to combine our efforts instead of doing the same work twice!). So, I will send my contribution in few days.....
Jerome On Apr 3, 2005 10:59 AM, Hari Kodungallur <[EMAIL PROTECTED]> wrote: > > Oops.. forgot to do "reply all". Seding again to the group. > > John, please ignore the attachment in my prvs email, use this one > instead (an extra backup file got into the other one) > > ------------------------------------------------ > > Here's a first version of the mime-mapper and magic-mime-mapper. i > have packaged it at org.apache.nutch.util.mime. None of the files > that depend on mime mapping is changed. This is just a set of utility > classes for magic/mime mapping. We can make those changes after this > code is verified and checked in. > > There are some TODOs that I have documented in the MagicMimeEntry > file. Also I need to write a proper test case. If I need to parse some > sample files (pdf, doc, mp3 etc) for use in the test cases, what is > the procedure to do it? Just copy some samples and check them in and > then use it in the test case? (I used hard-coded paths to my local > disk for the tests; since that won't work anywhere else, I commented > out that test case). > Also, now the mime mapper (mapping based on file extension) and magic > mime mapper (mapping from file contents based on magic.mime file) are > two different utility classes. They can be combined, if needed, to > provided a single interface. > > Please take a look at the files and let me know if there are any > issues. After your review, if its okay, can you check them in? I will > work on modifying some plugins that use the mime mapper and also work > on the TODOs, test cases, code documentation etc > > thx > -Hari > > > On Mar 23, 2005 10:22 AM, John X <[EMAIL PROTECTED]> wrote: > > > On Wed, Mar 23, 2005 at 12:35:30AM -0800, Hari Kodungallur wrote: > > > > On Wed, 23 Mar 2005 00:51:53 -0800, John X <[EMAIL PROTECTED]> wrote: > > > > > > > > > > It will be great if you can help on that. Plugin index-more also > uses it. > > > > > I know there are two opensource efforts: > > > > > http://jafi.sourceforge.net/maven-reports.html > > > > > http://www.gnu.org/software/classpathx/jaf/jaf.html > > > > > The gnu one is out of the question now. > > > > > I am currently short of time, so any help will be greatly > appreciated. > > > > > > > > > > One interesting observation: there is an activation.jar > > > > > (under ./common/lib/) in jakarta-tomcat-4.1.31.tar.gz > > > > > We need to find out which one this is? > > > > > > > > > > > > > Hm.. it is interesting to note that the activation.jar file in > tomcat > > > > is Sun's. Is that because some of the code to tomcat was donated by > > > > Sun? > > > > > > > > In any case, I will start some work on the MimeTypes implementation > as > > > > needed by the file, ftp and index-more plugins and provide you with > > > > patches. We do not need as exhaustive implementation as Sun's, I > > > > presume. > > > > > > Ideally we need something that provides mapping among: > > > (1) filename extension > > > (2) mime type > > > (3) file magic number (unix 'file' command) > > > jaf only gives (1) -> (2) one way. > > > It will be really great if yours can do (3) too. > > > Thanks, > > > > > > John > > > > > > > > > > > > -- http://motrech.free.fr/ http://frutch.free.fr/
