RE: JMimeMagic (was [fileUpload] file content-type)

Jörg Schaible Mon, 29 May 2006 09:41:21 -0700

Hi David,

sorry for the delay, but I had to do some research again to give some more 
substantial answers.


David Castro wrote on Thursday, May 25, 2006 10:38 AM:

>> Hi Brain and Mark,
>> 
>> Brian K. Wallace wrote on Tuesday, April 18, 2006 9:18 PM:
>> 
>> 
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>> 
>>>> Just be conscious of the fact that, with all open source projects,
>>>> time is usually volunteer/as available/as the urge strikes. I
>>>> wouldn't start to get anxious for a couple of weeks. (some take
>>>> longer, but I'm anxious by then)
>>>> 
>>>> As for forking -> commons, remember licensing issues. GPL/LGPL !=
>>>> ASL. In order for ASL to come into the picture you'd have to not
>>>> fork but start from scratch. IANAL, but that's how it's been
>>>> presented before. 
>>> 
>> 
>> Starting from scratch would be possibly the best anyway. I
> had it also on my todo list on a very low priority ... but
> just, because I found that jMimeMagic has a really worse
> implemenattion - extremly slow and not working correctly. I
> have a good pile of image files it does not detect. Main
> reason is, that the implementation is simply
> What exactly is extremely slow and not working correctly?

After a quick look over the package you get the impression, that you imported 
the magic codes of file magic into the project. And then you're quite 
astonished, if the library does not detect simple formats (e.g. TIFF, Windows 
BMP), that are no problem for the C pendant. This is IMHO a problem, because 
there's simply no documentation, that states something else. When I detected 
jMimeMagic I just thought to use it as a black box.

> There are lots of things that don't detect out of the box right now,
> since only a subset of magic rules are defined in the magic.xml file.
>>  wrong. The original magic files have a clear idea of
> precedence of patterns - this has been lost completely in the
> conversion/implementation of jMimeMagic.
>> 
> What is simply wrong about the implementation?  Precedence of matchers
> is a part of the current implementation, so I'm not sure what
> you mean.
> jMimeMagic wasn't a conversion, it was an implementation written from
> scratch. 

But you could not decide, what you wanted to implement. See, file magic has two 
magic files, one to produce a format description and one for the mime type. 
Your implementation mixes the two approaches. Mime type detection is normally 
an action that should happen *fast*, but if I request the mime type for an MP3 
you evaluate all the nested matchers that are totally moot for the mime type.

Looking at the code:

- what's the real difference between MagicMact and MagicMatcher? Even the 
javadoc is the same ...
- you're code is linked to Log4J. This is not good for libraries. See, some of 
our customers use completely own logging implementations, but with 
commons-logging you can at least write an easy bridge
- you never guard log.debug with log.isDebug - and you create *a lot* of debug 
output
- file magic has also its limits as already explained in this thread. You 
already introduced regexp support, but you don't use it properly e.g. for the 
HTML types so far

OK, some of the problems would have been solved by providing an own magic.xml 
file. E.g. one of my mistakes with the library was, that I assumed that the 
magic file was read every time you create a Magic instance and you would have 
to synchronize the initializartion of the instance if you want to share it. 
This assumtion was wrong, but only after looking at the code - not by reading 
the javadocs.

- Jörg

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: JMimeMagic (was [fileUpload] file content-type)

Reply via email to