great feedback guys.  I think I will start writing a new project from
scratch.  How can I go about getting set up with jakarta commons.  Do
I need to 'apply' to start a new project?

Thanks again!!

On 4/19/06, Jörg Schaible <[EMAIL PROTECTED]> wrote:
> Hi Markus,
>
> Jörg Schaible wrote on Wednesday, April 19, 2006 8:46 AM:
>
> > Hi Markus,
> >
> > Markus Härnvi wrote on Wednesday, April 19, 2006 8:47 AM:
> >
> >> Hi!
> >>
> >>> Starting from scratch would be possibly the best anyway. I
> >> had it also on my todo list on a very low priority ... but
> >> just, because I found that jMimeMagic has a really worse
> >> implemenattion - extremly slow and not working correctly. I
> >> have a good pile of image files it does not detect. Main
> >> reason is, that the implementation is simply wrong. The
> >> original magic files have a clear idea of precedence of
> >> patterns - this has been lost completely in the
> >> conversion/implementation of jMimeMagic.
> >>>
> >>> - Jörg
> >>>
> >>
> >> Using the original magic file and parse it in Java also makes it
> >> easier to keep it updated. Just add the newest magic file to the jar
> >> file and we are done.
> >
> > That would have been my approach also. I was just not sure,
> > whether we should bundle the magic file or try to locate it
> > (this is the interesting part and highly system dependent).
> > And a user might have an additional magic file in its home -
> > at least this can be located.
>
> After looking into the magic files (magic and magic.mime) I am somewhat 
> disappointed. While file magic is good at binary formats with fixed headers, 
> its definition language is poor for string based formats, e.g. rules for 
> detecting XML & XSL:
>
> ===== %< =====
> 0       string/cb       \<?xml                  XML document text
> 0       string          \<?xml\ version "       XML
> 0       string          \<?xml\ version="       XML
> >15     string          >\0                     %.3s document text
> >>23    string          \<xsl:stylesheet        (XSL stylesheet)
> >>24    string          \<xsl:stylesheet        (XSL stylesheet)
> 0       string/b        \<?xml                  XML document text
> 0       string/cb       \<?xml                  broken XML document text
> ===== %< =====
>
> This is quite poor. The second line is invalid XML. It looks at offset 23 or 
> 24 for "<xsl:stylesheet" totally ignoring the fact, that the offset might be 
> quite different if the XML declaration contains an encoding attribute or 
> depending on the whitspaces and line ending. See detection of xml mime 
> formats:
>
> ===== %< =====
> 0       string          \<?xml
> >38     string          \<\!DOCTYPE\040svg      image/svg+xml
> 0       string          \<?xml                  text/xml
> ===== %< =====
>
> Again I am quite sure, that a lot of SVG documents are not recognized.
>
> Main problem is that the format specification cannot deal with variable 
> length. See "man magic" for the format definition. You cannot express, that a 
> file with an XML declaration followed by a non-empty line with a DOCTYPE 
> declaration for SVG is "image/svg+xml".
>
> Bottom line: I am no longer sure, if a mime detection based on the 
> definitions of file magic is really a good idea :-/
>
> - Jörg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to