Jorg, Thanks and no problem. And yeah, I think the library can do both, just not at the same time as you pointed out. Send me some suggestions and we'll see what we can get fixed/cleaned up. I have a few things I need to get ready for a new release anyways.
BTW, if LGPL is a problem for folks, just let me know. I would consider re-licensing it. I'd also still consider moving the project to Jakarta Commons, so that I can utilize the Apache services. Cheers, David Jörg Schaible wrote: > Hi David, > > I realize that we're getting off-topic for jakarta commons here ... ;-) > > David Castro wrote on Tuesday, May 30, 2006 7:36 AM: > > >> Jörg Schaible wrote: >> > [snip] > >>> After a quick look over the package you get the impression, >>> that you imported the magic codes of file magic into the >>> project. And then you're quite astonished, if the library >>> does not detect simple formats (e.g. TIFF, Windows BMP), that >>> are no problem for the C pendant. This is IMHO a problem, >>> I did use the "magic" file to assist in generating the magic.xml file >>> bunded with the project. You'll note that I have some, but >>> not all of >>> the matches cleaned up and working. Actually, the file command will >>> sometimes have incorrect matches itself, which I didn't want to >>> inherit. So, I started with a small set of documents that I generated >>> and ran them through unit tests to verify them. >>> >> I would never be astonished that an alpha piece of open >> source software >> doesn't work exactly as expected or is limited in it's out-of-box >> state. I only moonlight as a open source developer as much >> as I'd like >> it to be my full-time job ;) >> > > I assume most people here are in the same boat including myself. > > >>> because there's simply no documentation, that states >>> something else. When I detected jMimeMagic I just thought to >>> use it as a black box. >>> >> Yeah, if you are looking for something that doesn't require a bit of >> elbow grease, jMimeMagic wouldn't be an optimal solution since it is >> early alpha open source software. That's pretty normal I think. >> > > And it's pretty normal for users to expect the opposite :D > > >> Nothing else existed out there when I started this project and I only >> had so many hours to devote to it. But let's get the engine >> revved up >> and make it more out-of-the-box-friendly. >> > > :) > > [snip] > > >>> But you could not decide, what you wanted to implement. >>> See, file magic has two magic files, one to produce a format >>> description and one for the mime type. Your implementation >>> mixes the two approaches. >>> >> I decided exactly what I wanted to implement and what I wanted to >> prepare for (at least at the time). You're assuming that my intention >> was to simply duplicate the "file" utility, which isn't the case. >> Determining mime type was really only one of my intentions. >> > > Well, by naming you project j*Mime*Magic, you imply something ;-) > > >> More import >> to me was actually determining the specific type and state of >> content in >> a stream of data. It was initially built as a helper library for a >> malware detection project. >> >> >>> Mime type detection is normally an action that should >>> happen *fast*, but if I request the mime type for an MP3 you >>> evaluate all the nested matchers that are totally moot for >>> the mime type. >>> >> Now you are talking about optimization based on one of the >> specific uses >> of the library. >> > > No, I am talking about your attempt to target two different things at the > same time and you cannot do both of it efficiently. > > >> I agree with you that there are some things >> that can >> certainly be done better/more efficiently. Those need to be >> identified and patched, but let's try not to throw the baby out with >> the >> bath water. >> > > Split the result of the parser, create specialized matchers for mime type > detection and descriptive format detection. If you have the need to detect a > mime type it is typically something you wanna do on the fly - and fast. > > >>> Looking at the code: >>> >>> > [snip] > >>> - you're code is linked to Log4J. This is not good for >>> libraries. See, some of our customers use completely own >>> logging implementations, but with commons-logging you can at >>> least write an easy bridge >>> >>> >> Yup, I agree with you. Nobody has been pounding on the door >> asking for >> it and I had enough work on other projects to not concern myself too >> deeply with it. >> > > Demand, demand :) > > >>> - you never guard log.debug with log.isDebug - and you create *a >>> lot* of debug output >>> >> Yup, certainly and area for making the library more >> efficient. Again, >> completely aware of the issue...just haven't fixed it yet. >> >> >>> - file magic has also its limits as already explained in >>> this thread. You already introduced regexp support, but you >>> don't use it properly e.g. for the HTML types so far >>> >> Definitely limits, and as I mentioned I was already moving and have >> already coded adjustments to support more of a pluggable matcher >> architecture. >> > > This is the functionality, that *I* am not that interested in ... the mime > type can normally be detected quite easily with the standard patterns. > > >> And if my HTML regex matcher is >> broken... >> > > Well, you have some of those non-regex, fixed position HTML matching > definitions in your magic.xml, that are also present in file magic's > definitons and that don't work too well. > > >> please send me a >> patch =) I've been calling for folks to help build out a >> complete set >> of matchers for more content types, but with limited responses. >> > > Just to clarify, when I first looked at jMimeMagic, it was just some days > before you posted your call for help. So the project looked to me like a lot > of other abandonned projects on SF with a single time dump of some > experimental code. Therefore I wanna apologize for my overall bad reputation > I gave to your project in one of my first postings in this thread. > > >> I usually just scratch my own itches. I've also determined >> that I am a >> pretty lousy mind reader ;) >> > > :) > > >>> OK, some of the problems would have been solved by >>> providing an own magic.xml file. E.g. one of my mistakes with >>> the library was, that I assumed that the magic file was read >>> every time you create a Magic instance and you would have to >>> synchronize the initializartion of the instance if you want >>> to share it. This assumtion was wrong, but only after looking >>> at the code - not by reading the javadocs. >>> >> Yeah...documentation is the first to go =( I try to keep my projects >> clean, organized, and as simple as possible though. So if >> you browse, >> you should get a good feel for what is going on. It's not always >> beautiful or elegant, but you shouldn't find any obfuscated code...heh >> >> Thanks for the feedback. I understand it is aways >> frustrating working >> with somebody else's code, so I'm sure it was less fun for >> you to deal >> with jMimeMagic than it typically is for myself. But let's make it >> better. I'd love to have other folks to collaborate with on this. >> > > As you have seen from all the folks responding to this thread, there is a > need for it and people are willing to do something. There's no need to bring > it here to Jakarta Commons though, SF is totally fine. > > - Jörg > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
