Well, almost - it's about 95% done. I basically just need to rewrite 1 more method which works right now just as if my heuristic code hadn't been committed.
I'll clean up my code shortly to use a Confidence datatype instead of a UT_uint8. Basically, everything that imports returns a normalized number between [0,255] with 0 being "I'm not at all confident", 127 being "I'm so-so" and 255 being "I can totally handle this file type". Applies to both recognizeContents and recognizeSuffix methods. What I'm going to do is heavily weight the recognizeContents method (maybe 85-15) and apply the following heuristic: my_match = heuristic(contentsConfidence, suffixConfidence); if ( my_match > best_match ) best_filetype = my_match_filetype; This will fix a few bugs in bugzilla. Dom
signature.asc
Description: This is a digitally signed message part
