Well, almost - it's about 95% done. I basically just need to rewrite 1
more method which works right now just as if my heuristic code hadn't
been committed.

I'll clean up my code shortly to use a Confidence datatype instead of a
UT_uint8.

Basically, everything that imports returns a normalized number between
[0,255] with 0 being "I'm not at all confident", 127 being "I'm so-so"
and 255 being "I can totally handle this file type". Applies to both
recognizeContents and recognizeSuffix methods.

What I'm going to do is heavily weight the recognizeContents method
(maybe 85-15) and apply the following heuristic:

my_match = heuristic(contentsConfidence, suffixConfidence);
if ( my_match > best_match )
  best_filetype = my_match_filetype;

This will fix a few bugs in bugzilla.

Dom

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to