Jukka Zitting wrote:
Hi,

On Mon, Jan 19, 2009 at 7:25 AM, Sami Siren <ssi...@gmail.com> wrote:
I like the idea, it allows us to use different strategies for detecting the
type for individual formats or change the whole strategy used. Only thing
that I am wondering is should we introduce some kind of confidence level to
the guesses , perhaps part of metadata?

Good question.

I'm personally not that big a fan of confidence levels, as there's no
clear definition of how they should be set and interpreted. I also
haven't seen any real world cases where confidence levels really would
have been needed to accurately determine the type of a document.
Yes, the only special I had in mind was the various "text" formats out there, but as you say there is no real use case at the moment so let's keep it out.

--
Sami Siren

Reply via email to