Hi all, Over at https://issues.apache.org/jira/browse/TIKA-1723, Tim & I have been discussing whether to focus these pending changes on the 2.0 branch, and leave 1.x as-is.
As part of that, we could do a cut-and-run in 2.0, and not spend the time to port the current (Tika 1.x) language detector code. I'm in favor of that approach, as I think leveraging the new detector project(s) gives us faster & more accurate results over more languages. But we're posting to the more general audience here, to gather input on things that we might not be considering. Thanks, -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr