TIKA-490 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Training in Europe - www.solrtraining.com
On 21. aug. 2010, at 03.46, Mattmann, Chris A (388J) wrote: > Hi Jan, > > +1, this approach seems sound. Feel free to file an issue and submit a patch, > otherwise if I get some time next week I can take a look at it. > > Cheers, > Chris > > > > On 8/20/10 1:56 PM, "Jan Høydahl / Cominvent" <[email protected]> wrote: > > Hi, > > Currently the Tika LanguageIdentifier loads language profiles thorugh a > hardcoded list in the java code. > > It would be better to make this configurable somehow, so you could add your > own languages without recompiling. > > Suggestion: > Remove the static code block loading all languages. Instead look for a > tika.languages.properties file on classpath. > Now the user can simply make his/her own (additional) language profile files, > put them > on the classpath together with a properties file and off you go! > > Also, once you make it configurable, there might be an issue of having the > profiles as static members, as you will force the same behaviour for the > whole VM. A static Map of Maps could solve this. > > Comments? > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Training in Europe - www.solrtraining.com > > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >
