Hi Jan,

Interesting. What why couldn't we just name the file the same thing, then?
Would this be putting it up as a gamble to the Classloader?

Cheers,
Chris


On 8/22/10 8:40 AM, "Jan Høydahl / Cominvent" <[email protected]> wrote:

> Hi,
> 
> My rationale for the override part is as follows:
> 
> The default properties file will be embedded within tika-xx.jar
> I assume most people are not keen to unpack and repack JARs to make a config
> change.
> We COULD put a similar named properties file at another location, but then the
> user
> needs to make sure that location is EARLIER in classpath than the JAR file.
> In the case of e.g. Solr (Tomcat, Jetty..) it is not obvious how to ensure
> this,
> and to avoid any confusion about class-loader peculiarities, it's more
> straight-forward
> to look for an override file.
> 
> Take the Solr example. The user would then put the properties file along with
> his new language profiles in a folder $SOLR_HOME/lib/org/apache/tika/language/
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Training in Europe - www.solrtraining.com
> 
> On 22. aug. 2010, at 16.40, Chris A. Mattmann (JIRA) wrote:
> 
>> 
>>    [ 
>> https://issues.apache.org/jira/browse/TIKA-490?page=com.atlassian.jira.plugin
>> .system.issuetabpanels:comment-tabpanel&focusedCommentId=12901170#action_1290
>> 1170 ]
>> 
>> Chris A. Mattmann commented on TIKA-490:
>> ----------------------------------------
>> 
>> Hi Jan,
>> 
>> I don't get the point of the override properties part. What does it buy us?
>> The way you've set it up, it also loads from the classpath just like the
>> language identifier properties proper file, so, it shouldn't be any more
>> arduous to just mod that file if necessary (since they both are classpath
>> loaded).
>> 
>> Let me know what you think. I've reviewed the rest of the patch it looks good
>> and I'm ready to commit it, sans the override part.
>> 
>> Cheers,
>> Chris
>> 
>> 
>>> Support for adding language profiles dynamically
>>> ------------------------------------------------
>>> 
>>>                Key: TIKA-490
>>>                URL: https://issues.apache.org/jira/browse/TIKA-490
>>>            Project: Tika
>>>         Issue Type: Improvement
>>>         Components: languageidentifier
>>>   Affects Versions: 0.7
>>>           Reporter: Jan Høydahl
>>>           Assignee: Chris A. Mattmann
>>>            Fix For: 0.8
>>> 
>>>        Attachments: TIKA-490.patch, TIKA-490.patch
>>> 
>>>  Original Estimate: 24h
>>> Remaining Estimate: 24h
>>> 
>>> Currently the Tika LanguageIdentifier loads language profiles thorugh a
>>> hardcoded static block in the java code.
>>> It would be better to make this configurable, so you could add your own
>>> languages without recompiling.
>>> Suggested approach:
>>> Remove the static code block loading all languages. Instead look for a
>>> tika.languageidentification.properties file on classpath.
>>> Now the user can simply make his/her own (additional) language profile
>>> files, put them on the classpath together with a properties file and off you
>>> go!
>>> Also, once you make it configurable, there might be an issue of having the
>>> profiles as static members, as you will force the same behaviour for the
>>> whole VM. A static Map of Maps could solve this.
>> 
>> --
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>> 
> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Reply via email to