[
https://issues.apache.org/jira/browse/TIKA-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann updated TIKA-490:
-----------------------------------
Attachment: TIKA-490.Mattmann.082210.2.patch.txt
- updated patch. I decided against throwing TikaExceptions, and just decided to
throw an IOException if there was a problem loading the props file containing
the profile default information. This makes the *real* exception the one that's
exposed to the consumer of this API, which makes more sense. I don't like
returning null in ProfilingWriter#getLanguage, but I couldn't think of anything
more elegant at this time. So, I documented that there's a possibility that
that method could return null now, so a caller should plan accordingly.
If there are no objections, I'll commit this in the next 24 hrs.
> Support for adding language profiles dynamically
> ------------------------------------------------
>
> Key: TIKA-490
> URL: https://issues.apache.org/jira/browse/TIKA-490
> Project: Tika
> Issue Type: Improvement
> Components: languageidentifier
> Affects Versions: 0.7
> Reporter: Jan Høydahl
> Assignee: Chris A. Mattmann
> Fix For: 0.8
>
> Attachments: TIKA-490.Mattmann.082210.2.patch.txt,
> TIKA-490.Mattmann.082210.patch.txt, TIKA-490.patch, TIKA-490.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Currently the Tika LanguageIdentifier loads language profiles thorugh a
> hardcoded static block in the java code.
> It would be better to make this configurable, so you could add your own
> languages without recompiling.
> Suggested approach:
> Remove the static code block loading all languages. Instead look for a
> tika.languageidentification.properties file on classpath.
> Now the user can simply make his/her own (additional) language profile files,
> put them on the classpath together with a properties file and off you go!
> Also, once you make it configurable, there might be an issue of having the
> profiles as static members, as you will force the same behaviour for the
> whole VM. A static Map of Maps could solve this.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.