[ 
https://issues.apache.org/jira/browse/TIKA-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907217#comment-14907217
 ] 

Nick Burch commented on TIKA-1657:
----------------------------------

A couple of things to say up front:
 * I agree with Tim's patch that most of the logic for this should really be in 
core, in the config package
 * I agree with Tim that the Tika App would be a better place to expose this

That said... As of r1705191, DumpTikaConfigExample should now support 3 modes 
for most kinds of things. It has Minimal, which is anything that isn't a 
default. It has Current, which shows what the defaults are. It has Static, 
which turns Defaults into full lists. It handles some (but not all) kinds of 
decorations. It needs more unit tests... It includes some bits of Tim's patch

One thing it doesn't have is the parameter parsing / writing out, as current 
Tika config core lacks that too. Once we work out how to add general parameters 
to the Tika Config without re-inventing Spring baldy (see another jira for that 
effort!), we'll need to include that part of Tim's patch too.

Assuming everyone likes + understands + agrees with the approach I've taken in 
the expanded example, I'd suggest we look at pulling most of that logic out (as 
Tim did in his patch), push it to core, add it to the app, and replace the 
example with a thin shim. Then add more tests, then look at the other 
decorations etc needed

> Allow easier XML serialization of TikaConfig
> --------------------------------------------
>
>                 Key: TIKA-1657
>                 URL: https://issues.apache.org/jira/browse/TIKA-1657
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>             Fix For: 1.11
>
>         Attachments: TIKA-1558-blacklist-effective.xml, TIKA-1657v1.patch
>
>
> In TIKA-1418, we added an example for how to dump the config file so that 
> users could easily modify it.  I think we should go further and make this an 
> option at the tika-core level with hooks for tika-app and tika-server.  I 
> propose adding a main() to TikaConfig that will print the xml config file 
> that Tika is currently using to stdout.
> I'd like to put this into core so that e.g. Solr's DIH users can get by 
> without having to download tika-app separately.  
> There's every chance that I've not accounted for issues with dynamic loading 
> etc.  Also, I'd be ok with only having this available in tika-app and 
> tika-server if there are good reasons.
> Feedback?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to