Hi Guys,

I could probably be using this within the latter part of my university
work, also we have nearly launched Any23 0.7.0-incubating so I suppose now
is as good a time as ever to formally set about preparing a strategy for
what an Any23 parser (and indexing filter?) implementation would look like.

A couple of questions first:

As parse-tika shadows the parse-html implementation in more or less every
way apart from (i) TikaParser methods are not declared public (I think this
is consistent throughout the plugin) (ii) TikaParser#ParseResult.getParse()
embeds the Tika parser logic. TikaParser doesn't have a main method.

1) Why are Tika methods not declared as public?
2) Why doesn't TikaParser ship with a main method?
3) We previously discussed implementing the Any23 parser plugin as a tika
wrapper, therefore it would look very similar to parse-tika?
4) I like the look of the feed plugin where it also ships with a custom
indexingfilter implementation, my thoughts were also to provide a custom
Any23IndexingFilter implementation?

Any comments would be great before I begin coding this up. I'm keen to get
it going, but not before it has the support from you guys as well.

Thanks in advance for any direction.

Lewis


-- 
*Lewis*

Reply via email to