Hi Guys, I could probably be using this within the latter part of my university work, also we have nearly launched Any23 0.7.0-incubating so I suppose now is as good a time as ever to formally set about preparing a strategy for what an Any23 parser (and indexing filter?) implementation would look like.
A couple of questions first: As parse-tika shadows the parse-html implementation in more or less every way apart from (i) TikaParser methods are not declared public (I think this is consistent throughout the plugin) (ii) TikaParser#ParseResult.getParse() embeds the Tika parser logic. TikaParser doesn't have a main method. 1) Why are Tika methods not declared as public? 2) Why doesn't TikaParser ship with a main method? 3) We previously discussed implementing the Any23 parser plugin as a tika wrapper, therefore it would look very similar to parse-tika? 4) I like the look of the feed plugin where it also ships with a custom indexingfilter implementation, my thoughts were also to provide a custom Any23IndexingFilter implementation? Any comments would be great before I begin coding this up. I'm keen to get it going, but not before it has the support from you guys as well. Thanks in advance for any direction. Lewis -- *Lewis*

