Hi guys, On Tuesday 17 April 2012 13:03:16 Julien Nioche wrote: > Hi Lewis > > 1) Why are Tika methods not declared as public? > > > why should they? If they are not used outside the class then I think it is > good practice to keep them private. Since TikaParser implements Parser the > only method that we can expect to be called is getParse() and it is public > > > 2) Why doesn't TikaParser ship with a main method? > > it could have one for testing but IMHO using the ParserChecker is a better > way of testing as it is closer to real use > > > 3) We previously discussed implementing the Any23 parser plugin as a tika > > wrapper, therefore it would look very similar to parse-tika? > > I don't remember this but I remember suggesting that the Any23 parser > should be a tika parser which is not the same as a Tika wrapper. I expect > other people in Tika-land to have a use for it, and we'd get the benefit of > it automatically with parse-tika
You did indeed suggest that. However, if building a wrapper is fairly straightforward then it may not be a bad idea. I haven't seen any hint of Tika having Any23 on-board any time soon so we might have to wait a very long time if we want to rely on Tika. > > > 4) I like the look of the feed plugin where it also ships with a custom > > indexingfilter implementation, my thoughts were also to provide a custom > > Any23IndexingFilter implementation? > > Depends on what you want to do? What would we get out of Any23? How would > that be used on the search side? Well, we could easily use certain microdata key/value pairs in our results to greatly improve search and navigation. Thanks Markus > > > Thanks > > Julien -- Markus Jelsma - CTO - Openindex

