I’d have to respectfully disagree with most of those points but if there’s that 
much resistance to the idea I’ll drop it.

Cheers,

Ray


On June 19, 2014 at 3:22:14 PM, Nick Burch ([email protected]) wrote:
> On Thu, 19 Jun 2014, Ray Gauss wrote:
> > The point of a tika-parsers-all artifact would be a single dependency
> > that re-aggregates everything so that downstream projects could work the
> > same way they do now and not worry about missing dependencies.
> >
> > What’s the disadvantage for splitting things up (in a 2.0 timeframe)?
>  
> We already have users confused by the current split between tika-core and
> tika-parsers - see users list for example. We already have users confused
> by what dependencies they need with the current poms setup. Splitting is
> going to make that a lot worse. (POI, as a related example, sees plenty of
> confused users who've got mis-matched jars and problems. Splitting is
> going to make that a lot worse.)
>  
> We have previously tried pushing parsers out of the tika parser jar and
> into other jars, eg ones maintained by external groups, but on the whole
> it hasn't been a great success. Keeping them in sync, dealing with
> different cycles, applying updates, keeping them consistent, building in a
> sensible length of time, all of that would be harder with a pile of
> modules.
>  
> If we were to split out out to the level needed by some of the use cases
> mentioned, we'd have so many parser modules it'd be a nightmare to
> maintain, and would case problems mentioned above. (People in other
> threads have cautioned on these problems). If we split into just a handful
> of sub modules, then many of the uses cases mentioned still have to do
> work to pick out the bits they need
>  
> I still believe that the main use case of tika is "everything included",
> and especially that's the beginners use case, so I think we should focus
> on keeping that easy. Peeling out just some bits feels like an advanced
> use case to me, so I'd rather we put the requirement for effort onto those
> folks, rather than onto newbies and people on the typical uses. I'd
> therefore much rather we provide advanced docs/help on excluding some
> bits, rather than pull it out into a pile of different modules.
>  
> Nick

Reply via email to