Hi Nick, I think I've actually learned a new urban dictionary word mentioned in this thread, 'faff' :-).

On 16/12/14 03:34, Nick Burch wrote:
On Mon, 15 Dec 2014, Sergey Beryozkin wrote:
I'm not proposing to split tika-parsers in a way that would affect the
users, tika-parsers would still be there, except that it would
strongly depend on tika-pdf and perhaps, when it is being built, it
can have its dependencies like tika-pdf shaded in/merged in to ensure
a complete backward-compatibility as far as the user expectations of
tika-parsers is concerned.

We still have the additional faff of multiple "core" modules, which
someone warned about in an earlier thread, and additional work for
developers, and we did try pulling out the pdf parser which didn't work,
and I'm finding having the Vorbis parsers in a different module + repo
to be a faff

I was thinking of introducing a very minimum number of extra modules (at the 'expense' of tika-parsers), those covering the mainstream parsers, the ones you mentioned earlier, pdf, plus few others. tika-parsers would still be effectively the same after the build time, no side-effects for the tika-parsers users. Perhaps it is difficult to realize practically...


My plan doesn't involve any of those problems in phase 1 - core +
parsers don't change at all, so if it doesn't work we haven't got to
work hard to undo it, and people not interested aren't affected.

Alternately, if you head back to some of the earlier threads on this,
and can come up with reasons why the objections raised there can be
overruled, we could hack up tika parsers. (I'm trying to come up with a
plan that respects previously raised issues)

Sounds good, thanks

I might experiment a bit later on and create a patch for the review but I'll take a pause for now
Cheers, Sergey
Nick

Reply via email to