Hi Christian,

Thanks for bringing this up! Would you be able to share the PDF which
causes this? Or, one with a similar structure?

Thanks,
Tyler


On Mon, Jun 16, 2014 at 6:46 AM, Christian Reuschling <
[email protected]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I currently migrate to Tika 1.5, and fall into this behaviour, which leads
> to double entries in my
> database for one pdf file as I work directly with the handler.
>
> Here are the two calls:
>
> First call is in PDF2HTML, line 197: handler.endDocument();
> this is part of the PDF2XHTML.process(pdfDocument, handler, context,
> metadata, localConfig);
> invocation from PDFParser, line 143.
>
>
> The second call is then directly in PDFParser, line 151:
> handler.endDocument();
>
>
> Will stay at Tika 1.4 for now - still thanks for good work!
>
>
> Christian
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.19 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iEYEARECAAYFAlOe9S4ACgkQ6EqMXq+WZg/1dwCcD/OHrKb287FqLMw8T93ma+rk
> Pn4An0WBWan0afV34aDbCWTtyJ5zlMw2
> =Pzrf
> -----END PGP SIGNATURE-----
>

Reply via email to