Y. Will fix shortly. Thank you!

On Tue, Mar 25, 2025 at 4:50 PM David Pilato <da...@pilato.fr> wrote:

> Hey team
>
> The page
> https://tika.apache.org/3.1.0/formats.html#HyperText_Markup_Language
> mentions:
>
>
> The output from the HtmlParser class is guaranteed to be well-formed and
> valid XHTML, and various heuristics are used to prevent things like inline
> scripts from cluttering the extracted text content.
>
>
> But HtmlParser links to a non existing class:
> https://tika.apache.org/3.1.0/api/org/apache/tika/parser/html/HtmlParser.html
> Should it be
> https://tika.apache.org/3.1.0/api/org/apache/tika/parser/html/JSoupParser.html
> instead?
>
>
>
> David Pilato
> da...@pilato.fr
> 06 13 03 08 41
>
>

Reply via email to