It's in HTMLParser#private static String sniffCharacterEncoding

I'm still wondering where TikaParser gets the character encoding from
though? Additionally, this doesn't look like something we check for in our
JUnit classes? If we don't then I would like to write some tests to test
for this.

I am working on Any23 tests first, so this provides the justification
behind my question.

Thanks

Lewis

On Tue, Feb 14, 2012 at 10:00 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> Hi,
>
> I can't see anywhere within our parser plugins where we detect encoding of
> documents. I've also begun looking through the o.a.n.p package but again I
> can't see anything.
>
> Can anyone provide some detail on this please?
>
> Thank you
>
> Lewis
>
>
>
> --
> *Lewis*
>
>


-- 
*Lewis*

Reply via email to