hi.
i'd like to use SAXParserImpl.JAXPSAXParser from xerces2-j to parse and
correct
html/xhtml user input.
- is there a way to know at which line/column (or, since there's
probably no line
counter, at which character index) an error happened?

i didn't see where to get that info in the api docs, and my attempt to
hijack the input
stream to get count position myself also failed since after the first
couple chars read individually, xerces reads 2045 at once (which makes
sense, performance-wise).
so how do i know at what char a parse error happens?

also, as a side-question, not necessarily xerces-related, but you guys
might know and
spare me some wheel-reinventing:
i'm looking for a solution in java to sanitize untrusted user-input
html/xhtml of
not just malFORMED stuff (tagsoup or an own xerces-based parser solution
would do that)
but specifically MALICIOUS input, e.g. XSS attempts etc. for php,
there's htmlPurifier
( http://hp.jpsband.org/ ), but for java, i found nothing equivalent.
Do you know of some existing solution for that?



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to