All such handlers are implementation of org.xml.sax.ContentHandler interface, so thier methods throws SAXException. But in code above none of contentHandler methods are invoked (only in parser.parse where content handler is passed).
You can take a look at org.apache.tika.Tika.parseToString(InputSteam, Metadata, int) as a reference. It has code similar to Jukka's code above. -- Best regards, Konstantin Gribov. 2014-03-28 15:47 GMT+04:00 Stefano Fornari <[email protected]>: > well, I should look at the code, I can't do it now, but I guess my point is > that BodyContentHandler should not throw the exception (and most probably > not a SAXException in any case) in the case the limit is reached. This > means that the limit should not put on the WriteOutContentHandler, but on > BodyContentHandler. > > Ste > > > On Fri, Mar 28, 2014 at 11:52 AM, Konstantin Gribov <[email protected] > >wrote: > > > SAXException is checked, so you have to catch it or add to method throws > > list (or javac wouldn't compile it). Tika usually rethrows exceptions > > enveloping them into TikaException. In case of code above method throws > > SAXException. > > > > Suppressing the exception is done to avoid parser fail after parsing > > valuable amount of data. > > > > -- > > Best regards, > > Konstantin Gribov. > > 28.03.2014 14:27 пользователь "Stefano Fornari" < > [email protected] > > > > > написал: > > > > > On Fri, Mar 28, 2014 at 11:26 AM, Stefano Fornari < > > > [email protected] > > > > wrote: > > > > > > > I understood the trick, but I am trying to understand this is done in > > > this > > > > way (that at a first glance does not seem clean). > > > > > > > > ... trying to understand why this is done in this way... > > > > > >
