Tim Allison created TIKA-2432:
---------------------------------

             Summary: Convert RTFParser's asserts to TikaExceptions
                 Key: TIKA-2432
                 URL: https://issues.apache.org/jira/browse/TIKA-2432
             Project: Tika
          Issue Type: Improvement
    Affects Versions: 1.16
            Reporter: Tim Allison
            Priority: Minor


The RTFParser relies on asserts in numerous places.  With a fuzzed/corrupted 
file, the user will get an AssertionError rather than a TikaException.

It looks like the idea in several places is to allow for the user to turn off 
assert-checking to allow for a lenient parser.  

{noformat}
            // In document
            if (equals("b")) {
                // b0
                assert param == 0;
                if (groupState.bold) {
{noformat}

In other places, though, the assert checks for a showstopper.

{noformat}
    private void addOutputByte(int b) throws IOException, SAXException, 
TikaException {
        assert b >= 0 && b < 256 : "byte value out of range: " + b;
{noformat} 

It would be useful to distinguish between these.  I propose adding a 
"beLenient" parameter (or something) to the RTFParser with default=true, that 
would ignore the first case if lenient, but would through a TikaException if 
lenient=false.  However, we'd always want to through a TikaException for the 
second case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to