On Sun, Jan 11, 2009 at 2:37 AM, Matthew Toseland
<[email protected]> wrote:
> On Monday 05 January 2009 12:03, [email protected] wrote:
>> Author: j16sdiz
>> Date: 2009-01-05 12:03:27 +0000 (Mon, 05 Jan 2009)
>> New Revision: 24911
>>
>> Modified:
>>    trunk/plugins/XMLSpider/XMLSpider.java
>> Log:
>> mark as failed when content filter throws
>>
>> Modified: trunk/plugins/XMLSpider/XMLSpider.java
>> ===================================================================
>> --- trunk/plugins/XMLSpider/XMLSpider.java    2009-01-05 12:03:17 UTC (rev
> 24910)
>> +++ trunk/plugins/XMLSpider/XMLSpider.java    2009-01-05 12:03:27 UTC (rev
> 24911)
>> @@ -410,25 +410,34 @@
>>                       PageCallBack pageCallBack = new PageCallBack(page);
>>                       Logger.minor(this, "Successful: " + uri + " : " + 
>> page.getId());
>>
>> +                     try {
>>                       ContentFilter.filter(data, new NullBucketFactory(), 
>> mimeType,
> uri.toURI("http://127.0.0.1:8888/";),
>>                                       pageCallBack);
>> +                     } catch (UnsafeContentTypeException e) {
>> +                             // wrong mime type
>> +                             page.setStatus(Status.SUCCEEDED);
>> +                             db.endThreadTransaction();
>> +                             dbTransactionEnded = true;
>> +
>> +                             Logger.minor(this, "UnsafeContentTypeException 
>> " + uri + " : " +
> page.getId(), e);
>> +                             return; // Ignore
>> +                     } catch (IOException e) {
>> +                             // ugh?
>> +                             Logger.error(this, "Bucket error?: " + e, e);
>> +                             return;
>> +                     } catch (Exception e) {
>> +                             // we have lots of invalid html on net - just 
>> normal, not error
>
> In which case it should throw a content filter exception, not just any
> exception... anything unexpected is bad, and may indicate a bug in the
> filter...

The CSS Tokenizer throw IlegalStateException on some invalid css

>> +                             Logger.normal(this, "exception on content 
>> filter for " + page, e);
>> +                             return;
>> +                     }
>> +
>>                       page.setStatus(Status.SUCCEEDED);
>>                       db.endThreadTransaction();
>>                       dbTransactionEnded  = true;
>>
>>                       Logger.minor(this, "Filtered " + uri + " : " + 
>> page.getId());
>> -             } catch (UnsafeContentTypeException e) {
>> -                     page.setStatus(Status.SUCCEEDED);
>> -                     db.endThreadTransaction();
>> -                     dbTransactionEnded = true;
>> -
>> -                     Logger.minor(this, "UnsafeContentTypeException " + uri 
>> + " : " +
> page.getId(), e);
>> -                     return; // Ignore
>> -             } catch (IOException e) {
>> -                     Logger.error(this, "Bucket error?: " + e, e);
>> -             } catch (URISyntaxException e) {
>> -                     Logger.error(this, "Internal error: " + e, e);
>>               } catch (RuntimeException e) {
>> +                     // other runtime exceptions
>>                       Logger.error(this, "Runtime Exception: " + e, e);
>>                       throw e;
>>               } finally {
>> @@ -444,6 +453,9 @@
>>                               if (!dbTransactionEnded) {
>>                                       Logger.minor(this, "rollback 
>> transaction", new Exception("debug"));
>>                                       db.rollbackThreadTransaction();
>> +                                     
>> db.beginThreadTransaction(Storage.EXCLUSIVE_TRANSACTION);
>> +                                     page.setStatus(Status.FAILED);
>> +                                     db.endThreadTransaction();
>>                               }
>>                       }
>>               }
>>
>> _______________________________________________
>> cvs mailing list
>> [email protected]
>> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/cvs
>>
>>
>
> _______________________________________________
> Devl mailing list
> [email protected]
> http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
>
_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to