crawler returns error if we use capital letters in the name of the mimetypes or
if we don't use "product/" at the beginning of the name.
----------------------------------------------------------------------------------------------------------------------------------------
Key: OODT-148
URL: https://issues.apache.org/jira/browse/OODT-148
Project: OODT
Issue Type: Bug
Components: crawler
Affects Versions: 0.3
Environment: unix
Reporter: faranak davoodi
Fix For: 0.2
naming the mimetypes in the crawler required some certain format that is not
documented anywhere. Suppose it should be started with "product/" and it should
be all in lower case of it returns error. And the error is so general that you
don't know what the problem is.
I used "product/dadsL0" as the name for a mimetype. and I got the errors below:
Feb 24, 2011 9:20:49 PM org.apache.oodt.cas.crawl.ProductCrawler handleFile
INFO: Handling file
/usr/local/carve/support/filemgr_lucene/carveFiles/20110209183453.dadsL0
Feb 24, 2011 9:20:49 PM org.apache.oodt.cas.crawl.AutoDetectProductCrawler
passesPreconditions
WARNING: No extractor specs specified for
/usr/local/carve/support/filemgr_lucene/carveFiles/20110209183453.dadsL0
Feb 24, 2011 9:20:49 PM org.apache.oodt.cas.crawl.ProductCrawler handleFile
WARNING: Failed to pass preconditions for ingest of product:
[/usr/local/carve/support/filemgr_lucene/carveFiles/20110209183453.dadsL0]
After changing the mimetype name to "product/dadsl0" I got:
Feb 24, 2011 9:25:46 PM org.apache.oodt.cas.crawl.ProductCrawler ingest
INFO: Successfully ingested product:
[/usr/local/carve/support/filemgr_lucene/carveFiles/20110209183453.dadsL0]:
product id: b2c6deec-409f-11e0-9885-3f3332df0e68
Feb 24, 2011 9:25:46 PM org.apache.oodt.cas.crawl.ProductCrawler handleFile
INFO: Successful ingest of product:
[/usr/local/carve/support/filemgr_lucene/carveFiles/20110209183453.dadsL0]
I wish the format for the mimetype names wouldn't be this sensitive. And if it
is necessary to have such a format, then we might want to have it documented in
the crawler's user guide to avoid hours of confusion.
Thanks.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira