Hi Folks,
I asked a similar question a while back.... but I don't think I
communicated it clearly enough.
I'm running the crawler_launcher as follows

./crawler_launcher --filemgrUrl http://localhost:9000 --operation
--launchMetCrawler --clientTransferer
org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
--productPath /usr/local/coal-sds-deploy/data/staging --metExtractor
org.apache.oodt.cas.metadata.extractors.TikaCmdLineMetExtractor
--metExtractorConfig
/usr/local/coal-sds-deploy/crawler/etc/tika_aviris_hdr.properties

The project is parsers and ingested into File Manager, however Tika only
uses the org.apache.tika.parser.DefaultParser... which is not sufficient as
I am working with application/envi.hdr files which are rich in metadata.

The --metExtractorConfig file contains the following primitive metadata

ProductType=GenericFile
Content-type=application/envi.hdr

And yes the 'Content-type=application/envi.hdr' is successfully added to
the metadata record in File Manager. I am just not sure how to force Tika
to invoke a specific parser.

Thanks for any help,
Lewis



-- 
http://home.apache.org/~lewismc/
http://people.apache.org/keys/committer/lewismc

Reply via email to