Got it. Thanks Chris.

On 2018/10/19 15:31:22, Chris Mattmann <mattm...@apache.org> wrote: 
> Hmm if your mime-types.xml in Tika has that MIME format it should call the 
> right parser.
> 
>  
> 
> Alternatives:
> 
>  
> 
> Try the AutoDetectCrawler where you can feed it your own MIME repo mapping.
> 
> Use the ExternMetExtractor and wire it up to call Tika from the command line 
> or 
> tika-python and then customize as needed
> 
>  
> 
> Cheers,
> 
> Chris
> 
>  
> 
>  
> 
>  
> 
>  
> 
> From: lewis john mcgibbney <lewi...@apache.org>
> Reply-To: "dev@oodt.apache.org" <dev@oodt.apache.org>
> Date: Thursday, October 18, 2018 at 9:41 PM
> To: "dev@oodt.apache.org" <dev@oodt.apache.org>
> Subject: Forcing invocation of specific Tika parser when running 
> TikaCmdLineMetExtractor
> 
>  
> 
> Hi Folks,
> 
> I asked a similar question a while back.... but I don't think I
> 
> communicated it clearly enough.
> 
> I'm running the crawler_launcher as follows
> 
>  
> 
> ./crawler_launcher --filemgrUrl http://localhost:9000 --operation
> 
> --launchMetCrawler --clientTransferer
> 
> org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
> 
> --productPath /usr/local/coal-sds-deploy/data/staging --metExtractor
> 
> org.apache.oodt.cas.metadata.extractors.TikaCmdLineMetExtractor
> 
> --metExtractorConfig
> 
> /usr/local/coal-sds-deploy/crawler/etc/tika_aviris_hdr.properties
> 
>  
> 
> The project is parsers and ingested into File Manager, however Tika only
> 
> uses the org.apache.tika.parser.DefaultParser... which is not sufficient as
> 
> I am working with application/envi.hdr files which are rich in metadata.
> 
>  
> 
> The --metExtractorConfig file contains the following primitive metadata
> 
>  
> 
> ProductType=GenericFile
> 
> Content-type=application/envi.hdr
> 
>  
> 
> And yes the 'Content-type=application/envi.hdr' is successfully added to
> 
> the metadata record in File Manager. I am just not sure how to force Tika
> 
> to invoke a specific parser.
> 
>  
> 
> Thanks for any help,
> 
> Lewis
> 
>  
> 
>  
> 
>  
> 
> -- 
> 
> http://home.apache.org/~lewismc/
> 
> http://people.apache.org/keys/committer/lewismc
> 
>  
> 
> 

Reply via email to