./crawler/bin/crawler_launcher --filemgrUrl http://localhost:9000 --operation --launchMetCrawler --clientTransferer org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory --productPath $OODT_HOME/data/staging --metExtractor org.apache.oodt.cas.metadata.extractors.TikaCmdLineMetExtractor --metExtractorConfig /home/bugg/Projects/surrey100/oodt/data/met/tika.conf
I'm running that. Which runs fine with the default lucene stuff, also runs fine with a txt file, but doesn't run fine over a random picture I took or over an mp3 I tested it on. On Mon, Nov 23, 2015 at 3:12 PM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Encoding issues with the extracted metadata? What are you getting > just running Tika on the files? > > The actual data shouldn’t matter since it’s not being ingested > (are you doing it in place, or what data transferer are you using)? > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > -----Original Message----- > From: Tom Barber <tom.bar...@meteorite.bi> > Reply-To: "dev@oodt.apache.org" <dev@oodt.apache.org> > Date: Monday, November 23, 2015 at 6:36 AM > To: "dev@oodt.apache.org" <dev@oodt.apache.org> > Subject: Crawling / Archiving binary data with Solr backend > > >Hello, > > > >Looks like I've never tried it before with binary data. If I swap the > >filemgr defaults to use solr then try and crawl my staging directory using > >the Tika extractor I get a lot of > > > >org.apache.xmlrpc.XmlRpcException: java.lang.Exception: > >org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error > >ingesting product [org.apache.oodt.cas.filemgr.structs.Product@62b19476] > : > >null > >at > >org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcClie > >ntResponseProcessor.java:104) > >at > >org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcClien > >tResponseProcessor.java:71) > >at > >org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73) > > > > > >Type things. > > > >Any ideas? > > > >Tom > >