Author: Alun Davis - Loudmouth
Content-Length: 3273160
Content-Type: audio/mpeg
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-TIKA:digest:MD5: 5f374012180e94778346619515152f74
X-TIKA:digest:SHA256:
34d8bf9da8feb848922138eb7807c0d71ed92376422fb28c8cbbffe788574ab0
channels: 2
creator: Alun Davis - Loudmouth
dc:creator: Alun Davis - Loudmouth
dc:title: Teenage Baghead
meta:author: Alun Davis - Loudmouth
resourceName: Teenage Baghead.mp3
samplerate: 44100
title: Teenage Baghead
version: MPEG 3 Layer III Version 1
xmpDM:album:
xmpDM:artist: Alun Davis - Loudmouth
xmpDM:audioChannelType: Stereo
xmpDM:audioCompressor: MP3
xmpDM:audioSampleRate: 44100
xmpDM:duration: 204577.046875
xmpDM:genre: Pop
xmpDM:logComment: www.maimthattune.com for more!
xmpDM:releaseDate: 2001


Nothing that should scare a parser in the mp3 at least.

On Mon, Nov 23, 2015 at 3:33 PM, Chris Mattmann <chris.mattm...@gmail.com>
wrote:

> yeah check the metadata. Any weird UTF-8 encoding?
>
> (aka run tika on the file outside of OODT what do you see?)
>
> —
> Chris Mattmann
> chris.mattm...@gmail.com
>
>
>
>
>
>
> -----Original Message-----
> From: Tom Barber <tom.bar...@meteorite.bi>
> Reply-To: <dev@oodt.apache.org>
> Date: Monday, November 23, 2015 at 7:23 AM
> To: "dev@oodt.apache.org" <dev@oodt.apache.org>
> Subject: Re: Crawling / Archiving binary data with Solr backend
>
> >./crawler/bin/crawler_launcher     --filemgrUrl http://localhost:9000
> >--operation --launchMetCrawler     --clientTransferer
> >org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
> >--productPath $OODT_HOME/data/staging     --metExtractor
> >org.apache.oodt.cas.metadata.extractors.TikaCmdLineMetExtractor
> >--metExtractorConfig /home/bugg/Projects/surrey100/oodt/data/met/tika.conf
> >
> >I'm running that. Which runs fine with the default lucene stuff, also runs
> >fine with a txt file, but doesn't run fine over a random picture I took or
> >over an mp3 I tested it on.
> >
> >
> >On Mon, Nov 23, 2015 at 3:12 PM, Mattmann, Chris A (3980) <
> >chris.a.mattm...@jpl.nasa.gov> wrote:
> >
> >> Encoding issues with the extracted metadata? What are you getting
> >> just running Tika on the files?
> >>
> >> The actual data shouldn’t matter since it’s not being ingested
> >> (are you doing it in place, or what data transferer are you using)?
> >>
> >> Cheers,
> >> Chris
> >>
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Chris Mattmann, Ph.D.
> >> Chief Architect
> >> Instrument Software and Science Data Systems Section (398)
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 168-519, Mailstop: 168-527
> >> Email: chris.a.mattm...@nasa.gov
> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Adjunct Associate Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Tom Barber <tom.bar...@meteorite.bi>
> >> Reply-To: "dev@oodt.apache.org" <dev@oodt.apache.org>
> >> Date: Monday, November 23, 2015 at 6:36 AM
> >> To: "dev@oodt.apache.org" <dev@oodt.apache.org>
> >> Subject: Crawling / Archiving binary data with Solr backend
> >>
> >> >Hello,
> >> >
> >> >Looks like I've never tried it before with binary data. If I swap the
> >> >filemgr defaults to use solr then try and crawl my staging directory
> >>using
> >> >the Tika extractor I get a lot of
> >> >
> >> >org.apache.xmlrpc.XmlRpcException: java.lang.Exception:
> >> >org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Error
> >> >ingesting product
> >>[org.apache.oodt.cas.filemgr.structs.Product@62b19476]
> >> :
> >> >null
> >> >at
> >>
> >>>org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcCl
> >>>ie
> >> >ntResponseProcessor.java:104)
> >> >at
> >>
> >>>org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcCli
> >>>en
> >> >tResponseProcessor.java:71)
> >> >at
> >>
> >>>org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
> >> >
> >> >
> >> >Type things.
> >> >
> >> >Any ideas?
> >> >
> >> >Tom
> >>
> >>
>
>
>

Reply via email to