Hi Jorge, It doesn't look like your actually using Tika as a wrapper for your custom parser at all...
You would be need to specify the correct Tika config by calling tikaConfig.getParser hth On Wed, Jun 27, 2012 at 7:46 PM, Jorge Luis Betancourt Gonzalez <[email protected]> wrote: > Hi all: > > I'm working on a custom parser plugin to generate thumbnails from images > fetched with nutch 1.4. I'm doing this because the humbnails will be > converted into a base64 encoded string and stored on a Solr backend. > > So I basically wrote a custom parser (to which I send all png images, for > example). I enable the plugin (image-thumbnail) in the nutch-site.xml, set > some custom properties to load the width and height of the thumbnail. Also > set the alias on the parse-plugins.xml and set the plugin to handle the > image/png files, also in this file. > > the plugin is being loaded, but every time I get a png image to parse I get > this: > > Error parsing: > http://localhost/sites/all/themes/octavitos/images/iconos/audiointernet.png: > java.lang.NullPointerException > at org.apache.nutch.parse.ParserFactory.match(ParserFactory.java:388) > at > org.apache.nutch.parse.ParserFactory.getExtension(ParserFactory.java:397) > at > org.apache.nutch.parse.ParserFactory.matchExtensions(ParserFactory.java:296) > at > org.apache.nutch.parse.ParserFactory.findExtensions(ParserFactory.java:262) > at > org.apache.nutch.parse.ParserFactory.getExtensions(ParserFactory.java:234) > at > org.apache.nutch.parse.ParserFactory.getParsers(ParserFactory.java:119) > at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:71) > at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:86) > at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:42) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) > > The thing is that I have put some log messages inside the getParse() method > but none of this message are being logged on the hadoop.log file, so for what > I can tell the method is not being executed. > > Any one has any idea what I'm doing wrong? > > P.S: I've attached the source of the ImageThumbnailParser. > > Greetings! > > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS > INFORMATICAS... > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > http://www.uci.cu > http://www.facebook.com/universidad.uci > http://www.flickr.com/photos/universidad_uci -- Lewis

