Hi,

I tried *bin/nutch org.apache.nutch.parse.ParserChecker
http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg*
using the latest trunk from SVN and I am getting

---------
> Version: 5
> Status: success(1,0)
> Title:
> Outlinks: 0
> Content Metadata: ETag="15dab-8280a1c0" Date=Mon, 17 May 2010 13:55:16 GMT
> Content-Length=89515 Expires=Mon, 26 Jul 2010 13:55:16 GMT
> Last-Modified=Mon, 26 Jan 2009 13:13:51 GMT Content-Type=image/jpeg
> Connection=close Accept-Ranges=bytes Server=Apache/2.2.3 (Debian)
> PHP/5.2.0-8+etch16 Cache-Control=max-age=6048000
> Parse Metadata: Software=Adobe Photoshop CS2 Windows Number of Components=3
> Orientation=Top, left side (Horizontal / normal) Color Space=sRGB Image
> Height=156 pixels Data Precision=8 bits Exif Image Width=992 pixels
> Component 1=Y component: Quantization table 0, Sampling factors 1 horiz/1
> vert Component 2=Cb component: Quantization table 1, Sampling factors 1
> horiz/1 vert Compression=JPEG (old-style) Component 3=Cr component:
> Quantization table 1, Sampling factors 1 horiz/1 vert Date/Time=2009:01:26
> 14:05:22 X Resolution=72 dots per inch Thumbnail Offset=302 bytes Exif Image
> Height=156 pixels Thumbnail Length=3259 bytes Resolution Unit=Inch Image
> Width=992 pixels Thumbnail Data=[3259 bytes of thumbnail data] Y
> Resolution=72 dots per inch
>

could you try the command above?

J.
-- 
DigitalPebble Ltd
http://www.digitalpebble.com


On 17 May 2010 14:26, Markus Jelsma <[email protected]> wrote:

> Hi,
>
>
> It seems it still doens't work afterall. I updated all config files and the
> JPEG (and more new as it looks like). But the log still tells me it cannot
> find a suitable parser.
>
> ---------------
> 2010-05-17 15:20:06,636 WARN  parse.ParseUtil - No suitable parser found
> when
> trying to parse content
> http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg of type
> image/jpeg
> 2010-05-17 15:20:06,637 WARN  parse.Parser - Error parsing:
> http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg:
> org.apache.nutch.parse.ParseException: parser not found for
> contentType=image/jpeg
> url=http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg
>        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:74)
>        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:85)
>        at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:41)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>        at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> ---------------
>
>
> Cheers,
>
> On Monday 17 May 2010 14:37:54 Markus Jelsma wrote:
> > Hi,
> >
> >
> > I've got a copy of the nutch-2010-05-11_04-34-41 nightly build because i
> >  need Tika to parse JPEG images and that would be in 1.1 as i read
> >  somewhere [1].
> >
> > ---------------
> > 2010-05-17 14:36:13,074 WARN  parse.ParseUtil - No suitable parser found
> >  when trying to parse content
> > http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg of type
> > image/jpeg
> > 2010-05-17 14:36:13,075 WARN  parse.Parser - Error parsing:
> > http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg:
> > org.apache.nutch.parse.ParseException: parser not found for
> > contentType=image/jpeg
> > url=http://www.fcgroningen.nl/uploads/media/hollabovenplaat_01.jpg
> > ---------------
> >
> >
> > [1]: http://lucene.472066.n3.nabble.com/Adding-jpeg-parser-to-nutch-
> > td710135.html
> >
> > Cheers,
> >
> > Markus Jelsma - Technisch Architect - Buyways BV
> > http://www.linkedin.com/in/markus17
> > 050-8536620 / 06-50258350
> >
>
> Markus Jelsma - Technisch Architect - Buyways BV
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>
>

Reply via email to