Hi,

Below I have updated both Content as well as Parse Metadata.

Can you suggest me the rule for "çontentType¨ as well as
metatag.content-Type . Is this from the header of the file as my html file
only have a description field.

__DUMP__

parsing: http://localhost/def.html
contentType: text/html
signature: d677f21eaccf7cc5ff4cb8484d9a8965
---------
Url
---------------
http://localhost/def.html
---------
ParseData
---------
Version: 5
Status: success(1,0)
Title:
Outlinks: 0
Content Metadata: ETag="40c43-2f-4d68fe096f698" Date=Mon, 25 Feb 2013
17:40:04 GMT Content-Length=47 Last-Modified=Mon, 25 Feb 2013 17:29:03 GMT
Content-Type=text/html; charset=UTF-8 Connection=close Accept-Ranges=bytes
Server=Apache/2.2.22 (Fedora)
Parse Metadata: CharEncodingForConversion=utf-8 OriginalCharEncoding=utf-8
metatag.description=¨text/html¨
---------
ParseText
---------

__END_DUMP


On
Mon, Feb 25, 2013 at 8:29 PM, kiran chitturi <[email protected]>wrote:

> Hi Raja,
>
> Which Nutch version are you using ? Can you check again with parseChecker
> [1] tool ?
>
> [1] - http://wiki.apache.org/nutch/bin/nutch%20parsechecker
>
>
>
> On Mon, Feb 25, 2013 at 9:32 AM, Raja Kulasekaran <[email protected]>
> wrote:
>
> > Hi,
> >
> > I am unable to get the value of ContentType as well as
> > metatag.Content-Type.
> >
> > Can you please suggest me the correct way to get this value ?
> >
> > Raja
> >
>
>
>
> --
> Kiran Chitturi
>

Reply via email to