Hi, Below I have updated both Content as well as Parse Metadata.
Can you suggest me the rule for "çontentType¨ as well as metatag.content-Type . Is this from the header of the file as my html file only have a description field. __DUMP__ parsing: http://localhost/def.html contentType: text/html signature: d677f21eaccf7cc5ff4cb8484d9a8965 --------- Url --------------- http://localhost/def.html --------- ParseData --------- Version: 5 Status: success(1,0) Title: Outlinks: 0 Content Metadata: ETag="40c43-2f-4d68fe096f698" Date=Mon, 25 Feb 2013 17:40:04 GMT Content-Length=47 Last-Modified=Mon, 25 Feb 2013 17:29:03 GMT Content-Type=text/html; charset=UTF-8 Connection=close Accept-Ranges=bytes Server=Apache/2.2.22 (Fedora) Parse Metadata: CharEncodingForConversion=utf-8 OriginalCharEncoding=utf-8 metatag.description=¨text/html¨ --------- ParseText --------- __END_DUMP On Mon, Feb 25, 2013 at 8:29 PM, kiran chitturi <[email protected]>wrote: > Hi Raja, > > Which Nutch version are you using ? Can you check again with parseChecker > [1] tool ? > > [1] - http://wiki.apache.org/nutch/bin/nutch%20parsechecker > > > > On Mon, Feb 25, 2013 at 9:32 AM, Raja Kulasekaran <[email protected]> > wrote: > > > Hi, > > > > I am unable to get the value of ContentType as well as > > metatag.Content-Type. > > > > Can you please suggest me the correct way to get this value ? > > > > Raja > > > > > > -- > Kiran Chitturi >

