Hi Everyone,

I got my error. I was trying to use toString for a field which is int or float 
or long. But this leads me to another question.
The protocol status is a nested structure. Similar to parseStatus. How could we 
parse these to get the individual majorcode, minorcode,args ?
Also, how to detect if a url has returned a 404, or 200 or any other status 
code ?
Thanks.

-----Original Message-----
From: Shah, Nishant 
Sent: Wednesday, May 29, 2013 1:51 PM
To: [email protected]
Subject: Extracting status code from hbase

Hi Everyone,

I have my Nutch 2.1 setup with Hbase. Once I am done with the crawl, I want to 
extract all the information from the column family 'f'.
For this I do,

Scan s = new Scan();
ResultScanner scanner = table.getScanner(s); try { // Scanners return Result 
instances.
// Now, for the actual iteration. One way is to use a while loop // like so:
for (Result rr = scanner.next(); rr != null; rr = scanner.next()) { // print 
out the row we found and the columns we were looking // for 
System.out.println("Found row: " + rr); String[] 
rrs=getColumnsInColumnFamily(rr,"f");
NavigableMap familyMap = rr.getFamilyMap(Bytes.toBytes("f"));
Iterator entries = familyMap.entrySet().iterator(); while(entries.hasNext()){

Entry thisEntry = (Entry) entries.next(); Object key = thisEntry.getKey(); 
Object val = thisEntry.getValue();
System.out.println(Bytes.toString((byte[]) key)+"="+Bytes.toString((byte[]) 
val)); }

The value for status is blank. It's not null, but blank. Same is the case with 
headers. 'mtdt' family and rest of the 'f' family is fine.
Can anyone suggest why this is happening ?
Thanks,
Nishant

Reply via email to