Hello,
We are trying to use Nutch in our project. This is my first
project with Nutch and Hbase.
I was able to make Nutch write to Hbase. When I go into the hbase shell and
use the scan command I see data.
I started writing a map reduce to get the data out of Hbase. Our intention
is to do some massaging and write the cleaned data into RDBMS.
In my Map program I am not able to see the data I see through the scan
command.
Question : How do I read Nutch crawl data from Hbase .
Map Program is
protected void map(
ImmutableBytesWritable rowkey,
Result result,
Context context) {
NavigableMap<byte[],NavigableMap<byte[],NavigableMap<Long,byte[]>>> map
=
result.getMap();
for (Entry<byte[], NavigableMap<byte[], NavigableMap<Long,
byte[]>>> columnFamilyEntry : map.entrySet())
{
NavigableMap<byte[],NavigableMap<Long,byte[]>> columnMap =
columnFamilyEntry.getValue();
for( Entry<byte[], NavigableMap<Long, byte[]>>
columnEntry :
columnMap.entrySet())
{
NavigableMap<Long,byte[]> cellMap =
columnEntry.getValue();
for ( Entry<Long, byte[]> cellEntry :
cellMap.entrySet())
{
System.out.println(String.format("Key : %s, Value
: %s", Bytes.toString(columnEntry.getKey()),
Bytes.toString(cellEntry.getValue())));
}
}
}
I see the following in the console
Key : st, Value :
Any help would be appreciated
Thanks
Murali