RE: TableRecordReaderImpl is not able to get the rows

Anoop Sam John Thu, 14 Jun 2012 01:19:46 -0700

Hi Subroto
          Scan.addColumn you can use when u want only specific columns to be 
retrieved in the sacn . U can read the javadoc in addColumn() and addFamily() 
methods and then it will be clear.
scan.addColumn(Bytes.toBytes(column), Bytes.toBytes(""))
As you do this empty qualifier name is getting added in the family list which 
needs to be retrieved.


So if you want to retrieve whole data avoid addColumn and addFamily.
If want want specific CFs to be retrieved go with addFamily
Within a CF u want specific qualifier only then only go for addColumn 

-Anoop-
________________________________________
From: Subroto [[email protected]]
Sent: Thursday, June 14, 2012 1:35 PM
To: [email protected]
Subject: Re: TableRecordReaderImpl is not able to get the rows

Hi Sonal,

Thanks for your suggestion… I checked my code and found out that I was doing 
something like this:
for (String column : columns) {
        scan.addColumn(Bytes.toBytes(column), Bytes.toBytes(""))
}

The same scan object was being set in the configuration. After removing this 
piece of code, the MR Job worked fine.

One more doubt:
Scan.addColumn takes two argument:
public Scan addColumn(byte[] family,
                      byte[] qualifier)
The first one is the column name in bytes. What should be passed as second 
argument if I want the qualifier to be ignored(I mean to include any 
qualifier)???
My HBase table looks like this:
ROW                                                                  COLUMN+CELL
 row1                                                                
column=cf:a, timestamp=1339581548508, value=value1
 row2                                                                
column=cf:b, timestamp=1339581557585, value=value2
 row3                                                                
column=cf:c, timestamp=1339581566227, value=value3

Thanks again for the correct pointers…. :-)

Cheers,
Subroto Sanyal
On Jun 14, 2012, at 9:07 AM, Sonal Goyal wrote:

> Are you doing something specific with the RecordReader? Maybe you can post
> more of your code as it is difficult to tell anything right now.
>
> Best Regards,
> Sonal
> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
> On Wed, Jun 13, 2012 at 7:28 PM, Subroto <[email protected]> wrote:
>
>> Hi Sonal,
>>
>> The Scan is being created by:
>> void
>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(Configuration
>> configuration)
>> I am not providing any other scan options…. :-(
>>
>> Cheers,
>> Subroto Sanyal
>>
>> On Jun 13, 2012, at 1:30 PM, Sonal Goyal wrote:
>>
>>> Hi Subroto,
>>>
>>> How are you configuring your job? Are you providing any Scan options?
>> Check
>>> Chapter 7 of the ref guide at
>>>
>>> http://hbase.apache.org/book/mapreduce.example.html
>>>
>>> Best Regards,
>>> Sonal
>>> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
>>> Nube Technologies <http://www.nubetech.co>
>>>
>>> <http://in.linkedin.com/in/sonalgoyal>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jun 13, 2012 at 4:47 PM, Subroto <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a table with details:
>>>> hbase(main):024:0> scan 'test'
>>>> ROW
>>>> COLUMN+CELL
>>>> row1
>>>> column=cf:a, timestamp=1339581548508, value=value1
>>>> row2
>>>> column=cf:b, timestamp=1339581557585, value=value2
>>>> row3
>>>> column=cf:c, timestamp=1339581566227, value=value3
>>>> 3 row(s) in 0.0200 seconds
>>>>
>>>> When my MR job tries to access TableRecordReader
>>>> org.apache.hadoop.hbase.client.HTable.ClientScanner.nextScanner set the
>>>> "value" to null for every call which gives a sense to my MR application
>>>> that there are no records in the table.
>>>>
>>>> I am using: 0.92.1 HBase and 0.23.1 Hadoop…..
>>>>
>>>> What can be the possible reason(s) behind it??
>>>>
>>>> Cheers,
>>>> Subroto Sanyal
>>
>>

RE: TableRecordReaderImpl is not able to get the rows

Reply via email to