Hi Subroto
Scan.addColumn you can use when u want only specific columns to be
retrieved in the sacn . U can read the javadoc in addColumn() and addFamily()
methods and then it will be clear.
scan.addColumn(Bytes.toBytes(column), Bytes.toBytes(""))
As you do this empty qualifier name is getting added in the family list which
needs to be retrieved.
So if you want to retrieve whole data avoid addColumn and addFamily.
If want want specific CFs to be retrieved go with addFamily
Within a CF u want specific qualifier only then only go for addColumn
-Anoop-
________________________________________
From: Subroto [[email protected]]
Sent: Thursday, June 14, 2012 1:35 PM
To: [email protected]
Subject: Re: TableRecordReaderImpl is not able to get the rows
Hi Sonal,
Thanks for your suggestion… I checked my code and found out that I was doing
something like this:
for (String column : columns) {
scan.addColumn(Bytes.toBytes(column), Bytes.toBytes(""))
}
The same scan object was being set in the configuration. After removing this
piece of code, the MR Job worked fine.
One more doubt:
Scan.addColumn takes two argument:
public Scan addColumn(byte[] family,
byte[] qualifier)
The first one is the column name in bytes. What should be passed as second
argument if I want the qualifier to be ignored(I mean to include any
qualifier)???
My HBase table looks like this:
ROW COLUMN+CELL
row1
column=cf:a, timestamp=1339581548508, value=value1
row2
column=cf:b, timestamp=1339581557585, value=value2
row3
column=cf:c, timestamp=1339581566227, value=value3
Thanks again for the correct pointers…. :-)
Cheers,
Subroto Sanyal
On Jun 14, 2012, at 9:07 AM, Sonal Goyal wrote:
> Are you doing something specific with the RecordReader? Maybe you can post
> more of your code as it is difficult to tell anything right now.
>
> Best Regards,
> Sonal
> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
> On Wed, Jun 13, 2012 at 7:28 PM, Subroto <[email protected]> wrote:
>
>> Hi Sonal,
>>
>> The Scan is being created by:
>> void
>> org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(Configuration
>> configuration)
>> I am not providing any other scan options…. :-(
>>
>> Cheers,
>> Subroto Sanyal
>>
>> On Jun 13, 2012, at 1:30 PM, Sonal Goyal wrote:
>>
>>> Hi Subroto,
>>>
>>> How are you configuring your job? Are you providing any Scan options?
>> Check
>>> Chapter 7 of the ref guide at
>>>
>>> http://hbase.apache.org/book/mapreduce.example.html
>>>
>>> Best Regards,
>>> Sonal
>>> Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
>>> Nube Technologies <http://www.nubetech.co>
>>>
>>> <http://in.linkedin.com/in/sonalgoyal>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jun 13, 2012 at 4:47 PM, Subroto <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a table with details:
>>>> hbase(main):024:0> scan 'test'
>>>> ROW
>>>> COLUMN+CELL
>>>> row1
>>>> column=cf:a, timestamp=1339581548508, value=value1
>>>> row2
>>>> column=cf:b, timestamp=1339581557585, value=value2
>>>> row3
>>>> column=cf:c, timestamp=1339581566227, value=value3
>>>> 3 row(s) in 0.0200 seconds
>>>>
>>>> When my MR job tries to access TableRecordReader
>>>> org.apache.hadoop.hbase.client.HTable.ClientScanner.nextScanner set the
>>>> "value" to null for every call which gives a sense to my MR application
>>>> that there are no records in the table.
>>>>
>>>> I am using: 0.92.1 HBase and 0.23.1 Hadoop…..
>>>>
>>>> What can be the possible reason(s) behind it??
>>>>
>>>> Cheers,
>>>> Subroto Sanyal
>>
>>