Hi Kathleen,

I think my previous messages were misinterpreted, in previous message i was
talking about generating separate put statement for separate columnfamily.
I am having hard time understanding how this would violate the Hbase
atomicity rule?

For instance, on hbase shell my put statement would be like this for two
column family:
hbase shell>put 'merchant_data', '1', 'info:name', 'starbucks'
hbase shell>put 'merchant_data', '1', 'user_reviews:id', '4545'

Similarly, this can be achieved by using java api of HBase which sqoop is
using. Is the above scenario not possible in Hbase Java api?

Thanks,
Anil


On Wed, Feb 22, 2012 at 2:02 PM, Kathleen Ting <kathl...@cloudera.com>wrote:

> Hi Anil -
>
> Good question and sorry for any confusion earlier. To be sure, because
> HBase permits atomic operations across a single column family only, Sqoop
> can not support multiple column families.
>
> Regards, Kathleen
>
> On Wed, Feb 22, 2012 at 12:43 PM, anil gupta <anilg...@buffalo.edu> wrote:
>
>> Hi Kathleen,
>>
>> Yes, that is always an option. Thanks for suggestion.
>>
>> I am a beginner at HBase. However, I was thinking of cutting down the
>> time to dump the data from Database. If i do it twice(assuming i have 2
>> column families) then it increases the time of load the entire HBase table.
>> AFAIK, Sqoop generates put statements to import data into HBase. If we
>> can generate put statements for more than one column family. Would it
>> violate the atomicity principle of HBase? I went through the atomicity
>> section of http://hbase.apache.org/acid-semantics.html and I cant find
>> anything which would stop sqoop loading more than one column family and
>> Hbase bulk load also allows more than one column family although the
>> approach of  HBase bulk loading might be different from Sqoop. Could you
>> provide me more insight?  Sorry, if my question is dumb.
>>
>> Thanks,
>> Anil Gupta
>>
>>
>> On Wed, Feb 22, 2012 at 11:51 AM, Kathleen Ting <kathl...@cloudera.com>wrote:
>>
>>> Hi Anil,
>>>
>>> Sqoop does not support multiple column families because HBase only
>>> permits atomic operations.
>>>
>>> One workaround is to run two imports, specifying a different column
>>> family each time.
>>>
>>> Regards,
>>> Kathleen
>>>
>>> On Wed, Feb 22, 2012 at 11:31 AM, anil gupta <anilgupt...@gmail.com>wrote:
>>>
>>>> Hi All,
>>>>
>>>> I went through the User guide of Sqoop but i could not find anything
>>>> for importing more than one columnfamily in HBase. Am i missing something?
>>>> Is it planned for future release?
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Anil Gupta
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>
>


-- 
Thanks & Regards,
Anil Gupta

Reply via email to