Hi Kathleen, Sorry for the delayed reply as i started working on HBase rather than Sqoop. Here is an example code from the book "HBase:The Definitive Guide" which will show that it is possible to load data into more than one column family through java api which was exactly the point i was trying to make.
Have a look at these two classes: https://github.com/larsgeorge/hbase-book/blob/master/ch04/src/main/java/util/HBaseHelper.java https://github.com/larsgeorge/hbase-book/blob/master/ch04/src/main/java/filters/PrefixFilterExample.java Please let me know if you have further questions. Thanks, Anil On Fri, Feb 24, 2012 at 9:36 PM, Kathleen Ting <kathl...@cloudera.com>wrote: > Hi Anil, > > re: Is the above scenario not possible in Hbase Java api? > I would suggest asking that on u...@hbase.apache.org. > > Thanks, > Kathleen > > On Wed, Feb 22, 2012 at 2:26 PM, anil gupta <anilg...@buffalo.edu> wrote: > >> Hi Kathleen, >> >> I think my previous messages were misinterpreted, in previous message i >> was talking about generating separate put statement for separate >> columnfamily. I am having hard time understanding how this would violate >> the Hbase atomicity rule? >> >> For instance, on hbase shell my put statement would be like this for two >> column family: >> hbase shell>put 'merchant_data', '1', 'info:name', 'starbucks' >> hbase shell>put 'merchant_data', '1', 'user_reviews:id', '4545' >> >> Similarly, this can be achieved by using java api of HBase which sqoop is >> using. Is the above scenario not possible in Hbase Java api? >> >> Thanks, >> Anil >> >> >> >> On Wed, Feb 22, 2012 at 2:02 PM, Kathleen Ting <kathl...@cloudera.com>wrote: >> >>> Hi Anil - >>> >>> Good question and sorry for any confusion earlier. To be sure, because >>> HBase permits atomic operations across a single column family only, Sqoop >>> can not support multiple column families. >>> >>> Regards, Kathleen >>> >>> On Wed, Feb 22, 2012 at 12:43 PM, anil gupta <anilg...@buffalo.edu>wrote: >>> >>>> Hi Kathleen, >>>> >>>> Yes, that is always an option. Thanks for suggestion. >>>> >>>> I am a beginner at HBase. However, I was thinking of cutting down the >>>> time to dump the data from Database. If i do it twice(assuming i have 2 >>>> column families) then it increases the time of load the entire HBase table. >>>> AFAIK, Sqoop generates put statements to import data into HBase. If we >>>> can generate put statements for more than one column family. Would it >>>> violate the atomicity principle of HBase? I went through the atomicity >>>> section of http://hbase.apache.org/acid-semantics.html and I cant find >>>> anything which would stop sqoop loading more than one column family and >>>> Hbase bulk load also allows more than one column family although the >>>> approach of HBase bulk loading might be different from Sqoop. Could you >>>> provide me more insight? Sorry, if my question is dumb. >>>> >>>> Thanks, >>>> Anil Gupta >>>> >>>> >>>> On Wed, Feb 22, 2012 at 11:51 AM, Kathleen Ting >>>> <kathl...@cloudera.com>wrote: >>>> >>>>> Hi Anil, >>>>> >>>>> Sqoop does not support multiple column families because HBase only >>>>> permits atomic operations. >>>>> >>>>> One workaround is to run two imports, specifying a different column >>>>> family each time. >>>>> >>>>> Regards, >>>>> Kathleen >>>>> >>>>> On Wed, Feb 22, 2012 at 11:31 AM, anil gupta <anilgupt...@gmail.com>wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I went through the User guide of Sqoop but i could not find anything >>>>>> for importing more than one columnfamily in HBase. Am i missing >>>>>> something? >>>>>> Is it planned for future release? >>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Anil Gupta >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> Anil Gupta >>>> >>> >>> >> >> >> -- >> Thanks & Regards, >> Anil Gupta >> > > -- Thanks & Regards, Anil Gupta