Re: HBase - stable versions
BTW, can somebody explain the function/purpose of 0.95.2. Do the community expect 0.95.2 to be used in a prod env or does it have to 0.96.0 for that ? Also, I have some development hiccups with it (like cannot find the jar on the maven repo etc, if somebody can provide pointers that would be great). Regards, - kiru From: Ameya Kanitkar am...@groupon.com To: user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Monday, September 9, 2013 5:02 PM Subject: Re: HBase - stable versions We (Groupon), will also stick to 0.94 for near future. Ameya On Mon, Sep 9, 2013 at 4:03 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.comwrote: When is 0.96 release being planned ? Right now we are testing against 0.95.2 as this does not seem to have the HBASE-9410 bug. Regards, - kiru From: Enis Söztutar e...@apache.org To: hbase-user user@hbase.apache.org Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Wednesday, September 4, 2013 6:20 PM Subject: Re: HBase - stable versions As long as there is interest for 0.94, we will care for 0.94. However, when 0.96.0 comes out, it will be marked as the next stable release, so I expect that we would promote newcomers that branch. Any committer can propose any branch and release candidate any time, so if there are road blocks for 0.94.x mainline, you might as well propose a new branch. Enis On Wed, Sep 4, 2013 at 4:29 PM, Varun Sharma va...@pinterest.com wrote: We, at Pinterest, are also going to stay on 0.94 for a while since it has worked well for us and we don't have the resources to test 0.96 in the EC2 environment. That may change in the future but we don't know when... On Wed, Sep 4, 2013 at 1:53 PM, Andrew Purtell apurt...@apache.org wrote: If LarsH is willing to stay on as RM for 0.94 then IMHO we should proceed as today with the exception that 0.96 is what the stable symlink points to. As long as 0.94 has someone willing to RM and users such as Salesforce then there will be individuals there and in the community motivated to keep it in good working order with occasional point releases. We should not throw up roadblocks or adopt an arbitrary policy, as long as new features arrive in the branch as backports, and the changes maintain our point release compatibility criteria (rolling restarts possible, no API regressions). On Tue, Sep 3, 2013 at 5:30 PM, lars hofhansl la...@apache.org wrote: With 0.96 being imminent we should start a discussion about continuing support for 0.94. 0.92 became stale pretty soon after 0.94 was released. The relationship between 0.94 and 0.96 is slightly different, though: 1. 0.92.x could be upgraded to 0.94.x without downtime 2. 0.92 clients and servers are mutually compatible with 0.94 clients and servers 3. the user facing API stayed backward compatible None of the above is true when moving from 0.94 to 0.96+. Upgrade from 0.94 to 0.96 will require a one-way upgrade process including downtime, and client and server need to be upgraded in lockstep. I would like to have an informal poll about who's using 0.94 and is planning to continue to use it; and who is planning to upgrade from 0.94 to 0.96. Should we officially continue support for 0.94? How long? Thanks. -- Lars -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: HBase - stable versions
That's linux terminology. 0.95 is a developper release It should not go in production. When it's ready for production, it will be released as 0.96 0.96 should be ready soon, tests (and fixes are in progress). There is already a release candidate available: 0.96.RC0. There should be a new release candidate (soon as well :-)) For details about the 0.96 RC0 see this thread: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.devel/39592 Cheers, Nicolas On Tue, Sep 10, 2013 at 8:22 AM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote: BTW, can somebody explain the function/purpose of 0.95.2. Do the community expect 0.95.2 to be used in a prod env or does it have to 0.96.0 for that ? Also, I have some development hiccups with it (like cannot find the jar on the maven repo etc, if somebody can provide pointers that would be great). Regards, - kiru From: Ameya Kanitkar am...@groupon.com To: user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Monday, September 9, 2013 5:02 PM Subject: Re: HBase - stable versions We (Groupon), will also stick to 0.94 for near future. Ameya On Mon, Sep 9, 2013 at 4:03 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.comwrote: When is 0.96 release being planned ? Right now we are testing against 0.95.2 as this does not seem to have the HBASE-9410 bug. Regards, - kiru From: Enis Söztutar e...@apache.org To: hbase-user user@hbase.apache.org Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Wednesday, September 4, 2013 6:20 PM Subject: Re: HBase - stable versions As long as there is interest for 0.94, we will care for 0.94. However, when 0.96.0 comes out, it will be marked as the next stable release, so I expect that we would promote newcomers that branch. Any committer can propose any branch and release candidate any time, so if there are road blocks for 0.94.x mainline, you might as well propose a new branch. Enis On Wed, Sep 4, 2013 at 4:29 PM, Varun Sharma va...@pinterest.com wrote: We, at Pinterest, are also going to stay on 0.94 for a while since it has worked well for us and we don't have the resources to test 0.96 in the EC2 environment. That may change in the future but we don't know when... On Wed, Sep 4, 2013 at 1:53 PM, Andrew Purtell apurt...@apache.org wrote: If LarsH is willing to stay on as RM for 0.94 then IMHO we should proceed as today with the exception that 0.96 is what the stable symlink points to. As long as 0.94 has someone willing to RM and users such as Salesforce then there will be individuals there and in the community motivated to keep it in good working order with occasional point releases. We should not throw up roadblocks or adopt an arbitrary policy, as long as new features arrive in the branch as backports, and the changes maintain our point release compatibility criteria (rolling restarts possible, no API regressions). On Tue, Sep 3, 2013 at 5:30 PM, lars hofhansl la...@apache.org wrote: With 0.96 being imminent we should start a discussion about continuing support for 0.94. 0.92 became stale pretty soon after 0.94 was released. The relationship between 0.94 and 0.96 is slightly different, though: 1. 0.92.x could be upgraded to 0.94.x without downtime 2. 0.92 clients and servers are mutually compatible with 0.94 clients and servers 3. the user facing API stayed backward compatible None of the above is true when moving from 0.94 to 0.96+. Upgrade from 0.94 to 0.96 will require a one-way upgrade process including downtime, and client and server need to be upgraded in lockstep. I would like to have an informal poll about who's using 0.94 and is planning to continue to use it; and who is planning to upgrade from 0.94 to 0.96. Should we officially continue support for 0.94? How long? Thanks. -- Lars -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Command to delete based on column Family + rowkey
Dear All, Requirement is to delete all columns which belongs to a column family and for a particular rowkey. Have tried with the below command but record is not getting deleted. * hbase deleteall 't1', 'r1', 'c1'* * * *Test result :* * * 3) Scan the table 't' hbase(main):025:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0440 seconds - 4) Delete the records from the table 't' in the rowkey '333' in the column family 'TWO' hbase(main):027:0 deleteall 't','333','TWO' 0 row(s) in 0.0060 seconds - 5) After deleting scan the table hbase(main):028:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0310 seconds Observation :- -- No records got deleted regards, Rams
Fastest way to get count of records in huge hbase table?
Dear All, Is there any fastest way to get the count of records in a huge HBASE table with billions of records? The normal count command is running for a hour with this huge volume of data.. regards, Rams
Re: Command to delete based on column Family + rowkey
hey rama, Try this:: *deleteall 't','333'* * * I hope it will definitely works for you!! On Tue, Sep 10, 2013 at 1:31 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Dear All, Requirement is to delete all columns which belongs to a column family and for a particular rowkey. Have tried with the below command but record is not getting deleted. * hbase deleteall 't1', 'r1', 'c1'* * * *Test result :* * * 3) Scan the table 't' hbase(main):025:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0440 seconds - 4) Delete the records from the table 't' in the rowkey '333' in the column family 'TWO' hbase(main):027:0 deleteall 't','333','TWO' 0 row(s) in 0.0060 seconds - 5) After deleting scan the table hbase(main):028:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0310 seconds Observation :- -- No records got deleted regards, Rams -- Regards *Manish Dunani* *Contact No* : +91 9408329137 *skype id* : manish.dunani* *
Re: Fastest way to get count of records in huge hbase table?
Try the RowCounter MR Job, that comes with HBase. [1] http://hbase.apache.org/book/ops_mgt.html#rowcounter On Tue, Sep 10, 2013 at 1:37 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Dear All, Is there any fastest way to get the count of records in a huge HBASE table with billions of records? The normal count command is running for a hour with this huge volume of data.. regards, Rams -- Ashwanth Kumar / ashwanthkumar.in
Re: Command to delete based on column Family + rowkey
If you want to delete rowkey for particular columnfamily then you need to mention individually:: delete 't','333','TWO:qualifier_name' This will definitely delete the records which you are looking for. Please revert back if it is not work. On Tue, Sep 10, 2013 at 1:40 PM, manish dunani manishd...@gmail.com wrote: hey rama, Try this:: *deleteall 't','333'* * * I hope it will definitely works for you!! On Tue, Sep 10, 2013 at 1:31 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Dear All, Requirement is to delete all columns which belongs to a column family and for a particular rowkey. Have tried with the below command but record is not getting deleted. * hbase deleteall 't1', 'r1', 'c1'* * * *Test result :* * * 3) Scan the table 't' hbase(main):025:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0440 seconds - 4) Delete the records from the table 't' in the rowkey '333' in the column family 'TWO' hbase(main):027:0 deleteall 't','333','TWO' 0 row(s) in 0.0060 seconds - 5) After deleting scan the table hbase(main):028:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0310 seconds Observation :- --
Re: 0.95 Error in Connecting
(redirected user mailing list, dev mailing list in bcc) Various comments: - you should not need to add the hadoop jar in your client application pom, they will come with hbase. But this should not the cause of your issue. - what does the server say in its logs? - I'm suprised by this: Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT = HBase is built with the version 3.4.5. May be you can share the code the logs on pastebin? Nicolas On Tue, Sep 10, 2013 at 3:19 AM, David Williams dwilli...@truecar.comwrote: Hi all, I am working on a api demo that talks to hbase. Today I upgraded to 0.95 to get access to the hbase-client 0.95 libraries. I unpacked the 0.95 binaries on my system, and started hbase. I logged into Hbase shell, and checked status etc. Then I added the client libs for hadoop 1.2.1 and hbase 0.95 to my pom.xml and ran a unit test which checks if I can read and write a simple test value to a table, which I created before hand. The output is a stack trace and some timeouts. The ip addresses correspond to my machine on the local network. It then repeats this on the command line. What should I try next? My goal is to simply programmatically read and write to a local hbase on Mac OS X running in pseudo distributed mode. --- T E S T S --- Running com.example.hbase.HConnectionTest 13/09/09 18:06:32 INFO annotation.ClassPathBeanDefinitionScanner: JSR-330 'javax.inject.Named' annotation found and supported for component scanning 13/09/09 18:06:32 INFO annotation.AnnotationConfigApplicationContext: Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@4b40de18: startup date [Mon Sep 09 18:06:32 PDT 2013]; root of context hierarchy 13/09/09 18:06:32 INFO annotation.ClassPathBeanDefinitionScanner: JSR-330 'javax.inject.Named' annotation found and supported for component scanning 13/09/09 18:06:32 INFO annotation.ClassPathBeanDefinitionScanner: JSR-330 'javax.inject.Named' annotation found and supported for component scanning 13/09/09 18:06:32 INFO annotation.AnnotationConfigApplicationContext: Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@27c549b9: startup date [Mon Sep 09 18:06:32 PDT 2013]; root of context hierarchy 13/09/09 18:06:32 INFO annotation.ClassPathBeanDefinitionScanner: JSR-330 'javax.inject.Named' annotation found and supported for component scanning 13/09/09 18:06:32 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring 13/09/09 18:06:32 INFO support.DefaultListableBeanFactory: Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@594f8a87: defining beans [org.springframework.context.annotation.internalConfigurationAnnotationProcessor,org.springframework.context.annotation.internalAutowiredAnnotationProcessor,org.springframework.context.annotation.internalRequiredAnnotationProcessor,org.springframework.context.annotation.internalCommonAnnotationProcessor,config,org.springframework.context.annotation.ConfigurationClassPostProcessor.importAwareProcessor,properties,hTablePool,appHealthCheck,healthCheck,validate,jsonProvider,submit,hConnection,jaxRsServer,cxf,hadoopConfiguration,jaxRsApiApplication,decode]; root of factory hierarchy 13/09/09 18:06:32 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT 13/09/09 18:06:32 INFO zookeeper.ZooKeeper: Client environment:host.name =10.14.49.129 13/09/09 18:06:32 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_25 13/09/09 18:06:32 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation 13/09/09 18:06:32 INFO zookeeper.ZooKeeper: Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre 13/09/09 18:06:32 INFO zookeeper.ZooKeeper: Client
Re: Command to delete based on column Family + rowkey
Manish, I need to delete all the columns for a particular column family of a given rowkey... I don't want to specify the column name (qualifier name) one by one to delete. Pls let me know is there any way to delete like that... regards, Rams On Tue, Sep 10, 2013 at 2:06 PM, manish dunani manishd...@gmail.com wrote: If you want to delete rowkey for particular columnfamily then you need to mention individually:: delete 't','333','TWO:qualifier_name' This will definitely delete the records which you are looking for. Please revert back if it is not work. On Tue, Sep 10, 2013 at 1:40 PM, manish dunani manishd...@gmail.com wrote: hey rama, Try this:: *deleteall 't','333'* * * I hope it will definitely works for you!! On Tue, Sep 10, 2013 at 1:31 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Dear All, Requirement is to delete all columns which belongs to a column family and for a particular rowkey. Have tried with the below command but record is not getting deleted. * hbase deleteall 't1', 'r1', 'c1'* * * *Test result :* * * 3) Scan the table 't' hbase(main):025:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0440 seconds - 4) Delete the records from the table 't' in the rowkey '333' in the column family 'TWO' hbase(main):027:0 deleteall 't','333','TWO' 0 row(s) in 0.0060 seconds - 5) After deleting scan the table hbase(main):028:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333
How to convert a text or csv file into HFile format and load into HBASE
Dear All, Can you please share a sample code (Java) to convert a text/csv file into HFILE and load it using HBASE API into HBASE. regards, Rams
Re: Command to delete based on column Family + rowkey
This? hbase(main):002:0 help alter Alter column family schema; pass table name and a dictionary specifying new column family schema. Dictionaries are described on the main help command output. Dictionary must include name of column family to alter. For example, To change or add the 'f1' column family in table 't1' from defaults to instead keep a maximum of 5 cell VERSIONS, do: hbase alter 't1', NAME = 'f1', VERSIONS = 5 To delete the 'f1' column family in table 't1', do: hbase alter 't1', NAME = 'f1', METHOD = 'delete' or a shorter version: hbase alter 't1', 'delete' = 'f1' 2013/9/10 Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com Manish, I need to delete all the columns for a particular column family of a given rowkey... I don't want to specify the column name (qualifier name) one by one to delete. Pls let me know is there any way to delete like that... regards, Rams On Tue, Sep 10, 2013 at 2:06 PM, manish dunani manishd...@gmail.com wrote: If you want to delete rowkey for particular columnfamily then you need to mention individually:: delete 't','333','TWO:qualifier_name' This will definitely delete the records which you are looking for. Please revert back if it is not work. On Tue, Sep 10, 2013 at 1:40 PM, manish dunani manishd...@gmail.com wrote: hey rama, Try this:: *deleteall 't','333'* * * I hope it will definitely works for you!! On Tue, Sep 10, 2013 at 1:31 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Dear All, Requirement is to delete all columns which belongs to a column family and for a particular rowkey. Have tried with the below command but record is not getting deleted. * hbase deleteall 't1', 'r1', 'c1'* * * *Test result :* * * 3) Scan the table 't' hbase(main):025:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0440 seconds - 4) Delete the records from the table 't' in the rowkey '333' in the column family 'TWO' hbase(main):027:0 deleteall 't','333','TWO' 0 row(s) in 0.0060 seconds - 5) After deleting scan the table hbase(main):028:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222
Re: How to convert a text or csv file into HFile format and load into HBASE
Hi Rams, Just to make sure, have you looked at importtsv? http://hbase.apache.org/book/ops_mgt.html#importtsv It's tabs, but you can easily update that to take comas or anything else. Also, any specific reason you want to go from csv to HFiles to HBase? Or you can go from csv to HBase? JM 2013/9/10 Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com Dear All, Can you please share a sample code (Java) to convert a text/csv file into HFILE and load it using HBASE API into HBASE. regards, Rams
Re: HBase - stable versions
Nicolas, makes sense. Thanks for the explanation. Regards, - kiru From: Nicolas Liochon nkey...@gmail.com To: user user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Tuesday, September 10, 2013 12:31 AM Subject: Re: HBase - stable versions That's linux terminology. 0.95 is a developper release It should not go in production. When it's ready for production, it will be released as 0.96 0.96 should be ready soon, tests (and fixes are in progress). There is already a release candidate available: 0.96.RC0. There should be a new release candidate (soon as well :-)) For details about the 0.96 RC0 see this thread: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.devel/39592 Cheers, Nicolas On Tue, Sep 10, 2013 at 8:22 AM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote: BTW, can somebody explain the function/purpose of 0.95.2. Do the community expect 0.95.2 to be used in a prod env or does it have to 0.96.0 for that ? Also, I have some development hiccups with it (like cannot find the jar on the maven repo etc, if somebody can provide pointers that would be great). Regards, - kiru From: Ameya Kanitkar am...@groupon.com To: user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Monday, September 9, 2013 5:02 PM Subject: Re: HBase - stable versions We (Groupon), will also stick to 0.94 for near future. Ameya On Mon, Sep 9, 2013 at 4:03 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.comwrote: When is 0.96 release being planned ? Right now we are testing against 0.95.2 as this does not seem to have the HBASE-9410 bug. Regards, - kiru From: Enis Söztutar e...@apache.org To: hbase-user user@hbase.apache.org Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Wednesday, September 4, 2013 6:20 PM Subject: Re: HBase - stable versions As long as there is interest for 0.94, we will care for 0.94. However, when 0.96.0 comes out, it will be marked as the next stable release, so I expect that we would promote newcomers that branch. Any committer can propose any branch and release candidate any time, so if there are road blocks for 0.94.x mainline, you might as well propose a new branch. Enis On Wed, Sep 4, 2013 at 4:29 PM, Varun Sharma va...@pinterest.com wrote: We, at Pinterest, are also going to stay on 0.94 for a while since it has worked well for us and we don't have the resources to test 0.96 in the EC2 environment. That may change in the future but we don't know when... On Wed, Sep 4, 2013 at 1:53 PM, Andrew Purtell apurt...@apache.org wrote: If LarsH is willing to stay on as RM for 0.94 then IMHO we should proceed as today with the exception that 0.96 is what the stable symlink points to. As long as 0.94 has someone willing to RM and users such as Salesforce then there will be individuals there and in the community motivated to keep it in good working order with occasional point releases. We should not throw up roadblocks or adopt an arbitrary policy, as long as new features arrive in the branch as backports, and the changes maintain our point release compatibility criteria (rolling restarts possible, no API regressions). On Tue, Sep 3, 2013 at 5:30 PM, lars hofhansl la...@apache.org wrote: With 0.96 being imminent we should start a discussion about continuing support for 0.94. 0.92 became stale pretty soon after 0.94 was released. The relationship between 0.94 and 0.96 is slightly different, though: 1. 0.92.x could be upgraded to 0.94.x without downtime 2. 0.92 clients and servers are mutually compatible with 0.94 clients and servers 3. the user facing API stayed backward compatible None of the above is true when moving from 0.94 to 0.96+. Upgrade from 0.94 to 0.96 will require a one-way upgrade process including downtime, and client and server need to be upgraded in lockstep. I would like to have an informal poll about who's using 0.94 and is planning to continue to use it; and who is planning to upgrade from 0.94 to 0.96. Should we officially continue support for 0.94? How long? Thanks. -- Lars -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: HBase - stable versions
Even we will use 0.94 for foreseeable future. On Tue, Sep 10, 2013 at 9:29 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote: Nicolas, makes sense. Thanks for the explanation. Regards, - kiru From: Nicolas Liochon nkey...@gmail.com To: user user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Tuesday, September 10, 2013 12:31 AM Subject: Re: HBase - stable versions That's linux terminology. 0.95 is a developper release It should not go in production. When it's ready for production, it will be released as 0.96 0.96 should be ready soon, tests (and fixes are in progress). There is already a release candidate available: 0.96.RC0. There should be a new release candidate (soon as well :-)) For details about the 0.96 RC0 see this thread: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.devel/39592 Cheers, Nicolas On Tue, Sep 10, 2013 at 8:22 AM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote: BTW, can somebody explain the function/purpose of 0.95.2. Do the community expect 0.95.2 to be used in a prod env or does it have to 0.96.0 for that ? Also, I have some development hiccups with it (like cannot find the jar on the maven repo etc, if somebody can provide pointers that would be great). Regards, - kiru From: Ameya Kanitkar am...@groupon.com To: user@hbase.apache.org; Kiru Pakkirisamy kirupakkiris...@yahoo.com Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Monday, September 9, 2013 5:02 PM Subject: Re: HBase - stable versions We (Groupon), will also stick to 0.94 for near future. Ameya On Mon, Sep 9, 2013 at 4:03 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.comwrote: When is 0.96 release being planned ? Right now we are testing against 0.95.2 as this does not seem to have the HBASE-9410 bug. Regards, - kiru From: Enis Söztutar e...@apache.org To: hbase-user user@hbase.apache.org Cc: d...@hbase.apache.org d...@hbase.apache.org Sent: Wednesday, September 4, 2013 6:20 PM Subject: Re: HBase - stable versions As long as there is interest for 0.94, we will care for 0.94. However, when 0.96.0 comes out, it will be marked as the next stable release, so I expect that we would promote newcomers that branch. Any committer can propose any branch and release candidate any time, so if there are road blocks for 0.94.x mainline, you might as well propose a new branch. Enis On Wed, Sep 4, 2013 at 4:29 PM, Varun Sharma va...@pinterest.com wrote: We, at Pinterest, are also going to stay on 0.94 for a while since it has worked well for us and we don't have the resources to test 0.96 in the EC2 environment. That may change in the future but we don't know when... On Wed, Sep 4, 2013 at 1:53 PM, Andrew Purtell apurt...@apache.org wrote: If LarsH is willing to stay on as RM for 0.94 then IMHO we should proceed as today with the exception that 0.96 is what the stable symlink points to. As long as 0.94 has someone willing to RM and users such as Salesforce then there will be individuals there and in the community motivated to keep it in good working order with occasional point releases. We should not throw up roadblocks or adopt an arbitrary policy, as long as new features arrive in the branch as backports, and the changes maintain our point release compatibility criteria (rolling restarts possible, no API regressions). On Tue, Sep 3, 2013 at 5:30 PM, lars hofhansl la...@apache.org wrote: With 0.96 being imminent we should start a discussion about continuing support for 0.94. 0.92 became stale pretty soon after 0.94 was released. The relationship between 0.94 and 0.96 is slightly different, though: 1. 0.92.x could be upgraded to 0.94.x without downtime 2. 0.92 clients and servers are mutually compatible with 0.94 clients and servers 3. the user facing API stayed backward compatible None of the above is true when moving from 0.94 to 0.96+. Upgrade from 0.94 to 0.96 will require a one-way upgrade process including downtime, and client and server need to be upgraded in lockstep. I would like to have an informal poll about who's using 0.94 and is planning to continue to use it; and who is planning to upgrade from 0.94 to 0.96. Should we officially continue support for 0.94? How long? Thanks. -- Lars -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: Tables gets Major Compacted even if they haven't changed
Major compactions can still be useful to improve locality - could we add a condition to check for that too? On Mon, Sep 9, 2013 at 10:41 PM, lars hofhansl la...@apache.org wrote: Interesting. I guess we could add a check to avoid major compactions if (1) no TTL is set or we can show that all data is newer and (2) there's only one file (3) and there are no delete markers. All of these can be cheaply checked with some HFile metadata (we might have all data needed already). That would take care of both of your scenarios. -- Lars From: Premal Shah premal.j.s...@gmail.com To: user user@hbase.apache.org Sent: Monday, September 9, 2013 9:02 PM Subject: Tables gets Major Compacted even if they haven't changed Hi, We have a bunch on tables in our HBase cluster. We have a script which makes sure all of them get Major Compacted once every 2 days. There are 2 things I'm observing 1) Table X has not updated in a month. We have not inserted, updated or deleted data. However, it still major compacts every 2 days. All the regions in this table have only 1 store file. 2) Table Y has a few regions where the rowkey is essentially a timestamp. So, we only write to 1 region at a time. Over time, the region splits, and then we write the one of the split regions. Now, whenever we major compact the table, all regions get major compacted. Only 1 region has more than 1 store file, every other region has exactly once. Is there a way to avoid compaction of regions that have not changed? We are using HBase 0.94.11 -- Regards, Premal Shah.
HBASE and Zookeeper in parallel
Hi I am writing a program that makes use of a zookeeper server (I used the queue implementation of Curator) In addition, the program has access to HBASE Database via Gora. Hbase uses Zookeeper My question is: Does HBASE use the same zookeeper server that I am using from my queue implementation or does it have a zookeeper server by itself? For example, can I stop my zk server without hurting hbase processing? Thanks Benjamin
Re: Zookeeper state for failed region servers
You won't have this directly. /hbase/rs contains the regionservers that are online. When a regionserver dies, hbase (or zookeeper if it's a silent failure) will remove it from this list. (And obviously this is internal to hbase and could change or not at any time :-) ). But technically you can do as hbase does and set a zookeeper watcher in this znode. What do you want to achieve exactly? On Tue, Sep 10, 2013 at 8:46 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net wrote: Could someone tell me what Zookeeper node to watch to know if any region servers are down currently and what the affected region list is? Thank you! -sudarshan
Two concurrent programs using the same hbase
Hi I installed hbase on a gpfs directory and lanched it using bin/start-hbase.sh Two servers on this gpfs filesystem run a similar program. This program accesses the hbase via GORA call: this.dataStore = DataStoreFactory.getDataStore(Long.class, Pageview.class, new Configuration()); However, when I am launching the second program on the second server, I get the following exception: [java] Exception in thread main java.lang.RuntimeException: org.apache.gora.util.GoraException: java.lang.RuntimeException: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. I do not know how to solve that. It is strange, since only *two* programs use this HBASE connection. Thanks a lot Benjamin
Re: Performance analysis in Hbase
Yeah there isn't a whole lot of documentation about metrics. Could it be that you are still running on a default 1GB heap and you are pounding it with multiple clients? Try raising the heap size? FWIW I gave a presentation at HBaseCon with Kevin O'dell about HBase operations which could shed some light: Video: http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/hbasecon-2013--apache-hbase-meet-ops-ops-meet-apache-hbase-video.html Slides: http://www.slideshare.net/cloudera/operations-session-6 J-D On Tue, Sep 10, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com wrote: Can someone please throw some light on this aspect of Hbase ? On Thu, Sep 5, 2013 at 11:04 AM, Vimal Jain vkj...@gmail.com wrote: Just to add more information , i got following link which explains metrics related to RS. http://hbase.apache.org/book.html#rs_metrics Is there any resource which explains these metrics in detail ,( in official guide , there is just one line for each metric) . On Thu, Sep 5, 2013 at 10:06 AM, Vimal Jain vkj...@gmail.com wrote: Hi, I am running Hbase in *pseudo distributed mode on top of HDFS.* So far , its been running fine. In past i had some memory related issue ( long GC pauses ). So i wanted to know if there is a way through GUI ( web UI on 60010,60030) or CLI ( shell) to get the health of Hbase ( with reference to its memory consumption , cpu starvation if any ). Please provide some resources where i can look for this information. -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain
Does hbase runs with hadoop 2.1.0 beta?
Hi, I'm trying to run Hbase with hadoop 2.1.0 beta. Witch hbase version should I use? I made some tests with 0.95.2 compiling with 2.0 profile but I faced protobuf issues. Thanks, -- Marcos Sousa
Re: Tables gets Major Compacted even if they haven't changed
Thanx for the discussion guys. @Anil, we have turned off major compaction in the settings. This is a script which is run manually to make sure all tables get major compacted ever so often to increase data locality. In our case, there is some collateral damage of compacting unchanged regions. I was planning to rework the script to compact regions and not tables by querying how many store files a region has, and compact if num_store_files 1. Is that a good solution in the interim? On Tue, Sep 10, 2013 at 11:11 AM, Dave Latham lat...@davelink.net wrote: Major compactions can still be useful to improve locality - could we add a condition to check for that too? On Mon, Sep 9, 2013 at 10:41 PM, lars hofhansl la...@apache.org wrote: Interesting. I guess we could add a check to avoid major compactions if (1) no TTL is set or we can show that all data is newer and (2) there's only one file (3) and there are no delete markers. All of these can be cheaply checked with some HFile metadata (we might have all data needed already). That would take care of both of your scenarios. -- Lars From: Premal Shah premal.j.s...@gmail.com To: user user@hbase.apache.org Sent: Monday, September 9, 2013 9:02 PM Subject: Tables gets Major Compacted even if they haven't changed Hi, We have a bunch on tables in our HBase cluster. We have a script which makes sure all of them get Major Compacted once every 2 days. There are 2 things I'm observing 1) Table X has not updated in a month. We have not inserted, updated or deleted data. However, it still major compacts every 2 days. All the regions in this table have only 1 store file. 2) Table Y has a few regions where the rowkey is essentially a timestamp. So, we only write to 1 region at a time. Over time, the region splits, and then we write the one of the split regions. Now, whenever we major compact the table, all regions get major compacted. Only 1 region has more than 1 store file, every other region has exactly once. Is there a way to avoid compaction of regions that have not changed? We are using HBase 0.94.11 -- Regards, Premal Shah. -- Regards, Premal Shah.
Re: Does hbase runs with hadoop 2.1.0 beta?
0.96.0 should work. See this thread: http://search-hadoop.com/m/7W1PfyzHy51 On Tue, Sep 10, 2013 at 1:25 PM, Marcos Sousa marcoscaixetaso...@gmail.comwrote: Hi, I'm trying to run Hbase with hadoop 2.1.0 beta. Witch hbase version should I use? I made some tests with 0.95.2 compiling with 2.0 profile but I faced protobuf issues. Thanks, -- Marcos Sousa
Re: HBASE and Zookeeper in parallel
Take a look at http://hbase.apache.org/book.html#zookeeper Cheers On Tue, Sep 10, 2013 at 12:11 PM, Sznajder ForMailingList bs4mailingl...@gmail.com wrote: Hi I am writing a program that makes use of a zookeeper server (I used the queue implementation of Curator) In addition, the program has access to HBASE Database via Gora. Hbase uses Zookeeper My question is: Does HBASE use the same zookeeper server that I am using from my queue implementation or does it have a zookeeper server by itself? For example, can I stop my zk server without hurting hbase processing? Thanks Benjamin
Strange behavior of blockCacheSize metric with LruBlockCache
When enabling the direct memory allocation [HBASE-4027] I'm observing strange values for the blockCacheSize metric. With usual cache, values are slowly growing around 1 GB When I'm enabling direct allocation (~4 GB off-heap), then it looks like this: 1 GB, 4 GB, 1, 4, 1, 4... When looking at the code, the double cache is just adding values of both caches. I don't think my cache is evicting/filling that quick, and looking at the debug logs nothing is happening (very low cache activity). Can anyone explain this metric under such circumstances? Perhaps it should be CC to dev mailing list? -- Adrien Mogenet http://www.borntosegfault.com
Re: Does hbase runs with hadoop 2.1.0 beta?
HBase 0.96 will work with Hadoop 2.1.0. Hadoop 2.1.0 changed protobuf versions. 0.95.X had the older version of protobuf that's incompatible with the one used in hadoop 2.1.0. On Tue, Sep 10, 2013 at 1:25 PM, Marcos Sousa marcoscaixetaso...@gmail.com wrote: Hi, I'm trying to run Hbase with hadoop 2.1.0 beta. Witch hbase version should I use? I made some tests with 0.95.2 compiling with 2.0 profile but I faced protobuf issues. Thanks, -- Marcos Sousa
Re: Does hbase runs with hadoop 2.1.0 beta?
Yes, I noticed that... Thanks, I'm compiling right now. On Tue, Sep 10, 2013 at 5:35 PM, Elliott Clark ecl...@apache.org wrote: HBase 0.96 will work with Hadoop 2.1.0. Hadoop 2.1.0 changed protobuf versions. 0.95.X had the older version of protobuf that's incompatible with the one used in hadoop 2.1.0. On Tue, Sep 10, 2013 at 1:25 PM, Marcos Sousa marcoscaixetaso...@gmail.com wrote: Hi, I'm trying to run Hbase with hadoop 2.1.0 beta. Witch hbase version should I use? I made some tests with 0.95.2 compiling with 2.0 profile but I faced protobuf issues. Thanks, -- Marcos Sousa -- Marcos Sousa www.marcossousa.com
Re: HBASE and Zookeeper in parallel
Hi Benjamin, It depends on whether you set HBASE_MANAGES_ZK in you hbase-env.sh https://hbase.apache.org/book/zookeeper.html -Ivan On Tue, Sep 10, 2013 at 10:11:40PM +0300, Sznajder ForMailingList wrote: Hi I am writing a program that makes use of a zookeeper server (I used the queue implementation of Curator) In addition, the program has access to HBASE Database via Gora. Hbase uses Zookeeper My question is: Does HBASE use the same zookeeper server that I am using from my queue implementation or does it have a zookeeper server by itself? For example, can I stop my zk server without hurting hbase processing? Thanks Benjamin
Please welcome our newest committer, Nick Dimiduk
Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. Cheers, Enis
deploy saleforce phoenix coprocessor to hbase/lib??
Hi, Since this is not a hbase system level jar, instead, it is more like user code, should we deploy it under hbase/lib? It seems we can use alter to add the coprocessor for a particular user table. So I can put the jar file any place that is accessible, e.g. hdfs:/myPath? My customer said, there is no need to run 'aler' command. Instead, as long as I put the jar into hbase/lib, then when phoenix client make read call, it will add the the coprocessor attr into that table being read. It is kind of suspicious. Does the phoenix client call a alter under cover for the client already? Anyone knows about this? Thanks Tian-Ying
Re: Please welcome our newest committer, Nick Dimiduk
On Tue, Sep 10, 2013 at 3:54 PM, Enis Söztutar e...@apache.org wrote: Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. One of us [1] St.Ack 1. http://www.youtube.com/watch?v=bBXyB7niEc0
Re: deploy saleforce phoenix coprocessor to hbase/lib??
When a table is created with Phoenix, its HBase table is configured with the Phoenix coprocessors. We do not specify a jar path, so the Phoenix jar that contains the coprocessor implementation classes must be on the classpath of the region server. In addition to coprocessors, Phoenix relies on custom filters which are also in the Phoenix jar. In theory you could put the jar in HDFS, use the relatively new HBase feature to load custom filters from HDFS, and issue alter table calls for existing Phoenix HBase tables to reconfigure the coprocessors. When new Phoenix tables are created, though, they wouldn't have this jar path. FYI, we're looking into modifying our install procedure to do the above (see https://github.com/forcedotcom/phoenix/issues/216), if folks are interested in contributing. Thanks, James On Sep 10, 2013, at 2:41 PM, Tianying Chang tich...@ebaysf.com wrote: Hi, Since this is not a hbase system level jar, instead, it is more like user code, should we deploy it under hbase/lib? It seems we can use alter to add the coprocessor for a particular user table. So I can put the jar file any place that is accessible, e.g. hdfs:/myPath? My customer said, there is no need to run 'aler' command. Instead, as long as I put the jar into hbase/lib, then when phoenix client make read call, it will add the the coprocessor attr into that table being read. It is kind of suspicious. Does the phoenix client call a alter under cover for the client already? Anyone knows about this? Thanks Tian-Ying
Zookeeper state for failed region servers
Could someone tell me what Zookeeper node to watch to know if any region servers are down currently and what the affected region list is? Thank you! -sudarshan
Re: Please welcome our newest committer, Nick Dimiduk
Thank you everyone! On Tue, Sep 10, 2013 at 3:54 PM, Enis Söztutar e...@apache.org wrote: Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. Cheers, Enis
Re: Two concurrent programs using the same hbase
Hi Benjamin, Are you able to insert data through hbase shell? How big is your zookeeper quorum? Are you running zookeeper as a separate process? Maybe you should just add more connections to your zookeeper process through its configuration file. Renato M. 2013/9/10 Sznajder ForMailingList bs4mailingl...@gmail.com Hi I installed hbase on a gpfs directory and lanched it using bin/start-hbase.sh Two servers on this gpfs filesystem run a similar program. This program accesses the hbase via GORA call: this.dataStore = DataStoreFactory.getDataStore(Long.class, Pageview.class, new Configuration()); However, when I am launching the second program on the second server, I get the following exception: [java] Exception in thread main java.lang.RuntimeException: org.apache.gora.util.GoraException: java.lang.RuntimeException: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. I do not know how to solve that. It is strange, since only *two* programs use this HBASE connection. Thanks a lot Benjamin
Re: Please welcome our newest committer, Nick Dimiduk
Congrats Nick, great to have you on board! - Original Message - From: Enis Söztutar e...@apache.org To: d...@hbase.apache.org d...@hbase.apache.org; hbase-user user@hbase.apache.org Cc: Sent: Tuesday, September 10, 2013 3:54 PM Subject: Please welcome our newest committer, Nick Dimiduk Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. Cheers, Enis
答复: Fastest way to get count of records in huge hbase table?
No fast way to get the count of records of a table without scanning and counting, especially when you want to get the accurate count. By design the data/cells of a same record/row can scatter in many different HFiles and memstore, so even we can record the count of records of each HFile as meta in FileInfo, we still need to de-dup to get the accurate total count, which only can be achieved by scanning. 发件人: Ramasubramanian Narayanan [ramasubramanian.naraya...@gmail.com] 发送时间: 2013年9月10日 16:07 收件人: user@hbase.apache.org 主题: Fastest way to get count of records in huge hbase table? Dear All, Is there any fastest way to get the count of records in a huge HBASE table with billions of records? The normal count command is running for a hour with this huge volume of data.. regards, Rams
Re: Tables gets Major Compacted even if they haven't changed
That. And other parameters (like compression) might have been changed, too. Would need to check for that as well, From: Dave Latham lat...@davelink.net To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Tuesday, September 10, 2013 11:11 AM Subject: Re: Tables gets Major Compacted even if they haven't changed Major compactions can still be useful to improve locality - could we add a condition to check for that too? On Mon, Sep 9, 2013 at 10:41 PM, lars hofhansl la...@apache.org wrote: Interesting. I guess we could add a check to avoid major compactions if (1) no TTL is set or we can show that all data is newer and (2) there's only one file (3) and there are no delete markers. All of these can be cheaply checked with some HFile metadata (we might have all data needed already). That would take care of both of your scenarios. -- Lars From: Premal Shah premal.j.s...@gmail.com To: user user@hbase.apache.org Sent: Monday, September 9, 2013 9:02 PM Subject: Tables gets Major Compacted even if they haven't changed Hi, We have a bunch on tables in our HBase cluster. We have a script which makes sure all of them get Major Compacted once every 2 days. There are 2 things I'm observing 1) Table X has not updated in a month. We have not inserted, updated or deleted data. However, it still major compacts every 2 days. All the regions in this table have only 1 store file. 2) Table Y has a few regions where the rowkey is essentially a timestamp. So, we only write to 1 region at a time. Over time, the region splits, and then we write the one of the split regions. Now, whenever we major compact the table, all regions get major compacted. Only 1 region has more than 1 store file, every other region has exactly once. Is there a way to avoid compaction of regions that have not changed? We are using HBase 0.94.11 -- Regards, Premal Shah.
hbase table design
Hi all, who can provide some RowKey HBase table design or design-related information, I made reference to the official documents and HBase The Definitive Guide. But what is more specific and detailed case? Thank you -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com
Re: hbase table design
Have you looked at http://hbase.apache.org/book.html#schema.casestudies ? On Tue, Sep 10, 2013 at 7:57 PM, kun yan yankunhad...@gmail.com wrote: Hi all, who can provide some RowKey HBase table design or design-related information, I made reference to the official documents and HBase The Definitive Guide. But what is more specific and detailed case? Thank you -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com
Re: hbase table design
Thank you, before I just read 6. HBase and Schema Design part sections, I did not notice 6.11. Schema Design Case Studies I should be more careful read 2013/9/11 Ted Yu yuzhih...@gmail.com Have you looked at http://hbase.apache.org/book.html#schema.casestudies ? On Tue, Sep 10, 2013 at 7:57 PM, kun yan yankunhad...@gmail.com wrote: Hi all, who can provide some RowKey HBase table design or design-related information, I made reference to the official documents and HBase The Definitive Guide. But what is more specific and detailed case? Thank you -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com
Import data from MySql to HBase using Sqoop2
Hi Guys, How to import mysql to Hbase table. I am using sqoop2 when i try to import table it's doesn't show storage as Hbase. Schema name: sqoop:000 create job --xid 12 --type import . . . . Boundary query: Output configuration Storage type: * 0 : HDFS* Choose: Please guide me. How to do this ? -Dhanasekaran. Did I learn something today? If not, I wasted it.
RE: Please welcome our newest committer, Nick Dimiduk
Congratulations Nick. From: lars hofhansl [la...@apache.org] Sent: Wednesday, September 11, 2013 7:30 AM To: d...@hbase.apache.org; hbase-user Subject: Re: Please welcome our newest committer, Nick Dimiduk Congrats Nick, great to have you on board! - Original Message - From: Enis Söztutar e...@apache.org To: d...@hbase.apache.org d...@hbase.apache.org; hbase-user user@hbase.apache.org Cc: Sent: Tuesday, September 10, 2013 3:54 PM Subject: Please welcome our newest committer, Nick Dimiduk Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. Cheers, Enis
Re: Please welcome our newest committer, Nick Dimiduk
Congratulations Nick.!!! On Wed, Sep 11, 2013 at 9:15 AM, rajeshbabu chintaguntla rajeshbabu.chintagun...@huawei.com wrote: Congratulations Nick. From: lars hofhansl [la...@apache.org] Sent: Wednesday, September 11, 2013 7:30 AM To: d...@hbase.apache.org; hbase-user Subject: Re: Please welcome our newest committer, Nick Dimiduk Congrats Nick, great to have you on board! - Original Message - From: Enis Söztutar e...@apache.org To: d...@hbase.apache.org d...@hbase.apache.org; hbase-user user@hbase.apache.org Cc: Sent: Tuesday, September 10, 2013 3:54 PM Subject: Please welcome our newest committer, Nick Dimiduk Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. Cheers, Enis
Re: 答复: Fastest way to get count of records in huge hbase table?
Use Phoenix (https://github.com/forcedotcom/phoenix) by doing the following: CREATE VIEW myHTableName (key VARBINARY NOT NULL PRIMARY KEY); SELECT COUNT(*) FROM myHTableName; As fenghong...@xiaomi.com said, you still need to scan the table, but Phoenix will do it in parallel and use a coprocessor and an internal scanner API to speed things up. Thanks, James @JamesPlusPlus On Tue, Sep 10, 2013 at 7:01 PM, 冯宏华 fenghong...@xiaomi.com wrote: No fast way to get the count of records of a table without scanning and counting, especially when you want to get the accurate count. By design the data/cells of a same record/row can scatter in many different HFiles and memstore, so even we can record the count of records of each HFile as meta in FileInfo, we still need to de-dup to get the accurate total count, which only can be achieved by scanning. 发件人: Ramasubramanian Narayanan [ramasubramanian.naraya...@gmail.com] 发送时间: 2013年9月10日 16:07 收件人: user@hbase.apache.org 主题: Fastest way to get count of records in huge hbase table? Dear All, Is there any fastest way to get the count of records in a huge HBASE table with billions of records? The normal count command is running for a hour with this huge volume of data.. regards, Rams
Re: Please welcome our newest committer, Nick Dimiduk
Congratulations, Nick !!! Keep doing this great work 2013/9/10 ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com Congratulations Nick.!!! On Wed, Sep 11, 2013 at 9:15 AM, rajeshbabu chintaguntla rajeshbabu.chintagun...@huawei.com wrote: Congratulations Nick. From: lars hofhansl [la...@apache.org] Sent: Wednesday, September 11, 2013 7:30 AM To: d...@hbase.apache.org; hbase-user Subject: Re: Please welcome our newest committer, Nick Dimiduk Congrats Nick, great to have you on board! - Original Message - From: Enis Söztutar e...@apache.org To: d...@hbase.apache.org d...@hbase.apache.org; hbase-user user@hbase.apache.org Cc: Sent: Tuesday, September 10, 2013 3:54 PM Subject: Please welcome our newest committer, Nick Dimiduk Hi, Please join me in welcoming Nick as our new addition to the list of committers. Nick is exceptionally good with user-facing issues, and has done major contributions in mapreduce related areas, hive support, as well as 0.96 issues and the new and shiny data types API. Nick, as tradition, feel free to do your first commit to add yourself to pom.xml. Cheers, Enis -- Marcos Ortiz Valmaseda Product Manager at PDVSA http://about.me/marcosortiz
HBase use how much hdfs storage space
Hi all How can I know HBase in a table, the table using HDFS storage space? What is the command or in the HBase web page I can see?(version 0.94 hbase) -- In the Hadoop world, I am just a novice, explore the entire Hadoop ecosystem, I hope one day I can contribute their own code YanBit yankunhad...@gmail.com