CfP 2013 Workshop on Middleware for HPC and Big Data Systems (MHPC'13)
we apologize if you receive multiple copies of this message === CALL FOR PAPERS 2013 Workshop on Middleware for HPC and Big Data Systems MHPC '13 as part of Euro-Par 2013, Aachen, Germany === Date: August 27, 2012 Workshop URL: http://m-hpc.org Springer LNCS SUBMISSION DEADLINE: May 31, 2013 - LNCS Full paper submission (rolling abstract submission) June 28, 2013 - Lightning Talk abstracts SCOPE Extremely large, diverse, and complex data sets are generated from scientific applications, the Internet, social media and other applications. Data may be physically distributed and shared by an ever larger community. Collecting, aggregating, storing and analyzing large data volumes presents major challenges. Processing such amounts of data efficiently has been an issue to scientific discovery and technological advancement. In addition, making the data accessible, understandable and interoperable includes unsolved problems. Novel middleware architectures, algorithms, and application development frameworks are required. In this workshop we are particularly interested in original work at the intersection of HPC and Big Data with regard to middleware handling and optimizations. Scope is existing and proposed middleware for HPC and big data, including analytics libraries and frameworks. The goal of this workshop is to bring together software architects, middleware and framework developers, data-intensive application developers as well as users from the scientific and engineering community to exchange their experience in processing large datasets and to report their scientific achievement and innovative ideas. The workshop also offers a dedicated forum for these researchers to access the state of the art, to discuss problems and requirements, to identify gaps in current and planned designs, and to collaborate in strategies for scalable data-intensive computing. The workshop will be one day in length, composed of 20 min paper presentations, each followed by 10 min discussion sections. Presentations may be accompanied by interactive demonstrations. TOPICS Topics of interest include, but are not limited to: - Middleware including: Hadoop, Apache Drill, YARN, Spark/Shark, Hive, Pig, Sqoop, HBase, HDFS, S4, CIEL, Oozie, Impala, Storm and Hyrack - Data intensive middleware architecture - Libraries/Frameworks including: Apache Mahout, Giraph, UIMA and GraphLab - NG Databases including Apache Cassandra, MongoDB and CouchDB/Couchbase - Schedulers including Cascading - Middleware for optimized data locality/in-place data processing - Data handling middleware for deployment in virtualized HPC environments - Parallelization and distributed processing architectures at the middleware level - Integration with cloud middleware and application servers - Runtime environments and system level support for data-intensive computing - Skeletons and patterns - Checkpointing - Programming models and languages - Big Data ETL - Stream processing middleware - In-memory databases for HPC - Scalability and interoperability - Large-scale data storage and distributed file systems - Content-centric addressing and networking - Execution engines, languages and environments including CIEL/Skywriting - Performance analysis, evaluation of data-intensive middleware - In-depth analysis and performance optimizations in existing data-handling middleware, focusing on indexing/fast storing or retrieval between compute and storage nodes - Highly scalable middleware optimized for minimum communication - Use cases and experience for popular Big Data middleware - Middleware security, privacy and trust architectures DATES Papers: Rolling abstract submission May 31, 2013 - Full paper submission July 8, 2013 - Acceptance notification October 3, 2013 - Camera-ready version due Lightning Talks: June 28, 2013 - Deadline for lightning talk abstracts July 15, 2013 - Lightning talk notification August 27, 2013 - Workshop Date TPC CHAIR Michael Alexander (chair), TU Wien, Austria Anastassios Nanos (co-chair), NTUA, Greece Jie Tao (co-chair), Karlsruhe Institut of Technology, Germany Lizhe Wang (co-chair), Chinese Academy of Sciences, China Gianluigi Zanetti (co-chair), CRS4, Italy PROGRAM COMMITTEE Amitanand Aiyer, Facebook, USA Costas Bekas, IBM, Switzerland Jakob Blomer, CERN, Switzerland William Gardner, University of Guelph, Canada José Gracia, HPC Center of the University of Stuttgart, Germany Zhenghua Guom, Indiana University, USA Marcus Hardt, Karlsruhe Institute of Technology, Germany Sverre Jarp, CERN, Switzerland Christopher Jung, Karlsruhe Institute of Technology, Germany Andreas Knüpfer - Technische Universität Dresden, Germany Nectarios Koziris, National Technical University of Athens, Greece Yan Ma, Chinese Academy of Sciences, China Martin Schulz - Lawrence Livermore National Laboratory Viral Shah,
Re: Re: While starting 3-nodes cluster hbase: WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null
Hi John, bin/start-dfs is to start hadoop, right? Why are you trying to start that from HBase Master? You said that you hadoop is already configured and working fine. You should start HBase with bin/start-hbase.sh which will start the master and the regionservers. Also, in your host file, please replace 127.0.0.1debian01 by the real IP. It will help. Keep us posted. JM 2013/4/24 John Foxinhead john.foxinh...@gmail.com: I tried this way: I started hadoop, and it's all ok, so go on. I setted HBASE_MANAGES_ZK=false, so i could start zookeeper, try it and check the problem. I changed the zookeeper.quorum to jobtracker,datanode1 instead of zookeeper1,zookeeper2 because the IP are the same, but in log file the hostname reported was jobtracker and datanode1, while in zookeeper.quorum i setted zookeper1,zookeeper2 so i changed this property because in hbase documentation it's recommended to set the same hostname specified in log file otherwise zookeeper nodes could not recognise themselves as zookeeper nodes and maybe master could have problem logging. Anyway, I made this change in all the nodes. I start zookeeper in zookeeper nodes, and it works now because when i log with bin/hbase zkcli from master node (who is not running zookeeper) It logs at datanode1:2181 (where zookeeper is running) and commands like ls / work. When i use from master /bin/start-dfs the only output is starting master When I see log file i noticed that master send a zookeeper request to 0.0.0.0:2181 and accept response from 127.0.0.2181, while using bin/hbase zkcli it connect with datanode1.2181. This is very strange, I think. What could be the reason why?
Re: undefined method `internal_command' for Shell::Formatter::Console
Hi Robin, Were you finally able to find the issue? JM 2013/4/18 Robin Gowin landr...@gmail.com: same results with @null (i had earlier tried nil, same thing) hbase(main):045:0 uu = @hbase.table('robin1', @null) = Hbase::Table - robin1 hbase(main):046:0 uu.scan(ss) NoMethodError: undefined method `internal_command' for nil:NilClass One thing I'm curious about - might not matter - the output of my @hbase.table command looks like this = Hbase::Table - robin1 but the output of yours (and what is in the book) looks like this = #Hbase::Table:0x3a8cbb70 On Thu, Apr 18, 2013 at 12:17 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Interesting... I tried the same locally and it's working fine for me. hbase(main):010:0 uu = @hbase.table('TestAcidGuarantees', @formatter) = #Hbase::Table:0x3a8cbb70 @table=#Java::OrgApacheHadoopHbaseClient::HTable:0x6d65d417 hbase(main):011:0 ss = {COLUMNS = ['A']} = {COLUMNS=[A]} hbase(main):012:0 uu.scan(ss) = {test_row_0={A:col0=timestamp=1366299718358, value=\\x14\\xC2\\xF0\\x0 I did a cutpaste from what you sent and only changed the table name. Can you try with @null instead of @formatter? JM 2013/4/18 Robin Gowin landr...@gmail.com Hi Jean-Marc, Thanks for your quick reply. Yes I am trying to do something like that. For brevity I combined everything into one jruby command. My command can be split into two and I get the same error. For example, this shows a similar problem using the scan method: hbase(main):041:0 uu = @hbase.table('robin1', @formatter) = Hbase::Table - robin1 hbase(main):042:0 ss = {COLUMNS = ['cf1']} = {COLUMNS=[cf1]} hbase(main):043:0 uu.scan(ss) NoMethodError: undefined method `internal_command' for #Shell::Formatter::Console:0x15f6ae4d hbase(main):044:0 scan 'robin1', ss ROW COLUMN+CELL myrow1 column=cf1:q1, timestamp=1366046037514, value=value2 myrow1 column=cf1:q2, timestamp=1366046489446, value=value2b myrow1 column=cf1:q2b, timestamp=1366046497799, value=value2bb myrow2 column=cf1:q2b, timestamp=1366046731281, value=value2bbce myrow2 column=cf1:q2be, timestamp=1366046748001, value=value2bbce 2 row(s) in 0.0460 seconds On Thu, Apr 18, 2013 at 11:54 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Robin, I'm not sure about your command line (@hbase.table('robin1',@formatter).scan({'COLUMNS' = ['cf1']})) Are you trying do to something like that? scan 'robin1', {COLUMNS = ['cf1']} JM 2013/4/18 Robin Gowin landr...@gmail.com This feels like a stupid mistake I'm making somewhere but I searched for quite a while and did not find any evidence that anybody else reported this problem. I'm trying to use hbase shell to call the 'scan()' method and I keep getting the same error message. A regular scan of the table works fine. I'd appreciate any assistance. hbase(main):005:0 scan 'robin1' ROW COLUMN+CELL myrow1 column=cf1:q1, timestamp=1366046037514, value=value2 myrow1 column=cf1:q2, timestamp=1366046489446, value=value2b myrow1 column=cf1:q2b, timestamp=1366046497799, value=value2bb myrow2 column=cf1:q2b, timestamp=1366046731281, value=value2bbce myrow2 column=cf1:q2be, timestamp=1366046748001, value=value2bbce 2 row(s) in 0.1290 seconds hbase(main):007:0 @hbase.table('robin1', @formatter).scan({'COLUMNS' = ['cf1']}) NoMethodError: undefined method `internal_command' for #Shell::Formatter::Console:0x15f6ae4d this method appears to exist [cloudera@localhost test]$ grep internal_command /usr/lib/hbase/lib/ruby/shell.rb internal_command(command, :command, *args) def internal_command(command, method_name= :command, *args) info about my environment: [cloudera@localhost test]$ hbase version 13/04/18 11:17:33 INFO util.VersionInfo: HBase 0.94.2-cdh4.2.0 13/04/18 11:17:33 INFO util.VersionInfo: Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hbase-0.94.2-cdh4.2.0 -r Unknown 13/04/18 11:17:33 INFO util.VersionInfo: Compiled by jenkins on Fri Feb 15 11:51:18 PST 2013 [cloudera@localhost test]$ java -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01,
Re: How to remove all traces of a dropped table.
Hi David, After you dropped your table, did you looked into the ZK server to see if all nodes related to this table got removed to? Also, have you tried to run HBCK after the drop to see if you system if fine? JM 2013/4/16 David Koch ogd...@googlemail.com: Hello, We had problems with not being able to scan over a large (~8k regions) table so we disabled and dropped it and decided to re-import data from scratch into a table with the SAME name. This never worked and I list some log extracts below. The only way to make the import go through was to import into a table with a different name. Hence my question: How do I remove all traces of a table which was dropped? Our cluster consists of 30 machines, running CDH4.0.1 with HBase 0.92.1. Thank you, /David Log stuff: The Mapper job reads text and the output are Puts. A couple of minutes into the job it fails with the following message in the task log: 2013-04-16 17:11:16,918 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table: java.io.IOException: HRegionInfo was null or empty in Meta for my_table, row=my_table,\xC1\xE7T\x01a8OM\xB0\xCE/\x97\x88\xB7y,99 repeat 9 times 2013-04-16 17:11:16,924 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2013-04-16 17:11:16,926 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:jenkins (auth:SIMPLE) cause:java.io.IOException: HRegionInfo was null or empty in .META., row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22, my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8} 2013-04-16 17:11:16,926 WARN org.apache.hadoop.mapred.Child: Error running child java.io.IOException: HRegionInfo was null or empty in .META., row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22, my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8} at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:957) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1524) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:121) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:533) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:88) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106) at com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:81) at com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:50) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:264) 2013-04-16 17:11:16,929 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task The master server contains stuff like this: WARN org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is empty in keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22, my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8} We tried pre-splitting the table,
Re: undefined method `internal_command' for Shell::Formatter::Console
Hi JM, Thank you for following up! No, the issue still exists. I have temporarily abandoned jruby for this project, and am using curl and REST for the time being. Since it's working properly for you and others, I suspect that it's either a version mismatch or an installation problem or some configuration issue. If you have time, I'm willing to continue debugging. Robin On Thu, Apr 25, 2013 at 9:24 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Robin, Were you finally able to find the issue? JM 2013/4/18 Robin Gowin landr...@gmail.com: same results with @null (i had earlier tried nil, same thing) hbase(main):045:0 uu = @hbase.table('robin1', @null) = Hbase::Table - robin1 hbase(main):046:0 uu.scan(ss) NoMethodError: undefined method `internal_command' for nil:NilClass One thing I'm curious about - might not matter - the output of my @hbase.table command looks like this = Hbase::Table - robin1 but the output of yours (and what is in the book) looks like this = #Hbase::Table:0x3a8cbb70 On Thu, Apr 18, 2013 at 12:17 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Interesting... I tried the same locally and it's working fine for me. hbase(main):010:0 uu = @hbase.table('TestAcidGuarantees', @formatter) = #Hbase::Table:0x3a8cbb70 @table=#Java::OrgApacheHadoopHbaseClient::HTable:0x6d65d417 hbase(main):011:0 ss = {COLUMNS = ['A']} = {COLUMNS=[A]} hbase(main):012:0 uu.scan(ss) = {test_row_0={A:col0=timestamp=1366299718358, value=\\x14\\xC2\\xF0\\x0 I did a cutpaste from what you sent and only changed the table name. Can you try with @null instead of @formatter? JM 2013/4/18 Robin Gowin landr...@gmail.com Hi Jean-Marc, Thanks for your quick reply. Yes I am trying to do something like that. For brevity I combined everything into one jruby command. My command can be split into two and I get the same error. For example, this shows a similar problem using the scan method: hbase(main):041:0 uu = @hbase.table('robin1', @formatter) = Hbase::Table - robin1 hbase(main):042:0 ss = {COLUMNS = ['cf1']} = {COLUMNS=[cf1]} hbase(main):043:0 uu.scan(ss) NoMethodError: undefined method `internal_command' for #Shell::Formatter::Console:0x15f6ae4d hbase(main):044:0 scan 'robin1', ss ROW COLUMN+CELL myrow1 column=cf1:q1, timestamp=1366046037514, value=value2 myrow1 column=cf1:q2, timestamp=1366046489446, value=value2b myrow1 column=cf1:q2b, timestamp=1366046497799, value=value2bb myrow2 column=cf1:q2b, timestamp=1366046731281, value=value2bbce myrow2 column=cf1:q2be, timestamp=1366046748001, value=value2bbce 2 row(s) in 0.0460 seconds On Thu, Apr 18, 2013 at 11:54 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Robin, I'm not sure about your command line (@hbase.table('robin1',@formatter).scan({'COLUMNS' = ['cf1']})) Are you trying do to something like that? scan 'robin1', {COLUMNS = ['cf1']} JM 2013/4/18 Robin Gowin landr...@gmail.com This feels like a stupid mistake I'm making somewhere but I searched for quite a while and did not find any evidence that anybody else reported this problem. I'm trying to use hbase shell to call the 'scan()' method and I keep getting the same error message. A regular scan of the table works fine. I'd appreciate any assistance. hbase(main):005:0 scan 'robin1' ROW COLUMN+CELL myrow1 column=cf1:q1, timestamp=1366046037514, value=value2 myrow1 column=cf1:q2, timestamp=1366046489446, value=value2b myrow1 column=cf1:q2b, timestamp=1366046497799, value=value2bb myrow2 column=cf1:q2b, timestamp=1366046731281, value=value2bbce myrow2 column=cf1:q2be, timestamp=1366046748001, value=value2bbce 2 row(s) in 0.1290 seconds hbase(main):007:0 @hbase.table('robin1', @formatter).scan({'COLUMNS' = ['cf1']}) NoMethodError: undefined method `internal_command' for #Shell::Formatter::Console:0x15f6ae4d this method appears to exist [cloudera@localhost test]$ grep internal_command /usr/lib/hbase/lib/ruby/shell.rb internal_command(command, :command, *args) def internal_command(command, method_name= :command, *args)
Re: undefined method `internal_command' for Shell::Formatter::Console
Something I thought about is that you might have a Ruby lib installed somewhere else that the shell is using. Someone faced something similar recently Take a look at this thread: http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E Can you see if you have something like that in your system? JM 2013/4/25 Robin Gowin landr...@gmail.com: Hi JM, Thank you for following up! No, the issue still exists. I have temporarily abandoned jruby for this project, and am using curl and REST for the time being. Since it's working properly for you and others, I suspect that it's either a version mismatch or an installation problem or some configuration issue. If you have time, I'm willing to continue debugging. Robin On Thu, Apr 25, 2013 at 9:24 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Robin, Were you finally able to find the issue? JM 2013/4/18 Robin Gowin landr...@gmail.com: same results with @null (i had earlier tried nil, same thing) hbase(main):045:0 uu = @hbase.table('robin1', @null) = Hbase::Table - robin1 hbase(main):046:0 uu.scan(ss) NoMethodError: undefined method `internal_command' for nil:NilClass One thing I'm curious about - might not matter - the output of my @hbase.table command looks like this = Hbase::Table - robin1 but the output of yours (and what is in the book) looks like this = #Hbase::Table:0x3a8cbb70 On Thu, Apr 18, 2013 at 12:17 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Interesting... I tried the same locally and it's working fine for me. hbase(main):010:0 uu = @hbase.table('TestAcidGuarantees', @formatter) = #Hbase::Table:0x3a8cbb70 @table=#Java::OrgApacheHadoopHbaseClient::HTable:0x6d65d417 hbase(main):011:0 ss = {COLUMNS = ['A']} = {COLUMNS=[A]} hbase(main):012:0 uu.scan(ss) = {test_row_0={A:col0=timestamp=1366299718358, value=\\x14\\xC2\\xF0\\x0 I did a cutpaste from what you sent and only changed the table name. Can you try with @null instead of @formatter? JM 2013/4/18 Robin Gowin landr...@gmail.com Hi Jean-Marc, Thanks for your quick reply. Yes I am trying to do something like that. For brevity I combined everything into one jruby command. My command can be split into two and I get the same error. For example, this shows a similar problem using the scan method: hbase(main):041:0 uu = @hbase.table('robin1', @formatter) = Hbase::Table - robin1 hbase(main):042:0 ss = {COLUMNS = ['cf1']} = {COLUMNS=[cf1]} hbase(main):043:0 uu.scan(ss) NoMethodError: undefined method `internal_command' for #Shell::Formatter::Console:0x15f6ae4d hbase(main):044:0 scan 'robin1', ss ROW COLUMN+CELL myrow1 column=cf1:q1, timestamp=1366046037514, value=value2 myrow1 column=cf1:q2, timestamp=1366046489446, value=value2b myrow1 column=cf1:q2b, timestamp=1366046497799, value=value2bb myrow2 column=cf1:q2b, timestamp=1366046731281, value=value2bbce myrow2 column=cf1:q2be, timestamp=1366046748001, value=value2bbce 2 row(s) in 0.0460 seconds On Thu, Apr 18, 2013 at 11:54 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Robin, I'm not sure about your command line (@hbase.table('robin1',@formatter).scan({'COLUMNS' = ['cf1']})) Are you trying do to something like that? scan 'robin1', {COLUMNS = ['cf1']} JM 2013/4/18 Robin Gowin landr...@gmail.com This feels like a stupid mistake I'm making somewhere but I searched for quite a while and did not find any evidence that anybody else reported this problem. I'm trying to use hbase shell to call the 'scan()' method and I keep getting the same error message. A regular scan of the table works fine. I'd appreciate any assistance. hbase(main):005:0 scan 'robin1' ROW COLUMN+CELL myrow1 column=cf1:q1, timestamp=1366046037514, value=value2 myrow1 column=cf1:q2, timestamp=1366046489446, value=value2b myrow1 column=cf1:q2b, timestamp=1366046497799, value=value2bb myrow2 column=cf1:q2b, timestamp=1366046731281, value=value2bbce myrow2 column=cf1:q2be, timestamp=1366046748001, value=value2bbce 2 row(s) in 0.1290 seconds hbase(main):007:0 @hbase.table('robin1',
Re: How to remove all traces of a dropped table.
David, I have only seen this once before and I actually had to drop the META table and rebuild it with HBCK. After that the import worked. I am pretty sure I cleaned up the ZK as well. It was very strange indeed. If you can reproduce this can you open a JIRA as this is no longer a one off scenario. On Apr 25, 2013 9:28 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi David, After you dropped your table, did you looked into the ZK server to see if all nodes related to this table got removed to? Also, have you tried to run HBCK after the drop to see if you system if fine? JM 2013/4/16 David Koch ogd...@googlemail.com: Hello, We had problems with not being able to scan over a large (~8k regions) table so we disabled and dropped it and decided to re-import data from scratch into a table with the SAME name. This never worked and I list some log extracts below. The only way to make the import go through was to import into a table with a different name. Hence my question: How do I remove all traces of a table which was dropped? Our cluster consists of 30 machines, running CDH4.0.1 with HBase 0.92.1. Thank you, /David Log stuff: The Mapper job reads text and the output are Puts. A couple of minutes into the job it fails with the following message in the task log: 2013-04-16 17:11:16,918 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table: java.io.IOException: HRegionInfo was null or empty in Meta for my_table, row=my_table,\xC1\xE7T\x01a8OM\xB0\xCE/\x97\x88\xB7y,99 repeat 9 times 2013-04-16 17:11:16,924 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2013-04-16 17:11:16,926 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:jenkins (auth:SIMPLE) cause:java.io.IOException: HRegionInfo was null or empty in .META., row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22, my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8} 2013-04-16 17:11:16,926 WARN org.apache.hadoop.mapred.Child: Error running child java.io.IOException: HRegionInfo was null or empty in .META., row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22, my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8} at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:957) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1524) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:121) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:533) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:88) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106) at com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:81) at com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:50) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:270) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:264) 2013-04-16 17:11:16,929 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
What are the appropriate steps before performing hardware maintenance?
We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver machines for a few hours. What are the right steps to take before doing the maintenance in order to ensure limited impact to the cluster and (thrift) clients of the cluster, both for HDFS and HBase? After the maintenance, are there any special steps required to add the node back to the cluster, or can we simply restart the services and HDFS/HBase take care of the rest? Thanks, - Dan
Re: What are the appropriate steps before performing hardware maintenance?
Hi Dan, You might want to take a look at bin/graceful_stop.sh . It will move all the regions hosted by your RS to other RS before stopping it gracefuly. After the maintenance, simply start the RS/DN back and it will be added back to the cluster. Loadbalancer will then assign some regions back to him. You will loose some data locality for the regions wich are going to be moved. JM 2013/4/25 Dan Crosta d...@magnetic.com: We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver machines for a few hours. What are the right steps to take before doing the maintenance in order to ensure limited impact to the cluster and (thrift) clients of the cluster, both for HDFS and HBase? After the maintenance, are there any special steps required to add the node back to the cluster, or can we simply restart the services and HDFS/HBase take care of the rest? Thanks, - Dan
Re: undefined method `internal_command' for Shell::Formatter::Console
I looked at that thread and I do have ruby installed but I don't think that is the problem, unless maybe there is a version mismatch? I wasn't sure if jruby needs to be installed and if so what its command line is. Here's the relevant versions as far as I can tell. The problem still exists. [cloudera@localhost ~]$ ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] [cloudera@localhost ~]$ irb -v irb 0.9.5(05/04/13) [cloudera@localhost ~]$ hbase -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) [cloudera@localhost ~]$ which rvm /usr/bin/which: no rvm in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) [cloudera@localhost ~]$ which jruby /usr/bin/which: no jruby in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) Robin On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Something I thought about is that you might have a Ruby lib installed somewhere else that the shell is using. Someone faced something similar recently Take a look at this thread: http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E Can you see if you have something like that in your system? JM
Re: undefined method `internal_command' for Shell::Formatter::Console
Is it easy for you to de-install it and re-install it? If so, would you mind giving it a try? 2013/4/25 Robin Gowin landr...@gmail.com: I looked at that thread and I do have ruby installed but I don't think that is the problem, unless maybe there is a version mismatch? I wasn't sure if jruby needs to be installed and if so what its command line is. Here's the relevant versions as far as I can tell. The problem still exists. [cloudera@localhost ~]$ ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] [cloudera@localhost ~]$ irb -v irb 0.9.5(05/04/13) [cloudera@localhost ~]$ hbase -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) [cloudera@localhost ~]$ which rvm /usr/bin/which: no rvm in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) [cloudera@localhost ~]$ which jruby /usr/bin/which: no jruby in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) Robin On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Something I thought about is that you might have a Ruby lib installed somewhere else that the shell is using. Someone faced something similar recently Take a look at this thread: http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E Can you see if you have something like that in your system? JM
Re: What are the appropriate steps before performing hardware maintenance?
Sorry, I should have mentioned before -- we are using CDH 4.2, which does not package the graceful_stop script. Do you happen to know if there's a way to do this through the CDH manager? Perhaps the decommission action does something similar? My impression is that decommission is more heavy-handed, but if that's the most convenient route, that'll work for us. Thanks, - Dan On Apr 25, 2013, at 11:30 AM, Jean-Marc Spaggiari wrote: Hi Dan, You might want to take a look at bin/graceful_stop.sh . It will move all the regions hosted by your RS to other RS before stopping it gracefuly. After the maintenance, simply start the RS/DN back and it will be added back to the cluster. Loadbalancer will then assign some regions back to him. You will loose some data locality for the regions wich are going to be moved. JM 2013/4/25 Dan Crosta d...@magnetic.com: We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver machines for a few hours. What are the right steps to take before doing the maintenance in order to ensure limited impact to the cluster and (thrift) clients of the cluster, both for HDFS and HBase? After the maintenance, are there any special steps required to add the node back to the cluster, or can we simply restart the services and HDFS/HBase take care of the rest? Thanks, - Dan
Re: What are the appropriate steps before performing hardware maintenance?
Moving to scm-us...@cloudera.org then (hbase in BCC). Hi Dan, The best way to know how to achieve this with Cloudera Manager is to ask on the scm-users list. I'm net yet enough used to CM to reply to your question so I will let someone else confirm. JMS 2013/4/25 Dan Crosta d...@magnetic.com: Sorry, I should have mentioned before -- we are using CDH 4.2, which does not package the graceful_stop script. Do you happen to know if there's a way to do this through the CDH manager? Perhaps the decommission action does something similar? My impression is that decommission is more heavy-handed, but if that's the most convenient route, that'll work for us. Thanks, - Dan On Apr 25, 2013, at 11:30 AM, Jean-Marc Spaggiari wrote: Hi Dan, You might want to take a look at bin/graceful_stop.sh . It will move all the regions hosted by your RS to other RS before stopping it gracefuly. After the maintenance, simply start the RS/DN back and it will be added back to the cluster. Loadbalancer will then assign some regions back to him. You will loose some data locality for the regions wich are going to be moved. JM 2013/4/25 Dan Crosta d...@magnetic.com: We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver machines for a few hours. What are the right steps to take before doing the maintenance in order to ensure limited impact to the cluster and (thrift) clients of the cluster, both for HDFS and HBase? After the maintenance, are there any special steps required to add the node back to the cluster, or can we simply restart the services and HDFS/HBase take care of the rest? Thanks, - Dan
HBase is not running.
Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase - prioritizing writes over reads?
Short answer is no, there's no knob or configuration to do that. Longer answer is it depends. Are the reads and writes going to different regions/tables? If so, disable the balancer and take it in charge by segregating the offending regions on their own RS. I also see you have the requirement to take incoming data not matter what. Well, this currently cannot be guaranteed in HBase since a RS failure will incur some limited unavailability while the ZK session times out, the logs are replayed and the regions are reassigned. I don't know what kind of SLA you have but it sounds like even without your reads problem you need to do something client-side to take care of this. Local buffers maybe? It would work as long as you don't need to serve that new data right away (unless you also start serving from the local buffer, but it's getting complicated). Hope this helps, J-D On Wed, Apr 24, 2013 at 3:25 AM, kzurek kzu...@proximetry.pl wrote: Is it possible to prioritize writes over reads in HBase? I'm facing some I/O read related issues that influence my write clients and cluster in general (constantly growing store files on some RS). Due to the fact that I cannot let myself to loose/skip incoming data, I would like to guarantee that in case of extensive read I will be able to limit incoming read requests, so that write requests wont be influenced. Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-prioritizing-writes-over-reads-tp4042838.html Sent from the HBase User mailing list archive at Nabble.com.
Re: HBase is not running.
Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Ah, my mistake. I'll re-try with the more stable version. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: undefined method `internal_command' for Shell::Formatter::Console
I removed ruby and reinstalled it; same results. On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Is it easy for you to de-install it and re-install it? If so, would you mind giving it a try? 2013/4/25 Robin Gowin landr...@gmail.com: I looked at that thread and I do have ruby installed but I don't think that is the problem, unless maybe there is a version mismatch? I wasn't sure if jruby needs to be installed and if so what its command line is. Here's the relevant versions as far as I can tell. The problem still exists. [cloudera@localhost ~]$ ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] [cloudera@localhost ~]$ irb -v irb 0.9.5(05/04/13) [cloudera@localhost ~]$ hbase -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) [cloudera@localhost ~]$ which rvm /usr/bin/which: no rvm in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) [cloudera@localhost ~]$ which jruby /usr/bin/which: no jruby in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) Robin On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Something I thought about is that you might have a Ruby lib installed somewhere else that the shell is using. Someone faced something similar recently Take a look at this thread: http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E Can you see if you have something like that in your system? JM
Re: HBase is not running.
Hi, I have a small update. The stable build seems to be working. Thanks again for your help. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: undefined method `internal_command' for Shell::Formatter::Console
No, don't re-install it ;) Remove it and retry. To make sure it's not using any lib anywhere else... JM 2013/4/25 Robin Gowin landr...@gmail.com: I removed ruby and reinstalled it; same results. On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Is it easy for you to de-install it and re-install it? If so, would you mind giving it a try? 2013/4/25 Robin Gowin landr...@gmail.com: I looked at that thread and I do have ruby installed but I don't think that is the problem, unless maybe there is a version mismatch? I wasn't sure if jruby needs to be installed and if so what its command line is. Here's the relevant versions as far as I can tell. The problem still exists. [cloudera@localhost ~]$ ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] [cloudera@localhost ~]$ irb -v irb 0.9.5(05/04/13) [cloudera@localhost ~]$ hbase -version java version 1.6.0_31 Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) [cloudera@localhost ~]$ which rvm /usr/bin/which: no rvm in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) [cloudera@localhost ~]$ which jruby /usr/bin/which: no jruby in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin) Robin On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Something I thought about is that you might have a Ruby lib installed somewhere else that the shell is using. Someone faced something similar recently Take a look at this thread: http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E Can you see if you have something like that in your system? JM
writing and reading from a region at once
Hi, If a region is being written to, and a scanner takes a lease out on the region, what will happen to the writes? Is there a concept of Transaction Isolation Levels? I don't see errors in Puts while the tables are being scanned? But it seems that I'm losing writes somewhere, is it possible the writes could fail silently? thanks, Aaron Zimmerman
Re: HBase is not running.
Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: undefined method `internal_command' for Shell::Formatter::Console
To be more explicit: I'm running CentOS release 6.4 in a vm on Mac OSx 10.6 I ran yum remove ruby and then yum install ruby (inside the vm). Is that what you meant? Also I put in some simple print statements in several of the ruby scripts called by the hbase shell, and they are getting executed. (for example: admin.rb, hbase.rb, and table.rb) (I wasn't sure what it referred to in your email) Robin On Thu, Apr 25, 2013 at 3:58 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: No, don't re-install it ;) Remove it and retry. To make sure it's not using any lib anywhere else... JM 2013/4/25 Robin Gowin landr...@gmail.com: I removed ruby and reinstalled it; same results. On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Is it easy for you to de-install it and re-install it? If so, would you mind giving it a try?
Re: HBase is not running.
Hello Yves, The log seems to be incomplete. Could you please the complete logs?Have you set the hbase.zookeeper.quorum property properly?Is your Hadoop running fine? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret yoursurrogate...@gmail.comwrote: Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on this machine, do I _need_ to get 1.6?
Re: HBase is not running.
Hi Mohammad, He is running standalone, so no need to update the zookeeper qorum yet. Yes, can you share the entire hbase-ysg-master-ysg.connect.log file? Not just the first lines. Or what you sent is already all? So what have you done yet? Downloaded 0.94, extracted it, setup the JAVA_HOME and ran bin/start-hbase.sh ? JMS 2013/4/25 Mohammad Tariq donta...@gmail.com: Hello Yves, The log seems to be incomplete. Could you please the complete logs?Have you set the hbase.zookeeper.quorum property properly?Is your Hadoop running fine? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret yoursurrogate...@gmail.comwrote: Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only bin/start-hbase.sh? The later is the correct one while the former as an extra start not required at the end. JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi all, I'm having an issue with getting HBase to run. I'm following this tutorial: http://hbase.apache.org/book.html#start_hbase When I run that command [ bin/start-hbase.sh start ], nothing happens. At all. My question is why. I have Java 1.7 on
Re: undefined method `internal_command' for Shell::Formatter::Console
Hi Robin, No, the idea is to run yum remove, and then test the HBase sheel. Don't run yum install ruby until we get that fixed. I want to see if your installed very of Ruby can cause the issue. The it was refering to the Ruby package. JM 2013/4/25 Robin Gowin landr...@gmail.com: To be more explicit: I'm running CentOS release 6.4 in a vm on Mac OSx 10.6 I ran yum remove ruby and then yum install ruby (inside the vm). Is that what you meant? Also I put in some simple print statements in several of the ruby scripts called by the hbase shell, and they are getting executed. (for example: admin.rb, hbase.rb, and table.rb) (I wasn't sure what it referred to in your email) Robin On Thu, Apr 25, 2013 at 3:58 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: No, don't re-install it ;) Remove it and retry. To make sure it's not using any lib anywhere else... JM 2013/4/25 Robin Gowin landr...@gmail.com: I removed ruby and reinstalled it; same results. On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Is it easy for you to de-install it and re-install it? If so, would you mind giving it a try?
Re: HBase is not running.
My mistake. I thought I had all of those logs. This is what I currently have: http://bin.cakephp.org/view/2112130549 I have $JAVA_HOME set to this: /usr/java/jdk1.7.0_17 I have extracted 0.94 and ran bin/start-hbase.sh Thanks for your help! On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Mohammad, He is running standalone, so no need to update the zookeeper qorum yet. Yes, can you share the entire hbase-ysg-master-ysg.connect.log file? Not just the first lines. Or what you sent is already all? So what have you done yet? Downloaded 0.94, extracted it, setup the JAVA_HOME and ran bin/start-hbase.sh ? JMS 2013/4/25 Mohammad Tariq donta...@gmail.com: Hello Yves, The log seems to be incomplete. Could you please the complete logs?Have you set the hbase.zookeeper.quorum property properly?Is your Hadoop running fine? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret yoursurrogate...@gmail.comwrote: Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very appreciated! On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, Which version of HBase are you trying with? It should be working with Java 1.7. To start HBase, are you trying bin/start-hbase.sh start as you said below? On only
Re: HBase is not running.
Hi Yves, You seems to have some network configuration issue with your installation. java.net.BindException: Cannot assign requested address and ip72-215-225-9.at.at.cox.net/72.215.225.9:0 How is your host file configured? You need to have your host name pointing to you local IP (and not 127.0.0.1). 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: My mistake. I thought I had all of those logs. This is what I currently have: http://bin.cakephp.org/view/2112130549 I have $JAVA_HOME set to this: /usr/java/jdk1.7.0_17 I have extracted 0.94 and ran bin/start-hbase.sh Thanks for your help! On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Mohammad, He is running standalone, so no need to update the zookeeper qorum yet. Yes, can you share the entire hbase-ysg-master-ysg.connect.log file? Not just the first lines. Or what you sent is already all? So what have you done yet? Downloaded 0.94, extracted it, setup the JAVA_HOME and ran bin/start-hbase.sh ? JMS 2013/4/25 Mohammad Tariq donta...@gmail.com: Hello Yves, The log seems to be incomplete. Could you please the complete logs?Have you set the hbase.zookeeper.quorum property properly?Is your Hadoop running fine? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret yoursurrogate...@gmail.comwrote: Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices: http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/ The 3 choices: - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using) - hbase-0.95.0-hadoop2-bin.tar.gz - hbase-0.95.0-src.tar.gz Which of those should I download and work with? The instructions were somewhat vague on that and I think this might be causing me some headaches in this process. By the way, thank you for your answer, very
Re: writing and reading from a region at once
Inline. J-D On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman azimmer...@sproutsocial.com wrote: Hi, If a region is being written to, and a scanner takes a lease out on the region, what will happen to the writes? Is there a concept of Transaction Isolation Levels? There's MVCC, so reads can happen while someone else is writing. What you should expect from HBase is read committed. I don't see errors in Puts while the tables are being scanned? But it seems that I'm losing writes somewhere, is it possible the writes could fail silently? Is it temporary while you're scanning or there's really data missing at the end of the day? The former might happen on some older HBase versions while the latter should never happen unless you lower the durability level yourself and have machine failures. J-D
Re: HBase is not running.
Hi JM :) Sorry about the previous mail. I didn't notice that. @Yves : I agree with Jean. Please make sure you have proper name resolution, which is vital for a proper Hbase setup. To make things simple you could just make use of 127.0.0.1, since you are running in standalone mode. Comment out other bindings. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, You seems to have some network configuration issue with your installation. java.net.BindException: Cannot assign requested address and ip72-215-225-9.at.at.cox.net/72.215.225.9:0 How is your host file configured? You need to have your host name pointing to you local IP (and not 127.0.0.1). 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: My mistake. I thought I had all of those logs. This is what I currently have: http://bin.cakephp.org/view/2112130549 I have $JAVA_HOME set to this: /usr/java/jdk1.7.0_17 I have extracted 0.94 and ran bin/start-hbase.sh Thanks for your help! On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Mohammad, He is running standalone, so no need to update the zookeeper qorum yet. Yes, can you share the entire hbase-ysg-master-ysg.connect.log file? Not just the first lines. Or what you sent is already all? So what have you done yet? Downloaded 0.94, extracted it, setup the JAVA_HOME and ran bin/start-hbase.sh ? JMS 2013/4/25 Mohammad Tariq donta...@gmail.com: Hello Yves, The log seems to be incomplete. Could you please the complete logs?Have you set the hbase.zookeeper.quorum property properly?Is your Hadoop running fine? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret yoursurrogate...@gmail.comwrote: Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm
Re: Coprocessors
You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
Thanks Lars. I briefly looked into Phoenix but it appeared to do full-table scans to perform the aggregation. The same goes with Impala. If you think otherwise, I'll look into it again. - Original Message - From: user@hbase.apache.org To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), user@hbase.apache.org At: Apr 25 2013 17:54:48 You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
It doesn't. Based on your query and key-layout it only scans subranges of the keyspace and these scans are parallelized across region servers. I'll let James explain more (if he's listening) :) -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:57 PM Subject: Re: Coprocessors Thanks Lars. I briefly looked into Phoenix but it appeared to do full-table scans to perform the aggregation. The same goes with Impala. If you think otherwise, I'll look into it again. - Original Message - From: user@hbase.apache.org To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), user@hbase.apache.org At: Apr 25 2013 17:54:48 You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
I don't think Phoenix will solve his problem. He also needs to explain more about his problem before we can start to think about the problem. On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote: You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
Phoenix might be able to solve the problem if the keys are structured in the binary format that it understand or else you are better off reloading that data in a table created via Phoenix. But I will let James tackle this question. Regarding your use-case, why can't you do the aggregation using observers ? You should be able to do the aggregation and return a new Scanner to your client. And Lars is right about the range scans that Phoenix does. It does restrict things and also will do parallel scans for you based on what you select/filter. -Viral On Thu, Apr 25, 2013 at 3:12 PM, Michael Segel michael_se...@hotmail.comwrote: I don't think Phoenix will solve his problem. He also needs to explain more about his problem before we can start to think about the problem. On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote: You might want to have a look at Phoenix ( https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
Michael: Fair enough. Let me see what relevant information I can add to what I've already said: 1. To Lars' point, my 250K keys are unlikely to fall into fewer than 250K sub-ranges. 2. Here's a bit more about my schema: 2.1 My rowkeys are composed of 2 entities - let's call it object-id and field-type. An object (O1) has 100s of field types (F1,F2,F3...). Each object-id - field-type pair has 100s of attributes (A1,A2,A3). 2.2 My rowkeys are O1-F1, O1-F2, O1-F3, etc. 2.3 My primary application (not the one my original post was about) accesses by these rowkeys. 2.4 My application that does aggregation is given a bunch of objects O1, O2, O3, a field-type F1, a bunch of attributes A1,A2 and some computation to perform. 2.5 As you can see, scans are unlikely to be useful when fetching O1-F1, O2-F1, O3-F1 etc. Viral: How do I tackle aggregation using observers? Let's say I override the postGet method. I do a multi-get from my client and my method gets called on each region server for each row. What is the next step with this approach? - Original Message - From: user@hbase.apache.org To: la...@apache.org, user@hbase.apache.org Cc: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) At: Apr 25 2013 18:12:46 I don't think Phoenix will solve his problem. He also needs to explain more about his problem before we can start to think about the problem. On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote: You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
On 04/25/2013 03:35 PM, Gary Helmling wrote: I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? This is essentially what coprocessor endpoints (called through HTable.coprocessorExec()) basically do. (One difference is that there is a parallel request per-region, not per-region server, though that is a potential optimization that could be made as well). The tricky part I see for the case you describe is splitting your full set of row keys up correctly per region. You could send the full set of row keys to each endpoint invocation, and have the endpoint implementation filter down to only those keys present in the current region. But that would be a lot of overhead on the request side. You could split the row keys into per-region sets on the client side, but I'm not sure we provide sufficient context for the Batch.Callable instance you provide to coprocessorExec() to determine which region it is being invoked against. Sudarshan, In our head branch of Phoenix (we're targeting this for a 1.2 release in two weeks), we've implemented a skip scan filter that functions similar to a batched get, except: 1) it's more flexible in that it can jump not only from a single key to another single key, but also from range to range 2) it's faster, about 3-4x. 3) you can use it in combination with aggregation, since it's a filter The scan is chunked up by region and only the keys in each region are sent, along the lines as you and Gary have described. Then the results are merged together by the client automatically. How would you decompose your row key into columns? Is there a time component? Let me walk you through an example where you might have a LONG id value plus perhaps a timestamp (it work equally well if you only had a single column in your PK). If you provide a bit more info on your use case, I can tailor it more exactly. Create a schema: CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR CONSTRAINT pk PRIMARY KEY (key, ts)); Populate your data using our UPSERT statement. Aggregate over a set of keys like this: SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts ? AND ts ? where you bind the ? at runtime (probably building the statement programmatically based on how many keys you're binding. Then Phoenix would jump around the key space of your table using the skip next hint feature provided by filters. You'd just use the regular JDBC ResultSet to get your count back. If you want more info and/or a benchmark of seeking over 250K keys in a billion row table, let me know. Thanks, James
Re: Coprocessors
Thanks for the additional info, Sudarshan. This would fit well with the implementation of Phoenix's skip scan. CREATE TABLE t ( object_id INTEGER NOT NULL, field_type INTEGER NOT NULL, attrib_id INTEGER NOT NULL, value BIGINT CONSTRAINT pk PRIMARY KEY (object_id, field_type, attribute_id)); SELECT count(value), sum(value),avg(value) FROM t WHERE object_id IN (?,?,?) AND field_type IN (?,?,?) AND attribute_type IN (?,?,?) and then your client would do whatever additional computation it needed on the results it got back. Would that fit with what you're trying to do? James On 04/25/2013 03:36 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote: Michael: Fair enough. Let me see what relevant information I can add to what I've already said: 1. To Lars' point, my 250K keys are unlikely to fall into fewer than 250K sub-ranges. 2. Here's a bit more about my schema: 2.1 My rowkeys are composed of 2 entities - let's call it object-id and field-type. An object (O1) has 100s of field types (F1,F2,F3...). Each object-id - field-type pair has 100s of attributes (A1,A2,A3). 2.2 My rowkeys are O1-F1, O1-F2, O1-F3, etc. 2.3 My primary application (not the one my original post was about) accesses by these rowkeys. 2.4 My application that does aggregation is given a bunch of objects O1, O2, O3, a field-type F1, a bunch of attributes A1,A2 and some computation to perform. 2.5 As you can see, scans are unlikely to be useful when fetching O1-F1, O2-F1, O3-F1 etc. Viral: How do I tackle aggregation using observers? Let's say I override the postGet method. I do a multi-get from my client and my method gets called on each region server for each row. What is the next step with this approach? - Original Message - From: user@hbase.apache.org To: la...@apache.org, user@hbase.apache.org Cc: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) At: Apr 25 2013 18:12:46 I don't think Phoenix will solve his problem. He also needs to explain more about his problem before we can start to think about the problem. On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote: You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: Coprocessors
James: First of all, this looks quite promising. The table schema outlined in your other message is correct except that attrib_id will not be in the primary key. Will that be a problem with respect to the skip-scan filter's performance? (it doesn't seem like it...) Could you share any sort of benchmark numbers? I want to try this out right away, but I've to wait for my cluster administrator to upgrade us from HBase 0.92 first! - Original Message - From: user@hbase.apache.org To: user@hbase.apache.org At: Apr 25 2013 18:45:14 On 04/25/2013 03:35 PM, Gary Helmling wrote: I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? This is essentially what coprocessor endpoints (called through HTable.coprocessorExec()) basically do. (One difference is that there is a parallel request per-region, not per-region server, though that is a potential optimization that could be made as well). The tricky part I see for the case you describe is splitting your full set of row keys up correctly per region. You could send the full set of row keys to each endpoint invocation, and have the endpoint implementation filter down to only those keys present in the current region. But that would be a lot of overhead on the request side. You could split the row keys into per-region sets on the client side, but I'm not sure we provide sufficient context for the Batch.Callable instance you provide to coprocessorExec() to determine which region it is being invoked against. Sudarshan, In our head branch of Phoenix (we're targeting this for a 1.2 release in two weeks), we've implemented a skip scan filter that functions similar to a batched get, except: 1) it's more flexible in that it can jump not only from a single key to another single key, but also from range to range 2) it's faster, about 3-4x. 3) you can use it in combination with aggregation, since it's a filter The scan is chunked up by region and only the keys in each region are sent, along the lines as you and Gary have described. Then the results are merged together by the client automatically. How would you decompose your row key into columns? Is there a time component? Let me walk you through an example where you might have a LONG id value plus perhaps a timestamp (it work equally well if you only had a single column in your PK). If you provide a bit more info on your use case, I can tailor it more exactly. Create a schema: CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR CONSTRAINT pk PRIMARY KEY (key, ts)); Populate your data using our UPSERT statement. Aggregate over a set of keys like this: SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts ? AND ts ? where you bind the ? at runtime (probably building the statement programmatically based on how many keys you're binding. Then Phoenix would jump around the key space of your table using the skip next hint feature provided by filters. You'd just use the regular JDBC ResultSet to get your count back. If you want more info and/or a benchmark of seeking over 250K keys in a billion row table, let me know. Thanks, James
Re: Coprocessors
Our performance engineer, Mujtaba Chohan has agreed to put together a benchmark for you. We only have a four node cluster of pretty average boxes, but it should give you an idea. No performance impact for the attrib_id not being part of the PK since you're not filtering on them (if I understand things correctly). A few more questions for you: - How many rows should be use? 1B? - How many rows would be filtered by object_id and field_type? - Any particular key distribution or is random fine? - What's the minimum key size we should use for object_id and field_type? 2 bytes each? - Any particular kind of aggregation? count(attrib1)? sum(attrib1)? A sample query would be helpful Since you're upgrading, use the latest on the 0.94 branch, 0.94.7. Thanks, James On 04/25/2013 04:19 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote: James: First of all, this looks quite promising. The table schema outlined in your other message is correct except that attrib_id will not be in the primary key. Will that be a problem with respect to the skip-scan filter's performance? (it doesn't seem like it...) Could you share any sort of benchmark numbers? I want to try this out right away, but I've to wait for my cluster administrator to upgrade us from HBase 0.92 first! - Original Message - From: user@hbase.apache.org To: user@hbase.apache.org At: Apr 25 2013 18:45:14 On 04/25/2013 03:35 PM, Gary Helmling wrote: I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? This is essentially what coprocessor endpoints (called through HTable.coprocessorExec()) basically do. (One difference is that there is a parallel request per-region, not per-region server, though that is a potential optimization that could be made as well). The tricky part I see for the case you describe is splitting your full set of row keys up correctly per region. You could send the full set of row keys to each endpoint invocation, and have the endpoint implementation filter down to only those keys present in the current region. But that would be a lot of overhead on the request side. You could split the row keys into per-region sets on the client side, but I'm not sure we provide sufficient context for the Batch.Callable instance you provide to coprocessorExec() to determine which region it is being invoked against. Sudarshan, In our head branch of Phoenix (we're targeting this for a 1.2 release in two weeks), we've implemented a skip scan filter that functions similar to a batched get, except: 1) it's more flexible in that it can jump not only from a single key to another single key, but also from range to range 2) it's faster, about 3-4x. 3) you can use it in combination with aggregation, since it's a filter The scan is chunked up by region and only the keys in each region are sent, along the lines as you and Gary have described. Then the results are merged together by the client automatically. How would you decompose your row key into columns? Is there a time component? Let me walk you through an example where you might have a LONG id value plus perhaps a timestamp (it work equally well if you only had a single column in your PK). If you provide a bit more info on your use case, I can tailor it more exactly. Create a schema: CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR CONSTRAINT pk PRIMARY KEY (key, ts)); Populate your data using our UPSERT statement. Aggregate over a set of keys like this: SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts ? AND ts ? where you bind the ? at runtime (probably building the statement programmatically based on how many keys you're binding. Then Phoenix would jump around the key space of your table using the skip next hint feature provided by filters. You'd just use the regular JDBC ResultSet to get your count back. If you want more info and/or a benchmark of seeking over 250K keys in a billion row table, let me know. Thanks, James
[ANNOUNCE] HBase 0.94.7 is available for download
The HBase Team is pleased to announce the immediate release of HBase 0.94.7. Download it from your favorite Apache mirror [1]. HBase 0.94.7 is a bug fix release with a few performance improvements as well. It has 73 issues resolved against it. 0.94.7 is the current stable release of HBase. As usual, all previous 0.92.x and 0.94.x releases can upgraded to 0.94.7 via a rolling upgrade without downtime, intermediary versions can be skipped. For a complete list of changes, see release notes [2]. Yours, The HBase Team P.S. Thank you to the 27 individuals who contributed to this release! 1. http://www.apache.org/dyn/closer.cgi/hbase/ 2. https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12324039
Re: HBase - prioritizing writes over reads?
I would also add that if you need an always available store (as in you want A and P of CAP [1], and can sacrifice C), you might be better served with one of the DynamoDB inspired architectures such as Riak or Cassandra. HBase choses C and P of CAP. It might seem strange that as an HBase committer I would advise to look at some non-HBase technology, but I am a big fan of using the right tool for the right job. -- Lars 1. See also http://en.wikipedia.org/wiki/CAP_theorem From: Jean-Daniel Cryans jdcry...@apache.org To: user@hbase.apache.org user@hbase.apache.org Sent: Thursday, April 25, 2013 10:17 AM Subject: Re: HBase - prioritizing writes over reads? Short answer is no, there's no knob or configuration to do that. Longer answer is it depends. Are the reads and writes going to different regions/tables? If so, disable the balancer and take it in charge by segregating the offending regions on their own RS. I also see you have the requirement to take incoming data not matter what. Well, this currently cannot be guaranteed in HBase since a RS failure will incur some limited unavailability while the ZK session times out, the logs are replayed and the regions are reassigned. I don't know what kind of SLA you have but it sounds like even without your reads problem you need to do something client-side to take care of this. Local buffers maybe? It would work as long as you don't need to serve that new data right away (unless you also start serving from the local buffer, but it's getting complicated). Hope this helps, J-D On Wed, Apr 24, 2013 at 3:25 AM, kzurek kzu...@proximetry.pl wrote: Is it possible to prioritize writes over reads in HBase? I'm facing some I/O read related issues that influence my write clients and cluster in general (constantly growing store files on some RS). Due to the fact that I cannot let myself to loose/skip incoming data, I would like to guarantee that in case of extensive read I will be able to limit incoming read requests, so that write requests wont be influenced. Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-prioritizing-writes-over-reads-tp4042838.html Sent from the HBase User mailing list archive at Nabble.com.
Re: HBase is not running.
Hi Jean, this is my /etc/hosts. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 127.0.0.1 localhost ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 On Thu, Apr 25, 2013 at 5:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, You seems to have some network configuration issue with your installation. java.net.BindException: Cannot assign requested address and ip72-215-225-9.at.at.cox.net/72.215.225.9:0 How is your host file configured? You need to have your host name pointing to you local IP (and not 127.0.0.1). 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: My mistake. I thought I had all of those logs. This is what I currently have: http://bin.cakephp.org/view/2112130549 I have $JAVA_HOME set to this: /usr/java/jdk1.7.0_17 I have extracted 0.94 and ran bin/start-hbase.sh Thanks for your help! On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Mohammad, He is running standalone, so no need to update the zookeeper qorum yet. Yes, can you share the entire hbase-ysg-master-ysg.connect.log file? Not just the first lines. Or what you sent is already all? So what have you done yet? Downloaded 0.94, extracted it, setup the JAVA_HOME and ran bin/start-hbase.sh ? JMS 2013/4/25 Mohammad Tariq donta...@gmail.com: Hello Yves, The log seems to be incomplete. Could you please the complete logs?Have you set the hbase.zookeeper.quorum property properly?Is your Hadoop running fine? Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret yoursurrogate...@gmail.comwrote: Hi again. I have 3 log files and only one of them had anything in them, here are the file names. I'm assuming that you're talking about the directory ${APACHE_HBASE_HOME}/logs, yes? Here are the file names: -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit Also, to answer your question about the UI, I tried that URL (I'm doing all of this on my laptop just to learn at the moment) and neither the URL nor localhost:60010 worked. So, the answer to your question is that the UI is not showing up. This could be due to not being far along in the tutorial, perhaps? Thanks again! On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: There is no stupid question ;) Are the log truncated? Anything else after that? Or that's all what you have? For the UI, you can access it with http://192.168.X.X:60010/master-status Replace the X with your own IP. You should see some information about your HBase cluster (even in Standalone mode). JMS 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Here are the logs, what should I be looking for? Seems like everything is fine for the moment, no? http://bin.cakephp.org/view/2144893539 The web UI? What do you mean? Sorry if this is a stupid question, I'm a Hadoop newb. On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Before trying the shell, can you look at the server logs and see if everything is fine? Also, is the web UI working fine? 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Ok, spoke too soon :) . I ran this command [ create 'test', 'cf' ] and this is the result that I got: http://bin.cakephp.org/view/168926019 This is after running helpenter and having this run just fine. On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Yves, 0.95.0 is a developer version. If you are starting with HBase, I will recommend you to choose a more stable version like 0.94.6.1. Regarding the 3 choices you are listing below. 1) This one is HBase 0.95 running over Hadoop 1.0 2) This one is HBase 0.95 running over Hadoop 2.0 3) This one are the HBase source classes. Again, I think you are better to go with a stable version for the first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/ Would you mind to retry you tests with this version and let me know if it's working better? JM 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com: Hi, I'm trying to run 0.95.0. I've tried both and nothing worked. I do have another question. When I go to download hbase, I get the following 3 choices:
Re: Coprocessors
Hi, Lets reiterate what you've said You have a set of objects O1, O2. On and you have some field type F1 where F1 which is part of your composite key. You want to fetch back a set of rows and then do some aggregation on the attributes. There was a similar discussion on this where someone had a random set of values and was having performance issues. If your set of objects is in sort order and you have only one field type F1 you should be able to do the multi-gets. Are you currently using the multigets ? On Apr 25, 2013, at 5:36 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net wrote: Michael: Fair enough. Let me see what relevant information I can add to what I've already said: 1. To Lars' point, my 250K keys are unlikely to fall into fewer than 250K sub-ranges. 2. Here's a bit more about my schema: 2.1 My rowkeys are composed of 2 entities - let's call it object-id and field-type. An object (O1) has 100s of field types (F1,F2,F3...). Each object-id - field-type pair has 100s of attributes (A1,A2,A3). 2.2 My rowkeys are O1-F1, O1-F2, O1-F3, etc. 2.3 My primary application (not the one my original post was about) accesses by these rowkeys. 2.4 My application that does aggregation is given a bunch of objects O1, O2, O3, a field-type F1, a bunch of attributes A1,A2 and some computation to perform. 2.5 As you can see, scans are unlikely to be useful when fetching O1-F1, O2-F1, O3-F1 etc. Viral: How do I tackle aggregation using observers? Let's say I override the postGet method. I do a multi-get from my client and my method gets called on each region server for each row. What is the next step with this approach? - Original Message - From: user@hbase.apache.org To: la...@apache.org, user@hbase.apache.org Cc: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) At: Apr 25 2013 18:12:46 I don't think Phoenix will solve his problem. He also needs to explain more about his problem before we can start to think about the problem. On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote: You might want to have a look at Phoenix (https://github.com/forcedotcom/phoenix), which does that and more, and gives a SQL/JDBC interface. -- Lars From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net To: user@hbase.apache.org Sent: Thursday, April 25, 2013 2:44 PM Subject: Coprocessors Folks: This is my first post on the HBase user mailing list. I have the following scenario: I've a HBase table of upto a billion keys. I'm looking to support an application where on some user action, I'd need to fetch multiple columns for upto 250K keys and do some sort of aggregation on it. Fetching all that data and doing the aggregation in my application takes about a minute. I'm looking to co-locate the aggregation logic with the region servers to a. Distribute the aggregation b. Avoid having to fetch large amounts of data over the network (this could potentially be cross-datacenter) Neither observers nor aggregation endpoints work for this use case. Observers don't return data back to the client while aggregation endpoints work in the context of scans not a multi-get (Are these correct assumptions?). I'm looking to write a service that runs alongside the region servers and acts a proxy b/w my application and the region servers. I plan to use the logic in HBase client's HConnectionManager, to segment my request of 1M rowkeys into sub-requests per region-server. These are sent over to the proxy which fetches the data from the region server, aggregates locally and sends data back. Does this sound reasonable or even a useful thing to pursue? Regards, -sudarshan
Re: writing and reading from a region at once
But it seems that I'm losing writes somewhere, is it possible the writes could fail silently Which version you are using? How you say writes missed silently? The current read, which was going on, has not returned the row that you just wrote? Or you have created a new scan after wards and in that also the written data is missing? -Anoop- On Fri, Apr 26, 2013 at 3:04 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Inline. J-D On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman azimmer...@sproutsocial.com wrote: Hi, If a region is being written to, and a scanner takes a lease out on the region, what will happen to the writes? Is there a concept of Transaction Isolation Levels? There's MVCC, so reads can happen while someone else is writing. What you should expect from HBase is read committed. I don't see errors in Puts while the tables are being scanned? But it seems that I'm losing writes somewhere, is it possible the writes could fail silently? Is it temporary while you're scanning or there's really data missing at the end of the day? The former might happen on some older HBase versions while the latter should never happen unless you lower the durability level yourself and have machine failures. J-D