CfP 2013 Workshop on Middleware for HPC and Big Data Systems (MHPC'13)

2013-04-25 Thread MHPC 2013
we apologize if you receive multiple copies of this message
===

CALL FOR PAPERS

2013 Workshop on

Middleware for HPC and Big Data Systems

MHPC '13

as part of Euro-Par 2013, Aachen, Germany

===

Date: August 27, 2012

Workshop URL: http://m-hpc.org

Springer LNCS

SUBMISSION DEADLINE:

May 31, 2013 - LNCS Full paper submission (rolling abstract submission)
June 28, 2013 - Lightning Talk abstracts


SCOPE

Extremely large, diverse, and complex data sets are generated from
scientific applications, the Internet, social media and other applications.
Data may be physically distributed and shared by an ever larger community.
Collecting, aggregating, storing and analyzing large data volumes
presents major challenges. Processing such amounts of data efficiently
has been an issue to scientific discovery and technological
advancement. In addition, making the data accessible, understandable and
interoperable includes unsolved problems. Novel middleware architectures,
algorithms, and application development frameworks are required.

In this workshop we are particularly interested in original work at the
intersection of HPC and Big Data with regard to middleware handling
and optimizations. Scope is existing and proposed middleware for HPC
and big data, including analytics libraries and frameworks.

The goal of this workshop is to bring together software architects,
middleware and framework developers, data-intensive application developers
as well as users from the scientific and engineering community to exchange
their experience in processing large datasets and to report their scientific
achievement and innovative ideas. The workshop also offers a dedicated forum
for these researchers to access the state of the art, to discuss problems
and requirements, to identify gaps in current and planned designs, and to
collaborate in strategies for scalable data-intensive computing.

The workshop will be one day in length, composed of 20 min paper
presentations, each followed by 10 min discussion sections.
Presentations may be accompanied by interactive demonstrations.


TOPICS

Topics of interest include, but are not limited to:

- Middleware including: Hadoop, Apache Drill, YARN, Spark/Shark, Hive,
Pig, Sqoop,
HBase, HDFS, S4, CIEL, Oozie, Impala, Storm and Hyrack
- Data intensive middleware architecture
 - Libraries/Frameworks including: Apache Mahout, Giraph, UIMA and GraphLab
- NG Databases including Apache Cassandra, MongoDB and CouchDB/Couchbase
- Schedulers including Cascading
- Middleware for optimized data locality/in-place data processing
- Data handling middleware for deployment in virtualized HPC environments
- Parallelization and distributed processing architectures at the
middleware level
- Integration with cloud middleware and application servers
- Runtime environments and system level support for data-intensive computing
- Skeletons and patterns
- Checkpointing
- Programming models and languages
- Big Data ETL
- Stream processing middleware
- In-memory databases for HPC
- Scalability and interoperability
- Large-scale data storage and distributed file systems
- Content-centric addressing and networking
- Execution engines, languages and environments including CIEL/Skywriting
- Performance analysis, evaluation of data-intensive middleware
- In-depth analysis and performance optimizations in existing data-handling
middleware, focusing on indexing/fast storing or retrieval between compute
and storage nodes
- Highly scalable middleware optimized for minimum communication
- Use cases and experience for popular Big Data middleware
- Middleware security, privacy and trust architectures

DATES

Papers:
Rolling abstract submission
May 31, 2013 - Full paper submission
July 8, 2013 - Acceptance notification
October 3, 2013 - Camera-ready version due

Lightning Talks:
June 28, 2013 - Deadline for lightning talk abstracts
July 15, 2013 - Lightning talk notification

August 27, 2013 - Workshop Date


TPC

CHAIR

Michael Alexander (chair), TU Wien, Austria
Anastassios Nanos (co-chair), NTUA, Greece
Jie Tao (co-chair), Karlsruhe Institut of Technology, Germany
Lizhe Wang (co-chair), Chinese Academy of Sciences, China
Gianluigi Zanetti (co-chair), CRS4, Italy

PROGRAM COMMITTEE

Amitanand Aiyer, Facebook, USA
Costas Bekas, IBM, Switzerland
Jakob Blomer, CERN, Switzerland
William Gardner, University of Guelph, Canada
José Gracia, HPC Center of the University of Stuttgart, Germany
Zhenghua Guom,  Indiana University, USA
Marcus Hardt,  Karlsruhe Institute of Technology, Germany
Sverre Jarp, CERN, Switzerland
Christopher Jung,  Karlsruhe Institute of Technology, Germany
Andreas Knüpfer - Technische Universität Dresden, Germany
Nectarios Koziris, National Technical University of Athens, Greece
Yan Ma, Chinese Academy of Sciences, China
Martin Schulz - Lawrence Livermore National Laboratory
Viral Shah, 

Re: Re: While starting 3-nodes cluster hbase: WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null

2013-04-25 Thread Jean-Marc Spaggiari
Hi John,

bin/start-dfs is to start hadoop, right? Why are you trying to start
that from HBase Master? You said that you hadoop is already configured
and working fine. You should start HBase with bin/start-hbase.sh which
will start the master and the regionservers.

Also, in your host file, please replace 127.0.0.1debian01 by the
real IP. It will help.

Keep us posted.

JM

2013/4/24 John Foxinhead john.foxinh...@gmail.com:
 I tried this way: I started hadoop, and it's all ok, so go on.
 I setted HBASE_MANAGES_ZK=false, so i could start zookeeper, try it and
 check the problem.
 I changed the zookeeper.quorum to jobtracker,datanode1 instead of
 zookeeper1,zookeeper2 because the IP are the same, but in log file the
 hostname reported was jobtracker and datanode1, while in zookeeper.quorum i
 setted zookeper1,zookeeper2 so i changed this property because in hbase
 documentation it's recommended to set the same hostname specified in log
 file otherwise zookeeper nodes could not recognise themselves as zookeeper
 nodes and maybe master could have problem logging. Anyway, I made this
 change in all the nodes.
 I start zookeeper in zookeeper nodes, and it works now because when i log
 with bin/hbase zkcli from master node (who is not running zookeeper) It
 logs at datanode1:2181 (where zookeeper is running) and commands like ls
 / work.
 When i use from master /bin/start-dfs the only output is starting
 master When I see log file i noticed that master send a zookeeper
 request to 0.0.0.0:2181 and accept response from 127.0.0.2181, while using
 bin/hbase zkcli it connect with datanode1.2181. This is very strange, I
 think. What could be the reason why?


Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Jean-Marc Spaggiari
Hi Robin,

Were you finally able to find the issue?

JM

2013/4/18 Robin Gowin landr...@gmail.com:
 same results with @null (i had earlier tried nil, same thing)

 hbase(main):045:0 uu = @hbase.table('robin1', @null)
 = Hbase::Table - robin1
 hbase(main):046:0 uu.scan(ss)
 NoMethodError: undefined method `internal_command' for nil:NilClass

 One thing I'm curious about - might not matter - the output of my
 @hbase.table command looks like this

 = Hbase::Table - robin1

 but the output of yours (and what is in the book) looks like this

 = #Hbase::Table:0x3a8cbb70




 On Thu, Apr 18, 2013 at 12:17 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Interesting...

 I tried the same locally and it's working fine for me.

 hbase(main):010:0 uu = @hbase.table('TestAcidGuarantees', @formatter)
 = #Hbase::Table:0x3a8cbb70
 @table=#Java::OrgApacheHadoopHbaseClient::HTable:0x6d65d417
 hbase(main):011:0 ss = {COLUMNS = ['A']}
 = {COLUMNS=[A]}
 hbase(main):012:0 uu.scan(ss)
 = {test_row_0={A:col0=timestamp=1366299718358,
 value=\\x14\\xC2\\xF0\\x0

 I did a cutpaste from what you sent and only changed the table name.

 Can you try with @null instead of @formatter?

 JM

 2013/4/18 Robin Gowin landr...@gmail.com

  Hi Jean-Marc,
 
  Thanks for your quick reply. Yes I am trying to do something like that.
 For
  brevity I combined everything into one jruby command.
 
  My command can be split into two and I get the same error. For example,
  this shows a similar problem using the scan method:
 
  hbase(main):041:0 uu = @hbase.table('robin1', @formatter)
  = Hbase::Table - robin1
  hbase(main):042:0 ss = {COLUMNS = ['cf1']}
  = {COLUMNS=[cf1]}
  hbase(main):043:0 uu.scan(ss)
  NoMethodError: undefined method `internal_command' for
  #Shell::Formatter::Console:0x15f6ae4d
 
  hbase(main):044:0 scan 'robin1', ss
  ROW   COLUMN+CELL
 
 
   myrow1   column=cf1:q1,
  timestamp=1366046037514, value=value2
 
   myrow1   column=cf1:q2,
  timestamp=1366046489446, value=value2b
 
   myrow1   column=cf1:q2b,
  timestamp=1366046497799, value=value2bb
 
   myrow2   column=cf1:q2b,
  timestamp=1366046731281, value=value2bbce
 
   myrow2   column=cf1:q2be,
  timestamp=1366046748001, value=value2bbce
 
  2 row(s) in 0.0460 seconds
 
 
 
 
 
  On Thu, Apr 18, 2013 at 11:54 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
   Hi Robin,
  
   I'm not sure about your command line
   (@hbase.table('robin1',@formatter).scan({'COLUMNS' = ['cf1']}))
  
   Are you trying do to something like that? scan 'robin1', {COLUMNS
   = ['cf1']}
  
   JM
  
   2013/4/18 Robin Gowin landr...@gmail.com
  
This feels like a stupid mistake I'm making somewhere but I searched
  for
quite a while and did not find any evidence that anybody else
 reported
   this
problem.
   
I'm trying to use hbase shell to call the 'scan()' method and I keep
getting the same error message. A regular scan of the table works
 fine.
   
I'd appreciate any assistance.
   
hbase(main):005:0 scan 'robin1'
ROW   COLUMN+CELL
   
   
 myrow1   column=cf1:q1,
timestamp=1366046037514, value=value2
   
 myrow1   column=cf1:q2,
timestamp=1366046489446, value=value2b
   
 myrow1   column=cf1:q2b,
timestamp=1366046497799, value=value2bb
   
 myrow2   column=cf1:q2b,
timestamp=1366046731281, value=value2bbce
   
 myrow2   column=cf1:q2be,
timestamp=1366046748001, value=value2bbce
   
2 row(s) in 0.1290 seconds
   
hbase(main):007:0 @hbase.table('robin1', @formatter).scan({'COLUMNS'
  =
['cf1']})
NoMethodError: undefined method `internal_command' for
#Shell::Formatter::Console:0x15f6ae4d
   
this method appears to exist
   
[cloudera@localhost test]$ grep internal_command
/usr/lib/hbase/lib/ruby/shell.rb
  internal_command(command, :command, *args)
def internal_command(command, method_name= :command, *args)
   
   
info about my environment:
   
[cloudera@localhost test]$ hbase version
13/04/18 11:17:33 INFO util.VersionInfo: HBase 0.94.2-cdh4.2.0
13/04/18 11:17:33 INFO util.VersionInfo: Subversion
   
   
  
 
 file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hbase-0.94.2-cdh4.2.0
-r Unknown
13/04/18 11:17:33 INFO util.VersionInfo: Compiled by jenkins on Fri
 Feb
   15
11:51:18 PST 2013
[cloudera@localhost test]$ java -version
java version 1.6.0_31
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, 

Re: How to remove all traces of a dropped table.

2013-04-25 Thread Jean-Marc Spaggiari
Hi David,

After you dropped your table, did you looked into the ZK server to see
if all nodes related to this table got removed to?

Also, have you tried to run HBCK after the drop to see if you system if fine?

JM

2013/4/16 David Koch ogd...@googlemail.com:
 Hello,

 We had problems with not being able to scan over a large (~8k regions)
 table so we disabled and dropped it and decided to re-import data from
 scratch into a table with the SAME name. This never worked and I list some
 log extracts below.

 The only way to make the import go through was to import into a table with
 a different name. Hence my question:

 How do I remove all traces of a table which was dropped? Our cluster
 consists of 30 machines, running CDH4.0.1 with HBase 0.92.1.

 Thank you,

 /David

 Log stuff:

 The Mapper job reads text and the output are Puts. A couple of minutes into
 the job it fails with the following message in the task log:

 2013-04-16 17:11:16,918 WARN
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
 Encountered problems when prefetch META table:
 java.io.IOException: HRegionInfo was null or empty in Meta for my_table,
 row=my_table,\xC1\xE7T\x01a8OM\xB0\xCE/\x97\x88\xB7y,99

 repeat 9 times

 2013-04-16 17:11:16,924 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
 Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
 2013-04-16 17:11:16,926 ERROR
 org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
 as:jenkins (auth:SIMPLE) cause:java.io.IOException: HRegionInfo was null or
 empty in .META.,
 row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22,
 my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8}
 2013-04-16 17:11:16,926 WARN org.apache.hadoop.mapred.Child: Error running
 child
 java.io.IOException: HRegionInfo was null or empty in .META.,
 row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22,
 my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8}
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:957)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1524)
 at
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409)
 at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795)
 at
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:121)
 at
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
 at
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:533)
 at
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:88)
 at
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
 at
 com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:81)
 at
 com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:50)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 at org.apache.hadoop.mapred.Child.main(Child.java:264)
 2013-04-16 17:11:16,929 INFO org.apache.hadoop.mapred.Task: Runnning
 cleanup for the task

 The master server contains stuff like this:

 WARN org.apache.hadoop.hbase.master.CatalogJanitor: REGIONINFO_QUALIFIER is
 empty in
 keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22,
 my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8}


 We tried pre-splitting the table, 

Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Robin Gowin
Hi JM,

Thank you for following up!

No, the issue still exists. I have temporarily abandoned jruby for this
project, and am using curl and REST for the time being.

Since it's working properly for you and others, I suspect that it's either
a version mismatch or an installation
problem or some configuration issue. If you have time, I'm willing to
continue debugging.

Robin


On Thu, Apr 25, 2013 at 9:24 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Robin,

 Were you finally able to find the issue?

 JM

 2013/4/18 Robin Gowin landr...@gmail.com:
  same results with @null (i had earlier tried nil, same thing)
 
  hbase(main):045:0 uu = @hbase.table('robin1', @null)
  = Hbase::Table - robin1
  hbase(main):046:0 uu.scan(ss)
  NoMethodError: undefined method `internal_command' for nil:NilClass
 
  One thing I'm curious about - might not matter - the output of my
  @hbase.table command looks like this
 
  = Hbase::Table - robin1
 
  but the output of yours (and what is in the book) looks like this
 
  = #Hbase::Table:0x3a8cbb70
 
 
 
 
  On Thu, Apr 18, 2013 at 12:17 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Interesting...
 
  I tried the same locally and it's working fine for me.
 
  hbase(main):010:0 uu = @hbase.table('TestAcidGuarantees', @formatter)
  = #Hbase::Table:0x3a8cbb70
  @table=#Java::OrgApacheHadoopHbaseClient::HTable:0x6d65d417
  hbase(main):011:0 ss = {COLUMNS = ['A']}
  = {COLUMNS=[A]}
  hbase(main):012:0 uu.scan(ss)
  = {test_row_0={A:col0=timestamp=1366299718358,
  value=\\x14\\xC2\\xF0\\x0
 
  I did a cutpaste from what you sent and only changed the table name.
 
  Can you try with @null instead of @formatter?
 
  JM
 
  2013/4/18 Robin Gowin landr...@gmail.com
 
   Hi Jean-Marc,
  
   Thanks for your quick reply. Yes I am trying to do something like
 that.
  For
   brevity I combined everything into one jruby command.
  
   My command can be split into two and I get the same error. For
 example,
   this shows a similar problem using the scan method:
  
   hbase(main):041:0 uu = @hbase.table('robin1', @formatter)
   = Hbase::Table - robin1
   hbase(main):042:0 ss = {COLUMNS = ['cf1']}
   = {COLUMNS=[cf1]}
   hbase(main):043:0 uu.scan(ss)
   NoMethodError: undefined method `internal_command' for
   #Shell::Formatter::Console:0x15f6ae4d
  
   hbase(main):044:0 scan 'robin1', ss
   ROW   COLUMN+CELL
  
  
myrow1   column=cf1:q1,
   timestamp=1366046037514, value=value2
  
myrow1   column=cf1:q2,
   timestamp=1366046489446, value=value2b
  
myrow1   column=cf1:q2b,
   timestamp=1366046497799, value=value2bb
  
myrow2   column=cf1:q2b,
   timestamp=1366046731281, value=value2bbce
  
myrow2   column=cf1:q2be,
   timestamp=1366046748001, value=value2bbce
  
   2 row(s) in 0.0460 seconds
  
  
  
  
  
   On Thu, Apr 18, 2013 at 11:54 AM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
Hi Robin,
   
I'm not sure about your command line
(@hbase.table('robin1',@formatter).scan({'COLUMNS' = ['cf1']}))
   
Are you trying do to something like that? scan 'robin1', {COLUMNS
= ['cf1']}
   
JM
   
2013/4/18 Robin Gowin landr...@gmail.com
   
 This feels like a stupid mistake I'm making somewhere but I
 searched
   for
 quite a while and did not find any evidence that anybody else
  reported
this
 problem.

 I'm trying to use hbase shell to call the 'scan()' method and I
 keep
 getting the same error message. A regular scan of the table works
  fine.

 I'd appreciate any assistance.

 hbase(main):005:0 scan 'robin1'
 ROW   COLUMN+CELL


  myrow1   column=cf1:q1,
 timestamp=1366046037514, value=value2

  myrow1   column=cf1:q2,
 timestamp=1366046489446, value=value2b

  myrow1   column=cf1:q2b,
 timestamp=1366046497799, value=value2bb

  myrow2   column=cf1:q2b,
 timestamp=1366046731281, value=value2bbce

  myrow2   column=cf1:q2be,
 timestamp=1366046748001, value=value2bbce

 2 row(s) in 0.1290 seconds

 hbase(main):007:0 @hbase.table('robin1',
 @formatter).scan({'COLUMNS'
   =
 ['cf1']})
 NoMethodError: undefined method `internal_command' for
 #Shell::Formatter::Console:0x15f6ae4d

 this method appears to exist

 [cloudera@localhost test]$ grep internal_command
 /usr/lib/hbase/lib/ruby/shell.rb
   internal_command(command, :command, *args)
 def internal_command(command, method_name= :command, *args)


Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Jean-Marc Spaggiari
Something I thought about is that you might have a Ruby lib installed
somewhere else that the shell is using. Someone faced something
similar recently

Take a look at this thread:
http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E

Can you see if you have something like that in your system?

JM

2013/4/25 Robin Gowin landr...@gmail.com:
 Hi JM,

 Thank you for following up!

 No, the issue still exists. I have temporarily abandoned jruby for this
 project, and am using curl and REST for the time being.

 Since it's working properly for you and others, I suspect that it's either
 a version mismatch or an installation
 problem or some configuration issue. If you have time, I'm willing to
 continue debugging.

 Robin


 On Thu, Apr 25, 2013 at 9:24 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Robin,

 Were you finally able to find the issue?

 JM

 2013/4/18 Robin Gowin landr...@gmail.com:
  same results with @null (i had earlier tried nil, same thing)
 
  hbase(main):045:0 uu = @hbase.table('robin1', @null)
  = Hbase::Table - robin1
  hbase(main):046:0 uu.scan(ss)
  NoMethodError: undefined method `internal_command' for nil:NilClass
 
  One thing I'm curious about - might not matter - the output of my
  @hbase.table command looks like this
 
  = Hbase::Table - robin1
 
  but the output of yours (and what is in the book) looks like this
 
  = #Hbase::Table:0x3a8cbb70
 
 
 
 
  On Thu, Apr 18, 2013 at 12:17 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Interesting...
 
  I tried the same locally and it's working fine for me.
 
  hbase(main):010:0 uu = @hbase.table('TestAcidGuarantees', @formatter)
  = #Hbase::Table:0x3a8cbb70
  @table=#Java::OrgApacheHadoopHbaseClient::HTable:0x6d65d417
  hbase(main):011:0 ss = {COLUMNS = ['A']}
  = {COLUMNS=[A]}
  hbase(main):012:0 uu.scan(ss)
  = {test_row_0={A:col0=timestamp=1366299718358,
  value=\\x14\\xC2\\xF0\\x0
 
  I did a cutpaste from what you sent and only changed the table name.
 
  Can you try with @null instead of @formatter?
 
  JM
 
  2013/4/18 Robin Gowin landr...@gmail.com
 
   Hi Jean-Marc,
  
   Thanks for your quick reply. Yes I am trying to do something like
 that.
  For
   brevity I combined everything into one jruby command.
  
   My command can be split into two and I get the same error. For
 example,
   this shows a similar problem using the scan method:
  
   hbase(main):041:0 uu = @hbase.table('robin1', @formatter)
   = Hbase::Table - robin1
   hbase(main):042:0 ss = {COLUMNS = ['cf1']}
   = {COLUMNS=[cf1]}
   hbase(main):043:0 uu.scan(ss)
   NoMethodError: undefined method `internal_command' for
   #Shell::Formatter::Console:0x15f6ae4d
  
   hbase(main):044:0 scan 'robin1', ss
   ROW   COLUMN+CELL
  
  
myrow1   column=cf1:q1,
   timestamp=1366046037514, value=value2
  
myrow1   column=cf1:q2,
   timestamp=1366046489446, value=value2b
  
myrow1   column=cf1:q2b,
   timestamp=1366046497799, value=value2bb
  
myrow2   column=cf1:q2b,
   timestamp=1366046731281, value=value2bbce
  
myrow2   column=cf1:q2be,
   timestamp=1366046748001, value=value2bbce
  
   2 row(s) in 0.0460 seconds
  
  
  
  
  
   On Thu, Apr 18, 2013 at 11:54 AM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
Hi Robin,
   
I'm not sure about your command line
(@hbase.table('robin1',@formatter).scan({'COLUMNS' = ['cf1']}))
   
Are you trying do to something like that? scan 'robin1', {COLUMNS
= ['cf1']}
   
JM
   
2013/4/18 Robin Gowin landr...@gmail.com
   
 This feels like a stupid mistake I'm making somewhere but I
 searched
   for
 quite a while and did not find any evidence that anybody else
  reported
this
 problem.

 I'm trying to use hbase shell to call the 'scan()' method and I
 keep
 getting the same error message. A regular scan of the table works
  fine.

 I'd appreciate any assistance.

 hbase(main):005:0 scan 'robin1'
 ROW   COLUMN+CELL


  myrow1   column=cf1:q1,
 timestamp=1366046037514, value=value2

  myrow1   column=cf1:q2,
 timestamp=1366046489446, value=value2b

  myrow1   column=cf1:q2b,
 timestamp=1366046497799, value=value2bb

  myrow2   column=cf1:q2b,
 timestamp=1366046731281, value=value2bbce

  myrow2   column=cf1:q2be,
 timestamp=1366046748001, value=value2bbce

 2 row(s) in 0.1290 seconds

 hbase(main):007:0 @hbase.table('robin1',
 

Re: How to remove all traces of a dropped table.

2013-04-25 Thread Kevin O'dell
David,

  I have only seen this once before and I actually had to drop the META
table and rebuild it with HBCK.  After that the import worked. I am pretty
sure I cleaned up the ZK as well. It was very strange indeed.  If you can
reproduce this can you open a JIRA as this is no longer a one off scenario.
On Apr 25, 2013 9:28 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:

 Hi David,

 After you dropped your table, did you looked into the ZK server to see
 if all nodes related to this table got removed to?

 Also, have you tried to run HBCK after the drop to see if you system if
 fine?

 JM

 2013/4/16 David Koch ogd...@googlemail.com:
  Hello,
 
  We had problems with not being able to scan over a large (~8k regions)
  table so we disabled and dropped it and decided to re-import data from
  scratch into a table with the SAME name. This never worked and I list
 some
  log extracts below.
 
  The only way to make the import go through was to import into a table
 with
  a different name. Hence my question:
 
  How do I remove all traces of a table which was dropped? Our cluster
  consists of 30 machines, running CDH4.0.1 with HBase 0.92.1.
 
  Thank you,
 
  /David
 
  Log stuff:
 
  The Mapper job reads text and the output are Puts. A couple of minutes
 into
  the job it fails with the following message in the task log:
 
  2013-04-16 17:11:16,918 WARN
 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
  Encountered problems when prefetch META table:
  java.io.IOException: HRegionInfo was null or empty in Meta for my_table,
  row=my_table,\xC1\xE7T\x01a8OM\xB0\xCE/\x97\x88\xB7y,99
 
  repeat 9 times
 
  2013-04-16 17:11:16,924 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
  Initializing logs' truncater with mapRetainSize=-1 and
 reduceRetainSize=-1
  2013-04-16 17:11:16,926 ERROR
  org.apache.hadoop.security.UserGroupInformation:
 PriviledgedActionException
  as:jenkins (auth:SIMPLE) cause:java.io.IOException: HRegionInfo was null
 or
  empty in .META.,
 
 row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22,
 
 my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8}
  2013-04-16 17:11:16,926 WARN org.apache.hadoop.mapred.Child: Error
 running
  child
  java.io.IOException: HRegionInfo was null or empty in .META.,
 
 row=keyvalues={my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:server/1366035344492/Put/vlen=22,
 
 my_table,\xA4\xDC\x82\x84OAB\xC1\xBA\xE9\xE7\xA9\xE8\x81\x16\x09,1365996567593.50bb0cbde855cbdc4006051531dba162./info:serverstartcode/1366035344492/Put/vlen=8}
  at
 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:957)
  at
 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:818)
  at
 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1524)
  at
 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409)
  at
 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943)
  at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820)
  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795)
  at
 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:121)
  at
 
 org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:82)
  at
 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:533)
  at
 
 org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:88)
  at
 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
  at
 
 com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:81)
  at
 
 com.mycompany.data.tools.export.Export2HBase$JsonImporterMapper.map(Export2HBase.java:50)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
  at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
  at org.apache.hadoop.mapred.Child.main(Child.java:264)
  2013-04-16 17:11:16,929 INFO org.apache.hadoop.mapred.Task: Runnning
  cleanup for the task
 
  

What are the appropriate steps before performing hardware maintenance?

2013-04-25 Thread Dan Crosta
We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver 
machines for a few hours. What are the right steps to take before doing the 
maintenance in order to ensure limited  impact to the cluster and (thrift) 
clients of the cluster, both for HDFS and HBase?

After the maintenance, are there any special steps required to add the node 
back to the cluster, or can we simply restart the services and HDFS/HBase take 
care of the rest?

Thanks,
- Dan

Re: What are the appropriate steps before performing hardware maintenance?

2013-04-25 Thread Jean-Marc Spaggiari
Hi Dan,

You might want to take a look at bin/graceful_stop.sh . It will move
all the regions hosted by your RS to other RS before stopping it
gracefuly. After the maintenance, simply start the RS/DN back and it
will be added back to the cluster. Loadbalancer will then assign some
regions back to him. You will loose some data locality for the regions
wich are going to be moved.

JM

2013/4/25 Dan Crosta d...@magnetic.com:
 We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver 
 machines for a few hours. What are the right steps to take before doing the 
 maintenance in order to ensure limited  impact to the cluster and (thrift) 
 clients of the cluster, both for HDFS and HBase?

 After the maintenance, are there any special steps required to add the node 
 back to the cluster, or can we simply restart the services and HDFS/HBase 
 take care of the rest?

 Thanks,
 - Dan


Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Robin Gowin
I looked at that thread and I do have ruby installed but I don't think that
is the problem,
unless maybe there is a version mismatch? I wasn't sure if jruby needs to
be installed
and if so what its command line is.

Here's the relevant versions as far as I can tell. The problem still exists.

[cloudera@localhost ~]$ ruby -v
ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
[cloudera@localhost ~]$ irb -v
irb 0.9.5(05/04/13)
[cloudera@localhost ~]$ hbase -version
java version 1.6.0_31
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
[cloudera@localhost ~]$ which rvm
/usr/bin/which: no rvm in
(/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)
[cloudera@localhost ~]$ which jruby
/usr/bin/which: no jruby in
(/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)


Robin


On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Something I thought about is that you might have a Ruby lib installed
 somewhere else that the shell is using. Someone faced something
 similar recently

 Take a look at this thread:


http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E

 Can you see if you have something like that in your system?

 JM



Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Jean-Marc Spaggiari
Is it easy for you to de-install it and re-install it? If so, would
you mind giving it a try?

2013/4/25 Robin Gowin landr...@gmail.com:
 I looked at that thread and I do have ruby installed but I don't think that
 is the problem,
 unless maybe there is a version mismatch? I wasn't sure if jruby needs to
 be installed
 and if so what its command line is.

 Here's the relevant versions as far as I can tell. The problem still exists.

 [cloudera@localhost ~]$ ruby -v
 ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
 [cloudera@localhost ~]$ irb -v
 irb 0.9.5(05/04/13)
 [cloudera@localhost ~]$ hbase -version
 java version 1.6.0_31
 Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
 [cloudera@localhost ~]$ which rvm
 /usr/bin/which: no rvm in
 (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)
 [cloudera@localhost ~]$ which jruby
 /usr/bin/which: no jruby in
 (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)


 Robin


 On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Something I thought about is that you might have a Ruby lib installed
 somewhere else that the shell is using. Someone faced something
 similar recently

 Take a look at this thread:


 http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E

 Can you see if you have something like that in your system?

 JM



Re: What are the appropriate steps before performing hardware maintenance?

2013-04-25 Thread Dan Crosta
Sorry, I should have mentioned before -- we are using CDH 4.2, which does not 
package the graceful_stop script. Do you happen to know if there's a way to do 
this through the CDH manager? Perhaps the decommission action does something 
similar? My impression is that decommission is more heavy-handed, but if 
that's the most convenient route, that'll work for us.

Thanks,
- Dan

On Apr 25, 2013, at 11:30 AM, Jean-Marc Spaggiari wrote:

 Hi Dan,
 
 You might want to take a look at bin/graceful_stop.sh . It will move
 all the regions hosted by your RS to other RS before stopping it
 gracefuly. After the maintenance, simply start the RS/DN back and it
 will be added back to the cluster. Loadbalancer will then assign some
 regions back to him. You will loose some data locality for the regions
 wich are going to be moved.
 
 JM
 
 2013/4/25 Dan Crosta d...@magnetic.com:
 We have to perform maintenance on one of our HDFS DataNode/HBase 
 Regionserver machines for a few hours. What are the right steps to take 
 before doing the maintenance in order to ensure limited  impact to the 
 cluster and (thrift) clients of the cluster, both for HDFS and HBase?
 
 After the maintenance, are there any special steps required to add the node 
 back to the cluster, or can we simply restart the services and HDFS/HBase 
 take care of the rest?
 
 Thanks,
 - Dan



Re: What are the appropriate steps before performing hardware maintenance?

2013-04-25 Thread Jean-Marc Spaggiari
Moving to scm-us...@cloudera.org then (hbase in BCC).

Hi Dan,

The best way to know how to achieve this with Cloudera Manager is to
ask on the scm-users list.

I'm net yet enough used to CM to reply to your question so I will let
someone else confirm.

JMS



2013/4/25 Dan Crosta d...@magnetic.com:
 Sorry, I should have mentioned before -- we are using CDH 4.2, which does not 
 package the graceful_stop script. Do you happen to know if there's a way to 
 do this through the CDH manager? Perhaps the decommission action does 
 something similar? My impression is that decommission is more heavy-handed, 
 but if that's the most convenient route, that'll work for us.

 Thanks,
 - Dan

 On Apr 25, 2013, at 11:30 AM, Jean-Marc Spaggiari wrote:

 Hi Dan,

 You might want to take a look at bin/graceful_stop.sh . It will move
 all the regions hosted by your RS to other RS before stopping it
 gracefuly. After the maintenance, simply start the RS/DN back and it
 will be added back to the cluster. Loadbalancer will then assign some
 regions back to him. You will loose some data locality for the regions
 wich are going to be moved.

 JM

 2013/4/25 Dan Crosta d...@magnetic.com:
 We have to perform maintenance on one of our HDFS DataNode/HBase 
 Regionserver machines for a few hours. What are the right steps to take 
 before doing the maintenance in order to ensure limited  impact to the 
 cluster and (thrift) clients of the cluster, both for HDFS and HBase?

 After the maintenance, are there any special steps required to add the node 
 back to the cluster, or can we simply restart the services and HDFS/HBase 
 take care of the rest?

 Thanks,
 - Dan



HBase is not running.

2013-04-25 Thread Yves S. Garret
Hi all,

I'm having an issue with getting HBase to run.  I'm following this tutorial:
http://hbase.apache.org/book.html#start_hbase

When I run that command [ bin/start-hbase.sh start ], nothing happens.  At
all.
My question is why.  I have Java 1.7 on this machine, do I _need_ to get
1.6?


Re: HBase is not running.

2013-04-25 Thread Jean-Marc Spaggiari
Hi Yves,

Which version of HBase are you trying with? It should be working with Java 1.7.

To start HBase, are you trying bin/start-hbase.sh start as you said
below? On only bin/start-hbase.sh? The later is the correct one
while the former as an extra start not required at the end.

JM


2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Hi all,

 I'm having an issue with getting HBase to run.  I'm following this tutorial:
 http://hbase.apache.org/book.html#start_hbase

 When I run that command [ bin/start-hbase.sh start ], nothing happens.  At
 all.
 My question is why.  I have Java 1.7 on this machine, do I _need_ to get
 1.6?


Re: HBase - prioritizing writes over reads?

2013-04-25 Thread Jean-Daniel Cryans
Short answer is no, there's no knob or configuration to do that.

Longer answer is it depends. Are the reads and writes going to different
regions/tables? If so, disable the balancer and take it in charge
by segregating the offending regions on their own RS.

I also see you have the requirement to take incoming data not matter what.
Well, this currently cannot be guaranteed in HBase since a RS failure will
incur some limited unavailability while the ZK session times out, the logs
are replayed and the regions are reassigned. I don't know what kind of SLA
you have but it sounds like even without your reads problem you need to do
something client-side to take care of this. Local buffers maybe? It would
work as long as you don't need to serve that new data right away (unless
you also start serving from the local buffer, but it's getting complicated).

Hope this helps,

J-D


On Wed, Apr 24, 2013 at 3:25 AM, kzurek kzu...@proximetry.pl wrote:

 Is it possible to prioritize writes over reads in HBase? I'm facing some
 I/O
 read related issues that influence my write clients and cluster in general
 (constantly growing store files on some RS). Due to the fact that I cannot
 let myself to loose/skip incoming data, I would like to guarantee that in
 case of extensive read I will be able to limit incoming read requests, so
 that write requests wont be influenced. Is it possible? If so what would be
 the best way to that and where it should be placed - on the client or
 cluster side)?







 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/HBase-prioritizing-writes-over-reads-tp4042838.html
 Sent from the HBase User mailing list archive at Nabble.com.



Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Hi, I'm trying to run 0.95.0.

I've tried both and nothing worked.

I do have another question.  When I go to download hbase, I get the
following 3 choices:
http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/

The 3 choices:
- hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
- hbase-0.95.0-hadoop2-bin.tar.gz
- hbase-0.95.0-src.tar.gz

Which of those should I download and work with?  The instructions
were somewhat vague on that and I think this might be causing me
some headaches in this process.

By the way, thank you for your answer, very appreciated!


On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Yves,

 Which version of HBase are you trying with? It should be working with Java
 1.7.

 To start HBase, are you trying bin/start-hbase.sh start as you said
 below? On only bin/start-hbase.sh? The later is the correct one
 while the former as an extra start not required at the end.

 JM


 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi all,
 
  I'm having an issue with getting HBase to run.  I'm following this
 tutorial:
  http://hbase.apache.org/book.html#start_hbase
 
  When I run that command [ bin/start-hbase.sh start ], nothing happens.
  At
  all.
  My question is why.  I have Java 1.7 on this machine, do I _need_ to get
  1.6?



Re: HBase is not running.

2013-04-25 Thread Jean-Marc Spaggiari
Hi Yves,

0.95.0 is a developer version. If you are starting with HBase, I will
recommend you to choose a more stable version like 0.94.6.1.

Regarding the 3 choices you are listing below.
1) This one is HBase 0.95 running over Hadoop 1.0
2) This one is HBase 0.95 running over Hadoop 2.0
3) This one are the HBase source classes.

Again, I think you are better to go with a stable version for the
first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/

Would you mind to retry you tests with this version and let me know if
it's working better?

JM

2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Hi, I'm trying to run 0.95.0.

 I've tried both and nothing worked.

 I do have another question.  When I go to download hbase, I get the
 following 3 choices:
 http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/

 The 3 choices:
 - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
 - hbase-0.95.0-hadoop2-bin.tar.gz
 - hbase-0.95.0-src.tar.gz

 Which of those should I download and work with?  The instructions
 were somewhat vague on that and I think this might be causing me
 some headaches in this process.

 By the way, thank you for your answer, very appreciated!


 On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Yves,

 Which version of HBase are you trying with? It should be working with Java
 1.7.

 To start HBase, are you trying bin/start-hbase.sh start as you said
 below? On only bin/start-hbase.sh? The later is the correct one
 while the former as an extra start not required at the end.

 JM


 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi all,
 
  I'm having an issue with getting HBase to run.  I'm following this
 tutorial:
  http://hbase.apache.org/book.html#start_hbase
 
  When I run that command [ bin/start-hbase.sh start ], nothing happens.
  At
  all.
  My question is why.  I have Java 1.7 on this machine, do I _need_ to get
  1.6?



Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Ah, my mistake.  I'll re-try with the more stable version.


On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Yves,

 0.95.0 is a developer version. If you are starting with HBase, I will
 recommend you to choose a more stable version like 0.94.6.1.

 Regarding the 3 choices you are listing below.
 1) This one is HBase 0.95 running over Hadoop 1.0
 2) This one is HBase 0.95 running over Hadoop 2.0
 3) This one are the HBase source classes.

 Again, I think you are better to go with a stable version for the
 first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/

 Would you mind to retry you tests with this version and let me know if
 it's working better?

 JM

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi, I'm trying to run 0.95.0.
 
  I've tried both and nothing worked.
 
  I do have another question.  When I go to download hbase, I get the
  following 3 choices:
  http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
 
  The 3 choices:
  - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
  - hbase-0.95.0-hadoop2-bin.tar.gz
  - hbase-0.95.0-src.tar.gz
 
  Which of those should I download and work with?  The instructions
  were somewhat vague on that and I think this might be causing me
  some headaches in this process.
 
  By the way, thank you for your answer, very appreciated!
 
 
  On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  Which version of HBase are you trying with? It should be working with
 Java
  1.7.
 
  To start HBase, are you trying bin/start-hbase.sh start as you said
  below? On only bin/start-hbase.sh? The later is the correct one
  while the former as an extra start not required at the end.
 
  JM
 
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi all,
  
   I'm having an issue with getting HBase to run.  I'm following this
  tutorial:
   http://hbase.apache.org/book.html#start_hbase
  
   When I run that command [ bin/start-hbase.sh start ], nothing happens.
   At
   all.
   My question is why.  I have Java 1.7 on this machine, do I _need_ to
 get
   1.6?
 



Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Robin Gowin
I removed ruby and reinstalled it; same results.

On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Is it easy for you to de-install it and re-install it? If so, would
 you mind giving it a try?

 2013/4/25 Robin Gowin landr...@gmail.com:
  I looked at that thread and I do have ruby installed but I don't think
 that
  is the problem,
  unless maybe there is a version mismatch? I wasn't sure if jruby needs to
  be installed
  and if so what its command line is.
 
  Here's the relevant versions as far as I can tell. The problem still
 exists.
 
  [cloudera@localhost ~]$ ruby -v
  ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
  [cloudera@localhost ~]$ irb -v
  irb 0.9.5(05/04/13)
  [cloudera@localhost ~]$ hbase -version
  java version 1.6.0_31
  Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
  Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
  [cloudera@localhost ~]$ which rvm
  /usr/bin/which: no rvm in
 
 (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)
  [cloudera@localhost ~]$ which jruby
  /usr/bin/which: no jruby in
 
 (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)
 
 
  Robin
 
 
  On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Something I thought about is that you might have a Ruby lib installed
  somewhere else that the shell is using. Someone faced something
  similar recently
 
  Take a look at this thread:
 
 
 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E
 
  Can you see if you have something like that in your system?
 
  JM
 



Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Hi, I have a small update.  The stable build seems to be working.

Thanks again for your help.


On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Yves,

 0.95.0 is a developer version. If you are starting with HBase, I will
 recommend you to choose a more stable version like 0.94.6.1.

 Regarding the 3 choices you are listing below.
 1) This one is HBase 0.95 running over Hadoop 1.0
 2) This one is HBase 0.95 running over Hadoop 2.0
 3) This one are the HBase source classes.

 Again, I think you are better to go with a stable version for the
 first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/

 Would you mind to retry you tests with this version and let me know if
 it's working better?

 JM

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi, I'm trying to run 0.95.0.
 
  I've tried both and nothing worked.
 
  I do have another question.  When I go to download hbase, I get the
  following 3 choices:
  http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
 
  The 3 choices:
  - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
  - hbase-0.95.0-hadoop2-bin.tar.gz
  - hbase-0.95.0-src.tar.gz
 
  Which of those should I download and work with?  The instructions
  were somewhat vague on that and I think this might be causing me
  some headaches in this process.
 
  By the way, thank you for your answer, very appreciated!
 
 
  On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  Which version of HBase are you trying with? It should be working with
 Java
  1.7.
 
  To start HBase, are you trying bin/start-hbase.sh start as you said
  below? On only bin/start-hbase.sh? The later is the correct one
  while the former as an extra start not required at the end.
 
  JM
 
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi all,
  
   I'm having an issue with getting HBase to run.  I'm following this
  tutorial:
   http://hbase.apache.org/book.html#start_hbase
  
   When I run that command [ bin/start-hbase.sh start ], nothing happens.
   At
   all.
   My question is why.  I have Java 1.7 on this machine, do I _need_ to
 get
   1.6?
 



Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Ok, spoke too soon :) .

I ran this command [ create 'test', 'cf' ] and this is the result that I
got:
http://bin.cakephp.org/view/168926019

This is after running helpenter and having this run just fine.


On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Yves,

 0.95.0 is a developer version. If you are starting with HBase, I will
 recommend you to choose a more stable version like 0.94.6.1.

 Regarding the 3 choices you are listing below.
 1) This one is HBase 0.95 running over Hadoop 1.0
 2) This one is HBase 0.95 running over Hadoop 2.0
 3) This one are the HBase source classes.

 Again, I think you are better to go with a stable version for the
 first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/

 Would you mind to retry you tests with this version and let me know if
 it's working better?

 JM

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi, I'm trying to run 0.95.0.
 
  I've tried both and nothing worked.
 
  I do have another question.  When I go to download hbase, I get the
  following 3 choices:
  http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
 
  The 3 choices:
  - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
  - hbase-0.95.0-hadoop2-bin.tar.gz
  - hbase-0.95.0-src.tar.gz
 
  Which of those should I download and work with?  The instructions
  were somewhat vague on that and I think this might be causing me
  some headaches in this process.
 
  By the way, thank you for your answer, very appreciated!
 
 
  On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  Which version of HBase are you trying with? It should be working with
 Java
  1.7.
 
  To start HBase, are you trying bin/start-hbase.sh start as you said
  below? On only bin/start-hbase.sh? The later is the correct one
  while the former as an extra start not required at the end.
 
  JM
 
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi all,
  
   I'm having an issue with getting HBase to run.  I'm following this
  tutorial:
   http://hbase.apache.org/book.html#start_hbase
  
   When I run that command [ bin/start-hbase.sh start ], nothing happens.
   At
   all.
   My question is why.  I have Java 1.7 on this machine, do I _need_ to
 get
   1.6?
 



Re: HBase is not running.

2013-04-25 Thread Jean-Marc Spaggiari
Before trying the shell, can you look at the server logs and see if
everything is fine?

Also, is the web UI working fine?

2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Ok, spoke too soon :) .

 I ran this command [ create 'test', 'cf' ] and this is the result that I
 got:
 http://bin.cakephp.org/view/168926019

 This is after running helpenter and having this run just fine.


 On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Yves,

 0.95.0 is a developer version. If you are starting with HBase, I will
 recommend you to choose a more stable version like 0.94.6.1.

 Regarding the 3 choices you are listing below.
 1) This one is HBase 0.95 running over Hadoop 1.0
 2) This one is HBase 0.95 running over Hadoop 2.0
 3) This one are the HBase source classes.

 Again, I think you are better to go with a stable version for the
 first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/

 Would you mind to retry you tests with this version and let me know if
 it's working better?

 JM

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi, I'm trying to run 0.95.0.
 
  I've tried both and nothing worked.
 
  I do have another question.  When I go to download hbase, I get the
  following 3 choices:
  http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
 
  The 3 choices:
  - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
  - hbase-0.95.0-hadoop2-bin.tar.gz
  - hbase-0.95.0-src.tar.gz
 
  Which of those should I download and work with?  The instructions
  were somewhat vague on that and I think this might be causing me
  some headaches in this process.
 
  By the way, thank you for your answer, very appreciated!
 
 
  On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  Which version of HBase are you trying with? It should be working with
 Java
  1.7.
 
  To start HBase, are you trying bin/start-hbase.sh start as you said
  below? On only bin/start-hbase.sh? The later is the correct one
  while the former as an extra start not required at the end.
 
  JM
 
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi all,
  
   I'm having an issue with getting HBase to run.  I'm following this
  tutorial:
   http://hbase.apache.org/book.html#start_hbase
  
   When I run that command [ bin/start-hbase.sh start ], nothing happens.
   At
   all.
   My question is why.  I have Java 1.7 on this machine, do I _need_ to
 get
   1.6?
 



Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Jean-Marc Spaggiari
No, don't re-install it ;)

Remove it and retry. To make sure it's not using any lib anywhere else...

JM

2013/4/25 Robin Gowin landr...@gmail.com:
 I removed ruby and reinstalled it; same results.

 On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Is it easy for you to de-install it and re-install it? If so, would
 you mind giving it a try?

 2013/4/25 Robin Gowin landr...@gmail.com:
  I looked at that thread and I do have ruby installed but I don't think
 that
  is the problem,
  unless maybe there is a version mismatch? I wasn't sure if jruby needs to
  be installed
  and if so what its command line is.
 
  Here's the relevant versions as far as I can tell. The problem still
 exists.
 
  [cloudera@localhost ~]$ ruby -v
  ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
  [cloudera@localhost ~]$ irb -v
  irb 0.9.5(05/04/13)
  [cloudera@localhost ~]$ hbase -version
  java version 1.6.0_31
  Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
  Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
  [cloudera@localhost ~]$ which rvm
  /usr/bin/which: no rvm in
 
 (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)
  [cloudera@localhost ~]$ which jruby
  /usr/bin/which: no jruby in
 
 (/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/sbin:/usr/java/jdk1.6.0_31/bin:/home/cloudera/bin:/sbin:/usr/java/jdk1.6.0_31/bin)
 
 
  Robin
 
 
  On Thu, Apr 25, 2013 at 9:40 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Something I thought about is that you might have a Ruby lib installed
  somewhere else that the shell is using. Someone faced something
  similar recently
 
  Take a look at this thread:
 
 
 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3CEE737D80-45B4-4A33-817D-28ED9C1CB0AE%40gmail.com%3E
 
  Can you see if you have something like that in your system?
 
  JM
 



writing and reading from a region at once

2013-04-25 Thread Aaron Zimmerman
Hi,

  If a region is being written to, and a scanner takes a lease out on the 
region, what will happen to the writes?  Is there a concept of Transaction 
Isolation Levels?   

  I don't see errors in Puts while the tables are being scanned?  But it seems 
that I'm losing writes somewhere, is it possible the writes could fail silently?
  
thanks,

Aaron Zimmerman



Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Here are the logs, what should I be looking for?  Seems like everything
is fine for the moment, no?

http://bin.cakephp.org/view/2144893539

The web UI?  What do you mean?  Sorry if this is a stupid question, I'm
a Hadoop newb.

On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Before trying the shell, can you look at the server logs and see if
 everything is fine?

 Also, is the web UI working fine?

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Ok, spoke too soon :) .
 
  I ran this command [ create 'test', 'cf' ] and this is the result that I
  got:
  http://bin.cakephp.org/view/168926019
 
  This is after running helpenter and having this run just fine.
 
 
  On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  0.95.0 is a developer version. If you are starting with HBase, I will
  recommend you to choose a more stable version like 0.94.6.1.
 
  Regarding the 3 choices you are listing below.
  1) This one is HBase 0.95 running over Hadoop 1.0
  2) This one is HBase 0.95 running over Hadoop 2.0
  3) This one are the HBase source classes.
 
  Again, I think you are better to go with a stable version for the
  first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/
 
  Would you mind to retry you tests with this version and let me know if
  it's working better?
 
  JM
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi, I'm trying to run 0.95.0.
  
   I've tried both and nothing worked.
  
   I do have another question.  When I go to download hbase, I get the
   following 3 choices:
   http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
  
   The 3 choices:
   - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
   - hbase-0.95.0-hadoop2-bin.tar.gz
   - hbase-0.95.0-src.tar.gz
  
   Which of those should I download and work with?  The instructions
   were somewhat vague on that and I think this might be causing me
   some headaches in this process.
  
   By the way, thank you for your answer, very appreciated!
  
  
   On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Hi Yves,
  
   Which version of HBase are you trying with? It should be working with
  Java
   1.7.
  
   To start HBase, are you trying bin/start-hbase.sh start as you said
   below? On only bin/start-hbase.sh? The later is the correct one
   while the former as an extra start not required at the end.
  
   JM
  
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Hi all,
   
I'm having an issue with getting HBase to run.  I'm following this
   tutorial:
http://hbase.apache.org/book.html#start_hbase
   
When I run that command [ bin/start-hbase.sh start ], nothing
 happens.
At
all.
My question is why.  I have Java 1.7 on this machine, do I _need_
 to
  get
1.6?
  
 



Re: HBase is not running.

2013-04-25 Thread Jean-Marc Spaggiari
There is no stupid question ;)

Are the log truncated? Anything else after that? Or that's all what you have?

For the UI, you can access it with http://192.168.X.X:60010/master-status

Replace the X with your own IP. You should see some information about
your HBase cluster (even in Standalone mode).

JMS

2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Here are the logs, what should I be looking for?  Seems like everything
 is fine for the moment, no?

 http://bin.cakephp.org/view/2144893539

 The web UI?  What do you mean?  Sorry if this is a stupid question, I'm
 a Hadoop newb.

 On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Before trying the shell, can you look at the server logs and see if
 everything is fine?

 Also, is the web UI working fine?

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Ok, spoke too soon :) .
 
  I ran this command [ create 'test', 'cf' ] and this is the result that I
  got:
  http://bin.cakephp.org/view/168926019
 
  This is after running helpenter and having this run just fine.
 
 
  On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  0.95.0 is a developer version. If you are starting with HBase, I will
  recommend you to choose a more stable version like 0.94.6.1.
 
  Regarding the 3 choices you are listing below.
  1) This one is HBase 0.95 running over Hadoop 1.0
  2) This one is HBase 0.95 running over Hadoop 2.0
  3) This one are the HBase source classes.
 
  Again, I think you are better to go with a stable version for the
  first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/
 
  Would you mind to retry you tests with this version and let me know if
  it's working better?
 
  JM
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi, I'm trying to run 0.95.0.
  
   I've tried both and nothing worked.
  
   I do have another question.  When I go to download hbase, I get the
   following 3 choices:
   http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
  
   The 3 choices:
   - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
   - hbase-0.95.0-hadoop2-bin.tar.gz
   - hbase-0.95.0-src.tar.gz
  
   Which of those should I download and work with?  The instructions
   were somewhat vague on that and I think this might be causing me
   some headaches in this process.
  
   By the way, thank you for your answer, very appreciated!
  
  
   On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Hi Yves,
  
   Which version of HBase are you trying with? It should be working with
  Java
   1.7.
  
   To start HBase, are you trying bin/start-hbase.sh start as you said
   below? On only bin/start-hbase.sh? The later is the correct one
   while the former as an extra start not required at the end.
  
   JM
  
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Hi all,
   
I'm having an issue with getting HBase to run.  I'm following this
   tutorial:
http://hbase.apache.org/book.html#start_hbase
   
When I run that command [ bin/start-hbase.sh start ], nothing
 happens.
At
all.
My question is why.  I have Java 1.7 on this machine, do I _need_
 to
  get
1.6?
  
 



Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Hi again.  I have 3 log files and only one of them had anything in them,
here are the file names.  I'm assuming that you're talking about the
directory ${APACHE_HBASE_HOME}/logs, yes?

Here are the file names:
-rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log
-rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out
-rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit

Also, to answer your question about the UI, I tried that URL (I'm doing all
of this on my laptop just to learn at the moment) and neither the URL nor
localhost:60010 worked.  So, the answer to your question is that the UI is
not showing up.  This could be due to not being far along in the tutorial,
perhaps?

Thanks again!


On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 There is no stupid question ;)

 Are the log truncated? Anything else after that? Or that's all what you
 have?

 For the UI, you can access it with http://192.168.X.X:60010/master-status

 Replace the X with your own IP. You should see some information about
 your HBase cluster (even in Standalone mode).

 JMS

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Here are the logs, what should I be looking for?  Seems like everything
  is fine for the moment, no?
 
  http://bin.cakephp.org/view/2144893539
 
  The web UI?  What do you mean?  Sorry if this is a stupid question, I'm
  a Hadoop newb.
 
  On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Before trying the shell, can you look at the server logs and see if
  everything is fine?
 
  Also, is the web UI working fine?
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Ok, spoke too soon :) .
  
   I ran this command [ create 'test', 'cf' ] and this is the result
 that I
   got:
   http://bin.cakephp.org/view/168926019
  
   This is after running helpenter and having this run just fine.
  
  
   On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Hi Yves,
  
   0.95.0 is a developer version. If you are starting with HBase, I will
   recommend you to choose a more stable version like 0.94.6.1.
  
   Regarding the 3 choices you are listing below.
   1) This one is HBase 0.95 running over Hadoop 1.0
   2) This one is HBase 0.95 running over Hadoop 2.0
   3) This one are the HBase source classes.
  
   Again, I think you are better to go with a stable version for the
   first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/
  
   Would you mind to retry you tests with this version and let me know
 if
   it's working better?
  
   JM
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Hi, I'm trying to run 0.95.0.
   
I've tried both and nothing worked.
   
I do have another question.  When I go to download hbase, I get the
following 3 choices:
http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
   
The 3 choices:
- hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
- hbase-0.95.0-hadoop2-bin.tar.gz
- hbase-0.95.0-src.tar.gz
   
Which of those should I download and work with?  The instructions
were somewhat vague on that and I think this might be causing me
some headaches in this process.
   
By the way, thank you for your answer, very appreciated!
   
   
On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:
   
Hi Yves,
   
Which version of HBase are you trying with? It should be working
 with
   Java
1.7.
   
To start HBase, are you trying bin/start-hbase.sh start as you
 said
below? On only bin/start-hbase.sh? The later is the correct one
while the former as an extra start not required at the end.
   
JM
   
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Hi all,

 I'm having an issue with getting HBase to run.  I'm following
 this
tutorial:
 http://hbase.apache.org/book.html#start_hbase

 When I run that command [ bin/start-hbase.sh start ], nothing
  happens.
 At
 all.
 My question is why.  I have Java 1.7 on this machine, do I
 _need_
  to
   get
 1.6?
   
  
 



Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Robin Gowin
To be more explicit:

I'm running CentOS release 6.4 in a vm on Mac OSx 10.6
I ran yum remove ruby and then yum install ruby (inside the vm). Is that
what you meant?

Also I put in some simple print statements in several of the ruby
scripts called by the hbase shell, and they are getting executed.
(for example: admin.rb, hbase.rb, and table.rb)

(I wasn't sure what it referred to in your email)

Robin


On Thu, Apr 25, 2013 at 3:58 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 No, don't re-install it ;)

 Remove it and retry. To make sure it's not using any lib anywhere else...

 JM

 2013/4/25 Robin Gowin landr...@gmail.com:
  I removed ruby and reinstalled it; same results.
 
  On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Is it easy for you to de-install it and re-install it? If so, would
  you mind giving it a try?
 


Re: HBase is not running.

2013-04-25 Thread Mohammad Tariq
Hello Yves,

   The log seems to be incomplete. Could you please the complete
logs?Have you set the hbase.zookeeper.quorum property properly?Is your
Hadoop running fine?

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret
yoursurrogate...@gmail.comwrote:

 Hi again.  I have 3 log files and only one of them had anything in them,
 here are the file names.  I'm assuming that you're talking about the
 directory ${APACHE_HBASE_HOME}/logs, yes?

 Here are the file names:
 -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log
 -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out
 -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit

 Also, to answer your question about the UI, I tried that URL (I'm doing all
 of this on my laptop just to learn at the moment) and neither the URL nor
 localhost:60010 worked.  So, the answer to your question is that the UI is
 not showing up.  This could be due to not being far along in the tutorial,
 perhaps?

 Thanks again!


 On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

  There is no stupid question ;)
 
  Are the log truncated? Anything else after that? Or that's all what you
  have?
 
  For the UI, you can access it with
 http://192.168.X.X:60010/master-status
 
  Replace the X with your own IP. You should see some information about
  your HBase cluster (even in Standalone mode).
 
  JMS
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Here are the logs, what should I be looking for?  Seems like everything
   is fine for the moment, no?
  
   http://bin.cakephp.org/view/2144893539
  
   The web UI?  What do you mean?  Sorry if this is a stupid question, I'm
   a Hadoop newb.
  
   On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Before trying the shell, can you look at the server logs and see if
   everything is fine?
  
   Also, is the web UI working fine?
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Ok, spoke too soon :) .
   
I ran this command [ create 'test', 'cf' ] and this is the result
  that I
got:
http://bin.cakephp.org/view/168926019
   
This is after running helpenter and having this run just fine.
   
   
On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:
   
Hi Yves,
   
0.95.0 is a developer version. If you are starting with HBase, I
 will
recommend you to choose a more stable version like 0.94.6.1.
   
Regarding the 3 choices you are listing below.
1) This one is HBase 0.95 running over Hadoop 1.0
2) This one is HBase 0.95 running over Hadoop 2.0
3) This one are the HBase source classes.
   
Again, I think you are better to go with a stable version for the
first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/
   
Would you mind to retry you tests with this version and let me know
  if
it's working better?
   
JM
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Hi, I'm trying to run 0.95.0.

 I've tried both and nothing worked.

 I do have another question.  When I go to download hbase, I get
 the
 following 3 choices:
 http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/

 The 3 choices:
 - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
 - hbase-0.95.0-hadoop2-bin.tar.gz
 - hbase-0.95.0-src.tar.gz

 Which of those should I download and work with?  The instructions
 were somewhat vague on that and I think this might be causing me
 some headaches in this process.

 By the way, thank you for your answer, very appreciated!


 On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Yves,

 Which version of HBase are you trying with? It should be working
  with
Java
 1.7.

 To start HBase, are you trying bin/start-hbase.sh start as you
  said
 below? On only bin/start-hbase.sh? The later is the correct
 one
 while the former as an extra start not required at the end.

 JM


 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi all,
 
  I'm having an issue with getting HBase to run.  I'm following
  this
 tutorial:
  http://hbase.apache.org/book.html#start_hbase
 
  When I run that command [ bin/start-hbase.sh start ], nothing
   happens.
  At
  all.
  My question is why.  I have Java 1.7 on this machine, do I
  _need_
   to
get
  1.6?

   
  
 



Re: HBase is not running.

2013-04-25 Thread Jean-Marc Spaggiari
Hi Mohammad,

He is running standalone, so no need to update the zookeeper qorum yet.

Yes, can you share the entire hbase-ysg-master-ysg.connect.log file?
Not just the first lines. Or what you sent is already all?

So what have you done yet? Downloaded 0.94, extracted it, setup the
JAVA_HOME and ran bin/start-hbase.sh ?

JMS

2013/4/25 Mohammad Tariq donta...@gmail.com:
 Hello Yves,

The log seems to be incomplete. Could you please the complete
 logs?Have you set the hbase.zookeeper.quorum property properly?Is your
 Hadoop running fine?

 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret
 yoursurrogate...@gmail.comwrote:

 Hi again.  I have 3 log files and only one of them had anything in them,
 here are the file names.  I'm assuming that you're talking about the
 directory ${APACHE_HBASE_HOME}/logs, yes?

 Here are the file names:
 -rw-rw-r--. 1 user user 12465 Apr 25 14:54 hbase-ysg-master-ysg.connect.log
 -rw-rw-r--. 1 user user 0 Apr 25 14:54 hbase-ysg-master-ysg.connect.out
 -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit

 Also, to answer your question about the UI, I tried that URL (I'm doing all
 of this on my laptop just to learn at the moment) and neither the URL nor
 localhost:60010 worked.  So, the answer to your question is that the UI is
 not showing up.  This could be due to not being far along in the tutorial,
 perhaps?

 Thanks again!


 On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

  There is no stupid question ;)
 
  Are the log truncated? Anything else after that? Or that's all what you
  have?
 
  For the UI, you can access it with
 http://192.168.X.X:60010/master-status
 
  Replace the X with your own IP. You should see some information about
  your HBase cluster (even in Standalone mode).
 
  JMS
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Here are the logs, what should I be looking for?  Seems like everything
   is fine for the moment, no?
  
   http://bin.cakephp.org/view/2144893539
  
   The web UI?  What do you mean?  Sorry if this is a stupid question, I'm
   a Hadoop newb.
  
   On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
   Before trying the shell, can you look at the server logs and see if
   everything is fine?
  
   Also, is the web UI working fine?
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Ok, spoke too soon :) .
   
I ran this command [ create 'test', 'cf' ] and this is the result
  that I
got:
http://bin.cakephp.org/view/168926019
   
This is after running helpenter and having this run just fine.
   
   
On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:
   
Hi Yves,
   
0.95.0 is a developer version. If you are starting with HBase, I
 will
recommend you to choose a more stable version like 0.94.6.1.
   
Regarding the 3 choices you are listing below.
1) This one is HBase 0.95 running over Hadoop 1.0
2) This one is HBase 0.95 running over Hadoop 2.0
3) This one are the HBase source classes.
   
Again, I think you are better to go with a stable version for the
first steps: http://www.bizdirusa.com/mirrors/apache/hbase/stable/
   
Would you mind to retry you tests with this version and let me know
  if
it's working better?
   
JM
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Hi, I'm trying to run 0.95.0.

 I've tried both and nothing worked.

 I do have another question.  When I go to download hbase, I get
 the
 following 3 choices:
 http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/

 The 3 choices:
 - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
 - hbase-0.95.0-hadoop2-bin.tar.gz
 - hbase-0.95.0-src.tar.gz

 Which of those should I download and work with?  The instructions
 were somewhat vague on that and I think this might be causing me
 some headaches in this process.

 By the way, thank you for your answer, very appreciated!


 On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Yves,

 Which version of HBase are you trying with? It should be working
  with
Java
 1.7.

 To start HBase, are you trying bin/start-hbase.sh start as you
  said
 below? On only bin/start-hbase.sh? The later is the correct
 one
 while the former as an extra start not required at the end.

 JM


 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi all,
 
  I'm having an issue with getting HBase to run.  I'm following
  this
 tutorial:
  http://hbase.apache.org/book.html#start_hbase
 
  When I run that command [ bin/start-hbase.sh start ], nothing
   happens.
  At
  all.
  My question is why.  I have Java 1.7 on 

Re: undefined method `internal_command' for Shell::Formatter::Console

2013-04-25 Thread Jean-Marc Spaggiari
Hi Robin,

No, the idea is to run yum remove, and then test the HBase sheel.
Don't run yum install ruby until we get that fixed. I want to see if
your installed very of Ruby can cause the issue.

The it was refering to the Ruby package.

JM

2013/4/25 Robin Gowin landr...@gmail.com:
 To be more explicit:

 I'm running CentOS release 6.4 in a vm on Mac OSx 10.6
 I ran yum remove ruby and then yum install ruby (inside the vm). Is that
 what you meant?

 Also I put in some simple print statements in several of the ruby
 scripts called by the hbase shell, and they are getting executed.
 (for example: admin.rb, hbase.rb, and table.rb)

 (I wasn't sure what it referred to in your email)

 Robin


 On Thu, Apr 25, 2013 at 3:58 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 No, don't re-install it ;)

 Remove it and retry. To make sure it's not using any lib anywhere else...

 JM

 2013/4/25 Robin Gowin landr...@gmail.com:
  I removed ruby and reinstalled it; same results.
 
  On Thu, Apr 25, 2013 at 11:59 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Is it easy for you to de-install it and re-install it? If so, would
  you mind giving it a try?
 


Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
My mistake.  I thought I had all of those logs.  This is what I currently
have:
http://bin.cakephp.org/view/2112130549

I have $JAVA_HOME set to this:
/usr/java/jdk1.7.0_17
I have extracted 0.94 and ran bin/start-hbase.sh

Thanks for your help!



On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Mohammad,

 He is running standalone, so no need to update the zookeeper qorum yet.

 Yes, can you share the entire hbase-ysg-master-ysg.connect.log file?
 Not just the first lines. Or what you sent is already all?

 So what have you done yet? Downloaded 0.94, extracted it, setup the
 JAVA_HOME and ran bin/start-hbase.sh ?

 JMS

 2013/4/25 Mohammad Tariq donta...@gmail.com:
  Hello Yves,
 
 The log seems to be incomplete. Could you please the complete
  logs?Have you set the hbase.zookeeper.quorum property properly?Is your
  Hadoop running fine?
 
  Warm Regards,
  Tariq
  https://mtariq.jux.com/
  cloudfront.blogspot.com
 
 
  On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret
  yoursurrogate...@gmail.comwrote:
 
  Hi again.  I have 3 log files and only one of them had anything in them,
  here are the file names.  I'm assuming that you're talking about the
  directory ${APACHE_HBASE_HOME}/logs, yes?
 
  Here are the file names:
  -rw-rw-r--. 1 user user 12465 Apr 25 14:54
 hbase-ysg-master-ysg.connect.log
  -rw-rw-r--. 1 user user 0 Apr 25 14:54
 hbase-ysg-master-ysg.connect.out
  -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit
 
  Also, to answer your question about the UI, I tried that URL (I'm doing
 all
  of this on my laptop just to learn at the moment) and neither the URL
 nor
  localhost:60010 worked.  So, the answer to your question is that the UI
 is
  not showing up.  This could be due to not being far along in the
 tutorial,
  perhaps?
 
  Thanks again!
 
 
  On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
   There is no stupid question ;)
  
   Are the log truncated? Anything else after that? Or that's all what
 you
   have?
  
   For the UI, you can access it with
  http://192.168.X.X:60010/master-status
  
   Replace the X with your own IP. You should see some information about
   your HBase cluster (even in Standalone mode).
  
   JMS
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Here are the logs, what should I be looking for?  Seems like
 everything
is fine for the moment, no?
   
http://bin.cakephp.org/view/2144893539
   
The web UI?  What do you mean?  Sorry if this is a stupid question,
 I'm
a Hadoop newb.
   
On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:
   
Before trying the shell, can you look at the server logs and see if
everything is fine?
   
Also, is the web UI working fine?
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Ok, spoke too soon :) .

 I ran this command [ create 'test', 'cf' ] and this is the result
   that I
 got:
 http://bin.cakephp.org/view/168926019

 This is after running helpenter and having this run just fine.


 On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Yves,

 0.95.0 is a developer version. If you are starting with HBase, I
  will
 recommend you to choose a more stable version like 0.94.6.1.

 Regarding the 3 choices you are listing below.
 1) This one is HBase 0.95 running over Hadoop 1.0
 2) This one is HBase 0.95 running over Hadoop 2.0
 3) This one are the HBase source classes.

 Again, I think you are better to go with a stable version for
 the
 first steps:
 http://www.bizdirusa.com/mirrors/apache/hbase/stable/

 Would you mind to retry you tests with this version and let me
 know
   if
 it's working better?

 JM

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi, I'm trying to run 0.95.0.
 
  I've tried both and nothing worked.
 
  I do have another question.  When I go to download hbase, I
 get
  the
  following 3 choices:
  http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
 
  The 3 choices:
  - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
  - hbase-0.95.0-hadoop2-bin.tar.gz
  - hbase-0.95.0-src.tar.gz
 
  Which of those should I download and work with?  The
 instructions
  were somewhat vague on that and I think this might be causing
 me
  some headaches in this process.
 
  By the way, thank you for your answer, very appreciated!
 
 
  On Thu, Apr 25, 2013 at 1:00 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  Which version of HBase are you trying with? It should be
 working
   with
 Java
  1.7.
 
  To start HBase, are you trying bin/start-hbase.sh start as
 you
   said
  below? On only 

Re: HBase is not running.

2013-04-25 Thread Jean-Marc Spaggiari
Hi Yves,

You seems to have some network configuration issue with your installation.

java.net.BindException: Cannot assign requested address and
ip72-215-225-9.at.at.cox.net/72.215.225.9:0

How is your host file configured? You need to have your host name
pointing to you local IP (and not 127.0.0.1).

2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 My mistake.  I thought I had all of those logs.  This is what I currently
 have:
 http://bin.cakephp.org/view/2112130549

 I have $JAVA_HOME set to this:
 /usr/java/jdk1.7.0_17
 I have extracted 0.94 and ran bin/start-hbase.sh

 Thanks for your help!



 On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Mohammad,

 He is running standalone, so no need to update the zookeeper qorum yet.

 Yes, can you share the entire hbase-ysg-master-ysg.connect.log file?
 Not just the first lines. Or what you sent is already all?

 So what have you done yet? Downloaded 0.94, extracted it, setup the
 JAVA_HOME and ran bin/start-hbase.sh ?

 JMS

 2013/4/25 Mohammad Tariq donta...@gmail.com:
  Hello Yves,
 
 The log seems to be incomplete. Could you please the complete
  logs?Have you set the hbase.zookeeper.quorum property properly?Is your
  Hadoop running fine?
 
  Warm Regards,
  Tariq
  https://mtariq.jux.com/
  cloudfront.blogspot.com
 
 
  On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret
  yoursurrogate...@gmail.comwrote:
 
  Hi again.  I have 3 log files and only one of them had anything in them,
  here are the file names.  I'm assuming that you're talking about the
  directory ${APACHE_HBASE_HOME}/logs, yes?
 
  Here are the file names:
  -rw-rw-r--. 1 user user 12465 Apr 25 14:54
 hbase-ysg-master-ysg.connect.log
  -rw-rw-r--. 1 user user 0 Apr 25 14:54
 hbase-ysg-master-ysg.connect.out
  -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit
 
  Also, to answer your question about the UI, I tried that URL (I'm doing
 all
  of this on my laptop just to learn at the moment) and neither the URL
 nor
  localhost:60010 worked.  So, the answer to your question is that the UI
 is
  not showing up.  This could be due to not being far along in the
 tutorial,
  perhaps?
 
  Thanks again!
 
 
  On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
   There is no stupid question ;)
  
   Are the log truncated? Anything else after that? Or that's all what
 you
   have?
  
   For the UI, you can access it with
  http://192.168.X.X:60010/master-status
  
   Replace the X with your own IP. You should see some information about
   your HBase cluster (even in Standalone mode).
  
   JMS
  
   2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
Here are the logs, what should I be looking for?  Seems like
 everything
is fine for the moment, no?
   
http://bin.cakephp.org/view/2144893539
   
The web UI?  What do you mean?  Sorry if this is a stupid question,
 I'm
a Hadoop newb.
   
On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:
   
Before trying the shell, can you look at the server logs and see if
everything is fine?
   
Also, is the web UI working fine?
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Ok, spoke too soon :) .

 I ran this command [ create 'test', 'cf' ] and this is the result
   that I
 got:
 http://bin.cakephp.org/view/168926019

 This is after running helpenter and having this run just fine.


 On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Hi Yves,

 0.95.0 is a developer version. If you are starting with HBase, I
  will
 recommend you to choose a more stable version like 0.94.6.1.

 Regarding the 3 choices you are listing below.
 1) This one is HBase 0.95 running over Hadoop 1.0
 2) This one is HBase 0.95 running over Hadoop 2.0
 3) This one are the HBase source classes.

 Again, I think you are better to go with a stable version for
 the
 first steps:
 http://www.bizdirusa.com/mirrors/apache/hbase/stable/

 Would you mind to retry you tests with this version and let me
 know
   if
 it's working better?

 JM

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Hi, I'm trying to run 0.95.0.
 
  I've tried both and nothing worked.
 
  I do have another question.  When I go to download hbase, I
 get
  the
  following 3 choices:
  http://www.bizdirusa.com/mirrors/apache/hbase/hbase-0.95.0/
 
  The 3 choices:
  - hbase-0.95.0-hadoop1-bin.tar.gz (what I'm using)
  - hbase-0.95.0-hadoop2-bin.tar.gz
  - hbase-0.95.0-src.tar.gz
 
  Which of those should I download and work with?  The
 instructions
  were somewhat vague on that and I think this might be causing
 me
  some headaches in this process.
 
  By the way, thank you for your answer, very 

Re: writing and reading from a region at once

2013-04-25 Thread Jean-Daniel Cryans
Inline.

J-D


On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman 
azimmer...@sproutsocial.com wrote:

 Hi,

   If a region is being written to, and a scanner takes a lease out on the
 region, what will happen to the writes?  Is there a concept of Transaction
 Isolation Levels?


There's MVCC, so reads can happen while someone else is writing. What you
should expect from HBase is read committed.



   I don't see errors in Puts while the tables are being scanned?  But it
 seems that I'm losing writes somewhere, is it possible the writes could
 fail silently?


 Is it temporary while you're scanning or there's really data missing at
the end of the day? The former might happen on some older HBase versions
while the latter should never happen unless you lower the durability level
yourself and have machine failures.

J-D


Re: HBase is not running.

2013-04-25 Thread Mohammad Tariq
Hi JM :)

  Sorry about the previous mail. I didn't notice that.

@Yves : I agree with Jean. Please make sure you have proper name
resolution, which is vital for a proper Hbase setup. To make things
 simple you could just make use of 127.0.0.1, since you are
running in standalone mode. Comment out other bindings.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Fri, Apr 26, 2013 at 2:52 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Yves,

 You seems to have some network configuration issue with your installation.

 java.net.BindException: Cannot assign requested address and
 ip72-215-225-9.at.at.cox.net/72.215.225.9:0

 How is your host file configured? You need to have your host name
 pointing to you local IP (and not 127.0.0.1).

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  My mistake.  I thought I had all of those logs.  This is what I currently
  have:
  http://bin.cakephp.org/view/2112130549
 
  I have $JAVA_HOME set to this:
  /usr/java/jdk1.7.0_17
  I have extracted 0.94 and ran bin/start-hbase.sh
 
  Thanks for your help!
 
 
 
  On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Mohammad,
 
  He is running standalone, so no need to update the zookeeper qorum yet.
 
  Yes, can you share the entire hbase-ysg-master-ysg.connect.log file?
  Not just the first lines. Or what you sent is already all?
 
  So what have you done yet? Downloaded 0.94, extracted it, setup the
  JAVA_HOME and ran bin/start-hbase.sh ?
 
  JMS
 
  2013/4/25 Mohammad Tariq donta...@gmail.com:
   Hello Yves,
  
  The log seems to be incomplete. Could you please the complete
   logs?Have you set the hbase.zookeeper.quorum property properly?Is
 your
   Hadoop running fine?
  
   Warm Regards,
   Tariq
   https://mtariq.jux.com/
   cloudfront.blogspot.com
  
  
   On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret
   yoursurrogate...@gmail.comwrote:
  
   Hi again.  I have 3 log files and only one of them had anything in
 them,
   here are the file names.  I'm assuming that you're talking about the
   directory ${APACHE_HBASE_HOME}/logs, yes?
  
   Here are the file names:
   -rw-rw-r--. 1 user user 12465 Apr 25 14:54
  hbase-ysg-master-ysg.connect.log
   -rw-rw-r--. 1 user user 0 Apr 25 14:54
  hbase-ysg-master-ysg.connect.out
   -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit
  
   Also, to answer your question about the UI, I tried that URL (I'm
 doing
  all
   of this on my laptop just to learn at the moment) and neither the URL
  nor
   localhost:60010 worked.  So, the answer to your question is that the
 UI
  is
   not showing up.  This could be due to not being far along in the
  tutorial,
   perhaps?
  
   Thanks again!
  
  
   On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
There is no stupid question ;)
   
Are the log truncated? Anything else after that? Or that's all what
  you
have?
   
For the UI, you can access it with
   http://192.168.X.X:60010/master-status
   
Replace the X with your own IP. You should see some information
 about
your HBase cluster (even in Standalone mode).
   
JMS
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Here are the logs, what should I be looking for?  Seems like
  everything
 is fine for the moment, no?

 http://bin.cakephp.org/view/2144893539

 The web UI?  What do you mean?  Sorry if this is a stupid
 question,
  I'm
 a Hadoop newb.

 On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Before trying the shell, can you look at the server logs and
 see if
 everything is fine?

 Also, is the web UI working fine?

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Ok, spoke too soon :) .
 
  I ran this command [ create 'test', 'cf' ] and this is the
 result
that I
  got:
  http://bin.cakephp.org/view/168926019
 
  This is after running helpenter and having this run just
 fine.
 
 
  On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  0.95.0 is a developer version. If you are starting with
 HBase, I
   will
  recommend you to choose a more stable version like 0.94.6.1.
 
  Regarding the 3 choices you are listing below.
  1) This one is HBase 0.95 running over Hadoop 1.0
  2) This one is HBase 0.95 running over Hadoop 2.0
  3) This one are the HBase source classes.
 
  Again, I think you are better to go with a stable version for
  the
  first steps:
  http://www.bizdirusa.com/mirrors/apache/hbase/stable/
 
  Would you mind to retry you tests with this version and let
 me
  know
if
  it's working better?
 
  JM
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi, I'm 

Re: Coprocessors

2013-04-25 Thread lars hofhansl
You might want to have a look at Phoenix 
(https://github.com/forcedotcom/phoenix), which does that and more, and gives a 
SQL/JDBC interface.

-- Lars




 From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
To: user@hbase.apache.org 
Sent: Thursday, April 25, 2013 2:44 PM
Subject: Coprocessors
 

Folks:

This is my first post on the HBase user mailing list. 

I have the following scenario:
I've a HBase table of upto a billion keys. I'm looking to support an 
application where on some user action, I'd need to fetch multiple columns for 
upto 250K keys and do some sort of aggregation on it. Fetching all that data 
and doing the aggregation in my application takes about a minute.

I'm looking to co-locate the aggregation logic with the region servers to
a. Distribute the aggregation
b. Avoid having to fetch large amounts of data over the network (this could 
potentially be cross-datacenter)

Neither observers nor aggregation endpoints work for this use case. Observers 
don't return data back to the client while aggregation endpoints work in the 
context of scans not a multi-get (Are these correct assumptions?).

I'm looking to write a service that runs alongside the region servers and acts 
a proxy b/w my application and the region servers. 

I plan to use the logic in HBase client's HConnectionManager, to segment my 
request of 1M rowkeys into sub-requests per region-server. These are sent over 
to the proxy which fetches the data from the region server, aggregates locally 
and sends data back. Does this sound reasonable or even a useful thing to 
pursue?

Regards,
-sudarshan

Re: Coprocessors

2013-04-25 Thread Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
Thanks Lars. I briefly looked into Phoenix but it appeared to do full-table 
scans to perform the aggregation. The same goes with Impala. If you think 
otherwise, I'll look into it again.

- Original Message -
From: user@hbase.apache.org
To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), user@hbase.apache.org
At: Apr 25 2013 17:54:48

You might want to have a look at Phoenix 
(https://github.com/forcedotcom/phoenix), which does that and more, and gives a 
SQL/JDBC interface.

-- Lars




 From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
To: user@hbase.apache.org 
Sent: Thursday, April 25, 2013 2:44 PM
Subject: Coprocessors
 

Folks:

This is my first post on the HBase user mailing list. 

I have the following scenario:
I've a HBase table of upto a billion keys. I'm looking to support an 
application where on some user action, I'd need to fetch multiple columns for 
upto 250K keys and do some sort of aggregation on it. Fetching all that data 
and doing the aggregation in my application takes about a minute.

I'm looking to co-locate the aggregation logic with the region servers to
a. Distribute the aggregation
b. Avoid having to fetch large amounts of data over the network (this could 
potentially be cross-datacenter)

Neither observers nor aggregation endpoints work for this use case. Observers 
don't return data back to the client while aggregation endpoints work in the 
context of scans not a multi-get (Are these correct assumptions?).

I'm looking to write a service that runs alongside the region servers and acts 
a proxy b/w my application and the region servers. 

I plan to use the logic in HBase client's HConnectionManager, to segment my 
request of 1M rowkeys into sub-requests per region-server. These are sent over 
to the proxy which fetches the data from the region server, aggregates locally 
and sends data back. Does this sound reasonable or even a useful thing to 
pursue?

Regards,
-sudarshan

Re: Coprocessors

2013-04-25 Thread lars hofhansl
It doesn't. Based on your query and key-layout it only scans subranges of the 
keyspace and these scans are parallelized across region servers.
I'll let James explain more (if he's listening) :)


-- Lars




 From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
To: user@hbase.apache.org 
Sent: Thursday, April 25, 2013 2:57 PM
Subject: Re: Coprocessors
 

Thanks Lars. I briefly looked into Phoenix but it appeared to do full-table 
scans to perform the aggregation. The same goes with Impala. If you think 
otherwise, I'll look into it again.

- Original Message -
From: user@hbase.apache.org
To: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN), user@hbase.apache.org
At: Apr 25 2013 17:54:48

You might want to have a look at Phoenix 
(https://github.com/forcedotcom/phoenix), which does that and more, and gives a 
SQL/JDBC interface.

-- Lars




From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
To: user@hbase.apache.org 
Sent: Thursday, April 25, 2013 2:44 PM
Subject: Coprocessors


Folks:

This is my first post on the HBase user mailing list. 

I have the following scenario:
I've a HBase table of upto a billion keys. I'm looking to support an 
application where on some user action, I'd need to fetch multiple columns for 
upto 250K keys and do some sort of aggregation on it. Fetching all that data 
and doing the aggregation in my application takes about a minute.

I'm looking to co-locate the aggregation logic with the region servers to
a. Distribute the aggregation
b. Avoid having to fetch large amounts of data over the network (this could 
potentially be cross-datacenter)

Neither observers nor aggregation endpoints work for this use case. Observers 
don't return data back to the client while aggregation endpoints work in the 
context of scans not a multi-get (Are these correct assumptions?).

I'm looking to write a service that runs alongside the region servers and acts 
a proxy b/w my application and the region servers. 

I plan to use the logic in HBase client's HConnectionManager, to segment my 
request of 1M rowkeys into sub-requests per region-server. These are sent over 
to the proxy which fetches the data from the region server, aggregates locally 
and sends data back. Does this sound reasonable or even a useful thing to 
pursue?

Regards,
-sudarshan

Re: Coprocessors

2013-04-25 Thread Michael Segel
I don't think Phoenix will solve his problem. 

He also needs to explain more about his problem before we can start to think 
about the problem. 


On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote:

 You might want to have a look at Phoenix 
 (https://github.com/forcedotcom/phoenix), which does that and more, and gives 
 a SQL/JDBC interface.
 
 -- Lars
 
 
 
 
 From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
 To: user@hbase.apache.org 
 Sent: Thursday, April 25, 2013 2:44 PM
 Subject: Coprocessors
 
 
 Folks:
 
 This is my first post on the HBase user mailing list. 
 
 I have the following scenario:
 I've a HBase table of upto a billion keys. I'm looking to support an 
 application where on some user action, I'd need to fetch multiple columns for 
 upto 250K keys and do some sort of aggregation on it. Fetching all that data 
 and doing the aggregation in my application takes about a minute.
 
 I'm looking to co-locate the aggregation logic with the region servers to
 a. Distribute the aggregation
 b. Avoid having to fetch large amounts of data over the network (this could 
 potentially be cross-datacenter)
 
 Neither observers nor aggregation endpoints work for this use case. Observers 
 don't return data back to the client while aggregation endpoints work in the 
 context of scans not a multi-get (Are these correct assumptions?).
 
 I'm looking to write a service that runs alongside the region servers and 
 acts a proxy b/w my application and the region servers. 
 
 I plan to use the logic in HBase client's HConnectionManager, to segment my 
 request of 1M rowkeys into sub-requests per region-server. These are sent 
 over to the proxy which fetches the data from the region server, aggregates 
 locally and sends data back. Does this sound reasonable or even a useful 
 thing to pursue?
 
 Regards,
 -sudarshan



Re: Coprocessors

2013-04-25 Thread Viral Bajaria
Phoenix might be able to solve the problem if the keys are structured in
the binary format that it understand or else you are better off reloading
that data in a table created via Phoenix. But I will let James tackle this
question.

Regarding your use-case, why can't you do the aggregation using observers ?
You should be able to do the aggregation and return a new Scanner to your
client.

And Lars is right about the range scans that Phoenix does. It does restrict
things and also will do parallel scans for you based on what you
select/filter.

-Viral


On Thu, Apr 25, 2013 at 3:12 PM, Michael Segel michael_se...@hotmail.comwrote:

 I don't think Phoenix will solve his problem.

 He also needs to explain more about his problem before we can start to
 think about the problem.


 On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote:

  You might want to have a look at Phoenix (
 https://github.com/forcedotcom/phoenix), which does that and more, and
 gives a SQL/JDBC interface.
 
  -- Lars
 
 
 
  
  From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
  To: user@hbase.apache.org
  Sent: Thursday, April 25, 2013 2:44 PM
  Subject: Coprocessors
 
 
  Folks:
 
  This is my first post on the HBase user mailing list.
 
  I have the following scenario:
  I've a HBase table of upto a billion keys. I'm looking to support an
 application where on some user action, I'd need to fetch multiple columns
 for upto 250K keys and do some sort of aggregation on it. Fetching all that
 data and doing the aggregation in my application takes about a minute.
 
  I'm looking to co-locate the aggregation logic with the region servers to
  a. Distribute the aggregation
  b. Avoid having to fetch large amounts of data over the network (this
 could potentially be cross-datacenter)
 
  Neither observers nor aggregation endpoints work for this use case.
 Observers don't return data back to the client while aggregation endpoints
 work in the context of scans not a multi-get (Are these correct
 assumptions?).
 
  I'm looking to write a service that runs alongside the region servers
 and acts a proxy b/w my application and the region servers.
 
  I plan to use the logic in HBase client's HConnectionManager, to segment
 my request of 1M rowkeys into sub-requests per region-server. These are
 sent over to the proxy which fetches the data from the region server,
 aggregates locally and sends data back. Does this sound reasonable or even
 a useful thing to pursue?
 
  Regards,
  -sudarshan




Re: Coprocessors

2013-04-25 Thread Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
Michael: Fair enough. Let me see what relevant information I can add to what 
I've already said:

1. To Lars' point, my 250K keys are unlikely to fall into fewer than 250K 
sub-ranges.
2. Here's a bit more about my schema:
 2.1 My rowkeys are composed of 2 entities - let's call it object-id and 
field-type. An object (O1) has 100s of field types (F1,F2,F3...). Each 
object-id - field-type pair has 100s of attributes (A1,A2,A3). 
 2.2 My rowkeys are O1-F1, O1-F2, O1-F3, etc.
 2.3 My primary application (not the one my original post was about) accesses 
by these rowkeys.
 2.4 My application that does aggregation is given a bunch of objects O1, O2, 
O3, a field-type F1, a bunch of attributes A1,A2 and some computation to 
perform.
 2.5 As you can see, scans are unlikely to be useful when fetching O1-F1, 
O2-F1, O3-F1 etc.

Viral: How do I tackle aggregation using observers? Let's say I override the 
postGet method. I do a multi-get from my client and my method gets called on 
each region server for each row. What is the next step with this approach?


- Original Message -
From: user@hbase.apache.org
To: la...@apache.org, user@hbase.apache.org
Cc: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
At: Apr 25 2013 18:12:46

I don't think Phoenix will solve his problem. 

He also needs to explain more about his problem before we can start to think 
about the problem. 


On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote:

 You might want to have a look at Phoenix 
 (https://github.com/forcedotcom/phoenix), which does that and more, and gives 
 a SQL/JDBC interface.
 
 -- Lars
 
 
 
 
 From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
 To: user@hbase.apache.org 
 Sent: Thursday, April 25, 2013 2:44 PM
 Subject: Coprocessors
 
 
 Folks:
 
 This is my first post on the HBase user mailing list. 
 
 I have the following scenario:
 I've a HBase table of upto a billion keys. I'm looking to support an 
 application where on some user action, I'd need to fetch multiple columns for 
 upto 250K keys and do some sort of aggregation on it. Fetching all that data 
 and doing the aggregation in my application takes about a minute.
 
 I'm looking to co-locate the aggregation logic with the region servers to
 a. Distribute the aggregation
 b. Avoid having to fetch large amounts of data over the network (this could 
 potentially be cross-datacenter)
 
 Neither observers nor aggregation endpoints work for this use case. Observers 
 don't return data back to the client while aggregation endpoints work in the 
 context of scans not a multi-get (Are these correct assumptions?).
 
 I'm looking to write a service that runs alongside the region servers and 
 acts a proxy b/w my application and the region servers. 
 
 I plan to use the logic in HBase client's HConnectionManager, to segment my 
 request of 1M rowkeys into sub-requests per region-server. These are sent 
 over to the proxy which fetches the data from the region server, aggregates 
 locally and sends data back. Does this sound reasonable or even a useful 
 thing to pursue?
 
 Regards,
 -sudarshan

Re: Coprocessors

2013-04-25 Thread James Taylor

On 04/25/2013 03:35 PM, Gary Helmling wrote:

I'm looking to write a service that runs alongside the region servers and
acts a proxy b/w my application and the region servers.

I plan to use the logic in HBase client's HConnectionManager, to segment
my request of 1M rowkeys into sub-requests per region-server. These are
sent over to the proxy which fetches the data from the region server,
aggregates locally and sends data back. Does this sound reasonable or even
a useful thing to pursue?



This is essentially what coprocessor endpoints (called through
HTable.coprocessorExec()) basically do.  (One difference is that there is a
parallel request per-region, not per-region server, though that is a
potential optimization that could be made as well).

The tricky part I see for the case you describe is splitting your full set
of row keys up correctly per region.  You could send the full set of row
keys to each endpoint invocation, and have the endpoint implementation
filter down to only those keys present in the current region.  But that
would be a lot of overhead on the request side.  You could split the row
keys into per-region sets on the client side, but I'm not sure we provide
sufficient context for the Batch.Callable instance you provide to
coprocessorExec() to determine which region it is being invoked against.


Sudarshan,
In our head branch of Phoenix (we're targeting this for a 1.2 release in 
two weeks), we've implemented a skip scan filter that functions similar 
to a batched get, except:
1) it's more flexible in that it can jump not only from a single key to 
another single key, but also from range to range

2) it's faster, about 3-4x.
3) you can use it in combination with aggregation, since it's a filter

The scan is chunked up by region and only the keys in each region are 
sent, along the lines as you and Gary have described. Then the results 
are merged together by the client automatically.


How would you decompose your row key into columns? Is there a time 
component? Let me walk you through an example where you might have a 
LONG id value plus perhaps a timestamp (it work equally well if you only 
had a single column in your PK). If you provide a bit more info on your 
use case, I can tailor it more exactly.


Create a schema:
CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR 
CONSTRAINT pk PRIMARY KEY (key, ts));


Populate your data using our UPSERT statement.

Aggregate over a set of keys like this:

SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts  ? AND ts  ?

where you bind the ? at runtime (probably building the statement 
programmatically based on how many keys you're binding.


Then Phoenix would jump around the key space of your table using the 
skip next hint feature provided by filters. You'd just use the regular 
JDBC ResultSet to get your count back.


If you want more info and/or a benchmark of seeking over 250K keys in a 
billion row table, let me know.


Thanks,

James


Re: Coprocessors

2013-04-25 Thread James Taylor
Thanks for the additional info, Sudarshan. This would fit well with the 
implementation of Phoenix's skip scan.


CREATE TABLE t (
object_id INTEGER NOT NULL,
field_type INTEGER NOT NULL,
attrib_id INTEGER NOT NULL,
value BIGINT
CONSTRAINT pk PRIMARY KEY (object_id, field_type, attribute_id));

SELECT count(value), sum(value),avg(value) FROM t
WHERE object_id IN (?,?,?) AND field_type IN (?,?,?) AND attribute_type 
IN (?,?,?)


and then your client would do whatever additional computation it needed 
on the results it got back.


Would that fit with what you're trying to do?

James

On 04/25/2013 03:36 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote:

Michael: Fair enough. Let me see what relevant information I can add to what 
I've already said:

1. To Lars' point, my 250K keys are unlikely to fall into fewer than 250K 
sub-ranges.
2. Here's a bit more about my schema:
  2.1 My rowkeys are composed of 2 entities - let's call it object-id and 
field-type. An object (O1) has 100s of field types (F1,F2,F3...). Each 
object-id - field-type pair has 100s of attributes (A1,A2,A3).
  2.2 My rowkeys are O1-F1, O1-F2, O1-F3, etc.
  2.3 My primary application (not the one my original post was about) accesses 
by these rowkeys.
  2.4 My application that does aggregation is given a bunch of objects O1, O2, O3, a 
field-type F1, a bunch of attributes A1,A2 and some computation to perform.
  2.5 As you can see, scans are unlikely to be useful when fetching O1-F1, 
O2-F1, O3-F1 etc.

Viral: How do I tackle aggregation using observers? Let's say I override the 
postGet method. I do a multi-get from my client and my method gets called on 
each region server for each row. What is the next step with this approach?


- Original Message -
From: user@hbase.apache.org
To: la...@apache.org, user@hbase.apache.org
Cc: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
At: Apr 25 2013 18:12:46

I don't think Phoenix will solve his problem.

He also needs to explain more about his problem before we can start to think 
about the problem.


On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote:


You might want to have a look at Phoenix 
(https://github.com/forcedotcom/phoenix), which does that and more, and gives a 
SQL/JDBC interface.

-- Lars




From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
To: user@hbase.apache.org
Sent: Thursday, April 25, 2013 2:44 PM
Subject: Coprocessors


Folks:

This is my first post on the HBase user mailing list.

I have the following scenario:
I've a HBase table of upto a billion keys. I'm looking to support an 
application where on some user action, I'd need to fetch multiple columns for 
upto 250K keys and do some sort of aggregation on it. Fetching all that data 
and doing the aggregation in my application takes about a minute.

I'm looking to co-locate the aggregation logic with the region servers to
a. Distribute the aggregation
b. Avoid having to fetch large amounts of data over the network (this could 
potentially be cross-datacenter)

Neither observers nor aggregation endpoints work for this use case. Observers 
don't return data back to the client while aggregation endpoints work in the 
context of scans not a multi-get (Are these correct assumptions?).

I'm looking to write a service that runs alongside the region servers and acts 
a proxy b/w my application and the region servers.

I plan to use the logic in HBase client's HConnectionManager, to segment my 
request of 1M rowkeys into sub-requests per region-server. These are sent over 
to the proxy which fetches the data from the region server, aggregates locally 
and sends data back. Does this sound reasonable or even a useful thing to 
pursue?

Regards,
-sudarshan




Re: Coprocessors

2013-04-25 Thread Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
James: First of all, this looks quite promising. 

The table schema outlined in your other message is correct except that 
attrib_id will not be in the primary key. Will that be a problem with respect 
to the skip-scan filter's performance? (it doesn't seem like it...)

Could you share any sort of benchmark numbers? I want to try this out right 
away, but I've to wait for my cluster administrator to upgrade us from HBase 
0.92 first!

- Original Message -
From: user@hbase.apache.org
To: user@hbase.apache.org
At: Apr 25 2013 18:45:14

On 04/25/2013 03:35 PM, Gary Helmling wrote:
 I'm looking to write a service that runs alongside the region servers and
 acts a proxy b/w my application and the region servers.

 I plan to use the logic in HBase client's HConnectionManager, to segment
 my request of 1M rowkeys into sub-requests per region-server. These are
 sent over to the proxy which fetches the data from the region server,
 aggregates locally and sends data back. Does this sound reasonable or even
 a useful thing to pursue?


 This is essentially what coprocessor endpoints (called through
 HTable.coprocessorExec()) basically do.  (One difference is that there is a
 parallel request per-region, not per-region server, though that is a
 potential optimization that could be made as well).

 The tricky part I see for the case you describe is splitting your full set
 of row keys up correctly per region.  You could send the full set of row
 keys to each endpoint invocation, and have the endpoint implementation
 filter down to only those keys present in the current region.  But that
 would be a lot of overhead on the request side.  You could split the row
 keys into per-region sets on the client side, but I'm not sure we provide
 sufficient context for the Batch.Callable instance you provide to
 coprocessorExec() to determine which region it is being invoked against.

Sudarshan,
In our head branch of Phoenix (we're targeting this for a 1.2 release in 
two weeks), we've implemented a skip scan filter that functions similar 
to a batched get, except:
1) it's more flexible in that it can jump not only from a single key to 
another single key, but also from range to range
2) it's faster, about 3-4x.
3) you can use it in combination with aggregation, since it's a filter

The scan is chunked up by region and only the keys in each region are 
sent, along the lines as you and Gary have described. Then the results 
are merged together by the client automatically.

How would you decompose your row key into columns? Is there a time 
component? Let me walk you through an example where you might have a 
LONG id value plus perhaps a timestamp (it work equally well if you only 
had a single column in your PK). If you provide a bit more info on your 
use case, I can tailor it more exactly.

Create a schema:
 CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR 
CONSTRAINT pk PRIMARY KEY (key, ts));

Populate your data using our UPSERT statement.

Aggregate over a set of keys like this:

 SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts  ? AND ts  ?

where you bind the ? at runtime (probably building the statement 
programmatically based on how many keys you're binding.

Then Phoenix would jump around the key space of your table using the 
skip next hint feature provided by filters. You'd just use the regular 
JDBC ResultSet to get your count back.

If you want more info and/or a benchmark of seeking over 250K keys in a 
billion row table, let me know.

Thanks,

James

Re: Coprocessors

2013-04-25 Thread James Taylor
Our performance engineer, Mujtaba Chohan has agreed to put together a 
benchmark for you. We only have a four node cluster of pretty average 
boxes, but it should give you an idea.


No performance impact for the attrib_id not being part of the PK since 
you're not filtering on them (if I understand things correctly).


A few more questions for you:
- How many rows should be use? 1B?
- How many rows would be filtered by object_id and field_type?
- Any particular key distribution or is random fine?
- What's the minimum key size we should use for object_id and 
field_type? 2 bytes each?
- Any particular kind of aggregation? count(attrib1)? sum(attrib1)? A 
sample query would be helpful


Since you're upgrading, use the latest on the 0.94 branch, 0.94.7.

Thanks,

James

On 04/25/2013 04:19 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) wrote:

James: First of all, this looks quite promising.

The table schema outlined in your other message is correct except that 
attrib_id will not be in the primary key. Will that be a problem with respect 
to the skip-scan filter's performance? (it doesn't seem like it...)

Could you share any sort of benchmark numbers? I want to try this out right 
away, but I've to wait for my cluster administrator to upgrade us from HBase 
0.92 first!

- Original Message -
From: user@hbase.apache.org
To: user@hbase.apache.org
At: Apr 25 2013 18:45:14

On 04/25/2013 03:35 PM, Gary Helmling wrote:

I'm looking to write a service that runs alongside the region servers and
acts a proxy b/w my application and the region servers.

I plan to use the logic in HBase client's HConnectionManager, to segment
my request of 1M rowkeys into sub-requests per region-server. These are
sent over to the proxy which fetches the data from the region server,
aggregates locally and sends data back. Does this sound reasonable or even
a useful thing to pursue?



This is essentially what coprocessor endpoints (called through
HTable.coprocessorExec()) basically do.  (One difference is that there is a
parallel request per-region, not per-region server, though that is a
potential optimization that could be made as well).

The tricky part I see for the case you describe is splitting your full set
of row keys up correctly per region.  You could send the full set of row
keys to each endpoint invocation, and have the endpoint implementation
filter down to only those keys present in the current region.  But that
would be a lot of overhead on the request side.  You could split the row
keys into per-region sets on the client side, but I'm not sure we provide
sufficient context for the Batch.Callable instance you provide to
coprocessorExec() to determine which region it is being invoked against.

Sudarshan,
In our head branch of Phoenix (we're targeting this for a 1.2 release in
two weeks), we've implemented a skip scan filter that functions similar
to a batched get, except:
1) it's more flexible in that it can jump not only from a single key to
another single key, but also from range to range
2) it's faster, about 3-4x.
3) you can use it in combination with aggregation, since it's a filter

The scan is chunked up by region and only the keys in each region are
sent, along the lines as you and Gary have described. Then the results
are merged together by the client automatically.

How would you decompose your row key into columns? Is there a time
component? Let me walk you through an example where you might have a
LONG id value plus perhaps a timestamp (it work equally well if you only
had a single column in your PK). If you provide a bit more info on your
use case, I can tailor it more exactly.

Create a schema:
  CREATE TABLE t (key BIGINT NOT NULL, ts DATE NOT NULL, data VARCHAR
CONSTRAINT pk PRIMARY KEY (key, ts));

Populate your data using our UPSERT statement.

Aggregate over a set of keys like this:

  SELECT count(*) FROM t WHERE key IN (?,?,?) AND ts  ? AND ts  ?

where you bind the ? at runtime (probably building the statement
programmatically based on how many keys you're binding.

Then Phoenix would jump around the key space of your table using the
skip next hint feature provided by filters. You'd just use the regular
JDBC ResultSet to get your count back.

If you want more info and/or a benchmark of seeking over 250K keys in a
billion row table, let me know.

Thanks,

James




[ANNOUNCE] HBase 0.94.7 is available for download

2013-04-25 Thread lars hofhansl
The HBase Team is pleased to announce the immediate release of HBase 0.94.7.
Download it from your favorite Apache mirror [1].

HBase 0.94.7 is a bug fix release with a few performance improvements as well. 
It has 73 issues resolved against it.

0.94.7 is the current stable release of HBase.

As usual, all previous 0.92.x and 0.94.x releases can upgraded to 0.94.7 via a 
rolling upgrade without downtime, intermediary versions can be skipped.

For a complete list of changes, see release notes [2].

Yours,
The HBase Team

P.S. Thank you to the 27 individuals who contributed to this release!

1. http://www.apache.org/dyn/closer.cgi/hbase/
2. 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753version=12324039

Re: HBase - prioritizing writes over reads?

2013-04-25 Thread lars hofhansl
I would also add that if you need an always available store (as in you want A 
and P of CAP [1], and can sacrifice C),
you might be better served with one of the DynamoDB inspired architectures such 
as Riak or Cassandra. HBase choses C and P of CAP.


It might seem strange that as an HBase committer I would advise to look at some 
non-HBase technology, but I am a big fan of using the right tool for the right 
job.

-- Lars


1. See also http://en.wikipedia.org/wiki/CAP_theorem




 From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org user@hbase.apache.org 
Sent: Thursday, April 25, 2013 10:17 AM
Subject: Re: HBase - prioritizing writes over reads?
 

Short answer is no, there's no knob or configuration to do that.

Longer answer is it depends. Are the reads and writes going to different
regions/tables? If so, disable the balancer and take it in charge
by segregating the offending regions on their own RS.

I also see you have the requirement to take incoming data not matter what.
Well, this currently cannot be guaranteed in HBase since a RS failure will
incur some limited unavailability while the ZK session times out, the logs
are replayed and the regions are reassigned. I don't know what kind of SLA
you have but it sounds like even without your reads problem you need to do
something client-side to take care of this. Local buffers maybe? It would
work as long as you don't need to serve that new data right away (unless
you also start serving from the local buffer, but it's getting complicated).

Hope this helps,

J-D


On Wed, Apr 24, 2013 at 3:25 AM, kzurek kzu...@proximetry.pl wrote:

 Is it possible to prioritize writes over reads in HBase? I'm facing some
 I/O
 read related issues that influence my write clients and cluster in general
 (constantly growing store files on some RS). Due to the fact that I cannot
 let myself to loose/skip incoming data, I would like to guarantee that in
 case of extensive read I will be able to limit incoming read requests, so
 that write requests wont be influenced. Is it possible? If so what would be
 the best way to that and where it should be placed - on the client or
 cluster side)?







 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/HBase-prioritizing-writes-over-reads-tp4042838.html
 Sent from the HBase User mailing list archive at Nabble.com.


Re: HBase is not running.

2013-04-25 Thread Yves S. Garret
Hi Jean, this is my /etc/hosts.

127.0.0.1   localhost localhost.localdomain localhost4
localhost4.localdomain4
127.0.0.1   localhost
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6


On Thu, Apr 25, 2013 at 5:22 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Yves,

 You seems to have some network configuration issue with your installation.

 java.net.BindException: Cannot assign requested address and
 ip72-215-225-9.at.at.cox.net/72.215.225.9:0

 How is your host file configured? You need to have your host name
 pointing to you local IP (and not 127.0.0.1).

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  My mistake.  I thought I had all of those logs.  This is what I currently
  have:
  http://bin.cakephp.org/view/2112130549
 
  I have $JAVA_HOME set to this:
  /usr/java/jdk1.7.0_17
  I have extracted 0.94 and ran bin/start-hbase.sh
 
  Thanks for your help!
 
 
 
  On Thu, Apr 25, 2013 at 4:42 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Mohammad,
 
  He is running standalone, so no need to update the zookeeper qorum yet.
 
  Yes, can you share the entire hbase-ysg-master-ysg.connect.log file?
  Not just the first lines. Or what you sent is already all?
 
  So what have you done yet? Downloaded 0.94, extracted it, setup the
  JAVA_HOME and ran bin/start-hbase.sh ?
 
  JMS
 
  2013/4/25 Mohammad Tariq donta...@gmail.com:
   Hello Yves,
  
  The log seems to be incomplete. Could you please the complete
   logs?Have you set the hbase.zookeeper.quorum property properly?Is
 your
   Hadoop running fine?
  
   Warm Regards,
   Tariq
   https://mtariq.jux.com/
   cloudfront.blogspot.com
  
  
   On Fri, Apr 26, 2013 at 2:00 AM, Yves S. Garret
   yoursurrogate...@gmail.comwrote:
  
   Hi again.  I have 3 log files and only one of them had anything in
 them,
   here are the file names.  I'm assuming that you're talking about the
   directory ${APACHE_HBASE_HOME}/logs, yes?
  
   Here are the file names:
   -rw-rw-r--. 1 user user 12465 Apr 25 14:54
  hbase-ysg-master-ysg.connect.log
   -rw-rw-r--. 1 user user 0 Apr 25 14:54
  hbase-ysg-master-ysg.connect.out
   -rw-rw-r--. 1 user user 0 Apr 25 14:54 SecurityAuth.audit
  
   Also, to answer your question about the UI, I tried that URL (I'm
 doing
  all
   of this on my laptop just to learn at the moment) and neither the URL
  nor
   localhost:60010 worked.  So, the answer to your question is that the
 UI
  is
   not showing up.  This could be due to not being far along in the
  tutorial,
   perhaps?
  
   Thanks again!
  
  
   On Thu, Apr 25, 2013 at 4:22 PM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org wrote:
  
There is no stupid question ;)
   
Are the log truncated? Anything else after that? Or that's all what
  you
have?
   
For the UI, you can access it with
   http://192.168.X.X:60010/master-status
   
Replace the X with your own IP. You should see some information
 about
your HBase cluster (even in Standalone mode).
   
JMS
   
2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
 Here are the logs, what should I be looking for?  Seems like
  everything
 is fine for the moment, no?

 http://bin.cakephp.org/view/2144893539

 The web UI?  What do you mean?  Sorry if this is a stupid
 question,
  I'm
 a Hadoop newb.

 On Thu, Apr 25, 2013 at 3:19 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org wrote:

 Before trying the shell, can you look at the server logs and
 see if
 everything is fine?

 Also, is the web UI working fine?

 2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
  Ok, spoke too soon :) .
 
  I ran this command [ create 'test', 'cf' ] and this is the
 result
that I
  got:
  http://bin.cakephp.org/view/168926019
 
  This is after running helpenter and having this run just
 fine.
 
 
  On Thu, Apr 25, 2013 at 1:23 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org wrote:
 
  Hi Yves,
 
  0.95.0 is a developer version. If you are starting with
 HBase, I
   will
  recommend you to choose a more stable version like 0.94.6.1.
 
  Regarding the 3 choices you are listing below.
  1) This one is HBase 0.95 running over Hadoop 1.0
  2) This one is HBase 0.95 running over Hadoop 2.0
  3) This one are the HBase source classes.
 
  Again, I think you are better to go with a stable version for
  the
  first steps:
  http://www.bizdirusa.com/mirrors/apache/hbase/stable/
 
  Would you mind to retry you tests with this version and let
 me
  know
if
  it's working better?
 
  JM
 
  2013/4/25 Yves S. Garret yoursurrogate...@gmail.com:
   Hi, I'm trying to run 0.95.0.
  
   I've tried both and nothing worked.
  
   I do have another question.  When I go to download hbase, I
  get
   the
   following 3 choices:
  

Re: Coprocessors

2013-04-25 Thread Michael Segel
Hi,

Lets reiterate what you've said 

You have a set of objects O1, O2. On and you have some field type F1 
where F1 which is part of your composite key. You want to fetch back a set of 
rows and then do some aggregation on the attributes. 


There was a similar discussion on this where someone had a random set of values 
and was having performance issues. 

If your set of objects is in sort order and you have only one field type F1 
you should be able to do the multi-gets. 

Are you currently using the multigets ? 



On Apr 25, 2013, at 5:36 PM, Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) 
skada...@bloomberg.net wrote:

 Michael: Fair enough. Let me see what relevant information I can add to what 
 I've already said:
 
 1. To Lars' point, my 250K keys are unlikely to fall into fewer than 250K 
 sub-ranges.
 2. Here's a bit more about my schema:
 2.1 My rowkeys are composed of 2 entities - let's call it object-id and 
 field-type. An object (O1) has 100s of field types (F1,F2,F3...). Each 
 object-id - field-type pair has 100s of attributes (A1,A2,A3). 
 2.2 My rowkeys are O1-F1, O1-F2, O1-F3, etc.
 2.3 My primary application (not the one my original post was about) accesses 
 by these rowkeys.
 2.4 My application that does aggregation is given a bunch of objects O1, O2, 
 O3, a field-type F1, a bunch of attributes A1,A2 and some computation to 
 perform.
 2.5 As you can see, scans are unlikely to be useful when fetching O1-F1, 
 O2-F1, O3-F1 etc.
 
 Viral: How do I tackle aggregation using observers? Let's say I override the 
 postGet method. I do a multi-get from my client and my method gets called on 
 each region server for each row. What is the next step with this approach?
 
 
 - Original Message -
 From: user@hbase.apache.org
 To: la...@apache.org, user@hbase.apache.org
 Cc: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN)
 At: Apr 25 2013 18:12:46
 
 I don't think Phoenix will solve his problem. 
 
 He also needs to explain more about his problem before we can start to think 
 about the problem. 
 
 
 On Apr 25, 2013, at 4:54 PM, lars hofhansl la...@apache.org wrote:
 
 You might want to have a look at Phoenix 
 (https://github.com/forcedotcom/phoenix), which does that and more, and 
 gives a SQL/JDBC interface.
 
 -- Lars
 
 
 
 
 From: Sudarshan Kadambi (BLOOMBERG/ 731 LEXIN) skada...@bloomberg.net
 To: user@hbase.apache.org 
 Sent: Thursday, April 25, 2013 2:44 PM
 Subject: Coprocessors
 
 
 Folks:
 
 This is my first post on the HBase user mailing list. 
 
 I have the following scenario:
 I've a HBase table of upto a billion keys. I'm looking to support an 
 application where on some user action, I'd need to fetch multiple columns 
 for upto 250K keys and do some sort of aggregation on it. Fetching all that 
 data and doing the aggregation in my application takes about a minute.
 
 I'm looking to co-locate the aggregation logic with the region servers to
 a. Distribute the aggregation
 b. Avoid having to fetch large amounts of data over the network (this could 
 potentially be cross-datacenter)
 
 Neither observers nor aggregation endpoints work for this use case. 
 Observers don't return data back to the client while aggregation endpoints 
 work in the context of scans not a multi-get (Are these correct 
 assumptions?).
 
 I'm looking to write a service that runs alongside the region servers and 
 acts a proxy b/w my application and the region servers. 
 
 I plan to use the logic in HBase client's HConnectionManager, to segment my 
 request of 1M rowkeys into sub-requests per region-server. These are sent 
 over to the proxy which fetches the data from the region server, aggregates 
 locally and sends data back. Does this sound reasonable or even a useful 
 thing to pursue?
 
 Regards,
 -sudarshan



Re: writing and reading from a region at once

2013-04-25 Thread Anoop John
But it seems that I'm losing writes somewhere, is it possible the writes
could fail silently

Which version you are using?  How you say writes missed silently?  The
current read, which was going on, has not returned the row that you just
wrote?  Or you have created a new scan after wards and in that also the
written data is missing?

-Anoop-

On Fri, Apr 26, 2013 at 3:04 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote:

 Inline.

 J-D


 On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman 
 azimmer...@sproutsocial.com wrote:

  Hi,
 
If a region is being written to, and a scanner takes a lease out on the
  region, what will happen to the writes?  Is there a concept of
 Transaction
  Isolation Levels?
 

 There's MVCC, so reads can happen while someone else is writing. What you
 should expect from HBase is read committed.


 
I don't see errors in Puts while the tables are being scanned?  But it
  seems that I'm losing writes somewhere, is it possible the writes could
  fail silently?
 

  Is it temporary while you're scanning or there's really data missing at
 the end of the day? The former might happen on some older HBase versions
 while the latter should never happen unless you lower the durability level
 yourself and have machine failures.

 J-D