Re: HBase manager GUI

2012-11-27 Thread Mohammad Tariq
Hello Alok,

   You are always welcome :). Everybody starts new at some point. Go
ahead. Lot of good people are here help you out.

Regards,
Mohammad Tariq



On Tue, Nov 27, 2012 at 10:16 AM, Alok Singh Mahor alokma...@gmail.comwrote:

 thanks a lot Mahammad for this very complete and so mature reply :)
 I am very new and just started playing with HBase for my college project
 work. I will try to play with API's
 thanks :)

 On Tue, Nov 27, 2012 at 2:31 AM, Mohammad Tariq donta...@gmail.com
 wrote:

  Hello Alok,
 
  I have seen this project. Good work. But let me tell you one thing,
 the
  way Hbase is used is slightly different from the way you use traditional
  relational databases. Rarely people, who are working on real clusters,
 face
  a situation wherein they need to query Hbase directly. Though it can be
  done just for a few minor tasks like small gets, scans, puts etc etc. For
  that the Hbase shell is more than sufficient.
 
  People either use Hbase API features like filters or co-processors or
 write
  MapReduce jobs to query their Hbase tables or map their tables to Hive
  warehouse tables. Having said that, I would suggest you to get yourself
  familiar with Hbase API rather than relying on any other thing if you are
  planning to adopt Hbase as your primary datastore.
 
  The web interface provided by Hbase is just for visualization and
  monitoring and not for performing various table operations. But, that
  doesn't mean it is completely useless. Hbase guys have done really a
 great
  work. You can even perform some operation from the webUI as well.
 
  HTH
 
  Regards,
  Mohammad Tariq
 
 
 
  On Tue, Nov 27, 2012 at 12:55 AM, Alok Singh Mahor alokma...@gmail.com
  wrote:
 
   I need frontend for HBase shell like we have phpmyadmin for MySql.
  
   I tried 127.0.0.1:600010 and 127.0.0.1:60030 these are just giving
   information about master mode and regional server respectively. so I
  tried
   to use hbasemanagergui but i am unable to connect it
  
   does HBase web UI have feature of using it as hbase shell GUI
  alternative ?
   if yes how to run that?
  
   On Tue, Nov 27, 2012 at 12:16 AM, Harsh J ha...@cloudera.com wrote:
  
What are your exact 'manager GUI' needs though? I mean, what are you
envisioning it will help you perform (over the functionality already
offered by the HBase Web UI)?
   
On Mon, Nov 26, 2012 at 9:59 PM, Alok Singh Mahor 
 alokma...@gmail.com
  
wrote:
 Hi all,
 I have set up standalone Hbase on my laptop. HBase shell is working
   fine.
 and I am not using hadoop and zookeeper
 I found one frontend for HBase
 https://sourceforge.net/projects/hbasemanagergui/
 but i am not able to use this

 to set up connection i have to give information
 hbase.zookeeper.quorum:
 hbase.zookeeper.property.clientport:
 hbase.master:

 I values I have to set in these fields and I am not using
 zookeeper.
 did anyone try this GUI?
 thanks in advance :)


 --
 Alok Singh Mahor
 http://alokmahor.co.cc
 Join the next generation of computing, Open Source and Linux/GNU!!
   
   
   
--
Harsh J
   
  
  
  
   --
   Alok Singh Mahor
   http://alokmahor.co.cc
   Join the next generation of computing, Open Source and Linux/GNU!!
  
 



 --
 Alok Singh Mahor
 http://alokmahor.co.cc
 Join the next generation of computing, Open Source and Linux/GNU!!



Re: Unable to Create Table in Hbase

2012-11-27 Thread Jean-Marc Spaggiari
Hi Shyam,

Are you sure your table is created? If you do a list on the shell,
you can see it? Can you see it on the html gui?

JM

2012/11/27, shyam kumar lakshyam.sh...@gmail.com:
 There is no exception or warnings in the log and the console prints the
 following

 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:host.name=localhost
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.version=1.7.0_09
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.vendor=Oracle Corporation
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.home=/home/shyam/jdk1.7.0_09/jre
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.class.path=lib/setooz-ir-core.jar:lib/guava-12.0.jar:lib/carrot2-core-3.7.0-SNAPSHOT.jar:lib/commons-codec-1.4.jar:lib/commons-configuration-1.7.jar:lib/hadoop-core-1.0.2.jar:lib/tika-app-1.0.jar:lib/httpclient-4.0.3.jar:lib/ezmorph.jar:lib/geoip.jar:lib/xercesImpl.jar:lib/attributes-binder-1.0.1.jar:lib/jackson-core-asl-1.7.4.jar:lib/veooz-analysis.jar:lib/log4j-1.2.17.jar:lib/maxent-3.0.0.jar:lib/liblinear-1.7.jar:lib/semantifire-1.0.jar:lib/ritaWN.jar:lib/slf4j-log4j12-1.6.1.jar:lib/commons-logging-1.1.1.jar:lib/slf4j-api-1.6.1.jar:lib/bzip2.jar:lib/langdetect.jar:lib/mahout-math-0.6.jar:lib/zookeeper-3.4.3.jar:lib/commons-lang-2.5.jar:lib/wikixmlj-r43.jar:lib/commons-collections-3.1.jar:lib/hppc-0.4.1.jar:lib/mahout-collections-1.0.jar:lib/jackson-mapper-asl-1.7.4.jar:lib/supportWN.jar:lib/simple-xml-2.6.4.jar:lib/commons-beanutils-1.7.jar:lib/opennlp-tools-1.5.0.jar:lib/setooz-core-3.5-SNAPSHOT.jar:lib/json-lib-2.4-jdk15.jar:lib/gson-2.2.2.jar:lib/jsoup-1.6.0.jar:lib/jsonic-1.2.4.jar:lib/lucene-analyzers-3.6.0.jar:lib/hbase-0.92.1.jar:lib/xml-apis.jar:conf/:dist/Veooz-Core.jar:.
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.io.tmpdir=/tmp
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:java.compiler=NA
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:os.name=Linux
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client environment:os.arch=i386
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:os.version=3.2.0-33-generic-pae
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:user.name=shyam
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:user.home=/home/shyam
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Client
 environment:user.dir=/home/shyam/workspace/Veooz/Veooz-Core
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Initiating client connection,
 connectString=localhost:2181 sessionTimeout=18 watcher=hconnection
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: Opening socket connection to
 server /127.0.0.1:2181
 12/11/27 11:03:42 INFO client.ZooKeeperSaslClient: Client will not
 SASL-authenticate because the default JAAS configuration section 'Client'
 could not be found. If you are not using SASL, you may ignore this. On the
 other hand, if you expected SASL to work, please fix your JAAS
 configuration.
 12/11/27 11:03:42 INFO zookeeper.RecoverableZooKeeper: The identifier of
 this process is 6296@setu-M68MT-S2
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: Socket connection established
 to localhost/127.0.0.1:2181, initiating session
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: Session establishment complete
 on server localhost/127.0.0.1:2181, sessionid = 0x13b405aac590004,
 negotiated timeout = 4
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Initiating client connection,
 connectString=localhost:2181 sessionTimeout=18
 watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@16f77b6
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: Opening socket connection to
 server /127.0.0.1:2181
 12/11/27 11:03:42 INFO client.ZooKeeperSaslClient: Client will not
 SASL-authenticate because the default JAAS configuration section 'Client'
 could not be found. If you are not using SASL, you may ignore this. On the
 other hand, if you expected SASL to work, please fix your JAAS
 configuration.
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: Socket connection established
 to localhost/127.0.0.1:2181, initiating session
 12/11/27 11:03:42 INFO zookeeper.RecoverableZooKeeper: The identifier of
 this process is 6296@setu-M68MT-S2
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: Session establishment complete
 on server localhost/127.0.0.1:2181, sessionid = 0x13b405aac590005,
 negotiated timeout = 4
 12/11/27 11:03:42 INFO zookeeper.ClientCnxn: EventThread shut down
 12/11/27 11:03:42 INFO zookeeper.ZooKeeper: Session: 0x13b405aac590005
 closed
 Creating HBase Table: Posts



 and finally the process is not terminating ... it is 

Re: Connecting to standalone HBase from a remote client

2012-11-27 Thread matan
Thanks guys,

Excuse my ignorance, but having sort of agreed that the configuration that
determines which server should be contacted for what is on the HBase
server, I am not sure how any of the practical suggestions made should
solve the issue, and enable connecting from a remote client.

Let me explain - setting /etc/hosts on my client side seems in this regard
not relevant in that view. And the other suggestion for hbase-site.xml
configuration I have already got covered as my client code successfully
connects to zookeeper (the configuration properties mentioned on this
thread are zookeeper specific, I don't directly see how they should solve
the problem).

On Mon, Nov 26, 2012 at 10:15 PM, Tariq [via Apache HBase] 
ml-node+s679495n4034419...@n3.nabble.com wrote:

 Hello Nicolas,

   You are right. It has been deprecated. Thank you for updating my
 knowledge base..:)

 Regards,
 Mohammad Tariq



 On Tue, Nov 27, 2012 at 12:17 AM, Nicolas Liochon [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4034419i=0
 wrote:

  Hi Mohammad,
 
  Your answer was right, just that specifying the master address is not
  necessary (anymore I think). But it does no harm.
  Changing the /etc/hosts (as you did) is right too.
  Lastly, if the cluster is standalone and accessed locally, having
 localhost
  in ZK will not be an issue. However, it's perfectly possible to have a
  standalone cluster accessed remotely, so you don't want to have the
 master
  to write I'm on the server named localhost in this case. I expect it
  won't be an issue for communications between the region servers or hdfs
 as
  they would be all on the same localhost...
 
  Cheers,
 
  Nicolas
 
  On Mon, Nov 26, 2012 at 7:16 PM, Mohammad Tariq [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4034419i=1

  wrote:
 
   what
 


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-from-a-remote-client-tp4034362p4034419.html
  To unsubscribe from Connecting to standalone HBase from a remote client, 
 click
 herehttp://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4034362code=bWF0YW5AY2xvdWRhbG9lLm9yZ3w0MDM0MzYyfC0xMDg3NTk1Njc3
 .
 NAMLhttp://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-from-a-remote-client-tp4034362p4034438.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Connecting to standalone HBase from a remote client

2012-11-27 Thread matan
Thanks guys,

Excuse my ignorance, but having sort of agreed that the configuration that
determines which-server-should-be-contacted-for-what is on the HBase
server, I am not sure how any of the practical suggestions made should
solve the issue, and enable connecting from a remote client.

Let me delineate - setting /etc/hosts on my client side seems in this
regard not relevant in that view. And the other suggestion for
hbase-site.xml configuration I have already got covered, as my client code
successfully connects to zookeeper (the configuration properties mentioned
on this thread are zookeeper specific according to my interpretation of
documentation, I don't directly see how they should solve the problem).
Perhaps Mohammad you can explain why those zookeeper properties relate to
how the master references itself towards zookeeper?

Should I take it from St.Ack that there is currently no way to specify the
master's remotely accessible server/ip in the HBase configuration?

Anyway, my HBase server's /etc/hosts has just one line now, in case it got
lost on the thread -
127.0.0.1 localhost 'server-name'. Everything works fine on the HBase
server itself, the same client code runs perfectly there.

Thanks again,
Matan

On Mon, Nov 26, 2012 at 10:15 PM, Tariq [via Apache HBase] 
ml-node+s679495n4034419...@n3.nabble.com wrote:

 Hello Nicolas,

   You are right. It has been deprecated. Thank you for updating my
 knowledge base..:)

 Regards,
 Mohammad Tariq



 On Tue, Nov 27, 2012 at 12:17 AM, Nicolas Liochon [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4034419i=0
 wrote:

  Hi Mohammad,
 
  Your answer was right, just that specifying the master address is not
  necessary (anymore I think). But it does no harm.
  Changing the /etc/hosts (as you did) is right too.
  Lastly, if the cluster is standalone and accessed locally, having
 localhost
  in ZK will not be an issue. However, it's perfectly possible to have a
  standalone cluster accessed remotely, so you don't want to have the
 master
  to write I'm on the server named localhost in this case. I expect it
  won't be an issue for communications between the region servers or hdfs
 as
  they would be all on the same localhost...
 
  Cheers,
 
  Nicolas
 
  On Mon, Nov 26, 2012 at 7:16 PM, Mohammad Tariq [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4034419i=1

  wrote:
 
   what
 


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-from-a-remote-client-tp4034362p4034419.html
  To unsubscribe from Connecting to standalone HBase from a remote client, 
 click
 herehttp://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4034362code=bWF0YW5AY2xvdWRhbG9lLm9yZ3w0MDM0MzYyfC0xMDg3NTk1Njc3
 .
 NAMLhttp://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-from-a-remote-client-tp4034362p4034439.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: standalone HBase instance fails to start

2012-11-27 Thread matan
Thanks again, seems helpful for (Ubuntu) quick starting.


On Mon, Nov 26, 2012 at 7:44 PM, stack-3 [via Apache HBase] 
ml-node+s679495n4034405...@n3.nabble.com wrote:

 On Sun, Nov 25, 2012 at 8:28 AM, matan [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4034405i=0
 wrote:
  Nothing. Maybe just link to it from
  http://hbase.apache.org/book/quickstart.html such that people for whom
 the
  quick start doesn't work, will have a direct route to this and other
  prerequisites.
 

 I just added note on loopback to the getting started:
 http://hbase.apache.org/book.html#quickstart

 I don't want to clutter the getting started w/ a long list of prereqs
 that actually are not needed putting up hbase in standalone mode; e.g.
 you don't need to make sure ssh to localhost is working when doing
 standalone.

 Thanks.  Any other suggestions on how to improve the doc. are most
 welcome.
 St.Ack


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-hbase.679495.n3.nabble.com/standalone-HBase-instance-fails-to-start-tp4034333p4034405.html
  To unsubscribe from standalone HBase instance fails to start, click 
 herehttp://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4034333code=bWF0YW5AY2xvdWRhbG9lLm9yZ3w0MDM0MzMzfC0xMDg3NTk1Njc3
 .
 NAMLhttp://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/standalone-HBase-instance-fails-to-start-tp4034333p4034440.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Connecting to standalone HBase from a remote client

2012-11-27 Thread Doug Meil

Hi there-

re:  From what I have understood, these properties are not for Hbase but
for the Hbase client which we write. They tell the client where to look for
ZK.

Yep.  That's how it works.  Then the client looks up ROOT/META and then
the client talks directly to the RegionServers.

http://hbase.apache.org/book.html#client





On 11/27/12 8:52 AM, Mohammad Tariq donta...@gmail.com wrote:

Hello Matan,

  From what I have understood, these properties are not for Hbase but
for the Hbase client which we write. They tell the client where to look
for
ZK.

Hmaster registers its address with ZK. And from there client will come to
know where to look for Hmaster. And if the Hmaster registers its address
as
'localhost', the client will take it as the 'localhost', which is client's
'localhost' and not the 'localhost' where Hmaster is running. So, if you
have the IP and hostname of the Hmaster in your /etc/hosts file the client
can reach that machine without any problem as there is proper DNS
resolution available.

But this just is what I think. I need approval from the heavyweights.

Stack sir??



Regards,
Mohammad Tariq



On Tue, Nov 27, 2012 at 5:57 PM, matan ma...@cloudaloe.org wrote:

 Thanks guys,

 Excuse my ignorance, but having sort of agreed that the configuration
that
 determines which-server-should-be-contacted-for-what is on the HBase
 server, I am not sure how any of the practical suggestions made should
 solve the issue, and enable connecting from a remote client.

 Let me delineate - setting /etc/hosts on my client side seems in this
 regard not relevant in that view. And the other suggestion for
 hbase-site.xml configuration I have already got covered, as my client
code
 successfully connects to zookeeper (the configuration properties
mentioned
 on this thread are zookeeper specific according to my interpretation of
 documentation, I don't directly see how they should solve the problem).
 Perhaps Mohammad you can explain why those zookeeper properties relate
to
 how the master references itself towards zookeeper?

 Should I take it from St.Ack that there is currently no way to specify
the
 master's remotely accessible server/ip in the HBase configuration?

 Anyway, my HBase server's /etc/hosts has just one line now, in case it
got
 lost on the thread -
 127.0.0.1 localhost 'server-name'. Everything works fine on the HBase
 server itself, the same client code runs perfectly there.

 Thanks again,
 Matan

 On Mon, Nov 26, 2012 at 10:15 PM, Tariq [via Apache HBase] 
 ml-node+s679495n4034419...@n3.nabble.com wrote:

  Hello Nicolas,
 
You are right. It has been deprecated. Thank you for updating my
  knowledge base..:)
 
  Regards,
  Mohammad Tariq
 
 
 
  On Tue, Nov 27, 2012 at 12:17 AM, Nicolas Liochon [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4034419i=0
  wrote:
 
   Hi Mohammad,
  
   Your answer was right, just that specifying the master address is
not
   necessary (anymore I think). But it does no harm.
   Changing the /etc/hosts (as you did) is right too.
   Lastly, if the cluster is standalone and accessed locally, having
  localhost
   in ZK will not be an issue. However, it's perfectly possible to
have a
   standalone cluster accessed remotely, so you don't want to have the
  master
   to write I'm on the server named localhost in this case. I expect
it
   won't be an issue for communications between the region servers or
hdfs
  as
   they would be all on the same localhost...
  
   Cheers,
  
   Nicolas
  
   On Mon, Nov 26, 2012 at 7:16 PM, Mohammad Tariq [hidden email]
 http://user/SendEmail.jtp?type=nodenode=4034419i=1
 
   wrote:
  
what
  
 
 
  --
   If you reply to this email, your message will be added to the
discussion
  below:
 
 
 
http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-f
rom-a-remote-client-tp4034362p4034419.html
   To unsubscribe from Connecting to standalone HBase from a remote
 client, click
  here
 
http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=u
nsubscribe_by_codenode=4034362code=bWF0YW5AY2xvdWRhbG9lLm9yZ3w0MDM0MzYy
fC0xMDg3NTk1Njc3
 
  .
  NAML
 
http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=m
acro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namesp
aces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.
web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemai
l.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3
Aemail.naml
 
 




 --
 View this message in context:
 
http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-f
rom-a-remote-client-tp4034362p4034439.html
 Sent from the HBase User mailing list archive at Nabble.com.





Re: Connecting to standalone HBase from a remote client

2012-11-27 Thread Mohammad Tariq
Thank you both for the comments :)

Regards,
Mohammad Tariq



On Tue, Nov 27, 2012 at 8:56 PM, ramkrishna vasudevan 
ramkrishna.s.vasude...@gmail.com wrote:

 You are right Mohammad,

 Regards
 Ram

 On Tue, Nov 27, 2012 at 8:53 PM, Doug Meil doug.m...@explorysmedical.com
 wrote:

 
  Hi there-
 
  re:   From what I have understood, these properties are not for Hbase
 but
  for the Hbase client which we write. They tell the client where to look
 for
  ZK.
 
  Yep.  That's how it works.  Then the client looks up ROOT/META and then
  the client talks directly to the RegionServers.
 
  http://hbase.apache.org/book.html#client
 
 
 
 
 
  On 11/27/12 8:52 AM, Mohammad Tariq donta...@gmail.com wrote:
 
  Hello Matan,
  
From what I have understood, these properties are not for Hbase
 but
  for the Hbase client which we write. They tell the client where to look
  for
  ZK.
  
  Hmaster registers its address with ZK. And from there client will come
 to
  know where to look for Hmaster. And if the Hmaster registers its address
  as
  'localhost', the client will take it as the 'localhost', which is
 client's
  'localhost' and not the 'localhost' where Hmaster is running. So, if you
  have the IP and hostname of the Hmaster in your /etc/hosts file the
 client
  can reach that machine without any problem as there is proper DNS
  resolution available.
  
  But this just is what I think. I need approval from the heavyweights.
  
  Stack sir??
  
  
  
  Regards,
  Mohammad Tariq
  
  
  
  On Tue, Nov 27, 2012 at 5:57 PM, matan ma...@cloudaloe.org wrote:
  
   Thanks guys,
  
   Excuse my ignorance, but having sort of agreed that the configuration
  that
   determines which-server-should-be-contacted-for-what is on the HBase
   server, I am not sure how any of the practical suggestions made should
   solve the issue, and enable connecting from a remote client.
  
   Let me delineate - setting /etc/hosts on my client side seems in this
   regard not relevant in that view. And the other suggestion for
   hbase-site.xml configuration I have already got covered, as my client
  code
   successfully connects to zookeeper (the configuration properties
  mentioned
   on this thread are zookeeper specific according to my interpretation
 of
   documentation, I don't directly see how they should solve the
 problem).
   Perhaps Mohammad you can explain why those zookeeper properties relate
  to
   how the master references itself towards zookeeper?
  
   Should I take it from St.Ack that there is currently no way to specify
  the
   master's remotely accessible server/ip in the HBase configuration?
  
   Anyway, my HBase server's /etc/hosts has just one line now, in case it
  got
   lost on the thread -
   127.0.0.1 localhost 'server-name'. Everything works fine on the HBase
   server itself, the same client code runs perfectly there.
  
   Thanks again,
   Matan
  
   On Mon, Nov 26, 2012 at 10:15 PM, Tariq [via Apache HBase] 
   ml-node+s679495n4034419...@n3.nabble.com wrote:
  
Hello Nicolas,
   
  You are right. It has been deprecated. Thank you for updating
 my
knowledge base..:)
   
Regards,
Mohammad Tariq
   
   
   
On Tue, Nov 27, 2012 at 12:17 AM, Nicolas Liochon [hidden email]
   http://user/SendEmail.jtp?type=nodenode=4034419i=0
wrote:
   
 Hi Mohammad,

 Your answer was right, just that specifying the master address is
  not
 necessary (anymore I think). But it does no harm.
 Changing the /etc/hosts (as you did) is right too.
 Lastly, if the cluster is standalone and accessed locally, having
localhost
 in ZK will not be an issue. However, it's perfectly possible to
  have a
 standalone cluster accessed remotely, so you don't want to have
 the
master
 to write I'm on the server named localhost in this case. I
 expect
  it
 won't be an issue for communications between the region servers or
  hdfs
as
 they would be all on the same localhost...

 Cheers,

 Nicolas

 On Mon, Nov 26, 2012 at 7:16 PM, Mohammad Tariq [hidden email]
   http://user/SendEmail.jtp?type=nodenode=4034419i=1
   
 wrote:

  what

   
   
--
 If you reply to this email, your message will be added to the
  discussion
below:
   
   
  
  
 
 http://apache-hbase.679495.n3.nabble.com/Connecting-to-standalone-HBase-f
  rom-a-remote-client-tp4034362p4034419.html
 To unsubscribe from Connecting to standalone HBase from a remote
   client, click
here
  
  
 
 http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=u
 
 nsubscribe_by_codenode=4034362code=bWF0YW5AY2xvdWRhbG9lLm9yZ3w0MDM0MzYy
  fC0xMDg3NTk1Njc3
   
.
NAML
  
  
 
 http://apache-hbase.679495.n3.nabble.com/template/NamlServlet.jtp?macro=m
 
 acro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namesp
 
 

Re: Unable to Create Table in Hbase

2012-11-27 Thread shyam kumar
HI,

Ya am able to see the table and table description in hbase shell
(list 'table_name' and describe 'table_name')

but am unable to perform scan 'table_name' as i told earlier



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/Unable-to-Create-Table-in-Hbase-tp4034375p4034443.html
Sent from the HBase User mailing list archive at Nabble.com.


Question About Starting and Stopping HADOOP HBASE Cluster in Secure Mode

2012-11-27 Thread a...@hsk.hk
Hi,

I currently use the following steps to start and stop HADOOP  HBASE cluster:

1) Without Kerberos Security
   (start zookeepers)
   start the cluster from Master:
  {$HADOOP_HOME}/bin/start-dfs.sh   // one command 
start all servers
  {$HADOOP_HOME}/bin/start-mapred.sh
  {$HBASE_HOME}/bin/start-hbase.sh 

   stop the cluster from Master:
  {$HBASE_HOME}/bin/stop-hbase.sh 
  {$HADOOP_HOME}/bin/stop-mapred.sh
  {$HADOOP_HOME}/bin/stop-dfs.sh
   (stop zookeepers)


2) With Kerberos and in Secure Mode
   start the HADOOP Namenode 
  {$HADOOP_HOME}/bin/hadoop-daemon.sh start namenode 
   then for each datanode:
  {$HADOOP_HOME}/bin/hadoop-daemon.sh start datanode  
   then HBASE Master: 
  {$HBASE_HOME}/bin/hbase-daemon.sh start master (as root)
   then for each HBASE regionserver:
  {$HBASE_HOME}/bin/hbase-daemon.sh start regionserver 


QUESTION: As I can see from 2) there are more steps to start the entire cluster 
in secure node, are there any existing commands which simplify start/stop 
HADOOP  HBASE in Secure mode? 

Thanks
ac

Re: Connecting to standalone HBase from a remote client

2012-11-27 Thread matan
Hi Mohammad,

I'm loosing track... I came to understand that ZK tells the client where
the ROOT/META is, and from there the client gets the region server it
should contact. And yet I take it that you are saying that the
configuration for the location of the ROOT/META or region server should be
done on the client side. These two ideas seem to present a contradiction,
and I probably don't have a good grasp of what is going on, or what should
be done. Can you or anyone try to clarify?

Thanks,
matan

On Tue, Nov 27, 2012 at 5:33 PM, Tariq [via Apache HBase] 
ml-node+s679495n4034446...@n3.nabble.com wrote:

 Thank you both for the comments :)

 Regards,
 Mohammad Tariq



 On Tue, Nov 27, 2012 at 8:56 PM, ramkrishna vasudevan 
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4034446i=0
 wrote:

  You are right Mohammad,
 
  Regards
  Ram
 
  On Tue, Nov 27, 2012 at 8:53 PM, Doug Meil [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4034446i=1
  wrote:
 
  
   Hi there-
  
   re:   From what I have understood, these properties are not for Hbase
  but
   for the Hbase client which we write. They tell the client where to
 look
  for
   ZK.
  
   Yep.  That's how it works.  Then the client looks up ROOT/META and
 then
   the client talks directly to the RegionServers.
  
   http://hbase.apache.org/book.html#client
  
  
  
  
  
   On 11/27/12 8:52 AM, Mohammad Tariq [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=4034446i=2
 wrote:
  
   Hello Matan,
   
 From what I have understood, these properties are not for Hbase
  but
   for the Hbase client which we write. They tell the client where to
 look
   for
   ZK.
   
   Hmaster registers its address with ZK. And from there client will
 come
  to
   know where to look for Hmaster. And if the Hmaster registers its
 address
   as
   'localhost', the client will take it as the 'localhost', which is
  client's
   'localhost' and not the 'localhost' where Hmaster is running. So, if
 you
   have the IP and hostname of the Hmaster in your /etc/hosts file the
  client
   can reach that machine without any problem as there is proper DNS
   resolution available.
   
   But this just is what I think. I need approval from the heavyweights.
   
   Stack sir??
   
   
   
   Regards,
   Mohammad Tariq
   
   
   
   On Tue, Nov 27, 2012 at 5:57 PM, matan [hidden 
   email]http://user/SendEmail.jtp?type=nodenode=4034446i=3
 wrote:
   
Thanks guys,
   
Excuse my ignorance, but having sort of agreed that the
 configuration
   that
determines which-server-should-be-contacted-for-what is on the
 HBase
server, I am not sure how any of the practical suggestions made
 should
solve the issue, and enable connecting from a remote client.
   
Let me delineate - setting /etc/hosts on my client side seems in
 this
regard not relevant in that view. And the other suggestion for
hbase-site.xml configuration I have already got covered, as my
 client
   code
successfully connects to zookeeper (the configuration properties
   mentioned
on this thread are zookeeper specific according to my
 interpretation
  of
documentation, I don't directly see how they should solve the
  problem).
Perhaps Mohammad you can explain why those zookeeper properties
 relate
   to
how the master references itself towards zookeeper?
   
Should I take it from St.Ack that there is currently no way to
 specify
   the
master's remotely accessible server/ip in the HBase configuration?
   
Anyway, my HBase server's /etc/hosts has just one line now, in case
 it
   got
lost on the thread -
127.0.0.1 localhost 'server-name'. Everything works fine on the
 HBase
server itself, the same client code runs perfectly there.
   
Thanks again,
Matan
   
On Mon, Nov 26, 2012 at 10:15 PM, Tariq [via Apache HBase] 
[hidden email]http://user/SendEmail.jtp?type=nodenode=4034446i=4
 wrote:
   
 Hello Nicolas,

   You are right. It has been deprecated. Thank you for
 updating
  my
 knowledge base..:)

 Regards,
 Mohammad Tariq



 On Tue, Nov 27, 2012 at 12:17 AM, Nicolas Liochon [hidden
 email]
http://user/SendEmail.jtp?type=nodenode=4034419i=0
 wrote:

  Hi Mohammad,
 
  Your answer was right, just that specifying the master address
 is
   not
  necessary (anymore I think). But it does no harm.
  Changing the /etc/hosts (as you did) is right too.
  Lastly, if the cluster is standalone and accessed locally,
 having
 localhost
  in ZK will not be an issue. However, it's perfectly possible to
   have a
  standalone cluster accessed remotely, so you don't want to have
  the
 master
  to write I'm on the server named localhost in this case. I
  expect
   it
  won't be an issue for communications between the region servers
 or
   hdfs
 as
  they would be all on the same localhost...
 
  

Re: Unable to Create Table in Hbase

2012-11-27 Thread ramkrishna vasudevan
Can you paste the master logs and RS logs.. am sure that there should have
been some errors in them.. That is why it is not able to locate the META

Regards
Ram

On Tue, Nov 27, 2012 at 7:51 PM, shyam kumar lakshyam.sh...@gmail.comwrote:

 HI,

 Ya am able to see the table and table description in hbase shell
 (list 'table_name' and describe 'table_name')

 but am unable to perform scan 'table_name' as i told earlier



 --
 View this message in context:
 http://apache-hbase.679495.n3.nabble.com/Unable-to-Create-Table-in-Hbase-tp4034375p4034443.html
 Sent from the HBase User mailing list archive at Nabble.com.



Re: Question About Starting and Stopping HADOOP HBASE Cluster in Secure Mode

2012-11-27 Thread Leonid Fedotov
AC,
scripts start-dfs.sh and start-mapred.sh is just wrappers for hadoop-daemon.sh 
commands.
All the security settings are in the configuration files, so same start 
procedure should work for both secure and unsecured modes.
Just make sure you have correct configuration files.

Thank you!

Sincerely,
Leonid Fedotov


On Nov 27, 2012, at 3:28 AM, a...@hsk.hk wrote:

 Hi,
 
 I currently use the following steps to start and stop HADOOP  HBASE cluster:
 
 1) Without Kerberos Security
   (start zookeepers)
   start the cluster from Master:
  {$HADOOP_HOME}/bin/start-dfs.sh  // one command 
 start all servers
  {$HADOOP_HOME}/bin/start-mapred.sh   
  {$HBASE_HOME}/bin/start-hbase.sh 
 
   stop the cluster from Master:
  {$HBASE_HOME}/bin/stop-hbase.sh 
  {$HADOOP_HOME}/bin/stop-mapred.sh
  {$HADOOP_HOME}/bin/stop-dfs.sh
   (stop zookeepers)
 
 
 2) With Kerberos and in Secure Mode
   start the HADOOP Namenode 
  {$HADOOP_HOME}/bin/hadoop-daemon.sh start namenode 
   then for each datanode:
  {$HADOOP_HOME}/bin/hadoop-daemon.sh start datanode  
   then HBASE Master: 
  {$HBASE_HOME}/bin/hbase-daemon.sh start master (as root)
   then for each HBASE regionserver:
  {$HBASE_HOME}/bin/hbase-daemon.sh start regionserver 
 
 
 QUESTION: As I can see from 2) there are more steps to start the entire 
 cluster in secure node, are there any existing commands which simplify 
 start/stop HADOOP  HBASE in Secure mode? 
 
 Thanks
 ac



Re: Connecting to standalone HBase from a remote client

2012-11-27 Thread Leonid Fedotov
Matan,
in short, your client should be able to resolve all names for all HBMaster, 
HBRegionServers and all ZK nodes.
DNS or local /etc/hosts file, does not matter, but names should be resolvable 
correctly on the client machine.
Then it will be able to connect to ZK, got HBmaster and ROOT/META locations .

Thank you!

Sincerely,
Leonid Fedotov


On Nov 27, 2012, at 8:10 AM, matan wrote:

 Hi Mohammad,
 
 I'm loosing track... I came to understand that ZK tells the client where
 the ROOT/META is, and from there the client gets the region server it
 should contact. And yet I take it that you are saying that the
 configuration for the location of the ROOT/META or region server should be
 done on the client side. These two ideas seem to present a contradiction,
 and I probably don't have a good grasp of what is going on, or what should
 be done. Can you or anyone try to clarify?
 
 Thanks,
 matan
 
 On Tue, Nov 27, 2012 at 5:33 PM, Tariq [via Apache HBase] 
 ml-node+s679495n4034446...@n3.nabble.com wrote:
 
 Thank you both for the comments :)
 
 Regards,
Mohammad Tariq
 
 
 
 On Tue, Nov 27, 2012 at 8:56 PM, ramkrishna vasudevan 
 [hidden email] http://user/SendEmail.jtp?type=nodenode=4034446i=0
 wrote:
 
 You are right Mohammad,
 
 Regards
 Ram
 
 On Tue, Nov 27, 2012 at 8:53 PM, Doug Meil [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4034446i=1
 wrote:
 
 
 Hi there-
 
 re:   From what I have understood, these properties are not for Hbase
 but
 for the Hbase client which we write. They tell the client where to
 look
 for
 ZK.
 
 Yep.  That's how it works.  Then the client looks up ROOT/META and
 then
 the client talks directly to the RegionServers.
 
 http://hbase.apache.org/book.html#client
 
 
 
 
 
 On 11/27/12 8:52 AM, Mohammad Tariq [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4034446i=2
 wrote:
 
 Hello Matan,
 
 From what I have understood, these properties are not for Hbase
 but
 for the Hbase client which we write. They tell the client where to
 look
 for
 ZK.
 
 Hmaster registers its address with ZK. And from there client will
 come
 to
 know where to look for Hmaster. And if the Hmaster registers its
 address
 as
 'localhost', the client will take it as the 'localhost', which is
 client's
 'localhost' and not the 'localhost' where Hmaster is running. So, if
 you
 have the IP and hostname of the Hmaster in your /etc/hosts file the
 client
 can reach that machine without any problem as there is proper DNS
 resolution available.
 
 But this just is what I think. I need approval from the heavyweights.
 
 Stack sir??
 
 
 
 Regards,
   Mohammad Tariq
 
 
 
 On Tue, Nov 27, 2012 at 5:57 PM, matan [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4034446i=3
 wrote:
 
 Thanks guys,
 
 Excuse my ignorance, but having sort of agreed that the
 configuration
 that
 determines which-server-should-be-contacted-for-what is on the
 HBase
 server, I am not sure how any of the practical suggestions made
 should
 solve the issue, and enable connecting from a remote client.
 
 Let me delineate - setting /etc/hosts on my client side seems in
 this
 regard not relevant in that view. And the other suggestion for
 hbase-site.xml configuration I have already got covered, as my
 client
 code
 successfully connects to zookeeper (the configuration properties
 mentioned
 on this thread are zookeeper specific according to my
 interpretation
 of
 documentation, I don't directly see how they should solve the
 problem).
 Perhaps Mohammad you can explain why those zookeeper properties
 relate
 to
 how the master references itself towards zookeeper?
 
 Should I take it from St.Ack that there is currently no way to
 specify
 the
 master's remotely accessible server/ip in the HBase configuration?
 
 Anyway, my HBase server's /etc/hosts has just one line now, in case
 it
 got
 lost on the thread -
 127.0.0.1 localhost 'server-name'. Everything works fine on the
 HBase
 server itself, the same client code runs perfectly there.
 
 Thanks again,
 Matan
 
 On Mon, Nov 26, 2012 at 10:15 PM, Tariq [via Apache HBase] 
 [hidden email]http://user/SendEmail.jtp?type=nodenode=4034446i=4
 wrote:
 
 Hello Nicolas,
 
  You are right. It has been deprecated. Thank you for
 updating
 my
 knowledge base..:)
 
 Regards,
Mohammad Tariq
 
 
 
 On Tue, Nov 27, 2012 at 12:17 AM, Nicolas Liochon [hidden
 email]
 http://user/SendEmail.jtp?type=nodenode=4034419i=0
 wrote:
 
 Hi Mohammad,
 
 Your answer was right, just that specifying the master address
 is
 not
 necessary (anymore I think). But it does no harm.
 Changing the /etc/hosts (as you did) is right too.
 Lastly, if the cluster is standalone and accessed locally,
 having
 localhost
 in ZK will not be an issue. However, it's perfectly possible to
 have a
 standalone cluster accessed remotely, so you don't want to have
 the
 master
 to write I'm on the server named localhost in this case. I
 expect
 it
 won't be an issue for communications between the region 

Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-27 Thread Suraj Varma
Ian Varley's excellent HBaseCon presentation is another great resource.
http://ianvarley.com/coding/HBaseSchema_HBaseCon2012.pdf

On Mon, Nov 26, 2012 at 5:43 AM, Doug Meil
doug.m...@explorysmedical.com wrote:

 Hi there, somebody already wisely mentioned the link to the # of CF's
 entry, but here are a few other entries that can save you some heartburn
 if you read them ahead of time.

 http://hbase.apache.org/book.html#datamodel

 http://hbase.apache.org/book.html#schema

 http://hbase.apache.org/book.html#architecture





 On 11/26/12 5:28 AM, Mohammad Tariq donta...@gmail.com wrote:

Hello sir,

You might become a victim of RS hotspotting, since the cutomerIDs will
be sequential(I assume). To keep things simple Hbase puts all the rows
with
similar keys to the same RS. But, it becomes a bottleneck in the long run
as all the data keeps on going to the same region.

HTH

Regards,
Mohammad Tariq



On Mon, Nov 26, 2012 at 3:53 PM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Hi,
 Thanks! Can we have the customer number as the RowKey for the customer
 (client) master table? Please help in educating me on the advantage and
 disadvantage of having customer number as the Row key...

 Also SCD2 we may need to implement in that table.. will it work if I
have
 like that?

 Or

 SCD2 is not needed instead we can achieve the same by increasing the
 version number that it will hold?

 pls suggest...

 regards,
 Rams

 On Mon, Nov 26, 2012 at 1:10 PM, Li, Min m...@microstrategy.com wrote:

  When 1 cf need to do split, other 599 cfs will split at the same
time. So
  many fragments will be produced when you use so many column families.
  Actually, many cfs can be merge to only one cf with specific tags in
  rowkey. For example, rowkey of customer address can be uid+'AD', and
  customer profile can be uid+'PR'.
 
  Min
  -Original Message-
  From: Ramasubramanian Narayanan [mailto:
  ramasubramanian.naraya...@gmail.com]
  Sent: Monday, November 26, 2012 3:05 PM
  To: user@hbase.apache.org
  Subject: Expert suggestion needed to create table in Hbase - Banking
 
  Hi,
 
I have a requirement of physicalising the logical model... I have a
  client model which has 600+ entities...
 
Need suggestion how to go about physicalising it...
 
I have few other doubts :
1) Whether is it good to create a single table for all the 600+
 columns?
2) To have different column families for different groups or can it
be
  under a single column family? For example, customer address can we
have
 as
  a different column family?
 
Please help on this..
 
 
  regards,
  Rams
 





RE: Backup strategy

2012-11-27 Thread Pablo Musa
Lars,
thanks for the great post. However I am using HBase 0.90.6 :(

What is the best approach in my case?
My data is not very big 100GB divided into 4 tables. I don't need daily backup,
weekly maybe. But I need to be able to fully restore the state (all data in a
consistent state) if my migration goes wrong.

Thanks,
Pablo

-Original Message-
From: lars hofhansl [mailto:lhofha...@yahoo.com] 
Sent: quinta-feira, 15 de novembro de 2012 15:46
To: user@hbase.apache.org
Subject: Re: Backup strategy

Here's one way: 
http://hadoop-hbase.blogspot.com/2012/04/timestamp-consistent-backups-in-hbase.html





 From: David Charle dbchar2...@gmail.com
To: user@hbase.apache.org
Sent: Thursday, November 15, 2012 7:41 AM
Subject: Backup strategy
 
Hi

Anyone using any backup strategy (other than replication) for any point-in-time 
restore ? Any recommendations on best practices ?

--
David


Re: recommended nodes

2012-11-27 Thread Jean-Marc Spaggiari
Hi Michael,

so are you recommanding 32Gb per node?

What about the disks? SATA drives are to slow?

JM

2012/11/26, Michael Segel michael_se...@hotmail.com:
 Uhm, those specs are actually now out of date.

 If you're running HBase, or want to also run R on top of Hadoop, you will
 need to add more memory.
 Also forget 1GBe got 10GBe,  and w 2 SATA drives, you will be disk i/o bound
 way too quickly.


 On Nov 26, 2012, at 8:05 AM, Marcos Ortiz mlor...@uci.cu wrote:

 Are you asking about hardware recommendations?
 Eric Sammer on his Hadoop Operations book, did a great job about this:
 For middle size clusters (until 300 nodes):
 Processor: A dual quad-core 2.6 Ghz
 RAM: 24 GB DDR3
 Dual 1 Gb Ethernet NICs
 a SAS drive controller
 at least two SATA II drives in a JBOD configuration

 The replication factor depends heavily of the primary use of your
 cluster.

 On 11/26/2012 08:53 AM, David Charle wrote:
 hi

 what's the recommended nodes for NN, hmaster and zk nodes for a larger
 cluster, lets say 50-100+

 also, what would be the ideal replication factor for larger clusters when
 u have 3-4 racks ?

 --
 David
 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
 INFORMATICAS...
 CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

 http://www.uci.cu
 http://www.facebook.com/universidad.uci
 http://www.flickr.com/photos/universidad_uci

 --

 Marcos Luis Ortíz Valmaseda
 about.me/marcosortiz http://about.me/marcosortiz
 @marcosluis2186 http://twitter.com/marcosluis2186



 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
 INFORMATICAS...
 CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

 http://www.uci.cu
 http://www.facebook.com/universidad.uci
 http://www.flickr.com/photos/universidad_uci




Re: recommended nodes

2012-11-27 Thread Michael Segel

OK... I don't know why Cloudera is so hung up on 32GB. ;-) [Its an inside joke 
...]

So here's the problem... 

By default, your child processes in a map/reduce job get a default 512MB. The 
majority of the time, this gets raised to 1GB.

8 cores (dual quad cores) shows up at 16 virtual processors in Linux. (Note: 
This is why when people talk about the number of cores, you have to specify 
physical cores or logical cores) 

So if you were to over subscribe and have lets say 12  mappers and 12 reducers, 
that's 24 slots. Which means that you would need 24GB of memory reserved just 
for the child processes. This would leave 8GB for DN, TT and the rest of the 
linux OS processes. 

Can you live with that? Sure. 
Now add in R, HBase, Impala, or some other set of tools on top of the cluster. 

Ooops! Now you are in trouble because you will swap. 
Also adding in R, you may want to bump up those child procs from 1GB to 2 GB. 
That means the 24 slots would now require 48GB.  Now you have swap and if that 
happens you will see HBase in a cascading failure. 

So while you can do a rolling restart with the changed configuration (reducing 
the number of mappers and reducers) you end up with less slots which will mean 
in longer run time for your jobs. (Less slots == less parallelism ) 

Looking at the price of memory... you can get 48GB or even 64GB  for around the 
same price point. (8GB chips) 

And I didn't even talk about adding SOLR either again a memory hog... ;-) 

Note that I matched the number of mappers w reducers. You could go with fewer 
reducers if you want. I tend to recommend a ratio of 2:1 mappers to reducers, 
depending on the work flow 

As to the disks... no 7200 SATA III drives are fine. SATA III interface is 
pretty much available in the new kit being shipped. 
Its just that you don't have enough drives. 8 cores should be 8 spindles if 
available. 
Otherwise you end up seeing your CPU load climb on wait states as the processes 
wait for the disk i/o to catch up. 

I mean you could build out a cluster w 4 x 3 3.5 2TB drives in a 1 U chassis 
based on price. You're making a trade off and you should be aware of the 
performance hit you will take. 

HTH 

-Mike

On Nov 27, 2012, at 1:52 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org 
wrote:

 Hi Michael,
 
 so are you recommanding 32Gb per node?
 
 What about the disks? SATA drives are to slow?
 
 JM
 
 2012/11/26, Michael Segel michael_se...@hotmail.com:
 Uhm, those specs are actually now out of date.
 
 If you're running HBase, or want to also run R on top of Hadoop, you will
 need to add more memory.
 Also forget 1GBe got 10GBe,  and w 2 SATA drives, you will be disk i/o bound
 way too quickly.
 
 
 On Nov 26, 2012, at 8:05 AM, Marcos Ortiz mlor...@uci.cu wrote:
 
 Are you asking about hardware recommendations?
 Eric Sammer on his Hadoop Operations book, did a great job about this:
 For middle size clusters (until 300 nodes):
 Processor: A dual quad-core 2.6 Ghz
 RAM: 24 GB DDR3
 Dual 1 Gb Ethernet NICs
 a SAS drive controller
 at least two SATA II drives in a JBOD configuration
 
 The replication factor depends heavily of the primary use of your
 cluster.
 
 On 11/26/2012 08:53 AM, David Charle wrote:
 hi
 
 what's the recommended nodes for NN, hmaster and zk nodes for a larger
 cluster, lets say 50-100+
 
 also, what would be the ideal replication factor for larger clusters when
 u have 3-4 racks ?
 
 --
 David
 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
 INFORMATICAS...
 CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
 
 http://www.uci.cu
 http://www.facebook.com/universidad.uci
 http://www.flickr.com/photos/universidad_uci
 
 --
 
 Marcos Luis Ortíz Valmaseda
 about.me/marcosortiz http://about.me/marcosortiz
 @marcosluis2186 http://twitter.com/marcosluis2186
 
 
 
 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
 INFORMATICAS...
 CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
 
 http://www.uci.cu
 http://www.facebook.com/universidad.uci
 http://www.flickr.com/photos/universidad_uci
 
 
 



Re: Do you know what's the release time of hbase 0.96.0 and 0.94.3?

2012-11-27 Thread Stack
Please don't send same question to three different mailing lists.

See below for answers.

On Tue, Nov 27, 2012 at 6:59 PM, 张莉苹 zlpmiche...@gmail.com wrote:
 *Do you know what's the release time of apache hbase 0.96.0 and hbase
 0.94.3?*


0.94.3 should be out in a week or two.

0.96.0 start of next year hopefully.


 I just saw there was a piece of news *December 4th, 2012 0.96 Bug
 Squashing and Testing
 Hackathonhttp://www.meetup.com/hackathon/events/90536432/at
 Cloudera, SF
 * in http://www.meetup.com/hackathon/events/90536432/.

 Does that mean apache hbase 0.96.0 will be finally released *before Dec. 4th
 *?


No.  Devs are going to hang around for a day working on 0.96 issues
and talking about how to get 0.96 out the door.



 BTW, I think hbase 0.94.2 is the latest stable and released version in the
 community, right?


Thats right!

Yours,
St.Ack


Regarding rework in changing column family

2012-11-27 Thread Ramasubramanian
Hi,

I have created table in hbase with one column family and planned to release for 
development (in pentaho). 

Suppose later after doing the data profiling in production if I feel that out 
of 600 columns 200 is not going to get used frequently I am planning to group 
those into another column family. 

If I change the column family at later point of time I hope there will a lots 
of rework that has to be done (either if we use java or pentaho). Is my 
understanding is correct? Is there any other alternative available to overcome?

Regards,
Rams

Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-27 Thread Ramasubramanian
Hi,

Thanks!!

Can someone help in suggesting what is the best rowkey that we can use in this 
scenario. 

Regards,
Rams

On 27-Nov-2012, at 10:37 PM, Suraj Varma svarma...@gmail.com wrote:

 Ian Varley's excellent HBaseCon presentation is another great resource.
 http://ianvarley.com/coding/HBaseSchema_HBaseCon2012.pdf
 
 On Mon, Nov 26, 2012 at 5:43 AM, Doug Meil
 doug.m...@explorysmedical.com wrote:
 
 Hi there, somebody already wisely mentioned the link to the # of CF's
 entry, but here are a few other entries that can save you some heartburn
 if you read them ahead of time.
 
 http://hbase.apache.org/book.html#datamodel
 
 http://hbase.apache.org/book.html#schema
 
 http://hbase.apache.org/book.html#architecture
 
 
 
 
 
 On 11/26/12 5:28 AM, Mohammad Tariq donta...@gmail.com wrote:
 
 Hello sir,
 
   You might become a victim of RS hotspotting, since the cutomerIDs will
 be sequential(I assume). To keep things simple Hbase puts all the rows
 with
 similar keys to the same RS. But, it becomes a bottleneck in the long run
 as all the data keeps on going to the same region.
 
 HTH
 
 Regards,
   Mohammad Tariq
 
 
 
 On Mon, Nov 26, 2012 at 3:53 PM, Ramasubramanian Narayanan 
 ramasubramanian.naraya...@gmail.com wrote:
 
 Hi,
 Thanks! Can we have the customer number as the RowKey for the customer
 (client) master table? Please help in educating me on the advantage and
 disadvantage of having customer number as the Row key...
 
 Also SCD2 we may need to implement in that table.. will it work if I
 have
 like that?
 
 Or
 
 SCD2 is not needed instead we can achieve the same by increasing the
 version number that it will hold?
 
 pls suggest...
 
 regards,
 Rams
 
 On Mon, Nov 26, 2012 at 1:10 PM, Li, Min m...@microstrategy.com wrote:
 
 When 1 cf need to do split, other 599 cfs will split at the same
 time. So
 many fragments will be produced when you use so many column families.
 Actually, many cfs can be merge to only one cf with specific tags in
 rowkey. For example, rowkey of customer address can be uid+'AD', and
 customer profile can be uid+'PR'.
 
 Min
 -Original Message-
 From: Ramasubramanian Narayanan [mailto:
 ramasubramanian.naraya...@gmail.com]
 Sent: Monday, November 26, 2012 3:05 PM
 To: user@hbase.apache.org
 Subject: Expert suggestion needed to create table in Hbase - Banking
 
 Hi,
 
  I have a requirement of physicalising the logical model... I have a
 client model which has 600+ entities...
 
  Need suggestion how to go about physicalising it...
 
  I have few other doubts :
  1) Whether is it good to create a single table for all the 600+
 columns?
  2) To have different column families for different groups or can it
 be
 under a single column family? For example, customer address can we
 have
 as
 a different column family?
 
  Please help on this..
 
 
 regards,
 Rams
 
 


Re: Regarding rework in changing column family

2012-11-27 Thread ramkrishna vasudevan
As far as i see altering the table with the new columnfamily should be
easier.
- disable the table
- Issue modify table command with the new col family.
- run a compaction.
Now after this when you start doing your puts, they should be in alignment
with the new schema defined for the table.  You may have to see one thing
is how much your rate of puts is getting affected because now both of your
CFs will start flushing whenever a memstore flush happens.

Hope this helps.

Regards
Ram

On Wed, Nov 28, 2012 at 10:10 AM, Ramasubramanian 
ramasubramanian.naraya...@gmail.com wrote:

 Hi,

 I have created table in hbase with one column family and planned to
 release for development (in pentaho).

 Suppose later after doing the data profiling in production if I feel that
 out of 600 columns 200 is not going to get used frequently I am planning to
 group those into another column family.

 If I change the column family at later point of time I hope there will a
 lots of rework that has to be done (either if we use java or pentaho). Is
 my understanding is correct? Is there any other alternative available to
 overcome?

 Regards,
 Rams


Re: Expert suggestion needed to create table in Hbase - Banking

2012-11-27 Thread anil gupta
Hi Rams,

IMHO, you need to go through http://hbase.apache.org/book.html and the book
HBase:The Definitive Guide to get a deeper understanding of HBase. It
will help you in designing your system.

There is no magical trick to design the most efficient/best RowKey without
knowing the detailed requirements, constraints and carrying out couple of
experiments.

HTH,
Anil


On Tue, Nov 27, 2012 at 8:44 PM, Ramasubramanian 
ramasubramanian.naraya...@gmail.com wrote:

 Hi,

 Thanks!!

 Can someone help in suggesting what is the best rowkey that we can use in
 this scenario.

 Regards,
 Rams

 On 27-Nov-2012, at 10:37 PM, Suraj Varma svarma...@gmail.com wrote:

  Ian Varley's excellent HBaseCon presentation is another great resource.
  http://ianvarley.com/coding/HBaseSchema_HBaseCon2012.pdf
 
  On Mon, Nov 26, 2012 at 5:43 AM, Doug Meil
  doug.m...@explorysmedical.com wrote:
 
  Hi there, somebody already wisely mentioned the link to the # of CF's
  entry, but here are a few other entries that can save you some heartburn
  if you read them ahead of time.
 
  http://hbase.apache.org/book.html#datamodel
 
  http://hbase.apache.org/book.html#schema
 
  http://hbase.apache.org/book.html#architecture
 
 
 
 
 
  On 11/26/12 5:28 AM, Mohammad Tariq donta...@gmail.com wrote:
 
  Hello sir,
 
You might become a victim of RS hotspotting, since the cutomerIDs
 will
  be sequential(I assume). To keep things simple Hbase puts all the rows
  with
  similar keys to the same RS. But, it becomes a bottleneck in the long
 run
  as all the data keeps on going to the same region.
 
  HTH
 
  Regards,
Mohammad Tariq
 
 
 
  On Mon, Nov 26, 2012 at 3:53 PM, Ramasubramanian Narayanan 
  ramasubramanian.naraya...@gmail.com wrote:
 
  Hi,
  Thanks! Can we have the customer number as the RowKey for the customer
  (client) master table? Please help in educating me on the advantage
 and
  disadvantage of having customer number as the Row key...
 
  Also SCD2 we may need to implement in that table.. will it work if I
  have
  like that?
 
  Or
 
  SCD2 is not needed instead we can achieve the same by increasing the
  version number that it will hold?
 
  pls suggest...
 
  regards,
  Rams
 
  On Mon, Nov 26, 2012 at 1:10 PM, Li, Min m...@microstrategy.com
 wrote:
 
  When 1 cf need to do split, other 599 cfs will split at the same
  time. So
  many fragments will be produced when you use so many column families.
  Actually, many cfs can be merge to only one cf with specific tags in
  rowkey. For example, rowkey of customer address can be uid+'AD', and
  customer profile can be uid+'PR'.
 
  Min
  -Original Message-
  From: Ramasubramanian Narayanan [mailto:
  ramasubramanian.naraya...@gmail.com]
  Sent: Monday, November 26, 2012 3:05 PM
  To: user@hbase.apache.org
  Subject: Expert suggestion needed to create table in Hbase - Banking
 
  Hi,
 
   I have a requirement of physicalising the logical model... I have a
  client model which has 600+ entities...
 
   Need suggestion how to go about physicalising it...
 
   I have few other doubts :
   1) Whether is it good to create a single table for all the 600+
  columns?
   2) To have different column families for different groups or can it
  be
  under a single column family? For example, customer address can we
  have
  as
  a different column family?
 
   Please help on this..
 
 
  regards,
  Rams
 
 




-- 
Thanks  Regards,
Anil Gupta


Re: Regarding rework in changing column family

2012-11-27 Thread Ramasubramanian Narayanan
Thanks Ram!!!

My question is like this...

suppose I have create a table with 100 columns with single column family
'cf1',

now in production there are billions of records are there in that table and
there are mulitiple programs that is feeding into this table (let us take
some 50 programs)...

In this scenario, if I change the column family like first 40 columns let
it be in 'cf1', the last 60 columns I want to move to new column family
'cf2', in this case, *do we need to change all 50 programs which are
inserting into that table with 'cf1' for all columns?*
*
*
regards,
Rams

On Wed, Nov 28, 2012 at 10:24 AM, ramkrishna vasudevan 
ramkrishna.s.vasude...@gmail.com wrote:

 As far as i see altering the table with the new columnfamily should be
 easier.
 - disable the table
 - Issue modify table command with the new col family.
 - run a compaction.
 Now after this when you start doing your puts, they should be in alignment
 with the new schema defined for the table.  You may have to see one thing
 is how much your rate of puts is getting affected because now both of your
 CFs will start flushing whenever a memstore flush happens.

 Hope this helps.

 Regards
 Ram

 On Wed, Nov 28, 2012 at 10:10 AM, Ramasubramanian 
 ramasubramanian.naraya...@gmail.com wrote:

  Hi,
 
  I have created table in hbase with one column family and planned to
  release for development (in pentaho).
 
  Suppose later after doing the data profiling in production if I feel that
  out of 600 columns 200 is not going to get used frequently I am planning
 to
  group those into another column family.
 
  If I change the column family at later point of time I hope there will a
  lots of rework that has to be done (either if we use java or pentaho). Is
  my understanding is correct? Is there any other alternative available to
  overcome?
 
  Regards,
  Rams



答复: Best practice for the naming convention of column family

2012-11-27 Thread 冯宏华
according to http://hbase.apache.org/book.html#number.of.cfs; - 6.3.2.1. 
Column Families : Try to keep the ColumnFamily names as small as possible, 
preferably one character (e.g. d for data/default).

发件人: Ramasubramanian Narayanan [ramasubramanian.naraya...@gmail.com]
发送时间: 2012年11月28日 13:51
收件人: user@hbase.apache.org
主题: Best practice for the naming convention of column family

Hi,

Can anyone suggest the best practice for the naming convention for the
column family pls..

regards,

Rams


Re: Regarding rework in changing column family

2012-11-27 Thread ramkrishna vasudevan
I am afraid it has to be changed...Because for your puts to go to the
specified Col family the col family name should appear in your Puts that is
created by the client.

Regards
Ram

On Wed, Nov 28, 2012 at 11:18 AM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Thanks Ram!!!

 My question is like this...

 suppose I have create a table with 100 columns with single column family
 'cf1',

 now in production there are billions of records are there in that table and
 there are mulitiple programs that is feeding into this table (let us take
 some 50 programs)...

 In this scenario, if I change the column family like first 40 columns let
 it be in 'cf1', the last 60 columns I want to move to new column family
 'cf2', in this case, *do we need to change all 50 programs which are
 inserting into that table with 'cf1' for all columns?*
 *
 *
 regards,
 Rams

 On Wed, Nov 28, 2012 at 10:24 AM, ramkrishna vasudevan 
 ramkrishna.s.vasude...@gmail.com wrote:

  As far as i see altering the table with the new columnfamily should be
  easier.
  - disable the table
  - Issue modify table command with the new col family.
  - run a compaction.
  Now after this when you start doing your puts, they should be in
 alignment
  with the new schema defined for the table.  You may have to see one thing
  is how much your rate of puts is getting affected because now both of
 your
  CFs will start flushing whenever a memstore flush happens.
 
  Hope this helps.
 
  Regards
  Ram
 
  On Wed, Nov 28, 2012 at 10:10 AM, Ramasubramanian 
  ramasubramanian.naraya...@gmail.com wrote:
 
   Hi,
  
   I have created table in hbase with one column family and planned to
   release for development (in pentaho).
  
   Suppose later after doing the data profiling in production if I feel
 that
   out of 600 columns 200 is not going to get used frequently I am
 planning
  to
   group those into another column family.
  
   If I change the column family at later point of time I hope there will
 a
   lots of rework that has to be done (either if we use java or pentaho).
 Is
   my understanding is correct? Is there any other alternative available
 to
   overcome?
  
   Regards,
   Rams
 



Parallel reading advice

2012-11-27 Thread Sean McNamara
I have a table who's keys are prefixed with a byte to help distribute the keys 
so scans don't hotspot.

I also have a bunch of slave processes that work to scan the prefix partitions 
in parallel.  Currently each slave sets up their own hbase connection, scanner, 
etc..  Most of the slave processes finish their scan and return within 2-3 
seconds.  It tends to take the same amount of time regardless of if there's 
lots of data, or very little.  So I think that 2 sec overhead is there because 
each slave will setup a new connection on each request (I am unable to reuse 
connections in the slaves).

I'm wondering if I could remove some of that overhead by using the master 
(which can reuse it's hbase connection) to determine the splits, and then 
delegating that information out to each slave. I think I could possible use 
TableInputFormat/TableRecordReader to accomplish this?  Would this route make 
sense?


RE: Regarding rework in changing column family

2012-11-27 Thread Anoop Sam John

Also what about the current data in the table. Now all are under the single CF. 
Modifying the table with addition of a new CF will not move data to the new 
family!
Remember HBase only deals with CF at the table schema level. There is no 
qualifiers in the schema as such. When data is inserted/retrieved we can 
specify a qualifier.

-Anoop-

From: ramkrishna vasudevan [ramkrishna.s.vasude...@gmail.com]
Sent: Wednesday, November 28, 2012 11:41 AM
To: user@hbase.apache.org
Subject: Re: Regarding rework in changing column family

I am afraid it has to be changed...Because for your puts to go to the
specified Col family the col family name should appear in your Puts that is
created by the client.

Regards
Ram

On Wed, Nov 28, 2012 at 11:18 AM, Ramasubramanian Narayanan 
ramasubramanian.naraya...@gmail.com wrote:

 Thanks Ram!!!

 My question is like this...

 suppose I have create a table with 100 columns with single column family
 'cf1',

 now in production there are billions of records are there in that table and
 there are mulitiple programs that is feeding into this table (let us take
 some 50 programs)...

 In this scenario, if I change the column family like first 40 columns let
 it be in 'cf1', the last 60 columns I want to move to new column family
 'cf2', in this case, *do we need to change all 50 programs which are
 inserting into that table with 'cf1' for all columns?*
 *
 *
 regards,
 Rams

 On Wed, Nov 28, 2012 at 10:24 AM, ramkrishna vasudevan 
 ramkrishna.s.vasude...@gmail.com wrote:

  As far as i see altering the table with the new columnfamily should be
  easier.
  - disable the table
  - Issue modify table command with the new col family.
  - run a compaction.
  Now after this when you start doing your puts, they should be in
 alignment
  with the new schema defined for the table.  You may have to see one thing
  is how much your rate of puts is getting affected because now both of
 your
  CFs will start flushing whenever a memstore flush happens.
 
  Hope this helps.
 
  Regards
  Ram
 
  On Wed, Nov 28, 2012 at 10:10 AM, Ramasubramanian 
  ramasubramanian.naraya...@gmail.com wrote:
 
   Hi,
  
   I have created table in hbase with one column family and planned to
   release for development (in pentaho).
  
   Suppose later after doing the data profiling in production if I feel
 that
   out of 600 columns 200 is not going to get used frequently I am
 planning
  to
   group those into another column family.
  
   If I change the column family at later point of time I hope there will
 a
   lots of rework that has to be done (either if we use java or pentaho).
 Is
   my understanding is correct? Is there any other alternative available
 to
   overcome?
  
   Regards,
   Rams