HBase BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?

2012-03-05 Thread D S
Hi, I'm learning more about HBase and I'm curious how much of HBase is actually based on Google's original dB. In Google's origins stories, they are well known for using low cost commodity hardware in scale in order to store their web database. Almost every blog I read about HBase tells me it's

Wrong Path look up in Hbase Bulk Uplaod

2012-03-05 Thread Garg, Rinku
Hi All, I am trying to upload a csv file using hbase bulk upload feature. Below is the URL which I referred: http://hbase.apache.org/bulk-loads.html I have the following problem that may be you or someone can help me out with. I am new to hadoop and the mapreduce feature. I tried to run my

a strange situation of HBase when I issue scan '.META.' command

2012-03-05 Thread yonghu
Hello, My HBase version is 0.90.2 and installed in pseudo mode. I have successfully inserted two tuples in the 'test' table. hbase(main):005:0 scan 'test' ROWCOLUMN+CELL jim column=course:english, timestamp=1330949116240, value=1.3 tom

Re: a strange situation of HBase when I issue scan '.META.' command

2012-03-05 Thread Doug Meil
Hi there- You might want to see this in the Ref Guide. http://hbase.apache.org/book.html#arch.catalog A region with an empty start key is the first region in a table. If region has both an empty start and an empty end key, its the only region in the table On 3/5/12 7:27 AM, yonghu

Re: a strange situation of HBase when I issue scan '.META.' command

2012-03-05 Thread yonghu
Thanks for your apply. I didn't read it carefully. Yong On Mon, Mar 5, 2012 at 2:14 PM, Doug Meil doug.m...@explorysmedical.com wrote: Hi there- You might want to see this in the Ref Guide. http://hbase.apache.org/book.html#arch.catalog A region with an empty start key       is the

Re: HBase BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?

2012-03-05 Thread Doug Meil
re: Almost every blog I read about HBase tells me it's a clone of BigTable. The HBase website says that too http://hbase.apache.org/ re: Almost every blog I've read about HBase also tells me to use a lot of RAM So does the Hbase Reference Guide...

Re: HBase BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?

2012-03-05 Thread Michael Drzal
You really need to consider the entire historical context here. A lot of the memory used in hbase is buffering writes to disk and for the block cache. These days, it isn't unreasonable to get 12 2-3TB disks in a commodity server. Back in 2003, you would not get as many disks, and they would be

Re: What about storing binary data(e.g. images) in HBase?

2012-03-05 Thread Michael Segel
Just a couple of things... MapR doesn't have the NN limitations. So if your design requires lots of small files, look at MapR... You could store your large blobs in a sequence file or series of sequence files using HBase to store the index. Sort of a hybrid approach. Sent from my iPhone On Mar

Install hbase on Ubuntu 11.10

2012-03-05 Thread Mahdi Negahi
Dear All Friends I'm new at Linux and Hbase. At first time, I install hbase on windows by Cygwin successfully but after install Thrift everything change. so I decided to change my OS and try to install Hbase on Ubuntu 11.10. I have tried for 2 weeks without any progress. Please please somebody

HBase Region move() and Data Locality

2012-03-05 Thread Bryan Beaudreault
Hey all, We are running on cdh3u2 (soon to upgrade to 3u3), and we notice that regions are balanced solely based on the number of regions per region server, with no regard for horizontal scaling of tables. This was mostly fine with a small number of regions, but as our cluster reaches thousands

Re: HBase Region move() and Data Locality

2012-03-05 Thread Doug Meil
This doesn't address your question on move(), but regarding locality, see 8.7.3 in here... http://hbase.apache.org/book.html#regions.arch .. it's not just major compactions, but any write of a storefile that affects locality (flush, minor, major). On 3/5/12 11:02 AM, Bryan Beaudreault

Re: Install hbase on Ubuntu 11.10

2012-03-05 Thread Peter Vandenabeele
On Mon, Mar 5, 2012 at 10:14 AM, Mahdi Negahi negahi.ma...@hotmail.com wrote: Dear All Friends I'm new at Linux and Hbase. At first time, I install hbase on windows by Cygwin successfully but after install Thrift everything change. so I decided to change my OS and try to install Hbase on

Re: What about storing binary data(e.g. images) in HBase?

2012-03-05 Thread Jacques
Namenode is limited on the number of blocks. Whether you changed the block size or not would not have much impact on the problem. I think that the limit is something like 150 million blocks. (Someone else can feel free to correct this.) (It isn't exactly that simple because it also has to do

Re: HBase Region move() and Data Locality

2012-03-05 Thread Bryan Beaudreault
Thanks for the response! We are currently migrating analytics data from an old mysql setup to a new hbase-backed architecture. We have a bunch of versions of the data running at once, for testing, beta, live, etc, so we have 63 tables right now and 6451 regions hosted on 12 EC2 m1.xlarge

Re: HBase BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?

2012-03-05 Thread D S
On 3/5/12, Michael Drzal mdr...@gmail.com wrote: You really need to consider the entire historical context here. A lot of the memory used in hbase is buffering writes to disk and for the block cache. These days, it isn't unreasonable to get 12 2-3TB disks in a commodity server. Back in

Re: HBase BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?

2012-03-05 Thread Alan Chaney
On 3/5/2012 11:39 AM, D S wrote: On 3/5/12, Michael Drzalmdr...@gmail.com wrote: Y Is HBase's configuration options robust enough that it could go back and run well on those 2003 specs by a bit of tweaking if that what was desired? What do you mean run well? Run as well as Big Table would

RE: gc pause killing regionserver

2012-03-05 Thread Sandy Pratt
What was the actual process size of the JVM as reported by top? Why use the following in your config? -XX:NewRatio=16 -XX:MaxGCPauseMillis=100 Do you really have a stringent latency target, or are you just being aggressive? If I'm reading your log correctly, you have about 2.5 GB of heap,

Re: gc pause killing regionserver

2012-03-05 Thread Mikael Sitruk
Try to set the initialOccupancy to a lower value and try to allocate more space to the new generation space. It seems that you don't have enough place in the survivor space, and therefore you have a promotion failure. You also mention that the same server is frequently having this problem, is it

RE: HBase BigTable + History: Can it run decently on a 512MB machine? What's the difference between the two?

2012-03-05 Thread Sandy Pratt
I have HBase instances with 2GB heap that perform ok. I'm sure they would perform better with more RAM, but they are definitely good enough to test queries and so forth. I bet you could probably get down to 1.5 or 1 GB and be stable if you wanted to. -Original Message- From: D S

Re: Bulk loading a CSV file into HBase

2012-03-05 Thread Shrijeet Paliwal
Anil, Stack meant adding debug statements yourself in tool. -Shrijeet On Mon, Mar 5, 2012 at 4:54 PM, anil gupta anilg...@buffalo.edu wrote: Hi St.Ack, Thanks for the response. Both the tsv and csv are UTF-8 file. Could you please let me know how to run bulk loading in Debug mode? I dont

Re: Install hbase on Ubuntu 11.10

2012-03-05 Thread shashwat shriparv
just download hbase from apace extract and by going to hbase folder give command bin/start-hbase.sh it will start as for the stand alone hbase you dont need to make any change to the configuration file On Tue, Mar 6, 2012 at 9:44 AM, Gopal absoft...@gmail.com wrote: On 3/5/2012 4:14 AM,

some questions about hbase in production environment

2012-03-05 Thread Qian Ye
Hi all: I'm a newbie to HBase. Here are two questions about hbase in production environment. I would very appreciate it if anyone could give a help. 1. Which hbase configuration recommended to be set, rather than use the default, when using in production environment? So far, I knew that these