Out of memory error in Hbase

2012-06-28 Thread Prakrati Agrawal
Hi, I am getting heap space error while running Hbase on certain nodes of my cluster. I can't increase the Heapspace allocated to Hbase, is there some way in which I can prevent this from happening? The following is the log: serverName=server1,60020,1340693606454, load=(requests=4,

RE: Region splitting problem

2012-06-28 Thread Ramkrishna.S.Vasudevan
Hi What type of rowkeys are you specifying? HBase does Byte comparison. So the split that happens is correct. 1012 and 10112 will fall in the same region whereas 201 will come in the next region. It depends on how you form the row key. Regards Ram -Original Message- From: Ben

Re: direct Hfile Read and Writes

2012-06-28 Thread samar kumar
Thanks for the replies . I am aware of the apis but can anyone give me little bit more insight on the details. After creating the HFiles and calling IncrementalLoadHFile how does it internally change the RS, Catalog tables etc? Can anyone explain the flow. Thanks , Samar On Thu, Jun 28, 2012 at

RE: Scan vs Put vs Get

2012-06-28 Thread Ramkrishna.S.Vasudevan
Hi You can also check the cache hit and cache miss statistics that appears on the UI? In your random scan how many Regions are scanned whereas in gets may be many due to randomness. Regards Ram -Original Message- From: N Keywal [mailto:nkey...@gmail.com] Sent: Thursday, June 28,

Re: Stargate: ScannerModel

2012-06-28 Thread N Keywal
(moving this to the user mailing list, with the dev one in bcc) From what you said it should be customerid_MIN_TX_ID to customerid_MAX_TX_ID But only if customerid size is constant. Note that with this rowkey design there will be very few regions involved, so it's unlikely to be parallelized.

Fwd: What is the real semantic of the parameter hbase.client.retries.number

2012-06-28 Thread shixing
send to user@hbase.apache.org -- Forwarded message -- From: shixing paradise...@gmail.com Date: Thu, Jun 28, 2012 at 12:25 PM Subject: What is the real semantic of the parameter hbase.client.retries.number To: hbase-u...@hadoop.apache.org, d...@hbase.apache.org I have seen there

Fwd: Can I use non kerberos HDFS for AccessControl HBase base on kerberos?

2012-06-28 Thread shixing
send to user@hbase.apache.org -- Forwarded message -- From: shixing paradise...@gmail.com Date: Thu, Jun 28, 2012 at 12:22 PM Subject: Can I use non kerberos HDFS for AccessControl HBase base on kerberos? To: hbase-u...@hadoop.apache.org, apurt...@apache.org,

Re: Coprocessors on specific servers

2012-06-28 Thread Mohammad Tariq
Thank you guys for the valuable inputs. Regards,     Mohammad Tariq On Thu, Jun 28, 2012 at 10:27 AM, Lars George lars.geo...@gmail.com wrote: Yes exactly, this plus what Mohammad says, use the internal scanner to get just the data from the region once you are in the coprocessor code. There

Re: Region splitting problem

2012-06-28 Thread Ben Kim
You are so right. why didn't I think about that :'( I appreciate a lot for your comment. Ben On Thu, Jun 28, 2012 at 5:33 PM, Ramkrishna.S.Vasudevan ramkrishna.vasude...@huawei.com wrote: Hi What type of rowkeys are you specifying? HBase does Byte comparison. So the split that happens is

Re: Problem starting up HBase in pseudo distributed mode

2012-06-28 Thread Gargi Nagar
Hi Hari, For the error.. ERROR: org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Timed out (1ms) A very non-tech stuff, but i was facing the same issue and it got fixed on its own when i restarted the system.

RE: secondary indexing of tables

2012-06-28 Thread Ramkrishna.S.Vasudevan
I don't have a sample code. But it can be done using Coprocessors because it provides lot of hooks. HBASE-2038 will give you pointers towards that and before that please read about coprocessors also. This is just one way of doing it. Regards Ram -Original Message- From: Harsh Gupta

Re: secondary indexing of tables

2012-06-28 Thread Mohammad Tariq
Hi Harsh, You can visit this page - http://wiki.apache.org/hadoop/Hbase/SecondaryIndexing Regards,     Mohammad Tariq On Thu, Jun 28, 2012 at 5:21 PM, Ramkrishna.S.Vasudevan ramkrishna.vasude...@huawei.com wrote: I don't have a sample code.  But it can be done using Coprocessors

RE: Scan vs Put vs Get

2012-06-28 Thread Ramkrishna.S.Vasudevan
In 0.94 The UI of the RS has a metrics table. In that you can see blockCacheHitCount, blockCacheMissCount etc. May be there is a variation when you do scan() and get() here. Regards Ram -Original Message- From: Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org] Sent:

Re: Scan vs Put vs Get

2012-06-28 Thread Jean-Marc Spaggiari
Oh! I never looked at this part ;) Ok. I have it. Here are the numbers for one server before the read: blockCacheSizeMB=186.28 blockCacheFreeMB=55.4 blockCacheCount=2923 blockCacheHitCount=195999 blockCacheMissCount=89297 blockCacheEvictedCount=69858 blockCacheHitRatio=68%

Re: Stargate: ScannerModel

2012-06-28 Thread Doug Meil
One other thingŠ re: I tried using rowFilter but it is quite slow. If you didn't use startRow/stopRow for the Scan you will be filtering all the rows in the table (albiet on the RS, but stillŠ all the rows) On 6/28/12 4:56 AM, N Keywal nkey...@gmail.com wrote: (moving this to the user

RE: Scan vs Put vs Get

2012-06-28 Thread Anoop Sam John
blockCacheHitRatio=69% Seems blocks you are getting from cache. You can check with Blooms also once. You can enable the usage of bloom using the config param io.storefile.bloom.enabled set to true . This will enable the usage of bloom globally Now you need to set the bloom type for your CF

Re: Scan vs Put vs Get

2012-06-28 Thread N Keywal
Time to read 1 lines: 108.0 mseconds (92593 lines/seconds) This part is unclear to me. How did you get these results? It's not with the list of gets, if I understood well? On Thu, Jun 28, 2012 at 1:13 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Wow. First, thanks a lot all for

Re: Scan vs Put vs Get

2012-06-28 Thread Jean-Marc Spaggiari
Hi Anoop, Are Bloom filters for columns? If I add g.setFilter(new KeyOnlyFilter()); that mean I can't use bloom filters, right? Basically, what I'm doing here is something like existKey(byte[]):boolean where I try to see if a key exist in the database whitout taking into consideration if there is

Re: Scan vs Put vs Get

2012-06-28 Thread Jean-Marc Spaggiari
Hi N Keywal, This result: Time to read 1 lines : 122.0 mseconds (81967 lines/seconds) Is obtain with this code: HTable table = new HTable(config, test3); final int linesToRead = 1; System.out.println(new java.util.Date () + Processing iteration + iteration + ... ); RandomRowFilter rrf

Re: In memory table after using 'alter'

2012-06-28 Thread Sever Fundatureanu
Thanks, Minh. This blog also mentions that bloom filters check if a given column exists in a given row. However in HBase: The definitive guide it is mentioned that enabling the Bloom filter does give you the immediate advantage of knowing if a file contains a particular row key or not. Is the blog

Re: In memory table after using 'alter'

2012-06-28 Thread Minh Duc Nguyen
Sever, I'd say that the wiki page is incomplete when discussing bloom filters. The HBase Reference Guide http://hbase.apache.org/book.html#schema.bloom provides a much better explanation of the different bloom filter options. ~ Minh On Thu, Jun 28, 2012 at 9:54 AM, Sever Fundatureanu

Re: Scan vs Put vs Get

2012-06-28 Thread N Keywal
Thank you. It's clearer now. From the code you sent, RandomRowFilter is not used. You're only using the KeyOnlyFilter (the second setFilter replaces the first one; you need to use like FilterList to combine filters). (Note as well that you would need to initialize RandomRowFilter#chance, if not

Re: Slow row deletion performance in comparison to insertion

2012-06-28 Thread Jeff Whiting
0.90.4-cdh3u3 is the version I'm running. ~Jeff On 6/27/2012 5:50 PM, Ted Yu wrote: I created HBASE-6287 https://issues.apache.org/jira/browse/HBASE-6287 for porting HBASE-5941 to trunk. Jeff: What version of HBase are you using ? Since HBASE-5941 is an improvement, a vote may be raised for

Re: secondary indexing of tables

2012-06-28 Thread Ted Yu
The wiki is quite old. See Todd's comment on a newer JIRA: https://issues.apache.org/jira/browse/HBASE -5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249927#comment-13249927 On Thu, Jun 28, 2012 at 4:58 AM, Mohammad Tariq donta...@gmail.com wrote:

Re: Region splitting problem

2012-06-28 Thread Jonathan Bishop
Ram, Your key splitting is incorrect - I had the same problem. Give this a try...notice that you need to insert a zero before the first byte to avoid BigInteger from interpreting this as a negative number (is uses the first bit as a sign bit, and that you need to strip of the leading zero when

Re: Region splitting problem

2012-06-28 Thread Jonathan Bishop
Sorry, that was meant for Benjamin. Also, you can test your key order using Bytes.compareTo(a,b). I believe that is what is used internally (correct me if I am wrong). On Thu, Jun 28, 2012 at 12:07 AM, Ben Kim benkimkim...@gmail.com wrote: Hi :) I have a hbase table with rowkeys 1 ~ 1000

Re: Scan vs Put vs Get

2012-06-28 Thread Jean-Marc Spaggiari
Oh! I see! KeyOnlyFilter is overwriting the RandomRowFilter! Bad. I mean, bad I did not figured that. Thanks for pointing that. That definitively explain the difference in the performances. I have activated the bloomfilters with this code: HBaseAdmin admin = new HBaseAdmin(config); HTable table =

Re: Scan vs Put vs Get

2012-06-28 Thread N Keywal
For the filter list my guess is that you're filtering out all rows because RandomRowFilter#chance is not initialized (it should be something like RandomRowFilter rrf = new RandomRowFilter(0.5);) But note that this test will never be comparable to the test with a list of gets. You can make it as

Re: Scan vs Put vs Get

2012-06-28 Thread Jean-Marc Spaggiari
Oh, sorry. You're right. You already said that and I forgot to update it. It's working fine when I add this parameter. And as you are saying, I can get the respons time I want by playing with the chance... I get (34758 lines/seconds) with 0.99 as the chance, and only (7564 lines/seconds) with

Re: Out of memory error in Hbase

2012-06-28 Thread Asaf Mesika
How Manu puts object are sending to the server? Maybe you should decrease this amount? Sent from my iPad On 28 ביונ 2012, at 11:13, Prakrati Agrawal prakrati.agra...@mu-sigma.com wrote: Hi, I am getting heap space error while running Hbase on certain nodes of my cluster. I can't increase

Understanding HBase log output regarding memstore flush

2012-06-28 Thread Asaf Mesika
Hi, I'm trying to figure out some discrepancies I'm witnessing in the HBase Region Server log file. It states that a flush was requested, and then a memstore flush is started. It says the flush size, after snapshotting is 139105600 (~132.7m). In the log message below, the file size of the file

Re: Understanding HBase log output regarding memstore flush

2012-06-28 Thread Matt Corgan
Hi Asaf, I believe HDFS will see the 35.5MB worth of data. The 132.7MB is the size of the data in the memstore with the overhead of the ConcurrentSkipListMap which is a pointer-heavy data structure. Are you using compression? If so then the 35.5 is the compressed size, and you should see a

HBase master not starting up

2012-06-28 Thread Kasturi
Hi, I have HBase master running on 3 nodes and region server on 4 other nodes on a Mapr hadoop cluster. We have been using it for a while, and it was working fine. Yesterday, we had a disk crash on one of the masters. There were some further issues and we had to delete the Hbase logs. Now, I am

Re: HBase master not starting up

2012-06-28 Thread Michael Segel
Sounds like your .Meta. table is corrupted. Thought that was fixed in 90.4... On Jun 28, 2012, at 1:26 PM, Kasturi wrote: Hi, I have HBase master running on 3 nodes and region server on 4 other nodes on a Mapr hadoop cluster. We have been using it for a while, and it was working fine.

RDBMS to HBASE schema migration

2012-06-28 Thread grashmi13
Hi, I want to change my RDBMS to HBASE schema, to be used with Hadoop platform. I have changed two RDBMS tables into HBASE tables. I have ignored constraints, indexes and foreign key relationship. Because I dont know how to convert these relationships in Hbase schema. Please confirm if the

Re: RDBMS to HBASE schema migration

2012-06-28 Thread grashmi13
Assets table is having numeric sequential ID and a one number out of (1,2,3,4,5,6,7, 8, 9, 10) for AssetName. This is a master table with say 10 rows only. hmmm.. after some more surfing, i came to know that we have to manually denormalize a relational DB. there are no preset rules for

Re: RDBMS to HBASE schema migration

2012-06-28 Thread Doug Meil
Hi there- I commend your enthusiasm for the Hbase project. For the ground rules of Hbase you probably want to read this closelyŠ http://hbase.apache.org/book.html#datamodel Š as it covers things like having one PK per table, no secondary indexes, etc. With a solid understanding of these

Re: HBase master not starting up

2012-06-28 Thread Stack
On Thu, Jun 28, 2012 at 11:26 AM, Kasturi kchatter...@technoratimedia.com wrote: We are using Hbase version 0.90.4. Try a later hbase? A 0.90.6? You could upgrade master only. Does that start? St.Ack

Re: Out of memory error in Hbase

2012-06-28 Thread Stack
On Thu, Jun 28, 2012 at 1:10 AM, Prakrati Agrawal prakrati.agra...@mu-sigma.com wrote: serverName=server1,60020,1340693606454, load=(requests=4, regions=18, usedHeap=1972, maxHeap=1993): OutOfMemoryError, aborting You've given it 2Gs of heap and you only have 18 regions? It should work.

Re: Out of memory error in Hbase

2012-06-28 Thread Stack
On Thu, Jun 28, 2012 at 1:10 AM, Prakrati Agrawal prakrati.agra...@mu-sigma.com wrote: Please help me Please help us by telling us what version of hbase? St.Ack

Re: direct Hfile Read and Writes

2012-06-28 Thread Stack
On Thu, Jun 28, 2012 at 1:40 AM, samar kumar samar.opensou...@gmail.com wrote: Thanks for the replies .  I am aware of the apis but can anyone give me little bit more insight on the details. After creating the HFiles and calling IncrementalLoadHFile how does it internally change the RS, Catalog

Re: RDBMS to HBASE schema migration

2012-06-28 Thread grashmi13
Thanks Doug Meil for your valuable comments. Actually I need to provide some output to my manager by today.. so i asked help from experts point of view. I will go thru with suggested maerial. But right now I dont have much time. Also, this is my perception towards hbase desgin by so far