Re: Region hot spotting

2012-11-22 Thread Mohammad Tariq
Good point Mike. Regards, Mohammad Tariq On Thu, Nov 22, 2012 at 2:51 AM, Michael Segel michael_se...@hotmail.comwrote: Salting is not a good idea and I don't know why people suggest it. Case in point you want to fetch a single row/record back. Because the salt is arbitrary, you

RE: Paging On HBASE like solr

2012-11-22 Thread Vajrakumar
Hello Doug, First of all thanks for taking time to reply. As per my knowledge goes below two lines take the rowkey as a parameter for representing start and end. scan.setStartRow( Bytes.toBytes(row)); // start key is inclusive scan.setStopRow( Bytes.toBytes(row + (char)0));

Re: Paging On HBASE like solr

2012-11-22 Thread Doug Meil
Hi there- Then don't use an end-row and break out of the loop when you hit 100 rows. On 11/22/12 5:16 AM, Vajrakumar vajra.ku...@pointcross.com wrote: Hello Doug, First of all thanks for taking time to reply. As per my knowledge goes below two lines take the rowkey as a parameter for

Re: Paging On HBASE like solr

2012-11-22 Thread Ryan Smith
But then the range might not be respected. I think another way to ask is, is it possible to iterate over the rowkeys in an hbase table sequentially? On Thu, Nov 22, 2012 at 7:17 AM, Doug Meil doug.m...@explorysmedical.comwrote: Hi there- Then don't use an end-row and break out of the

RE: HBase NonBlocking and Async Thrift

2012-11-22 Thread Pankaj Misra
Thank you so much for your responses Michael and JM. Surprisingly, HBase Definitive Guide does recommend usage of a server based client mechanism to access HBase, and mentions Avro/Thrift for best performance. I had initially tried to use Avro, which was said to be not supported in future

Custom versioning best practices

2012-11-22 Thread David Koch
Hello, I was thinking of using versions with custom timestamps to store the evolution of a column value - as opposed to creating several (time_t, value_at_time_t) qualifier-value pairs. The value to be stored is a single integer. Fast ad-hoc retrieval of multiple versions based on a row key +

Re: Paging On HBASE like solr

2012-11-22 Thread Harsh J
Ryan, Not sure I understood what you meant. As I see it, there are two things when you have a start key and need a limited scan: - Stop at 100th consecutive row, even if there are holes in the actual consecutive key range. -- This is possible with a start key plus boundary stop key that is 100 +

Re: Paging On HBASE like solr

2012-11-22 Thread Mohammad Tariq
Hello Vajra, Give Hbase PageFilter a shot and see if it works for you. You need to specify a pageSize parameter, which controls how many rows per page should be returned. One thing which you have to keep in mind is to remember the last row that was returned. HTH Regards, Mohammad Tariq

Re: Paging On HBASE like solr

2012-11-22 Thread Ryan Smith
Thanks Harsh, Thats how I understood hbase table scans to work currently. However I need true sequential rowscans. I currently run a MR job to create solr indexes which give me sequential access to the rowkey data. Just wondering if HBase offered a native solution as of yet. Thanks

Re: Custom versioning best practices

2012-11-22 Thread Michael Segel
IMHO, the best practice is not to do this. Its an abuse of versioning and if you really want to store temporal data, make it part of the column name. On Nov 22, 2012, at 7:55 AM, David Koch ogd...@googlemail.com wrote: Hello, I was thinking of using versions with custom timestamps to

HBase 0.94.x vs Hadoop 2.0.x?

2012-11-22 Thread Jean-Marc Spaggiari
Hi, Is HBase 0.94.x running with Hadoop 2.0.x? I saw that 0.96.0 will be, but I was wondering for 0.94.x... Thanks, JM

Re: HBase 0.94.x vs Hadoop 2.0.x?

2012-11-22 Thread Harsh J
HBase 0.92+ onwards has had support for Apache Hadoop 2.x+ along with Apache Hadoop 1.x. The note on the web manual says that: As of Apache HBase 0.96.x, Apache Hadoop 1.0.x at least is required. We will no longer run properly on older Hadoops such as 0.20.205 or branch-0.20-append. Do not move

Re: HBase 0.94.x vs Hadoop 2.0.x?

2012-11-22 Thread Jean-Marc Spaggiari
Thanks for the clarification. Should I take a specific 0.94.x distribution to work with Hadoop 2.x+ ? Because there is only one version on http://apache.parentingamerica.com/hbase/hbase-0.94.2/ ... Or can I just keep my existing HBase confirmation and simply upgrade my hadoop to 2.x? JM

Re: HBase 0.94.x vs Hadoop 2.0.x?

2012-11-22 Thread Marcos Ortiz
On 11/22/2012 01:49 PM, Harsh J wrote: HBase 0.92+ onwards has had support for Apache Hadoop 2.x+ along with Apache Hadoop 1.x. The note on the web manual says that: As of Apache HBase 0.96.x, Apache Hadoop 1.0.x at least is required. We will no longer run properly on older Hadoops such as

Re: HBase 0.94.x vs Hadoop 2.0.x?

2012-11-22 Thread Harsh J
You'd have to recompile against a proper target and produce your own tarball. I guess binary artifacts would begin shipping with 0.96 onwards. From the root of a 0.94.x release tarball, try Elliot's command from http://permalink.gmane.org/gmane.comp.java.hadoop.hbase.user/30163: $ mvn clean

Re: HBase 0.94.x vs Hadoop 2.0.x?

2012-11-22 Thread Marcos Ortiz
On 11/22/2012 02:35 PM, Harsh J wrote: Marcos makes a good point btw - Move from 1.x to 2.x also involves recompiling your existing MR apps (if they are already built with maven, thats not too much work to do). But other than a few obvious issues caught by the java compiler, the APIs remain the

Re: HBase scanner LeaseException

2012-11-22 Thread Vincent Barat
Apparently, my problem seems more related to the one exposed here: http://www.nosql.se/tags/hbase-rpc-timeout/ I don't really understand the reason why next() on our scanners is called less than once per 60s, and actually I suspect this is NOT the case, since we never had any scanner timeout

RE: Region hot spotting

2012-11-22 Thread Ajay Bhosle
We are not fetching single row back. Hashing really helped, the data is now almost equally split between the servers. Thanks a lot. -Ajay -Original Message- From: Michael Segel [mailto:michael_se...@hotmail.com] Sent: Thursday, November 22, 2012 2:52 AM To: user@hbase.apache.org

HBASE Benchmarking

2012-11-22 Thread a...@hsk.hk
Hi, I tried to write one million records into the HBASE cluster with 5 nodes (Hbase 0.94.2 on Hadoop 1.0.4) 1. Methide: sequentialWrite 2. From log, I found that the process had to sleep 3 times (total 4012ms) 3. It scanned .META for max=10rows Any idea why it got max=10 rows, will this

Re: Custom versioning best practices

2012-11-22 Thread anil gupta
Hi David, As per my knowledge, HBase currently doesn't supports specifying separate setMaxVersion for different column family in a single Scan object. HTH, Anil On Thu, Nov 22, 2012 at 12:47 PM, David Koch ogd...@googlemail.com wrote: Hello Michael, Thank you for your response. By the

Re: scan is slower after bulk load

2012-11-22 Thread Asaf Mesika
Did you end up finding the answer? How fast is this method of insertion relative to a simple insert of ListPut ? On 13 בנוב 2012, at 02:29, Bijieshan bijies...@huawei.com wrote: I think one possible reason is block caching. Have you turned the block caching off during scanning? Regards,

Re: Issues on disabling compaction in HBase 0.94.2

2012-11-22 Thread ramkrishna vasudevan
Hi Yun, Are you trying to disable Minor compactions? Regards Ram On Fri, Nov 23, 2012 at 5:20 AM, yun peng pengyunm...@gmail.com wrote: Hi, I want to disable automatic compaction in HBase. Currently I used following configurations in conf/hbase-site.xml The problem is compaction does not

Re: HBASE Benchmarking

2012-11-22 Thread lars hofhansl
Making some wild guesses here. If your IO system cannot keep up with the write load, eventually it has to block the writers. For a while your writes are buffered in the memstore(s) but at some point they need to be flushed to disk. Many small files will lead to pad read performance, so these