Yeah it worked fine. But as I understand: If I prefix my row key with something like
md5-hash + timestamp then the rowkeys are probably evenly distributed but how would I perform then a scan restricted to a special time range? 2013/2/19 Mohammad Tariq <[email protected]>: > No. before the timestamp. All the row keys which are identical go to the > same region. This is the default Hbase behavior and is meant to make the > performance better. But sometimes the machine gets overloaded with reads > and writes because we get concentrated on that particular machine. For > example timeseries data. So it's better to hash the keys in order to make > them go to all the machines equally. HTH > > BTW, did that range query work?? > > Warm Regards, > Tariq > https://mtariq.jux.com/ > cloudfront.blogspot.com > > > On Tue, Feb 19, 2013 at 9:54 PM, Paul van Hoven < > [email protected]> wrote: > >> Hey Tariq, >> >> thanks for your quick answer. I'm not sure if I got the idea in the >> seond part of your answer. You mean if I use a timestamp as a rowkey I >> should append a hash like this: >> >> 1357279200000+MD5HASH >> >> and then the data would be distributed more equally? >> >> >> 2013/2/19 Mohammad Tariq <[email protected]>: >> > Hello Paul, >> > >> > Try this and see if it works : >> > scan.setStartRow(Bytes.toBytes(startDate.getTime() + "")); >> > scan.setStopRow(Bytes.toBytes(endDate.getTime() + 1 + "")); >> > >> > Also try not to use TS as the rowkey, as it may lead to RS hotspotting. >> > Just add a hash to your rowkeys so that data is distributed evenly on all >> > the RSs. >> > >> > Warm Regards, >> > Tariq >> > https://mtariq.jux.com/ >> > cloudfront.blogspot.com >> > >> > >> > On Tue, Feb 19, 2013 at 9:41 PM, Paul van Hoven < >> > [email protected]> wrote: >> > >> >> Hi, >> >> >> >> I'm currently playing with hbase. The design of the rowkey seems to be >> >> critical. >> >> >> >> The rowkey for a certain database table of mine is: >> >> >> >> timestamp+ipaddress >> >> >> >> It looks something like this when performing a scan on the table in the >> >> shell: >> >> hbase(main):012:0> scan 'ToyDataTable' >> >> ROW COLUMN+CELL >> >> 1357020000000+192.168.178.9 column=CF:SampleCol, >> >> timestamp=1361288601717, value=Entry_1 = 2013-01-01 07:00:00 >> >> >> >> Since I got several rows for different timestamps I'd like to tell a >> >> scan to just a region of the table for example from 2013-01-07 to >> >> 2013-01-09. Previously I only had a timestamp as the rowkey and I >> >> could restrict the rowkey like that: >> >> >> >> SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd >> HH:mm:ss"); >> >> Date startDate = formatter.parse("2013-01-07 >> >> 07:00:00"); >> >> Date endDate = formatter.parse("2013-01-10 >> >> 07:00:00"); >> >> >> >> HTableInterface toyDataTable = >> >> pool.getTable("ToyDataTable"); >> >> Scan scan = new Scan( Bytes.toBytes( >> >> startDate.getTime() ), >> >> Bytes.toBytes( endDate.getTime() ) ); >> >> >> >> But this no longer works with my new design. >> >> >> >> Is there a way to tell the scan object to filter the rows with respect >> >> to the timestamp, or do I have to use a filter object? >> >> >>
