We do numerical sorting within some of our tables. We put the numerical
values as fixed length byte arrays within the keys (and flipped the sign
bit so negative values are lexigraphically lower than positive values)
Of course, it's still part of the key so that technique doesn't work for
On Thu, Aug 30, 2012 at 5:04 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
In general isn't it better to split the regions so that the load can be
spread accross the cluster to avoid HotSpots?
Time series data is a particular case [1] and the sematextians have
tools to help w/ that
Thanks St.Ack and Tom. Yes I too kinda came up with a similar scheme -- To
store the rank as part of the key. Where it broke down for me was for say,
k-dimensional data where ranks are stored for dimension A but the query
requires sorting by dimension b.
For now I have to settle with
Btw, liked the bit flipping for negative values. It didn't occur to me right
off, it would be a problem
i Sent from my iPad with iMstakes
On Aug 30, 2012, at 23:14, Tom Brown tombrow...@gmail.com wrote:
We do numerical sorting within some of our tables. We put the numerical
values as
Hi Sonal, Stack and Ulrich!
Yes, I should provide more details :$
I reached the links you provided when I was searching for a way to start HBase
with JUnit. From default, the only params I have changed are Zookeeper port and
the amount of nodes, which is 1 in my case. Based on logs I suspect
Hi Cristopher,
HBase starts a minicluster for many of its tests because we have a lot of
destructive tests. Or the non destructive tests would be impacted by the
destructive tests. When writing a client application, you usually don't
need to do that: you can rely on the same instance for all your
Hi Cristofer,
At least 15 seconds are spent on starting the mini cluster for each test
case.
and you are sure that you are reusing your mini cluster across unit tests?
HTH2,
Ulrich
On Fri, Aug 31, 2012 at 12:28 PM, Cristofer Weber
cristofer.we...@neogrid.com wrote:
Hi Sonal, Stack and
Hi Ulrich,
Yes, I'm starting mini cluster inside @BeforeClass. There are 3 different test
cases, and between 2 and 15 tests per test case.
Thanks!
Best regards,
Cristofer
-Mensagem original-
De: ustaudin...@gmail.com [mailto:ustaudin...@gmail.com] Em nome de Ulrich
Staudinger
Enviada
On Fri, Aug 31, 2012 at 2:33 PM, Cristofer Weber
cristofer.we...@neogrid.com wrote:
For the other adapters (Cassandra, Cassandra + Thrift, Cassandra +
Astyanax, etc) they managed to run tests as Internal and External for unit
tests and also have a profile for Performance and Concurrent tests,
Stack, re: Where did you read that?, I think he might also be referring
to this...
http://hbase.apache.org/book.html#important_configurations
On 8/30/12 8:04 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
In general isn't it better to split the regions so that the load can be
spread
Asynchbase redone with PB and attention to security would be a good place
to start. I can't commit resources in the immediate term, so that's easy
for me to say I know. Anyway seems we're on the same page wrt client.
On Friday, August 31, 2012, lars hofhansl wrote:
Many of us have been saying
On Thu, Aug 30, 2012 at 11:52 PM, Stack st...@duboce.net wrote:
On Thu, Aug 30, 2012 at 5:04 PM, Mohit Anchlia mohitanch...@gmail.com
wrote:
In general isn't it better to split the regions so that the load can be
spread accross the cluster to avoid HotSpots?
Time series data is a
On Fri, Aug 31, 2012 at 6:09 AM, Doug Meil
doug.m...@explorysmedical.com wrote:
Stack, re: Where did you read that?, I think he might also be referring
to this...
http://hbase.apache.org/book.html#important_configurations
I'd say we need to revist that paragraph. It gives a 'wrong'
On Fri, Aug 31, 2012 at 7:55 AM, Mohit Anchlia mohitanch...@gmail.com wrote:
My data is timeseries and to get random distribution and still have the
keys in the same region for a user I am thinking of using
md5(userid)+reversetimestamp as a row key. But with this type of key how
can one do
Maybe we need to add a coprocessors section to the ref guide. I think
all the current documentation is in javadoc. And if all the
potentially destabilizing issues of in-process coprocessor usage are
not yet called out (memory usage, cpu, etc), we could more explicitly
detail that.
In we want to
Yes (in 0.94.2+). But it would be quite tricky.
You'd have to hook into the compaction. There's a new hook now in
RegionObserver (preCompactionScannerOpen, and preFlushScannerOpen).
See HBASE-6427.
These two hooks are passed the scanners that provide the set of KVs to be
compacted. You could
16 matches
Mail list logo