deploying hbase 0.90 to internal maven repository

2010-12-29 Thread Ted Yu
Hi, I used the following script to deploy hbase 0.90 jar to internal maven repository but was not successful: #!/usr/bin/env bash set -x mvn deploy:deploy-file -Dfile=target/hbase-0.90.0.jar -Dpackaging=jar -DgroupId=org.apache.hbase -DartifactId=hbase -Dversion=0.90.0

Re: Good VLDB paper on WALs

2010-12-29 Thread Stack
Nice list of things we need to do to make logging faster (with useful citations on current state of art). This notion of early lock release (ELR) is worth looking into (Jon, for high rates of counter transactions, you've been talking about aggregating counts in front of the WAL lock... maybe an

Re: Good VLDB paper on WALs

2010-12-29 Thread Ryan Rawson
Oh no, let's be wary of those server rewrites. My micro profiling is showing about 30 usec for a lock handoff in the HBase client... I think we should be able to get big wins with minimal things. A big rewrite has it's major costs, not to mention to effectively be async we'd have to rewrite

Re: deploying hbase 0.90 to internal maven repository

2010-12-29 Thread Ryan Rawson
just run 'mvn install' in our directory and that should do the trick. everything else is implied by pom.xml. well except the repository stuff. -ryan On Wed, Dec 29, 2010 at 10:29 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, I used the following script to deploy hbase 0.90 jar to internal maven

bloom filter types

2010-12-29 Thread Ted Yu
In 0.90, /** * Bloom enabled with Table row as Key */ ROW, /** * Bloom enabled with Table row column (family+qualifier) as Key */ ROWCOL Is there wiki / doc on which type to use in various scenarios ? Thanks

Re: bloom filter types

2010-12-29 Thread Nicolas Spiegelberg
I don't think there's an explicit wiki. Which option depends on whether your use case is calling get() for entire rows or for specific columns in a row. It also depends on analyzing your workload to determine how likely a row will be in every store file vs. a specific column. Also, since a row

Re: Good VLDB paper on WALs

2010-12-29 Thread Nicolas Spiegelberg
+1 for ELR. I think having some data structure where we prepare the next stage of sync() operations instead of holding the row lock over the sync would be a big win for hot regions without a huge refactor. I think the other two optimizations are useful to think about, but wouldn't have the same

Re: bloom filter types

2010-12-29 Thread Stack
Here is link to the 0.90.0 BF doc Ted: http://people.apache.org/~stack/hbase-0.90.0-candidate-2/docs/blooms.html Its from some doc Nicolas wrote way back. @N Yeah, if you want to add a bit to the book or elsewhere (We can link to the latter). St.Ack On Wed, Dec 29, 2010 at 2:06 PM, Nicolas

How about to give more flexibility to RowKey (customized comparator and serializer)

2010-12-29 Thread Schubert Zhang
In our application, we want to build a index htable to a core htable, and the key of the index includes multiple columns in the core htable. for example: The core table: RowKey - column1, column2, column3, column4 Note: The length of column1 and column2 is irregular. The index table: RowKey -