Andy hi
Not sure what you mean by "Does something like the below help?" The current
code running is pasted below, line number are sightly different than yours.
It seems very close to the first file (revision "a") in your extract.
Mikael.S
public Result[] next(final long scannerId, int nbRows)
i am really intrigued to know why you are thinking of NoSQL for this use
case..
thanks
On Wed, Feb 15, 2012 at 10:39 PM, Raj N wrote:
> Thanks Mikael. I will try the first solution.
>
> To answer your question, I am evaluating both RDBMS and NoSQL and trying to
> find best solution.
>
>
> On Tue
Hi, all,
I have two region servers setup and each machine have around 32G
memory. For each region server, I started it with 12G JVM limit. Recently
I have one map-reduce job which will write big chunk of data into a hbase
table. The job will run around 10 hours and the final hbase table will b
Thanks Mikael. I will try the first solution.
To answer your question, I am evaluating both RDBMS and NoSQL and trying to
find best solution.
On Tue, Feb 14, 2012 at 8:03 PM, Mikael Sitruk wrote:
> Why don't you prefix the columns with an execution date (reverse order so
> the last execution is
Hello,
We are looking at Bloom Filters and wondering if they are helpful when
doing a sequential read (multi-row scan) or only when doing a Get for a
single row. It logically makes sense that it would only affect (or to
greater affect) getting a single row since it is a way for determining if
you
Hmm...
Does something like the below help?
diff --git
a/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
index f9627ed..0cee8e3 100644
--- a/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
+++ b/src/main/java/org/apache/hadoop/hbase/regionserver/HRegio
Hi,
I did look more into this and have a better idea how it could be
implemented.
As values are looked-up by dates (and sometimes additionally by source ID),
it would make sense to store each value in separate row.
rowkey would be some kind of timeseries, like:
timestamp_sourceID
However, docs s
Hi,
Was just reading about SSTable and LevelDB
(http://www.igvita.com/2012/02/06/sstable-and-log-structured-storage-leveldb/),
which has some HBase references. Somebody pointed out in comments Riak
supports LevelDB as a storage engine option, which made me wonder whether
pluggable backend sto
Ok, I don't have this log anymore but since the problem was reproduced in
other log (which i keep), here is the grep
2012-02-08 14:13:02,970 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.regionserver.LeaseException: lease
'-6992210222685255354' does not exist
Hey guys, im a hbase and python newbie, and im stuck with the mutateRow()
command.
I'm using Centos 5.5, python 2.6 & Hbase 0.90.4-cdh3u3. This is running in
a virtualbox, the original image file for the VM is the one provided by
Cloudera.
I've downloaded the hbase-0.90.4-cdh3u3.tar.gz file from cl
You would have to grep the lease's id, in your first email it was
"-7220618182832784549".
About the time it takes to process each row, I meant client (pig) side
not in the RS.
J-D
On Tue, Feb 14, 2012 at 1:33 PM, Mikael Sitruk wrote:
> Please see answer inline
> Thanks
> Mikael.S
>
> On Tue, Fe
I deployed it pretty easy on our internal repo by checking out the tag
0.92.0 (I assume this is the release) and *mvn deploy -DskipTests=true*.
Or you can move tests to a separate module eg hbase-test and add a
dependency to hbase. If all tests in hbase-test pass then you can
release the hbase
On Wed, Feb 15, 2012 at 8:43 AM, N Keywal wrote:
> You cannot use the option -D*skipTests* ?
>
Not on the release plugin apparently (its ignored -- I should fix it).
St.Ack
You cannot use the option -D*skipTests* ?
On Wed, Feb 15, 2012 at 5:27 PM, Stack wrote:
> On Tue, Feb 14, 2012 at 11:18 PM, Ulrich Staudinger
> wrote:
> > Hi St.Ack,
> >
> > i don't wanna be a pain in the back, but any progress on this?
> >
>
> You are not being a pain.
>
> I'm fumbling the mvn
On Wed, Feb 15, 2012 at 1:53 AM, Oliver Meyn (GBIF) wrote:
> So hacking around reveals that key collision is indeed the problem. I
> thought the modulo part of the getRandomRow method was suspect but while
> removing it improved the behaviour (I got ~8M rows instead of ~6.6M) it
> didn't fix i
On Tue, Feb 14, 2012 at 11:18 PM, Ulrich Staudinger
wrote:
> Hi St.Ack,
>
> i don't wanna be a pain in the back, but any progress on this?
>
You are not being a pain.
I'm fumbling the mvn publishing, repeatedly. Its a little
embarrassing which is why I'm not talking to much about it (smile).
T
What version of Hadoop are you running? There are many erroneous
instructions for how to get this up and running all over the internet.
You do not need to rebuild hive in order to get it to work. You only
need to do the following:
1. It will only work if HBase is running in distributed or
pseu
There is something amiss. The client has seen a ZK transaction ID far ahead of
what the ZK server thinks is the current epoch. Usually this happens if you
blow away ZK storage and restart it, ie you are creating transient ZK quorums.
A client that continues to run will remember the zxid of the o
Thank you for your reply Doug.. that is what i wanted to know.
On Tue, Feb 14, 2012 at 9:39 PM, Doug Meil wrote:
>
> I say "basically" because inside a Region there are Stores, and for each
> Store there are StoreFiles. For more info see:
>
> http://hbase.apache.org/book.html#regions.arch
>
>
>
Okie:
10x # of mappers: https://issues.apache.org/jira/browse/HBASE-5401
wrong row count: https://issues.apache.org/jira/browse/HBASE-5402
Oliver
On 2012-02-15, at 11:50 AM, yuzhih...@gmail.com wrote:
> Oliver:
> Thanks for digging.
>
> Please file Jira's for these issues.
>
>
>
> On Feb
Hi James, I'm new to HBase too.
How about this:
with "a range of orderIds", select the first id.
Step1 : set this ID as startRow, then checkout the closest id(Only fetch
one),
Step2:then with this fetched ID, setStartRow(fetchedID-startTimestamp),
setEndRow(fetchedID-endTimestamp),
Step3:
I am new to hbase, I can't get the Hive handler working. I downloaded
the latest Hive (0.8.1) which has a handler for 0.89, and based on the
instructions on
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration I
recompiled hive after updating the hbase, zookeeper and guava versions
in
Thanks a lot for the help Todd!
On 14 February 2012 22:39, Todd Lipcon wrote:
> Yep, definitely bound on seeks - see the 100% util, and the r/s >100.
> The bandwidth provided by random IO from a disk is going to be much
> smaller than the sequential IO you see from hdparm
>
> -Todd
>
> On Tue, F
Oliver:
Thanks for digging.
Please file Jira's for these issues.
On Feb 15, 2012, at 1:53 AM, "Oliver Meyn (GBIF)" wrote:
> On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote:
>
>> On 2012-02-15, at 7:32 AM, Stack wrote:
>>
>>> On Tue, Feb 14, 2012 at 8:14 AM, Stack wrote:
> 2) With
On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote:
> On 2012-02-15, at 7:32 AM, Stack wrote:
>
>> On Tue, Feb 14, 2012 at 8:14 AM, Stack wrote:
2) With that same randomWrite command line above, I would expect a
resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M
On 2012-02-15, at 7:32 AM, Stack wrote:
> On Tue, Feb 14, 2012 at 8:14 AM, Stack wrote:
>>> 2) With that same randomWrite command line above, I would expect a
>>> resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M
>>> rows). Instead what I'm seeing is that the randomWrite
26 matches
Mail list logo