Thank you so much Serega.
Regards,
Krishna
On Sun, Sep 28, 2014 at 11:01 PM, Serega Sheypak serega.shey...@gmail.com
wrote:
https://pig.apache.org/docs/r0.11.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html
I'm not sure how does Pig HBaseStroage works. I suppose it would read all
Thanks Serega,
Our usecase details:
We have a location table which will be stored in HBase with locationID as
the rowkey / Joinkey.
We intend to join this table with a transactional WebLog file in HDFS
(Expected size can be around 2TB).
Joining query will be passed from Pig.
Can we expect a
store location to hdfs
store weblog to hdfs
join them
use HBase bulk load tool to load join result to hbase.
What's the reason to keep location dataset in hbase and weblogs in hdfs?
You can expect data load perfomance improvement. For me it takes few
minutes to bulk load 500.000.000 records to
We actually have 2 data sets in HDFS, location (3-5 GB, approx 10 columns
in each record) and weblog (2-3 TB, approx 50 columns in each record). We
need to join the data sets using the locationId, which is in both the
data-sets.
We have 2 options:
1. Have both the data-sets in HDFS only and JOIN
https://pig.apache.org/docs/r0.11.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html
I'm not sure how does Pig HBaseStroage works. I suppose it would read all
data and then join it as usual dataset. So you should get serious hbase
perfomace degradation during read, you would get
Depends on the datasets size and HBase workload. The best way is to do join
in pig, store it and then use HBase bulk load tool.
It's general recommendation. I have no idea about your task details
2014-09-27 7:32 GMT+04:00 Krishna Kalyan krishnakaly...@gmail.com:
Hi,
We have a use case that
Hi,
We have a use case that involves ETL on data coming from several different
sources using pig.
We plan to store the final output table in HBase.
What will be the performance impact if we do a join with an external CSV
table using pig?.
Regards,
Krishna
On Thu, Oct 25, 2012 at 7:44 AM, Manu S manupk...@gmail.com wrote:
Hi,
I am using Pig-0.10.0 hbase-0.94.2.
I am trying to store the processed output to Hbase cluster using pig
script.
I registered the required .jar and set the mapreduce and zookeeper
parameters within the script itself.
)
- Original Message -
From: Mikael Sitruk mikael.sit...@gmail.com
To: user@hbase.apache.org; Andrew Purtell apurt...@apache.org
Cc:
Sent: Wednesday, February 15, 2012 11:32 PM
Subject: Re: LeaseException while extracting data via pig/hbase integration
Andy hi
Not sure what you mean by Does
Sitruk mikael.sit...@gmail.com
To: user@hbase.apache.org; Andrew Purtell apurt...@apache.org
Cc:
Sent: Wednesday, February 15, 2012 11:32 PM
Subject: Re: LeaseException while extracting data via pig/hbase
integration
Andy hi
Not sure what you mean by Does something like the below help
You would have to grep the lease's id, in your first email it was
-7220618182832784549.
About the time it takes to process each row, I meant client (pig) side
not in the RS.
J-D
On Tue, Feb 14, 2012 at 1:33 PM, Mikael Sitruk mikael.sit...@gmail.com wrote:
Please see answer inline
Thanks
Ok, I don't have this log anymore but since the problem was reproduced in
other log (which i keep), here is the grep
2012-02-08 14:13:02,970 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.regionserver.LeaseException: lease
'-6992210222685255354' does not exist
)
- Original Message -
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Cc:
Sent: Wednesday, February 15, 2012 10:17 AM
Subject: Re: LeaseException while extracting data via pig/hbase integration
You would have to grep the lease's id, in your first email
@hbase.apache.org
Cc:
Sent: Wednesday, February 15, 2012 10:17 AM
Subject: Re: LeaseException while extracting data via pig/hbase
integration
You would have to grep the lease's id, in your first email it was
-7220618182832784549.
About the time it takes to process each row, I meant client
hi,
Well no, i can't figure out what is the problem, but i saw that someone
else had the same problem (see email: LeaseException despite high
hbase.regionserver.lease.period)
What can i tell is the following:
Last week the problem was consistent
1. I updated hbase.regionserver.lease.period=30
On Tue, Feb 14, 2012 at 2:01 AM, Mikael Sitruk mikael.sit...@gmail.com wrote:
hi,
Well no, i can't figure out what is the problem, but i saw that someone
else had the same problem (see email: LeaseException despite high
hbase.regionserver.lease.period)
What can i tell is the following:
Last
Please see answer inline
Thanks
Mikael.S
On Tue, Feb 14, 2012 at 8:30 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:
On Tue, Feb 14, 2012 at 2:01 AM, Mikael Sitruk mikael.sit...@gmail.com
wrote:
hi,
Well no, i can't figure out what is the problem, but i saw that someone
else had the
Late answer, did you figure it out?
This exception happens when you don't use your scanner lease for more
than the lease time (default one minute). AFAIK that didn't change, so
maybe something else got slow? Or maybe some special configurations
you had didn't make it during the upgrade?
J-D
On
Hi all
Recently I have upgraded my cluster from Hbase 0.90.1 to 0.90.4 (using
cloudera from cdh3u0 to cdh3u2)
Everything was ok till I ran pig extract on the new cluster, from the old
cluster everything worked well.
Now each time i run the extract in conjunction to other work performed on
the
19 matches
Mail list logo