Re: Extremely slow when loading small amount of data from HBase

2012-09-05 Thread n keywal
Hi,

With 8 regionservers, yes, you can. Target a few hundreds by default imho.

N.

On Wed, Sep 5, 2012 at 4:55 AM, 某因幡 tewil...@gmail.com wrote:

 +HBase users.


 -- Forwarded message --
 From: Dmitriy Ryaboy dvrya...@gmail.com
 Date: 2012/9/4
 Subject: Re: Extremely slow when loading small amount of data from HBase
 To: u...@pig.apache.org u...@pig.apache.org


 I think the hbase folks recommend something like 40 regions per node
 per table, but I might be misremembering something. Have you tried
 emailing the hbase users list?

 On Sep 4, 2012, at 3:39 AM, 某因幡 tewil...@gmail.com wrote:

  After merging ~8000 regions to ~4000 on an 8-node cluster the things
  is getting better.
  Should I continue merging?
 
 
  2012/8/29 Dmitriy Ryaboy dvrya...@gmail.com:
  Can you try the same scans with a regular hbase mapreduce job? If you
 see the same problem, it's an hbase issue. Otherwise, we need to see the
 script and some facts about your table (how many regions, how many rows,
 how big a cluster, is the small range all on one region server, etc)
 
  On Aug 27, 2012, at 11:49 PM, 某因幡 tewil...@gmail.com wrote:
 
  When I load a range of data from HBase simply using row key range in
  HBaseStorageHandler, I find that the speed is acceptable when I'm
  trying to load some tens of millions rows or more, while the only map
  ends up in a timeout when it's some thousands of rows.
  What is going wrong here? Tried both Pig-0.9.2 and Pig-0.10.0.
 
 
  --
  language: Chinese, Japanese, English
 
 
 
  --
  language: Chinese, Japanese, English


 --
 language: Chinese, Japanese, English



Re: Extremely slow when loading small amount of data from HBase

2012-09-05 Thread Jean-Marc Spaggiari
But I think you should also look at why we have so many regions...
Because even if you merge them manually now, you might face the same
issu soon.

2012/9/5, n keywal nkey...@gmail.com:
 Hi,

 With 8 regionservers, yes, you can. Target a few hundreds by default imho.

 N.

 On Wed, Sep 5, 2012 at 4:55 AM, 某因幡 tewil...@gmail.com wrote:

 +HBase users.


 -- Forwarded message --
 From: Dmitriy Ryaboy dvrya...@gmail.com
 Date: 2012/9/4
 Subject: Re: Extremely slow when loading small amount of data from HBase
 To: u...@pig.apache.org u...@pig.apache.org


 I think the hbase folks recommend something like 40 regions per node
 per table, but I might be misremembering something. Have you tried
 emailing the hbase users list?

 On Sep 4, 2012, at 3:39 AM, 某因幡 tewil...@gmail.com wrote:

  After merging ~8000 regions to ~4000 on an 8-node cluster the things
  is getting better.
  Should I continue merging?
 
 
  2012/8/29 Dmitriy Ryaboy dvrya...@gmail.com:
  Can you try the same scans with a regular hbase mapreduce job? If you
 see the same problem, it's an hbase issue. Otherwise, we need to see the
 script and some facts about your table (how many regions, how many rows,
 how big a cluster, is the small range all on one region server, etc)
 
  On Aug 27, 2012, at 11:49 PM, 某因幡 tewil...@gmail.com wrote:
 
  When I load a range of data from HBase simply using row key range in
  HBaseStorageHandler, I find that the speed is acceptable when I'm
  trying to load some tens of millions rows or more, while the only map
  ends up in a timeout when it's some thousands of rows.
  What is going wrong here? Tried both Pig-0.9.2 and Pig-0.10.0.
 
 
  --
  language: Chinese, Japanese, English
 
 
 
  --
  language: Chinese, Japanese, English


 --
 language: Chinese, Japanese, English




Re: Extremely slow when loading small amount of data from HBase

2012-09-05 Thread Doug Meil

You have are 4000 regions on an 8 node cluster?  I think you need to bring
that *way* down…  

re:  something like 40 regions


Yep… around there.  See…


http://hbase.apache.org/book.html#bigger.regions



On 9/5/12 8:06 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote:

But I think you should also look at why we have so many regions...
Because even if you merge them manually now, you might face the same
issu soon.

2012/9/5, n keywal nkey...@gmail.com:
 Hi,

 With 8 regionservers, yes, you can. Target a few hundreds by default
imho.

 N.

 On Wed, Sep 5, 2012 at 4:55 AM, 某因幡 tewil...@gmail.com wrote:

 +HBase users.


 -- Forwarded message --
 From: Dmitriy Ryaboy dvrya...@gmail.com
 Date: 2012/9/4
 Subject: Re: Extremely slow when loading small amount of data from
HBase
 To: u...@pig.apache.org u...@pig.apache.org


 I think the hbase folks recommend something like 40 regions per node
 per table, but I might be misremembering something. Have you tried
 emailing the hbase users list?

 On Sep 4, 2012, at 3:39 AM, 某因幡 tewil...@gmail.com wrote:

  After merging ~8000 regions to ~4000 on an 8-node cluster the things
  is getting better.
  Should I continue merging?
 
 
  2012/8/29 Dmitriy Ryaboy dvrya...@gmail.com:
  Can you try the same scans with a regular hbase mapreduce job? If
you
 see the same problem, it's an hbase issue. Otherwise, we need to see
the
 script and some facts about your table (how many regions, how many
rows,
 how big a cluster, is the small range all on one region server, etc)
 
  On Aug 27, 2012, at 11:49 PM, 某因幡 tewil...@gmail.com wrote:
 
  When I load a range of data from HBase simply using row key range
in
  HBaseStorageHandler, I find that the speed is acceptable when I'm
  trying to load some tens of millions rows or more, while the only
map
  ends up in a timeout when it's some thousands of rows.
  What is going wrong here? Tried both Pig-0.9.2 and Pig-0.10.0.
 
 
  --
  language: Chinese, Japanese, English
 
 
 
  --
  language: Chinese, Japanese, English


 --
 language: Chinese, Japanese, English







Fwd: Extremely slow when loading small amount of data from HBase

2012-09-04 Thread 某因幡
+HBase users.


-- Forwarded message --
From: Dmitriy Ryaboy dvrya...@gmail.com
Date: 2012/9/4
Subject: Re: Extremely slow when loading small amount of data from HBase
To: u...@pig.apache.org u...@pig.apache.org


I think the hbase folks recommend something like 40 regions per node
per table, but I might be misremembering something. Have you tried
emailing the hbase users list?

On Sep 4, 2012, at 3:39 AM, 某因幡 tewil...@gmail.com wrote:

 After merging ~8000 regions to ~4000 on an 8-node cluster the things
 is getting better.
 Should I continue merging?


 2012/8/29 Dmitriy Ryaboy dvrya...@gmail.com:
 Can you try the same scans with a regular hbase mapreduce job? If you see 
 the same problem, it's an hbase issue. Otherwise, we need to see the script 
 and some facts about your table (how many regions, how many rows, how big a 
 cluster, is the small range all on one region server, etc)

 On Aug 27, 2012, at 11:49 PM, 某因幡 tewil...@gmail.com wrote:

 When I load a range of data from HBase simply using row key range in
 HBaseStorageHandler, I find that the speed is acceptable when I'm
 trying to load some tens of millions rows or more, while the only map
 ends up in a timeout when it's some thousands of rows.
 What is going wrong here? Tried both Pig-0.9.2 and Pig-0.10.0.


 --
 language: Chinese, Japanese, English



 --
 language: Chinese, Japanese, English


-- 
language: Chinese, Japanese, English