Hi,
Our HBase-backed app does about 95% reads and 5% writes on average, but once
per hour we do a bulk update of several million rows (through HTable w/
large write buffer, not MR). Most of the time the bulk update has no impact
on overall HBase performance. A few times during the day, usually d
eleted. All of the logs below are too late
> (the
> block's already gone, we need to figure out why).
>
> Can you look backwards through the past several days of the NN logs? Have
> you disabled the NN clienttrace log in log4j.properties?
>
> -Todd
>
> On Fri, May 7, 201
in the RS logs would also be
> helpful.
>
> Thanks
> -Todd
>
> On Fri, May 7, 2010 at 9:16 PM, James Baldassari >wrote:
>
> > On Sat, May 8, 2010 at 12:02 AM, Stack wrote:
> >
> > > On Fri, May 7, 2010 at 8:27 PM, James Baldassari <
> jbal
pcon wrote:
> This could very well be HBASE-2231.
>
> Do you find that region servers occasionally crash after going into GC
> pauses?
>
> -Todd
>
> On Fri, May 7, 2010 at 9:02 PM, Stack wrote:
>
> > On Fri, May 7, 2010 at 8:27 PM, James Baldassari
> >
On Sat, May 8, 2010 at 12:02 AM, Stack wrote:
> On Fri, May 7, 2010 at 8:27 PM, James Baldassari
> wrote:
> > java.io.IOException: Cannot open filename
> > /hbase/users/73382377/data/312780071564432169
> >
> This is the regionserver log? Is this deploying the regio
Hi,
First of all, thanks to all the HBase contributors for getting 0.20.4 out.
We're planning on upgrading soon, and we're also looking forward to 0.20.5.
Recently we've had a couple of problems where HBase (0.20.3) can't seem to
read a file, and the client spews errors like this:
java.io.IOExcep
Hi Michal,
I'm not an HBase committer, but my organization uses HBase heavily in our
production environment, which processes requests in real-time. I can share
with you a couple of our strategies and best practices for using HBase in
this type of environment:
1. We use an intermediate caching la
The hbase-*.xml files should probably go in WEB-INF/classes for Tomcat to
find them on the classpath. Better yet, put them wherever you want and just
programmatically build up an HBaseConfiguration instance based on the known
locations of these files. You can put the file locations in init-params
Hi,
I work at a small start-up that relies heavily on Hadoop for map-reduce as
well as HBase for real-time gets/puts, and I wanted to share some of my
experiences related to running these clusters on a budget. Like many
start-ups our initial platform was EC2. We would spin up a cluster, run
some
M/R stuff down.
>
> Anyway, thanks for sharing!
>
> Cheers,
> Dan
>
> On 18 February 2010 06:31, Stack wrote:
>
> > On Wed, Feb 17, 2010 at 11:04 AM, James Baldassari
> > wrote:
> > > OK, I'll do my best to capture our changes here. Ideally
On Wed, 2010-02-17 at 13:31 -0600, Stack wrote:
> On Wed, Feb 17, 2010 at 11:04 AM, James Baldassari wrote:
> > OK, I'll do my best to capture our changes here. Ideally we would have
> > changed one variable at a time, but since these performance problems
> > were h
k through these issues. I
really appreciate it.
-James
On Wed, 2010-02-17 at 02:18 -0600, Daniel Washusen wrote:
> Glad you sorted it out! Please do tell...
>
> On 17/02/2010, at 4:59 PM, James Baldassari wrote:
>
> > Hi,
> >
> > I think we managed to solve our
;
> When you look at the master ui, you can see that the request rate over
> time is about the same for all regionservers? (refresh the master ui
> every so often to take a new sampling).
>
> St.Ack
>
>
>
>
> On Tue, Feb 16, 2010 at 3:59 PM, James Baldassari wrot
ote:
> You mentioned in a previous email that you have a Task Tracker process
> running on each of the nodes. Is there any chance there is a map reduce job
> running?
>
> On 17 February 2010 10:31, James Baldassari wrote:
>
> > On Tue, 2010-02-16 at 16:45 -0600, Stack wr
On Tue, 2010-02-16 at 16:45 -0600, Stack wrote:
> On Tue, Feb 16, 2010 at 2:25 PM, James Baldassari wrote:
> > On Tue, 2010-02-16 at 14:05 -0600, Stack wrote:
> >> On Tue, Feb 16, 2010 at 10:50 AM, James Baldassari
> >> wrote:
> >
> > Whether the ke
On Tue, 2010-02-16 at 14:05 -0600, Stack wrote:
> On Tue, Feb 16, 2010 at 10:50 AM, James Baldassari wrote:
> > Today we added a fourth region server and forced the data to be
> > redistributed evenly by exporting /hbase and then importing it back in
> > (the Hadoop redis
in this page
> http://wiki.apache.org/hadoop/PerformanceTuning.
>
> St.Ack
>
>
> On Mon, Feb 15, 2010 at 11:21 PM, James Baldassari wrote:
> > No, we don't have LZO on the table right now. I guess that's something
> > else that we can try. I'll a
ot;O" and "E". They show the percentage of Old and
> Eden used. If old gen is staying up in the high 90's then there are more
> long lived objects then available memory...
>
> Cheers,
> Dan
>
> On 16 February 2010 17:54, James Baldassari wrote:
>
>
ing the
> block cache size to 40% you have now given the block cache 1600mb compared
> to the previous 800mb...
>
> Can you give the region servers more memory?
>
> Cheers,
> Dan
>
> On 16 February 2010 17:42, James Baldassari wrote:
>
> > On Tue, 2010-02-16 at 0
On Tue, 2010-02-16 at 00:14 -0600, Stack wrote:
> On Mon, Feb 15, 2010 at 10:05 PM, James Baldassari wrote:
> > Applying HBASE-2180 isn't really an option at this
> > time because we've been told to stick with the Cloudera distro.
>
> I'm sure the wouldn
ncreasing the memory allocated to one of the
> > regions and also increasing the "hfile.block.cache.size" to say '0.4' on the
> > same region?
> >
> > On 16 February 2010 11:54, James Baldassari wrote:
> >
> >> Hi Dan. Thanks for your suggesti
ze" property (default is 0.2
> (20%)).
>
> Cheers,
> Dan
>
> On 16 February 2010 10:45, James Baldassari wrote:
>
> > Hi,
> >
> > Does anyone have any tips to share regarding optimization for random
> > read performance? For writes I've found
Hi,
Does anyone have any tips to share regarding optimization for random
read performance? For writes I've found that setting a large write
buffer and setting auto-flush to false on the client side significantly
improved put performance. Are there any similar easy tweaks to improve
random read p
n 10 February 2010 13:50, Ryan Rawson wrote:
> >
> >> If you stop the source cluster then you can distcp the /hbase to the
> >> other cluster. Done. A perfect copy.
> >>
> >> That is probably the most efficient/highest performing way.
> >>
> >>
Hi,
I'm wondering if it's possible to export all data from one HBase cluster
and import it into another. We have a lot of data that we've imported
into our staging HBase environment, and rather than repeating the
lengthy import process in our production environment we would prefer to
just copy al
I recently had to do something similar both programmatically in a unit
test and from the hbase shell. I'm not using an index, but this is
probably similar to what you're doing:
HTableDescriptor table = new HTableDescriptor("tableName");
HColumnDescriptor column = new HColumnDescriptor("columnFami
k wrote:
> On Tue, Jan 26, 2010 at 9:03 PM, James Baldassari wrote:
> >
> > After running a map/reduce job which inserted around 180,000 rows into
> > HBase, HBase appeared to be fine. We could do a count on our table, and
> > no errors were reported. We then tried to tr
Hi,
I'm using the Cloudera distribution of HBase, version
0.20.0~1-1.cloudera, in a fully-distributed cluster of 10 nodes. I'm
using all default config options except for hbase.zookeeper.quorum,
hbase.rootdir, hbase.cluster.distributed, and an updated regionservers
file containing all our region
28 matches
Mail list logo