(This could be a known issue. Please let me know if it is).
We had a set of uncompacted store files in a region. One of the column families
had a store file of 5 Gigs. The other column families were pretty small (a few
megabytes at most).
It so turned out that all these files had rows whose
Have you read the following thread ?
ScannerTimeoutException when a scan enables caching, no exception when it
doesn'tDid you enable caching ? If not, it is different issue.
On Wed, Apr 13, 2011 at 12:40 AM, Vidhyashankar Venkataraman
vidhy...@yahoo-inc.com wrote:
(This could be a known issue.
Thanks J-D
I made sure to pass conf objects around in places where I was n't..
will give it a try
-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Apr 12, 2011 6:40 pm
Subject: Re: hbase -0.90.x upgrade - zookeeper exception
Hi there-
For what it's worth, although we haven't had this particular issue we've
certainly had other bumps and bruises (GC of death, and other metadata issues
caused when a split dies during a GC of death, etc.). But there are few
general items that helped in stability and performance I
Greetings,
I'm trying to backport CopyTable to HBase 0.20.6.
In other words, the challenge is to write a job that would copy data from
one HTable on cluster A to another HTable on cluster B.
I'm able to copy HTable to another HTable on the same cluster, but I can not
find a way to point to the
Hi
We had enabled scanner caching but I don't think it is the same issue
because scanner.next in this case is blocking: the scanner is busy in the
region server but hasn't returned anything yet since a row to be returned
hasn't been found yet (all rows have expired but are still there since
Hi Jean-Daniel,
thx for your reply.
What I assume is that the total network load during reduce is O(n) with n the
number of nodes in the cluster. We saw a major performance loss in the reduce
step when our network degraded to 100Mbit by accident (1h vs. 13 minutes).
With more nodes I see 2
Hi Vidhya,
So it sounds like the timeout thread is timing out the scanner when it takes
more than 60 seconds reading through the large column family store file
without returning anything to the client? Even without the TTL expiration
being applied, I think I've heard of this in other cases where
This could be HBASE-2077
J-D
On Wed, Apr 13, 2011 at 9:15 AM, Gary Helmling ghelml...@gmail.com wrote:
Hi Vidhya,
So it sounds like the timeout thread is timing out the scanner when it takes
more than 60 seconds reading through the large column family store file
without returning anything
Looks like the most recent patch for HBASE-2077 does try to address this
with the usage counter. That may be the more correct approach, but I was
wondering if we would do something simpler with periodically renewing the
lease down in the RegionScanner iteration? Sort of like calling progress()
On Wed, Apr 13, 2011 at 10:03 AM, Vidhyashankar Venkataraman
vidhy...@yahoo-inc.com wrote:
Even without the TTL expiration being applied, I think I've heard of
this in other cases where a very
restrictive filter was used on a large table scan.
Thanks, I was about to say that in a follow-up
Vidhya, the patch in that jira is stale, needs some love.
Gary, the AtomicInteger is just there to permit multiple users of a
single Lease, not very common so can be changed.
The issue with setting some sort of progress is that the Lease is
sleeping so you cannot change it's sleeping time. You
Vidhya,
Did you try setting scanner time range. It takes min and max timestamps, and
when instantiating the scanner at RS, a time based filtering is done to
include only selected store files. Have a look at StoreFile.shouldseek(Scan,
Sortedsetbyte[]). I think it should improve the response time.
Himanshu,
Thanks, this will resolve the particular case we ran into. But what if the
files are huge and have a wide range of timestamps and only some of the records
in the file are valid? And for the other application that we have: scans with
filters that returns a sparse set, the solution
Hi Doug,
3) Cluster restart
We schedule a full shutdown and restart of our cluster each
week. It's pretty quick, and HBase just seems happier
when we do this.
Can you say a bit more about how HBase is happier versus not?
I can speculate on a number of reasons why this may be the case,
This sounds like HBASE-2014: https://issues.apache.org/jira/browse/HBASE-2014
BTW apologies for the weird English in that issue, it appears I cut and pasted
a request from our China development center without sufficient editing.
- Andy
Vidhya, so yes in the case of huge files with valid rows, timerange thing
will not be effective and neither in the case of a scanner hanging in its
next calls either by a gc pause or some exhaustive computation. I voted for
this answer after reading your initial mail (but it got posted after a
The problem I'm having is in getting the conf that is used to init the table
within TableInputFormat. That's the one that's leaving open ZK connections for
me.
Following the code through, TableInputFormat initializes its HTable with new
Configuration(new JobConf(conf)), where conf is the
Reuben:
Yes..I've the exact same issue now.. I'm also kicking off from another jvm
that runs for ever..
I don't have an alternate solution..either modify hbase code (or) modify my
code to kick off
as a standalone jvm (or) hopefully 0.90.3 release soon :)
J-D/St.Ack may have some suggestions
V
HConnectionManager needed some modifications in order to make it work,
it's not just about backporting that job.
J-D
On Wed, Apr 13, 2011 at 7:27 AM, Manuel de Ferran
manuel.defer...@gmail.com wrote:
Greetings,
I'm trying to backport CopyTable to HBase 0.20.6.
In other words, the challenge
Context: we're still on .89 - so we can't take advantage of the MemStore
allocation buffers yet. One of the most important metrics for us was GC-stuck
region servers, and more nodes + more memory + scheduling periodic cluster
restarts helped in our situation. I wholeheartedly agree with the
Like I said, it's a zookeeper configuration that you can change. If
hbase is managing your zookeeper then set
hbase.zookeeper.property.maxClientCnxns to something higher than 30
and restart the zk server (can be done while hbase is running).
J-D
On Wed, Apr 13, 2011 at 12:04 PM, Venkatesh
Will do..I'll set it to 2000 as per JIRA..
Do we need a periodic bounce? ..because if this error comes up..only way I get
the mapreduce to work
is bounce.
-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Wed, Apr 13, 2011 3:22
Periodic bounce of what? Your client program or the ZK server?
J-D
On Wed, Apr 13, 2011 at 12:31 PM, Venkatesh vramanatha...@aol.com wrote:
Will do..I'll set it to 2000 as per JIRA..
Do we need a periodic bounce? ..because if this error comes up..only way I
get the mapreduce to work
is
To bring it back to the original point and a high level view, the fact
is that HBase is not Oracle, nor MySQL. It doesnt have multiple
decades, and futhermore distributed systems are inherently more
difficult (more failure cases) than single node DBs. Having said
that, the grass is certainly not
The problem is the connections are never closed... so they just keep piling up
until it hits the max. My max is at 400 right now, so after 14-15 hours of
running, it gets stuck in an endless connection retry.
I saw that the HConnectionManager will kick older HConnections out, but the
problem
Yeah for a JVM running forever it won't work.
If you know for a fact that the configuration passed to TIF won't be
changed then you can subclass it and override setConf to not clone the
conf.
J-D
On Wed, Apr 13, 2011 at 12:45 PM, Ruben Quintero rfq_...@yahoo.com wrote:
The problem is the
Venkatesh, I guess the two quick and dirty solutions are:
- Call deleteAllConnections(bool) at the end of your MapReduce jobs, or
periodically. If you have no other tables or pools, etc. open, then no problem.
If you do, they'll start throwing IOExceptions, but you can re-instantiate them
with
deleteAllConnections works well for my case..I can live with this but not with
connection leaks
thanks for the idea
Venkatesh
-Original Message-
From: Ruben Quintero rfq_...@yahoo.com
To: user@hbase.apache.org
Sent: Wed, Apr 13, 2011 4:25 pm
Subject: Re: hbase -0.90.x upgrade
Hi all,
I'm with a startup, GotoMetrics, doing things with Hadoop and I've gotten
permission to open source Orderly -- our row key schema system for use in
projects like HBase. Orderly allows you to serialize common data types
(long, double, bigdecimal, etc) or structs/records of these types to
Michael (and GotoMetrics),
Thank you for opening this up!
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back. - Piet Hein (via
Tom White)
--- On Wed, 4/13/11, Michael Dalton mwdal...@gmail.com wrote:
Hi all,
I'm with a startup, GotoMetrics, doing
2011-04-13 20:27:08,620 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Error closing cjjHTML,
http://www.csh.gov.cn/article_346937.html,1299079217805
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:234)
at
In hbase version 0.90.1 .
Is there any experience ?
Hmaster Logs :
2011-04-08 16:33:09,384 ERROR org.apache.hadoop.hbase.master.AssignmentManager:
Region has been OPEN for too long, we don't know where region was opened so
can't do anything
2011-04-08 16:33:09,384 ERROR
Michael,
Interesting contribution to the open source community. Sounds like nice
work.
Can you say how this relates to Avro with regard to collating of binary
data?
See, for instance, here: http://avro.apache.org/docs/current/spec.html#order
On Wed, Apr 13, 2011 at 5:55 PM, Michael Dalton
How do I persist data from my Spring/Java application to HBase? Currently I
am trying to use a datanucleus plugin and connect JPA with HBase. Is this
the best way or is there some other method I could use?
--
With Regards,
Jr.
35 matches
Mail list logo