First, I would recommend you try upgrading to HBase 0.20.0. There are a
number of significant improvements to performance and stability. Also,
you have plenty of memory, so give more of it to the HBase Regionserver
(especially if you upgrade to 0.20, give HBase 4GB or more) and you will
see significant performance improvements.
ISSUE1: I don't quite understand what you mean about indextable being
faster/slower depending how you pass it configuration? That doesn't
make much sense to me.
What exactly does your data look like / what are you trying to index?
IndexedTable is NOT known to be very performant. If speed is of utmost
concern, I would recommend you manage secondary indexing yourself, or
look for other solutions like denormalization. If you help us
understand what that query is actually doing we might be able to help
you optimize your schema for it.
ISSUE2: You're overrunning compactions. Is this during a high-load
import? You also seem to be having HDFS problems underneath, if the
regionserver is actually exiting on you. What do the datanode logs
show? There are lots of ways to deal with issues like that... explain
more about what exactly are you doing during this time and perhaps there
is a better strategy.
ISSUE3: This is a very unfortunate but still remaining issue in the
multi-put facilities of HBase (at least as far as I know). There is
work being done right now as part of a rework of batching facilities and
hopefully will be fixed in time for 0.20.1. Until then, you may have to
be more intelligent in your client to figure out when you have failures
and to retry.
Maybe someone else on the list has a good solution for dealing with this
problem?
JG
[email protected] wrote:
Hi all,
Being in the process of evaluating hbase for managing "bigtable" (to give an
idea ~ 1G entries of 500 bytes). We are now facing some issues and i would like to have
comments concerning what i have noticed.
Our configuration is hadoop 0.19.1 and hbase 0.19.3, both hadoop-default/site.xml and
hbase-default/site.xml are attached, 15 nodes (16 or 8 Go RAM and 1,3To disk, linux
kernel 2.6.24-standard, java version "1.6.0_12").
For now the test case is on one IndexedTable (without at the moment using the
index column) with 25 M of entries/rows:
Map is formatting the data and 15 reduces are BatchUpdating the textual data (like
url and simple text fields < 500 bytes)
All processes (hadoop/hbase) are started with -Xmx1000m and IndexedTable is
configured with AutoCommit to false.
ISSUE 1, We need one column index to have "fast" UI query (for instance as an
answer to Web form we could expect waiting at max 30sec). The only documentation I found
concerning indexed column comes from
http://rajeev1982.blogspot.com/2009/06/secondary-indexes-in-hbase.html
Instead of using the indextable properties in hbase-site.xml (that I have
tested but that gives very poor performance and also lost entries...) I pass
the properties to the job through a -conf indextable_properties.xml (file is in
attachement). I suppose that putting the indextable properties into the
hbase-site.xml apply to the whole hbase cluster making the whole performance
significantly decreasing ?
The best perf were reached passing through the -conf option of the Tool.run
method.
ISSUE2, we are facing serious regionserver problems often leading to
regionserver shutdown like:
2009-09-16 10:21:15,887 INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Too many
store files for region
urlsdata-validation,forum.telecharger.01net.com/index.php?page=01net_voter&forum=microhebdo&category=5&topic=344142&post=5653085,1253089082422:
23, waiting
or
2009-09-14 16:39:24,611 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking
updates for 'IPC Server handler 1 on 60020' on region
urlsdata-validation,www.abovetopsecret.com/forum/thread119/pg1&title=Underground+Communities,1252939031807:
Memcache size 128.0m is >= than blocking 128.0m size
2009-09-14 16:39:24,942 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Could not read from stream
2009-09-14 16:39:24,942 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-873614322830930554_111500
2009-09-14 16:39:31,180 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery
for block blk_-873614322830930554_111500 bad datanode[0] nodes == null
2009-09-14 16:39:31,181 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block
locations. Source file
"/hbase/urlsdata-validation/1733902030/info/mapfiles/2690714750206504745/data"
- Aborting...
2009-09-14 16:39:31,241 FATAL
org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog required.
Forcing server shutdown
I've read some hbase/jira issues (hbase-1415, hbase-1058, hbase-1084...)
concerning similar problems,
but i cannot get a clear idea of what kind of fix is proposed ?
ISSUE3, Theses problems are causing table.commit() IOException losing all the
entries:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server 192.168.255.8:60020 for region
urlsdata-validation,twitter.com/statuses/434272962,1253089707924, row
'www.harmonicasurcher.com', but failed after 10 attempts.
Exceptions:
java.io.IOException: Call to /192.168.255.8:60020 failed on local exception:
java.io.EOFException
java.net.ConnectException: Call to /192.168.255.8:60020 failed on connection exception: java.net.ConnectException: Connection refused
Is there a way to get back the uncommitted entries (there are many of them
because we are in AutoCommit false)
to resubmit them later ?
To give an idea, we sometime lost about 170 000 entries out of 25M entries due
to this commit exception.
Guillaume Viland ([email protected])
FT/TGPF/OPF/PORTAIL/DOP Sophia Antipolis
*********************************
This message and any attachments (the "message") are confidential and intended solely for the addressees.
Any unauthorised use or dissemination is prohibited.
Messages are susceptible to alteration.
France Telecom Group shall not be liable for the message if altered, changed or falsified.
If you are not the intended addressee of this message, please cancel it
immediately and inform the sender.
********************************