hi, I know regions will be reassigned when hbase cluster restarts. My
regionserver and my datanode sit on same physical node. So in my tests after
I restart hbase cluster, performance number drops, I guess this is due to
data locality problem. But in a further experiment, I increase the
For example, I have total some data and I can tune
hbase.hregion.max.filesize to increase/decrease total region number, rite?
I want to know if the region number has performance impact to random read
tests. I observed that in my ycsb test, with larger hfile size, I got
better tput and smaller
to increase hbase.region.mstore.flush.size to keep
the number of HFile generations smaller.
Thanks,
--
Tatsuya Kawano (Mr.)
Tokyo, Japan
On Jan 18, 2011, at 11:20 AM, Tao Xie xietao.mail...@gmail.com wrote:
For example, I have total some data and I can tune
hbase.hregion.max.filesize
retrieving data from disk is the most dominant element, until you are
fully cached in which case other factors inside the regionserver
become dominant. at this point copying memory, gc, algorithmic
complexity, etc become important.
On Wed, Jan 12, 2011 at 10:54 PM, Tao Xie xietao.mail...@gmail.com
hi, I know generally regionserver manages HRegions and in the HDFS layer
data in HRegion are stored as HFile format. I want to know whether HFiles
are all open and things lke block index are all loaded first to improve
lookup performance? If so, what will happen if exceeding memory limit?
Thanks.
includes loading up of the file index and metadata.
In our experience, this overhead has been small. Its currently not
accounted for in our general memory-counting. We should for sure add
it.
St.Ack
On Wed, Jan 12, 2011 at 7:51 PM, Tao Xie xietao.mail...@gmail.com wrote:
hi, I know generally
I see there is a block cache percentage configuration in hbase-site.xml. I
wonder if there is a row cache that stores k,v pairs.
Thanks.
, it can take a while
for regions to re-online. There could be another issue in the way of
the region re-onlining. Grepping around in the logs as per above
should give a clue.
St.Ack
On Thu, Dec 9, 2010 at 10:00 PM, Tao Xie xietao.mail...@gmail.com wrote:
hi, all
I met this exception when I
hi, all
I met this exception when I doing intensive insertions using YCSB. Anybody
give me some clues on this? I use hbase 0.20.6.
com.yahoo.ycsb.DBException:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server -- nothing found, no 'location' returned,
I read the code and my understanding is when a RS starts StoreFiles of each
Region
will be instantiated. Then HFile.reader.loadFileInfo() will read the the
index and file info.
So each StoreFile is opened only once and block index are cached. The cache
miss are
for blocks. I mean for random Get
I once have same problem. Finally I find RS are not started.
2010/10/26 Bradford Stephens bradfordsteph...@gmail.com
Hey datamigos,
I'm having trouble getting a finicky .20.6 cluster to behave.
The Master, Zookeeper, and ReigonServers all seem to be happy --
except the Master doesn't see
I also have similar result with YCSB. I disabled block cache (set to 0) and
got better throughput than default.
In my case my dataset is 160M records and block cache hit ratio is very low,
so frequent cache eviction causes long time pause.
2010/10/21 Ryan Rawson ryano...@gmail.com
Our own
I applied the patch for HBASE-2939. (The patch is for 0.89 but my code is
0.20.6, I checked the patch found it only changed one connection thread at
client side to a pool strategy.)
But when I rebuild the source and start hbase cluster. The master cannot
recognize regionservers though they are
hi, all
I set hdfs replica=1 when running hbase. And DN and RS co-exists on each
slave node. So the data in the regions managed by RS will be stored on its
local data node, rite?
But when I restart hbase and hbase client does gets on RS, datanode will
read data from remote data nodes. Does that
Maybe a stupid question. I have set export HBASE_MANAGES_ZK=true and provide
one ZK in hbase-site.xml. In my example, I only set the server sr114 as zk.
But I still find zookeeper will check other quorum servers. I wonder where
the server lists it reads. Confused about this. Anybody can give me a
Resolved. A stupid error I made. Sorry for this.
2010/9/28 Tao Xie xietao.mail...@gmail.com
Maybe a stupid question. I have set export HBASE_MANAGES_ZK=true and
provide one ZK in hbase-site.xml. In my example, I only set the server sr114
as zk. But I still find zookeeper will check other
I want to reproduce the results in the ycsb paper. I run hbase 0.20.6 and
hadoop 0.20.2. My cluster is like this:
1 Node as HMaster + ZK
6 Nodes as DN, RS
1 Node as Hbase client.
I think this environment is something like the one used by the paper.
When I run tests like workloadb with 100
Now my scenario is running ycsb doing heavy read. I compared the results of
setting hfile.block.cache.size to 0.2 with 0. I found with the factor 0 the
hbase metric 'get_avg_time' is even smaller. Maybe I should turn off block
cache in such scenario. I wonder if there are performance tests show
to take
as long as 500 ms. I will attach a snippet of that if necessary.
Thanks.
2010/9/19 Ryan Rawson ryano...@gmail.com
What does your GC situation look like?
On Sun, Sep 19, 2010 at 1:05 AM, Tao Xie xietao.mail...@gmail.com wrote:
Now my scenario is running ycsb doing heavy read. I
Here is the gc log: http://pastebin.com/1bGZvMri
2010/9/19 Ryan Rawson ryano...@gmail.com
I'd love to see a GC log, and yes it can be possible for ParNew to
take a long long time.
Thanks,
-ryan
On Sun, Sep 19, 2010 at 1:20 AM, Tao Xie xietao.mail...@gmail.com wrote:
At first when I
I see the following recommendation in
http://hbase.apache.org/docs/r0.20.6/api/overview-summary.html#requirements
It is recommended to run a ZooKeeper quorum of 3, 5 or 7 machines, and give
each ZooKeeper server around 1GB of RAM, and if possible, its own dedicated
disk. For very heavily loaded
hi, all
I use YCSB to measure the insert/read latency of hbase.
I found there will be 0 records inserted in up to 10 seconds during the
insertion procedure.
See the following result at 1514 second. I want to know why this occurs. Is
this due to compaction?
And I also want to know why the ops/sec
probably be
smoother, but do you really have a use case that requires it or just
poking?
J-D
On Thu, Sep 9, 2010 at 7:32 PM, Tao Xie xietao.mail...@gmail.com wrote:
hi, all
I use YCSB to measure the insert/read latency of hbase.
I found there will be 0 records inserted in up to 10 seconds
change what is in
HDFS.
There are some bugs in HDFS in 0.20 which can create this out-of-balance
scenario.
If you use CDH3b2 you should have a few patches which help to rectify the
situation, in particular HDFS-611.
Thanks
-Todd
JG
-Original Message-
From: Tao Xie
I have a look at the following method in 0.89. Is the the following line
correct ?
nRegions *= e.getValue().size();
private int regionsToGiveOtherServers(final int numUnassignedRegions,
final HServerLoad thisServersLoad) {
SortedMapHServerLoad, SetString lightServers =
new
280G 14G 252G 6% /mnt/DP_disk1
10.1.0.126: /dev/sdc1 280G 14G 252G 6% /mnt/DP_disk2
10.1.0.126: /dev/sdd1 280G 13G 253G 5% /mnt/DP_disk3
2010/9/7 Tao Xie xietao.mail...@gmail.com
I have a look at the following method in 0.89
26 matches
Mail list logo