hi, Ron.
Indeed, I have rewrote the implementation of
org.apache.hadoop.hbase.regionserver.metrics.RegionServerMetrics
org.apache.hadoop.hbase.regionserver.metrics.RegionServerDynamicMetrics
There are major changes as below:
1) change the context name for hbase to separate name for each other;
Hi,
I am using hbase 0.94.11 and i feel a bit confuse when looking at the log
file below:
13/09/24 13:11:00 INFO regionserver.Store: Flushed , sequenceid=687077,
memsize=
122.1m, into tmp file
hdfs://192.168.123.123:54310/hbase/usertable/b19289cf9b1400
We are planning to migrate from CDH3 to CDH4, as part of this migration we
also planning to bring HBASE into out system because it also updates to the
data, in CDH3 we are using Hive as warehouse.
Here we are having the major problem in migration, Hive supports partitions
to tables. And our
Hi ,
I am very new in HBase. Could you please let me know , how to insert spatial
data (Latitude / Longitude) in HBase using Java .
--
View this message in context:
http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
Sent from the HBase User mailing list
There're plenty of examples in unit tests.
e.g. :
Put put = new Put(Bytes.toBytes(row + String.format(%1$04d, i)));
put.add(family, null, value);
table.put(put);
value can be obtained through Bytes.toBytes().
table is an HTable.
Cheers
On Tue, Sep 24, 2013 at 4:15 AM, cto
Inline
On Mon, Sep 23, 2013 at 6:15 PM, Shahab Yunus shahab.yu...@gmail.comwrote:
Yeah, I saw that. In fact that is why I recommended that to you as I
couldn't infer from your email that whether you have already gone through
that source or not.
Yes, i was aware of that article. But my read
I'm only know of the links already embedded in the blog page that I sent
you or you have this.
https://groups.google.com/forum/#!forum/hbasewd
Regards,
Shahab
On Tue, Sep 24, 2013 at 11:12 AM, anil gupta anilgupt...@gmail.com wrote:
Inline
On Mon, Sep 23, 2013 at 6:15 PM, Shahab Yunus
Hi Tom,
What is your table schema for this region? How many CFs? Also, what do you
have on the logs for this table?
Thanks,
JM
2013/9/24 Tom Brown tombrow...@gmail.com
I have a region that is very small, only 5MB. Despite it's size, it has 24
store files. The logs show that it's compacting
It would help if you can show your RS log (via pastebin?) . Are there
frequent flushes for this region too?
On Tue, Sep 24, 2013 at 9:20 PM, Tom Brown tombrow...@gmail.com wrote:
I have a region that is very small, only 5MB. Despite it's size, it has 24
store files. The logs show that it's
There is one column family, d. Each row has about 10 columns, and each
row's total data size is less than 2K.
Here is a small snippet of logs from the region server:
http://pastebin.com/S2jE4ZAx
--Tom
On Tue, Sep 24, 2013 at 9:59 AM, Bharath Vissapragada bhara...@cloudera.com
wrote:
It
Can you past logs a bit before that? To see if anything triggered the
compaction?
Before the 1M compactions entries.
Also, what is your setup? Are you running in Standalone? Pseudo-Dist?
Fully-Dist?
Thanks,
JM
2013/9/24 Tom Brown tombrow...@gmail.com
There is one column family, d. Each row
Just wanted to follow up here with a little update. We enabled the Aggregation
coprocessor on our dev cluster. Here are the quick timing stats.
Tables: 565
Total Rows: 2,749,015,957
Total Time (to count): 52m:33s
Will be interesting to see how this fairs against our production clusters with
My cluster is fully distributed (2 regionserver nodes).
Here is a snippet of log entries that may explain why it started:
http://pastebin.com/wQECif8k. I had to go back 2 days to find when it
started for this region.
This is not the only region experiencing this issue (but this is the
smallest
Hey Anil,
The solution you've described is the best we've found for Phoenix (inspired
by the work of Alex at Sematext).
You can do all of this in a few lines of SQL:
CREATE TABLE event_data(
who VARCHAR, type SMALLINT, id BIGINT, when DATE, payload VARBINARY
CONSTRAINT pk PRIMARY KEY
Strange.
Few questions then.
1) What is your hadoop version?
2) Is the clock on all your severs synched with NTP?
3) What is you table definition? Bloom filters, etc.?
This is the reason why it keep compacting:
2013-09-24 10:04:00,548 INFO
Another important information why might be the root cause of this issue...
Do you have any TTL defines for this table?
JM
2013/9/24 Jean-Marc Spaggiari jean-m...@spaggiari.org
Strange.
Few questions then.
1) What is your hadoop version?
2) Is the clock on all your severs synched with
1. Hadoop version is 1.1.2.
2. All servers are synched with NTP.
3. Table definition is: 'compound0', {
NAME = 'd',
DATA_BLOCK_ENCODING = 'NONE',
BLOOMFILTER = 'ROW',
REPLICATION_SCOPE = '0',
VERSIONS = '1',
COMPRESSION = 'SNAPPY',
MIN_VERSIONS = '0',
TTL = '864',
KEEP_DELETED_CELLS = 'false',
memstoreSize of 128.2m was recorded at the beginning
of HRegion#internalFlushcache().
After the flush, memstoreSize became 48.0m.
Cheers
On Tue, Sep 24, 2013 at 3:50 AM, aiyoh79 tcheng...@gmail.com wrote:
Hi,
I am using hbase 0.94.11 and i feel a bit confuse when looking at the log
file
TTL seems to be fine.
-1 is the default value for TimeRangeTracker.maximumTimestamp.
Can you run:
hadoop fs -lsr hdfs://
hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/
Thanks,
JM
2013/9/24 Tom Brown tombrow...@gmail.com
1. Hadoop version is 1.1.2.
-rw--- 1 hadoop supergroup 2194 2013-09-21 14:32
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/014ead47a9484d67b55205be16802ff1
-rw--- 1 hadoop supergroup 31321 2013-09-24 05:49
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1305d625bd4a4be39a98ae4d91a66140
On flushing we do some cleanup, like removing deleted data that was already
in the MemStore or extra versions. Could it be that you are overwriting
recently written data?
48MB is the size of the Memstore that accumulated while the flushing
happened.
J-D
On Tue, Sep 24, 2013 at 3:50 AM, aiyoh79
Same thing in pastebin: http://pastebin.com/tApr5CDX
On Tue, Sep 24, 2013 at 11:18 AM, Tom Brown tombrow...@gmail.com wrote:
-rw--- 1 hadoop supergroup 2194 2013-09-21 14:32
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/014ead47a9484d67b55205be16802ff1
-rw--- 1 hadoop
So. Looking at the code, this, for me, sound like a bug.
I will try to reproduce it locally. Seems to be related to the combination
of TTL + BLOOM.
Creating a table for that right now, will keep you posted very shortly.
JM
2013/9/24 Tom Brown tombrow...@gmail.com
-rw--- 1 hadoop
To mitigate, you can change hbase.store.delete.expired.storefile to false
on one region server, or for entire table, and restart this RS.
This will trigger a different compaction, hopefully.
We'd need to find what the bug is. My gut feeling (which is known to be
wrong often) is that it has to do
Yeah, I think c3580bdb62d64e42a9eeac50f1c582d2 store file is a good example.
Can you grep for c3580bdb62d64e42a9eeac50f1c582d2 and post the log just to
be sure? Thanks.
It looks like an interaction of deleting expired files and
// Create the writer even if no kv(Empty store file is also
We get -1 because of this:
byte [] timerangeBytes = metadataMap.get(TIMERANGE_KEY);
if (timerangeBytes != null) {
this.reader.timeRangeTracker = new TimeRangeTracker();
Writables.copyWritable(timerangeBytes,
this.reader.timeRangeTracker);
}
One more Tom,
When you will have been able capture de HFile locally, please run run the
HFile class on it to see the number of keys (is it empty?) and the other
specific information.
bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s -v -f HFILENAME
Thanks,
JM
2013/9/24 Jean-Marc
/usr/lib/hbase/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s -v -f
/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff
13/09/24 12:33:40 INFO util.ChecksumType: Checksum using
org.apache.hadoop.util.PureJavaCrc32
Scanning -
Can you try with less parameters and see if you are able to get something
from it? This exception is caused by the printMeta, so if you remove -m
it should be ok. However, printMeta was what I was looking for ;)
getFirstKey for this file seems to return null. So it might simply be an
empty file,
Yes, it is empty.
13/09/24 13:03:03 INFO hfile.CacheConfig: Allocating LruBlockCache with
maximum size 2.9g
13/09/24 13:03:03 ERROR metrics.SchemaMetrics: Inconsistent configuration.
Previous configuration for using table name in metrics: true, new
configuration: false
13/09/24 13:03:03 WARN
Hi Tom,
Thanks for this information and the offer. I think we have enought to start
to look at this issue. I'm still trying to reproduce that locally. In the
meantime, I sent a patch to fix the NullPointException your faced before.
I will post back here if I'm able to reproduce. Have you tried
I tried the workaround, and it is working very well. The number of store
files for all regions is now sane (went from about 8000 total store files
to 1000), and scans are now much more efficient.
Thanks for all your help, Jean-Marc and Sergey!
--Tom
On Tue, Sep 24, 2013 at 2:11 PM, Jean-Marc
The HBase Team is pleased to announce the immediate release of HBase 0.94.12.
Download it from your favorite Apache mirror [1]. This release has also been
pushed to Apache's maven repository.
All previous 0.92.x and 0.94.x releases can upgraded to 0.94.12 via a rolling
upgrade without downtime,
Hi Jeremy,
I don't see any issue for HBase to handle 4000 tables. However, I don't
think it's the best solution for your use case.
JM
2013/9/24 jeremy p athomewithagroove...@gmail.com
Short description : I'd like to have 4000 tables in my HBase cluster. Will
this be a problem? In general,
Its better to do some salting in your keys for the reduce phase.
Basically, make ur key be something like KeyHash + Key and then decode it
in your reducer and write to HBase. This way you avoid the hotspotting
problem on HBase due to MapReduce sorting.
On Tue, Sep 24, 2013 at 2:50 PM, Jean-Marc
hi hbase !
Im getting mixed messages in the errors when creating a table in a simple
hbase two node cluster.
1) My HMaster is clearly running:
21519 Manager
14748 HMaster
25110 Jps
9887 QuorumPeerMain
15473 HRegionServer
14062 ServiceMain
5702 Bootstrap
2) But when I try to create a table:
Are you able to view Master Web UI ?
What exceptions do you see in master log ?
Cheers
On Tue, Sep 24, 2013 at 3:26 PM, Jay Vyas jayunit...@gmail.com wrote:
hi hbase !
Im getting mixed messages in the errors when creating a table in a simple
hbase two node cluster.
1) My HMaster is
Varun : I'm familiar with that method of salting. However, in this case, I
need to do filtered range scans. When I do a lookup for a given WORD at a
given POSITION, I'll actually be doing a regex on a range of WORDs at that
POSITION. If I salt the keys with a hash, the WORDs will no longer be
Hi Tom,
Thanks for reporting this and for providing all this information.
I have attached a patch on the JIRA that Sergey's opened.
This will need to be reviewed and we will need a commiter to push it if
it's accepted.
JM
2013/9/24 Tom Brown tombrow...@gmail.com
I tried the workaround, and
If you have a fixed length like:
_AAA
Where is a number from to 4000 and is your word, then simply
split by the number?
Then when you will instead each line, it will write to 4000 different
regions, which can be hosted in 4000 different servers if you have that.
And there
So you should salt the keys in the reduce phase but u donot salt the keys
in HBase. That basically means that reducers do not see the keys in sorted
order but they do see all the values for a specific key together.
So the Hash essentially is a trick that stays within the mapreduce does not
make
Who talked about salting in the reducer? Why do you want to do that? The
usecase did not even talk about any reduce phase.
Seems we need more details on what Jeremy want to achieve.
JM
2013/9/24 Varun Sharma va...@pinterest.com
So you should salt the keys in the reduce phase but u donot salt
Hi Dolan,
2013/9/23 Dolan Antenucci antenucc...@gmail.com
Hi Renato,
Can you clarify your recommendation?
Sorry about this. I will try to be more helpful (:
Currently I've added the directory
where my hbase-site.xml file lives (/etc/hbase/conf/) to my Hadoop
classpath (as described
Perhaps there has been some confusion. I'm concerned about hotspotting on
read, not on write.
So, for example, let's say it's time for me to process a 'document'. For
the sake of this example, let's say the words are all 10 characters long.
I spin up 200 mapreduce jobs, each one takes a 'line'
Hi Experts,
I would like to fetch data from hbase table using map reduce export API. I
see that I can fetch data using start and stop time, but I don't see any
information regarding start and stop row key. Can any expert guide me or
give me an example in order fetch first 1000 rows (or start and
Since different people use different terms... Salting is BAD. (You need to
understand what is implied by the term salt.)
What you really want to do is take the hash of the key, and then truncate the
hash. Use that instead of a salt.
Much better than a salt.
Sent from a remote device. Please
I now have string keys padded with spaces to a fixed size (40).
The FuzzyRowFilter is missing some keys. Any ideas, why this would happen ?
If I do get a 'get' on the hbase shell, i can see the row.
Regards,
- kiru
Hi Ted,
Thks for the reply, now i understand the 3rd entry of that log file.
Aiyoh79
Ted Yu-3 wrote
memstoreSize of 128.2m was recorded at the beginning
of HRegion#internalFlushcache().
After the flush, memstoreSize became 48.0m.
Cheers
On Tue, Sep 24, 2013 at 3:50 AM, aiyoh79 lt;
Hi J-D
I am doing some benchmark using ycsb and the log entries retrieved were
during the data loading stage. So i don't think there is and data deleted
and also overwriting.
Aiyoh79
Jean-Daniel Cryans wrote
On flushing we do some cleanup, like removing deleted data that was
already
in the
Can you provide a bit more detail ?
If you can reproduce this in a unit test, that would be easier to troubleshoot.
Thanks
On Sep 24, 2013, at 6:25 PM, Kiru Pakkirisamy kirupakkiris...@yahoo.com wrote:
I now have string keys padded with spaces to a fixed size (40).
The FuzzyRowFilter is
Meanwhile you can mitigate as specified above, by temporarily disabling
expired file deletion.
Please report if it doesn't work...
On Tue, Sep 24, 2013 at 4:08 PM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
Hi Tom,
Thanks for reporting this and for providing all this information.
Hi all
version HBase 0.94.11
Can I use importtsv tool to import data file (data file is too LZO
compressed, file.txt.lzo) from HDFS to HBase do? I opened the LZO
compression algorithm in HBase
--
In the Hadoop world, I am just a novice, explore the entire Hadoop
ecosystem, I hope one day I can
Hi,
I am seeing some warning message in Hregion log file for quite a few days.
The message is :
*WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow):
{processingtimems:12994,call:multi(org.apache.hadoop.hbase.client.MultiAction@602cdaf7),
rpc version=1, client version=29,
See http://hbase.apache.org/book.html#ops.slow.query
On Tue, Sep 24, 2013 at 9:36 PM, Vimal Jain vkj...@gmail.com wrote:
Hi,
I am seeing some warning message in Hregion log file for quite a few days.
The message is :
*WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow):
Are you running many concurrent clients ? I had a similar problem when running
on 0.94.x and I moved to 0.95.2 for this reason (see HBASE-9410)
Regards,
- kiru
From: Vimal Jain vkj...@gmail.com
To: user@hbase.apache.org
Sent: Tuesday, September 24, 2013 9:36
Thanks Ted and Kiru,
Ted,
Any place where i can debug that JSON in detail ?
Kiru ,
I have one multi-threaded client which reads/writes to hbase.
On Wed, Sep 25, 2013 at 10:16 AM, Kiru Pakkirisamy
kirupakkiris...@yahoo.com wrote:
Are you running many concurrent clients ? I had a similar
Okay, thanks for the explanation. You can hash or salt (as many people say)
the keys to avoid the hot spotting problem. What this means is that you
push the part that issues filtered range queries to HBase into the reduce
phase.
The idea is this:
1) You get your query 'Pos_WORD' in mapper and
The slow response was from this method in HRegionServer:
public R MultiResponse multi(MultiActionR multi) throws IOException {
You can capture a few jstack's of the HRegionServer process and see what
could be the cause.
You can pastebin the stack traces.
What HBase version are you using ?
The biggest issue I see with so many tables is the region counts could get
quite large. With 4000 tables, you will need at least that many regions,
not even accounting for splitting the regions/growth.
Forgive the speculation, but it almost sounds like you want an inverted
index. Could you not
59 matches
Mail list logo