Vimal,
In my case, for a single client my queries take less than a second (sub-second
performance is on what we were shooting for).
But the same queries when run concurrently gives completely degraded
performance. That is the reason I wrote that no-op test case which I have
attached in the
Hi I have a relatively simple situation,
As an example I have a table of Users, with first and last name.
I set a scan a FilterList and I add a SingleColumnValueFilter, with
column qualifier=firstName, CompareOp.EQUAL, and value=bob.
The problem is, I'm getting bob as well as anyone with out a
Thanks, that fixed the problem for me!
-Original Message-
From: Tianying Chang [mailto:tich...@ebaysf.com]
Sent: Wednesday, October 23, 2013 8:35 PM
To: user@hbase.apache.org
Subject: RE: Hbase 0.96 and Hadoop 2.2
What is your HBase version and Hadoop version? There is a RPC break
Ok. I will give a try.
regards!
Yong
On Wed, Oct 23, 2013 at 11:53 PM, Ted Yu yuzhih...@gmail.com wrote:
Yong:
I have attached the backport to HBASE-9819.
If you can patch your build and see if it fixes the problem, that would be
great.
On Tue, Oct 22, 2013 at 2:58 PM, Ted Yu
Interesting topic about constructing file/block system on top of HBase. Similar with Facebook Haystack or Taobao TFS targeting at small files management? It is great to see if you can opensource on github with some benchmark, Roman~Sounds like how to well utilize and configure HBase data server's
Hi Ted,
I'm not sure to get you. HBase will not store any cell in the system if
there is no content. So if you do a scan, you will get only the cells where
the content is not null. There is no need to have any filter here. Can you
please detail what you put in the table and what cells you are
I think Ted is looking for the SingleColumnValueFilter.setFilterIfMissing()
method. Try setting that to true.
Toby
***
Toby Lazar
Capital Technology Group
Email: tla...@capitaltg.com
Mobile: 646-469-5865
***
On Thu, Oct 24, 2013
Got it! I looked at the opposite way: How to get only the row where it's
null. Toby is correct.
JM
2013/10/24 Toby Lazar tla...@capitaltg.com
I think Ted is looking for the SingleColumnValueFilter.setFilterIfMissing()
method. Try setting that to true.
Toby
Can you stop HBase and run fsck on Hadoop to see how your HDFS health is?
2013/10/24 Vimal Jain vkj...@gmail.com
Hi Ted/Jean,
Can you please help here ?
On Tue, Oct 22, 2013 at 10:29 PM, Vimal Jain vkj...@gmail.com wrote:
Hi Ted,
Yes i checked namenode and datanode logs and i found
Hi Harry,
Do you have more details on the exact load? Can you run vmstats and see
what kind of load it is? Is it user? cpu? wio?
I suspect your disks to be the issue. There is 2 things here.
First, we don't recommend RAID for the HDFS/HBase disk. The best is to
simply mount the disks on 2
sure, I tried it again with hbase.client.retries.number=100 instead of 10
and it worked for me. But I'm not sure if this really solved the problem or
if it was just luck.
regards
2013/10/21 Ted Yu yuzhih...@gmail.com
John:
Can you let us know whether the Import succeeded this time ?
If
Hi,
I'm write currently a HBase Java programm which iterates over every row in
a table. I have to modiy some rows if the column size (the amount of
columns in this row) is bigger than 25000.
Here is my sourcode: http://pastebin.com/njqG6ry6
Is there any way to add a Filter to the scan Operation
Hi John,
Sorry it's not going to reply to your question, but if you do a full table
scan, you might want to do it with a MapReduce job so it will be way faster.
For the filter, you might have to implement your own. I'm not sure there is
any filter based on the cell size today :(
JM
2013/10/24
I am aware of the history of HBase and HDFS append and have read the various
blog posts and JIRAs. In particular:
https://issues.apache.org/jira/browse/HBASE-5676
https://issues.apache.org/jira/browse/HDFS-3120
My understanding is that HBase requires durable sync capabilities of
Hi JM
I took a snapshot on the initial run, before the changes:
https://www.evernote.com/shard/s95/sh/b8e1516d-7c49-43f0-8b5f-d16bbdd3fe13/00d7c6cd6dd9fba92d6f00f90fb54fc1/res/4f0e20a2-1ecb-4085-8bc8-b3263c23afb5/screenshot.png
Good timing, disks appear to be exploding (ATA errors) atm thus I'm
I have query on hbase.zookeeper.quorum.
I have 2 nodes in hadoop cluster and installed hbase on it.
Following is configuration:
file: hbase-site.xml
configuration
property
namehbase.zookeeper.quorum/name
valuehadoop-master/value
My understanding is that HBase requires durable sync capabilities of
HDFS (i.e. hflush() hsync()), but does *not* require file append
capabilities.
99.99% true. The remaining 0.01% is an exceptional code path during the
data recovery (as a fall back mechanism to ensure that we can start the
Hi all,
Recently, i read the source of HBase's HLog, and i got some questions
that puzzled me a lot. here there are:
1 why use reflection to init a SequenceFile.Writer
in SequenceFileLogWriter? i read HBASE-2312 but still can't catch the point.
2 It seems that hlog use
bq. Opening socket connection to server localhost/127.0.0.1:2181
Is localhost the same machine as hadoop-master ?
Can you tell us more about your cluster config ?
What version of HBase / zookeeper are you using ?
Cheers
On Thu, Oct 24, 2013 at 12:11 AM, Rajesh rajeshni...@gmail.com wrote:
@Jean-Marc: Sure, I can do that, but thats a little bit complicated because
the the rows has sometimes Millions of Columns and I have to handle them
into different batches because otherwise hbase crashs. Maybe I will try it
later, but first I want to try the API version. It works okay so far, but
Hi Ted,
Thank you for your response. for #1, I have tried to understand that
comment in SequenceFileLogWriter, i can't figure out that instead of
reflection, why not use the version of SF.createWriter below directly?
SequenceFile.Writer createWriter(FileSystem fs,
This was due to the fact that when HBASE-2312 was integrated, there were
many flavors of hadoop running in production.
So the code had to support all the flavors.
Cheers
On Thu, Oct 24, 2013 at 9:27 AM, Wukang Lin vboylin1...@gmail.com wrote:
Hi Ted,
Thank you for your response. for #1,
This will work ONLY if you add single column to scan. If you scan multiple
columns you will
need additional filter (reverse SkipFilter) which filter outs all rows (outputs
of SingleColumnValueFilter) which
do not have 'firstName' column. I do not think HBase provide similar filter but
you can
If the MR crash because of the number of columns, then we have an issue
that we need to fix ;) Please open a JIRA provide details if you are facing
that.
Thanks,
JM
2013/10/24 John johnnyenglish...@gmail.com
@Jean-Marc: Sure, I can do that, but thats a little bit complicated because
the the
I will have said:
scan.addColumn(YOUR_CF, Bytes.toBytes(firstName));
But not sure if it really makes a difference...
2013/10/24 Vladimir Rodionov vrodio...@carrieriq.com
If 'firstName' is NULL - it is missing completely from a row. Add
explicitly this column to Scan you create:
Jean, if we don't add setBatch to the scan, MR job does cause HBase to crash
due to OOME. We have run into this in the past as well. Basically the problem
is - Say I have a region server with 12GB of RAM and a row of size 20GB (an
extreme example, in practice, HBase runs out of memory way
I already mentioned that here:
https://groups.google.com/forum/#!topic/nosql-databases/ZWyc4zDursg ... .
I'm not sure if it is a issue. After setting the batch size everything
worked nice for me.
Anyway, that was another problem :) If there would be a Filter my current
code would work fine with
Solved by adding hadoop-common-2.2.0 to $HBASE_DIR/lib and removing the
version that shouldn't be included in the first place. I guess programmers
can't be expected to document a working configuration on production
releases.
--
View this message in context:
For streaming responses, there is this JIRA:
HBASE-8691 High-Throughput Streaming Scan API
On Thu, Oct 24, 2013 at 9:53 AM, Dhaval Shah prince_mithi...@yahoo.co.inwrote:
Jean, if we don't add setBatch to the scan, MR job does cause HBase to
crash due to OOME. We have run into this in the
Hi All,
I've followed the hbase install guide as closely as I possibly can and I
have tried a few different variants in my hbase-site.xml file. I'm still
having problems getting HMaster to stay up though. Could anyone tell me more
about the issue that I'm seeing here:
[hadoop@hadoop1 hadoop]$
Interesting!! Can't wait to see this in action. I am already imagining huge
performance gains
Regards,
Dhaval
From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org; Dhaval Shah
prince_mithi...@yahoo.co.in
Sent: Thursday, 24
Thank you, Ted. I got it, :-).
I read the comments on HDFS-744 and HBASE-5954, it seems HDFS support true
fsync and enable by default since hadoop 2.0.0 alpha, hbase has the ablity
to configure how WAL and Hfile use HDFS's fsync until 0.98, so, for our
version of HBase(0.94.6+hadoop 2.0.0), we
Ok I'm running a load job atm, I've add some possibly incomprehensible
coloured lines to the graph: http://goo.gl/cUGCGG
This is actually with one fewer nodes due to decommissioning to replace a
disk, hence I guess the reason for one squiggly line showing no disk
activity. I've included only the
Can you try vmstat 2? 2 is the interval in seconds it will display the disk
usage. On the extract here, nothing is running. only 8% is used. (1% disk
IO, 6% User, 1% sys)
Run it on 2 or 3 different nodes while you are putting the load on the
cluster. And take a look at the 4 last numbers and see
If you encounter the following error, we can address in another issue:
public HTableInterface getTable(byte[] tableName, ExecutorService pool)
throws IOException {
if (managed) {
throw new IOException(The connection has to be unmanaged.);
}
On Thu, Oct 24, 2013 at 3:25
So just a short update, I'll read into it a little more tomorrow. This is
from three of the nodes:
https://gist.github.com/hazzadous/1264af7c674e1b3cf867
The first is the grey guy. Just glancing at it, it looks to fluctuate more
than the others. I guess that could suggest that there are some
p.s. I guess this is more turning into a general hadoop issue, but I'll
keep the discussion here seeing that I have an audience, unless there are
objections.
On 24 October 2013 22:02, Harry Waye hw...@arachnys.com wrote:
So just a short update, I'll read into it a little more tomorrow. This
Your nodes are almost 50% idle... Might be something else. Sound it's not
your disks nor your CPU... Maybe to many RCPs?
Have you investigate on your network side? netperf might be a good help for
you.
JM
2013/10/24 Harry Waye hw...@arachnys.com
p.s. I guess this is more turning into a
Excuse the ignorance, RCP?
On 24 October 2013 22:28, Jean-Marc Spaggiari jean-m...@spaggiari.orgwrote:
Your nodes are almost 50% idle... Might be something else. Sound it's not
your disks nor your CPU... Maybe to many RCPs?
Have you investigate on your network side? netperf might be a good
I guess Jean meant RPCs.
On Thu, Oct 24, 2013 at 2:34 PM, Harry Waye hw...@arachnys.com wrote:
Excuse the ignorance, RCP?
On 24 October 2013 22:28, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
Your nodes are almost 50% idle... Might be something else. Sound it's not
your disks
Remote calls to a server. Just forget about it ;) Please verify the network
bandwidth between your nodes.
2013/10/24 Harry Waye hw...@arachnys.com
Excuse the ignorance, RCP?
On 24 October 2013 22:28, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:
Your nodes are almost 50% idle...
Got it! Re. 50% utilisation, I forgot to mention that 6 cores does not
include hyper-threading. Foolish I know, but that would explain CPU0 being
at 50%. The nodes are as stated in
http://www.hetzner.de/en/hosting/produkte_rootserver/ex10 bar the RAID1.
On 24 October 2013 22:50, Jean-Marc
Using HBase client API (scanners) for M/R is so oldish :). HFile has well
defined format and it is much more efficient to read them directly.
Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodio...@carrieriq.com
Well that depends on your use case ;)
There are many nuances/code complexities to keep in mind:
- merging results of various HFiles (each region can have.more than one)
- merging results of WAL
- applying delete markers
- how about data which is only in memory of region servers and no where else
The Phoenix team is pleased to announce the immediate availability of
Phoenix 2.1 [1].
More than 20 individuals contributed to the release. Here are some of the
new features
now available:
* Secondary Indexing [2] to create and automatically maintain global
indexes over your
primary table.
-
Congratulations~ will refresh and have a try today.Best Regards, JulianOn Oct 25, 2013, at 08:24 AM, James Taylor jtay...@salesforce.com wrote:The Phoenix team is pleased to announce the immediate availability of Phoenix 2.1 [1]. More than 20 individuals contributed to the release. Here are some
From https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing :
Is date_col a column from data table ?
CREATE INDEX my_index ON my_table (date_col DESC, v1) INCLUDE (v3)
SALT_BUCKETS=10, DATA_BLOCK_ENCODING='NONE';
On Thu, Oct 24, 2013 at 5:24 PM, James Taylor
Thanks, Ted. That was a typo which I've corrected. Yes, these are
references to columns from your primary table. It should have read like
this:
CREATE INDEX my_index ON my_table (v2 DESC, v1) INCLUDE (v3)
SALT_BUCKETS=10, DATA_BLOCK_ENCODING='NONE';
On Thu, Oct 24, 2013 at 5:40 PM, Ted Yu
yes! that was exactly what I was looking for.
thanks.
Ted
On 10/24/13, Toby Lazar tla...@capitaltg.com wrote:
I think Ted is looking for the SingleColumnValueFilter.setFilterIfMissing()
method. Try setting that to true.
Toby
***
Toby Lazar
Capital
Hi All,
I am running HBase 0.94.6 with 8 region servers and getting throughput of
around 15K Read OPS and 20K Write OPS per server through YCSB tests. Table
is pre created with 8 regions per region server and it has 120 million
records of 700 bytes each.
I increased the number of region servers
We need to finish up HBASE-8369
From: Dhaval Shah prince_mithi...@yahoo.co.in
To: user@hbase.apache.org user@hbase.apache.org
Sent: Thursday, October 24, 2013 4:38 PM
Subject: Re: RE: Add Columnsize Filter for Scan Operation
Well that depends on your use
How many YCSB clients were used in each setting ?
Thanks
On Oct 24, 2013, at 9:45 PM, Ramu M S ramu.ma...@gmail.com wrote:
Hi All,
I am running HBase 0.94.6 with 8 region servers and getting throughput of
around 15K Read OPS and 20K Write OPS per server through YCSB tests. Table
is pre
Excellent. :)
From: A Laxmi a.lakshmi...@gmail.com
To: user@hbase.apache.org user@hbase.apache.org; lars hofhansl
la...@apache.org
Sent: Wednesday, October 23, 2013 12:43 PM
Subject: Re: 'hbase.client.scanner.caching' default value for HBase 0.90.6?
Hi
No this is different.
All your data is in the memstore still.
The memstore is organized as a skip list, nobody has ever tested that with
72gb. 256mb, 512mb, 1gb, sure... 72gb... no way.
Same with a 96gb of java heap. Not with Oracle or OpenJDK and an application
specifically for such large
54 matches
Mail list logo