Sorry, i pressed send by mistake on my mobile phone. JM already provided
the solution to you.
On Tue, Jul 2, 2013 at 2:59 PM, Anil Gupta anilgupt...@gmail.com wrote:
In mapreduce, there is a proper
Best Regards,
Anil
On Jul 1, 2013, at 11:43 PM, Glen Arrowsmith garrowsm...@halfbrick.com
Thank you very much for the great support!
This is how I thought to design my key:
PATTERN: source|type|qualifier|hash(name)|timestamp
EXAMPLE:
google|appliance|oven|be9173589a7471a7179e928adc1a86f7|1372837702753
Do you think my key could be good for my scope (my search will be
essentially by
I'm not sure if you're eliding this fact or not, but you'd be much
better off if you used a fixed-width format for your keys. So in your
example, you'd have:
PATTERN: source(4-byte-int).type(4-byte-int or smaller).fixed 128-bit
hash.8-byte timestamp
Example: \x00\x00\x00\x01\x00\x00\x02\x03
Yeah, I was thinking to use a normalization step in order to allow the use
of FuzzyRowFilter but what is not clear to me is if integers must also be
normalized or not.
I will explain myself better. Suppose that i follow your advice and I
produce keys like:
- 1|1|somehash|sometimestamp
-
When you make the RK and convert the int parts into byte[] ( Use
org.apache.hadoop.hbase.util.Bytes#toBytes(*int) *) it will give 4 bytes
for every byte.. Be careful about the ordering... When u convert a +ve
and -ve integer into byte[] and u do Lexiographical compare (as done in
HBase) u will
Hi Flavio,
Have you had a look at Phoenix (https://github.com/forcedotcom/phoenix)?
It will allow you to model your multi-part row key like this:
CREATE TABLE flavio.analytics (
source INTEGER,
type INTEGER,
qual VARCHAR,
hash VARCHAR,
ts DATE
CONSTRAINT pk PRIMARY KEY
All my enums produce positive integers so I don't have +/-ve Integer
problems.
Obviously If I use fixed-length rowKeys I could take away the separator..
Sorry but I'm very a newbie in this field..I'm trying to understand how to
compose my key with the bytes..
Is it correct the following?
final
I ended up writing a tool which helps merge the table regions into a target
# of regions. For example if you want to go from N -- N/8, then the tool
figures out the grouping and merges them in one pass. I will put it up in a
github repo soon and share it here.
The sad part of this approach is the
Would online merge help (https://issues.apache.org/jira/browse/HBASE-7403) ?
The feature is not in 0.94 though.
Cheers
On Wed, Jul 3, 2013 at 3:58 AM, Viral Bajaria viral.baja...@gmail.comwrote:
I ended up writing a tool which helps merge the table regions into a target
# of regions. For
No, I've never seen Phoenix, but it looks like a very useful project!
However I don't have such strict performance issues in my use case, I just
want to have balanced regions as much as possible.
So I think that in this case I will still use Bytes concatenation if
someone confirm I'm doing it in
The two argument Bytes.add() calls:
return add(a, b, HConstants.EMPTY_BYTE_ARRAY);
where a new byte array is allocated:
byte [] result = new byte[a.length + b.length + c.length];
Meaning your code below would allocate two byte arrays.
Consider writing a method that accepts 4 byte []
Found this while going through the online merge jira...
https://issues.apache.org/jira/browse/HBASE-8217
The comments were interesting and I as an user would agree to the fact that
supplying a patch is good and it's on me to decide whether I should use it
or not. The core committee obviously is
Sure, but FYI Phoenix is not just faster, but much easier as well (as
this email chain shows).
On 07/03/2013 04:25 AM, Flavio Pompermaier wrote:
No, I've never seen Phoenix, but it looks like a very useful project!
However I don't have such strict performance issues in my use case, I just
want
Hello,
We are seeing some increased load on our system and we are trying to
determine why. We have historical read/write request count data (from
jmx metrics) but it's hard to pick out a definitive correlation between
a particular table and overall system load. Is there a way to tell which
I have a major typo in the question so I apologize. I meant to say 5
families with 1000+ qualifiers each.
Lets work with an example, (not the greatest example here but still). Lets
say we have a Genre Class like this:
Class HistoryBooks{
ArrayListBooks author1;
ArrayListBooks author2;
Really a bad title for the section.
Schema Smackdown? Really?
6.10.1 isn't really valid is it? Rows version versions?
IMHO it should be Columns versus versions. (Do you put a timestamp in the
column qualifier name versus having an enormous number of versions allowed?)
There's more, but
Hello Experts,
I am quite new to Hbase and Hadoop and above all new to java too. Recently
started working on Hbase and Java. I have successfully installed
configured Hadoop Hbase and also made my 1st ever java application on
hbase using this tutorial
Thanks Ted.
-Original Message-
From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Tuesday, July 02, 2013 6:11 PM
To: user@hbase.apache.org
Subject: Re: Scan performance
Tony:
Take a look at
Thanks Azuryy. Does it work on multiple clusters (e.g. HBase in cluster 1 and
HDFS files in another cluster 2)?
From: Azuryy Yu azury...@gmail.com
To: user@hbase.apache.org; S. Zhou myx...@yahoo.com
Sent: Tuesday, July 2, 2013 10:06 PM
Subject: Re:
Hi Kireet,
Have you had a look at Hannibal (https://github.com/sentric/hannibal)? It
graphs the distribution of a table's regions across your cluster as well as
region sizes and may point to unevenly distributed data which you could
then try to correlate to load on a particular node.
-Eric
On
Which HBase version are you using ?
In 0.94, take a look at:
src/main/resources/hbase-webapps/master/table.jsp
where you would see:
String tableName = request.getParameter(name);
HTable table = new HTable(conf, tableName);
Cheers
On Wed, Jul 3, 2013 at 4:21 AM, SamSalman
Azuryy, I am looking at the MultipleInputs doc. But I could not figure out how
to add HBase table as a Path to the input? Do you have some sample code? Thanks!
From: Azuryy Yu azury...@gmail.com
To: user@hbase.apache.org; S. Zhou myx...@yahoo.com
Sent:
Thanks for the tip, Himanshu.
I'm on 0.92.1 (cdh 4.1.2), so I imagine I don't have 7122.
On Tue, Jul 2, 2013 at 6:00 PM, Himanshu Vashishtha hv.cs...@gmail.comwrote:
Patrick,
What is the HBase version you using for Master cluster? If 0.94.8, does
it has 7122?
Hi,
Like scan with range. I would like to delete rows with range.Is this
supported from hbase shell ?
Lets say I have a table with keys like
A-34335
A-34353535
A-335353232
B-33435
B-4343
C-5353533
I want to delete all rows with prefix A using hbase shell.
Thanks,
Rahul
Thank you very much for your replay.
I am using hbase-0.94.6,
Yes I see those two lines, what should I do with them? Do I need to change
something there or else?
Waiting for your response.
Thanks,
--
View this message in context:
Try single quotes. The shell (ruby) may be trying to 'help you' by
interpreting your hex.
hbase(main):018:0 print \x20\n
hbase(main):019:0 print '\x20\n'
\x20\nhbase(main):020:0
See how w/ double quotes it prints space and new line where when I
single-quote it, it prints out the literal?
At
Not off hand.
But its something that I think I could cobble up over the next couple of days
if my wife runs out of projects for me to do around the house. ;-)
On Jul 3, 2013, at 12:57 PM, Stack st...@duboce.net wrote:
On Wed, Jul 3, 2013 at 7:08 AM, Michael Segel
Can you update your CDH and hbase?
This seems pretty basic dns setup issue:
13/07/03 11:22:36 ERROR hbase.HServerAddress: Could not resolve the DNS
name of CH35
java.lang.IllegalArgumentException: hostname can't be null
Can you fix this first?
St.Ack
On Tue, Jul 2, 2013 at 8:29 PM, ch huang
Hi Huang,
Few things.
1), cdh3.4 is a pretty old version. You should thing about upgrading
to a more recent version.
2) Have you checked you hosts files? Can you ping CH35 (if it's your
host name) from you ZK and other servers?
JM
2013/7/2 ch huang justlo...@gmail.com:
i modified all hostname
Do you have only 5 static author names?
Keep in mind the column family name is defined when creating the table.
Regarding tall vs wide debate:
HBase is first and for most a Key Value database thus reads and writes in
the column-value level. So it doesn't really care about rows.
But it's not
Did you somehow turned the Security flag on for HBase since your exception
is Security related.
On Wednesday, July 3, 2013, SamSalman wrote:
Hello Experts,
I am quite new to Hbase and Hadoop and above all new to java too. Recently
started working on Hbase and Java. I have successfully
You may want to pull your data from your HBase first in a separate map only job
and then use its output along with other HDFS input.
There is a significant disparity between the reads from HDFS and from HBase.
On Jul 3, 2013, at 10:34 AM, S. Zhou myx...@yahoo.com wrote:
Azuryy, I am
Seems right. You can make it more efficient by creating your result array
in advance and then fill it.
Regarding time filtering. Have you see that in Scan you can set start time
and end time?
On Wednesday, July 3, 2013, Flavio Pompermaier wrote:
All my enums produce positive integers so I don't
Hi,
1) It cannot input two different cluster's data to a MR job.
2) If your data locates in the same cluster, then:
conf.set(TableInputFormat.SCAN,
TableMapReduceUtil.convertScanToString(new Scan()));
conf.set(TableInputFormat.INPUT_TABLE, tableName);
Generally, It's network issue to root cause these problems.
On Thu, Jul 4, 2013 at 12:12 AM, Patrick Schless
patrick.schl...@gmail.comwrote:
Thanks for the tip, Himanshu.
I'm on 0.92.1 (cdh 4.1.2), so I imagine I don't have 7122.
On Tue, Jul 2, 2013 at 6:00 PM, Himanshu Vashishtha
ATT
For regions, see http://hbase.apache.org/book.html#regions.arch
Store file is related to http://hbase.apache.org/book.html#columnfamily
It would be easier to understand HBase architecture by reading
Thanks very much Azuryy. I will try it out.
From: Azuryy Yu azury...@gmail.com
To: user@hbase.apache.org
Sent: Wednesday, July 3, 2013 6:02 PM
Subject: Re: MapReduce job with mixed data sources: HBase table and HDFS files
Hi,
1) It cannot input two
bq. disparity between the reads from HDFS and from HBase
Depending on consistency requirement, the following JIRA should reduce the
disparity (if reading slightly out of date data from HBase is acceptable):
HBASE-8369 MapReduce over snapshot files
Cheers
On Thu, Jul 4, 2013 at 5:19 AM,
i insert 2GB data into my hbase cluster ,but i find the region is not
balance,is insert operation always cause the region number different on
region server?
here is my 60010 output
Table Regions Name Region Server Start Key End Key
demo2,,1372914631698.f42281ec44d113df3e644d5ecdb35366.
Did you presplit your table ?
Was load balancer enabled ?
What HBase version do you use ?
Thanks
On Jul 3, 2013, at 10:21 PM, ch huang justlo...@gmail.com wrote:
i insert 2GB data into my hbase cluster ,but i find the region is not
balance,is insert operation always cause the region number
I supposed load balancer is enabled and there is no pre-split.
LB run periodicly, (5 mininutes by default), so after insert bulk data into
HBase, the table is unbalance but there will be balanced eventually after
round-robin load balancer running.
On Thu, Jul 4, 2013 at 1:25 PM, Ted Yu
42 matches
Mail list logo