Hi All
I have a requirement to make my Hbase system real time to the closest
extent possible and get the incremental updates every 15 mins
I need Key value pairs in a tree structure format and hence chosen Hbase to
be a better option.
How frequently can I run the Hbase refresh the incremental
Hi
you can use Job#setNumReduceTasks(int tasks) method to set the number of
reducer to output.
On Wed, Mar 13, 2013 at 2:15 PM, Vikas Jadhav vikascjadha...@gmail.comwrote:
Hello,
As by default Hadoop framework can shuffle (key,value) pair to only one
reducer
I have use case where i need
Hi
I am specifying requirement again with example.
I have use case where i need to shufffle same (key,value) pair to multiple
reducers
For Example we have pair (1,ABC) and two reducers (reducer0 and
reducer1) are there then
by default this pair will go to reduce1 (cause (key %
Use can use Custom Partitioner for that same.
Regards,
Samir.
From: Vikas Jadhav [mailto:vikascjadha...@gmail.com]
Sent: 13 March 2013 14:29
To: user@hadoop.apache.org
Subject: Re: How to shuffle (Key,Value) pair from mapper to multiple reducer
Hi
I am specifying requirement again with
Use can use Custom Partitioner for that same.
Regards,
Samir.
On Wed, Mar 13, 2013 at 2:29 PM, Vikas Jadhav vikascjadha...@gmail.comwrote:
Hi
I am specifying requirement again with example.
I have use case where i need to shufffle same (key,value) pair to multiple
reducers
For
Do you want the pair to go to both reducers or do you want it to go to only
one but in a random fashion ?
AFAIK, 1st is not possible. Someone on the list can correct if I am wrong.
2nd is possible by just implementing your own partitioner which randomizes
where each key goes (not sure what you
But the patch will work on 1.0.4 correct ?
On Wed, Mar 13, 2013 at 4:57 AM, George Datskos
george.dats...@jp.fujitsu.com wrote:
Leo
That JIRA says fix version=1.0.4 but it is not correct. The real JIRA
is MAPREDUCE-2374.
The actual fix version for this bug 1.1.2
George
or
dont wait patch, its a very simple fix. just do it.
On Mar 13, 2013 5:04 PM, Amit Sela am...@infolinks.com wrote:
But the patch will work on 1.0.4 correct ?
On Wed, Mar 13, 2013 at 4:57 AM, George Datskos
george.dats...@jp.fujitsu.com wrote:
Leo
That JIRA says fix version=1.0.4 but it
Cyril,
how did you install hadoop?
when you start hadoop ... do you start it from the location where it is
installed or from users home directory?
try setting HADOOP_HOME (its deprecated but it helps to resolve issues like
where the config files are located etc)
On Wed, Mar 13, 2013 at 7:29
or set this variable HADOOP_PREFIX to the directory where hadoop is
installed.
On Wed, Mar 13, 2013 at 7:32 PM, Nitin Pawar nitinpawar...@gmail.comwrote:
Cyril,
how did you install hadoop?
when you start hadoop ... do you start it from the location where it is
installed or from users home
I am trying to start the datanode on the slave node but when I check the
dfs I only have one node.
When I check the logs on the slave node I find the following output.
2013-03-13 10:22:14,608 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
was this namenode part of any other hadoop cluster?
did you format your namenode and forgot to clean up the datanode?
you can refer Michael's blog for more details and solutions
Hello Cyril,
This is because your datanode has a different namespaceID from the
one which master(namenode) actually has. Have you formatted the HDFS
recently? Were you able to format it properly? Everytime you format HDFS,
NameNode generates new namespaceID, which should be same in both
you want a n:n join or 1:n join?
On Mar 13, 2013 10:51 AM, Roth Effy effyr...@gmail.com wrote:
I want to join two table data in reducer.So I need to find the start of
the table.
someone said the DataJoinReducerBase can help me,isn't it?
2013/3/13 Azuryy Yu azury...@gmail.com
you cannot
I want a n:n join as Cartesian product,but the DataJoinReducerBase looks
like only support equal join.
I want a non-equal join,but I have no idea now.
2013/3/13 Azuryy Yu azury...@gmail.com
you want a n:n join or 1:n join?
On Mar 13, 2013 10:51 AM, Roth Effy effyr...@gmail.com wrote:
I want
thanks luke. that was informative.
unfortunately for me, we are still on hadoop v0.20.2.
if you or anyone still have any feedback, given the new information,
on how to resolve dependency conflicts for hadoop v0.20.x, please let
me know.
your help is greatly appreciated.
On Tue, Mar 12, 2013 at
Thank you both.
So what both of you were saying which will be true is that is order to
start and synchronize the cluster, I will have to format both nodes at the
same time ok.
I was working on the master node without the second node and did not format
before trying to start the second one.
I
No, you don't have to format the NN everytime you add a DN. Looking at your
case, it seems the second DN was part of some other cluster and has the ID
of that NN.
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Wed, Mar 13, 2013 at 9:13 PM, Cyril Bogus
How about sending 0,x to 0 and 1,x to 1; reduce 0 can act based on the
value of x?
On Wed, Mar 13, 2013 at 2:29 AM, Vikas Jadhav vikascjadha...@gmail.comwrote:
Hello I am not talking about custom partioner(custom partitioner is
involved but i want to write same pair for more number times)
i
I have a question regarding the hadoop jar command. In a cluster of say
nodes n1,n2...n100
the node n1 has jar Myjar on its local file system.
If I run the command
hadoop jar local/path/to/Myjar Myclass other-args
Is the MR job executed just on n1 or any arbitrary node n1..n100?
If it is any
Hi
Any node would submit the job to JobTracker which distributes the jar to
TaskTrackers and individual tasks are executed on nodes across the cluster.
MR tasks are executed across the cluster.
Regards
Bejoy KS
Sent from remote device, Please excuse typos
-Original Message-
From:
Hi folks,
Thanks for your interest. The Cloudera blog post has a few additional
bullet points about the difference between Trevni and Parquet:
http://blog.cloudera.com/blog/2013/03/introducing-parquet-columnar-storage-for-apache-hadoop/
D
On Tue, Mar 12, 2013 at 3:40 PM, Luke Lu l...@apache.org
The blog indicates Trevni is giving way to Parquet, and there will be no need
for Trevni any more. Let us know if that is an incorrect interpretation.
- Original Message -
From: Dmitriy Ryaboy dvrya...@gmail.com
To: pig-u...@hadoop.apache.org user@hadoop.apache.org
Sent: Wednesday,
Hi, George,
Thanks. It works great.
For those who want more detail about it. You just need to add
export HADOOP_USER_NAME=user
in hadoop-env.sh under you hadoop conf directory.
Dan
From: George Datskos [mailto:george.dats...@jp.fujitsu.com]
Sent: Tuesday, March 12, 2013 7:07 PM
To:
Hi Sandy,
I was going through the RunJar source code and the jar executes locally.
When the jar fires a mapreduce job,
the way I create JobConf is
JobConf conf = new JobConf(MyJob.class);
Does this set MyJar as the job jar?
Can you explain what is the difference between running an MR job
Hi Kay,
Yeah, that line does set your jar as the job jar. hadoop jar expects
java code to configure and submit your job. mapred job takes in a
job.xml configuration file and runs the job based on that.
-Sandy
On Wed, Mar 13, 2013 at 11:07 AM, KayVajj vajjalak...@gmail.com wrote:
Hi Sandy,
I think in your case it will have to be even, because all the slots will get
filled. A more interesting case is if you have 40 nodes, will you get exactly 5
slots used for each of the nodes? Or will some nodes get more than 5 mappers,
and others less? I don't remember the details, but I've had
Hello,
My System specification:
Ubuntu 12.04 i686 LTS, Hadoop 1.0.4.deb (I installed from deb file),
eclipse Juno 4.2.2, OpenJDK 6 and 7
I have downloaded your
plugin1http://idatamining.org/resources/hadoop/hadoop-eclipse-plugin-1.0.4.jarand
28 matches
Mail list logo