Fwd: Hbase

2013-03-13 Thread burberry blues
Hi All I have a requirement to make my Hbase system real time to the closest extent possible and get the incremental updates every 15 mins I need Key value pairs in a tree structure format and hence chosen Hbase to be a better option. How frequently can I run the Hbase refresh the incremental

Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread feng lu
Hi you can use Job#setNumReduceTasks(int tasks) method to set the number of reducer to output. On Wed, Mar 13, 2013 at 2:15 PM, Vikas Jadhav vikascjadha...@gmail.comwrote: Hello, As by default Hadoop framework can shuffle (key,value) pair to only one reducer I have use case where i need

Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread Vikas Jadhav
Hi I am specifying requirement again with example. I have use case where i need to shufffle same (key,value) pair to multiple reducers For Example we have pair (1,ABC) and two reducers (reducer0 and reducer1) are there then by default this pair will go to reduce1 (cause (key %

RE: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread Samir Kumar Das Mohapatra
Use can use Custom Partitioner for that same. Regards, Samir. From: Vikas Jadhav [mailto:vikascjadha...@gmail.com] Sent: 13 March 2013 14:29 To: user@hadoop.apache.org Subject: Re: How to shuffle (Key,Value) pair from mapper to multiple reducer Hi I am specifying requirement again with

Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread samir das mohapatra
Use can use Custom Partitioner for that same. Regards, Samir. On Wed, Mar 13, 2013 at 2:29 PM, Vikas Jadhav vikascjadha...@gmail.comwrote: Hi I am specifying requirement again with example. I have use case where i need to shufffle same (key,value) pair to multiple reducers For

Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread Viral Bajaria
Do you want the pair to go to both reducers or do you want it to go to only one but in a random fashion ? AFAIK, 1st is not possible. Someone on the list can correct if I am wrong. 2nd is possible by just implementing your own partitioner which randomizes where each key goes (not sure what you

Re: Child error

2013-03-13 Thread Amit Sela
But the patch will work on 1.0.4 correct ? On Wed, Mar 13, 2013 at 4:57 AM, George Datskos george.dats...@jp.fujitsu.com wrote: Leo That JIRA says fix version=1.0.4 but it is not correct. The real JIRA is MAPREDUCE-2374. The actual fix version for this bug 1.1.2 George or

Re: Child error

2013-03-13 Thread Azuryy Yu
dont wait patch, its a very simple fix. just do it. On Mar 13, 2013 5:04 PM, Amit Sela am...@infolinks.com wrote: But the patch will work on 1.0.4 correct ? On Wed, Mar 13, 2013 at 4:57 AM, George Datskos george.dats...@jp.fujitsu.com wrote: Leo That JIRA says fix version=1.0.4 but it

Re: Small cluster setup.

2013-03-13 Thread Nitin Pawar
Cyril, how did you install hadoop? when you start hadoop ... do you start it from the location where it is installed or from users home directory? try setting HADOOP_HOME (its deprecated but it helps to resolve issues like where the config files are located etc) On Wed, Mar 13, 2013 at 7:29

Re: Small cluster setup.

2013-03-13 Thread Nitin Pawar
or set this variable HADOOP_PREFIX to the directory where hadoop is installed. On Wed, Mar 13, 2013 at 7:32 PM, Nitin Pawar nitinpawar...@gmail.comwrote: Cyril, how did you install hadoop? when you start hadoop ... do you start it from the location where it is installed or from users home

Second node hdfs

2013-03-13 Thread Cyril Bogus
I am trying to start the datanode on the slave node but when I check the dfs I only have one node. When I check the logs on the slave node I find the following output. 2013-03-13 10:22:14,608 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:

Re: Second node hdfs

2013-03-13 Thread Nitin Pawar
was this namenode part of any other hadoop cluster? did you format your namenode and forgot to clean up the datanode? you can refer Michael's blog for more details and solutions

Re: Second node hdfs

2013-03-13 Thread Mohammad Tariq
Hello Cyril, This is because your datanode has a different namespaceID from the one which master(namenode) actually has. Have you formatted the HDFS recently? Were you able to format it properly? Everytime you format HDFS, NameNode generates new namespaceID, which should be same in both

Re: How can I record some position of context in Reduce()?

2013-03-13 Thread Azuryy Yu
you want a n:n join or 1:n join? On Mar 13, 2013 10:51 AM, Roth Effy effyr...@gmail.com wrote: I want to join two table data in reducer.So I need to find the start of the table. someone said the DataJoinReducerBase can help me,isn't it? 2013/3/13 Azuryy Yu azury...@gmail.com you cannot

Re: How can I record some position of context in Reduce()?

2013-03-13 Thread Roth Effy
I want a n:n join as Cartesian product,but the DataJoinReducerBase looks like only support equal join. I want a non-equal join,but I have no idea now. 2013/3/13 Azuryy Yu azury...@gmail.com you want a n:n join or 1:n join? On Mar 13, 2013 10:51 AM, Roth Effy effyr...@gmail.com wrote: I want

Re: how to resolve conflicts with jar dependencies

2013-03-13 Thread Jane Wayne
thanks luke. that was informative. unfortunately for me, we are still on hadoop v0.20.2. if you or anyone still have any feedback, given the new information, on how to resolve dependency conflicts for hadoop v0.20.x, please let me know. your help is greatly appreciated. On Tue, Mar 12, 2013 at

Re: Second node hdfs

2013-03-13 Thread Cyril Bogus
Thank you both. So what both of you were saying which will be true is that is order to start and synchronize the cluster, I will have to format both nodes at the same time ok. I was working on the master node without the second node and did not format before trying to start the second one. I

Re: Second node hdfs

2013-03-13 Thread Mohammad Tariq
No, you don't have to format the NN everytime you add a DN. Looking at your case, it seems the second DN was part of some other cluster and has the ID of that NN. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Wed, Mar 13, 2013 at 9:13 PM, Cyril Bogus

Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread Karthik Kambatla
How about sending 0,x to 0 and 1,x to 1; reduce 0 can act based on the value of x? On Wed, Mar 13, 2013 at 2:29 AM, Vikas Jadhav vikascjadha...@gmail.comwrote: Hello I am not talking about custom partioner(custom partitioner is involved but i want to write same pair for more number times) i

Question regarding hadoop jar command usage

2013-03-13 Thread KayVajj
I have a question regarding the hadoop jar command. In a cluster of say nodes n1,n2...n100 the node n1 has jar Myjar on its local file system. If I run the command hadoop jar local/path/to/Myjar Myclass other-args Is the MR job executed just on n1 or any arbitrary node n1..n100? If it is any

Re: Question regarding hadoop jar command usage

2013-03-13 Thread bejoy . hadoop
Hi Any node would submit the job to JobTracker which distributes the jar to TaskTrackers and individual tasks are executed on nodes across the cluster. MR tasks are executed across the cluster. Regards Bejoy KS Sent from remote device, Please excuse typos -Original Message- From:

Re: Introducing Parquet: efficient columnar storage for Hadoop.

2013-03-13 Thread Dmitriy Ryaboy
Hi folks, Thanks for your interest. The Cloudera blog post has a few additional bullet points about the difference between Trevni and Parquet: http://blog.cloudera.com/blog/2013/03/introducing-parquet-columnar-storage-for-apache-hadoop/ D On Tue, Mar 12, 2013 at 3:40 PM, Luke Lu l...@apache.org

Re: Introducing Parquet: efficient columnar storage for Hadoop.

2013-03-13 Thread Abhishek Kashyap
The blog indicates Trevni is giving way to Parquet, and there will be no need for Trevni any more. Let us know if that is an incorrect interpretation. - Original Message - From: Dmitriy Ryaboy dvrya...@gmail.com To: pig-u...@hadoop.apache.org user@hadoop.apache.org Sent: Wednesday,

RE: access hadoop cluster from ubuntu on laptop

2013-03-13 Thread Danfeng Li
Hi, George, Thanks. It works great. For those who want more detail about it. You just need to add export HADOOP_USER_NAME=user in hadoop-env.sh under you hadoop conf directory. Dan From: George Datskos [mailto:george.dats...@jp.fujitsu.com] Sent: Tuesday, March 12, 2013 7:07 PM To:

Re: Question regarding hadoop jar command usage

2013-03-13 Thread KayVajj
Hi Sandy, I was going through the RunJar source code and the jar executes locally. When the jar fires a mapreduce job, the way I create JobConf is JobConf conf = new JobConf(MyJob.class); Does this set MyJar as the job jar? Can you explain what is the difference between running an MR job

Re: Question regarding hadoop jar command usage

2013-03-13 Thread Sandy Ryza
Hi Kay, Yeah, that line does set your jar as the job jar. hadoop jar expects java code to configure and submit your job. mapred job takes in a job.xml configuration file and runs the job based on that. -Sandy On Wed, Mar 13, 2013 at 11:07 AM, KayVajj vajjalak...@gmail.com wrote: Hi Sandy,

Re: Will hadoop always spread the work evenly between nodes?

2013-03-13 Thread Jeffrey Buell
I think in your case it will have to be even, because all the slots will get filled. A more interesting case is if you have 40 nodes, will you get exactly 5 slots used for each of the nodes? Or will some nodes get more than 5 mappers, and others less? I don't remember the details, but I've had

Fwd: [Hadoop-help] About Eclipse Setup

2013-03-13 Thread Mayur Patil
Hello, My System specification: Ubuntu 12.04 i686 LTS, Hadoop 1.0.4.deb (I installed from deb file), eclipse Juno 4.2.2, OpenJDK 6 and 7 I have downloaded your plugin1http://idatamining.org/resources/hadoop/hadoop-eclipse-plugin-1.0.4.jarand