[ANNOUNCE] Apache MRUnit 1.0.0 released

2013-04-15 Thread Dave Beech
The Apache MRUnit team is pleased to announce the release of MRUnit 1.0.0, a Java library that helps developers unit test Apache Hadoop MapReduce jobs. This is the fifth release of Apache MRUnit, and the first since graduation from the Incubator The release is available here:

[ANNOUNCE] Apache MRUnit 1.0.0 released

2013-04-15 Thread Dave Beech
The Apache MRUnit team is pleased to announce the release of MRUnit 1.0.0, a Java library that helps developers unit test Apache Hadoop MapReduce jobs. This is the fifth release of Apache MRUnit, and the first since graduation from the Incubator The release is available here:

Need help about the configuration optimization

2013-04-15 Thread 姚吉龙
Hi I am a newer for hadoop, now we have 32 nodes for hadoop study I need to speed up the process of hadoop processing by finding the best configuration. For example: io.sort.mb io.sort.record.percent etc. But I do not how to start with so many parameters available for optimization. BRs

Submitting mapreduce and nothing happens

2013-04-15 Thread Amit Sela
Hi all, I'm trying to submit a mapreduce job remotely using job.submit() I get the following: [WARN ] org.apache.hadoop.mapred.JobClient » Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. [INFO ] org.apache.hadoop.mapred.JobClient

Re: block over-replicated

2013-04-15 Thread Yanbo Liang
You can reference this function, it remove excess replicas form the map. public void removeStoredBlock(Block block, DatanodeDescriptor node) 2013/4/12 lei liu liulei...@gmail.com I use hadoop-2.0.3. I find when on block is over-replicated, the replicas to be add to excessReplicateMap

jobtracker not starting - access control exception - folder not owned by me (it claims)

2013-04-15 Thread Julian Bui
Hello hadoop users, I can't start my jobtracker and am getting an org.apache.hadoop.security.AccessControlException saying that my hdfs://localhost:9000/home/jbu/hadoop_local_install/hadoop-1.0.4/tmp/mapred/system is not owned by jbu (me, my user). However, I check the folder and it is indeed

Re: jobtracker not starting - access control exception - folder not owned by me (it claims)

2013-04-15 Thread Azuryy Yu
I supposed you start-mapred by user mapred. then hadoop fs -chown -R mpared:mapred /home/jbu/hadoop_local_install/ hadoop-1.0.4/tmp/mapred/system this is caused by fairscheduler, please reach MAPREDUCE-4398https://issues.apache.org/jira/browse/MAPREDUCE-4398 On Mon, Apr 15, 2013 at 6:43 PM,

How to set Rack Id of DataNodes?

2013-04-15 Thread Mohammad Mustaqeem
Hello everyone?? I want to set the Rack Id of each DataNodes?? I have read somewhere that we have to write a script that gives Rack Id of nodes. I want to clarify that the input of that script will be IP Address of DataNode and the output will be the RackId.. Is it?? -- *With regards ---*

Adjusting tasktracker heap size?

2013-04-15 Thread MARCOS MEDRADO RUBINELLI
Hi, I am currently tuning a cluster, and I haven't found much information on what factors to consider while adjusting the heap size of tasktrackers. Is it a direct multiple of the number of map+reduce slots? Is there anything else I should consider? Thank you, Marcos

RE: How to set Rack Id of DataNodes?

2013-04-15 Thread Vijay Thakorlal
Hi Mohammad, Yes that's correct your rack awareness script takes the IP address of a node and returns the rack name/id. You then just have to ensure the script is executable and referenced (using an absolute path) in the parameter topology.script.file.name in core-site.xml. Regards,

Re: Submitting mapreduce and nothing happens

2013-04-15 Thread Harsh J
When you say nothing happens; where exactly do you mean? The client doesn't print anything, or the cluster doesn't run anything? On Mon, Apr 15, 2013 at 3:36 PM, Amit Sela am...@infolinks.com wrote: Hi all, I'm trying to submit a mapreduce job remotely using job.submit() I get the following:

Re: jobtracker not starting - access control exception - folder not owned by me (it claims)

2013-04-15 Thread Harsh J
The folder the JT warns about is on the HDFS, not local filesystem. On Mon, Apr 15, 2013 at 4:13 PM, Julian Bui julian...@gmail.com wrote: Hello hadoop users, I can't start my jobtracker and am getting an org.apache.hadoop.security.AccessControlException saying that my

Re: Submitting mapreduce and nothing happens

2013-04-15 Thread Harsh J
Thats interesting; is the JT you're running on the cluster started with the ID 201304150711 or something else? On Mon, Apr 15, 2013 at 6:47 PM, Amit Sela am...@infolinks.com wrote: The client prints the two lines I posted and the cluster shows nothing. Not even incrementing the number of

Re: Submitting mapreduce and nothing happens

2013-04-15 Thread Amit Sela
This is the JT ID and there is no problem running jobs from command line, just remote. On Apr 15, 2013 4:24 PM, Harsh J ha...@cloudera.com wrote: Thats interesting; is the JT you're running on the cluster started with the ID 201304150711 or something else? On Mon, Apr 15, 2013 at 6:47 PM,

Re: regarding hadoop

2013-04-15 Thread Rajashree Bagal
I have checked earlier error and solved it after seeing logs but sitll have some problem .many of the solutions suggests about number of entries in /etc/hosts but not confirmed try to get replies from mailing list arpit@arpit:~/hadoop-1.0.3$ bin/hadoop jar hadoop-examples-1.0.3.jar

Can we setup Flume in Apache Hadoop?

2013-04-15 Thread Ramasubramanian Narayanan
Hi, Can we setup Flume in Apache Hadoop... If yes can someone share the URL for the steps to do so... Thanks and Regards, Rams

Re: Can we setup Flume in Apache Hadoop?

2013-04-15 Thread Nitin Pawar
in apache hadoop ? what do you want to do ? On Mon, Apr 15, 2013 at 8:10 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Hi, Can we setup Flume in Apache Hadoop... If yes can someone share the URL for the steps to do so... Thanks and Regards, Rams -- Nitin

Re: Can we setup Flume in Apache Hadoop?

2013-04-15 Thread Harsh J
Yes, Apache Flume works with Hadoop; See their user guide at http://flume.apache.org/FlumeUserGuide.html On Mon, Apr 15, 2013 at 8:10 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Hi, Can we setup Flume in Apache Hadoop... If yes can someone share the URL for the

Re: How to run hadoop jar command in a clustered environment

2013-04-15 Thread Chris Nauroth
Hello Thoihen, I'm moving this discussion from common-dev (questions about developing Hadoop) to user (questions about using Hadoop). If you haven't already seen it, then I recommend reading the cluster setup documentation. It's a bit different depending on the version of the Hadoop code that

Re: Adjusting tasktracker heap size?

2013-04-15 Thread Amal G Jose
It depends on the type of job that is frequently submitting. RAM size of the machine. Heap size of tasktracker= (mapslots+reduceslots)*jvm size We can adjust this according to our requirement to fine tune our cluster. This is my thought. On Mon, Apr 15, 2013 at 4:40 PM, MARCOS MEDRADO RUBINELLI

Re: R environment with Hadoop

2013-04-15 Thread Amal G Jose
Rhipe is good. From my experience Rhipe is fine tuned and the jobs are executing faster than RMR. RMR execution is juz like a streaming job. Rhipe 0.73 will work on CDH4 MR1. Rhipe versions below 0.73 will not work on CDH4. On Sun, Apr 14, 2013 at 12:16 PM, Håvard Wahl Kongsgård

Re: jps show nothing but hadoop still running

2013-04-15 Thread Amal G Jose
One more issue can occur. If your system is installed with more than one java, this issue may occur. Openjdk may be present as default (jps is not present in openjdk). jps is present in sun java. If the running process is using openjdk, then that process will not be listed with jps. The jps will

Re: Submitting mapreduce and nothing happens

2013-04-15 Thread Amit Sela
Reading my own message I understand that maybe it's not clear so just to clarify - the previously mentioned JT ID is indeed the correct ID. Thanks. On Apr 15, 2013 4:35 PM, Amit Sela am...@infolinks.com wrote: This is the JT ID and there is no problem running jobs from command line, just

Re: Bloom Filter analogy in SQL

2013-04-15 Thread Anupam Singh
Many join implementations use bloom filters built on the smaller to eliminate rows on the larger tables in SQL queries. Many industrial RDBMS engines will show the use of bloom filters in SQL explain plans. For instance, oracle explain plans call these joins bloom filters as

Re: jobtracker not starting - access control exception - folder not owned by me (it claims)

2013-04-15 Thread Julian Bui
I thought it was talking about the hdfs, but I don't recall ever making that directory and plus I thought everything was under /user/$USER in the hdfs? I just stubbornly assumed was talking about my local fs :/ Anyway, thanks guys! On Mon, Apr 15, 2013 at 6:12 AM, Harsh J ha...@cloudera.com

CDR files

2013-04-15 Thread oualid ait wafli
Hi Someone use hadoop ecosystem (Hadoop, mapReduce, Hive,, Pig, HBase, Flume...) to deploy and analysis CDR (call detail records) files thanks

searching any HttpFS Gateway Java Client.

2013-04-15 Thread Kenji Kawaguchi
Hi I have hadoop 2.0.3-alpha cluster(with Security). I see http HttpFS Gateway client(HttpFSFileSystem.java), this is supporting http but not https. I would like to use https but can not find any HttpFS Gateway Java Client. Thanks for the help, Kenji

Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
Hi I am a newer for hadoop, and set up hadoop with tarball . I have 5 nodes for cluster, 2 NN nodes with QJM (3 Journal Nodes, one of them on DN node. ), 3 DN nodes with zookeepers, It works fine. When I reboot one data node machine which includes zookeeper, after that , restart all

Re: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Ted Yu
I think this question would be more appropriate for HBase user mailing list. Moving hadoop user to bcc. Please tell us the HBase version you are using. Thanks On Mon, Apr 15, 2013 at 6:51 PM, dylan dwld0...@gmail.com wrote: Hi ** ** I am a newer for hadoop, and set up hadoop with

答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
It is hbase-0.94.2-cdh4.2.0. 发件人: Ted Yu [mailto:yuzhih...@gmail.com] 发送时间: 2013年4月16日 9:55 收件人: u...@hbase.apache.org 主题: Re: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again I think this question would be more appropriate

Re: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Azuryy Yu
This is zookeeper issue. please paste zookeeper log here. thanks. On Tue, Apr 16, 2013 at 9:58 AM, dylan dwld0...@gmail.com wrote: It is hbase-0.94.2-cdh4.2.0. ** ** *发件人:* Ted Yu [mailto:yuzhih...@gmail.com] *发送时间:* 2013年4月16日 9:55 *收件人:* u...@hbase.apache.org *主题:* Re: Region has

答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
How to check zookeeper log?? It is the binary files, how to transform it to app:ds:normal normal log? I find the “org.apache.zookeeper.server.LogFormatter”, how to run? 发件人: Azuryy Yu [mailto:azury...@gmail.com] 发送时间: 2013年4月16日 10:01 收件人: user@hadoop.apache.org 主题: Re: 答复:

Re: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Azuryy Yu
it located under hbase-home/logs/ if your zookeeper is managed by hbase. but I noticed you configured QJM, then did your QJM and Hbase share the same ZK cluster? if so, then just paste your QJM zk configuration in the hdfs-site.xml and hbase zk configuration in the hbase-site.xml. On Tue, Apr

Re: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Azuryy Yu
and paste ZK configuration in the zookeerp_home/conf/zoo.cfg On Tue, Apr 16, 2013 at 10:42 AM, Azuryy Yu azury...@gmail.com wrote: it located under hbase-home/logs/ if your zookeeper is managed by hbase. but I noticed you configured QJM, then did your QJM and Hbase share the same ZK

答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
QJM zk configuration in core-site.xml property nameha.zookeeper.quorum/name valueSlave01:2181,Slave02:2181,Slave03:2181/value /property property namehbase.zookeeper.quorum/name valueSlave01,Slave02,Slave03/value /property hbase-home/logs have no zookeeper log. My

答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is

答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
I use hbase hbck �Cfix to fix hbase . It show : RecoverableZooKeeper: The identifier of this process is 11286@Master 13/04/16 10:58:34 INFO zookeeper.ClientCnxn: Opening socket connection to server Slave02/192.168.75.243:2181. Will not attempt to authenticate using SASL (Unable to locate a

答复: 答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
I use hbase shell I always show : ERROR: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing 发件人: Azuryy Yu [mailto:azury...@gmail.com] 发送时间: 2013年4月16日 10:59 收件人: user@hadoop.apache.org 主题: Re: 答复: 答复: 答复: Region has been CLOSING

Re: 答复: 答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Azuryy Yu
then, can you find zookeeper log under zookeeper_home/zookeeper.out ? On Tue, Apr 16, 2013 at 11:04 AM, dylan dwld0...@gmail.com wrote: I use hbase shell ** ** I always show : ERROR: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.PleaseHoldException: Master is

答复: 答复: 答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread dylan
yes. I have just discovered. I find the Slave01 and Slave03 zookeeper.out under zookeeper_home/bin/ But Slave02(which reboot before) zookeeper_home under / directory after reboot Slave02 zookeeper.out show: WARN [RecvWorker:1:QuorumCnxManager$RecvWorker@765] - Interrupting SendWorker

threads quota is exceeded question

2013-04-15 Thread rauljin
HI: The hadoop cluster is running balance. And one datannode 172.16.80.72 is : Datanode :Not able to copy block -507744952197054725 to /172.16.80.73:51658 because threads quota is exceeded. ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:

Re: 答复: 答复: 答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Azuryy Yu
I cannot find any useful information from pasted logs. On Tue, Apr 16, 2013 at 11:22 AM, dylan dwld0...@gmail.com wrote: yes. I have just discovered. ** ** I find the Slave01 and Slave03 zookeeper.out under zookeeper_home/bin/*** * But Slave02(which reboot before) zookeeper_home

Re: 答复: 答复: 答复: 答复: 答复: Region has been CLOSING for too long, this should eventually complete or the server will expire, send RPC again

2013-04-15 Thread Samir Ahmic
Hi, Azuryy This actions may resolve RIT issue: 1. Try to restart master 2. If 1. dont resolve issue run 'hbase zkcli' and remove hbase znode with 'rmr /hbase' and then restart cluster On Tue, Apr 16, 2013 at 5:47 AM, Azuryy Yu azury...@gmail.com wrote: I cannot find any useful information