Re: Realtime sensor's tcpip data to hadoop

2014-05-13 Thread Azuryy Yu
Hi Alex, you can try Apache Flume. On Wed, May 7, 2014 at 10:48 AM, Alex Lee eliy...@hotmail.com wrote: Sensors' may send tcpip data to server. Each sensor may send tcpip data like a stream to the server, the quatity of the sensors and the data rate of the data is high. Firstly, how the

Re: speed of replication for under replicated blocks by namenode

2014-05-13 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Chandra! Replication is done according to priority (e.g. where only 1 block out of 3 remains is higher priority than when only 2 out of 3 remain). Every time a DN heartbeats into the NN, it *may* be assigned some replication work according to

Re: Conversion from MongoDB to hadoop

2014-05-13 Thread Raj K Singh
are you using aggregation mapReduce feature of MongoDb or some scripting language(python) to emit key/value pair? Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Mon, May 12, 2014 at

Re: Data node with multiple disks

2014-05-13 Thread kishore alajangi
replication factor=1 On Tue, May 13, 2014 at 11:04 AM, SF Hadoop sfhad...@gmail.com wrote: Your question is unclear. Please restate and describe what you are attempting to do. Thanks. On Monday, May 12, 2014, Marcos Sousa falecom...@marcossousa.com wrote: Hi, I have 20 servers with

Re: Data node with multiple disks

2014-05-13 Thread Nitin Pawar
Hi Marcos, If these discs are not shared across nodes, I would not worry. Hadoop takes care of making sure data is not replicated to single node. But if all these 20 nodes are sharing these 10 HDD's, Then you may have to basically assign specific disc to specific node and make your cluster rack

Re: LVM to JBOD conversion without data loss

2014-05-13 Thread Akira AJISAKA
Hi Bharath, The steps are not correct for me. Data loss can happen if you reduce the replication and remove a DataNode at the same time. 1) decomission a DataNode (or some DataNodes) 2) change the configuration of the DataNode(s) 3) add the DataNode(s) to the cluster repeat 1) - 3) for all

Re: LVM to JBOD conversion without data loss

2014-05-13 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 One way I can think of is decomissioning the nodes and then basically re-imaging it however you want to. Is that not an option? On 05/12/14 00:18, Bharath Kumar wrote: Hi I am a query regarding JBOD , I sit possible to migrate from LVM to JBOD

all tasks failing for MR job on Hadoop 2.4

2014-05-13 Thread Gäde , Sebastian
Hi, I've set up a Hadoop 2.4 cluster with three nodes. Namenode and Resourcemanager are running on one node, Datanodes and Nodemanagers on the other two. All services are starting up without problems (as far as I can see), web apps show all nodes as running. However, I am not able to run

enable regular expression on which parameter?

2014-05-13 Thread Avinash Kujur
mapreduce-5851 i can see many parameters in Distcp class. in which parameter do we need to enable regular expressions? private static final String usage = NAME + [OPTIONS] srcurl* desturl + \n\nOPTIONS: + \n-p[rbugp] Preserve status + \n

Re: Data node with multiple disks

2014-05-13 Thread SF Hadoop
Your question is unclear. Please restate and describe what you are attempting to do. Thanks. On Monday, May 12, 2014, Marcos Sousa falecom...@marcossousa.com wrote: Hi, I have 20 servers with 10 HD with 400GB SATA. I'd like to use them to be my datanode: /vol1/hadoop/data

Using Lookup file in mapreduce

2014-05-13 Thread Siddharth Tiwari
Hi team I have a huge lookup file around 5 GB and I need to use it to map users to categories in my mapreduce job. Can you suggest the best way to achieve it ? Sent from my iPhone

Re: Data node with multiple disks

2014-05-13 Thread Aitor Perez Cedres
If you specify a list in the property dfs.datanode.data.dir hadoop will distribute the data blocks among all those disks; it will not replicate data between them. If you want to use the disks as a single one you gotta make a LVM array or any other solution to present them as a single one to

Questions about Hadoop logs and mapred.local.dir

2014-05-13 Thread sam liu
Hi Experts, 1. The size of mapred.local.dir is big(30 GB), how many methods could clean it correctly? 2. For logs of NameNode/DataNode/JobTracker/TaskTracker, are they all rolling type log? What's their max size? I can not find the specific settings for them in log4j.properties. 3. I find the

Re: No job can run in YARN (Hadoop-2.2)

2014-05-13 Thread Tao Xiao
The *FileNotFoundException* was thrown when I tried to submit a job calculating PI, actually there is no such exception thrown when I submit a wordcount job, but I can still see Exception from container-launch... and any other jobs would throw such exceptions. Every job runs successfully when I

Re: Data node with multiple disks

2014-05-13 Thread Marcos Sousa
Yes, I don't want to replicate, just use as one disk? Isn't possible to make this work? Best regards, Marcos On Tue, May 13, 2014 at 6:55 AM, Rahul Chaudhari rahulchaudhari0...@gmail.com wrote: Marcos, While configuring hadoop, the dfs.datanode.data.dir property in hdfs-default.xml

Re: enable regular expression on which parameter?

2014-05-13 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Avinash! That JIRA is still open and does not seem to have been fixed. There are a lot of issues with providing regexes though. A long standing issue has been https://issues.apache.org/jira/browse/HDFS-13 which makes it even harder HTH Ravi On