Appending file makes Hadoop cluster out of storage.

2015-08-24 Thread Quan Nguyen Hong
Hi all, Have a good day! I used these below code to append file in HDFS from a local file. The local file size is 85MB. The Hadoop cluster (CDH 5.4.2, hdfs 2.6, replica number is 3) has 140GB free. I have a while loop, in there I do FSDataOutputStream out = fs.append(outFile);

Re: Unable to use ./hdfs dfsadmin -report with HDFS Federation

2015-08-24 Thread Akira AJISAKA
If you want to get the summary from a cluster, I'm thinking you can use -fs option to specify the NameNode of the cluster. The command will be hdfs dfsadmin -fs hdfs://nn1:port -report. dfsadmin -report does not support reporting the summary of all the cluster mounted on a viewfs. Regards,

Re: MultithreadedMapper - Sharing Data Structure

2015-08-24 Thread Harsh J
The MultiThreadedMapper won't solve your problem, as all it does is run parallel maps within the same map task JVM as a non-MT one. Your data structure won't be shared across the different map task JVMs on the host, but just within the map tasks's own multiple threads running the map() function

Re: Unable to use ./hdfs dfsadmin -report with HDFS Federation

2015-08-24 Thread EUGEO 2015
who is this? On Sun, Aug 23, 2015 at 8:54 PM, Todd bit1...@163.com wrote: Hi, our hadoop cluster is using HDFS Federation, but when use the following command to report the HDFS status It gives me the following message that viewfs is NOT HDFS filesystem Then how can I proceed to report the

Re: MultithreadedMapper - Sharing Data Structure

2015-08-24 Thread Harsh J
Perhaps combining MultiThreaded mapper along with a CombineFileInputFormat may help (it reduces total # of maps, but you get more threads per map task). On Mon, Aug 24, 2015 at 2:16 PM twinkle sachdeva twinkle.sachd...@gmail.com wrote: Hi, We have been using the jvm reuse feature for the same

Re: MultithreadedMapper - Sharing Data Structure

2015-08-24 Thread twinkle sachdeva
Hi, We have been using the jvm reuse feature for the same reason of sharing the same structure across multiple Map Tasks. Multithreaded Map task does that partially, as within the multiple threads, same copy is used. Depending upon the hardware availability, one can get the same performance.

yarn groups issue with ambari/hortonworks

2015-08-24 Thread REYANE OUKPEDJO
Hi there, I faced some  weird issue and I think there is no need to give many details other than the actual issue. When I add the user yarn to another group, I could see from within the same shell where  I added him that yarn belongs to the new group . But after restarting the node manager

Questions with regards to Yarn/Hadoop

2015-08-24 Thread Omid Alipourfard
Hi, I am running a Terasort benchmark (10 GB, 25 reducers, 50 mappers) that comes with Hadoop 2.7.1. I am experiencing an unexpected behavior with Yarn, which I am hoping someone can shed some light on: I have a cluster of three machines with 2 cores and 3.75 GB of RAM (per machine), when I run