Anatomy of read in hdfs

2017-04-06 Thread Sidharth Kumar
Hi Genies, I have a small doubt that hdfs read operation is parallel or sequential process. Because from my understanding it should be parallel but if I read "hadoop definitive guide 4" in anatomy of read it says "*Data is streamed from the datanode back **to the client, which calls read()

Customize Sqoop default property

2017-04-06 Thread Sidharth Kumar
Hi, I am importing data from RDBMS to hadoop using sqoop but my RDBMS data is multi valued and contains "," special character. So, While importing data using sqoop into hadoop ,sqoop by default it separate the columns by using "," character. Is there any property through which we can customize

Re: Physical memory (bytes) snapshot counter question - how to get maximum memory used in reduce task

2017-04-06 Thread Miklos Szegedi
There are two new counters, MAP_PHYSICAL_MEMORY_BYTES_MAX and REDUCE_PHYSICAL_MEMORY_BYTES_MAX that give you the max value for map and reduce respectively. Thanks, Miklos On Wed, Apr 5, 2017 at 6:37 PM, Aaron Eng wrote: > An important consideration is the difference between the

Re: Physical memory (bytes) snapshot counter question - how to get maximum memory used in reduce task

2017-04-06 Thread Aaron Eng
An important consideration is the difference between the RSS of the JVM process vs. the used heap size. Which of those are you looking for? And also, importantly, why/what do you plan to do with that info? A second important consideration is the length of time you are at/around your max