Hi, I have a small doubt about the how HDFS manages the files internally.
Assume like I have a NameNode and 2 DataNodes. I have inserted a csv file of size 80MB into HDFS using 'hadoop copyFromLocal' command. Then how this file will be stored in HDFS? Will it be split into two parts of size 64MB(Default chunk size) and remaining 16Mb and copied to the 2 DataNodes? If that is the case, if I am doing some map-reduce on the two dataNodes, as the data is not line oriented I may get unexpected results. How to solve this type of issues? Please help me. Thanks & Regards Shanmukhan.B
