Hi, Please can somebody clarify my doubts. Say. I have a cluster of 30 nodes and I want to put the files in HDFS. And all the files combine together the size is 10 TB but each file is roughly say 1GB only and the total number of files 10 files
1. In real production environment do we copy these 10 files in hdfs under a folder one by one. If this is the case then how many mappers do we specify 10 mappers. And do we use put command of hadoop to transfer this file. 2. If the above is not the case then do we pre-process to merge these 10 files make it one file of size 10 TB and copy this in hdfs . Regards Shashidhar
