Hi All, Can anyone help me know how does companies like Facebook ,Yahoo etc upload bulk files say to the tune of 100 petabytes to Hadoop HDFS cluster for processing and after processing how they download those files from HDFS to local file system.
I don't think they might be using the command line hadoop fs put to upload files as it would take too long or do they divide say 10 parts each 10 petabytes and compress and use the command line hadoop fs put Or if they use any tool to upload huge files. Please help me . Thanks thoihen
