Ted Dunning-3 wrote:
> 
> 
> The simple solution is to package files together in such a way that you
> can
> start anywhere in the package and process a number of files.  Using this
> technique, you can pretty easily have a much smaller number of very large
> files.  Moreover, because of the start-anywhere design, these large files
> can be processed efficiently in Hadoop because they will be near the
> programs processing them and because they will be read from disk in large
> sequential swathes.
> 
> 

When you talk about packaging lots of small files together before putting
them into HDFS what are you talking about? Something as simple as cat?

Thanks
-- 
View this message in context: 
http://www.nabble.com/Using-Map-Reduce-without-HDFS--tf4331338.html#a12337733
Sent from the Hadoop Users mailing list archive at Nabble.com.

Reply via email to