Evert Lammerts at Sara.nl did something seemed to your problem, spliting a big 2.7 TB file to chunks of 10 GB. This work was presented on the BioAssist Programmers' Day on January of this year and its name was
"Large-Scale Data Storage and Processing for Scientist in The Netherlands"

http://www.slideshare.net/evertlammerts

P.D: I sent the message with a copy to him

El 6/20/2011 10:38 AM, Niels Basjes escribió:
Hi,

On Mon, Jun 20, 2011 at 16:13, Mapred Learn<mapred.le...@gmail.com>  wrote:
But this file is a gzipped text file. In this case, it will only go to 1 mapper 
than the case if it was
split into 60 1 GB files which will make map-red job finish earlier than one 60 
GB file as it will
Hv 60 mappers running in parallel. Isn't it so ?
Yes, that is very true.


--
Marcos Luís Ortíz Valmaseda
 Software Engineer (UCI)
 http://marcosluis2186.posterous.com
 http://twitter.com/marcosluis2186

Reply via email to