I will, but deploying application on a cluster is now far away. Just finishing raw implementation. Cluster tuning is planed in the end of this month.

Thanks.

On 04/08/2012 09:06 PM, Harsh J wrote:
It will work. Pseudo-distributed mode shouldn't be all that different
from a fully distributed mode. Do let us know if it does not work as
intended.

On Sun, Apr 8, 2012 at 11:40 PM, Ondřej Klimpera<klimp...@fit.cvut.cz>  wrote:
Thanks for your advise, File.createTempFile() works great, at least in
pseudo-ditributed mode, hope cluster solution will do the same work. You
saved me hours of trying...



On 04/07/2012 11:29 PM, Harsh J wrote:
MapReduce sets "mapred.child.tmp" for all tasks to be the Task
Attempt's WorkingDir/tmp automatically. This also sets the
-Djava.io.tmpdir prop for each task at JVM boot.

Hence you may use the regular Java API to create a temporary file:

http://docs.oracle.com/javase/6/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String)

These files would also be automatically deleted away after the task
attempt is done.

On Sun, Apr 8, 2012 at 2:14 AM, Ondřej Klimpera<klimp...@fit.cvut.cz>
  wrote:
Hello,

I would like to ask you if it is possible to create and work with a
temporary file while in a map function.

I suppose that map function is running on a single node in Hadoop
cluster.
So what is a safe way to create a temporary file and read from it in one
map() run. If it is possible is there a size limit for the file.

The file can not be created before hadoop job is created. I need to
create
and process the file inside map().

Thanks for your answer.

Ondrej Klimpera.





Reply via email to