Re: Lookup HashMap available within the Map

2008-11-30 Thread Shane Butler
Given the goal of a shared data accessable across the Map instances,
can someone please explain some of the differences between using:
- setNumTasksToExecutePerJvm() and then having statically declared
data initialised in Mapper.configure(); and
- a MultithreadedMapRunner?

Regards,
Shane


On Wed, Nov 26, 2008 at 6:41 AM, Doug Cutting [EMAIL PROTECTED] wrote:
 tim robertson wrote:

 Thanks Alex - this will allow me to share the shapefile, but I need to
 one time only per job per jvm read it, parse it and store the
 objects in the index.
 Is the Mapper.configure() the best place to do this?  E.g. will it
 only be called once per job?

 In 0.19, with HADOOP-249, all tasks from a job can be run in a single JVM.
  So, yes, you could access a static cache from Mapper.configure().

 Doug




Re: HDFS directory listing from the Java API?

2008-11-26 Thread Shane Butler
Got it! Thanks Jürgen and Lohit!

On Wed, Nov 26, 2008 at 11:10 PM, Jürgen Broß
[EMAIL PROTECTED] wrote:
 Hi Shane,

 I think what you are looking for is the following:

 Path dirPath = new Path(path to dir);
 FileStatus[] files = FileSystem.get(conf).listStatus(dirPath);

 Each FileStatus entry in the above array contains a Path reference
 (files[i].getPath()) to the file or directory contained in dirPath.

 Greetings,
 Jürgen

 Shane Butler wrote:

 Hi all,

 Can someone pls guide me on how to get a directory listing of files on
 HDFS using the java API (0.19.0)?

 Regards,
 Shane



 --


 Jürgen Broß
 Institute of Computer Science
 Databases and Information Systems
 Freie Universität Berlin
 Takustr. 9
 D-14195 Berlin, Germany
 phone: +49 30 838-75108
 email: [EMAIL PROTECTED]




HDFS directory listing from the Java API?

2008-11-25 Thread Shane Butler
Hi all,

Can someone pls guide me on how to get a directory listing of files on
HDFS using the java API (0.19.0)?

Regards,
Shane