----- Original Message ---- From: vid sea <[EMAIL PROTECTED]> To: [email protected] Sent: Wednesday, October 31, 2007 2:03:15 AM Subject: Can I do a job(or task ) without using mapreduce on the hadoop dfs?
Hi,all: I am starting to use hadoop recently,I have 3 questions to ask?Can anybody help me with it? >first question:I previously have some code under Windows and some .exe & >.dll file.Does hadoop support calling the .exe & .dll files?Can I write a >.bat file ? If not ,does i support some something like that for example the >executable file & batch file under linux? If they do not support such things >,how can I use my existing .exe & .dll file? Yes you can. Hadoop Streaming and Pipes is just for that. http://wiki.apache.org/lucene-hadoop/HadoopStreaming http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/pipes/package-summary.html In your case since you already have an .exe streaming might be the right option. remember to pass on depending .dlls for the .exe along with -file option >second question:Can I do a job(or task ) without using mapreduce on the >hadoop dfs? This is because i have some code and do not want to change >it.But i need the hadoop dfs to store my data which can not be stored on one >machine . So I want to write some scheduling program to run my original on >the dfs without mapred.Can i do that ? If I could ,how can avoid moving data >and schedul the job(or task) to the namenode that store the right data? How about this. Have a dummy data set, and launch a map/reduce job. Now, when all are scheduled, they would be fed with this dummy data. You just ignore that data and access the ones which you need from dfs within your map/reduce tasks. Its kind of scheduling with some unnecessary reading of data >Third question: Can I control which part of data is stored on >which machine(namenode)?Are there any API ?If aren't,can i make it by >easyly change some hadoop dfs code? Do we need that? Remember the data and its replicas have to be distributed to take care of machine failures. If you are given the choice of choosing the machine, wouldn't it cause a points of failure? Thank you!! realvsea 07.10.13 __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
