Thank you very much! 'hadoop fs -cat <file> | mylegacyexe' is exactly the kind of method I came up with and was going to try it out. I'm glad to hear that it's actually an "official" alternative.
Thanks again. This is a great forum! Grace -----Original Message----- From: Arun Murthy [mailto:a...@hortonworks.com] Sent: Tuesday, August 23, 2011 10:36 AM To: common-dev@hadoop.apache.org Subject: Re: how to pass a hdfs file to a c++ process That is a normal use case. I'd encourage you to use Java MR (even pig/hive). If you really want to use your legacy app use streaming with a map cmd such as 'hadoop fs -cat <file> | mylegacyexe' Arun Sent from my iPhone On Aug 23, 2011, at 8:00 AM, Zhixuan Zhu <z...@calpont.com> wrote: > I'll actually invoke one executable from each of my map. Because this > C++ program has been implemented and used in the past, I just want to > integrate it to our Hadoop map/reduce without having to re-implement the > process in java. So my map is going to be very simple with just calling > the process and pass the input files. > > Thanks, > Grace > > -----Original Message----- > From: Arun C Murthy [mailto:a...@hortonworks.com] > Sent: Tuesday, August 23, 2011 9:51 AM > To: common-dev@hadoop.apache.org > Subject: Re: how to pass a hdfs file to a c++ process > > > On Aug 22, 2011, at 12:57 PM, Zhixuan Zhu wrote: > >> Hi All, >> >> I'm using hadoop-0.20.2 to try out some simple tasks. I asked a > question >> about FileInputFormat a few days ago and get some prompt replys from >> this forum and it helped a lot. Thanks again! Now I have another >> question. I'm trying to invoke a C++ process from my mapper for each >> hdfs file in the input directory to achieve some parallel processing. > > That seems weird - why aren't you using more maps and one file per-map? > >> But how do I pass the file to the program? I would want to do > something >> like the following in my mapper: > > IAC, libhdfs is one way to do HDFS ops via c/c++. > > Arun > >> >> Process lChldProc = Runtime.getRuntime().exec("myprocess -file >> $filepath"); >> >> How do I pass the hdfs filesystem to an outside process like that? Is >> HadoopStreaming the direction I should go? >> >> Thanks very much for any reply in advance. >> >> Best, >> Grace >