I'm invoking the wordcount example in host1 with this command, but I got an error.
HOST1:$ bin/hadoop jar hadoop-examples-1.0.4.jar wordcount hdfs://HOST2:54310/gutenberg gutenberg-output 13/04/08 22:02:55 ERROR security.UserGroupInformation: PriviledgedActionException as:ubuntu cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://HOST2:54310/gutenberg org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://HOST2:54310/gutenberg Can you be more specific about using the FileinputFormat? It's because I've configured MapReduce and HDFS to work in HOST, and I don't know how can I make an wordcount that reads the data from the HDFS from files in HOST1 and HOST2? On 8 April 2013 19:34, Harsh J <ha...@cloudera.com> wrote: > You should be able to add fully qualified HDFS paths from N clusters > into the same job via FileInputFormat.addInputPath(…) calls. Caveats > may apply for secure environments, but for non-secure mode this should > work just fine. Did you try this and did it not work? > > On Mon, Apr 8, 2013 at 9:56 PM, Pedro Sá da Costa <psdc1...@gmail.com> > wrote: > > Hi, > > > > I want to combine the data that are in different HDFS filesystems, for > them > > to be executed in one job. Is it possible to do this with MR, or there is > > another Apache tool that allows me to do this? > > > > Eg. > > > > Hdfs data in Cluster1 ----v > > Hdfs data in Cluster2 -> this job reads the data from Cluster1, 2 > > > > > > Thanks, > > -- > > Best regards, > > > > -- > Harsh J > -- Best regards,