Hi,
I am new to hadoop and the scenario is like this :
I have hadoop installed on a linux machine having IP as (162.192.100.46)
and I have another window machine with eclipse and hadoop plugin installed..
I am able to connect to linux hadoop machine and can see the dfs location
and mapred folder using my plugin. I copied all the hadoop jar files from
linux to windows and set them in my eclipse.
I am trying to run a sample small code from windows to the linux hadoop
machine
Code :
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
public class TestDriver {
public static void main(String[] args) {
JobClient client = new JobClient();
JobConf conf = new JobConf(TestDriver.class);
// TODO: specify output types
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
// TODO: specify input and output DIRECTORIES (not files)
//conf.setInputPath(new Path("src"));
//conf.setOutputPath(new Path("out"));
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path("In"));
FileOutputFormat.setOutputPath(conf, new Path("Out"));
// TODO: specify a mapper
conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
// TODO: specify a reducer
conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
client.setConf(conf);
try {
JobClient.runJob(conf);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Whenever I am trying to run the code on eclipse plugin ..
I am getting the following error :
*11/04/25 13:39:16 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=*
*11/04/25 13:39:16 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.*
*
org.apache.hadoop.mapred.InvalidInputException**: Input path does not exist:
hdfs://162.192.100.46:54310/user/hadoop/In*
*at org.apache.hadoop.mapred.FileInputFormat.listStatus(**
FileInputFormat.java:190**)*
*at org.apache.hadoop.mapred.FileInputFormat.getSplits(**
FileInputFormat.java:201**)*
*at org.apache.hadoop.mapred.JobClient.writeOldSplits(**JobClient.java:810**
)*
*at org.apache.hadoop.mapred.JobClient.submitJobInternal(**
JobClient.java:781**)*
*at org.apache.hadoop.mapred.JobClient.submitJob(**JobClient.java:730**)*
*at org.apache.hadoop.mapred.JobClient.runJob(**JobClient.java:1249**)*
*at TestDriver.main(**TestDriver.java:46**)*
I know I am doing something wrong, Can anyone tell me where I am wrong, and
how can I run my code from windows to that linux hadoop machine.
Thanks,
Praveenesh