Hi, a typo? import com.bejoy.sampels.worcount.WordCountDriver; = wor_d_count ?
- alex On Thu, Nov 24, 2011 at 3:45 PM, Bejoy Ks <[email protected]> wrote: > Hi Denis > I tried your code with out distributed cache locally and it worked > fine for me. Please find it at > http://pastebin.com/ki175YUx > > I echo Mike's words in submitting a map reduce jobs remotely. The remote > machine can be your local PC or any utility server as Mike specified. What > you need to have in remote machine is a replica of hadoop jars and > configuration files same as that of your hadoop cluster. (If you don't have > a remote util server set up then you can use your dev machine for the > same). Just trigger the hadoop job on local machine and the actual job > would be submitted and running on your cluster based on the NN host and > configuration parameters you have on your config files. > > Hope it helps!.. > > Regards > Bejoy.K.S > > On Thu, Nov 24, 2011 at 7:09 PM, Michel Segel <[email protected] > >wrote: > > > Denis... > > > > Sorry, you lost me. > > > > Just to make sure we're using the same terminology... > > The cluster is comprised of two types of nodes... > > The data nodes which run DN,TT, and if you have HBase, RS. > > Then there are control nodes which run you NN,SN, JT and if you run > HBase, > > HM and ZKs ... > > > > Outside of the cluster we have machines set up with Hadoop installed but > > are not running any of the processes. They are where our users launch > there > > jobs. We call them edge nodes. ( it's not a good idea to let users > directly > > on the actual cluster.) > > > > Ok, having said all of that... You launch you job from the edge nodes... > > Your data sits in HDFS so you don't need distributed cache at all. Does > > that make sense? > > You job will run on the local machine, connect to the JT and then run. > > > > We set up the edge nodes so that all of the jars, config files are > already > > set up for the users and we can better control access... > > > > Sent from a remote device. Please excuse any typos... > > > > Mike Segel > > > > On Nov 24, 2011, at 7:22 AM, Denis Kreis <[email protected]> wrote: > > > > > Without using the distributed cache i'm getting the same error. It's > > > because i start the job from a remote client / programmatically > > > > > > 2011/11/24 Michel Segel <[email protected]>: > > >> Silly question... Why do you need to use the distributed cache for the > > word count program? > > >> What are you trying to accomplish? > > >> > > >> I've only had to play with it for one project where we had to push out > > a bunch of c++ code to the nodes as part of a job... > > >> > > >> Sent from a remote device. Please excuse any typos... > > >> > > >> Mike Segel > > >> > > >> On Nov 24, 2011, at 7:05 AM, Denis Kreis <[email protected]> wrote: > > >> > > >>> Hi Bejoy > > >>> > > >>> 1. Old API: > > >>> The Map and Reduce classes are the same as in the example, the main > > >>> method is as follows > > >>> > > >>> public static void main(String[] args) throws IOException, > > >>> InterruptedException { > > >>> UserGroupInformation ugi = > > >>> UserGroupInformation.createProxyUser("<remote user name>", > > >>> UserGroupInformation.getLoginUser()); > > >>> ugi.doAs(new PrivilegedExceptionAction<Void>() { > > >>> public Void run() throws Exception { > > >>> > > >>> JobConf conf = new JobConf(WordCount.class); > > >>> conf.setJobName("wordcount"); > > >>> > > >>> conf.setOutputKeyClass(Text.class); > > >>> conf.setOutputValueClass(IntWritable.class); > > >>> > > >>> conf.setMapperClass(Map.class); > > >>> conf.setCombinerClass(Reduce.class); > > >>> conf.setReducerClass(Reduce.class); > > >>> > > >>> conf.setInputFormat(TextInputFormat.class); > > >>> conf.setOutputFormat(TextOutputFormat.class); > > >>> > > >>> FileInputFormat.setInputPaths(conf, new Path("<path to > > input dir>")); > > >>> FileOutputFormat.setOutputPath(conf, new Path("<path > to > > >>> output dir>")); > > >>> > > >>> conf.set("mapred.job.tracker", "<ip:8021>"); > > >>> > > >>> FileSystem fs = FileSystem.get(new > > URI("hdfs://<ip>:8020"), > > >>> new Configuration()); > > >>> fs.mkdirs(new Path("<remote path>")); > > >>> fs.copyFromLocalFile(new Path("<local > path>/test.jar"), > > new > > >>> Path("<remote path>")); > > >>> > > >>> > > >> > > > > > > -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*
