Hi Is org.apache.hadoop.mapred.FileInputFormat to be considered as obsolete/deprecated?
Thanks! 2011/11/15 Stuti Awasthi <[email protected]> > Sure Doug, > Thanks > > -----Original Message----- > From: Doug Meil [mailto:[email protected]] > Sent: Monday, November 14, 2011 9:08 PM > To: [email protected] > Subject: Re: MR - Input from Hbase output to HDFS > > > Glad to worked through that and everything is working. I will add an > example of MR to Hbase-to-HDFS in the book. > > > > > > On 11/14/11 1:24 AM, "Stuti Awasthi" <[email protected]> wrote: > > >Hi, > >I think that issue is with Filesystem Configuration, as in config, it > >is picking HbaseConfiguration. When I modified my output directory path > >to absolute path of HDFS : > >FileOutputFormat.setOutputPath(job, new > >Path("hdfs://master:54310/MR/stuti3")); > > > >The MR jobs runs successfully and I am able to see stuti3 directory > >inside HDFS at desired path. > > > > > >-----Original Message----- > >From: Stuti Awasthi > >Sent: Monday, November 14, 2011 11:40 AM > >To: [email protected] > >Subject: RE: MR - Input from Hbase output to HDFS > > > >Hi Joey, > >Thanks for pointing this. After importing "FileOutputFormat" as you > >suggested, I am able to run MR job from eclipse (Windows) the only > >problem is I am not able to see the output directory this code is > >creating. HDFS and HBase are on Linux machine. > > > >Code : > > Configuration config = HBaseConfiguration.create(); > > config.set("hbase.zookeeper.quorum", "master"); > > config.set("hbase.zookeeper.property.clientPort", "2181"); > > > > Job job = new Job(config, "Hbase_Read_Write"); > > job.setJarByClass(ReadWriteDriver.class); > > Scan scan = new Scan(); > > scan.setCaching(500); > > scan.setCacheBlocks(false); > > TableMapReduceUtil.initTableMapperJob("users", > >scan,ReadWriteMapper.class, Text.class, IntWritable.class, job); > > job.setOutputFormatClass(TextOutputFormat.class); > > FileOutputFormat.setOutputPath(job, new Path("/stuti2")); > > > >After executing this code, the MR jobs runs successfully but when I > >look hdfs no directory is created "/stuti2". I also looked directory in > >local filesystem of Linux machine as well as windows machine, but not > >able to find the output folder anywhere. > > > >Eclipse console Output : > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.version=1.6.0_27 > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.vendor=Sun Microsystems Inc. > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.home=C:\Program Files\Java\jdk1.6.0_27\jre > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.class.path=D:\workspace\Hbase\MRHbaseReadWrite\bin;D:\ > >wor > >kspace\Hbase\MRHbaseReadWrite\lib\commons-cli-1.2.jar;D:\workspace\Hbas > >e\M > >RHbaseReadWrite\lib\commons-httpclient-3.0.1.jar;D:\workspace\Hbase\MRH > >bas > >eReadWrite\lib\commons-logging-1.0.4.jar;D:\workspace\Hbase\MRHbaseRead > >Wri > >te\lib\hadoop-0.20.2-core.jar;D:\workspace\Hbase\MRHbaseReadWrite\lib\h > >bas > >e-0.90.3.jar;D:\workspace\Hbase\MRHbaseReadWrite\lib\log4j-1.2.15.jar;D > >:\w orkspace\Hbase\MRHbaseReadWrite\lib\zookeeper-3.3.2.jar > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.library.path=C:\Program > >Files\Java\jdk1.6.0_27\jre\bin;C:\Windows\Sun\Java\bin;C:\Windows\syste > >m32 ;C:\Windows;C:/Program Files/Java/jre6/bin/client;C:/Program > >Files/Java/jre6/bin;C:/Program > >Files/Java/jre6/lib/i386;C:\Windows\system32;C:\Windows;C:\Windows\Syst > >em3 2\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program > >Files\Java\jdk1.6.0_27;C:\Program > >Files\TortoiseSVN\bin;C:\cygwin\bin;D:\apache-maven-3.0.3\bin;D:\eclips > >e;; > >. > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.io.tmpdir=C:\Users\STUTIA~1\AppData\Local\Temp\ > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:java.compiler=<NA> > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:os.name=Windows 7 > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:os.arch=x86 > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:os.version=6.1 > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:user.name=stutiawasthi > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:user.home=C:\Users\stutiawasthi > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Client > >environment:user.dir=D:\workspace\Hbase\MRHbaseReadWrite > >11/11/14 11:21:45 INFO zookeeper.ZooKeeper: Initiating client > >connection, > >connectString=master:2181 sessionTimeout=180000 watcher=hconnection > >11/11/14 11:21:45 INFO zookeeper.ClientCnxn: Opening socket connection > >to server master/10.33.64.235:2181 > >11/11/14 11:21:45 INFO zookeeper.ClientCnxn: Socket connection > >established to master/10.33.64.235:2181, initiating session > >11/11/14 11:21:45 INFO zookeeper.ClientCnxn: Session establishment > >complete on server master/10.33.64.235:2181, sessionid = > >0x33879243de00ec, negotiated timeout = 180000 > >11/11/14 11:21:46 INFO mapred.JobClient: Running job: job_local_0001 > >11/11/14 11:21:46 INFO zookeeper.ZooKeeper: Initiating client > >connection, > >connectString=master:2181 sessionTimeout=180000 watcher=hconnection > >11/11/14 11:21:46 INFO zookeeper.ClientCnxn: Opening socket connection > >to server master/10.33.64.235:2181 > >11/11/14 11:21:46 INFO zookeeper.ClientCnxn: Socket connection > >established to master/10.33.64.235:2181, initiating session > >11/11/14 11:21:46 INFO zookeeper.ClientCnxn: Session establishment > >complete on server master/10.33.64.235:2181, sessionid = > >0x33879243de00ed, negotiated timeout = 180000 > >11/11/14 11:21:46 INFO zookeeper.ZooKeeper: Initiating client > >connection, > >connectString=master:2181 sessionTimeout=180000 watcher=hconnection > >11/11/14 11:21:46 INFO zookeeper.ClientCnxn: Opening socket connection > >to server master/10.33.64.235:2181 > >11/11/14 11:21:46 INFO zookeeper.ClientCnxn: Socket connection > >established to master/10.33.64.235:2181, initiating session > >11/11/14 11:21:46 INFO zookeeper.ClientCnxn: Session establishment > >complete on server master/10.33.64.235:2181, sessionid = > >0x33879243de00ee, negotiated timeout = 180000 > >11/11/14 11:21:46 INFO mapred.MapTask: io.sort.mb = 100 > >11/11/14 11:21:46 INFO mapred.MapTask: data buffer = 79691776/99614720 > >11/11/14 11:21:46 INFO mapred.MapTask: record buffer = 262144/327680 > >............................................... > >11/11/14 11:21:46 INFO mapred.MapTask: Finished spill 0 > >11/11/14 11:21:46 INFO mapred.TaskRunner: > >Task:attempt_local_0001_m_000000_0 is done. And is in the process of > >commiting > >11/11/14 11:21:46 INFO mapred.LocalJobRunner: > >11/11/14 11:21:46 INFO mapred.TaskRunner: Task > >'attempt_local_0001_m_000000_0' done. > >11/11/14 11:21:46 INFO mapred.LocalJobRunner: > >11/11/14 11:21:46 INFO mapred.Merger: Merging 1 sorted segments > >11/11/14 11:21:46 INFO mapred.Merger: Down to the last merge-pass, with > >1 segments left of total size: 103 bytes > >11/11/14 11:21:46 INFO mapred.LocalJobRunner: > >11/11/14 11:21:46 INFO mapred.TaskRunner: > >Task:attempt_local_0001_r_000000_0 is done. And is in the process of > >commiting > >11/11/14 11:21:46 INFO mapred.LocalJobRunner: > >11/11/14 11:21:46 INFO mapred.TaskRunner: Task > >attempt_local_0001_r_000000_0 is allowed to commit now > >11/11/14 11:21:46 INFO output.FileOutputCommitter: Saved output of task > >'attempt_local_0001_r_000000_0' to /stuti2 > >11/11/14 11:21:46 INFO mapred.LocalJobRunner: reduce > reduce > >11/11/14 11:21:46 INFO mapred.TaskRunner: Task > >'attempt_local_0001_r_000000_0' done. > >11/11/14 11:21:47 INFO mapred.JobClient: map 100% reduce 100% > >11/11/14 11:21:47 INFO mapred.JobClient: Job complete: job_local_0001 > >11/11/14 11:21:47 INFO mapred.JobClient: Counters: 12 > >11/11/14 11:21:47 INFO mapred.JobClient: FileSystemCounters > >11/11/14 11:21:47 INFO mapred.JobClient: FILE_BYTES_READ=40923 > >11/11/14 11:21:47 INFO mapred.JobClient: FILE_BYTES_WRITTEN=82343 > >11/11/14 11:21:47 INFO mapred.JobClient: Map-Reduce Framework > >11/11/14 11:21:47 INFO mapred.JobClient: Reduce input groups=5 > >11/11/14 11:21:47 INFO mapred.JobClient: Combine output records=0 > >11/11/14 11:21:47 INFO mapred.JobClient: Map input records=5 > >11/11/14 11:21:47 INFO mapred.JobClient: Reduce shuffle bytes=0 > >11/11/14 11:21:47 INFO mapred.JobClient: Reduce output records=5 > >11/11/14 11:21:47 INFO mapred.JobClient: Spilled Records=10 > >11/11/14 11:21:47 INFO mapred.JobClient: Map output bytes=91 > >11/11/14 11:21:47 INFO mapred.JobClient: Combine input records=0 > >11/11/14 11:21:47 INFO mapred.JobClient: Map output records=5 > >11/11/14 11:21:47 INFO mapred.JobClient: Reduce input records=5 > > > > > >Please Suggest > > > >-----Original Message----- > >From: Joey Echeverria [mailto:[email protected]] > >Sent: Friday, November 11, 2011 10:38 PM > >To: [email protected] > >Subject: Re: MR - Input from Hbase output to HDFS > > > >There are two APIs (old and new), and you appear to be mixing them. > >TableMapReduceUtil only works with the new API. The solution is to > >import the new version of FileOutputFormat which takes a Job: > > > > > >import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; > > > >-Joey > > > >On Fri, Nov 11, 2011 at 12:55 AM, Stuti Awasthi <[email protected]> > >wrote: > >> The method " setOutputPath (JobConf,Path)" take JobConf as a > >>parameter not the Job object. > >> At least this is the error Im getting while compiling with Hadoop > >>0.20.2 jar with eclipse. > >> > >> FileOutputFormat.setOutputPath(conf, new Path("/output")); > >> > >> -----Original Message----- > >> From: Prashant Sharma [mailto:[email protected]] > >> Sent: Friday, November 11, 2011 11:20 AM > >> To: [email protected] > >> Subject: Re: MR - Input from Hbase output to HDFS > >> > >> Hi stuti, > >> I was wondering why you are not using job object to set output path > >>like this. > >> > >> FileOutputFormat.setOutputPath(job, new Path("outputReadWrite") ); > >> > >> > >> thanks > >> > >> On Fri, Nov 11, 2011 at 10:43 AM, Stuti Awasthi > >><[email protected]>wrote: > >> > >>> Hi Andrie, > >>> Well I am bit confused. When I use Jobconf , and associate with > >>>JobClient to run the job then I get the error that "Input directory > >>>is not set". > >>> Since I want my input to be taken by Hbase table which I already > >>>configured with "TableMapReduceUtil.initTableMapperJob". I don't want > >>>to set input directory via jobconf. > >>> How to mix these 2 so that I can get input from Hbase and write > >>>ouput to HDFS. > >>> > >>> Thanks > >>> > >>> -----Original Message----- > >>> From: Andrei Cojocaru [mailto:[email protected]] > >>> Sent: Thursday, November 10, 2011 7:09 PM > >>> To: [email protected] > >>> Subject: Re: MR - Input from Hbase output to HDFS > >>> > >>> Stuti, > >>> > >>> I don't see you associating JobConf with Job anywhere. > >>> -Andrei > >>> > >>> ::DISCLAIMER:: > >>> > >>> -------------------------------------------------------------------- > >>> - > >>> - > >>> ------------------------------------------------- > >>> > >>> The contents of this e-mail and any attachment(s) are confidential > >>> and intended for the named recipient(s) only. > >>> It shall not attach any liability on the originator or HCL or its > >>> affiliates. Any views or opinions presented in this email are solely > >>> those of the author and may not necessarily reflect the opinions of > >>> HCL or its affiliates. > >>> Any form of reproduction, dissemination, copying, disclosure, > >>> modification, distribution and / or publication of this message > >>> without the prior written consent of the author of this e-mail is > >>> strictly prohibited. If you have received this email in error please > >>> delete it and notify the sender immediately. Before opening any mail > >>> and attachments please check them for viruses and defect. > >>> > >>> > >>> -------------------------------------------------------------------- > >>> - > >>> - > >>> ------------------------------------------------- > >>> > >> > > > > > > > >-- > >Joseph Echeverria > >Cloudera, Inc. > >443.305.9434 > > > > >
