You can load data directly into a hive table(external and internal) directly from the local file system. The same stands for pig. To Manish's point you can do the same using hadoop fs commands. I have tried it both ways and have seen a difference in performance. I would be interested to hear from the rest of the community about this to see it is consistent with what they have seen.
Thanks, Ranjith On May 14, 2012, at 8:45 PM, "Manish Bhoge" <manishbh...@rocketmail.com> wrote: > You first need to copy data using copyFromLocal to your HDFS and then you can > utilize PIG and Hive program for further analysis which run on map reduce. > Yes warehouse directory is in HDFS. If you want to run(test) PIG in local > then in that case you don't to copy data to HDFS > Sent from my BlackBerry, pls excuse typo > > -----Original Message----- > From: Michael Wang <michael.w...@meredith.com> > Date: Mon, 14 May 2012 18:43:47 > To: common-user@hadoop.apache.org<common-user@hadoop.apache.org> > Reply-To: common-user@hadoop.apache.org > Subject: RE: How to load raw log file into HDFS? > > I have the same question and I am glad to get you guys' help. I am also > novice in Hadoop :) > I am using pig and hive to analyze the logs. My logs are in > <LOCAL_FILE_PATH>. > Do I need to use "hadoop fs -copyFromLocal" to put files to <HDFS_FILE_PATH> > first, and then load data files to pig or hive from <HDFS_FILE_PATH>? Or can > just load logs from Local_file_path directly to pig or hive? After I load the > files to hive, I found it is put at /user/hive/warehouse. Is > /user/hive/warehouse a HDFS? > How do I know what <HDFS_FILE_PATH> are available? > > -----Original Message----- > From: Alexander Fahlke [mailto:alexander.fahlke.mailingli...@googlemail.com] > Sent: Monday, May 14, 2012 1:53 AM > To: common-user@hadoop.apache.org > Subject: Re: How to load raw log file into HDFS? > > Hi, > > the best would be to read the documentation and some books to get familar > with Hadoop. > > One of my favourite books is "Hadoop in Action" from Manning ( > http://www.manning.com/lam/) > This book has an exmple for putting (log)-files into HDFS. Check out the > source "listing-3-1" > > Later you can also check out Cloudera's Flume: > https://github.com/cloudera/flume/wiki > > -- > BR > > Alexander Fahlke > Java Developer > www.nurago.com | www.fahlke.org > > > On Mon, May 14, 2012 at 7:24 AM, Amith D K <amit...@huawei.com> wrote: > >> U can even use put/copyFromLocal >> >> both are similar and does the job via terminal. >> >> Or u can write a simple client program to do the job :) >> >> Amith >> >> >> ________________________________________ >> From: samir das mohapatra [samir.help...@gmail.com] >> Sent: Sunday, May 13, 2012 9:13 PM >> To: common-user@hadoop.apache.org >> Subject: Re: How to load raw log file into HDFS? >> >> Hi >> To load any file from local >> Command: >> syntax: hadoop fs -copyFromLocal <LOCAL_FILE_PATH> <HDFS_FILE_PATH> >> Example hadoop fs -copyFromLocal input/logs >> hdfs://localhost/user/dataset/ >> >> More Commans: >> http://hadoop.apache.org/common/docs/r0.17.1/hdfs_shell.html >> >> >> On Sun, May 13, 2012 at 9:53 AM, AnExplorer <satishtha...@gmail.com> >> wrote: >> >>> >>> Hi, I am novice in Hadoop. Kindly suggest how do we load log files into >>> hdfs. >>> Please suggest the command and steps. >>> Thanks in advance!! >>> -- >>> View this message in context: >>> >> http://old.nabble.com/How-to-load-raw-log-file-into-HDFS--tp33815208p33815208.html >>> Sent from the Hadoop core-user mailing list archive at Nabble.com. >>> >>> >> > > This electronic message, including any attachments, may contain proprietary, > confidential or privileged information for the sole use of the intended > recipient(s). You are hereby notified that any unauthorized disclosure, > copying, distribution, or use of this message is prohibited. If you have > received this message in error, please immediately notify the sender by reply > e-mail and delete it. >