Hi Whats the data per hour or per day u r looking to put into HDFS ? For dumping source data into HDFS there are again few options
Option 1 ======= Have parallel threads dumping raw data into HDFS from your source Option 2 ======= Design how your Objects will look and write code to convert raw input files into Sequence Files and then dump it into HDFS The community may have more options….depends on your use case Regards sanjay From: <Agarwal>, Nikhil <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Friday, May 31, 2013 12:24 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: RE: MapReduce on Local FileSystem Hi, Thank you for your reply. One simple answer can be to reduce the time taken for ingesting the data in HDFS. Regards, Nikhil From: Sanjay Subramanian [mailto:[email protected]] Sent: Friday, May 31, 2013 12:50 PM To: <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]> Subject: Re: MapReduce on Local FileSystem Basic question. Why would u want to do that ? Also I think the Map R Hadoop distribution has an NFS mountable HDFS Sanjay Sent from my iPhone On May 30, 2013, at 11:37 PM, "Agarwal, Nikhil" <[email protected]<mailto:[email protected]>> wrote: Hi, Is it possible to run MapReduce on multiple nodes using Local File system (file:///<file:///\\>) ? I am able to run it in single node setup but in a multiple node setup the “slave” nodes are not able to access the “jobtoken” file which is present in the Hadoop.tmp.dir in “master” node. Please let me know if it is possible to do this. Thanks & Regards, Nikhil CONFIDENTIALITY NOTICE ====================== This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator. CONFIDENTIALITY NOTICE ====================== This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
