You can indeed use file:/// urls, when the mount point is shared. Expect extreme io loading on the machines hosting that mount point ;)
On Sat, Jan 23, 2010 at 8:57 AM, prasenjit mukherjee <[email protected]>wrote: > I have hundreds of large files ( ~ 100MB ) in a /mnt/ location which is > shared by all my hadoop nodes. Was wondering if I could directly use > "hadoop > distcp file:///mnt/data/tr* /input" to parallelize/distribute hadoop push. > Hadoop push is indeed becoming a bottle neck for me and any help in this > regard is greatly appreciated. Currently I am using "hadoop -moveFromlocal > ..." and it is taking too much of time. > > -Thanks, > Prasen > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals
