How can I make a directory as a InputSplit rather than a file. I want that the input split available to a map task should be a directory and not a file. And I will implement my own record reader which will read appropriate data from the directory and thus give the records to the map tasks.
To explain in other words, I have a list of directories distributed over hdfs and I know that each of these directories is small enough to be present on a single node. I want that one directory to be given to each map task rather than the files present in it. How to do this? Thanks, Akhil -- View this message in context: http://old.nabble.com/having-a-directory-as-input-split-tp28408886p28408886.html Sent from the Hadoop core-user mailing list archive at Nabble.com.