Unify Map-Reduce and Streaming to take the same globbed input specification
---------------------------------------------------------------------------

                 Key: HADOOP-619
                 URL: http://issues.apache.org/jira/browse/HADOOP-619
             Project: Hadoop
          Issue Type: Improvement
            Reporter: eric baldeschwieler


Right now streaming input is specified very differently from other map-reduce 
input.  It would be good if these two apps could take much more similar input 
specs.

In particular -input in streaming expects a file or glob pattern while MR takes 
a directory.  It would be cool if both could take a glob patern of files and if 
both took a directory by default (with some patern excluded to allow logs, 
metadata and other framework output to be safely stored).

We want to be sure that MR input is backward compatible over this change.  I 
propose that a single file should be accepted as an input or a single 
directory.  Globs should only match directories if the paterns is '/' 
terminated, to avoid massive inputs specified by mistake.

Thoughts?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to