Kasper Sørensen created METAMODEL-148:
-----------------------------------------

             Summary: Add a HdfsResource implementation
                 Key: METAMODEL-148
                 URL: https://issues.apache.org/jira/browse/METAMODEL-148
             Project: Apache MetaModel
          Issue Type: New Feature
            Reporter: Kasper Sørensen
            Assignee: Kasper Sørensen


I suggest to implement a Resource class that will allow reading and writing 
files in Hadoop's HDFS file system.

Background:

Many of the file-based DataContext implementations we have accept a Resource - 
an interface which abstracts the file system. We currently have implementations 
like FileResource, UrlResource, ClasspathResource.

A request I get often is to also support Hadoop. Now obviously the ideal Hadoop 
integration would not even imply using MetaModel's query-based approach to data 
access, but for many simple use cases it is actually not that important whether 
the job runs natively in Hadoop (for example in Map-Reduce) or if the process 
simply fetches the file over the wire. I have seen many cases of small-ish CSV 
files on Hadoop for example, where it would actually be quicker to run through 
the file on a client than submitting a job to Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to