to HDFS of a remote Hadoop Cluster

Sourav Mazumder (JIRA) Thu, 13 Oct 2016 08:22:29 -0700

Sourav Mazumder created BAHIR-67:
------------------------------------

             Summary: Ability to read/write data in Spark from/to HDFS of a 
remote Hadoop Cluster
                 Key: BAHIR-67
                 URL: https://issues.apache.org/jira/browse/BAHIR-67
             Project: Bahir
          Issue Type: Improvement
          Components: Spark SQL Data Sources
    Affects Versions: Not Applicable
            Reporter: Sourav Mazumder
             Fix For: Spark-2.0.0



In today's world of Analytics many use cases need capability to access data 
from multiple remote data sources in Spark. Though Spark has great integration 
with local Hadoop cluster it lacks heavily on capability for connecting to a 
remote Hadoop cluster. However, in reality not all data of enterprises in 
Hadoop and running Spark Cluster locally with Hadoop Cluster is not always a 
solution.

In this improvement we propose to create a connector for accessing data (read 
and write) from/to HDFS of a remote Hadoop cluster from Spark using webhdfs api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (BAHIR-67) Ability to read/write data in Spark from/to HDFS of a remote Hadoop Cluster

Reply via email to