Luciano Resende commented on BAHIR-67:

Thanks [~sourav-mazumder], it would be great to enable high level SQL APIs to 
go over remote webhdfs which can also help in multi-cluster environment or 
cloud/hybrid cloud environments. Are you planning to submit a PR for this ?

> Ability to read/write data in Spark from/to HDFS of a remote Hadoop Cluster
> ---------------------------------------------------------------------------
>                 Key: BAHIR-67
>                 URL: https://issues.apache.org/jira/browse/BAHIR-67
>             Project: Bahir
>          Issue Type: Improvement
>          Components: Spark SQL Data Sources
>    Affects Versions: Not Applicable
>            Reporter: Sourav Mazumder
>             Fix For: Spark-2.0.0
>   Original Estimate: 336h
>  Remaining Estimate: 336h
> In today's world of Analytics many use cases need capability to access data 
> from multiple remote data sources in Spark. Though Spark has great 
> integration with local Hadoop cluster it lacks heavily on capability for 
> connecting to a remote Hadoop cluster. However, in reality not all data of 
> enterprises in Hadoop and running Spark Cluster locally with Hadoop Cluster 
> is not always a solution.
> In this improvement we propose to create a connector for accessing data (read 
> and write) from/to HDFS of a remote Hadoop cluster from Spark using webhdfs 
> api.

This message was sent by Atlassian JIRA

Reply via email to