[ https://issues.apache.org/jira/browse/BAHIR-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15631140#comment-15631140 ]
ASF GitHub Bot commented on BAHIR-67: ------------------------------------- Github user ckadner commented on the issue: https://github.com/apache/bahir/pull/25 @sourav-mazumder please add some description to this PR which could include outstanding issues and tag the PR title with the JIRA key and add the tag `[WIP]` while work on this PR is ongoing... i.e. the title could look like `"[BAHIR-67][WIP] Create WebHDFS data source for Spark"` -- Thanks > WebHDFS Data Source for Spark SQL > --------------------------------- > > Key: BAHIR-67 > URL: https://issues.apache.org/jira/browse/BAHIR-67 > Project: Bahir > Issue Type: Improvement > Components: Spark SQL Data Sources > Reporter: Sourav Mazumder > Original Estimate: 336h > Remaining Estimate: 336h > > Ability to read/write data in Spark from/to HDFS of a remote Hadoop Cluster > In today's world of Analytics many use cases need capability to access data > from multiple remote data sources in Spark. Though Spark has great > integration with local Hadoop cluster it lacks heavily on capability for > connecting to a remote Hadoop cluster. However, in reality not all data of > enterprises in Hadoop and running Spark Cluster locally with Hadoop Cluster > is not always a solution. > In this improvement we propose to create a connector for accessing data (read > and write) from/to HDFS of a remote Hadoop cluster from Spark using webhdfs > api. -- This message was sent by Atlassian JIRA (v6.3.4#6332)