[jira] [Comment Edited] (DRILL-4892) Swift Documentation

Sudheesh Katkam (JIRA) Fri, 16 Sep 2016 11:32:33 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497034#comment-15497034
 ]


Sudheesh Katkam edited comment on DRILL-4892 at 9/16/16 6:31 PM:
-----------------------------------------------------------------

>From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from \[1\], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
\[1\] and \[2\]. The parameters and values mentioned in the “Configuring” 
section in \[1\] should be provided through the “config” map in the storage 
plugin (or maybe through conf/core-site.xml in the Drill installation 
directory).

Something like:
\{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
    ...
  \},
  "formats": \{
    ...
  \}
  "config": \{
    ...
  \}
\}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the 
exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}


was (Author: sudheeshkatkam):
>From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from \[1\], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
\[1\] and \[2\]. The parameters and values mentioned in the “Configuring” 
section in \[1\] should be provided through the “config” map in the storage 
plugin (or maybe through conf/core-site.xml in the Drill installation 
directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
    ...
  \},
  "formats": \{
    ...
  \}
  "config": \{
    ...
  \}
}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the 
exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}

> Swift Documentation
> -------------------
>
>                 Key: DRILL-4892
>                 URL: https://issues.apache.org/jira/browse/DRILL-4892
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.6.0, 1.8.0
>            Reporter: Matt Keranen
>
> The Drill FAQ (https://drill.apache.org/faq/), suggest Swift is a datasource:
> "Cloud storage: Amazon S3, Google Cloud Storage, Azure Blog Storage, Swift"
> However there appears to be no documentation (?)
> Swift specific docs would be very useful. We have a large Swift installation 
> and using Drill over files in it would be a valuable feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (DRILL-4892) Swift Documentation

Reply via email to