[ 
https://issues.apache.org/jira/browse/GRIFFIN-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978915#comment-16978915
 ] 

Chitral Verma commented on GRIFFIN-297:
---------------------------------------

[~guoyp] Thanks for merging this. Im not sure why but even though the PR is 
closed, the JIRA assignee field is still set to Unassigned. Shouldn't it be be 
set to me?

> Allow support for additional file based data sources
> ----------------------------------------------------
>
>                 Key: GRIFFIN-297
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-297
>             Project: Griffin
>          Issue Type: Improvement
>            Reporter: Chitral Verma
>            Priority: Major
>              Labels: features
>             Fix For: 0.6.0
>
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> In the current version of Apache griffin (0.5.0), there is very limited 
> support for file based data sources as only Avro and Text files are 
> supported. 
> I propose the feature to allow support for additional file based data sources 
> like Parquet, CSV, TSV, ORC etc in batch mode. Since most of the above 
> sources already have first class support provided by spark, the 
> implementation is straight forward.
> Also, this feature will allow data to be read directly from stand alone files 
> as well as directories present in both local and distributed filesystems.
> A sample config would look like,
> {noformat}
> {
>   "name": "source",
>   "baseline": true,
>   "connectors": [
>     {
>       "type": "file",
>       "version": "1.7",
>       "config": {
>         "format": "parquet",
>         "options": { 
>           "k1": "v1",
>           "k2": "v2"
>         },
>         "paths": [
>           "/home/chitral/path/to/source/",
>           "/home/chitral/path/to/test.parquet"
>         ]
>       }
>     }
>   ]
> }{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to