[
https://issues.apache.org/jira/browse/DRILL-8086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511604#comment-17511604
]
ASF GitHub Bot commented on DRILL-8086:
---------------------------------------
paul-rogers commented on pull request #2485:
URL: https://github.com/apache/drill/pull/2485#issuecomment-1077279388
@jnturton you make good points. I somewhat worry that folks who build large
distributed systems are a bit under-represented in our current wonderful group
of contributors, so I do see a drift toward small-scale concerns. (For example,
I've never seen any shop query 1000s of Excel files, but folks do that every
day for JSON, Parquet and CSV.)
Similarly, the data science folks want Drill do to it all in SQL. But, at
scale, people build out data pipelines so that the files which Drill queries
are the result of a process to produce a query-optimized format such as
well-partitioned Parquet.
The idea of having two forks (or modes) was an attempt to preserve Drill's
distributed system heritage, while also allowing the current contributors to go
wild on the data science use cases.
Maybe a simpler solution is to just have different "bootstrap" files: the
at-scale edition, the data science edition, etc. Each addition includes plugins
and defaults optimized for the target use case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Convert the CSV (AKA "compliant text") reader to EVF V2
> -------------------------------------------------------
>
> Key: DRILL-8086
> URL: https://issues.apache.org/jira/browse/DRILL-8086
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.19.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Major
>
> Work was done some time ago to convert the CSV reader to use EVF V3. Merge
> that work into the master branch.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)