[GitHub] [drill] paul-rogers commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2

GitBox Wed, 23 Mar 2022 23:24:58 -0700


paul-rogers commented on pull request #2485:
URL: https://github.com/apache/drill/pull/2485#issuecomment-1077279388



   @jnturton you make good points. I somewhat worry that folks who build large 
distributed systems are a bit under-represented in our current wonderful group 
of contributors, so I do see a drift toward small-scale concerns. (For example, 
I've never seen any shop query 1000s of Excel files, but folks do that every 
day for JSON, Parquet and CSV.)
   
   Similarly, the data science folks want Drill do to it all in SQL. But, at 
scale, people build out data pipelines so that the files which Drill queries 
are the result of a process to produce a query-optimized format such as 
well-partitioned Parquet.
   
   The idea of having two forks (or modes) was an attempt to preserve Drill's 
distributed system heritage, while also allowing the current contributors to go 
wild on the data science use cases.
   
   Maybe a simpler solution is to just have different "bootstrap" files: the 
at-scale edition, the data science edition, etc. Each addition includes plugins 
and defaults optimized for the target use case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [drill] paul-rogers commented on pull request #2485: DRILL-8086: Convert the CSV (AKA "compliant text") reader to EVF V2

Reply via email to