[GitHub] [drill] dzamo edited a comment on pull request #2282: DRILL-7978: Fixed Width Format Plugin

GitBox Mon, 08 Nov 2021 05:44:32 -0800


dzamo edited a comment on pull request #2282:
URL: https://github.com/apache/drill/pull/2282#issuecomment-963159298



   > Let's consider a real world use case: some fixed width log generated by a 
database. Since the fields may be mashed together, there isn't a delimiter that 
you can use to divide the fields. You _could_ use however the logRegex reader 
to do this. That point aside for the moment, the way I imagined someone using 
this was that different configs could be set up and linked to workspaces such 
that if a file was in the `mysql_logs` folder, it would use the mysql log 
config, and if it was in the `postgres` it would use another.
   
   @cgivre  This use case would still work after two `CREATE SCHEMA` statements 
to set the names and data types, wouldn't it?  The schemas would be applied 
every subsequent query.
   
   > My opinion here is that the goal should be to get the cleanest data to the 
user as possible without the user having to rely on CASTs and other 
complicating factors.
   
   Let's drop the CASTs, those aren't fun.  So we're left with different ways a 
user can specify column names and types.
   
   1. With a `CREATE SCHEMA` against a directory.
   2. With an inline schema to a table function.
   3. With some plugin-specific format config that works for this plugin but 
generally not for others.
   
   Any one requires some effort, any one gets you to `select *` returning nice 
results (disclaimer: is this claim I'm making actually true?) which is super 
valuable.  So shouldn't we avoid the quirky 3 and commit to 1 and 2 
consistently wherever we can?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [drill] dzamo edited a comment on pull request #2282: DRILL-7978: Fixed Width Format Plugin

Reply via email to