cgivre commented on pull request #2400:
URL: https://github.com/apache/drill/pull/2400#issuecomment-990305838


   > 
   
   @pjfanning Thanks for this PR.  You've hit on an issue which is subject to a 
lot of different opinions.  The issue at stake is how do you read data without 
a schema and what do you do when the data is inconsistent?
   
   I'm approaching this from the data scientist/user perspective and I am just 
one data point, but here goes.  I really like the way Pandas handles the whole 
situation.  Specifically, Pandas does attempt to infer data types, but also 
gives the user a lot of control over the behavior when things go wrong.   I 
think Pandas lets the user decide if it throws an exception, ignores it, or 
coerces a value.  
   
   I don't know if that's more work than you want to do, but I do feel that 
logging isn't quite enough because that assumes that a user has logging enabled 
and can view the logs.  If there was a config option on error handling, it 
could be changed at query time with the `table()` function.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to