cgivre commented on pull request #2400: URL: https://github.com/apache/drill/pull/2400#issuecomment-990305838
> @pjfanning Thanks for this PR. You've hit on an issue which is subject to a lot of different opinions. The issue at stake is how do you read data without a schema and what do you do when the data is inconsistent? I'm approaching this from the data scientist/user perspective and I am just one data point, but here goes. I really like the way Pandas handles the whole situation. Specifically, Pandas does attempt to infer data types, but also gives the user a lot of control over the behavior when things go wrong. I think Pandas lets the user decide if it throws an exception, ignores it, or coerces a value. I don't know if that's more work than you want to do, but I do feel that logging isn't quite enough because that assumes that a user has logging enabled and can view the logs. If there was a config option on error handling, it could be changed at query time with the `table()` function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
