dqkqd commented on code in PR #17796:
URL: https://github.com/apache/datafusion/pull/17796#discussion_r2383668162
##########
datafusion/core/src/datasource/file_format/csv.rs:
##########
@@ -470,6 +471,47 @@ mod tests {
Ok(())
}
+ #[tokio::test]
+ async fn test_infer_schema_stream_separated_chunks_with_nulls() ->
Result<()> {
Review Comment:
Datafusion infers data type from each chunk separately, then combines all
the possible types.
This test creates a
[ChunkedStore](https://docs.rs/object_store/latest/object_store/chunked/struct.ChunkedStore.html),
reading each line as a separated chunk (one of them contains only nulls),
then ensure type inference shouldn't be skewed by null chunks.
I should have commented and make the test clearer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]