alamb opened a new issue, #8502: URL: https://github.com/apache/arrow-datafusion/issues/8502
### Is your feature request related to a problem or challenge? DataFusion can now automatically read CSV and parquet files in parallel (see https://github.com/apache/arrow-datafusion/issues/6325 for CSV) It would be great to do the same for "NDJSON" files -- namely files that have multiple JSON objects placed one after the other. ### Describe the solution you'd like Basically implement what is described in https://github.com/apache/arrow-datafusion/issues/6325 for JSON -- and read a single large ND json file (new line delimited file) in parallel ### Describe alternatives you've considered Some research may be required -- I am not sure if finding record boundaries is feasible ### Additional context I found this while writing tests for https://github.com/apache/arrow-datafusion/issues/8451 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
