alexrafferty-qoria opened a new issue, #17222:
URL: https://github.com/apache/datafusion/issues/17222

   ### Is your feature request related to a problem or challenge?
   
   Datafusion can't read JSONL/ND-JSON files where the rows are arrays rather 
than objects. e.g.:
   
   ```
   [1, 2, 3]
   [4, 5, 6]
   [7, 8, 9]
   ```
   
   ### Describe the solution you'd like
   
   It would be great if Datafusion could directly consume ND-JSON files 
containing top-level arrays, mapping the array to a single field with a 
customisable name, like `data` or `items`.
   
   I notice that the underlying arrow library seems to support this concept via 
`datafusion::arrow::json::ReaderBuilder::new_with_field`, but this doesn't seem 
to be exposed by the JSON `FileFormat` that Datafusion exposes.
   
   ### Describe alternatives you've considered
   
   At present I am working around the issue with a pre-processing step that 
wraps each line with a `{ "data": ... }` wrapper, however this comes with a 
heavy performance penalty.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to