R-JunmingChen commented on issue #35096:
URL: https://github.com/apache/arrow/issues/35096#issuecomment-1529412834

   > Sorry for the delay. `newlines_in_values` shouldn't actually affect the 
resulting table. It mostly serves as a warning to the reader that the source's 
JSON objects can't be reliably delimited by raw newlines - so a more expensive 
chunking path is taken prior to each chunk being parsed individually. 
Otherwise, parsing errors are very likely.
   > 
   > In your case, when `newlines_in_values=false`, you would get an error if 
you set `ReadOptions::block_size` to 64 (where the file size is 120). However, 
it would work just fine with `newlines_in_values=true`.
   > 
   > That being said, I'm not entirely sure why `newlines_in_values` isn't in 
`ReadOptions` instead. Looking at the C++ implementation, the option doesn't 
appear to be used by the parser at all.
   
   It resolves my confusion.
   May be we should refine the doc of parse_options? Since it hard to get the 
point of  the function of `newlines_in_values` .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to