Re: [I] support `skip_rows` for `CsvFormat` [arrow-datafusion]

via GitHub Fri, 26 Jan 2024 14:03:18 -0800


comphead commented on issue #8824:
URL: 
https://github.com/apache/arrow-datafusion/issues/8824#issuecomment-1912754490


   I think the idea of skipping N rows on the file level doesn't make much 
sense. What we can probably do is to skip N rows on dataframe level, but again 
there is no guarantee which exactly 2 rows will be skipped because ordering, 
shuffling, etc. IMHO it looks more a user task than DataFusion task as the user 
has more context when executing the query
   
   I checked Spark but I haven't found the embedded functionality probably 
because of concerns above
   
   @universalmind303 what is your vision as the ticket owner?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] support `skip_rows` for `CsvFormat` [arrow-datafusion]

Reply via email to