comphead commented on issue #8824: URL: https://github.com/apache/arrow-datafusion/issues/8824#issuecomment-1912754490
I think the idea of skipping N rows on the file level doesn't make much sense. What we can probably do is to skip N rows on dataframe level, but again there is no guarantee which exactly 2 rows will be skipped because ordering, shuffling, etc. IMHO it looks more a user task than DataFusion task as the user has more context when executing the query I checked Spark but I haven't found the embedded functionality probably because of concerns above @universalmind303 what is your vision as the ticket owner? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
