WenyXu commented on issue #3725: URL: https://github.com/apache/incubator-opendal/issues/3725#issuecomment-1845278633
> Every `read` in opendal returns a `reader`: If application calls `seek` on this reader, we will need to send a new request to read the data underhood. I'm guessing the stream calls `seek` internally to read the data. Yes. I added a debug case here: https://github.com/WenyXu/greptimedb/blob/3b6562c546f28a755706618bc12f8917db697783/tests-integration/src/tests/instance_test.rs#L647-L677 It requires some steps before starting: 1. Upload to s3 the parquet file (`/tests/data/parquet/tsbs.data`) ``` aws s3 cp ./tests/data/parquet/tsbs.data s3://bucket/test-import/tsbs.data ``` 2. And add the s3 secrets to `Connection` https://github.com/WenyXu/greptimedb/blob/3b6562c546f28a755706618bc12f8917db697783/tests-integration/src/tests/instance_test.rs#L674 And there is some helpful info: 1. The parquet stream is built at: https://github.com/WenyXu/greptimedb/blob/3b6562c546f28a755706618bc12f8917db697783/src/operator/src/statement/copy_table_from.rs#L203-L224 2. The stream's `next` is invoked at: https://github.com/WenyXu/greptimedb/blob/3b6562c546f28a755706618bc12f8917db697783/src/operator/src/statement/copy_table_from.rs#L315-L341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
