cnauroth commented on PR #7418: URL: https://github.com/apache/hadoop/pull/7418#issuecomment-2680008490
> > @cnauroth @anujmodi2021 have either of you two implemented the vector read API yet? > > I ask as this PR currently maps the readVectored/3 call to the readVectored/2 call unless overridden, so the default implementation will leak buffers on failure, even if a release function is passed in. > > If I change it to passing the release call down, then any input stream which implemented readVectored/2 will not have the readVectored/3 call invoking it, unless they override that explicitly too. In this PR, everything in hadoop common does, and I will in S3AInputStream. > > I'm just trying to work out the best design for other streams. IF all the implementation are in the hadoop source tree, I can do the overrides there and have a default which does release buffers everywhere else. > > > > * @mukund-thakur @ahmarsuhail @saikatroy038 @shameersss1 > > Hi @steveloughran, I am working on the vectored read API feature from the ABFS driver team. We are still working on the design part of the feature and will pick up the implementation soon. Hello @steveloughran ! GCS has an implementation of vectored read, overriding readVectored/2 here: https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/src/main/java/com/google/cloud/hadoop/fs/gcs/GoogleHadoopFSInputStream.java#L176 Implementation details here: https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/master/gcs/src/main/java/com/google/cloud/hadoop/fs/gcs/VectoredIOImpl.java This is on the master branch and 3.0 release line, which is not yet in mainstream Dataproc use. We don't have vectored read in version 2.2 or earlier. It sounds like once this change is in a Hadoop release, GCS should plan on picking this up and overriding readVectored/3. Do I have it right? CC: @arunkumarchacko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
