nastra commented on code in PR #8168:
URL: https://github.com/apache/iceberg/pull/8168#discussion_r1284295926
##########
gcp/src/main/java/org/apache/iceberg/gcp/gcs/GCSFileIO.java:
##########
@@ -174,4 +181,46 @@ public void close() {
}
}
}
+
+ @Override
+ public Iterable<FileInfo> listPrefix(String prefix) {
+ GCSLocation location = new GCSLocation(prefix);
+ return () ->
+ client()
+ .list(location.bucket(),
Storage.BlobListOption.prefix(location.prefix()))
+ .streamAll()
+ .map(
+ blob ->
+ new FileInfo(
+ String.format("gs://%s/%s", blob.getBucket(),
blob.getName()),
+ blob.getSize(),
+ getCreateTimeMillis(blob)))
+ .iterator();
+ }
+
+ private long getCreateTimeMillis(Blob blob) {
+ if (blob.getCreateTimeOffsetDateTime() == null) {
+ return 0;
+ }
+ return blob.getCreateTimeOffsetDateTime().toInstant().toEpochMilli();
+ }
+
+ @Override
+ public void deletePrefix(String prefix) {
+ internalDeleteFiles(
+ () ->
+ Streams.stream(listPrefix(prefix))
+ .map(fileInfo -> BlobId.fromGsUtilUri(fileInfo.location()))
+ .iterator());
+ }
+
+ @Override
+ public void deleteFiles(Iterable<String> pathsToDelete) throws
BulkDeletionFailureException {
+ internalDeleteFiles(() ->
Streams.stream(pathsToDelete).map(BlobId::fromGsUtilUri).iterator());
+ }
+
+ private void internalDeleteFiles(Iterable<BlobId> blobIdsToDelete) {
Review Comment:
I was wondering whether we should pass the `Stream<BlobId` through to this
method to have something like
```
@Override
public void deletePrefix(String prefix) {
internalDeleteFiles(
Streams.stream(listPrefix(prefix))
.map(fileInfo -> BlobId.fromGsUtilUri(fileInfo.location())));
}
@Override
public void deleteFiles(Iterable<String> pathsToDelete) throws
BulkDeletionFailureException {
internalDeleteFiles(Streams.stream(pathsToDelete).map(BlobId::fromGsUtilUri));
}
private void internalDeleteFiles(Stream<BlobId> blobIdsToDelete) {
Streams.stream(Iterators.partition(blobIdsToDelete.iterator(),
gcpProperties.deleteBatchSize()))
.forEach(batch -> client().delete(batch));
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]