gianm commented on a change in pull request #8903: S3 input source
URL: https://github.com/apache/incubator-druid/pull/8903#discussion_r349862132
##########
File path:
core/src/main/java/org/apache/druid/data/input/RetryingInputEntity.java
##########
@@ -32,30 +33,39 @@
@Override
default InputStream open() throws IOException
{
- return new RetryingInputStream<>(
+ RetryingInputStream<?> retryingInputStream = new RetryingInputStream<>(
this,
new RetryingInputEntityOpenFunction(),
getRetryCondition(),
RetryUtils.DEFAULT_MAX_TRIES
);
+ return CompressionUtils.decompress(retryingInputStream,
getDecompressionPath());
}
/**
- * Directly opens an {@link InputStream} on the input entity.
+ * Directly opens an {@link InputStream} on the input entity. Decompression
should be handled externally, this should
+ * return the raw stream for the object.
*/
default InputStream readFromStart() throws IOException
{
return readFrom(0);
}
/**
- * Directly opens an {@link InputStream} starting at the given offset on the
input entity.
+ * Directly opens an {@link InputStream} starting at the given offset on the
input entity. Decompression should be
+ * handled externally, this should return the raw stream for the object.
*
* @param offset an offset to start reading from. A non-negative integer
counting
* the number of bytes from the beginning of the entity
*/
InputStream readFrom(long offset) throws IOException;
+
+ /**
+ * Get path to decompress a compressed stream for the entity
Review comment:
I had trouble making sense of what this comment means, could you please
consider rewording it?
At first glance it sounds like a path on local disk that the compressed
stream will be decompressed to, but looking at implementations, that doesn't
seem right.
At second glance it looks like it's the filename corresponding to the input
entity, and is just used to figure out if it needs to be decompressed or not.
The javadoc should say something like that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]