[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5
[ https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120454#comment-17120454 ] ASF subversion and git services commented on JCLOUDS-1547: -- Commit 6e6f8ebf779d8edc5cedec687558637d8212ab18 in jclouds's branch refs/heads/master from Andrew Gaul [ https://gitbox.apache.org/repos/asf?p=jclouds.git;h=6e6f8eb ] JCLOUDS-912: JCLOUDS-1547: GCS InputStream single-part upload Previously this provider worked around a RestAnnotationProcessor quirk by using multi-part uploads for InputStream payloads. Instead work around the quirk another way which allows a single-part upload. This allows inclusion of the Content-MD5 header during object creation. Backfill tests with both ByteSource and InputStream inputs. > Google InputStream blob upload ignores MD5 > -- > > Key: JCLOUDS-1547 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1547 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore >Affects Versions: 2.2.0, 2.2.1 >Reporter: Alexander Chernavin >Assignee: Andrew Gaul >Priority: Major > Labels: google-cloud-storage, md5 > > According to [GCS blob upload > documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]], > when Content-MD5 header is provided, Google uses it to verify data integrity > of an uploaded blob. This feature is crucial for us. We have a file upload > functionality that takes an input stream and uploads it to a cloud via > JClouds. We want to be sure that file integrity is enforced. > > JClouds blob builder allows to specify content MD5, but this value is ignored > with InputStream payload, it's simply is not propagated into Content-MD5 > header. > Here is the code snippet to reproduce the issue: > {code:java} > BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage") > .credentials(clientEmail, privateKey) > .buildView(BlobStoreContext.class); > // generate MD5 hash for some bogus content > MessageDigest md5 = MessageDigest.getInstance("MD5"); > md5.update("bogus".getBytes()); > InputStream inputStream = new ByteArrayInputStream("hi".getBytes()); > BlobStore blobStore = context.getBlobStore(); > blobStore.putBlob(myContainer, > blobStore.blobBuilder("test.txt") > .payload(inputStream) > .contentLength(2) > .contentType("text/plain") > .contentMD5(HashCode.fromBytes(md5.digest())) > .build()); {code} > putBlob should have failed, because payload is "hi", but MD5 is calculated > for "bogus" string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5
[ https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118212#comment-17118212 ] Andrew Gaul commented on JCLOUDS-1547: -- We just released 2.2.1 so I would estimate 3-6 months. In the mean time, you can use the SNAPSHOT releases (one this PR merges): https://jclouds.apache.org/start/install/ > Google InputStream blob upload ignores MD5 > -- > > Key: JCLOUDS-1547 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1547 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore >Affects Versions: 2.2.0, 2.2.1 >Reporter: Alexander Chernavin >Assignee: Andrew Gaul >Priority: Major > Labels: google-cloud-storage, md5 > > According to [GCS blob upload > documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]], > when Content-MD5 header is provided, Google uses it to verify data integrity > of an uploaded blob. This feature is crucial for us. We have a file upload > functionality that takes an input stream and uploads it to a cloud via > JClouds. We want to be sure that file integrity is enforced. > > JClouds blob builder allows to specify content MD5, but this value is ignored > with InputStream payload, it's simply is not propagated into Content-MD5 > header. > Here is the code snippet to reproduce the issue: > {code:java} > BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage") > .credentials(clientEmail, privateKey) > .buildView(BlobStoreContext.class); > // generate MD5 hash for some bogus content > MessageDigest md5 = MessageDigest.getInstance("MD5"); > md5.update("bogus".getBytes()); > InputStream inputStream = new ByteArrayInputStream("hi".getBytes()); > BlobStore blobStore = context.getBlobStore(); > blobStore.putBlob(myContainer, > blobStore.blobBuilder("test.txt") > .payload(inputStream) > .contentLength(2) > .contentType("text/plain") > .contentMD5(HashCode.fromBytes(md5.digest())) > .build()); {code} > putBlob should have failed, because payload is "hi", but MD5 is calculated > for "bogus" string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5
[ https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118145#comment-17118145 ] Alexander Chernavin commented on JCLOUDS-1547: -- Hi [~gaul], first of all, thank you for such a quick fix. I appreciate it. Amazing job! I checkout out your PR branch and I confirm that your fix works as expected. When I provided a wrong MD5 in my service, I got a 400 error. When can we expect 2.3.0 version to be released? Regards, Alexander > Google InputStream blob upload ignores MD5 > -- > > Key: JCLOUDS-1547 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1547 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore >Affects Versions: 2.2.0, 2.2.1 >Reporter: Alexander Chernavin >Assignee: Andrew Gaul >Priority: Major > Labels: google-cloud-storage, md5 > > According to [GCS blob upload > documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]], > when Content-MD5 header is provided, Google uses it to verify data integrity > of an uploaded blob. This feature is crucial for us. We have a file upload > functionality that takes an input stream and uploads it to a cloud via > JClouds. We want to be sure that file integrity is enforced. > > JClouds blob builder allows to specify content MD5, but this value is ignored > with InputStream payload, it's simply is not propagated into Content-MD5 > header. > Here is the code snippet to reproduce the issue: > {code:java} > BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage") > .credentials(clientEmail, privateKey) > .buildView(BlobStoreContext.class); > // generate MD5 hash for some bogus content > MessageDigest md5 = MessageDigest.getInstance("MD5"); > md5.update("bogus".getBytes()); > InputStream inputStream = new ByteArrayInputStream("hi".getBytes()); > BlobStore blobStore = context.getBlobStore(); > blobStore.putBlob(myContainer, > blobStore.blobBuilder("test.txt") > .payload(inputStream) > .contentLength(2) > .contentType("text/plain") > .contentMD5(HashCode.fromBytes(md5.digest())) > .build()); {code} > putBlob should have failed, because payload is "hi", but MD5 is calculated > for "bogus" string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5
[ https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114523#comment-17114523 ] Andrew Gaul commented on JCLOUDS-1547: -- Alexander, please try out the referenced GitHub PR. Sorry I missed the {{InputStream}} citation; our tests were incomplete for this. > Google InputStream blob upload ignores MD5 > -- > > Key: JCLOUDS-1547 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1547 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore >Affects Versions: 2.2.0, 2.2.1 >Reporter: Alexander Chernavin >Assignee: Andrew Gaul >Priority: Major > Labels: google-cloud-storage, md5 > > According to [GCS blob upload > documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]], > when Content-MD5 header is provided, Google uses it to verify data integrity > of an uploaded blob. This feature is crucial for us. We have a file upload > functionality that takes an input stream and uploads it to a cloud via > JClouds. We want to be sure that file integrity is enforced. > > JClouds blob builder allows to specify content MD5, but this value is ignored > with InputStream payload, it's simply is not propagated into Content-MD5 > header. > Here is the code snippet to reproduce the issue: > {code:java} > BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage") > .credentials(clientEmail, privateKey) > .buildView(BlobStoreContext.class); > // generate MD5 hash for some bogus content > MessageDigest md5 = MessageDigest.getInstance("MD5"); > md5.update("bogus".getBytes()); > InputStream inputStream = new ByteArrayInputStream("hi".getBytes()); > BlobStore blobStore = context.getBlobStore(); > blobStore.putBlob(myContainer, > blobStore.blobBuilder("test.txt") > .payload(inputStream) > .contentLength(2) > .contentType("text/plain") > .contentMD5(HashCode.fromBytes(md5.digest())) > .build()); {code} > putBlob should have failed, because payload is "hi", but MD5 is calculated > for "bogus" string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5
[ https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114273#comment-17114273 ] Alexander Chernavin commented on JCLOUDS-1547: -- Hi [~gaul], I appreciate your quick response! I looked into [testPutIncorrectContentMD5 |[https://github.com/jclouds/jclouds/blob/7af4d8e8f19c479b0a8f35e06dd68418a8367b3e/blobstore/src/test/java/org/jclouds/blobstore/integration/internal/BaseBlobIntegrationTest.java#L298]]. It uploads a blob with a byte array payload: {code:java} byte[] payload = createTestInput(1024).read(); {code} Here is the catch, if you replace stream payload in my example with a byte array payload then will also start failing with the error you mentioned: {code:java} .payload("hi".getBytes()) {code} Also, if you modify the test case to use InputStream, it should also start failing. It happens, because [putBlob|[https://github.com/jclouds/jclouds/blob/master/providers/google-cloud-storage/src/main/java/org/jclouds/googlecloudstorage/blobstore/GoogleCloudStorageBlobStore.java#L211]] method of GoogleCloudStorageBlobStore has a fork that treats payloads differently: {code:java} if (length != 0 && (options.isMultipart() || !blob.getPayload().isRepeatable())) { // JCLOUDS-912 prevents using single-part uploads with InputStream payloads. // Work around this with multi-part upload which buffers parts in-memory. return putMultipartBlob(container, blob, options); } else { // skipped some lines for readability return api.getObjectApi().multipartUpload(container, template, blob.getPayload()).etag(); }{code} Input stream payload falls into "if" statement, byte array payload falls into "else" statement. putMultipartBlob method eventually calls org.jclouds.googlecloudstorage.features.ObjectApi.simpleUpload, here is the definition of this method: {code:java} GoogleCloudStorageObject simpleUpload(@PathParam("bucket") String bucketName, @HeaderParam("Content-Type") String contentType, @HeaderParam("Content-Length") Long contentLength, @PayloadParam("payload") Payload payload, InsertObjectOptions options);{code} Content-MD5 header is not passed to the server side. Regards, Alexander > Google InputStream blob upload ignores MD5 > -- > > Key: JCLOUDS-1547 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1547 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore >Affects Versions: 2.2.0, 2.2.1 >Reporter: Alexander Chernavin >Priority: Major > Labels: google-cloud-storage, md5 > > According to [GCS blob upload > documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]], > when Content-MD5 header is provided, Google uses it to verify data integrity > of an uploaded blob. This feature is crucial for us. We have a file upload > functionality that takes an input stream and uploads it to a cloud via > JClouds. We want to be sure that file integrity is enforced. > > JClouds blob builder allows to specify content MD5, but this value is ignored > with InputStream payload, it's simply is not propagated into Content-MD5 > header. > Here is the code snippet to reproduce the issue: > {code:java} > BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage") > .credentials(clientEmail, privateKey) > .buildView(BlobStoreContext.class); > // generate MD5 hash for some bogus content > MessageDigest md5 = MessageDigest.getInstance("MD5"); > md5.update("bogus".getBytes()); > InputStream inputStream = new ByteArrayInputStream("hi".getBytes()); > BlobStore blobStore = context.getBlobStore(); > blobStore.putBlob(myContainer, > blobStore.blobBuilder("test.txt") > .payload(inputStream) > .contentLength(2) > .contentType("text/plain") > .contentMD5(HashCode.fromBytes(md5.digest())) > .build()); {code} > putBlob should have failed, because payload is "hi", but MD5 is calculated > for "bogus" string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5
[ https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113731#comment-17113731 ] Andrew Gaul commented on JCLOUDS-1547: -- [~achernavin] We have a test that exercises this: {code} $ mvn integration-test -pl :google-cloud-storage -Plive -Dtest.google-cloud-storage.identity="${JCLOUDS_IDENTITY}" -Dtest.google-cloud-storage.credential="${JCLOUDS_CREDENTIAL}" -Dtest.blobstore.container-count=4 -Dtest=GoogleCloudStorageBlobIntegrationLiveTest#testPutIncorrectContentMD5 -am -DfailIfNoTests=false {code} When I comment out the expected error handling, I see the exception: {code} failed with response: HTTP/1.1 400 Bad Request; content: [{ "error": { "code": 400, "message": "Provided MD5 hash \"kK/72aGVTsn/Apt61xg6Fg==\" doesn't match calculated MD5 hash \"tlhTuN/s2y8UgsAEqO1+3g==\".", "errors": [ { "message": "Provided MD5 hash \"kK/72aGVTsn/Apt61xg6Fg==\" doesn't match calculated MD5 hash \"tlhTuN/s2y8UgsAEqO1+3g==\".", "domain": "global", "reason": "invalid" } ] } } ] {code} > Google InputStream blob upload ignores MD5 > -- > > Key: JCLOUDS-1547 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1547 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore >Affects Versions: 2.2.0, 2.2.1 >Reporter: Alexander Chernavin >Priority: Major > Labels: gcs, google, md5 > > According to [GCS blob upload > documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]], > when Content-MD5 header is provided, Google uses it to verify data integrity > of an uploaded blob. This feature is crucial for us. We have a file upload > functionality that takes an input stream and uploads it to a cloud via > JClouds. We want to be sure that file integrity is enforced. > > JClouds blob builder allows to specify content MD5, but this value is ignored > with InputStream payload, it's simply is not propagated into Content-MD5 > header. > Here is the code snippet to reproduce the issue: > {code:java} > BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage") > .credentials(clientEmail, privateKey) > .buildView(BlobStoreContext.class); > // generate MD5 hash for some bogus content > MessageDigest md5 = MessageDigest.getInstance("MD5"); > md5.update("bogus".getBytes()); > InputStream inputStream = new ByteArrayInputStream("hi".getBytes()); > BlobStore blobStore = context.getBlobStore(); > blobStore.putBlob(myContainer, > blobStore.blobBuilder("test.txt") > .payload(inputStream) > .contentLength(2) > .contentType("text/plain") > .contentMD5(HashCode.fromBytes(md5.digest())) > .build()); {code} > putBlob should have failed, because payload is "hi", but MD5 is calculated > for "bogus" string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)