[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5

2020-05-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120454#comment-17120454
 ] 

ASF subversion and git services commented on JCLOUDS-1547:
--

Commit 6e6f8ebf779d8edc5cedec687558637d8212ab18 in jclouds's branch 
refs/heads/master from Andrew Gaul
[ https://gitbox.apache.org/repos/asf?p=jclouds.git;h=6e6f8eb ]

JCLOUDS-912: JCLOUDS-1547: GCS InputStream single-part upload

Previously this provider worked around a RestAnnotationProcessor quirk
by using multi-part uploads for InputStream payloads.  Instead work
around the quirk another way which allows a single-part upload.  This
allows inclusion of the Content-MD5 header during object creation.
Backfill tests with both ByteSource and InputStream inputs.


> Google InputStream blob upload ignores MD5
> --
>
> Key: JCLOUDS-1547
> URL: https://issues.apache.org/jira/browse/JCLOUDS-1547
> Project: jclouds
>  Issue Type: Bug
>  Components: jclouds-blobstore
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Alexander Chernavin
>Assignee: Andrew Gaul
>Priority: Major
>  Labels: google-cloud-storage, md5
>
> According to [GCS blob upload 
> documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]],
>  when Content-MD5 header is provided, Google uses it to verify data integrity 
> of an uploaded blob. This feature is crucial for us. We have a file upload 
> functionality that takes an input stream and uploads it to a cloud via 
> JClouds. We want to be sure that file integrity is enforced.
>  
> JClouds blob builder allows to specify content MD5, but this value is ignored 
> with InputStream payload, it's simply is not propagated into Content-MD5 
> header.
> Here is the code snippet to reproduce the issue:
> {code:java}
> BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage")
> .credentials(clientEmail, privateKey)
> .buildView(BlobStoreContext.class);
> // generate MD5 hash for some bogus content
> MessageDigest md5 = MessageDigest.getInstance("MD5");
> md5.update("bogus".getBytes());
> InputStream inputStream = new ByteArrayInputStream("hi".getBytes());
> BlobStore blobStore = context.getBlobStore();
> blobStore.putBlob(myContainer,
> blobStore.blobBuilder("test.txt")
> .payload(inputStream)
> .contentLength(2)
> .contentType("text/plain")
> .contentMD5(HashCode.fromBytes(md5.digest()))
> .build()); {code}
> putBlob should have failed, because payload is "hi", but MD5 is calculated 
> for "bogus" string.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5

2020-05-27 Thread Andrew Gaul (Jira)


[ 
https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118212#comment-17118212
 ] 

Andrew Gaul commented on JCLOUDS-1547:
--

We just released 2.2.1 so I would estimate 3-6 months.  In the mean time, you 
can use the SNAPSHOT releases (one this PR merges):

 

https://jclouds.apache.org/start/install/

> Google InputStream blob upload ignores MD5
> --
>
> Key: JCLOUDS-1547
> URL: https://issues.apache.org/jira/browse/JCLOUDS-1547
> Project: jclouds
>  Issue Type: Bug
>  Components: jclouds-blobstore
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Alexander Chernavin
>Assignee: Andrew Gaul
>Priority: Major
>  Labels: google-cloud-storage, md5
>
> According to [GCS blob upload 
> documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]],
>  when Content-MD5 header is provided, Google uses it to verify data integrity 
> of an uploaded blob. This feature is crucial for us. We have a file upload 
> functionality that takes an input stream and uploads it to a cloud via 
> JClouds. We want to be sure that file integrity is enforced.
>  
> JClouds blob builder allows to specify content MD5, but this value is ignored 
> with InputStream payload, it's simply is not propagated into Content-MD5 
> header.
> Here is the code snippet to reproduce the issue:
> {code:java}
> BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage")
> .credentials(clientEmail, privateKey)
> .buildView(BlobStoreContext.class);
> // generate MD5 hash for some bogus content
> MessageDigest md5 = MessageDigest.getInstance("MD5");
> md5.update("bogus".getBytes());
> InputStream inputStream = new ByteArrayInputStream("hi".getBytes());
> BlobStore blobStore = context.getBlobStore();
> blobStore.putBlob(myContainer,
> blobStore.blobBuilder("test.txt")
> .payload(inputStream)
> .contentLength(2)
> .contentType("text/plain")
> .contentMD5(HashCode.fromBytes(md5.digest()))
> .build()); {code}
> putBlob should have failed, because payload is "hi", but MD5 is calculated 
> for "bogus" string.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5

2020-05-27 Thread Alexander Chernavin (Jira)


[ 
https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17118145#comment-17118145
 ] 

Alexander Chernavin commented on JCLOUDS-1547:
--

Hi [~gaul],

first of all, thank you for such a quick fix. I appreciate it. Amazing job!

I checkout out your PR branch and I confirm that your fix works as expected. 
When I provided a wrong MD5 in my service, I got a 400 error.

When can we expect 2.3.0 version to be released?

Regards,

Alexander

> Google InputStream blob upload ignores MD5
> --
>
> Key: JCLOUDS-1547
> URL: https://issues.apache.org/jira/browse/JCLOUDS-1547
> Project: jclouds
>  Issue Type: Bug
>  Components: jclouds-blobstore
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Alexander Chernavin
>Assignee: Andrew Gaul
>Priority: Major
>  Labels: google-cloud-storage, md5
>
> According to [GCS blob upload 
> documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]],
>  when Content-MD5 header is provided, Google uses it to verify data integrity 
> of an uploaded blob. This feature is crucial for us. We have a file upload 
> functionality that takes an input stream and uploads it to a cloud via 
> JClouds. We want to be sure that file integrity is enforced.
>  
> JClouds blob builder allows to specify content MD5, but this value is ignored 
> with InputStream payload, it's simply is not propagated into Content-MD5 
> header.
> Here is the code snippet to reproduce the issue:
> {code:java}
> BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage")
> .credentials(clientEmail, privateKey)
> .buildView(BlobStoreContext.class);
> // generate MD5 hash for some bogus content
> MessageDigest md5 = MessageDigest.getInstance("MD5");
> md5.update("bogus".getBytes());
> InputStream inputStream = new ByteArrayInputStream("hi".getBytes());
> BlobStore blobStore = context.getBlobStore();
> blobStore.putBlob(myContainer,
> blobStore.blobBuilder("test.txt")
> .payload(inputStream)
> .contentLength(2)
> .contentType("text/plain")
> .contentMD5(HashCode.fromBytes(md5.digest()))
> .build()); {code}
> putBlob should have failed, because payload is "hi", but MD5 is calculated 
> for "bogus" string.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5

2020-05-22 Thread Andrew Gaul (Jira)


[ 
https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114523#comment-17114523
 ] 

Andrew Gaul commented on JCLOUDS-1547:
--

Alexander, please try out the referenced GitHub PR.  Sorry I missed the 
{{InputStream}} citation; our tests were incomplete for this.

> Google InputStream blob upload ignores MD5
> --
>
> Key: JCLOUDS-1547
> URL: https://issues.apache.org/jira/browse/JCLOUDS-1547
> Project: jclouds
>  Issue Type: Bug
>  Components: jclouds-blobstore
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Alexander Chernavin
>Assignee: Andrew Gaul
>Priority: Major
>  Labels: google-cloud-storage, md5
>
> According to [GCS blob upload 
> documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]],
>  when Content-MD5 header is provided, Google uses it to verify data integrity 
> of an uploaded blob. This feature is crucial for us. We have a file upload 
> functionality that takes an input stream and uploads it to a cloud via 
> JClouds. We want to be sure that file integrity is enforced.
>  
> JClouds blob builder allows to specify content MD5, but this value is ignored 
> with InputStream payload, it's simply is not propagated into Content-MD5 
> header.
> Here is the code snippet to reproduce the issue:
> {code:java}
> BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage")
> .credentials(clientEmail, privateKey)
> .buildView(BlobStoreContext.class);
> // generate MD5 hash for some bogus content
> MessageDigest md5 = MessageDigest.getInstance("MD5");
> md5.update("bogus".getBytes());
> InputStream inputStream = new ByteArrayInputStream("hi".getBytes());
> BlobStore blobStore = context.getBlobStore();
> blobStore.putBlob(myContainer,
> blobStore.blobBuilder("test.txt")
> .payload(inputStream)
> .contentLength(2)
> .contentType("text/plain")
> .contentMD5(HashCode.fromBytes(md5.digest()))
> .build()); {code}
> putBlob should have failed, because payload is "hi", but MD5 is calculated 
> for "bogus" string.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5

2020-05-22 Thread Alexander Chernavin (Jira)


[ 
https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114273#comment-17114273
 ] 

Alexander Chernavin commented on JCLOUDS-1547:
--

Hi [~gaul],

I appreciate your quick response!

I looked into [testPutIncorrectContentMD5 
|[https://github.com/jclouds/jclouds/blob/7af4d8e8f19c479b0a8f35e06dd68418a8367b3e/blobstore/src/test/java/org/jclouds/blobstore/integration/internal/BaseBlobIntegrationTest.java#L298]].
 It uploads a blob with a byte array payload:
{code:java}
byte[] payload = createTestInput(1024).read(); {code}
 

Here is the catch, if you replace stream payload in my example with a byte 
array payload then will also start failing with the error you mentioned:
{code:java}
.payload("hi".getBytes()) {code}
Also, if you modify the test case to use InputStream, it should also start 
failing.

 

It happens, because 
[putBlob|[https://github.com/jclouds/jclouds/blob/master/providers/google-cloud-storage/src/main/java/org/jclouds/googlecloudstorage/blobstore/GoogleCloudStorageBlobStore.java#L211]]
 method of GoogleCloudStorageBlobStore has a fork that treats payloads 
differently:
{code:java}
 if (length != 0 && (options.isMultipart() || 
!blob.getPayload().isRepeatable())) {
  // JCLOUDS-912 prevents using single-part uploads with InputStream payloads.
  // Work around this with multi-part upload which buffers parts in-memory.
  return putMultipartBlob(container, blob, options);
} else {
  // skipped some lines for readability
  return api.getObjectApi().multipartUpload(container, template, 
blob.getPayload()).etag();
}{code}
Input stream payload falls into "if" statement, byte array payload falls into 
"else" statement.

 

putMultipartBlob method eventually calls 
org.jclouds.googlecloudstorage.features.ObjectApi.simpleUpload, here is the 
definition of this method:
{code:java}
 GoogleCloudStorageObject simpleUpload(@PathParam("bucket") String bucketName, 
@HeaderParam("Content-Type") String contentType,
@HeaderParam("Content-Length") Long contentLength, 
@PayloadParam("payload") Payload payload,
InsertObjectOptions options);{code}
 

Content-MD5 header is not passed to the server side.

 

Regards,

Alexander

> Google InputStream blob upload ignores MD5
> --
>
> Key: JCLOUDS-1547
> URL: https://issues.apache.org/jira/browse/JCLOUDS-1547
> Project: jclouds
>  Issue Type: Bug
>  Components: jclouds-blobstore
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Alexander Chernavin
>Priority: Major
>  Labels: google-cloud-storage, md5
>
> According to [GCS blob upload 
> documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]],
>  when Content-MD5 header is provided, Google uses it to verify data integrity 
> of an uploaded blob. This feature is crucial for us. We have a file upload 
> functionality that takes an input stream and uploads it to a cloud via 
> JClouds. We want to be sure that file integrity is enforced.
>  
> JClouds blob builder allows to specify content MD5, but this value is ignored 
> with InputStream payload, it's simply is not propagated into Content-MD5 
> header.
> Here is the code snippet to reproduce the issue:
> {code:java}
> BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage")
> .credentials(clientEmail, privateKey)
> .buildView(BlobStoreContext.class);
> // generate MD5 hash for some bogus content
> MessageDigest md5 = MessageDigest.getInstance("MD5");
> md5.update("bogus".getBytes());
> InputStream inputStream = new ByteArrayInputStream("hi".getBytes());
> BlobStore blobStore = context.getBlobStore();
> blobStore.putBlob(myContainer,
> blobStore.blobBuilder("test.txt")
> .payload(inputStream)
> .contentLength(2)
> .contentType("text/plain")
> .contentMD5(HashCode.fromBytes(md5.digest()))
> .build()); {code}
> putBlob should have failed, because payload is "hi", but MD5 is calculated 
> for "bogus" string.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (JCLOUDS-1547) Google InputStream blob upload ignores MD5

2020-05-21 Thread Andrew Gaul (Jira)


[ 
https://issues.apache.org/jira/browse/JCLOUDS-1547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113731#comment-17113731
 ] 

Andrew Gaul commented on JCLOUDS-1547:
--

[~achernavin] We have a test that exercises this:

{code}
$ mvn integration-test -pl :google-cloud-storage -Plive 
-Dtest.google-cloud-storage.identity="${JCLOUDS_IDENTITY}" 
-Dtest.google-cloud-storage.credential="${JCLOUDS_CREDENTIAL}" 
-Dtest.blobstore.container-count=4 
-Dtest=GoogleCloudStorageBlobIntegrationLiveTest#testPutIncorrectContentMD5 -am 
-DfailIfNoTests=false
{code}

When I comment out the expected error handling, I see the exception:

{code}
failed with response: HTTP/1.1 400 Bad Request; content: [{
  "error": {
"code": 400,
"message": "Provided MD5 hash \"kK/72aGVTsn/Apt61xg6Fg==\" doesn't match 
calculated MD5 hash \"tlhTuN/s2y8UgsAEqO1+3g==\".",
"errors": [
  {
"message": "Provided MD5 hash \"kK/72aGVTsn/Apt61xg6Fg==\" doesn't 
match calculated MD5 hash \"tlhTuN/s2y8UgsAEqO1+3g==\".",
"domain": "global",
"reason": "invalid"
  }
]
  }
}
]
{code}

> Google InputStream blob upload ignores MD5
> --
>
> Key: JCLOUDS-1547
> URL: https://issues.apache.org/jira/browse/JCLOUDS-1547
> Project: jclouds
>  Issue Type: Bug
>  Components: jclouds-blobstore
>Affects Versions: 2.2.0, 2.2.1
>Reporter: Alexander Chernavin
>Priority: Major
>  Labels: gcs, google, md5
>
> According to [GCS blob upload 
> documentation|[https://cloud.google.com/storage/docs/xml-api/put-object-upload]],
>  when Content-MD5 header is provided, Google uses it to verify data integrity 
> of an uploaded blob. This feature is crucial for us. We have a file upload 
> functionality that takes an input stream and uploads it to a cloud via 
> JClouds. We want to be sure that file integrity is enforced.
>  
> JClouds blob builder allows to specify content MD5, but this value is ignored 
> with InputStream payload, it's simply is not propagated into Content-MD5 
> header.
> Here is the code snippet to reproduce the issue:
> {code:java}
> BlobStoreContext context = ContextBuilder.newBuilder("google-cloud-storage")
> .credentials(clientEmail, privateKey)
> .buildView(BlobStoreContext.class);
> // generate MD5 hash for some bogus content
> MessageDigest md5 = MessageDigest.getInstance("MD5");
> md5.update("bogus".getBytes());
> InputStream inputStream = new ByteArrayInputStream("hi".getBytes());
> BlobStore blobStore = context.getBlobStore();
> blobStore.putBlob(myContainer,
> blobStore.blobBuilder("test.txt")
> .payload(inputStream)
> .contentLength(2)
> .contentType("text/plain")
> .contentMD5(HashCode.fromBytes(md5.digest()))
> .build()); {code}
> putBlob should have failed, because payload is "hi", but MD5 is calculated 
> for "bogus" string.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)