[ 
https://issues.apache.org/jira/browse/JCLOUDS-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Vermeulen updated JCLOUDS-1608:
-----------------------------------
    Description: 
MultipartUploadSlicingAlgorithm calculates slices for a large file by first 
using the defaultPartSize. If the results of that slicing gives parts that 
don't exceed the min/maxPartSizes (5MB and 5GB for GoogleCloudStorageBlobStore) 
but that do exceed the maxNumberOfParts (32 for GoogleCloudStorageBlobStore), 
the algoritm sets the number of parts to 32 and recalculates the size of the 
parts. If there is any remainder after that, JClouds ends up uploading 33 parts 
in total to GCS, causing the process to fail in completeMultipartUpload() when 
recomposing the original content from the parts.

The following simple unitTest proves the case:

{{public class AlgoritmTest extends TestCase {}}
{{    public void testSlicing() {}}
{{        MultipartUploadSlicingAlgorithm algorithm = new 
MultipartUploadSlicingAlgorithm(1024*1024*5,1024*1024*1024*5,32);}}
{{        algorithm.calculateChunkSize(1024*1024*1200+33);}}
{{        assertTrue(algorithm.getParts()+((algorithm.getRemaining() > 
0)?(1):(0)) <= 32);}}
{{}}}

It simulates the slicing of a file of 1.2GB+33 bytes (to make sure there is a 
remainder).

The following patch fixes the issue:

{{      ...}}

{{      long remainder = length % unitPartSize;     }}
{{      // SHB patch}}
{{      // remainder should be distributed over parts if we are at the 
maximumNumberOfParts}}
{{      // (if not, an additional part is uploaded to GCS thus exceeding the 
maximum allowed parts)}}
{{      // if (remainder == 0 && parts > 0) {}}
{{      //     parts -= 1;}}
{{      if (remainder > 0 && parts == maximumNumberOfParts) {}}
{{          parts -= 1;}}
{{          partSize = length/parts;}}
            {{// end of SHB patch}}
{{      ...}}

I also commented the code that reduces the number of parts when there is no 
remainder, since that ends up creating a remaining part that is the same size 
as the others.

  was:
MultipartUploadSlicingAlgorithm calculates slices for a large file by first 
using the defaultPartSize. If the results of that slicing gives parts that 
don't exceed the min/maxPartSizes (5MB and 5GB for GoogleCloudStorageBlobStore) 
but that do exceed the maxNumberOfParts (32 for GoogleCloudStorageBlobStore), 
the algoritm sets the number of parts to 32 and recalculates the size of the 
parts. If there is any remainder after that, JClouds ends up uploading 33 parts 
in total to GCS, causing the process to fail in completeMultipartUpload() when 
recomposing the original content from the parts.

The following simple unitTest proves the case:

{{public class AlgoritmTest extends TestCase {}}
{{    public void testSlicing() {}}
{{        MultipartUploadSlicingAlgorithm algorithm = new 
MultipartUploadSlicingAlgorithm(1024*1024*5,1024*1024*1024*5,32);}}
{{        algorithm.calculateChunkSize(1024*1024*1200+33);}}
{{        assertTrue(algorithm.getParts()+((algorithm.getRemaining() > 
0)?(1):(0)) <= 32);}}
{{    }}}
{{}}}

It simulates the slicing of a file of 1.2GB+33 bytes (to make sure there is a 
remainder).

The following patch fixes the issue:
{{      }}

{{      ...}}

{{      long remainder = length % unitPartSize;     }}
{{      // SHB patch}}
{{      // remainder should be distributed over parts if we are at the 
maximumNumberOfParts}}
{{      // (if not, an additional part is uploaded to GCS thus exceeding the 
maximum allowed parts)}}
{{      // if (remainder == 0 && parts > 0) {}}
{{      //     parts -= 1;}}
{{      // }}}
{{      if (remainder > 0 && parts == maximumNumberOfParts) {}}
{{          parts -= 1;}}
{{          partSize = length/parts;}}
{{      }}}
{{      // end of SHB patch
      ...}}

I also commented the code that reduces the number of parts when there is no 
remainder, since that ends up creating a remaining part that is the same size 
as the others.


> Slicing of large files can lead to exceed the 32 parts limit of GCS
> -------------------------------------------------------------------
>
>                 Key: JCLOUDS-1608
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-1608
>             Project: jclouds
>          Issue Type: Bug
>          Components: jclouds-blobstore
>    Affects Versions: 2.2.1, 2.5.0
>            Reporter: Jan Vermeulen
>            Priority: Major
>              Labels: patch
>             Fix For: 2.2.1, 2.6.0
>
>
> MultipartUploadSlicingAlgorithm calculates slices for a large file by first 
> using the defaultPartSize. If the results of that slicing gives parts that 
> don't exceed the min/maxPartSizes (5MB and 5GB for 
> GoogleCloudStorageBlobStore) but that do exceed the maxNumberOfParts (32 for 
> GoogleCloudStorageBlobStore), the algoritm sets the number of parts to 32 and 
> recalculates the size of the parts. If there is any remainder after that, 
> JClouds ends up uploading 33 parts in total to GCS, causing the process to 
> fail in completeMultipartUpload() when recomposing the original content from 
> the parts.
> The following simple unitTest proves the case:
> {{public class AlgoritmTest extends TestCase {}}
> {{    public void testSlicing() {}}
> {{        MultipartUploadSlicingAlgorithm algorithm = new 
> MultipartUploadSlicingAlgorithm(1024*1024*5,1024*1024*1024*5,32);}}
> {{        algorithm.calculateChunkSize(1024*1024*1200+33);}}
> {{        assertTrue(algorithm.getParts()+((algorithm.getRemaining() > 
> 0)?(1):(0)) <= 32);}}
> {{}}}
> It simulates the slicing of a file of 1.2GB+33 bytes (to make sure there is a 
> remainder).
> The following patch fixes the issue:
> {{      ...}}
> {{      long remainder = length % unitPartSize;     }}
> {{      // SHB patch}}
> {{      // remainder should be distributed over parts if we are at the 
> maximumNumberOfParts}}
> {{      // (if not, an additional part is uploaded to GCS thus exceeding the 
> maximum allowed parts)}}
> {{      // if (remainder == 0 && parts > 0) {}}
> {{      //     parts -= 1;}}
> {{      if (remainder > 0 && parts == maximumNumberOfParts) {}}
> {{          parts -= 1;}}
> {{          partSize = length/parts;}}
>             {{// end of SHB patch}}
> {{      ...}}
> I also commented the code that reduces the number of parts when there is no 
> remainder, since that ends up creating a remaining part that is the same size 
> as the others.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to