[ https://issues.apache.org/jira/browse/JCLOUDS-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749080#comment-17749080 ]
Jan Vermeulen commented on JCLOUDS-1608: ---------------------------------------- I had to patch MultipartUploadSlicingAlgorithm locally for a project where we upload large videos using JClouds, and it failed at the end when trying to pass more then 32 parts to GCS. Debugging the code, I found that the failure was due to a miscalculation when there is a remainder after dividing the size of the file in 32 parts. Without this fix we simply could not use JClouds consistently without having (seemingly random) upload failures. We have been using the fixed code for over a year now (bug was reported on 27/5/2022) without any problems. I will get the code of the failing unit test and see what is failing. > Slicing of large files can lead to exceed the 32 parts limit of GCS > ------------------------------------------------------------------- > > Key: JCLOUDS-1608 > URL: https://issues.apache.org/jira/browse/JCLOUDS-1608 > Project: jclouds > Issue Type: Bug > Components: jclouds-blobstore > Affects Versions: 2.2.1, 2.5.0 > Reporter: Jan Vermeulen > Priority: Major > Labels: google-cloud-storage > Fix For: 2.2.1, 2.6.0 > > > MultipartUploadSlicingAlgorithm calculates slices for a large file by first > using the defaultPartSize. If the results of that slicing gives parts that > don't exceed the min/maxPartSizes (5MB and 5GB for > GoogleCloudStorageBlobStore) but that do exceed the maxNumberOfParts (32 for > GoogleCloudStorageBlobStore), the algoritm sets the number of parts to 32 and > recalculates the size of the parts. If there is any remainder after that, > JClouds ends up uploading 33 parts in total to GCS, causing the process to > fail in completeMultipartUpload() when recomposing the original content from > the parts. > The following simple unitTest proves the case: > {{public class AlgoritmTest extends TestCase {}} > {{ public void testSlicing() {}} > {{ MultipartUploadSlicingAlgorithm algorithm = new > MultipartUploadSlicingAlgorithm(1024*1024*5,1024*1024*1024*5,32);}} > {{ algorithm.calculateChunkSize(1024*1024*1200+33);}} > {{ assertTrue(algorithm.getParts()+((algorithm.getRemaining() > > 0)?(1):(0)) <= 32);}} > {{}}} > It simulates the slicing of a file of 1.2GB+33 bytes (to make sure there is a > remainder). > The following patch fixes the issue: > {{ ...}} > {{ long remainder = length % unitPartSize; }} > {{ // SHB patch}} > {{ // remainder should be distributed over parts if we are at the > maximumNumberOfParts}} > {{ // (if not, an additional part is uploaded to GCS thus exceeding the > maximum allowed parts)}} > {{ // if (remainder == 0 && parts > 0) {}} > {{ // parts -= 1;}} > {{ if (remainder > 0 && parts == maximumNumberOfParts) {}} > {{ parts -= 1;}} > {{ partSize = length/parts;}} > {{// end of SHB patch}} > {{ ...}} > I also commented the code that reduces the number of parts when there is no > remainder, since that ends up creating a remaining part that is the same size > as the others. -- This message was sent by Atlassian Jira (v8.20.10#820010)