Hi Dennis,

One thing to check... is your file 5.1 gibibytes or gigabytes?  NiFi and
Amazon's multipart limits are in GiB, while Cloudian appears to be using GB
based on this Github thread [1].  5.1 GB is 4.75 GiB, so I'm thinking NiFi
is seeing your file is 4.75 GiB and your threshold is "4.8GB" (which is
really GiB), so it's not trying to perform a multipart upload, but 4.75GiB
= 5.1GB, which is above Cloudian's single PUT operation limit, so the
Cloudian server is rejecting it.  Could you try dropping your limit to "4.7
GB" (which is really interpreted as 4.7 GiB in NiFi) to see if that works?
That should put your file above the multipart threshold for NiFi, which
will invoke the multipart upload functionality.

[1] https://github.com/kahing/goofys/issues/139

Paul

On Tue, Jan 12, 2021 at 6:13 PM Dennis N Brown <dbr...@lenovo.com> wrote:

> Mark,  Sorry… also NiFi version 1.12.1
>
>
>
> Regards,
>
>
>
> *Dennis N Brown*
> Data Architect
> *Systems **CARE** Team*
> Lenovo USA
>
>
> US PostgreSQL Association
>
>
> [image: Email]dbr...@lenovo.com <dbr...@lenovo.com>
>
>
>
> [image: 2015-NewBrand-Lenovo-Services]
>
>
>
>
>
> *From:* Mark Payne <marka...@hotmail.com>
> *Sent:* January 12, 2021 12:20
> *To:* users@nifi.apache.org
> *Subject:* Re: [External] Re: Having an issue with large files and
> PutS3Object
>
>
>
> Dennis,
>
>
>
> Do your logs have any stack traces in them? That would probably help to
> understand what’s happening pretty quickly. Also, which version of NiFi are
> you running?
>
>
>
> Thanks
>
> -Mark
>
>
>
>
>
> On Jan 12, 2021, at 12:12 PM, Dennis N Brown <dbr...@lenovo.com> wrote:
>
>
>
> Thanks Mark,  The “Multipart Threshold” was defaulted to 5GB, and I have
> now adjusted it to 4.8GB to see if that made any difference, and it does
> not.  It seems to me that NiFi should be detecting the file size and
> initiating the multipart upload, without even trying a normal S3
> PutObject.  But I’m not seeing any “multipart” messages in the error ( as I
> have seen in other posts about multipart uploads ).
>
>
>
> The Cloudian implementation appears to be using the AWS libraries, as all
> of the messages appear to have Amazon or AWS in them, and the documentation
> also states the 5GB limit for file size without using multipart upload.
>
>
>
> Regards,
>
>
>
> <image001.gif>
>
> *Dennis N Brown*
> Data Architect
> *Systems **CARE* *Team*
> Lenovo USA
>
>
> US PostgreSQL Association
>
>
> <image002.gif>dbr...@lenovo.com <dbr...@lenovo.com>
>
>
>
> <image003.gif>
>
>
>
>
>
> *From:* Mark Payne <marka...@hotmail.com>
> *Sent:* January 12, 2021 11:53
> *To:* users@nifi.apache.org
> *Subject:* [External] Re: Having an issue with large files and PutS3Object
>
>
>
> Dennis,
>
>
>
> It appears that the PutS3Object processor looks at the size of the
> FlowFile and compares that to the value set in the “Multipart Threshold”
> property. If the size of the FlowFile is larger than that, it will use the
> multipart upload, with the configured size of the parts. I’m not familiar
> with the Cloudian implementation, but it may have different thresholds than
> S3? What do you have configured for the size of the Multipart Threshold?
>
>
>
> Thanks
>
> -Mark
>
>
>
> On Jan 12, 2021, at 11:32 AM, Dennis N Brown <dbr...@lenovo.com> wrote:
>
>
>
> Hello,  I’m having an issue attempting to upload a large fie ( 5.1GB ) to
> S3 storage ( not AWS, but rather Cloudian implementation ).
>
>
>
> From everything I’ve read, it appears NiFi is supposed to revert to a
> multipart upload if the size of the file is greater than then “Multipart
> Threshold” defined in the PutS3Object processor.  This is not happening for
> me, it just errors out with this message:
>
> ERROR o.a.nifi.processors.aws.s3.PutS3Object
> PutS3Object[id=cd683449-d9b3-1ce2-85ae-a0d900cfd488] Failed to put
> StandardFlowFileRecord[uuid=74a8d054-53cb-44d7-aca1-dabd94b50781,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1610459752598-174464,
> container=default, section=384], offset=59300,
> length=5109625339],offset=0,name=6477482,size=5109625339] to Amazon S3 due
> to com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
> 400; Error Code: EntityTooLarge; Request ID:
> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081), S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081:
> com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
> 400; Error Code: EntityTooLarge; Request ID:
> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081), S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081
>
> com.amazonaws.services.s3.model.AmazonS3Exception: Your proposed upload
> exceeds the maximum allowed object size. (Service: Amazon S3; Status Code:
> 400; Error Code: EntityTooLarge; Request ID:
> 2f967706-9745-1564-a246-0a94ef6266cb; S3 Extended Request ID:
> a647d24f02954de69d161d24c3e48081)
>
>
>
> So my question is;  Is NiFi supposed to detect the large file and then
> initiate the multipart upload, or is the server supposed to respond and
> cause NiFi to react to the size challenge?
>
>
>
> Regards,
>
>
>
> <image001.gif>
>
> *Dennis N Brown*
> Data Architect
> *Systems **CARE* *Team*
> Lenovo USA
>
>
> US PostgreSQL Association
>
>
> <image002.gif>dbr...@lenovo.com <dbr...@lenovo.com>
>
>
>
> <image003.gif>
>
>
>

Reply via email to