[ 
https://issues.apache.org/jira/browse/NIFI-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933999#comment-16933999
 ] 

Joseph Witt commented on NIFI-6313:
-----------------------------------

[~markap14] the GCE massive scale testing you did - was it with this processor?

[~jasperknulst] this is important to progress but it needs a PR and review 
traction before we set the fix version.  

> PutGCSObject performance seems slow
> -----------------------------------
>
>                 Key: NIFI-6313
>                 URL: https://issues.apache.org/jira/browse/NIFI-6313
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework, Extensions
>    Affects Versions: 1.9.2
>            Reporter: Jasper Knulst
>            Priority: Major
>             Fix For: 1.10.0
>
>
> The PutGCSObject processor to transfer files to Google Cloud Platform bucket 
> has bad transfer speeds.
> It is impossible to put any hard figures on the throughput as it seems 
> dependent on:
> -Network location of the Nifi node (situated in GC or not)
> -Network bandwidth
> -Nifi node specs
>  
> After performing benchmarks on multiple Nifi clusters (ranging from test 
> setups to prod. sites) the throughput can range from 8MB/s to 800MB/s. 
> Slow really means, slow in comparison to gsutil. If you run gsutil directly 
> from the Nifi node the throughput speed goes up 5 to 8 times (without 
> 'parallel_composite_upload') and up to 16 times faster with 
> 'parallel_composite_upload'.
>  
> The GC Java API on which Nifi's GCS processors are built, does not have the 
> same optimizations as gsutil and maybe isn't supported/maintained. The 
> Storage.create method is even deprecated.
> Still there must be ways to speed up transfers the GCS by implementing 
> parallel composite uploads in chuncks and config options on the GCS 
> processors 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to