[ https://issues.apache.org/jira/browse/NIFI-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933999#comment-16933999 ]
Joseph Witt commented on NIFI-6313: ----------------------------------- [~markap14] the GCE massive scale testing you did - was it with this processor? [~jasperknulst] this is important to progress but it needs a PR and review traction before we set the fix version. > PutGCSObject performance seems slow > ----------------------------------- > > Key: NIFI-6313 > URL: https://issues.apache.org/jira/browse/NIFI-6313 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions > Affects Versions: 1.9.2 > Reporter: Jasper Knulst > Priority: Major > Fix For: 1.10.0 > > > The PutGCSObject processor to transfer files to Google Cloud Platform bucket > has bad transfer speeds. > It is impossible to put any hard figures on the throughput as it seems > dependent on: > -Network location of the Nifi node (situated in GC or not) > -Network bandwidth > -Nifi node specs > > After performing benchmarks on multiple Nifi clusters (ranging from test > setups to prod. sites) the throughput can range from 8MB/s to 800MB/s. > Slow really means, slow in comparison to gsutil. If you run gsutil directly > from the Nifi node the throughput speed goes up 5 to 8 times (without > 'parallel_composite_upload') and up to 16 times faster with > 'parallel_composite_upload'. > > The GC Java API on which Nifi's GCS processors are built, does not have the > same optimizations as gsutil and maybe isn't supported/maintained. The > Storage.create method is even deprecated. > Still there must be ways to speed up transfers the GCS by implementing > parallel composite uploads in chuncks and config options on the GCS > processors -- This message was sent by Atlassian Jira (v8.3.4#803005)