Hi Chamikaramj >From my tests on a 1TiB dataset, with GCP dataflow, this increases performance >by about 10x, and reduces the amount of dataflow worker disk space required by >about 10x (from 40TiB to 4TiB). Overall there is a 40x reduction in cost >(dataflow worker and PD)
[ Full content available at: https://github.com/apache/beam/pull/6407 ] This message was relayed via gitbox.apache.org for [email protected]
