[
https://issues.apache.org/jira/browse/TEZ-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ming Ma updated TEZ-3216:
-------------------------
Attachment: TEZ-3216.patch
Here is the draft patch. The more accurate partition stats uses 4 bytes per
partition and it is in the unit of MB. A new property
{{tez.runtime.report.detailed.partition.stats}} is defined. To enable more
accurate partition stats, applications also need to set the existing
{{tez.runtime.report.partition.stats}} to true. When more accurate partition
stats is reported, the bitset-based partition stats will be skipped.
In addition to the main feature, the patch also includes
* Have UnorderedPartitionedKVWriter honor the existing
{{tez.runtime.report.partition.stats}}.
* Move generateVMEvent to ShuffleUtils so that it can be shared between
ExternalSorter and UnorderedPartitionedKVWriter.
> Support for more precise partition stats in VertexManagerEvent
> --------------------------------------------------------------
>
> Key: TEZ-3216
> URL: https://issues.apache.org/jira/browse/TEZ-3216
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Ming Ma
> Attachments: TEZ-3216.patch
>
>
> Follow up on TEZ-3206 discussion, at least for some use cases, more accurate
> partition stats will be useful for DataMovementEvent routing. Maybe we can
> provide a config option to allow apps to choose the more accurate partition
> stats over RoaringBitmap.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)