[ https://issues.apache.org/jira/browse/BEAM-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Valentyn Tymofieiev reassigned BEAM-9184: ----------------------------------------- Assignee: Jeffrey Sorensen > Add ToSet() combiner, similar to ToList() and ToDict() > ------------------------------------------------------ > > Key: BEAM-9184 > URL: https://issues.apache.org/jira/browse/BEAM-9184 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core > Reporter: Jeffrey Sorensen > Assignee: Jeffrey Sorensen > Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > ToList() doesn't do deduplication, and ToDict() requires key/value tuples. > Sets are a different type than dicts in Python, so ToSet() is required to > combine very large PCollections while deduplicating. -- This message was sent by Atlassian Jira (v8.3.4#803005)