[jira] [Created] (BEAM-11881) DataFrame subpartitioning order is incorrect

Brian Hulette (Jira) Fri, 26 Feb 2021 16:05:04 -0800

Brian Hulette created BEAM-11881:
------------------------------------

             Summary: DataFrame subpartitioning order is incorrect
                 Key: BEAM-11881
                 URL: https://issues.apache.org/jira/browse/BEAM-11881
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
            Reporter: Brian Hulette



Currently we've defined

Nothing() < Index([i]) < Index([i,j]) < .. < Index() < Singleton()

s.t. Singleton is a subpartitoning of Index, is a subpartitioning of 
Index([i,j]), but this is incorrect. The order should be 

Singleton() < Index([i]) < Index([i,j]) < .. < Index() < Nothing()

s.t. every other partitioning is a subpartitioning of Singleton. This is 
logical, since Singleton will collect the largest amount of data on a single 
node, partitioning by a single index will be alittle more distributed, and 
partitioning by the full Index() will be the most distribtued.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-11881) DataFrame subpartitioning order is incorrect

Reply via email to