Brian Hulette created BEAM-12132:
------------------------------------

             Summary: DataFrame API: Consider allowing partitioning by column 
in addition to Index
                 Key: BEAM-12132
                 URL: https://issues.apache.org/jira/browse/BEAM-12132
             Project: Beam
          Issue Type: Improvement
          Components: sdk-py-core
            Reporter: Brian Hulette


For some DataFrame use-cases it may be beneficial to partition a dataset across 
the columns as well as across the index.

One example might be computing a correlation in a DataFrame with a very large 
number of columns. It would be beneficial to be able to perform pairwise column 
correlations on separate workers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to