Brian Hulette created BEAM-12132:
------------------------------------
Summary: DataFrame API: Consider allowing partitioning by column
in addition to Index
Key: BEAM-12132
URL: https://issues.apache.org/jira/browse/BEAM-12132
Project: Beam
Issue Type: Improvement
Components: sdk-py-core
Reporter: Brian Hulette
For some DataFrame use-cases it may be beneficial to partition a dataset across
the columns as well as across the index.
One example might be computing a correlation in a DataFrame with a very large
number of columns. It would be beneficial to be able to perform pairwise column
correlations on separate workers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)