TheNeuralBit commented on a change in pull request #11766:
URL: https://github.com/apache/beam/pull/11766#discussion_r434916844



##########
File path: sdks/python/apache_beam/dataframe/expressions.py
##########
@@ -85,16 +87,10 @@ def evaluate_at(self, session):  # type: (Session) -> T
     """Returns the result of self with the bindings given in session."""
     raise NotImplementedError(type(self))
 
-  def requires_partition_by_index(self):  # type: () -> bool
-    """Whether this expression requires its argument(s) to be partitioned
-    by index."""
-    # TODO: It might be necessary to support partitioning by part of the index,
-    # for some args, which would require returning more than a boolean here.
+  def requires_partition_by(self):  # type: () -> Partitioning
     raise NotImplementedError(type(self))
 
-  def preserves_partition_by_index(self):  # type: () -> bool
-    """Whether the result of this expression will be partitioned by index
-    whenever all of its inputs are partitioned by index."""
+  def preserves_partition_by(self):  # type: () -> Partitioning

Review comment:
       Ah makes sense. So perhaps "preserves" could be thought of as an upper 
bound on the partitioning of the output (similar to how "requires" is a lower 
bound on the partitioning of the input).
   
   It looks like every current expression has preserves set to either Nothing 
or Singleton. Wouldn't it be simpler to just keep preserves as a boolean? Or 
maybe you have some other expression in mind where a boolean won't be 
sufficient?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to