timsaucer commented on code in PR #993: URL: https://github.com/apache/datafusion-python/pull/993#discussion_r1966217074
########## python/datafusion/context.py: ########## @@ -783,7 +783,9 @@ def register_parquet( file_extension, skip_metadata, schema, - file_sort_order, + [[expr.raw_sort for expr in exprs] for exprs in file_sort_order] Review Comment: Can we use the same helper function `sort_list_to_raw_sort_list` that you use below to clean it up a tiny bit? ########## python/datafusion/catalog.py: ########## @@ -66,9 +66,10 @@ def __init__(self, table: df_internal.Table) -> None: """This constructor is not typically called by the end user.""" self.table = table + @property Review Comment: I've become convinced that the current code is broken and that this is the right change. Thank you @chenkovsky for doing this. I do think we need to put some notes to that effect in the section of user facing changes of the PR description. ########## python/datafusion/udf.py: ########## @@ -182,7 +182,7 @@ class AggregateUDF: def __init__( self, - name: Optional[str], + name: str, Review Comment: Why remove the `Optional[str]` if we still have the logic below to allow `None`? ########## python/datafusion/udf.py: ########## @@ -158,7 +158,7 @@ def state(self) -> List[pyarrow.Scalar]: pass @abstractmethod - def update(self, *values: pyarrow.Array) -> None: + def update(self, values: pyarrow.Array) -> None: Review Comment: This looks wrong to me. I think we want/need to allow a variable number of values passed for udafs that have multiple state elements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org