timsaucer commented on code in PR #993:
URL: https://github.com/apache/datafusion-python/pull/993#discussion_r1966217074
##########
python/datafusion/context.py:
##########
@@ -783,7 +783,9 @@ def register_parquet(
file_extension,
skip_metadata,
schema,
- file_sort_order,
+ [[expr.raw_sort for expr in exprs] for exprs in file_sort_order]
Review Comment:
Can we use the same helper function `sort_list_to_raw_sort_list` that you
use below to clean it up a tiny bit?
##########
python/datafusion/catalog.py:
##########
@@ -66,9 +66,10 @@ def __init__(self, table: df_internal.Table) -> None:
"""This constructor is not typically called by the end user."""
self.table = table
+ @property
Review Comment:
I've become convinced that the current code is broken and that this is the
right change. Thank you @chenkovsky for doing this. I do think we need to put
some notes to that effect in the section of user facing changes of the PR
description.
##########
python/datafusion/udf.py:
##########
@@ -182,7 +182,7 @@ class AggregateUDF:
def __init__(
self,
- name: Optional[str],
+ name: str,
Review Comment:
Why remove the `Optional[str]` if we still have the logic below to allow
`None`?
##########
python/datafusion/udf.py:
##########
@@ -158,7 +158,7 @@ def state(self) -> List[pyarrow.Scalar]:
pass
@abstractmethod
- def update(self, *values: pyarrow.Array) -> None:
+ def update(self, values: pyarrow.Array) -> None:
Review Comment:
This looks wrong to me. I think we want/need to allow a variable number of
values passed for udafs that have multiple state elements.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]