zhangfengcdt commented on PR #254:
URL: https://github.com/apache/sedona-db/pull/254#issuecomment-3471311563

   > If this is a DataFusion issue it likely means that the aggregations in 
SedonaDB are broken (I am not sure we have integration tests for these and we 
probably should!).
   > 
   > I would be surprised if this is not solvable in DataFusion...in any case, 
the lack of field metadata propagation for our specific case is a bug (not 
necessarily the generic case, as you noted). Our return field in theory 
declares metadata properly!
   
   I got your points!  I have added e2e tests (aggregate.rs) to test these 
aggregation functions and verified that without the "fix" in this PR, they will 
fail with the same errors we see earlier.
   
   I understand that our aggregate_udf.rs has the correct return_field() method 
implemented that carries the metadata. However, it seems only be used in 
Datafusion logic plan planning. But in Datafusion's physical planning, as noted 
before, it does not honor the return_field(), instead, it creates a brand new 
field from scratch based only on the array's DataType, which is just Binary 
without extension metadata. The real fix is to make DataFusion's physical 
aggregate use the logical plan's field (with metadata) instead of creating a 
new one.
   
   Until we have that fix in Datafusion, it does not seem to have an easy way 
to fully fix it in sedona db. The "fix" in this PR should be able to unblock 
us, so I would suggest we:
   (1) add feature flag to control this fixing behavior
   (2) add todo to remove such "fix" once the root cause in datafusion is 
resolved.
   
   Please let me know if you'd be good with this or any other thoughts, and I 
can make the changes accordingly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to