alamb commented on PR #8270: URL: https://github.com/apache/arrow-datafusion/pull/8270#issuecomment-1819202212
> The fix looks good. It might be good to fix some other bugs in ntile. Do you think a separate issue should be opened for these or is it something maybe worth addressing in this PR? > > ```sql > DataFusion CLI v33.0.0 > ❯ create table t1 (a int); > 0 rows in set. Query took 0.005 seconds. > > ❯ insert into t1 values (1),(2),(3); > +-------+ > | count | > +-------+ > | 3 | > +-------+ > 1 row in set. Query took 0.006 seconds. > > -- Do these results make sense? All other databases return ntile values 1,2,3. > -- Tested at https://dbfiddle.uk/ > ❯ select ntile(9223377) OVER(ORDER BY a) from t1; > +--------------------------------------------------------------------------------------------------------+ > | NTILE(Int64(9223377)) ORDER BY [t1.a ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW | > +--------------------------------------------------------------------------------------------------------+ > | 1 | > | 3074460 | > | 6148919 | > +--------------------------------------------------------------------------------------------------------+ > > -- This should return a regular error instead of an internal error > ❯ select ntile(9223372036854775809) OVER(ORDER BY a) from t1; > Internal error: Cannot convert UInt64(9223372036854775809) to i64. > This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker > > -- This should not panic and crash the datafusion cli > ❯ select ntile(-922337203685477580) OVER(ORDER BY a) from t1; > thread 'main' panicked at /home/ms/git/arrow-datafusion/datafusion/physical-expr/src/window/ntile.rs:100:23: > attempt to multiply with overflow > note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace > /home/ms/git/arrow-datafusion/datafusion-cli (ntile_output_type ✔) ᐅ > ``` Thanks @msirek . I recommend we file a ticket and fix these issues in a follow on PR. My rationale is that this PR is strictly better than main (even though you have identified other areas where it can be better). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
