icexelloss commented on PR #34645: URL: https://github.com/apache/arrow/pull/34645#issuecomment-1478391066
> > This is more aligned with behaviors of pandas/spark/SQL databases. > > Whoops, my comment above was intended to refer to my comment on the issue ([#34644 (comment)](https://github.com/apache/arrow/issues/34644#issuecomment-1477405158)), but it seems I forgot to add that. But so for overflow, this PR is _not_ more aligned with eg postgres (didn't check other databases, though). > > So I mostly wanted to mention, if we change the default, we don't necessarily have to use either one of "safe" or "unsafe", since we already have more fine grained control over the option class, we can also use something like > > ``` > auto options = compute::CastOptions::Unsafe(std::move(type_nullable.first)); > options.allow_int_overflow = false; > ``` > > Now, I am not up to date with the discussions in substrait around this, so I don't know if that matches better the intended default in substrait. But it might match better with user expectations from _some_ SQL databases (I only checked Postgres) The behavior for overflow is a valid concern. I tested on pandas and Spark, both which allows overflow by default and therefore, is different from the PostgreSQL behavior. I am slight +0.5 for matching pandas/Spark behavior because my background is from Python data science tooling but I can see the other argument as well. @jorisvandenbossche if you feel strongly about the overflow behavior then I don't object to disallowing it by default, but I hope to get some quick resolution as the "allow truncate" behavior is what I care about most. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
