Wait and see seems reasonable. As data-points, BigQuery's corresponding data types, set precision to max precision for the given type. I seem to recall from prior DBs default to p=38, scale=9 also being a common default for columns when one didn't really know what to expect (this might have just been my experience).
On Mon, Oct 11, 2021 at 1:21 PM David Li <lidav...@apache.org> wrote: > Thanks all for chiming in. Perhaps we can take a wait-and-see approach for > now; I agree we'll eventually have to support multiple behaviors for > different use cases. I'll synthesize the discussion so far into JIRA. > > -David > > On Mon, Oct 11, 2021, at 16:12, Weston Pace wrote: > > I can see valid arguments for both cases. > > > > Traditionally, "front end" users (e.g. Iceberg, Flink, SQL Server, > > Postgres) parameterize decimal only by precision and scale and the > > underlying storage is an implementation detail (this is not true for > > integers, e.g. tinyint and smallint). From this perspective it would > > be very strange that > > > > MULT(DECIMAL<5, 5>, DECIMAL<5, 5>) => DECIMAL<11, 10> > > MULT(DECIMAL<40, 10>, DECIMAL<10, 10>) => DECIMAL<51, 20> > > MULT(DECIMAL<20, 10>, DECIMAL<20, 10>) => ERROR > > > > On the other hand, Arrow DOES parameterize decimal by precision, > > scale, and storage. So then Antoine's point holds, which is that we > > don't upcast integers automatically. > > > > At the end of the day, I don't think we have to get it right. Arrow > > is a back end library and front end libraries can always perform the > > cast if automatic upcast is the behavior they desire. If they want > > the plan rejected at plan-time they can do that too. Those are the > > only two easy options available to Arrow anyways. > > > > Long term, I would think we'd probably end up being asked to support > > the whole gamut of "saturate", "runtime error", and "go to null" for > > each of the data types, controlled by an option, but I'd defer that > > work until someone actually asked for it. > > > > If I had to choose an option I'd vote for automatic upcasting since I > > think users will be thinking of decimals in terms of precision and > > scale only. > > > > On Sun, Oct 10, 2021 at 10:39 PM Antoine Pitrou <anto...@python.org> > wrote: > > > > > > On Fri, 08 Oct 2021 08:47:33 -0400 > > > "David Li" <lidav...@apache.org> wrote: > > > > At max precision, we currently error out regardless of scale. > > > > > > > > It looks like SQL Server not only saturates at max precision, but > may also reduce scale. I think we can provide that as an option if there's > demand for it. But we can at least promote decimal128 to decimal256 and > perform overflow checks. > > > > > > I am not sure if promotion to decimal256 is something we want to do > > > automatically (we don't promote int32 to int64 automatically, for > > > example). > > > > > > Regards > > > > > > Antoine. > > > > > > > > >