Beam types and BigQuery types do not have 1:1 mapping. For example, Beam has a few integer types and BigQuery has only INT64. So far, each Beam type is mapped to 1 BigQuery type. This is the first case where a Beam type can potentially map to 2 BigQuery types.
It would be ideal if we could provide a way for users to choose which BigQuery type to use. If they know their values fit in NUMERIC, they can choose NUMERIC. Otherwise they can choose BIGNUMERIC. IIRC, the mapping in question [1] is used only in BigQueryUtils.toTableSchema [2], which is called in BigQueryTable.buildIOWriter [3] and schema inference in WriteResult [4]. For [3], BigQueryTable also has a member BigQueryUtils.ConversionOptions [5]. We can add an option to BigQueryUtils.ConversionOptions to specify whether to convert Decimal to NUMERIC or BIGNUMERIC, and pass BigQueryUtils.ConversionOptions as a new argument to BigQueryUtils.toTableSchema. For [4], I'm not sure if we can add BigQueryUtils.ConversionOptions to WriteResult but even if we can't, users can specify the schema instead of using the inferred schema, so it seems fine to keep mapping Decimal to NUMERIC in schema inference. Does this sound reasonable? [1] https://github.com/apache/beam/blob/e039ca28d6f806f30b87cae82e6af86694c171cd/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L180 [2] https://github.com/apache/beam/blob/e039ca28d6f806f30b87cae82e6af86694c171cd/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L391 [3] https://github.com/apache/beam/blob/e039ca28d6f806f30b87cae82e6af86694c171cd/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTable.java#L192 [4] https://github.com/apache/beam/blob/e039ca28d6f806f30b87cae82e6af86694c171cd/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L2604 [5] https://github.com/apache/beam/blob/e039ca28d6f806f30b87cae82e6af86694c171cd/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTable.java#L81 On 2021/03/18 19:40:07, Reuven Lax <r...@google.com> wrote: > It does not, which might've been a mistake. The user can pass in an> > arbitrary BigDecimal object, and we will encode whatever scale parameter> > is encoded. This means that for DECIMAL, each record encodes the scale.> > > On Thu, Mar 18, 2021 at 12:33 PM Mingyu Zhong <my...@google.com> wrote:> > > > Just wanted to clarify: BigQuery BIGNUMERIC type costs more than NUMERIC> > > type, so if NUMERIC is sufficient, the users likely won't want to switch to> > > BIGNUMERIC.> > >> > > Does Beam DECIMAL datatype contain the precision/scale parameters in the> > > metadata? If so, can we use those parameters to determine the mapped type?> > >> > > On Thu, Mar 18, 2021 at 12:08 PM Brian Hulette <bh...@google.com>> > > wrote:> > >> > >> Hi Vachan,> > >> I don't think Beam DECIMAL is really a great mapping for either> > >> BigQuery's NUMERIC or BIGNUMERIC type. Beam's DECIMAL represents arbitrary> > >> precision decimals [1] to map well to Java's BigDecimal class [2].> > >>> > >> Maybe we should add a fixed-precision decimal logical type [3], then have> > >> specific instances of it with the appropriate precision that map to NUMERIC> > >> and BIGNUMERIC. We could also shunt Beam DECIMAL to BIGNUMERIC for> > >> compatibility.> > >>> > >> [1]> > >> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L424> > >> [2] https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html> > >> [3]> > >> https://github.com/apache/beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes> > >>> > >> On Thu, Mar 18, 2021 at 12:00 PM Vachan Shetty <va...@google.com> wrote:> > >>> > >>> Hello, I am currently trying to add support for BigQuery's new> > >>> BIGNUMERIC datatype [1] in Beam's BigQueryIO. I am currently following the> > >>> steps that were used for adding the NUMERIC datatype [2]. AFAICT Beam's> > >>> DECIMAL is the most appropriate datatype to map to BIGNUMERIC in BQ.> > >>> However, the Beam DECIMAL datatype is already mapped to NUMERIC in BQ> > >>> [2, 3]. Given this, should I simply map all Beam DECIMAL to BQ BIGNUMERIC?> > >>> Or should this conversion be done based on other information? [1]:> > >>> https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#decimal_types> > >>> [2]: https://issues.apache.org/jira/browse/BEAM-4417 [3]:> > >>> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryUtils.java#L188> > >>>> > >>> > >> > > --> > > Thanks,> > >> > > Mingyu> > >> >