[
https://issues.apache.org/jira/browse/ARROW-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Swenson updated ARROW-16427:
-------------------------------------
Component/s: Java
> [Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable
> decimal precision / scale
> -------------------------------------------------------------------------------------------------------
>
> Key: ARROW-16427
> URL: https://issues.apache.org/jira/browse/ARROW-16427
> Project: Apache Arrow
> Issue Type: Bug
> Components: Java
> Affects Versions: 7.0.0
> Reporter: Jonathan Swenson
> Priority: Major
>
> When a JDBC driver returns a Numeric type that doesn't exactly align with
> what is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator
> fails to process the result (failing on serializing the the value into the
> BigDecimalVector).
> It appears as though this is because JDBC drivers can return BigDecimal /
> Numeric values that are different between the metadata and not consistent
> between each of the rows.
> Is there a recommended course of action to represent a variable precision /
> scale decimal vector? In any case it does not seem possible to convert JDBC
> data with the built in utilities that uses these numeric types when they come
> in this form.
> It seems like both the Oracle and the Postgres JDBC driver also returns
> metadata with a 0,0 precision / scale when values in the result set have
> different (and varied) precision / scale.
> An example:
> Against postgres, running a simple SQL query that produces numeric types can
> lead to a JDBC result set with BigDecimal values with variable decimal
> precision/scale.
> {code:java}
> SELECT value FROM (
> SELECT 1000000000000000.01 AS "value"
> UNION SELECT 1000000000300.0000001
> ) a {code}
>
> The postgres JDBC adapter produces a result set that looks like the
> following:
>
> || ||value||precision||scale||
> |metadata|N/A|0|0|
> |row 1|1000000000000000.01|18|2|
> |row 2|1000000000300.0000001|20|7|
>
> Even a result set that returns a single value may Numeric values with
> precision / scale that do not match the precision / scale in the
> ResultSetMetadata.
>
> {code:java}
> SELECT AVG(one) from (
> SELECT 1000000000000000.01 as "one"
> UNION select 1000000000300.0000001
> ) a {code}
> produces a result set that looks like this
>
> || ||value||precision||scale||
> |metadata|N/A|0|0|
> |row 1|500500000000150.0050001|22|7|
>
> When processing the result set using the simple jdbcToArrowVectors (or
> sqlToArrowVectorIterator) this fails to set the values extracted from the
> result set into the the DecimalVector
>
> {code:java}
> val calendar = JdbcToArrowUtils.getUtcCalendar()
> val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
> val root = VectorSchemaRoot.create(schema, RootAllocator())
> val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code}
> Error:
>
> {code:java}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0,
> length: 1 (expected: range(0, 0))
> at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
> at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
> at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
> at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
> at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
> at
> org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
> at
> org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
> at
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
> at
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321)
> {code}
>
> using `sqlToArrowVectorIterator` also fails with an error trying to set data
> into the vector: (requires a little bit of trickery to force creation of the
> package private configuration)
>
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Error occurred while
> getting next schema root.
> at
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
> at
> com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
> at com.acme.AppKt.main(App.kt:54)
> at com.acme.AppKt.main(App.kt)
> Caused by: java.lang.RuntimeException: Error occurred while consuming data.
> at
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
> at
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
> at
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
> ... 3 more
> Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must
> equal that in the Arrow vector: 7 != 0
> at
> org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
> at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
> at
> org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
> at
> org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
> at
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
> ... 5 more {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)