[ 
https://issues.apache.org/jira/browse/ARROW-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Swenson updated ARROW-16427:
-------------------------------------
    Description: 
When a JDBC driver returns a Numeric type that doesn't exactly align with what 
is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator fails to 
process the result (failing on serializing the the value into the 
BigDecimalVector). 

It appears as though this is because JDBC drivers can return BigDecimal / 
Numeric values that are different between the metadata and not consistent 
between each of the rows. 

Is there a recommended course of action to represent a variable precision / 
scale decimal vector? In any case it does not seem possible to convert JDBC 
data with the built in utilities that uses these numeric types when they come 
in this form. 

It seems like both the Oracle and the Postgres JDBC driver also returns 
metadata with a 0,0 precision / scale when values in the result set have 
different (and varied) precision / scale. 

An example: 

Against postgres, running a simple SQL query that produces numeric types can 
lead to a JDBC result set with BigDecimal values with variable decimal 
precision/scale. 
{code:java}
SELECT value FROM (
  SELECT 1000000000000000.01 AS "value" 
  UNION SELECT 1000000000300.0000001
) a {code}
 

The postgres JDBC adapter produces a result set that looks like the following: 

 
|| ||value||precision||scale||
|metadata|N/A|0|0|
|row 1|1000000000000000.01|18|2|
|row 2|1000000000300.0000001|20|7|

 

Even a result set that returns a single value may Numeric values with precision 
/ scale that do not match the precision / scale in the ResultSetMetadata. 

 
{code:java}
SELECT AVG(one) from (
  SELECT 1000000000000000.01 as "one" 
  UNION select 1000000000300.0000001
) a {code}
produces a result set that looks like this

 
|| ||value||precision||scale||
|metadata|N/A|0|0|
|row 1|500500000000150.0050001|22|7|

 

When processing the result set using the simple jdbcToArrowVectors (or 
sqlToArrowVectorIterator) this fails to set the values extracted from the 
result set into the the DecimalVector

 
{code:java}
val calendar = JdbcToArrowUtils.getUtcCalendar()
val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
val root = VectorSchemaRoot.create(schema, RootAllocator())
val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code}
Error:

 
{code:java}
Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, 
length: 1 (expected: range(0, 0))
    at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
    at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
    at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
    at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
    at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
    at 
org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
    at 
org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
    at 
org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
    at 
org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321)
 {code}
 

using `sqlToArrowVectorIterator` also fails with an error trying to set data 
into the vector: (requires a little bit of trickery to force creation of the 
package private configuration)

 
{code:java}
Exception in thread "main" java.lang.RuntimeException: Error occurred while 
getting next schema root.
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
    at 
com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
    at com.acme.AppKt.main(App.kt:54)
    at com.acme.AppKt.main(App.kt)
Caused by: java.lang.RuntimeException: Error occurred while consuming data.
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
    ... 3 more
Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must equal 
that in the Arrow vector: 7 != 0
    at 
org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
    at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
    at 
org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
    at 
org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
    ... 5 more {code}
 

 

  was:
When a JDBC driver returns a Numeric type that doesn't exactly align with what 
is in the JDBC metadata, khjdbcToArrowVectors / sqlToArrowVectorIterator fails 
to process the result (failing on serializing the the value into the 
BigDecimalVector). 

It appears as though this is because JDBC drivers can return BigDecimal / 
Numeric values that are different between the metadata and not consistent 
between each of the rows. 

Is there a recommended course of action to represent a variable precision / 
scale decimal vector? In any case it does not seem possible to convert JDBC 
data with the built in utilities that uses these numeric types when they come 
in this form. 

It seems like both the Oracle and the Postgres JDBC driver also returns 
metadata with a 0,0 precision / scale when values in the result set have 
different (and varied) precision / scale. 

An example: 

Against postgres, running a simple SQL query that produces numeric types can 
lead to a JDBC result set with BigDecimal values with variable decimal 
precision/scale. 
{code:java}
SELECT value FROM (
  SELECT 1000000000000000.01 AS "value" 
  UNION SELECT 1000000000300.0000001
) a {code}
 

The postgres JDBC adapter produces a result set that looks like the following: 

 
|| ||value||precision||scale||
|metadata|N/A|0|0|
|row 1|1000000000000000.01|18|2|
|row 2|1000000000300.0000001|20|7|

 

Even a result set that returns a single value may Numeric values with precision 
/ scale that do not match the precision / scale in the ResultSetMetadata. 

 
{code:java}
SELECT AVG(one) from (
  SELECT 1000000000000000.01 as "one" 
  UNION select 1000000000300.0000001
) a {code}
produces a result set that looks like this

 
|| ||value||precision||scale||
|metadata|N/A|0|0|
|row 1|500500000000150.0050001|22|7|

 

When processing the result set using the simple jdbcToArrowVectors (or 
sqlToArrowVectorIterator) this fails to set the values extracted from the 
result set into the the DecimalVector

 
{code:java}
val calendar = JdbcToArrowUtils.getUtcCalendar()
val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
val root = VectorSchemaRoot.create(schema, RootAllocator())
val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code}
Error:

 
{code:java}
Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, 
length: 1 (expected: range(0, 0))
    at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
    at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
    at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
    at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
    at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
    at 
org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
    at 
org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
    at 
org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
    at 
org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321)
 {code}
 

using `sqlToArrowVectorIterator` also fails with an error trying to set data 
into the vector: (requires a little bit of trickery to force creation of the 
package private configuration)

 
{code:java}
Exception in thread "main" java.lang.RuntimeException: Error occurred while 
getting next schema root.
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
    at 
com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
    at com.acme.AppKt.main(App.kt:54)
    at com.acme.AppKt.main(App.kt)
Caused by: java.lang.RuntimeException: Error occurred while consuming data.
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
    ... 3 more
Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must equal 
that in the Arrow vector: 7 != 0
    at 
org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
    at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
    at 
org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
    at 
org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
    at 
org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
    ... 5 more {code}
 

 


> [Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable 
> decimal precision / scale
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-16427
>                 URL: https://issues.apache.org/jira/browse/ARROW-16427
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Jonathan Swenson
>            Priority: Major
>
> When a JDBC driver returns a Numeric type that doesn't exactly align with 
> what is in the JDBC metadata, jdbcToArrowVectors / sqlToArrowVectorIterator 
> fails to process the result (failing on serializing the the value into the 
> BigDecimalVector). 
> It appears as though this is because JDBC drivers can return BigDecimal / 
> Numeric values that are different between the metadata and not consistent 
> between each of the rows. 
> Is there a recommended course of action to represent a variable precision / 
> scale decimal vector? In any case it does not seem possible to convert JDBC 
> data with the built in utilities that uses these numeric types when they come 
> in this form. 
> It seems like both the Oracle and the Postgres JDBC driver also returns 
> metadata with a 0,0 precision / scale when values in the result set have 
> different (and varied) precision / scale. 
> An example: 
> Against postgres, running a simple SQL query that produces numeric types can 
> lead to a JDBC result set with BigDecimal values with variable decimal 
> precision/scale. 
> {code:java}
> SELECT value FROM (
>   SELECT 1000000000000000.01 AS "value" 
>   UNION SELECT 1000000000300.0000001
> ) a {code}
>  
> The postgres JDBC adapter produces a result set that looks like the 
> following: 
>  
> || ||value||precision||scale||
> |metadata|N/A|0|0|
> |row 1|1000000000000000.01|18|2|
> |row 2|1000000000300.0000001|20|7|
>  
> Even a result set that returns a single value may Numeric values with 
> precision / scale that do not match the precision / scale in the 
> ResultSetMetadata. 
>  
> {code:java}
> SELECT AVG(one) from (
>   SELECT 1000000000000000.01 as "one" 
>   UNION select 1000000000300.0000001
> ) a {code}
> produces a result set that looks like this
>  
> || ||value||precision||scale||
> |metadata|N/A|0|0|
> |row 1|500500000000150.0050001|22|7|
>  
> When processing the result set using the simple jdbcToArrowVectors (or 
> sqlToArrowVectorIterator) this fails to set the values extracted from the 
> result set into the the DecimalVector
>  
> {code:java}
> val calendar = JdbcToArrowUtils.getUtcCalendar()
> val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
> val root = VectorSchemaRoot.create(schema, RootAllocator())
> val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code}
> Error:
>  
> {code:java}
> Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, 
> length: 1 (expected: range(0, 0))
>     at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
>     at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
>     at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
>     at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
>     at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
>     at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
>     at 
> org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321)
>  {code}
>  
> using `sqlToArrowVectorIterator` also fails with an error trying to set data 
> into the vector: (requires a little bit of trickery to force creation of the 
> package private configuration)
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Error occurred while 
> getting next schema root.
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
>     at 
> com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
>     at com.acme.AppKt.main(App.kt:54)
>     at com.acme.AppKt.main(App.kt)
> Caused by: java.lang.RuntimeException: Error occurred while consuming data.
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
>     ... 3 more
> Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must 
> equal that in the Arrow vector: 7 != 0
>     at 
> org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
>     at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
>     at 
> org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
>     at 
> org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
>     ... 5 more {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to