[
https://issues.apache.org/jira/browse/DRILL-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dave Oshinsky updated DRILL-4184:
---------------------------------
Description:
Encoding a DECIMAL logical type in Parquet using the variable length BINARY
primitive type is not supported by Drill as of versions 1.3.0 and 1.4.0. The
problem first surfaces with the ClassCastException shown below, but fixing the
immediate cause of the exception is not sufficient to support this combination
(DECIMAL, BINARY) in a Parquet file.
In Drill, DECIMAL is currently assumed to be INT32, INT64, INT96, or
FIXED_LEN_BINARY_ARRAY. Are there any plans to support DECIMAL with variable
length BINARY? Avro definitely supports encoding DECIMAL in variable length
bytes (see https://avro.apache.org/docs/current/spec.html#Decimal), but this
support in Parquet is less clear.
Selecting on a BINARY DECIMAL field in a parquet file throws an exception as
shown below (java.lang.ClassCastException:
org.apache.drill.exec.vector.Decimal28SparseVector cannot be cast to
org.apache.drill.exec.vector.VariableWidthVector). The successful query at
bottom selected on a string field in the same file.
0: jdbc:drill:zk=local> select count(*) from
dfs.`c:/dao/DBArchivePredictor/tenrows.parquet` where acct_no=70000020;
org.apache.drill.common.exceptions.DrillRuntimeException: Error in parquet recor
d reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message sbi.acct_mstr {
required binary ACCT_NO (DECIMAL(20,0));
optional binary SF_NO (UTF8);
optional binary LF_NO (UTF8);
optional binary BRANCH_NO (DECIMAL(20,0));
optional binary INTRO_CUST_NO (DECIMAL(20,0));
optional binary INTRO_ACCT_NO (DECIMAL(20,0));
optional binary INTRO_SIGN (UTF8);
optional binary TYPE (UTF8);
optional binary OPR_MODE (UTF8);
optional binary CUR_ACCT_TYPE (UTF8);
optional binary TITLE (UTF8);
optional binary CORP_CUST_NO (DECIMAL(20,0));
optional binary APLNDT (UTF8);
optional binary OPNDT (UTF8);
optional binary VERI_EMP_NO (DECIMAL(20,0));
optional binary VERI_SIGN (UTF8);
optional binary MANAGER_SIGN (UTF8);
optional binary CURBAL (DECIMAL(8,2));
optional binary STATUS (UTF8);
}
, metadata: {parquet.avro.schema={"type":"record","name":"acct_mstr","namespace"
:"sbi","fields":[{"name":"ACCT_NO","type":{"type":"bytes","logicalType":"decimal
","precision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_co
lumn_class":"java.math.BigDecimal","cv_connection":"oracle.jdbc.driver.T4CConnec
tion","cv_currency":true,"cv_def_writable":false,"cv_nullable":0,"cv_precision":
20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_s
ubscript":1,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}},{"name":"SF_
NO","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":tru
e,"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":fal
se,"cv_nullable":1,"cv_precision":10,"cv_read_only":false,"cv_scale":0,"cv_searc
hable":true,"cv_signed":true,"cv_subscript":2,"cv_type":12,"cv_typename":"VARCHA
R2","cv_writable":true}]},{"name":"LF_NO","type":["null",{"type":"string","cv_au
to_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv
_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":10,"cv_r
ead_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript
":3,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"BRANCH_
NO","type":["null",{"type":"bytes","logicalType":"decimal","precision":20,"scale
":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"java.math.
BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullable":1,"cv_preci
sion":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true
,"cv_subscript":4,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}]},{"nam
e":"INTRO_CUST_NO","type":["null",{"type":"bytes","logicalType":"decimal","preci
sion":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_cla
ss":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullab
le":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"
cv_signed":true,"cv_subscript":5,"cv_type":2,"cv_typename":"NUMBER","cv_writable
":true}]},{"name":"INTRO_ACCT_NO","type":["null",{"type":"bytes","logicalType":"
decimal","precision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false
,"cv_column_class":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":f
alse,"cv_nullable":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_sea
rchable":true,"cv_signed":true,"cv_subscript":6,"cv_type":2,"cv_typename":"NUMBE
R","cv_writable":true}]},{"name":"INTRO_SIGN","type":["null",{"type":"string","c
v_auto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String"
,"cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"c
v_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscr
ipt":7,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"TYPE
","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":true,
"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":false
,"cv_nullable":1,"cv_precision":2,"cv_read_only":false,"cv_scale":0,"cv_searchab
le":true,"cv_signed":true,"cv_subscript":8,"cv_type":12,"cv_typename":"VARCHAR2"
,"cv_writable":true}]},{"name":"OPR_MODE","type":["null",{"type":"string","cv_au
to_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv
_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":2,"cv_re
ad_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript"
:9,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"CUR_ACCT
_TYPE","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":
true,"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":
false,"cv_nullable":1,"cv_precision":4,"cv_read_only":false,"cv_scale":0,"cv_sea
rchable":true,"cv_signed":true,"cv_subscript":10,"cv_type":12,"cv_typename":"VAR
CHAR2","cv_writable":true}]},{"name":"TITLE","type":["null",{"type":"string","cv
_auto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String",
"cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":30,"c
v_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscr
ipt":11,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"COR
P_CUST_NO","type":["null",{"type":"bytes","logicalType":"decimal","precision":20
,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"jav
a.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullable":1,"c
v_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signe
d":true,"cv_subscript":12,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}
]},{"name":"APLNDT","type":["null",{"type":"string","cv_auto_incr":false,"cv_cas
e_sensitive":false,"cv_column_class":"java.sql.Timestamp","cv_currency":false,"c
v_def_writable":false,"cv_nullable":1,"cv_precision":0,"cv_read_only":false,"cv_
scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript":13,"cv_type":93,"c
v_typename":"DATE","cv_writable":true}]},{"name":"OPNDT","type":["null",{"type":
"string","cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"java.
sql.Timestamp","cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_p
recision":0,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":t
rue,"cv_subscript":14,"cv_type":93,"cv_typename":"DATE","cv_writable":true}]},{"
name":"VERI_EMP_NO","type":["null",{"type":"bytes","logicalType":"decimal","prec
ision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_cl
ass":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nulla
ble":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,
"cv_signed":true,"cv_subscript":15,"cv_type":2,"cv_typename":"NUMBER","cv_writab
le":true}]},{"name":"VERI_SIGN","type":["null",{"type":"string","cv_auto_incr":f
alse,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv_currency"
:false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"cv_read_only":f
alse,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript":16,"cv_ty
pe":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"MANAGER_SIGN","ty
pe":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":true,"cv_c
olumn_class":"java.lang.String","cv_currency":false,"cv_def_writable":false,"cv_
nullable":1,"cv_precision":1,"cv_read_only":false,"cv_scale":0,"cv_searchable":t
rue,"cv_signed":true,"cv_subscript":17,"cv_type":12,"cv_typename":"VARCHAR2","cv
_writable":true}]},{"name":"CURBAL","type":["null",{"type":"bytes","logicalType"
:"decimal","precision":8,"scale":2,"cv_auto_incr":false,"cv_case_sensitive":fals
e,"cv_column_class":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":
false,"cv_nullable":1,"cv_precision":8,"cv_read_only":false,"cv_scale":2,"cv_sea
rchable":true,"cv_signed":true,"cv_subscript":18,"cv_type":2,"cv_typename":"NUMB
ER","cv_writable":true}]},{"name":"STATUS","type":["null",{"type":"string","cv_a
uto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","c
v_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"cv_r
ead_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript
":19,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]}]}}}, blocks: [B
lockMetaData{10, 1281 [ColumnMetaData{SNAPPY [ACCT_NO] BINARY [BIT_PACKED, PLAI
N], 4}, ColumnMetaData{SNAPPY [SF_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY
], 88}, ColumnMetaData{SNAPPY [LF_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY
], 163}, ColumnMetaData{SNAPPY [BRANCH_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTI
ONARY], 241}, ColumnMetaData{SNAPPY [INTRO_CUST_NO] BINARY [RLE, BIT_PACKED, PL
AIN_DICTIONARY], 298}, ColumnMetaData{SNAPPY [INTRO_ACCT_NO] BINARY [RLE, BIT_P
ACKED, PLAIN_DICTIONARY], 364}, ColumnMetaData{SNAPPY [INTRO_SIGN] BINARY [RLE,
BIT_PACKED, PLAIN_DICTIONARY], 421}, ColumnMetaData{SNAPPY [TYPE] BINARY [RLE,
BIT_PACKED, PLAIN_DICTIONARY], 478}, ColumnMetaData{SNAPPY [OPR_MODE] BINARY [
RLE, BIT_PACKED, PLAIN_DICTIONARY], 538}, ColumnMetaData{SNAPPY [CUR_ACCT_TYPE]
BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 598}, ColumnMetaData{SNAPPY [TITLE]
BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 658}, ColumnMetaData{SNAPPY [CORP_
CUST_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 736}, ColumnMetaData{SNAPP
Y [APLNDT] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 802}, ColumnMetaData{SNA
PPY [OPNDT] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 919}, ColumnMetaData{SN
APPY [VERI_EMP_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1036}, ColumnMet
aData{SNAPPY [VERI_SIGN] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1093}, Col
umnMetaData{SNAPPY [MANAGER_SIGN] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1
150}, ColumnMetaData{SNAPPY [CURBAL] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY]
, 1207}, ColumnMetaData{SNAPPY [STATUS] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONA
RY], 1270}]}]}
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
.handleAndRaise(ParquetRecordReader.java:346)
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
.setup(ParquetRecordReader.java:339)
at org.apache.drill.exec.physical.impl.ScanBatch.<init>(ScanBatch.java:1
01)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(
ParquetScanBatchCreator.java:168)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(
ParquetScanBatchCreator.java:56)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:151)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreat
or.java:105)
at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.j
ava:79)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExec
utor.java:230)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable
.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException:
org.apache.drill.exec.vector.Decimal28SparseVector cannot be cast to
org.apache.drill.exec.vector.VariableWidthVector
at org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColu
mn.<init>(VarLengthValuesColumn.java:44)
at org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnRead
ers$Decimal28Column.<init>(VarLengthColumnReaders.java:52)
at org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
.getReader(ColumnReaderFactory.java:178)
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
.setup(ParquetRecordReader.java:319)
... 22 more
Error: SYSTEM ERROR: ClassCastException: org.apache.drill.exec.vector.Decimal28S
parseVector cannot be cast to org.apache.drill.exec.vector.VariableWidthVector
Fragment 0:0
[Error Id: 22bfa8dd-1129-4300-9449-409e96d6c800 on DaveOshinsky-PC.gp.cv.commvau
lt.com:31010] (state=,code=0)
0: jdbc:drill:zk=local> select count(*) from dfs.`c:/dao/DBArchivePredictor/tenr
ows.parquet` where opr_mode='JO';
+---------+
| EXPR$0 |
+---------+
| 10 |
+---------+
1 row selected (0.406 seconds)
0: jdbc:drill:zk=local>
The immediate cause of this exception is that Drill, in
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader, assumes
that all BINARY values are encoded in VariableWidthVectors. For BINARY
DECIMAL, this is not true, as for example Decimal28SparseVector is a
FixedWidthVector, not a VariableWidthVector. The assumption that DECIMAL is
not encoded in variable length BINARY is found in a number of other places in
the Drill code, including:
org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory only
contains logic to handle DECIMAL with INT32, INT64, INT96, or
FIXED_LEN_BYTE_ARRAY. BINARY is not supported with DECIMAL.
org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders
does not support a nullable reader for BINARY in getNullableColumnReader
method.
was:
Encoding a DECIMAL logical type in Parquet using the variable length BINARY
primitive type is not supported by Drill as of versions 1.3.0 and 1.4.0. The
problem first surfaces with the ClassCastException shown below, but fixing the
immediate cause of the exception is not sufficient to support this combination
(DECIMAL, BINARY) in a Parquet file.
In Drill, DECIMAL is currently assumed to be INT32, INT64, INT96, or
FIXED_LEN_BINARY_ARRAY. Are there any plans to support DECIMAL with variable
length BINARY? Avro definitely supports encoding DECIMAL in variable length
bytes (see https://avro.apache.org/docs/current/spec.html#Decimal), but this
support in Parquet is less clear.
Selecting on a BINARY DECIMAL field in a parquet file throws an exception as
shown below (java.lang.ClassCastException:
org.apache.drill.exec.vector.Decimal28SparseVector cannot be cast to
org.apache.drill.exec.vector.VariableWidthVector). The successful query at
bottom selected on a string field in the same file.
0: jdbc:drill:zk=local> select count(*) from
dfs.`c:/dao/DBArchivePredictor/tenrows.parquet` where acct_no=70000020;
org.apache.drill.common.exceptions.DrillRuntimeException: Error in parquet recor
d reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message sbi.acct_mstr {
required binary ACCT_NO (DECIMAL(20,0));
optional binary SF_NO (UTF8);
optional binary LF_NO (UTF8);
optional binary BRANCH_NO (DECIMAL(20,0));
optional binary INTRO_CUST_NO (DECIMAL(20,0));
optional binary INTRO_ACCT_NO (DECIMAL(20,0));
optional binary INTRO_SIGN (UTF8);
optional binary TYPE (UTF8);
optional binary OPR_MODE (UTF8);
optional binary CUR_ACCT_TYPE (UTF8);
optional binary TITLE (UTF8);
optional binary CORP_CUST_NO (DECIMAL(20,0));
optional binary APLNDT (UTF8);
optional binary OPNDT (UTF8);
optional binary VERI_EMP_NO (DECIMAL(20,0));
optional binary VERI_SIGN (UTF8);
optional binary MANAGER_SIGN (UTF8);
optional binary CURBAL (DECIMAL(8,2));
optional binary STATUS (UTF8);
}
, metadata: {parquet.avro.schema={"type":"record","name":"acct_mstr","namespace"
:"sbi","fields":[{"name":"ACCT_NO","type":{"type":"bytes","logicalType":"decimal
","precision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_co
lumn_class":"java.math.BigDecimal","cv_connection":"oracle.jdbc.driver.T4CConnec
tion","cv_currency":true,"cv_def_writable":false,"cv_nullable":0,"cv_precision":
20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_s
ubscript":1,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}},{"name":"SF_
NO","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":tru
e,"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":fal
se,"cv_nullable":1,"cv_precision":10,"cv_read_only":false,"cv_scale":0,"cv_searc
hable":true,"cv_signed":true,"cv_subscript":2,"cv_type":12,"cv_typename":"VARCHA
R2","cv_writable":true}]},{"name":"LF_NO","type":["null",{"type":"string","cv_au
to_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv
_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":10,"cv_r
ead_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript
":3,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"BRANCH_
NO","type":["null",{"type":"bytes","logicalType":"decimal","precision":20,"scale
":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"java.math.
BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullable":1,"cv_preci
sion":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true
,"cv_subscript":4,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}]},{"nam
e":"INTRO_CUST_NO","type":["null",{"type":"bytes","logicalType":"decimal","preci
sion":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_cla
ss":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullab
le":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"
cv_signed":true,"cv_subscript":5,"cv_type":2,"cv_typename":"NUMBER","cv_writable
":true}]},{"name":"INTRO_ACCT_NO","type":["null",{"type":"bytes","logicalType":"
decimal","precision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false
,"cv_column_class":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":f
alse,"cv_nullable":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_sea
rchable":true,"cv_signed":true,"cv_subscript":6,"cv_type":2,"cv_typename":"NUMBE
R","cv_writable":true}]},{"name":"INTRO_SIGN","type":["null",{"type":"string","c
v_auto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String"
,"cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"c
v_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscr
ipt":7,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"TYPE
","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":true,
"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":false
,"cv_nullable":1,"cv_precision":2,"cv_read_only":false,"cv_scale":0,"cv_searchab
le":true,"cv_signed":true,"cv_subscript":8,"cv_type":12,"cv_typename":"VARCHAR2"
,"cv_writable":true}]},{"name":"OPR_MODE","type":["null",{"type":"string","cv_au
to_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv
_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":2,"cv_re
ad_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript"
:9,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"CUR_ACCT
_TYPE","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":
true,"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":
false,"cv_nullable":1,"cv_precision":4,"cv_read_only":false,"cv_scale":0,"cv_sea
rchable":true,"cv_signed":true,"cv_subscript":10,"cv_type":12,"cv_typename":"VAR
CHAR2","cv_writable":true}]},{"name":"TITLE","type":["null",{"type":"string","cv
_auto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String",
"cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":30,"c
v_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscr
ipt":11,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"COR
P_CUST_NO","type":["null",{"type":"bytes","logicalType":"decimal","precision":20
,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"jav
a.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullable":1,"c
v_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signe
d":true,"cv_subscript":12,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}
]},{"name":"APLNDT","type":["null",{"type":"string","cv_auto_incr":false,"cv_cas
e_sensitive":false,"cv_column_class":"java.sql.Timestamp","cv_currency":false,"c
v_def_writable":false,"cv_nullable":1,"cv_precision":0,"cv_read_only":false,"cv_
scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript":13,"cv_type":93,"c
v_typename":"DATE","cv_writable":true}]},{"name":"OPNDT","type":["null",{"type":
"string","cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"java.
sql.Timestamp","cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_p
recision":0,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":t
rue,"cv_subscript":14,"cv_type":93,"cv_typename":"DATE","cv_writable":true}]},{"
name":"VERI_EMP_NO","type":["null",{"type":"bytes","logicalType":"decimal","prec
ision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_cl
ass":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nulla
ble":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,
"cv_signed":true,"cv_subscript":15,"cv_type":2,"cv_typename":"NUMBER","cv_writab
le":true}]},{"name":"VERI_SIGN","type":["null",{"type":"string","cv_auto_incr":f
alse,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv_currency"
:false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"cv_read_only":f
alse,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript":16,"cv_ty
pe":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"MANAGER_SIGN","ty
pe":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":true,"cv_c
olumn_class":"java.lang.String","cv_currency":false,"cv_def_writable":false,"cv_
nullable":1,"cv_precision":1,"cv_read_only":false,"cv_scale":0,"cv_searchable":t
rue,"cv_signed":true,"cv_subscript":17,"cv_type":12,"cv_typename":"VARCHAR2","cv
_writable":true}]},{"name":"CURBAL","type":["null",{"type":"bytes","logicalType"
:"decimal","precision":8,"scale":2,"cv_auto_incr":false,"cv_case_sensitive":fals
e,"cv_column_class":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":
false,"cv_nullable":1,"cv_precision":8,"cv_read_only":false,"cv_scale":2,"cv_sea
rchable":true,"cv_signed":true,"cv_subscript":18,"cv_type":2,"cv_typename":"NUMB
ER","cv_writable":true}]},{"name":"STATUS","type":["null",{"type":"string","cv_a
uto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","c
v_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"cv_r
ead_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript
":19,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]}]}}}, blocks: [B
lockMetaData{10, 1281 [ColumnMetaData{SNAPPY [ACCT_NO] BINARY [BIT_PACKED, PLAI
N], 4}, ColumnMetaData{SNAPPY [SF_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY
], 88}, ColumnMetaData{SNAPPY [LF_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY
], 163}, ColumnMetaData{SNAPPY [BRANCH_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTI
ONARY], 241}, ColumnMetaData{SNAPPY [INTRO_CUST_NO] BINARY [RLE, BIT_PACKED, PL
AIN_DICTIONARY], 298}, ColumnMetaData{SNAPPY [INTRO_ACCT_NO] BINARY [RLE, BIT_P
ACKED, PLAIN_DICTIONARY], 364}, ColumnMetaData{SNAPPY [INTRO_SIGN] BINARY [RLE,
BIT_PACKED, PLAIN_DICTIONARY], 421}, ColumnMetaData{SNAPPY [TYPE] BINARY [RLE,
BIT_PACKED, PLAIN_DICTIONARY], 478}, ColumnMetaData{SNAPPY [OPR_MODE] BINARY [
RLE, BIT_PACKED, PLAIN_DICTIONARY], 538}, ColumnMetaData{SNAPPY [CUR_ACCT_TYPE]
BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 598}, ColumnMetaData{SNAPPY [TITLE]
BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 658}, ColumnMetaData{SNAPPY [CORP_
CUST_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 736}, ColumnMetaData{SNAPP
Y [APLNDT] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 802}, ColumnMetaData{SNA
PPY [OPNDT] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 919}, ColumnMetaData{SN
APPY [VERI_EMP_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1036}, ColumnMet
aData{SNAPPY [VERI_SIGN] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1093}, Col
umnMetaData{SNAPPY [MANAGER_SIGN] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1
150}, ColumnMetaData{SNAPPY [CURBAL] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY]
, 1207}, ColumnMetaData{SNAPPY [STATUS] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONA
RY], 1270}]}]}
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
.handleAndRaise(ParquetRecordReader.java:346)
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
.setup(ParquetRecordReader.java:339)
at org.apache.drill.exec.physical.impl.ScanBatch.<init>(ScanBatch.java:1
01)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(
ParquetScanBatchCreator.java:168)
at org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(
ParquetScanBatchCreator.java:56)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:151)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
eator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
or.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreat
or.java:105)
at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.j
ava:79)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExec
utor.java:230)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable
.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException:
org.apache.drill.exec.vector.Decimal28SparseVector cannot be cast to
org.apache.drill.exec.vector.VariableWidthVector
at org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColu
mn.<init>(VarLengthValuesColumn.java:44)
at org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnRead
ers$Decimal28Column.<init>(VarLengthColumnReaders.java:52)
at org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
.getReader(ColumnReaderFactory.java:178)
at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
.setup(ParquetRecordReader.java:319)
... 22 more
Error: SYSTEM ERROR: ClassCastException: org.apache.drill.exec.vector.Decimal28S
parseVector cannot be cast to org.apache.drill.exec.vector.VariableWidthVector
Fragment 0:0
[Error Id: 22bfa8dd-1129-4300-9449-409e96d6c800 on DaveOshinsky-PC.gp.cv.commvau
lt.com:31010] (state=,code=0)
0: jdbc:drill:zk=local> select count(*) from dfs.`c:/dao/DBArchivePredictor/tenr
ows.parquet` where opr_mode='JO';
+---------+
| EXPR$0 |
+---------+
| 10 |
+---------+
1 row selected (0.406 seconds)
0: jdbc:drill:zk=local>
The immediate cause of this exception is that Drill, in
ParquetRecordReader.java, assumes that all BINARY values are encoded in
VariableWidthVectors. For BINARY DECIMAL, this is not true, as for example
Decimal28SparseVector is a FixedWidthVector, not a VariableWidthVector. The
assumption that DECIMAL is encoded in INT32 or INT64 fields (not BINARY) is
found in a number of other places in the Drill code, including:
org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory only
contains logic to handle DECIMAL with INT32, INT64, INT96, or
FIXED_LEN_BYTE_ARRAY. BINARY is not supported with DECIMAL.
org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders
does not support a nullable reader for BINARY in getNullableColumnReader
method.
> Drill does not support Parquet DECIMAL values in variable length BINARY fields
> ------------------------------------------------------------------------------
>
> Key: DRILL-4184
> URL: https://issues.apache.org/jira/browse/DRILL-4184
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Parquet
> Affects Versions: 1.4.0
> Environment: Windows 7 Professional, Java 1.8.0_66
> Reporter: Dave Oshinsky
>
> Encoding a DECIMAL logical type in Parquet using the variable length BINARY
> primitive type is not supported by Drill as of versions 1.3.0 and 1.4.0. The
> problem first surfaces with the ClassCastException shown below, but fixing
> the immediate cause of the exception is not sufficient to support this
> combination (DECIMAL, BINARY) in a Parquet file.
> In Drill, DECIMAL is currently assumed to be INT32, INT64, INT96, or
> FIXED_LEN_BINARY_ARRAY. Are there any plans to support DECIMAL with variable
> length BINARY? Avro definitely supports encoding DECIMAL in variable length
> bytes (see https://avro.apache.org/docs/current/spec.html#Decimal), but this
> support in Parquet is less clear.
> Selecting on a BINARY DECIMAL field in a parquet file throws an exception as
> shown below (java.lang.ClassCastException:
> org.apache.drill.exec.vector.Decimal28SparseVector cannot be cast to
> org.apache.drill.exec.vector.VariableWidthVector). The successful query at
> bottom selected on a string field in the same file.
> 0: jdbc:drill:zk=local> select count(*) from
> dfs.`c:/dao/DBArchivePredictor/tenrows.parquet` where acct_no=70000020;
> org.apache.drill.common.exceptions.DrillRuntimeException: Error in parquet
> recor
> d reader.
> Message: Failure in setting up reader
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message sbi.acct_mstr {
> required binary ACCT_NO (DECIMAL(20,0));
> optional binary SF_NO (UTF8);
> optional binary LF_NO (UTF8);
> optional binary BRANCH_NO (DECIMAL(20,0));
> optional binary INTRO_CUST_NO (DECIMAL(20,0));
> optional binary INTRO_ACCT_NO (DECIMAL(20,0));
> optional binary INTRO_SIGN (UTF8);
> optional binary TYPE (UTF8);
> optional binary OPR_MODE (UTF8);
> optional binary CUR_ACCT_TYPE (UTF8);
> optional binary TITLE (UTF8);
> optional binary CORP_CUST_NO (DECIMAL(20,0));
> optional binary APLNDT (UTF8);
> optional binary OPNDT (UTF8);
> optional binary VERI_EMP_NO (DECIMAL(20,0));
> optional binary VERI_SIGN (UTF8);
> optional binary MANAGER_SIGN (UTF8);
> optional binary CURBAL (DECIMAL(8,2));
> optional binary STATUS (UTF8);
> }
> , metadata:
> {parquet.avro.schema={"type":"record","name":"acct_mstr","namespace"
> :"sbi","fields":[{"name":"ACCT_NO","type":{"type":"bytes","logicalType":"decimal
> ","precision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_co
> lumn_class":"java.math.BigDecimal","cv_connection":"oracle.jdbc.driver.T4CConnec
> tion","cv_currency":true,"cv_def_writable":false,"cv_nullable":0,"cv_precision":
> 20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_s
> ubscript":1,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}},{"name":"SF_
> NO","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":tru
> e,"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":fal
> se,"cv_nullable":1,"cv_precision":10,"cv_read_only":false,"cv_scale":0,"cv_searc
> hable":true,"cv_signed":true,"cv_subscript":2,"cv_type":12,"cv_typename":"VARCHA
> R2","cv_writable":true}]},{"name":"LF_NO","type":["null",{"type":"string","cv_au
> to_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv
> _currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":10,"cv_r
> ead_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript
> ":3,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"BRANCH_
> NO","type":["null",{"type":"bytes","logicalType":"decimal","precision":20,"scale
> ":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"java.math.
> BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullable":1,"cv_preci
> sion":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true
> ,"cv_subscript":4,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}]},{"nam
> e":"INTRO_CUST_NO","type":["null",{"type":"bytes","logicalType":"decimal","preci
> sion":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_cla
> ss":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullab
> le":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"
> cv_signed":true,"cv_subscript":5,"cv_type":2,"cv_typename":"NUMBER","cv_writable
> ":true}]},{"name":"INTRO_ACCT_NO","type":["null",{"type":"bytes","logicalType":"
> decimal","precision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false
> ,"cv_column_class":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":f
> alse,"cv_nullable":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_sea
> rchable":true,"cv_signed":true,"cv_subscript":6,"cv_type":2,"cv_typename":"NUMBE
> R","cv_writable":true}]},{"name":"INTRO_SIGN","type":["null",{"type":"string","c
> v_auto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String"
> ,"cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"c
> v_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscr
> ipt":7,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"TYPE
> ","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":true,
> "cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":false
> ,"cv_nullable":1,"cv_precision":2,"cv_read_only":false,"cv_scale":0,"cv_searchab
> le":true,"cv_signed":true,"cv_subscript":8,"cv_type":12,"cv_typename":"VARCHAR2"
> ,"cv_writable":true}]},{"name":"OPR_MODE","type":["null",{"type":"string","cv_au
> to_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv
> _currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":2,"cv_re
> ad_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript"
> :9,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"CUR_ACCT
> _TYPE","type":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":
> true,"cv_column_class":"java.lang.String","cv_currency":false,"cv_def_writable":
> false,"cv_nullable":1,"cv_precision":4,"cv_read_only":false,"cv_scale":0,"cv_sea
> rchable":true,"cv_signed":true,"cv_subscript":10,"cv_type":12,"cv_typename":"VAR
> CHAR2","cv_writable":true}]},{"name":"TITLE","type":["null",{"type":"string","cv
> _auto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String",
> "cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":30,"c
> v_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscr
> ipt":11,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"COR
> P_CUST_NO","type":["null",{"type":"bytes","logicalType":"decimal","precision":20
> ,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"jav
> a.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nullable":1,"c
> v_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signe
> d":true,"cv_subscript":12,"cv_type":2,"cv_typename":"NUMBER","cv_writable":true}
> ]},{"name":"APLNDT","type":["null",{"type":"string","cv_auto_incr":false,"cv_cas
> e_sensitive":false,"cv_column_class":"java.sql.Timestamp","cv_currency":false,"c
> v_def_writable":false,"cv_nullable":1,"cv_precision":0,"cv_read_only":false,"cv_
> scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript":13,"cv_type":93,"c
> v_typename":"DATE","cv_writable":true}]},{"name":"OPNDT","type":["null",{"type":
> "string","cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_class":"java.
> sql.Timestamp","cv_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_p
> recision":0,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":t
> rue,"cv_subscript":14,"cv_type":93,"cv_typename":"DATE","cv_writable":true}]},{"
> name":"VERI_EMP_NO","type":["null",{"type":"bytes","logicalType":"decimal","prec
> ision":20,"scale":0,"cv_auto_incr":false,"cv_case_sensitive":false,"cv_column_cl
> ass":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":false,"cv_nulla
> ble":1,"cv_precision":20,"cv_read_only":false,"cv_scale":0,"cv_searchable":true,
> "cv_signed":true,"cv_subscript":15,"cv_type":2,"cv_typename":"NUMBER","cv_writab
> le":true}]},{"name":"VERI_SIGN","type":["null",{"type":"string","cv_auto_incr":f
> alse,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","cv_currency"
> :false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"cv_read_only":f
> alse,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript":16,"cv_ty
> pe":12,"cv_typename":"VARCHAR2","cv_writable":true}]},{"name":"MANAGER_SIGN","ty
> pe":["null",{"type":"string","cv_auto_incr":false,"cv_case_sensitive":true,"cv_c
> olumn_class":"java.lang.String","cv_currency":false,"cv_def_writable":false,"cv_
> nullable":1,"cv_precision":1,"cv_read_only":false,"cv_scale":0,"cv_searchable":t
> rue,"cv_signed":true,"cv_subscript":17,"cv_type":12,"cv_typename":"VARCHAR2","cv
> _writable":true}]},{"name":"CURBAL","type":["null",{"type":"bytes","logicalType"
> :"decimal","precision":8,"scale":2,"cv_auto_incr":false,"cv_case_sensitive":fals
> e,"cv_column_class":"java.math.BigDecimal","cv_currency":true,"cv_def_writable":
> false,"cv_nullable":1,"cv_precision":8,"cv_read_only":false,"cv_scale":2,"cv_sea
> rchable":true,"cv_signed":true,"cv_subscript":18,"cv_type":2,"cv_typename":"NUMB
> ER","cv_writable":true}]},{"name":"STATUS","type":["null",{"type":"string","cv_a
> uto_incr":false,"cv_case_sensitive":true,"cv_column_class":"java.lang.String","c
> v_currency":false,"cv_def_writable":false,"cv_nullable":1,"cv_precision":1,"cv_r
> ead_only":false,"cv_scale":0,"cv_searchable":true,"cv_signed":true,"cv_subscript
> ":19,"cv_type":12,"cv_typename":"VARCHAR2","cv_writable":true}]}]}}}, blocks:
> [B
> lockMetaData{10, 1281 [ColumnMetaData{SNAPPY [ACCT_NO] BINARY [BIT_PACKED,
> PLAI
> N], 4}, ColumnMetaData{SNAPPY [SF_NO] BINARY [RLE, BIT_PACKED,
> PLAIN_DICTIONARY
> ], 88}, ColumnMetaData{SNAPPY [LF_NO] BINARY [RLE, BIT_PACKED,
> PLAIN_DICTIONARY
> ], 163}, ColumnMetaData{SNAPPY [BRANCH_NO] BINARY [RLE, BIT_PACKED,
> PLAIN_DICTI
> ONARY], 241}, ColumnMetaData{SNAPPY [INTRO_CUST_NO] BINARY [RLE, BIT_PACKED,
> PL
> AIN_DICTIONARY], 298}, ColumnMetaData{SNAPPY [INTRO_ACCT_NO] BINARY [RLE,
> BIT_P
> ACKED, PLAIN_DICTIONARY], 364}, ColumnMetaData{SNAPPY [INTRO_SIGN] BINARY
> [RLE,
> BIT_PACKED, PLAIN_DICTIONARY], 421}, ColumnMetaData{SNAPPY [TYPE] BINARY
> [RLE,
> BIT_PACKED, PLAIN_DICTIONARY], 478}, ColumnMetaData{SNAPPY [OPR_MODE] BINARY
> [
> RLE, BIT_PACKED, PLAIN_DICTIONARY], 538}, ColumnMetaData{SNAPPY
> [CUR_ACCT_TYPE]
> BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 598}, ColumnMetaData{SNAPPY
> [TITLE]
> BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 658}, ColumnMetaData{SNAPPY
> [CORP_
> CUST_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 736},
> ColumnMetaData{SNAPP
> Y [APLNDT] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 802},
> ColumnMetaData{SNA
> PPY [OPNDT] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 919},
> ColumnMetaData{SN
> APPY [VERI_EMP_NO] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1036},
> ColumnMet
> aData{SNAPPY [VERI_SIGN] BINARY [RLE, BIT_PACKED, PLAIN_DICTIONARY], 1093},
> Col
> umnMetaData{SNAPPY [MANAGER_SIGN] BINARY [RLE, BIT_PACKED,
> PLAIN_DICTIONARY], 1
> 150}, ColumnMetaData{SNAPPY [CURBAL] BINARY [RLE, BIT_PACKED,
> PLAIN_DICTIONARY]
> , 1207}, ColumnMetaData{SNAPPY [STATUS] BINARY [RLE, BIT_PACKED,
> PLAIN_DICTIONA
> RY], 1270}]}]}
> at
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
> .handleAndRaise(ParquetRecordReader.java:346)
> at
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
> .setup(ParquetRecordReader.java:339)
> at
> org.apache.drill.exec.physical.impl.ScanBatch.<init>(ScanBatch.java:1
> 01)
> at
> org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(
> ParquetScanBatchCreator.java:168)
> at
> org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(
> ParquetScanBatchCreator.java:56)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
> eator.java:151)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
> or.java:174)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
> eator.java:131)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
> or.java:174)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
> eator.java:131)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
> or.java:174)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
> eator.java:131)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
> or.java:174)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
> eator.java:131)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
> or.java:174)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCr
> eator.java:131)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreat
> or.java:174)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreat
> or.java:105)
> at
> org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.j
> ava:79)
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExec
> utor.java:230)
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable
> .java:38)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
> java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
> .java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassCastException:
> org.apache.drill.exec.vector.Decimal28SparseVector cannot be cast to
> org.apache.drill.exec.vector.VariableWidthVector
> at
> org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColu
> mn.<init>(VarLengthValuesColumn.java:44)
> at
> org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnRead
> ers$Decimal28Column.<init>(VarLengthColumnReaders.java:52)
> at
> org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory
> .getReader(ColumnReaderFactory.java:178)
> at
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
> .setup(ParquetRecordReader.java:319)
> ... 22 more
> Error: SYSTEM ERROR: ClassCastException:
> org.apache.drill.exec.vector.Decimal28S
> parseVector cannot be cast to org.apache.drill.exec.vector.VariableWidthVector
> Fragment 0:0
> [Error Id: 22bfa8dd-1129-4300-9449-409e96d6c800 on
> DaveOshinsky-PC.gp.cv.commvau
> lt.com:31010] (state=,code=0)
> 0: jdbc:drill:zk=local> select count(*) from
> dfs.`c:/dao/DBArchivePredictor/tenr
> ows.parquet` where opr_mode='JO';
> +---------+
> | EXPR$0 |
> +---------+
> | 10 |
> +---------+
> 1 row selected (0.406 seconds)
> 0: jdbc:drill:zk=local>
> The immediate cause of this exception is that Drill, in
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader,
> assumes that all BINARY values are encoded in VariableWidthVectors. For
> BINARY DECIMAL, this is not true, as for example Decimal28SparseVector is a
> FixedWidthVector, not a VariableWidthVector. The assumption that DECIMAL is
> not encoded in variable length BINARY is found in a number of other places in
> the Drill code, including:
> org.apache.drill.exec.store.parquet.columnreaders.ColumnReaderFactory only
> contains logic to handle DECIMAL with INT32, INT64, INT96, or
> FIXED_LEN_BYTE_ARRAY. BINARY is not supported with DECIMAL.
> org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders
> does not support a nullable reader for BINARY in getNullableColumnReader
> method.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)