[
https://issues.apache.org/jira/browse/PARQUET-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339338#comment-14339338
]
Ryan Blue commented on PARQUET-180:
-----------------------------------
I don't think the one-time reflection strategy works. I get compile errors when
I build against 0.9.2. Is it safe to build against 0.9.0 or 0.7.0 and assume it
works in 0.9.2?
{code}
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on
project parquet-thrift: Compilation failure
[ERROR]
/home/blue/workspace/parquet-mr/parquet-thrift/src/main/java/parquet/hadoop/thrift/ThriftBytesWriteSupport.java:[136,34]
cannot find symbol
[ERROR] symbol: method setReadLength(int)
[ERROR] location: class org.apache.thrift.protocol.TBinaryProtocol
{code}
> Parquet-thrift compile issue with 0.9.2.
> ----------------------------------------
>
> Key: PARQUET-180
> URL: https://issues.apache.org/jira/browse/PARQUET-180
> Project: Parquet
> Issue Type: Bug
> Reporter: Ryan Blue
>
> Thrift 0.9.2 removed
> [{{setReadLength}}|https://github.com/apache/thrift/commit/2ca9c2028593782621c8876817d8772aa5f46ac7].
> This causes parquet-thrift to fail because it is called for TBinaryProtocol.
> The reason we use it is defensive: a size is read from the data and then that
> many bytes are read, so using this method sets a maximum and causes an
> exception rather than a strange failure later on. The code also has a comment
> that says it is okay when it can't be used.
> {code}
> /* Reduce the chance of OOM when data is corrupted. When readBinary is
> called on TBinaryProtocol, it reads the length of the binary first,
> so if the data is corrupted, it could read a big integer as the length
> of the binary and therefore causes OOM to happen.
> Currently this fix only applies to TBinaryProtocol which has the
> setReadLength defined.
> */
> if (protocol instanceof TBinaryProtocol) {
> ((TBinaryProtocol)protocol).setReadLength(record.getLength());
> }
> {code}
> I think the fix is to remove the section above.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)