[ 
https://issues.apache.org/jira/browse/PARQUET-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206414#comment-17206414
 ] 

David Mollitor commented on PARQUET-1918:
-----------------------------------------

Unit tests fail with:

 
{code:java}
java.lang.Exception: java.nio.ReadOnlyBufferException
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.nio.ReadOnlyBufferException
        at java.nio.ByteBuffer.array(ByteBuffer.java:996)
        at 
shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375)
        at 
org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135)
        at 
org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945)
        at 
org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820)
        at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728)
        at org.apache.parquet.format.Util.write(Util.java:372)
        at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69)
        at 
org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087)
        at 
org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050)
 {code}

> Avoid Copy of Bytes in Protobuf BinaryWriter
> --------------------------------------------
>
>                 Key: PARQUET-1918
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1918
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Assignee: David Mollitor
>            Priority: Minor
>
> {code:java|title=ProtoWriteSupport.java}
>   class BinaryWriter extends FieldWriter {
>     @Override
>     final void writeRawValue(Object value) {
>       ByteString byteString = (ByteString) value;
>       Binary binary = Binary.fromConstantByteArray(byteString.toByteArray());
>       recordConsumer.addBinary(binary);
>     }
>   }
> {code}
> {{toByteArray()}} creates a copy of the buffer.  There is already support 
> with Parquet and Protobuf to pass instead a ByteBuffer which avoids the copy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to