[ 
https://issues.apache.org/jira/browse/PARQUET-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021349#comment-17021349
 ] 

Brian Mwambazi commented on PARQUET-1773:
-----------------------------------------

This trace looks somewhat different from the original snippet we have in the 
bug report. But from what I am seeing both errors somehow are arising from an 
_Illegal State_ type of error. Since this is not deterministic my guess is you 
may be having a race condition in your application, particularly on 
_ParquetS3Writer_

> Parquet file in invalid state while writing to S3 when calling 
> ParquetWriter.write
> ----------------------------------------------------------------------------------
>
>                 Key: PARQUET-1773
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1773
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.10.0
>            Reporter: Tristan Davolt
>            Priority: Major
>
> This may be related to PARQUET-632. I am also writing parquet to S3, but I am 
> calling ParquetWriter.write directly. I have multiple containerized instances 
> consuming messages from Kafka, converting them to Parquet, and then writing 
> to S3. One instance will begin to throw this exception for all new messages. 
> Sometimes, the container will recover. Other times, it must be restarted 
> manually to recover. I am unable to find any "error thrown previously."
> Exception:
>  java.io.IOException
>  Message:
>  The file being written is in an invalid state. Probably caused by an error 
> thrown previously. Current state: BLOCK
>  Stacktrace:
> {code:java}
> org.apache.parquet.hadoop.ParquetFileWriter$STATE.error(ParquetFileWriter.java:168)org.apache.parquet.hadoop.ParquetFileWriter$STATE.startBlock(ParquetFileWriter.java:160)org.apache.parquet.hadoop.ParquetFileWriter.startBlock(ParquetFileWriter.java:291)org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:171)org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114)org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:308)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to