[jira] [Commented] (AVRO-1419) java.io.IOException: Invalid sync! throw after random number of sync() calls.
[ https://issues.apache.org/jira/browse/AVRO-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542819#comment-17542819 ] Abhishek MJ commented on AVRO-1419: --- What is the update on this??? Still seeing issue while using avro 1.8.2. Is it resolved in later versions? What version to use? > java.io.IOException: Invalid sync! throw after random number of sync() calls. > - > > Key: AVRO-1419 > URL: https://issues.apache.org/jira/browse/AVRO-1419 > Project: Apache Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.5 >Reporter: Deepak Kumar V >Priority: Major > > I have a 340 MB avro data file that contains records sorted and identified by > unique id (duplicate records exists). At the beginning of every unique record > a synchronization point is created with DataFileWriter.sync(). (I cannot or > do not want to save the sync points and i do not want to use > SortedKeyValueFile as output format for M/R job) > There are at-least 25k synchronization points in a 340 MB file. > Ex: > Marker1_RecordA1_RecordA2_RecordA3_Marker2_RecordB1_RecordB2 > As records are sorted and marked, for efficient retrieval, binary search is > performed. Most of the times the search is successful, at times the code > throws the following exception > -- > org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at > org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210 > -- > I note down the position that was used to invoke fileReader.sync(mid); and > catch AvroRuntimeException, close and open the file and sync(mid) i do not > see exception. > Why should Avro throw exception before and not later ? > 1.7.5v of library is throwing this error. Raising a major defect, adjust the > priority at your convenience. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (AVRO-1419) java.io.IOException: Invalid sync! throw after random number of sync() calls.
[ https://issues.apache.org/jira/browse/AVRO-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794718#comment-16794718 ] Chandan Pasunoori commented on AVRO-1419: - org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210) issue still occurring in 'org.apache.avro:avro:1.8.2' > java.io.IOException: Invalid sync! throw after random number of sync() calls. > - > > Key: AVRO-1419 > URL: https://issues.apache.org/jira/browse/AVRO-1419 > Project: Apache Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.5 >Reporter: Deepak Kumar V >Priority: Major > > I have a 340 MB avro data file that contains records sorted and identified by > unique id (duplicate records exists). At the beginning of every unique record > a synchronization point is created with DataFileWriter.sync(). (I cannot or > do not want to save the sync points and i do not want to use > SortedKeyValueFile as output format for M/R job) > There are at-least 25k synchronization points in a 340 MB file. > Ex: > Marker1_RecordA1_RecordA2_RecordA3_Marker2_RecordB1_RecordB2 > As records are sorted and marked, for efficient retrieval, binary search is > performed. Most of the times the search is successful, at times the code > throws the following exception > -- > org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at > org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210 > -- > I note down the position that was used to invoke fileReader.sync(mid); and > catch AvroRuntimeException, close and open the file and sync(mid) i do not > see exception. > Why should Avro throw exception before and not later ? > 1.7.5v of library is throwing this error. Raising a major defect, adjust the > priority at your convenience. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AVRO-1419) java.io.IOException: Invalid sync! throw after random number of sync() calls.
[ https://issues.apache.org/jira/browse/AVRO-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15503556#comment-15503556 ] Myles Baker commented on AVRO-1419: --- I've seen this for Spark jobs using yarn as the resource manager. The job aborts due to stage failure caused by org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! > java.io.IOException: Invalid sync! throw after random number of sync() calls. > - > > Key: AVRO-1419 > URL: https://issues.apache.org/jira/browse/AVRO-1419 > Project: Avro > Issue Type: Bug > Components: java >Affects Versions: 1.7.5 >Reporter: Deepak Kumar V > > I have a 340 MB avro data file that contains records sorted and identified by > unique id (duplicate records exists). At the beginning of every unique record > a synchronization point is created with DataFileWriter.sync(). (I cannot or > do not want to save the sync points and i do not want to use > SortedKeyValueFile as output format for M/R job) > There are at-least 25k synchronization points in a 340 MB file. > Ex: > Marker1_RecordA1_RecordA2_RecordA3_Marker2_RecordB1_RecordB2 > As records are sorted and marked, for efficient retrieval, binary search is > performed. Most of the times the search is successful, at times the code > throws the following exception > -- > org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at > org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210 > -- > I note down the position that was used to invoke fileReader.sync(mid); and > catch AvroRuntimeException, close and open the file and sync(mid) i do not > see exception. > Why should Avro throw exception before and not later ? > 1.7.5v of library is throwing this error. Raising a major defect, adjust the > priority at your convenience. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AVRO-1419) java.io.IOException: Invalid sync! throw after random number of sync() calls.
[ https://issues.apache.org/jira/browse/AVRO-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871285#comment-13871285 ] Russell Jurney commented on AVRO-1419: -- I've seen this behavior too, in Pig's AvroStorage and piggybank's AvroStorage. java.io.IOException: Invalid sync! throw after random number of sync() calls. - Key: AVRO-1419 URL: https://issues.apache.org/jira/browse/AVRO-1419 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.7.5 Reporter: Deepak Kumar V I have a 340 MB avro data file that contains records sorted and identified by unique id (duplicate records exists). At the beginning of every unique record a synchronization point is created with DataFileWriter.sync(). (I cannot or do not want to save the sync points and i do not want to use SortedKeyValueFile as output format for M/R job) There are at-least 25k synchronization points in a 340 MB file. Ex: Marker1_RecordA1_RecordA2_RecordA3_Marker2_RecordB1_RecordB2 As records are sorted and marked, for efficient retrieval, binary search is performed. Most of the times the search is successful, at times the code throws the following exception -- org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210 -- I note down the position that was used to invoke fileReader.sync(mid); and catch AvroRuntimeException, close and open the file and sync(mid) i do not see exception. Why should Avro throw exception before and not later ? 1.7.5v of library is throwing this error. Raising a major defect, adjust the priority at your convenience. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1419) java.io.IOException: Invalid sync! throw after random number of sync() calls.
[ https://issues.apache.org/jira/browse/AVRO-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863948#comment-13863948 ] Deepak Kumar V commented on AVRO-1419: -- Doug, You want me to provide a JUnit/TestNG test case ? Or you want me to provide a java code that threw the exception ? java.io.IOException: Invalid sync! throw after random number of sync() calls. - Key: AVRO-1419 URL: https://issues.apache.org/jira/browse/AVRO-1419 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.7.5 Reporter: Deepak Kumar V I have a 340 MB avro data file that contains records sorted and identified by unique id (duplicate records exists). At the beginning of every unique record a synchronization point is created with DataFileWriter.sync(). (I cannot or do not want to save the sync points and i do not want to use SortedKeyValueFile as output format for M/R job) There are at-least 25k synchronization points in a 340 MB file. Ex: Marker1_RecordA1_RecordA2_RecordA3_Marker2_RecordB1_RecordB2 As records are sorted and marked, for efficient retrieval, binary search is performed. Most of the times the search is successful, at times the code throws the following exception -- org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210 -- I note down the position that was used to invoke fileReader.sync(mid); and catch AvroRuntimeException, close and open the file and sync(mid) i do not see exception. Why should Avro throw exception before and not later ? 1.7.5v of library is throwing this error. Raising a major defect, adjust the priority at your convenience. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (AVRO-1419) java.io.IOException: Invalid sync! throw after random number of sync() calls.
[ https://issues.apache.org/jira/browse/AVRO-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862096#comment-13862096 ] Doug Cutting commented on AVRO-1419: Can you provide a test case? Thanks! java.io.IOException: Invalid sync! throw after random number of sync() calls. - Key: AVRO-1419 URL: https://issues.apache.org/jira/browse/AVRO-1419 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.7.5 Reporter: Deepak Kumar V I have a 340 MB avro data file that contains records sorted and identified by unique id (duplicate records exists). At the beginning of every unique record a synchronization point is created with DataFileWriter.sync(). (I cannot or do not want to save the sync points and i do not want to use SortedKeyValueFile as output format for M/R job) There are at-least 25k synchronization points in a 340 MB file. Ex: Marker1_RecordA1_RecordA2_RecordA3_Marker2_RecordB1_RecordB2 As records are sorted and marked, for efficient retrieval, binary search is performed. Most of the times the search is successful, at times the code throws the following exception -- org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync! at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:210 -- I note down the position that was used to invoke fileReader.sync(mid); and catch AvroRuntimeException, close and open the file and sync(mid) i do not see exception. Why should Avro throw exception before and not later ? 1.7.5v of library is throwing this error. Raising a major defect, adjust the priority at your convenience. -- This message was sent by Atlassian JIRA (v6.1.5#6160)