[ 
https://issues.apache.org/jira/browse/PIG-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112907#comment-16112907
 ] 

Adam Szita commented on PIG-5284:
---------------------------------

There were two issues I fixed in [^PIG-5284.0.patch] both of them inside 
{{InterRecordReader#skipUntilMarkerOrSplitEndOrEOF}}

One is that we were using 0 as default value of {{int b}} which is a valid 
first byte for a sync marker. I've set this now to Integer.MIN_VALUE. Without 
this we may end up having the same records appearing multiple times in the 
output.

The other problem was the handling of sync markers on split beginnings: when 
the next sync marker is exactly at the next split's beginning and the last byte 
before (in the previous data/record) is the same as the first byte of the 
marker we will read past the split end instead of stopping. This also results 
in same records appearing multiple times in the output.

[~rohini], [~nkollar] let me know what you think please

> Fix flakyness introduced by PIG-3655
> ------------------------------------
>
>                 Key: PIG-5284
>                 URL: https://issues.apache.org/jira/browse/PIG-5284
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>         Attachments: PIG-5284.0.patch
>
>
> It seems like some tests are flaky after PIG-3655.
> A recent error is:
> {code}
> Failed
> org.apache.pig.test.TestBinInterSedes.testSyncMarkerOverlappingMarker
> Failing for the past 1 build (Since Unstable#2523 )
> Took 13 sec.
> Error Message
> Comparing actual and expected results.  expected:<[(apple,1,1), 
> (kiwi,16909095,72624011372134400), (orange,2,2), (orange,4,4)]> but 
> was:<[(apple,1,1), (kiwi,16909095,72624011372134400), (orange,2,2), 
> (orange,4,4), (orange,4,4)]>
> Stacktrace
> junit.framework.AssertionFailedError: Comparing actual and expected results.  
> expected:<[(apple,1,1), (kiwi,16909095,72624011372134400), (orange,2,2), 
> (orange,4,4)]> but was:<[(apple,1,1), (kiwi,16909095,72624011372134400), 
> (orange,2,2), (orange,4,4), (orange,4,4)]>
>       at 
> org.apache.pig.test.Util.checkQueryOutputsAfterSortRecursive(Util.java:1290)
>       at 
> org.apache.pig.test.TestBinInterSedes.testInterStorageSyncMarker(TestBinInterSedes.java:428)
>       at 
> org.apache.pig.test.TestBinInterSedes.testSyncMarkerOverlappingMarker(TestBinInterSedes.java:350)
> {code}
> I've made the tests in TestBinInterSedes run for couple of hundred times and 
> have spotted some failures that may come up depending on the generated random 
> sync marker and data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to