I'll work on getting a PR together this morning, probably following
Eugene's suggestion #1.

On Tue, Jan 22, 2019 at 8:34 PM Udi Meiri <eh...@google.com> wrote:

> Alex, the only way to implement my suggestion #1 (that I know of) would be
> to write to a file and read it back.
> I don't have good example for #2.
>
> Eugene's suggestion no. 1 seems like a good idea. There are some example
> <https://github.com/apache/beam/blob/324a1bcc820945731ccce7dd7e5354247b841356/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIOWriteTest.java#L335-L340>
> in the codebase.
>
> On Tue, Jan 22, 2019 at 5:16 PM Eugene Kirpichov <kirpic...@google.com>
> wrote:
>
>> Yeah the "List<MatchResult.Metadata> expected" is constructed
>> from Files.getLastModifiedTime() calls before the files are actually
>> modified, the code is basically unconditionally broken rather than merely
>> flaky.
>>
>> There's several easy options:
>> 1) Use PAssert.that().satisfies() instead of .contains(), and use
>> assertThat().contains() inside that, with the list constructed at time the
>> assertion is applied rather than declared.
>> 2) Implement a Matcher<Metadata> that ignores last modified time and use
>> that
>>
>> Jeff - your option #3 is unfortunately also race-prone, because the code
>> may match the files after they have been written but before
>> setLastModifiedTime was called.
>>
>> On Tue, Jan 22, 2019 at 5:08 PM Jeff Klukas <j...@klukas.net> wrote:
>>
>>> Another option:
>>>
>>> #3 Have the writer thread call Files.setLastModifiedTime explicitly
>>> after each File.write. Then the lastModifiedMillis can be a stable value
>>> for each file and we can use those same static values in our expected
>>> result. I think that would also eliminate the race condition.
>>>
>>> On Tue, Jan 22, 2019 at 7:48 PM Alex Amato <ajam...@google.com> wrote:
>>>
>>>> Thanks Udi, is there a good example for either of these?
>>>> #1 - seems like you have to rewrite your assertion logic without the
>>>> PAssert? Is there some way to capture the pipeline output and iterate over
>>>> it? The pattern I have seen for this in the past also has thread safety
>>>> issues (Using a DoFn at the end of the pipeline to add the output to a
>>>> collection is not safe since the collection can be executed concurrently)
>>>> #2 - Would BigqueryMatcher be a good example for this? which is used in
>>>> BigQueryTornadoesIT.java Or is there another example you would suggest
>>>> looking at for reference?
>>>>
>>>>    - I guess to this you need to implement the SerializableMatcher
>>>>    interface and use the matcher as an option in the pipeline options.
>>>>
>>>>
>>>> On Tue, Jan 22, 2019 at 4:28 PM Udi Meiri <eh...@google.com> wrote:
>>>>
>>>>> Some options:
>>>>> - You could wait to assert until after p.waitForFinish().
>>>>> - You could PAssert using SerializableMatcher and allow any
>>>>> lastModifiedTime.
>>>>>
>>>>> On Tue, Jan 22, 2019 at 3:56 PM Alex Amato <ajam...@google.com> wrote:
>>>>>
>>>>>> +Jeff, Eugene,
>>>>>>
>>>>>> Hi Jeff and Eugene,
>>>>>>
>>>>>> I've noticed that Jeff's PR
>>>>>> <https://github.com/apache/beam/commit/410d6c7b5f933dcb0280894553c1e576ee4e4884>
>>>>>>  introduced
>>>>>> a race condition in this test, but its not clear exactly how to add 
>>>>>> Jeff's
>>>>>> test check in a thread safe way. I believe this to be the source of the
>>>>>> flakeyness Do you have any suggestions Eugene (since you authored this
>>>>>> test)?
>>>>>>
>>>>>> I added some details to this JIRA issue explaining in full
>>>>>> https://jira.apache.org/jira/browse/BEAM-6491?filter=-2
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 22, 2019 at 3:34 PM Alex Amato <ajam...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I've seen this fail in a few different PRs for different
>>>>>>> contributors, and its causing some issues during the presubmit process..
>>>>>>> This is a multithreadred test with a lot of sleeps, so it looks a bit
>>>>>>> suspicious as the source of the problem.
>>>>>>>
>>>>>>> https://builds.apache.org/job/beam_PreCommit_Java_Commit/3688/testReport/org.apache.beam.sdk.io/FileIOTest/testMatchWatchForNewFiles/
>>>>>>>
>>>>>>> I filed a JIRA for this issue:
>>>>>>> https://jira.apache.org/jira/browse/BEAM-6491?filter=-2
>>>>>>>
>>>>>>>
>>>>>>>

Reply via email to