Thanks everyone for the responses. I put up a WIP PR [1] that removes OnCreateMatcher and OnSuccessMatcher.
[1] https://github.com/apache/beam/pull/10056 On Fri, Nov 8, 2019 at 9:48 AM Luke Cwik <lc...@google.com> wrote: > It can be marked as deprecated and we can remove its usage everywhere but > leave this interface and mark it for removal at some future time. > > On Thu, Nov 7, 2019 at 2:23 PM Ismaël Mejía <ieme...@gmail.com> wrote: > >> Thanks for bringing this to the ML Brian >> >> +1 For full TestPipelineOptions deprecation. Even worth to remove it, >> bad part is that this class resides in 'sdks/core/main/java' and not >> in testing as I imagined so this could count as a 'breaking' change. >> >> On Thu, Nov 7, 2019 at 8:27 PM Luke Cwik <lc...@google.com> wrote: >> > >> > There was issue with asynchrony of p.run(), some runners blocked till >> the pipeline was complete with p.run() which was never meant to be the >> intent. >> > >> > The test timeout one makes sense to be able to configure it per runner >> (since Dataflow takes a lot longer than other runners) but we may be able >> to configure a Junit test timeout attribute instead. >> > >> > I would be for getting rid of them. >> > >> > >> > On Wed, Nov 6, 2019 at 3:36 PM Robert Bradshaw <rober...@google.com> >> wrote: >> >> >> >> +1 to all of these are probably obsolete at this point and would be >> >> nice to remove. >> >> >> >> >> >> On Wed, Nov 6, 2019 at 3:00 PM Kenneth Knowles <k...@apache.org> >> wrote: >> >> > >> >> > Good find. I think TestPipelineOptions is from very early days. It >> makes sense to me that these are all obsolete. Some guesses, though I >> haven't dug through commit history to confirm: >> >> > >> >> > - TempRoot: a while ago TempLocation was optional, so I think this >> would provide a default for things like gcpTempLocation and stagingLocation >> >> > - OnSuccessMatcher: for runners where pipeline used to not >> terminate in streaming mode. Now I think every runner can successfully >> waitUntilFinish. Also the current API for waitUntilFinish went through some >> evolutions around asynchrony so it wasn't always a good choice. >> >> > - OnCreateMatcher: just for symmetry? I don't know >> >> > - TestTimeoutSeconds: probably also for the >> asychrony/waitUntilfinish issue >> >> > >> >> > Kenn >> >> > >> >> > On Wed, Nov 6, 2019 at 12:19 PM Brian Hulette <bhule...@google.com> >> wrote: >> >> >> >> >> >> I recently came across TestPipelineOptions, and now I'm wondering >> if maybe it should be deprecated. It only seems to actually be supported >> for Spark and Dataflow (via TestSparkRunner and TestDataflowRunner), and I >> think it may make more sense to move the functionality it provides into the >> tests that need it. >> >> >> >> >> >> TestPipelineOptions currently has four attributes: >> >> >> >> >> >> # TempRoot >> >> >> It's purpose isn't documented, but many tests read TempRoot and use >> it to set a TempLocation (example). I think this attribute makes sense >> (e.g. we can set TempRoot once and each test has its own subdirectory), but >> I'm not sure. Can anyone confirm the motivation for it? I'd like to at >> least add a docstring for it. >> >> >> >> >> >> # OnCreateMatcher >> >> >> A way to register a matcher that will be checked right after a >> pipeline has started. It's never set except for in TestDataflowRunnerTest, >> so I think this is absolutely safe to remove. >> >> >> >> >> >> # OnSuccessMatcher >> >> >> A way to register a matcher that will be checked right after a >> pipeline has successfully completed. This is used in several tests >> (RequiresStableInputIT, WordCountIT, ... 8 total occurrences), but I don't >> see why they couldn't all be replaced with a `p.run().waitUntilFinish()`, >> followed by an assert. >> >> >> >> >> >> I think the current approach is actually dangerous, because running >> these tests with runners other than TestDataflowRunner or TestSparkRunner >> means the matchers are never actually checked. This is actually how I came >> across TestPipelineOptions - I tried running a test with the DirectRunner >> and couldn't make it fail. >> >> >> >> >> >> # TestTimeoutSeconds >> >> >> Seems to just be a wrapper for `waitUntilFinish(duration)`, and >> only used in one place. I think it would be cleaner for the test to be >> responsible for calling waitUntilFinish (which we do elsewhere), the only >> drawback is it requires a small refactor so the test has access to the >> PipelineResult object. >> >> >> >> >> >> >> >> >> So I have a couple of questions for the community >> >> >> 1) Are there thoughts on TempRoot? Can we get rid of it? >> >> >> 2) Are there any objections to removing the other three attributes? >> Am I missing something? Unless there are any objections I think I'll write >> a patch to remove them. >> >> >> >> >> >> Thanks, >> >> >> Brian >> >