Github user dragos commented on the pull request:
https://github.com/apache/spark/pull/7600#issuecomment-125741403
> On 28 iul. 2015, at 21:42, Tathagata Das <[email protected]> wrote:
>
> @dragos I didnt delete any comment! The comments may have been to earlier
threads which are currently hidden due to outdated diffs.
>
Weird. Probably you attached it to a commit that went away. They seemed to
refer to code that was no longer there, but still new.
> I know that these TestSuiteBase are the helper methods that I have
provided with Spark Streaming. They were done a long time ago and since then we
have learnt quite a bit about the shortcoming of those methods (runStreams),
and when to use them and when its cleaner to not use them.
>
How about you deprecate them? It would save time for everyone touching
spark streaming.
> Tests have unnecessary delays when designed using runStreams. This you
see for yourself. With or without runStreams you have to use eventually to test
your condition. So why wait for the runStreams to complete (especially when
running a real clock) and then test with eventually?
If you run on a real clock, I don't see what you gain in terms of time.
It's not faster, is it?
> And how do you decide how many batches to run in runStreams?
It's quite simple: in my test I have 3 rates I want to observe, so I need
at least 3 batches.
> Is it even important to care about the number of batches to run
considering the test really wants to verify some other condition being
eventually true?
Yes, since I need to make sure I observe the right number of rate updates,
with the right values.
> Also why go to the complexity of attaching a TestOutputDStream if the
test does care about the output?
The 'complexity' we're talking about is 2 lines of code. I'm happy to be
proven wrong, but inlining the useful parts of runStreams will be more than
that.
> It will improve test times as well simplify the test if in these cases
you just test with eventually alone, without using abstractions that does not
really help.
I think they help, but again, happy to be proven wrong.
How about you rewrite the last test and compare? If you're still convinced
it's better I'll rewrite the rest.
> Look at all the newly updated streaming testsuitesl, none of them use
runStreams. For a reason. Hope this clarifies why I am trying to move the code
away from runStreams.
>
> â
> Reply to this email directly or view it on GitHub.
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]