Re: IO IT Patterns: Simplifying data loading

Ted Yu Thu, 23 Mar 2017 17:00:07 -0700

Looks like you forgot to include JIRA number: BEAM-1799

Cheers


On Thu, Mar 23, 2017 at 4:26 PM, Stephen Sisk <[email protected]>
wrote:

> hi!
>
> I just opened a jira ticket that I wanted to make sure the mailing list got
> a chance to see.
>
> The problem is that the current design pattern for doing data loading in IO
> ITs (either writing a small program or using an external tool) is complex,
> inefficient and requires extra steps like installing external
> tools/probably using a VM. It also really doesn't scale well to the larger
> data sizes we'd like to use for performance benchmarking.
>
> My proposal is that instead of trying to test read and write separately,
> the test should be a "write, then read back what you just wrote", all using
> the IO being tested. To support scenarios like "I want to run my read test
> repeatedly without re-writing the data", tests would add flags for
> "skipCleanUp" and "useExistingData".
>
> I think we've all likely seen this type of solution when testing storage
> layers in the past, and I've previously shied away from it in this context,
> but I think now that I've seen some real ITs and thought about scaling
> them, in this case it's the right solution.
>
> Please take a look at the jira if you have questions - there's a lot more
> detail there.
>
> S
>

Re: IO IT Patterns: Simplifying data loading

Reply via email to