thanks, appreciated :) On Thu, Mar 23, 2017 at 4:59 PM Ted Yu <yuzhih...@gmail.com> wrote:
> Looks like you forgot to include JIRA number: BEAM-1799 > > Cheers > > On Thu, Mar 23, 2017 at 4:26 PM, Stephen Sisk <s...@google.com.invalid> > wrote: > > > hi! > > > > I just opened a jira ticket that I wanted to make sure the mailing list > got > > a chance to see. > > > > The problem is that the current design pattern for doing data loading in > IO > > ITs (either writing a small program or using an external tool) is > complex, > > inefficient and requires extra steps like installing external > > tools/probably using a VM. It also really doesn't scale well to the > larger > > data sizes we'd like to use for performance benchmarking. > > > > My proposal is that instead of trying to test read and write separately, > > the test should be a "write, then read back what you just wrote", all > using > > the IO being tested. To support scenarios like "I want to run my read > test > > repeatedly without re-writing the data", tests would add flags for > > "skipCleanUp" and "useExistingData". > > > > I think we've all likely seen this type of solution when testing storage > > layers in the past, and I've previously shied away from it in this > context, > > but I think now that I've seen some real ITs and thought about scaling > > them, in this case it's the right solution. > > > > Please take a look at the jira if you have questions - there's a lot more > > detail there. > > > > S > > >