Re: IO IT Patterns: Simplifying data loading

Etienne Chauchot Tue, 28 Mar 2017 03:00:48 -0700

Hi Stephen,

I have some comments bellow:



Le 24/03/2017 à 00:26, Stephen Sisk a écrit :

hi!

I just opened a jira ticket that I wanted to make sure the mailing list got
a chance to see.

The problem is that the current design pattern for doing data loading in IO
ITs (either writing a small program or using an external tool) is complex,
inefficient and requires extra steps like installing external
tools/probably using a VM. It also really doesn't scale well to the larger
data sizes we'd like to use for performance benchmarking.

My proposal is that instead of trying to test read and write separately,
the test should be a "write, then read back what you just wrote", all using
the IO being tested.

Sure, joining read and write tests will allow to write less often andthus be more efficient. Indeed, instead of writing once for all the readtest runs and write at each write test run, we will only write at eachread+write test run. We will also avoid using another writing place.

To support scenarios like "I want to run my read test
repeatedly without re-writing the data", tests would add flags for
"skipCleanUp" and "useExistingData".

But this does the assumption of the order of test runs: write test needsto have been run before read test can happen. Maybe a little dangerousto do this assumption no?


I think we've all likely seen this type of solution when testing storage
layers in the past, and I've previously shied away from it in this context,
but I think now that I've seen some real ITs and thought about scaling
them, in this case it's the right solution.

Please take a look at the jira if you have questions - there's a lot more
detail there.

S

Etienne

Re: IO IT Patterns: Simplifying data loading

Reply via email to