Hi, Is it crazy to use a MiniAccumuloCluster to measure the *relative* performance of two different implementations of iterators?
Obviously it would be better to do it on a real Accumulo cluster, but that's not possible for several reasons. The approach would be something like: - Fire up a Mini cluster - Bulk import a file - Start timer - Set up a BatchScanner with one of the iterator stacks and use it to query for lots of different ranges - Iterate through the results of this - Stop timer Repeat with the other implementation of the iterators. Of course, the difference in performance may not be measurable, if the time is dominated by the disk-seek time, but that would still be useful information. And the absolute performance wouldn't be representative of what you'd get on a real cluster as there's no network latency in these trials, but that's fine as I'm mainly interested in which of the two implementations of the iterators is most performant. Similarly, could the same approach be used to compare the performance on SSD vs hard disk? Thanks, Dave.
