The verification job is very sensitive to the number of rounds it takes to shuffle/sort the results. How many reducers have you used, and how much memory have you given them? More is better.
I think we've clocked the verification job for 24 hours of ingest in under 2 hours. This is from memory, so I could be wrong. But with a bad configuration (uses only a few small reducers), it can take a very long time. Go with as many as 100 reducers per node and let the reducers have a lot of memory. You want each reducer to run long enough to make the process creation overhead small. So they should run for a few minutes, each. Please post back with any improvements! We are about to enter a testing cycle, so I'll update the example configuration files with some better instructions. I'm curious, how many key/value entries did you ingest in 24 hours? -Eric On Tue, Oct 22, 2013 at 4:56 PM, Billie Rinaldi <[email protected]> wrote: > I believe it does take a long time to verify. Shorter than, but a similar > order of magnitude as, the amount of time it took to write the data. > Others may be able to give you more quantitative information. > > > On Tue, Oct 22, 2013 at 12:56 PM, Ryan Fishel <[email protected]>wrote: > >> Hello, >> >> I am currently running through the test suites included with the Accumulo >> package ($ACCUMULO_HOME/test/system) and am running into some rather long >> verification times with the Continuous Test. >> >> I am running the continuous test for a 24 hour period on a 7 node cluster >> with walkers, batch walkers, and that stats service turned on. All jobs >> appear to run fine during the whole period. Since the test docs don't give >> any indication, I was wondering if someone could provide typical run times >> for the verification job? I'd like to appropriately set my expectations >> before I start looking for a misconfiguration in the underlying cluster. >> >> Thank you! >> Ryan Fishel >>
