thanks everyone, that should definately help me out, while I feel silly for ignoring this issue at first, it should be interesting to see how much this influences the results.
On Sat, Aug 4, 2012 at 7:19 AM, Eric Newton <[email protected]> wrote: > You can drop the OS caches between runs: > > # echo 1 > /proc/sys/vm/drop_caches > > > On Fri, Aug 3, 2012 at 9:41 PM, Christopher Tubbs <[email protected]>wrote: > >> Steve- >> >> I would probably design the experiment to test different cluster sizes >> as completely independent. That means, taking the entire thing down >> and back up again (possibly even rebooting the boxes, and/or >> re-initializing the cluster at the new size). I'd also do several runs >> while it is up at a particular cluster size, to capture any >> performance difference between the first and a later run due to OS or >> TServer caching, for analysis later. >> >> Essentially, when in doubt, take more data... >> >> --L >> >> >> On Fri, Aug 3, 2012 at 5:50 PM, Steven Troxell <[email protected]> >> wrote: >> > Hi all, >> > >> > I am running a benchmarking project on accumulo looking at RDF queries >> for >> > clusters with different node sizes. While I intend to look at caching >> for >> > each optimizing each individual run, I do NOT want caching to interfere >> for >> > example between runs involving the use of 10 and 8 tablet servers. >> > >> > Up to now I'd just been killing nodes via the bin/stop-here.sh script >> but I >> > realize that may have allowed caching from previous runs with different >> node >> > sizes to influence my results. It seemed weird to me for exmaple when >> I >> > realized dropping nodes actually increased performance (as measured by >> query >> > return times) in some cases (though I acknowledge the code I'm working >> with >> > has some serious issues with how ineffectively it is actually utilizing >> > accumulo, but that's an issue I intend to address later). >> > >> > I suppose one way would be between a change of node sizes, stop and >> restart >> > ALL nodes ( as opposed to what I'd been doing in just killing 2 nodes >> for >> > example in transitioning from a 10 to 8 node test). Will this be sure >> to >> > clear the influence of caching across runs, and is there any cleaner >> way to >> > do this? >> > >> > thanks, >> > Steve >> > >
