Calling BatchScanner.iterator() is what starts the work on the server side. You should do this first for all 6 batch scanners, then iterate over all of them in parallel.
----- Original Message ----- From: "Sven Hodapp" <[email protected]> To: "user" <[email protected]> Sent: Thursday, August 25, 2016 4:53:41 AM Subject: Re: Accumulo Seek performance Hi, I've changed the code a little bit, so that it uses a thread pool (via the Future): val ranges500 = ranges.asScala.grouped(500) // this means 6 BatchScanners will be created for (ranges <- ranges500) { val bscan = instance.createBatchScanner(ARTIFACTS, auths, 2) bscan.setRanges(ranges.asJava) Future { time("mult-scanner") { bscan.asScala.toList // toList forces the iteration of the iterator } } } Here are the results: background log: info: mult-scanner time: 4807.289358 ms background log: info: mult-scanner time: 4930.996522 ms background log: info: mult-scanner time: 9510.010808 ms background log: info: mult-scanner time: 11394.152391 ms background log: info: mult-scanner time: 13297.247295 ms background log: info: mult-scanner time: 14032.704837 ms background log: info: single-scanner time: 15322.624393 ms Every Future completes independent, but in return every batch scanner iterator needs more time to complete. :( This means the batch scanners aren't really processed in parallel on the server side? Should I reconfigure something? Maybe the tablet servers haven't/can't allocate enough threads or memory? (Every of the two nodes has 8 cores and 64GB memory and a storage with ~300MB/s...) Regards, Sven -- Sven Hodapp, M.Sc., Fraunhofer Institute for Algorithms and Scientific Computing SCAI, Department of Bioinformatics Schloss Birlinghoven, 53754 Sankt Augustin, Germany [email protected] www.scai.fraunhofer.de ----- Ursprüngliche Mail ----- > Von: "Josh Elser" <[email protected]> > An: "user" <[email protected]> > Gesendet: Mittwoch, 24. August 2016 18:36:42 > Betreff: Re: Accumulo Seek performance > Ahh duh. Bad advice from me in the first place :) > > Throw 'em in a threadpool locally. > > [email protected] wrote: >> Doesn't this use the 6 batch scanners serially? >> >> ------------------------------------------------------------------------ >> *From: *"Sven Hodapp" <[email protected]> >> *To: *"user" <[email protected]> >> *Sent: *Wednesday, August 24, 2016 11:56:14 AM >> *Subject: *Re: Accumulo Seek performance >> >> Hi Josh, >> >> thanks for your reply! >> >> I've tested your suggestion with a implementation like that: >> >> val ranges500 = ranges.asScala.grouped(500) // this means 6 >> BatchScanners will be created >> >> time("mult-scanner") { >> for (ranges <- ranges500) { >> val bscan = instance.createBatchScanner(ARTIFACTS, auths, 1) >> bscan.setRanges(ranges.asJava) >> for (entry <- bscan.asScala) yield { >> entry.getKey() >> } >> } >> } >> >> And the result is a bit disappointing: >> >> background log: info: mult-scanner time: 18064.969281 ms >> background log: info: single-scanner time: 6527.482383 ms >> >> I'm doing something wrong here? >> >> >> Regards, >> Sven >> >> -- >> Sven Hodapp, M.Sc., >> Fraunhofer Institute for Algorithms and Scientific Computing SCAI, >> Department of Bioinformatics >> Schloss Birlinghoven, 53754 Sankt Augustin, Germany >> [email protected] >> www.scai.fraunhofer.de >> >> ----- Ursprüngliche Mail ----- >> > Von: "Josh Elser" <[email protected]> >> > An: "user" <[email protected]> >> > Gesendet: Mittwoch, 24. August 2016 16:33:37 >> > Betreff: Re: Accumulo Seek performance >> >> > This reminded me of https://issues.apache.org/jira/browse/ACCUMULO-3710 >> > >> > I don't feel like 3000 ranges is too many, but this isn't quantitative. >> > >> > IIRC, the BatchScanner will take each Range you provide, bin each Range >> > to the TabletServer(s) currently hosting the corresponding data, clip >> > (truncate) each Range to match the Tablet boundaries, and then does an >> > RPC to each TabletServer with just the Ranges hosted there. >> > >> > Inside the TabletServer, it will then have many Ranges, binned by Tablet >> > (KeyExtent, to be precise). This will spawn a >> > org.apache.accumulo.tserver.scan.LookupTask will will start collecting >> > results to send back to the client. >> > >> > The caveat here is that those ranges are processed serially on a >> > TabletServer. Maybe, you're swamping one TabletServer with lots of >> > Ranges that it could be processing in parallel. >> > >> > Could you experiment with using multiple BatchScanners and something >> > like Guava's Iterables.concat to make it appear like one Iterator? >> > >> > I'm curious if we should put an optimization into the BatchScanner >> > itself to limit the number of ranges we send in one RPC to a >> > TabletServer (e.g. one BatchScanner might open multiple >> > MultiScanSessions to a TabletServer). >> > >> > Sven Hodapp wrote: >> >> Hi there, >> >> >> >> currently we're experimenting with a two node Accumulo cluster (two >> tablet >> >> servers) setup for document storage. >> >> This documents are decomposed up to the sentence level. >> >> >> >> Now I'm using a BatchScanner to assemble the full document like this: >> >> >> >> val bscan = instance.createBatchScanner(ARTIFACTS, auths, 10) // >> ARTIFACTS table >> >> currently hosts ~30GB data, ~200M entries on ~45 tablets >> >> bscan.setRanges(ranges) // there are like 3000 Range.exact's in the >> ranges-list >> >> for (entry<- bscan.asScala) yield { >> >> val key = entry.getKey() >> >> val value = entry.getValue() >> >> // etc. >> >> } >> >> >> >> For larger full documents (e.g. 3000 exact ranges), this operation >> will take >> >> about 12 seconds. >> >> But shorter documents are assembled blazing fast... >> >> >> >> Is that to much for a BatchScanner / I'm misusing the BatchScaner? >> >> Is that a normal time for such a (seek) operation? >> >> Can I do something to get a better seek performance? >> >> >> >> Note: I have already enabled bloom filtering on that table. >> >> >> >> Thank you for any advice! >> >> >> >> Regards, >> >> Sven
