Ara, What kind of query load are you generating within your batch scanners? Are you using an iterator that seeks around a lot? Are you grabbing many small batches (only a few keys per range) from the batch scanner? As a wild guess, this could be the result of lots of seeks with a low cache hit rate, which would induce CPU load in HDFS fetching blocks and CPU load in Accumulo decrypting/decompressing those blocks. The monitor page will show you seek rates and cache hit rates.
Adam On Sat, Feb 7, 2015 at 8:48 PM, Ara Ebrahimi <[email protected]> wrote: > 2.4.0.2.1. > > Yeah seems like I need to do that. I was hoping I’d get some advice based on > prior experience with google cloud environment. > > Ara. > > On Feb 7, 2015, at 11:23 AM, Josh Elser <[email protected]> wrote: > > What version of Hadoop are you using? > > Have you considered hooking up a profiler to the Datanode on GCE to see > where the time is being spent? That might help shed some light on the > situation. > > Ara Ebrahimi wrote: > > Hi, > > We’re seeing some weird behavior from the hdfs daemon on google cloud > environment when we use accumulo Scanner to sequentially scan a table. Top > reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around > 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of > free memory. It seems like something causes the hdfs daemon to consume a lot > of cpu and not to send enough read requests to the disk (ssd actually, so > disk is super fast and vastly under-utilized). The process which sends scan > requests to accumulo is 500% active (using 3 query batch threads and > aggressive scan-batch-size/read-ahead-threashold values). So it seems like > somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon > going over 10% cpu usage. Any idea what the issue could be? > > Thanks, > Ara. > > > > ________________________________ > > This message is for the designated recipient only and may contain > privileged, proprietary, or otherwise confidential information. If you have > received it in error, please notify the sender immediately and delete the > original. Any other use of the e-mail by you is prohibited. Thank you in > advance for your cooperation. > > ________________________________ > > > > > ________________________________ > > This message is for the designated recipient only and may contain > privileged, proprietary, or otherwise confidential information. If you have > received it in error, please notify the sender immediately and delete the > original. Any other use of the e-mail by you is prohibited. Thank you in > advance for your cooperation. > > ________________________________ > > > > > > ________________________________ > > This message is for the designated recipient only and may contain > privileged, proprietary, or otherwise confidential information. If you have > received it in error, please notify the sender immediately and delete the > original. Any other use of the e-mail by you is prohibited. Thank you in > advance for your cooperation. > > ________________________________
