The link corrupt, please use this one. http://issues.apache.org/jira/browse/HBASE-1778<http://issues.apache.org/jira/browse/HBASE-1778>
On Mon, Aug 24, 2009 at 9:50 PM, Schubert Zhang <[email protected]> wrote: > The patch of performance evaluation for 0.20.0 is available @ > http://issues.apache.org/jira/browse/HBASE‐1778. > With the comments of stack, JD, JG, ryan..., we generated a new test > report, you can also download it this report from above jira link. Please > have a review and give you comments. > Schubert > On Wed, Aug 19, 2009 at 4:17 AM, Ryan Rawson <[email protected]> wrote: > >> Sounds like you are running into RAM issues. Remember, 4gb of ram is >> what I have in my consumer Mac Book (white). I would personally like >> to outfit machines with 2-4gb per CORE. >> >> Jgray is right on here, the Java CMS GC trades time for memory, and >> thus it requires more ram to keep GC pauses low. If you are allocating >> 1/2 your ram to HBase, then you have precious little for the datanode >> and any buffer cache you might need. >> >> Try running datanodes and regionservers not on the same machines as >> one option. You could buy different machine configurations, one with >> large disk, one with less. Or go with modern 8core, 16gb ram machines. >> >> good luck, >> -ryan >> >> On Tue, Aug 18, 2009 at 2:35 PM, Schubert Zhang<[email protected]> wrote: >> > @JG and @stack >> > >> > Helpful! >> > >> > runing RS with 2GB is because we have a heterogeneous node(the slave-5), >> > which has only 4GB RAM. >> > Now, I temporarily removed this node from the cluster. Then we got the >> ~2ms >> > random-read now. It is fine now. >> > >> > Thank you very much. >> > On Wed, Aug 19, 2009 at 2:52 AM, Jonathan Gray <[email protected]> >> wrote: >> > >> >> As stack says, but more strongly, if you have 4+ cores then you >> definitely >> >> want to turn off incremental mode. Is there a reason you're running >> your RS >> >> with 2GB given that you have 8GB of total memory? I'd up it to 4GB, >> after I >> >> did that on our production cluster things ran much more smoothly with >> CMS. >> >> >> >> I'd also drop your swappiness to 0, I've not heard a good argument for >> when >> >> we ever want to swap on an HBase/Hadoop cluster. If you end up >> swapping, >> >> you're going to start seeing some weird behavior and very slow GC runs, >> and >> >> likely killing off regionservers as ZK times out and assumes the RS is >> dead. >> >> >> >> >> >> >> >> stack wrote: >> >> >> >>> "-XX:+CMSIncrementalMode" is our default but its for nodes with 2 or >> less >> >>> CPUs according to >> >>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html. >> You >> >>> might try without this. >> >>> >> >>> >> >>> But I am surprising that the node(5) which has 8CPU cores and 4GB RAM, >> 6 >> >>> >> >>>> SATA-RAID1, has problem. >> >>>> >> >>>> avg-cpu: %user %nice %system %iowait %steal %idle >> >>>> 7.46 0.00 3.28 23.11 0.00 66.15 >> >>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s >> avgrq-sz >> >>>> avgqu-sz await svctm %util >> >>>> sda 84.83 25.12 485.57 2.49 53649.75 220.90 >> 110.38 >> >>>> 9.20 18.85 2.04 99.53 >> >>>> dm-0 0.00 0.00 0.00 25.12 0.00 201.00 >> 8.00 >> >>>> 0.01 0.27 0.01 0.02 >> >>>> dm-1 0.00 0.00 570.90 2.49 53655.72 19.90 >> 93.61 >> >>>> 10.74 18.72 1.74 99.53 >> >>>> >> >>>> It seems the disk I/O is very busy. >> >>>> >> >>>> >> >>> Yeah. Whats writing? Can you tell? Is it NN or ZK node? >> >>> >> >>> St.Ack >> >>> >> >>> >> > >> > >
