No slow datanode in your cluster? When stuff is slow, can you figure who all are trying to talk to?
St.Ack On Mon, Sep 12, 2011 at 8:37 PM, Geoff Hendrey <[email protected]> wrote: > OK Guys - > > We upgraded to 90.4, and made all the suggested config changes. The only > thing we have not done yet, but will try soon, is switching from OpenJDK > to the HotSpot JVM. Unfortunately, the problem recurs exactly as before. > We will test with the HotSpot JVM shortly. > > -geoff > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of > Jean-Daniel Cryans > Sent: Monday, September 12, 2011 11:44 AM > To: [email protected] > Subject: Re: scanner deadlock? > >> I thought that as long as I specified neither -client nor -server, > that >> Server Class detection would automatically invoke the "-server" > option. >> >> > http://download.oracle.com/javase/6/docs/technotes/guides/vm/server-clas >> s.html >> >> We are running 12-core AMD Opteron which is AMD64, so according to the >> guide above, -server is selected automatically. Please let me know if >> I've misunderstood this. We *definitely* want to be running hotspot! > > It's two different JVMs, not a matter of using -client or -server > (which are just different configurations). What you are running is: > > http://openjdk.java.net/ > > What most people run is: > > http://www.oracle.com/us/technologies/java/index.html > >> >> Regarding GC: we are generating GC logs for namenode, datanode, master >> and regionserver. We do see long GC from time to time. In fact, I > played >> with the mslab option, but didn't find significant improvement. We've >> seen times on the order of a minute in these logs, and have found no > way >> around it (spent countless days and nights experimenting with > different >> GC parameters, mslab, different heap sizes, etc). > > Sometimes it's just a matter of how much data you have in flight. > That's why I mentioned scanner pre-caching (set via Scan.setCaching), > because it can potentially load a lot of rows into the RS's heap. More > concurrent scanners means also more data loaded into memory. > > Are you also inserting at the same time? What's your write buffer size? > > The discussion in this jira could be relevant: > https://issues.apache.org/jira/browse/HBASE-3813 > > A temporary fix got committed in 0.90.3 to make > ipc.server.max.queue.size configurable. > > J-D >
