Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
I have to give credit to Sun, once we finally got them to open the issue and take a look they addressed it very quickly. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6401245 Thanks everyone... Cheers, Eric Molitor On 3/13/06, Rick Jones [EMAIL PROTECTED] wrote: S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. Even doing it two pieces (header and data) would have worked. Insofar as it would not have stumbled over the cwnd. Still it should be presenting everything at one time - is there no gathering send to be made? rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Eric Molitor wrote: You are correct, the format is as follows. Command Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o command set (1 byte) o command (1 byte) * data (Variable) Reply Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o error code (2 bytes) * data (Variable) Source: http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jdwp-spec.html Also maybe useful: http://java.sun.com/j2se/1.5.0/docs/guide/jpda/socketTransport-example.c S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
On Mon, 13 Mar 2006 09:25:44 -0800 Rick Jones [EMAIL PROTECTED] wrote: Eric Molitor wrote: You are correct, the format is as follows. Command Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o command set (1 byte) o command (1 byte) * data (Variable) Reply Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o error code (2 bytes) * data (Variable) Source: http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jdwp-spec.html Also maybe useful: http://java.sun.com/j2se/1.5.0/docs/guide/jpda/socketTransport-example.c S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. Even doing it two pieces (header and data) would have worked. --- socketTransport-example.c.orig 2006-03-13 10:14:00.0 -0800 +++ socketTransport-example.c 2006-03-13 10:24:02.0 -0800 @@ -445,7 +445,16 @@ static jdwpTransportError JNICALL socketTransport_writePacket(jdwpTransportEnv* env, const jdwpPacket *packet) { -jint len, data_len, id; +#pragma pack +struct { + jint len; + jint id; + jbyte flags; + union { + jshort errorCode; + jbyte cmd[2]; + }; +} header; jbyte *data; /* packet can't be null */ @@ -453,7 +462,6 @@ RETURN_ERROR(JDWPTRANSPORT_ERROR_ILLEGAL_ARGUMENT, packet is NULL); } -len = packet-type.cmd.len;/* includes header */ data_len = len - 11; /* bad packet */ @@ -461,40 +469,22 @@ RETURN_ERROR(JDWPTRANSPORT_ERROR_ILLEGAL_ARGUMENT, invalid length); } -len = (jint)dbgsysHostToNetworkLong(len); - -if (dbgsysSend(socketFD,(char *)len,sizeof(jint),0) != sizeof(jint)) { - RETURN_IO_ERROR(send failed); -} - -id = (jint)dbgsysHostToNetworkLong(packet-type.cmd.id); - -if (dbgsysSend(socketFD,(char *)(id),sizeof(jint),0) != sizeof(jint)) { - RETURN_IO_ERROR(send failed); -} - -if (dbgsysSend(socketFD,(char *)(packet-type.cmd.flags) - ,sizeof(jbyte),0) != sizeof(jbyte)) { - RETURN_IO_ERROR(send failed); -} +header.len = (jint)dbgsysHostToNetworkLong(packet-type.cmd.len); +header.id = (jint)dbgsysHostToNetworkLong(packet-type.cmd.id); +header.flags = packet-type.cmd.flags; if (packet-type.cmd.flags JDWPTRANSPORT_FLAGS_REPLY) { -jshort errorCode = dbgsysHostToNetworkShort(packet-type.reply.errorCode); -if (dbgsysSend(socketFD,(char *)(errorCode) - ,sizeof(jshort),0) != sizeof(jshort)) { - RETURN_IO_ERROR(send failed); - } + header.errorCode = dbgsysHostToNetworkShort(packet-type.reply.errorCode); } else { -if (dbgsysSend(socketFD,(char *)(packet-type.cmd.cmdSet) - ,sizeof(jbyte),0) != sizeof(jbyte)) { - RETURN_IO_ERROR(send failed); - } -if (dbgsysSend(socketFD,(char *)(packet-type.cmd.cmd) - ,sizeof(jbyte),0) != sizeof(jbyte)) { - RETURN_IO_ERROR(send failed); - } + header.cmd[0] = packet-type.cmd.cmdSet; + header.cmd[1] = packet-type.cmd.cmd; } - + +if (dbgsysSend(socketFD,(char *)header, sizeof(header), 0) + != sizeof(header)) { + RETURN_IO_ERROR(send failed); +} + data = packet-type.cmd.data; if (dbgsysSend(socketFD,(char *)data,data_len,0) != data_len) { RETURN_IO_ERROR(send failed); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
S, since they know the length it means they know the data to send, which means there is no valid excuse for not sending all the data at one time - ie in one call to the transport. Their use of TCP_NODELAY was simply a kludge, and a massive one at that. I still think there are issues trying to map the byte-based RFC cwnds to a packet based cwnd in the stack, but the application is defintely broken. Even doing it two pieces (header and data) would have worked. Insofar as it would not have stumbled over the cwnd. Still it should be presenting everything at one time - is there no gathering send to be made? rick jones - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
You are correct, the format is as follows. Command Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o command set (1 byte) o command (1 byte) * data (Variable) Reply Packet * Header o length (4 bytes) o id (4 bytes) o flags (1 byte) o error code (2 bytes) * data (Variable) Source: http://java.sun.com/j2se/1.4.2/docs/guide/jpda/jdwp-spec.html Also maybe useful: http://java.sun.com/j2se/1.5.0/docs/guide/jpda/socketTransport-example.c - Eric On 3/10/06, Rick Jones [EMAIL PROTECTED] wrote: Stephen Hemminger wrote: The strace shows that the client? does lots of little send's also, the response is handled in a different thread than the sender so they spend a lot of time banging on futex's. I got these by doing strace -ff -o /tmp/foo eclipse sched_getscheduler(15164) = 0 (SCHED_OTHER) gettimeofday({1142026997, 896025}, NULL) = 0 futex(0x80b77ec, FUTEX_WAKE, 1) = 1 futex(0x80b77e8, FUTEX_WAKE, 1) = 0 futex(0x80ce014, FUTEX_WAIT, 3, NULL) = 0 futex(0x805c638, FUTEX_WAIT, 2, NULL)= -1 EAGAIN (Resource temporarily unavailable) futex(0x805c638, FUTEX_WAKE, 1) = 0 futex(0x80cca4c, FUTEX_WAIT, 5, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\37, 4, 0) = 4 send(3, \0\0\0\2, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\4\0\0\0\4\0\0\0\10\0\0\0\10\0\0\0\10, 20, 0) = 20 futex(0x80cca4c, FUTEX_WAIT, 7, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\17, 4, 0) = 4 send(3, \0\0\0\3, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\2, 4, 0) = 4 That sure smells like the first four bytes are a message length and the rest of it are the data and if that is correct, it certainly does look like someone needs to be hit upside with the HP P/N 19511-80014 HP Attitude Adjuster It also explains why after the cwnd finally grew to five packets things appeared happy in the trace. rick - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
The strace shows that the client? does lots of little send's also, the response is handled in a different thread than the sender so they spend a lot of time banging on futex's. I got these by doing strace -ff -o /tmp/foo eclipse sched_getscheduler(15164) = 0 (SCHED_OTHER) gettimeofday({1142026997, 896025}, NULL) = 0 futex(0x80b77ec, FUTEX_WAKE, 1) = 1 futex(0x80b77e8, FUTEX_WAKE, 1) = 0 futex(0x80ce014, FUTEX_WAIT, 3, NULL) = 0 futex(0x805c638, FUTEX_WAIT, 2, NULL)= -1 EAGAIN (Resource temporarily unavailable) futex(0x805c638, FUTEX_WAKE, 1) = 0 futex(0x80cca4c, FUTEX_WAIT, 5, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\37, 4, 0) = 4 send(3, \0\0\0\2, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\4\0\0\0\4\0\0\0\10\0\0\0\10\0\0\0\10, 20, 0) = 20 futex(0x80cca4c, FUTEX_WAIT, 7, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\17, 4, 0) = 4 send(3, \0\0\0\3, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\2, 4, 0) = 4 futex(0x80cca4c, FUTEX_WAIT, 9, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 The other side is more sane futex(0x805b5c8, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\v\0\0\0\2\0\1\7, 11, 0) = 11 futex(0x831356c, FUTEX_WAIT, 3, NULL) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\21\0\0\0\3\0\17\1\t\0\0\0\0\0, 17, 0) = 17 futex(0x831356c, FUTEX_WAIT, 5, NULL) = 0 futex(0x8313568, FUTEX_WAIT, 2, NULL) = 0 futex(0x8313568, FUTEX_WAKE, 1) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\21\0\0\0\1\0\17\1\6\0\0\0\0\0, 17, 0) = 17 futex(0x831356c, FUTEX_WAIT, 7, NULL) = 0 futex(0x8313568, FUTEX_WAIT, 2, NULL) = 0 futex(0x8313568, FUTEX_WAKE, 1) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\21\0\0\0\4\0\17\1\7\0\0\0\0\0, 17, 0) = 17 futex(0x831356c, FUTEX_WAIT, 9, NULL) = 0 futex(0x8313568, FUTEX_WAIT, 2, NULL) = 0 futex(0x8313568, FUTEX_WAKE, 1) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 s - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Am I correct in assuming by Client you mean Eclipse and by the other side you mean the JDK? On 3/10/06, Stephen Hemminger [EMAIL PROTECTED] wrote: The strace shows that the client? does lots of little send's also, the response is handled in a different thread than the sender so they spend a lot of time banging on futex's. I got these by doing strace -ff -o /tmp/foo eclipse sched_getscheduler(15164) = 0 (SCHED_OTHER) gettimeofday({1142026997, 896025}, NULL) = 0 futex(0x80b77ec, FUTEX_WAKE, 1) = 1 futex(0x80b77e8, FUTEX_WAKE, 1) = 0 futex(0x80ce014, FUTEX_WAIT, 3, NULL) = 0 futex(0x805c638, FUTEX_WAIT, 2, NULL)= -1 EAGAIN (Resource temporarily unavailable) futex(0x805c638, FUTEX_WAKE, 1) = 0 futex(0x80cca4c, FUTEX_WAIT, 5, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\37, 4, 0) = 4 send(3, \0\0\0\2, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\4\0\0\0\4\0\0\0\10\0\0\0\10\0\0\0\10, 20, 0) = 20 futex(0x80cca4c, FUTEX_WAIT, 7, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\17, 4, 0) = 4 send(3, \0\0\0\3, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\2, 4, 0) = 4 futex(0x80cca4c, FUTEX_WAIT, 9, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 The other side is more sane futex(0x805b5c8, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\v\0\0\0\2\0\1\7, 11, 0) = 11 futex(0x831356c, FUTEX_WAIT, 3, NULL) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\21\0\0\0\3\0\17\1\t\0\0\0\0\0, 17, 0) = 17 futex(0x831356c, FUTEX_WAIT, 5, NULL) = 0 futex(0x8313568, FUTEX_WAIT, 2, NULL) = 0 futex(0x8313568, FUTEX_WAKE, 1) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\21\0\0\0\1\0\17\1\6\0\0\0\0\0, 17, 0) = 17 futex(0x831356c, FUTEX_WAIT, 7, NULL) = 0 futex(0x8313568, FUTEX_WAIT, 2, NULL) = 0 futex(0x8313568, FUTEX_WAKE, 1) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 send(9, \0\0\0\21\0\0\0\4\0\17\1\7\0\0\0\0\0, 17, 0) = 17 futex(0x831356c, FUTEX_WAIT, 9, NULL) = 0 futex(0x8313568, FUTEX_WAIT, 2, NULL) = 0 futex(0x8313568, FUTEX_WAKE, 1) = 0 futex(0xaee3d008, FUTEX_WAKE, 1) = 0 s - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
On Fri, 10 Mar 2006 16:11:07 -0600 Eric Molitor [EMAIL PROTECTED] wrote: Am I correct in assuming by Client you mean Eclipse and by the other side you mean the JDK? Since the process are all inside the JVM it is impossible to really tell, but that's my assumption. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Stephen Hemminger wrote: The strace shows that the client? does lots of little send's also, the response is handled in a different thread than the sender so they spend a lot of time banging on futex's. I got these by doing strace -ff -o /tmp/foo eclipse sched_getscheduler(15164) = 0 (SCHED_OTHER) gettimeofday({1142026997, 896025}, NULL) = 0 futex(0x80b77ec, FUTEX_WAKE, 1) = 1 futex(0x80b77e8, FUTEX_WAKE, 1) = 0 futex(0x80ce014, FUTEX_WAIT, 3, NULL) = 0 futex(0x805c638, FUTEX_WAIT, 2, NULL)= -1 EAGAIN (Resource temporarily unavailable) futex(0x805c638, FUTEX_WAKE, 1) = 0 futex(0x80cca4c, FUTEX_WAIT, 5, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\37, 4, 0) = 4 send(3, \0\0\0\2, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\4\0\0\0\4\0\0\0\10\0\0\0\10\0\0\0\10, 20, 0) = 20 futex(0x80cca4c, FUTEX_WAIT, 7, NULL) = 0 futex(0x80cca48, FUTEX_WAIT, 2, NULL) = 0 futex(0x80cca48, FUTEX_WAKE, 1) = 0 futex(0x80cac18, FUTEX_WAKE, 1) = 0 send(3, \0\0\0\17, 4, 0) = 4 send(3, \0\0\0\3, 4, 0) = 4 send(3, \200, 1, 0) = 1 send(3, \0\0, 2, 0) = 2 send(3, \0\0\0\2, 4, 0) = 4 That sure smells like the first four bytes are a message length and the rest of it are the data and if that is correct, it certainly does look like someone needs to be hit upside with the HP P/N 19511-80014 HP Attitude Adjuster It also explains why after the cwnd finally grew to five packets things appeared happy in the trace. rick - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Hello Stephen, 2) How to setup the same environment (for non-java savvy people) with freely available software (Sun JDK okay). I can figure out how to get JDK IDEA is available for download at http://www.jetbrains.com/idea You can use either evaluation license or download an EA build and use free EAP licence: http://www.intellij.net/eap/products/idea/download.jsp 1) a strace (system call trace) of what the JVM is doing during the transfer. To enable JPDA trace IDEA must be started with the following VM property: -Didea.debugger.trace=category[spacecategory] where category is one of the following: SENDS RAW_SENDS RECEIVES RAW_RECEIVES EVENTS REFTYPES OBJREFS ALL Examples: -Didea.debugger.trace=SENDS RECEIVES -Didea.debugger.trace=ALL As for the code that needs to be debugged in order to reproduce the problem, perhaps Eric will help? Best regards, Eugene Zhuravlev Software Developer JetBrains Inc. http://www.jetbrains.com Develop with pleasure! - Original Message - From: Stephen Hemminger [EMAIL PROTECTED] To: Andrew Morton [EMAIL PROTECTED] Cc: Eric Molitor [EMAIL PROTECTED]; netdev@vger.kernel.org; [EMAIL PROTECTED] Sent: Thursday, March 09, 2006 03:30 Subject: Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug To chase this regression down, we need: 1) a strace (system call trace) of what the JVM is doing during the transfer. Is it writing lots of little buffers? 2) How to setup the same environment (for non-java savvy people) with freely available software (Sun JDK okay). I can figure out how to get JDK installed, but how to cause the debugging interaction to occur? !DSPAM:7,440f7743260861512816632! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Just to clarify this should be reproducable with any Java Debug tool (IntelliJ, Eclipse, etc) The slow down increases with the scope of the current Frame. If you have a simple scope of say 5 basic objects then things are slow but liveable. If you have a large scope of say 22 objects several of which are collections of a hundred or so objects then it takes literally an hour to step over one line of code. So last night I cracked out my W R Stephens books and plowed through a few man pages, RFC's and other specs and do agree that Java is probably doing something that it shouldn't. However I'm a bit surprised that such a change occured in 2.6.15 when every kernel from 2.2 on that I've tested *doesn't* have this issue. I don't believe in changing an app to work round another apps issues so definitely agree that the JDK needs to get looked at. However if I put on my risk aversion corporate hat I'd be surprised that such a change occured this late in a kernel release. sudo sysctl -w net.ipv4.tcp_abc=0 does seem to help with 2.6.15 but it still doesn't feel as fast as 2.6.14 or previous kernels. I'm not 100% sure that this is the only issue but I will post it as a workaround for now. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
On Wed, 08 Mar 2006 23:29:48 -0800 (PST) David S. Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 08 Mar 2006 23:24:22 -0800 I have gotten massive strace's and the java VM is: 1) Turning on TCP_NODELAY 2) Sending small packets. Java is doing the wrong thing, obviously. 4) Fix java And this is the only reasonable recourse. You cannot turn on TCP_NODELAY and expect good performance when sending out small packets. You are asking for low latency and no delaying of packets in order to allow larger ones to accumulate. The kernel is doing exactly what Java is asking it to do. In fact I consider the new behavior of the kernel a bug fix. A possible solution would be to set cwnd bigger for loopback. If there was a clean way to know that connection was over loopback, then doing something in tcp_init_metrics() to set INIT_CWND if (IsLoopback(sk)) dst-metrics[RTAX_INIT_CWND-1] = 10; then tcp_init_cwnd() would return a bigger congestion window. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Just out of curiosity was the window size changed in 2.6.15? Just trying to get an idea of what might have changed in 2.6.15 that triggered this. (In 2.6.14 and 2.4.27 things run very fast) On 3/9/06, Stephen Hemminger [EMAIL PROTECTED] wrote: On Wed, 08 Mar 2006 23:29:48 -0800 (PST) David S. Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 08 Mar 2006 23:24:22 -0800 I have gotten massive strace's and the java VM is: 1) Turning on TCP_NODELAY 2) Sending small packets. Java is doing the wrong thing, obviously. 4) Fix java And this is the only reasonable recourse. You cannot turn on TCP_NODELAY and expect good performance when sending out small packets. You are asking for low latency and no delaying of packets in order to allow larger ones to accumulate. The kernel is doing exactly what Java is asking it to do. In fact I consider the new behavior of the kernel a bug fix. A possible solution would be to set cwnd bigger for loopback. If there was a clean way to know that connection was over loopback, then doing something in tcp_init_metrics() to set INIT_CWND if (IsLoopback(sk)) dst-metrics[RTAX_INIT_CWND-1] = 10; then tcp_init_cwnd() would return a bigger congestion window. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
On Thu, 9 Mar 2006 12:29:08 -0600 Eric Molitor [EMAIL PROTECTED] wrote: Just out of curiosity was the window size changed in 2.6.15? Just trying to get an idea of what might have changed in 2.6.15 that triggered this. (In 2.6.14 and 2.4.27 things run very fast) No, window size hasn't changed, but how we account for it has. Appropriate Byte Count changes what constitutes a packet for increasing the congestion window. Without ABC, the congestion window is increased by one after each successful acknowledge during slow start. With ABC, we don't increase the congestion window until after you get an acknowledgement for the number of bytes in a full TCP packet. This means if you send small packets, the window will increase more slowly, read the RFC. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
From: Stephen Hemminger [EMAIL PROTECTED] Date: Thu, 9 Mar 2006 08:33:15 -0800 A possible solution would be to set cwnd bigger for loopback. If there was a clean way to know that connection was over loopback, then doing something in tcp_init_metrics() to set INIT_CWND if (IsLoopback(sk)) dst-metrics[RTAX_INIT_CWND-1] = 10; then tcp_init_cwnd() would return a bigger congestion window. I'm not even going to entertain workaround for applications that set socket options and then things go wrong because the kernel actually does what the application has asked for. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
I did open up a bug with SUN about this. It looks like most clients dont set TCP_NODELAY on debug sockets but the JDK itself has TCP_NODELAY hardcoded. In the meantime is there a way to set or disable Appropriate Byte Counting on a per interface basis? (I know that its a protocal but the abiltiy to set protocal options on a per interface basis would seem nice.) On 3/9/06, David S. Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Thu, 9 Mar 2006 08:33:15 -0800 A possible solution would be to set cwnd bigger for loopback. If there was a clean way to know that connection was over loopback, then doing something in tcp_init_metrics() to set INIT_CWND if (IsLoopback(sk)) dst-metrics[RTAX_INIT_CWND-1] = 10; then tcp_init_cwnd() would return a bigger congestion window. I'm not even going to entertain workaround for applications that set socket options and then things go wrong because the kernel actually does what the application has asked for. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
On Thu, 9 Mar 2006 15:13:39 -0600 Eric Molitor [EMAIL PROTECTED] wrote: I did open up a bug with SUN about this. It looks like most clients dont set TCP_NODELAY on debug sockets but the JDK itself has TCP_NODELAY hardcoded. In the meantime is there a way to set or disable Appropriate Byte Counting on a per interface basis? (I know that its a protocal but the abiltiy to set protocal options on a per interface basis would seem nice.) No, but you may be able to set a bigger initial cwnd with by altering the route for the loopback interface. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
I am not sure if it is the same problem, but I am now able to reproduce slowness if I use eclipse and debug something. It is annoying, but not fatal. If I turn off TCP appropriate byte count: sudo sysctl -w net.ipv4.tcp_abc=0 then the problem goes away. See RFC 3465 http://www.apps.ietf.org/rfc/rfc3465.html for a description. I have gotten massive strace's and the java VM is: 1) Turning on TCP_NODELAY 2) Sending small packets. So I think we are counting the small packets now counting against it and it getting blocked. There are a several possible options: 1) Ship with TCP ABC = 0 off -- bad because no one ever changes things to be more fair. 2) Ship with TCP ABC set 2 -- makes it more aggressive, that may work. 3) Tweak TCP to know more about the loopback interface so it has bigger cwnd 4) Fix java - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 08 Mar 2006 23:24:22 -0800 I have gotten massive strace's and the java VM is: 1) Turning on TCP_NODELAY 2) Sending small packets. Java is doing the wrong thing, obviously. 4) Fix java And this is the only reasonable recourse. You cannot turn on TCP_NODELAY and expect good performance when sending out small packets. You are asking for low latency and no delaying of packets in order to allow larger ones to accumulate. The kernel is doing exactly what Java is asking it to do. In fact I consider the new behavior of the kernel a bug fix. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
Eric Molitor [EMAIL PROTECTED] wrote: Attached is a TCP Dump, SYS output was... tcpdump -i lo debug.dump tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes 31066 packets captured 93658 packets received by filter 459 packets dropped by kernel This was while debuging iteration over a list of 10 items. On such a simple example on 2.6.14 it runs basically instantly. On 2.6.15 this took several minutes. Thanks. The attachment was probably too large for the mailing list, so I've uploaded it to http://www.zip.com.au/~akpm/linux/patches/stuff/debug.dump.gz On 3/6/06, Andrew Morton [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=6177 Summary: Java remote debugging is slow due to apparent networking bug Kernel Version: 2.6.15 Status: NEW Severity: normal Owner: [EMAIL PROTECTED] Submitter: [EMAIL PROTECTED] Most recent kernel where this bug did not occur: Distribution: Suse 10.0, Suse 10.1 Debian 3.1 Hardware Environment: ix86 Software Environment: 2.6.14 Problem Description: Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a crawl. Users have reported the problem on both Debian, Suse. Downgrading to 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ, JBuilder, and even the Sun JDAPI examples. I've talked with many people about this and here is what is known so far. http://www.jetbrains.net/jira/browse/IDEA-6540 The best quote to summerize that I know of is tcpdump shows tons and tons of packets going back and forth, none of which individually look strange, but the fact that it took somewhere around the neighborhood of 2500 packets to open the key/value for a single Hash element was weird. Each packet has a very small payload of only a few bytes of information. (I will happily send the tcpdump dump files if anyone wants.) I know that this bug report sucks because of the limited information but it is a real issue and somewhat hard to provide a better test case. Eugene Zhuravlev [EMAIL PROTECTED] of IDEA (IntelliJ's publisher) has offered to help track this problem down. Steps to reproduce: Make sure you are running 2.6.15 or higher (occurs in 2.6.16 pre as well) install Eclipse and start a remote debugging session. Downgrading to 2.6.14 will cause the app to run at its normal speed. Its probably easiest to install Tomcat and attach Eclipse to that. Yes, if you can get the net guys a full tcpdump it would really help, thanks. (Please respond via email rather than via bugzilla so the non-bugilla-capable net developers get to see it, thanks ;)) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 6177] New: Java remote debugging is slow due to apparent networking bug
[EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=6177 Summary: Java remote debugging is slow due to apparent networking bug Kernel Version: 2.6.15 Status: NEW Severity: normal Owner: [EMAIL PROTECTED] Submitter: [EMAIL PROTECTED] Most recent kernel where this bug did not occur: Distribution: Suse 10.0, Suse 10.1 Debian 3.1 Hardware Environment: ix86 Software Environment: 2.6.14 Problem Description: Sometime between 2.6.14 and 2.6.15 remote Java debugging has slowed to a crawl. Users have reported the problem on both Debian, Suse. Downgrading to 2.6.14 solves the problem. The problem occurs with IDEA IntelliJ, JBuilder, and even the Sun JDAPI examples. I've talked with many people about this and here is what is known so far. http://www.jetbrains.net/jira/browse/IDEA-6540 The best quote to summerize that I know of is tcpdump shows tons and tons of packets going back and forth, none of which individually look strange, but the fact that it took somewhere around the neighborhood of 2500 packets to open the key/value for a single Hash element was weird. Each packet has a very small payload of only a few bytes of information. (I will happily send the tcpdump dump files if anyone wants.) I know that this bug report sucks because of the limited information but it is a real issue and somewhat hard to provide a better test case. Eugene Zhuravlev [EMAIL PROTECTED] of IDEA (IntelliJ's publisher) has offered to help track this problem down. Steps to reproduce: Make sure you are running 2.6.15 or higher (occurs in 2.6.16 pre as well) install Eclipse and start a remote debugging session. Downgrading to 2.6.14 will cause the app to run at its normal speed. Its probably easiest to install Tomcat and attach Eclipse to that. Yes, if you can get the net guys a full tcpdump it would really help, thanks. (Please respond via email rather than via bugzilla so the non-bugilla-capable net developers get to see it, thanks ;)) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html