Hi Arnold, [email protected] wrote: > I'm seeing things working with HTTP urls and failing with git:// URLs. > Not just the gawk repo. It's around 5:40 a.m. east coast time as I > write this.
Thank you for the updated report. To help explain things let me ramble a moment. Imagine a Venn diagram with multiple circles around different areas. They overlap. One is networking. We know there has been network problems. Some of those (dropped packets logged at the managed switches) seem to have been caused by the kernel change. However even though reverting the kernel fixed those the reports of global network connectivity continue to be reported. So while one problem may have been the kernel it was not all of the problems. Other network changes were implemented. I only have a small peephole looking into what changed there. But it was a large change. The file systems were repackaged into different volumes as part of the hardware move. A long term problem is that the git daemons sometimes get stuck for unknown reasons. This has seemingly been exacerbated by the current network connectivity problems. When the stuck daemons consume all available slots then git:// stops working due to what *I* think is a bug in xinetd that simply closes connections instead of queuing them. This is why git:// may stop but ssh:// and http:// will continue working. git:// uses the git daemon and is also the most popular. ssh:// uses sshd and http:// uses apache. So alternate protocols use different daemons each of which has different slot cap limits. There are many fewer people accessing source through ssh:// and http:// connectivity than the many people using git:// connectivity. The many using git:// may overwhelm the system and worse with the stuck git:// process problem. That is why the git:// sees the problem the most. However it is normally the fastest protocol to use and most people will want to continue to use it. The immediate need is to solve the recent network connectivity problems. These are intermittent. As you say you were able to access the machine. But at the same time others report failures. Then longer term we need to upgrade all of the software onto a new system and that should hopefully solve some of these other problems. Hope this helps! Bob
