On 01/23/13 09:46 AM, Philip Brown wrote:
On 01/18/13 05:54 PM, Bart Smaalders wrote:
There's a slight difference, but nothing substantial. Ops Center must
be doing something silly.

Is the Intel kit slow under ops center as well, or just the SPARC?



Unfortunately, I had some issues getting our x86 machines to use opscenter this week,even though they were working previously.


I've finally been able to get back to investigating the problems we've been having. Turns out, it's not SSL... because opscenter doesnt use SSL for the actual package transfers.
The HUGE slowness problem we saw, seems to be a combination of:
1. something weird opscenter does
2. something weird IPS does
3. a misconfiguration that leaked in from somewhere.

The good news is, I can now positively identify ALL of the above. So I'm posting a summary of findings to the list.

Background: opscenter is a distributed control system,with a "master" controller, and assorted proxies, to distribute load. Solaris 11 installation is supposed to be handed off to a "proxy controller".

It turns out that the proxy controller, for purposes of IPS installs, is a literal apache proxy. With a confusing multi-level httpd configuration. It's supposed to be a caching proxy, so I think it serves out the packages from cache, after initial load.

I explored the opscenter http configs, and found this shocking comment:

# The pkg client opens 20 parallel connections to the server when performing
# network operations.

This turned out to be the key. The MaxClients knob had gotten set too low, and it was starved for working connections.

Okay, this fixes my immediate problem. but what does that say about IPS? Seems like there are multiple problems there.

First of all: It shouldn't degrade into glacial speed, when it can't open 20 full connections!!!

Secondly... why is it being so obnoxious about so many connections? I decided to put it to the test.

I created 20x 100mb files, and downloaded them, first with "wget file1 file2 file3..." and then
"wget file1&; wget file2&"
When dumping the data to /dev/null, I was surprised to find that 20 in parallel was actually faster. About 30 seconds vs 34 seconds, usually

However: IPS doesnt dump downloaded packages to /dev/null. So time for some more realistic tests!

When I set my tests to save the files to a ZFS filesystem (with atime=off) I found that the transfer times were much more variable, and generally speaking, there was no significant difference between the two methods. They all took around 1 minute. 1:30 on slower(disk) hardware. Results tended to be within 1 second of each other.

So, I would suggest that IPS be fixed to use fewer connections. It's currently being obnoxious to any http front end, and for no significant benefit.
At the very minimum, it needs to handle connection starvation better.


_______________________________________________
pkg-discuss mailing list
pkg-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to