Hi Brian -

Thanks for the nice investigation!  Good work.

Curious - do you and/or Burt (aka CMS) have a goal in mind in terms of number of jobs being managed per machine ? It is pretty cheap to max out the RAM in submit nodes, and even inexpensive 1U machines can hold 24GB allowing them to host more than 25k shadows, right? I guess with a multi-shadow you could go north of 100k shadows on the same inexpensive server (assuming some other bottleneck doesn't prevent that).... but would you want to?

thanks
Todd

On 7/18/2012 4:38 PM, Brian Bockelman wrote:

On Jul 17, 2012, at 6:37 PM, Brian Bockelman wrote:

Hi,

When I last talked to Miron about multi-shadow, he suggested first wringing 
every last byte out of the current one before even proposing the multi-shadow.  
So, I spent about an hour with igprof and staring at smaps.

I measured a shadow as having 360KB of heap, about 550KB total unshared space, 
and 274KB of data live on the heap (so about 25% waste due to fragmentation).

Here's what I found that we could save.  List is in ascending order of 
difficulty to implement.
0) Turn off classad caching: 55KB.
1) Copy of job's classad inside the file transfer object: 8KB
2) gethostbyaddr -> gethostbyaddr_r (including all callsites, even in the 
logging code!  See ExecuteEvent::writeEvent): 5KB.
3) getpwnam, getpwuid to reentrant versions: 2KB
4) Remove stats object from DaemonCore for shadow: 7KB
5) libcondor_utils has 156KB of dirty writable memory (non-const statics?) that 
can't be shared: 100KB?  This part was not included in my heap calculations, 
but is indeed non-shared.
6) Cleanup of auth code to reduce heap fragmentation: 5-15KB
7) Un-loading the IpVerify table after usage: 9KB.
8) The configuration subsystem.  This would be one tough nugget to crack (note: 
would all be shared with the multi-shadow), but is very lightly used after the 
shadow fires up.  70KB.

Lessons learned:
- Classad caching does more harm than good for a single shadow (20% of heap)
- If we squeeze really hard at odds-n-ends in the heap, we can shrink the heap 
by 10%.  I don't think all the items listed above are plausible (especially 8).
- Non-const globals in libcondor_utils consist of 25% of the total memory 
footprint.  There are 332 source files in libcondor_utils - whack-a-mole time?
  - Similarly, there are a few things sitting around in the other Condor 
libraries, but nothing as sizable.
- Obviously sharable resources for the multi-shadow (parameter subsystem, auth 
hash maps and tables, daemon core object) make up 50% of the heap.
- It's not immediately obvious how much the ClassAd cache will affect the 
multi-shadow, but I would expect a bit of sharing.  Let's estimate 50% of the 
current cache is sharable, or 10% of the total heap.

So, we can squeeze about 15% of the shadow size by continuing to shave things 
and turning off caching.

Assuming 10 jobs per 1 shadow, we could realize a 60% memory gain.

Both numbers become more dramatic if we can figure out who's hanging out in the 
data segment.

Brian

PS - all numbers have been rounded and self-consistency is limited to my 
ability to do mental math.

PPS - after 5 minutes with 'nm', it appears the data segment consists primarily 
of the parameter table.  DOH!

_______________________________________________
Condor-devel mailing list
Condor-devel@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel


Hi all,

Updated estimates after playing with the shadow for more than an hour:
- All the linked libraries add up a bit more drastically than I originally 
thought (I had ignored all the small ones and just looked at the big ones in my 
first estimates).  Adding in all my optimizations, there are 888KB of 
Private_Dirty memory.  I squeezed things down to 96KB for libcondor_utils's 
data segment from 156KB.  The savings are more drastic on more recent versions 
of GCC (where LTO is available).  There's not much more left to squeeze out due 
to vtables, got, and plt.
- There's probably 150KB of unique data per shadow, once you subtract out the things I 
mention above.  Hence, 738KB can be thought of as "overhead".  So, running 10 
jobs per shadow would result in an 75% memory savings.

In all, it indicates that multi-shadow would be beneficial and "shadow 
squeezing" is going to continue to have diminishing returns.

Brian

_______________________________________________
Condor-devel mailing list
Condor-devel@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel



--
Todd Tannenbaum <tanne...@cs.wisc.edu> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
Condor Project Technical Lead          1210 W. Dayton St. Rm #4257
Phone: (608) 263-7132                  Madison, WI 53706-1685


_______________________________________________
Condor-devel mailing list
Condor-devel@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel

Reply via email to