Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-24 Thread Henrik Nordstrom
On ons, 2008-09-24 at 14:14 +1200, Martin Langhoff wrote:

 Good hint, thanks! If we did have such a control, what is the wired
 memory that squid will use for each entry? In an email earlier I
 wrote...

For on-disk objects about 100 bytes.

In-memory objects obviously uses a lot more. Probably something like 1kb
+ the object size rounded up to 4k pages.

Also disable the client db unless you need to use the maxconn acl

  client_db off

And don't configure with too many filedescriptors. The default 1024 is
probably reasonable for the environment. (Note: configure flag in
squid-3, squid.conf option in 2.7)

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-24 Thread Amos Jeffries

Henrik Nordstrom wrote:

On ons, 2008-09-24 at 10:18 +0200, Henrik Nordstrom wrote:


For on-disk objects about 100 bytes.

In-memory objects obviously uses a lot more. Probably something like 1kb
+ the object size rounded up to 4k pages.

Also disable the client db unless you need to use the maxconn acl

  client_db off

And don't configure with too many filedescriptors. The default 1024 is
probably reasonable for the environment. (Note: configure flag in
squid-3, squid.conf option in 2.7)


A quick inspection using nm also reveals that there is some data
elements which can easily be trimmed down

size  type name
131072 b queried_keys
131072 b queried_addr
131072 b queried_keys
262144 B server_pconn_hist
262144 B client_pconn_hist

The first three is used by ICP and can be ripped out if you do not need
to support ICP. But if you need ICP then they are needed (but can be
shrunk down a bit by limiting the ICP id range).

The second two histograms is purely informational statistics. Should be
fine to set PCONN_HIST_SZ to something much smaller such as 64, or
disable this part of the code entirely as not used other than for
statistical information available via cachemgr.

There probably is a lot more junk being allocated runtime which can be
trimmed, especially if you build with ssl support. But the fd_table
seems to be the only big one and can not be significantly trimmed by
other means than limiting the number of concurrent connections
(filedescriptors).

Regards
Henrik


I've had a suspicion since I first heard Squid was in use for OLPC, that 
we would be needing to soon provide configure options to remove features 
such as ICP which they and perhapse others in the *WRT mini-device areas 
don't need.


On that train --disable-htcp and --disable-wccp may be useful if you 
don't use those features.


Theres a small project for someone, adding default-enabled configure 
macros back into squid ;-)


Maybe an overall option like --disable-peering, to wholesale drop 
cache_peer and all its related features for slimline stand-along Squid. 
Theres a good (estimated) 15% of the app footprint gone.


One feature I'm uncertain of utility-wise is the netdb cache 
(--enable-icmp).  It may be beneficial for schools on flakey links which 
need to ensure fast retrieval of data from a set of stable peers before 
the link dies. Though that does add more memory for the NetDB itself and 
some baseline ICMP load to the link.


Amos
--
Please use Squid 2.7.STABLE4 or 3.0.STABLE9


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Adrian Chadd
2008/9/23 Martin Langhoff [EMAIL PROTECTED]:

 Any way we can kludge our way around it for the time being? Does squid
 take any signal that gets it to shed its index?

It'd be pretty trivial to write a few cachemgr hooks to implement that
kind of behaviour. 'flush memory cache', 'flush disk cache entirely',
etc.

The trouble is that the index is -required- at the moment for the disk
cache. if you flush the index you flush the disk cache entirely.

 There's no hard limit for squid and squid (any version) handles
 memory allocation failures very very poorly (read: crashes.)

 Is it relatively sane to run it with a tight rlimit and restart it
 often? Or just monitor it and restart it?

It probably won't like that very much if you decide to also use disk caching.

 You can limit the amount of cache_mem which limits the memory cache
 size; you could probably modify the squid codebase to start purging
 objects at a certain object count rather than based on the disk+memory
 storage size. That wouldn't be difficult.

 Any chance of having patches that do this?

I could probably do that in a week or so once I've finished my upcoming travel.
Someone could try beating me to it..


 The big problem: you won't get Squid down to 24meg of RAM with the
 current tuning parameters. Well, I couldn't; and I'm playing around

 Hmmm...

 with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
 RAM.) Its something which will require quite a bit of development to
 slim some of the internals down to scale better with restricted
 memory footprints. Its on my personal TODO list (as it mostly is in
 line with a bunch of performance work I'm slowly working towards) but
 as the bulk of that is happening in my spare time, I do not have a
 fixed timeframe at the moment.

 Thanks for that -- at whatever pace, progress is progress. I'll stay
 tuned. I'm not on squid-devel, but generally interested in any news on
 this track; I'll be thankful if you CC me or rope me into relevant
 threads.

Ok.

 Is there interest within the squid dev team in moving towards a memory
 allocation model that is more tunable and/or relies more on the
 abilities of modern kernels to do memory mgmt? Or an alternative
 approach to handle scalability (both down to small devices and up to
 huge kit) more dynamically and predictably?

You'll generally find the squid dev team happy to move in whatever
directions make sense. The problem isn't direction as so much as the
coding to make it happen. Making Squid operate well in small memory
footprints turns out to be quite relevant to higher performance and
scalability; the problem is in the doing.

I'm hoping to start work on some stuff to reduce the memory footprint
in my squid-2 branch (cacheboy) once the current round of IPv6
preparation is completed and stable. The developers working on Squid-3
are talking about similar stuff.


Adrian


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Henrik Nordstrom
On tis, 2008-09-23 at 14:57 +0800, Adrian Chadd wrote:

  You can limit the amount of cache_mem which limits the memory cache
  size; you could probably modify the squid codebase to start purging
  objects at a certain object count rather than based on the disk+memory
  storage size. That wouldn't be difficult.
 
  Any chance of having patches that do this?
 
 I could probably do that in a week or so once I've finished my upcoming 
 travel.
 Someone could try beating me to it..

The relevant code locations for implementing this if you want to take a
stab at it yourself is the maintain function in each cache_dir type
(src/fs/*/store_dir_...)

Should be trivial to add a cache_dir parameter specifyung the max number
of files in this cache_dir, and use this in the maintenance function.

Regards
Henrik


signature.asc
Description: This is a digitally signed message part


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Martin Langhoff
On Wed, Sep 24, 2008 at 12:08 AM, Henrik Nordstrom
[EMAIL PROTECTED] wrote:
 On tis, 2008-09-23 at 14:57 +0800, Adrian Chadd wrote:
 I could probably do that in a week or so once I've finished my upcoming 
 travel.
 Someone could try beating me to it..

 The relevant code locations for implementing this if you want to take a
 stab at it yourself is the maintain function in each cache_dir type
 (src/fs/*/store_dir_...)

 Should be trivial to add a cache_dir parameter specifyung the max number
 of files in this cache_dir, and use this in the maintenance function.

Good hint, thanks! If we did have such a control, what is the wired
memory that squid will use for each entry? In an email earlier I
wrote...

 - Each index entry takes between 56 bytes and 88 bytes, plus
additional, unspecificed overhead. Is 1KB per entry a reasonable
conservative estimate?

 - Discussions about compressing or hashing the URL in the index are
recurrent - is the uncompressed URL there? That means up to 4KB per
index entry?

the notes I read about the index structure were rather old...




m
-- 
 [EMAIL PROTECTED]
 [EMAIL PROTECTED] -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-23 Thread Adrian Chadd
2008/9/24 Martin Langhoff [EMAIL PROTECTED]:

 Good hint, thanks! If we did have such a control, what is the wired
 memory that squid will use for each entry? In an email earlier I
 wrote...

sizeof(StoreEntry) per index entry, basically.


  - Each index entry takes between 56 bytes and 88 bytes, plus
 additional, unspecificed overhead. Is 1KB per entry a reasonable
 conservative estimate?

1kb per entry is pretty conservative. The per-object overhead includes
the StoreEntry, the couple of structures for the memory/disk
replacement policies, plus the MD5 URL for the index hash, whatever
other stuff hangs off MemObject for in-memory objects.

You'll find that the RAM requirements grow a bit more for things like
in-memory cache objects as the full reply headers stay in memory, and
are copied whenever anyone wants to request it.

  - Discussions about compressing or hashing the URL in the index are
 recurrent - is the uncompressed URL there? That means up to 4KB per
 index entry?

The uncompressed URL and headers are in memory during:

* request/reply handling
* in-memory object; (objects with MemObject's allocated); on-disk
entries just have the MD5 URL hash per StoreEntry.

HTH,

Oh, and I'll be in the US from October for a few months; I can always
do a side-trip out to see you guys if there's enough interest.


Adrian


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Adrian Chadd
G'day,

I've looked into this a bit (and have a couple of OLPC laptops to do
testing with) and .. well, its going to take a bit of effort to make
squid fit.

There's no hard limit for squid and squid (any version) handles
memory allocation failures very very poorly (read: crashes.)

You can limit the amount of cache_mem which limits the memory cache
size; you could probably modify the squid codebase to start purging
objects at a certain object count rather than based on the disk+memory
storage size. That wouldn't be difficult.

The big problem: you won't get Squid down to 24meg of RAM with the
current tuning parameters. Well, I couldn't; and I'm playing around
with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
RAM.) Its something which will require quite a bit of development to
slim some of the internals down to scale better with restricted
memory footprints. Its on my personal TODO list (as it mostly is in
line with a bunch of performance work I'm slowly working towards) but
as the bulk of that is happening in my spare time, I do not have a
fixed timeframe at the moment.


Adrian


2008/9/23 Martin Langhoff [EMAIL PROTECTED]:
 Hi!

 I am working on the School Server (aka XS: a Fedora 9 spin, tailored
 to run on fairly limited hw), I'm preparing the configuration settings
 for it. It's a somewhat new area for me -- I've setup Squid before on
 mid-range hardware... but this is... different.

 So I'm interested in understanding more aobut the variables affecting
 memory footprint and how I can set a _hard limit_ on the wired memory
 that squid allocates.

 In brief:

  - The workload is relatively light - 3K clients is the upper bound.

  - The XS will (in some locations) be hooked to *very* unreliable
 power... uncontrolled shutdowns are the norm. Is this ever a problem with 
 Squid?

  - After a bad shutdown, graceful recovery is the most important
 aspect. If a few cached items are lost, we can cope...

  - The XS hardware runs many services (mostly webbased), so Squid gets
 only a limited slice of memory. To make matters worse, I *really*
 don't want the core working set (Squid, Pg, Apache/PHP) to get paged
 out. So I am interested in pegging the max memory Squid will take to itself.

  - The XS hw is varied. In small schools it may have 256MB RAM (likely
 to be running on XO hardware + usb-connected ext hard-drive).
 Medium-to-large schools will have the recommended 1GB RAM and a cheap
 SATA disk. A few very large schools will be graced with more RAM (2 or
 4GB).

 .. so RAM allocation for Squid will prob range between 24MB at the
 lower-end and 96MB at the 1GB recommended RAM.

 My main question is: how would you tune Squid 3 so that

  - it does not allocate directly more than 24MB / 96MB? (Assume that
 the linux kernel will be smart about mmapped stuff, and aggressive
 about caching -- I am talking about the memory Squid will claim to
 itself).

  - still gives us good thoughput? :-)



 So far Google has turned up very little info, and it seems to be
 rather old. What I've found can be summarised as follows:

  - The index is malloc'd, so the number of entries in the index will
 be the dominant concern WRT memory footprint.

  - Each index entry takes between 56 bytes and 88 bytes, plus
 additional, unspecificed overhead. Is 1KB per entry a reasonable
 conservative estimate?

  - Discussions about compressing or hashing the URL in the index are
 recurrent - is the uncompressed URL there? That means up to 4KB per
 index entry?

  - The index does nto seem to be mmappable or otherwise

 We can rely on the (modern) linux kernel doing a fantastic job at
 caching disk IO and shedding those cached entries when under memory
 pressure, so I am likely to set Squid's own cache to something really
 small. Everything I read points to the index being my main concern -
 is there a way to limit (a) the total memory the index is allowed to
 take or (b) the number of index entries allowed?

 Does the above make sense in general? Or am I barking up the wrong tree?


 cheers,



 martin
 --
  [EMAIL PROTECTED]
  [EMAIL PROTECTED] -- School Server Architect
  - ask interesting questions
  - don't get distracted with shiny stuff - working code first
  - http://wiki.laptop.org/go/User:Martinlanghoff
 ___
 Server-devel mailing list
 [EMAIL PROTECTED]
 http://lists.laptop.org/listinfo/server-devel




Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Martin Langhoff
On Tue, Sep 23, 2008 at 3:09 PM, Adrian Chadd [EMAIL PROTECTED] wrote:
 I've looked into this a bit (and have a couple of OLPC laptops to do
 testing with) and .. well, its going to take a bit of effort to make
 squid fit.

Any way we can kludge our way around it for the time being? Does squid
take any signal that gets it to shed its index?

 There's no hard limit for squid and squid (any version) handles
 memory allocation failures very very poorly (read: crashes.)

Is it relatively sane to run it with a tight rlimit and restart it
often? Or just monitor it and restart it?

 You can limit the amount of cache_mem which limits the memory cache
 size; you could probably modify the squid codebase to start purging
 objects at a certain object count rather than based on the disk+memory
 storage size. That wouldn't be difficult.

Any chance of having patches that do this?

 The big problem: you won't get Squid down to 24meg of RAM with the
 current tuning parameters. Well, I couldn't; and I'm playing around

Hmmm...

 with Squid on OLPC-like hardware (SBC with 500mhz geode, 256/512mb
 RAM.) Its something which will require quite a bit of development to
 slim some of the internals down to scale better with restricted
 memory footprints. Its on my personal TODO list (as it mostly is in
 line with a bunch of performance work I'm slowly working towards) but
 as the bulk of that is happening in my spare time, I do not have a
 fixed timeframe at the moment.

Thanks for that -- at whatever pace, progress is progress. I'll stay
tuned. I'm not on squid-devel, but generally interested in any news on
this track; I'll be thankful if you CC me or rope me into relevant
threads.

Is there interest within the squid dev team in moving towards a memory
allocation model that is more tunable and/or relies more on the
abilities of modern kernels to do memory mgmt? Or an alternative
approach to handle scalability (both down to small devices and up to
huge kit) more dynamically and predictably?

cheers,



m
-- 
 [EMAIL PROTECTED]
 [EMAIL PROTECTED] -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Mark Nottingham


On 23/09/2008, at 1:42 PM, Martin Langhoff wrote:



There's no hard limit for squid and squid (any version) handles
memory allocation failures very very poorly (read: crashes.)


Is it relatively sane to run it with a tight rlimit and restart it
often? Or just monitor it and restart it?


That's about the worst thing you can do; it will go down hard if it  
hits that limit.


It's not that Squid's memory use will necessarily increase over time  
(at least, when cache_mem is full); rather, it's that in-transit  
objects and internal accounting use memory on top of cache_mem. As  
such, intense traffic (e.g., lots of simultaneous connections) will  
cause more memory use. Likewise, if you use disk caching, there's a  
certain amount of overhead (I believe about 10M of memory per G of  
disk).


FWIW, one of my standard squid packages uses 48M cache_mem, and I  
advise people that it shouldn't use more than 96M of memory (and that  
rarely). However, that's predicated on a small number of local users  
and no disk caching; if you have more users and connections are long- 
lived (which I'd imagine they will be in an OLPC deployment), there  
may be more overhead.



- The XS will (in some locations) be hooked to *very* unreliable
power... uncontrolled shutdowns are the norm. Is this ever a problem  
with Squid?


- After a bad shutdown, graceful recovery is the most important
aspect. If a few cached items are lost, we can cope...


Squid will handle being taken down roughly OK; at worst, the  
swap.state may get corrupted, which means it'll have to rebuild it  
next time around.



Overall, what do you want to use Squid for here; caching, access  
control..? If you want caching, realise that you're not going to see  
much benefit from such a resource-limited box, and indeed it may be  
more of a bottleneck than is worthwhile.


Cheers,




--
Mark Nottingham   [EMAIL PROTECTED]




Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Martin Langhoff
On Tue, Sep 23, 2008 at 4:12 PM, Mark Nottingham [EMAIL PROTECTED] wrote:
 Overall, what do you want to use Squid for here; caching, access control..?

Caching and plugins such as squidgard (does that qualify as access control?)

 If you want caching, realise that you're not going to see much benefit from
 such a resource-limited box, and indeed it may be more of a bottleneck than
 is worthwhile.

well, we are very constrained RAM wise, but we have a reasonable hard
drive quota *and* a horrible internet connection. Picture 200 kids
behind a dsl line, 50 kids behind a satellite link or 3G modem.

Granted, youtube isn't going to work but well-behaved cacheable
content (in http terms) can work well with a good proxy.

cheers,



martin
-- 
 [EMAIL PROTECTED]
 [EMAIL PROTECTED] -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


Re: [Server-devel] Squid tuning recommendations for OLPC School Server tuning...

2008-09-22 Thread Mark Nottingham


On 23/09/2008, at 2:40 PM, Martin Langhoff wrote:

On Tue, Sep 23, 2008 at 4:12 PM, Mark Nottingham [EMAIL PROTECTED] 
inc.com wrote:
Overall, what do you want to use Squid for here; caching, access  
control..?


Caching and plugins such as squidgard (does that qualify as access  
control?)


If you want caching, realise that you're not going to see much  
benefit from
such a resource-limited box, and indeed it may be more of a  
bottleneck than

is worthwhile.


well, we are very constrained RAM wise, but we have a reasonable hard
drive quota *and* a horrible internet connection. Picture 200 kids
behind a dsl line, 50 kids behind a satellite link or 3G modem.


Hmm, way back when I administered a Squid with about 200 investment  
bankers behind an ISDN line... sounds familiar :)


The main problem is going to be the RAM; having it for cache removes a  
lot of the pressure on your disk, and most ISP caches with this kind  
of workload get IO bound. The per-connection overhead isn't trivial,  
either, when you've got a slow link, as users will tend to be  
impatient and retry, surf with multiple windows, etc.


BTW, completely out of left field, have you seen this?
  http://www.eecis.udel.edu/~nataraja/papers/nsdr2008.pdf

There is a group of folks talking about adding SCTP support to squid  
(there are already patches for apache and firefox); with that, you  
could use SCTP over the lossy/low bandwidth hops to improve  
performance. I think in your case it would require a server on the  
other end of the link -- e.g., maybe a farm of HTTP/SCTP - HTTP/TCP  
gateways -- but it may be worth considering...


Cheers,





Granted, youtube isn't going to work but well-behaved cacheable
content (in http terms) can work well with a good proxy.

cheers,



martin
--
[EMAIL PROTECTED]
[EMAIL PROTECTED] -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff


--
Mark Nottingham   [EMAIL PROTECTED]