Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF and network overload + [Feature] multiple requests per conn on interactive port
Committed to trunk for testing, r1946. On Tue, Jan 13, 2009 at 10:41, Spike Spiegel fsm...@gmail.com wrote: Hi, I wanted to add a feature to gmetad so that it was possible to request multiple items per connection on the interactive port, and while doing so I uncovered a buffer overflow that crashes the gmetad server. I'm releasing both patches together because they are somewhat connected. Apologies if this is a problem and for the length of my email, the intent is to make it easier for reviewers but it might trigger brain filters. == Multiple requests per connection on interactive port We wanted to leverage ganglia to develop complex checks that rely on correlating various metrics over several hosts. With a large tree the best way to achieve that is by using the interactive port to request those specific metrics, but this incurred in a large number of connections given the limit of one item per request. The proposed patch removes this limit. I've picked up ':' as a separator since it's the natural one under Unix and wasn't included in the string process_request() checks against to sanitize input. To limit the changeset I've reused the existing code and simply wrapped process_path in a do while loop that splits the request in chunks and pass them to the function. As a side effect this happens: if you request something like /grid1/host1/load_one:/grid1/host1/load_five you'll get two times the grid and host xml tags rather than one grid/host tag with the two metrics merged inside. While at a first glance this looks ugly there are imho a bunch of things to consider: 1) doing the merge server side is a lot of work and a big changeset afaics. 2) I would argue that it's actually correct to not merge because you've requested for /grid/host/metric so you effectively get what you asked for. 3) Most likely on the client the processing will be done with a XML library resulting in something similar to res[grid][host] = val , which would make the merge fairly easy (at least in higher level languages like python). As to filters support, it is potentially a problem for multiple items per request, but as it is I don't see any. Right now there's only support for a SUMMARY filter which will only be applied to root/grid/cluster nodes and output HOSTS UP=\%u\ DOWN=\%u\ SOURCE=\gmetad\/. Being that the case with a request like /grid1:/grid2?filter=summary each grid will get its own summary which is the expected behavior. Nonetheless I wanted to point it out because in case more filters are added this would need to be accounted for. == Gmetad server buffer overflow and network overload These problems exist regardless of the above feature but would get far worse in that case, hence being linked. === Buffer overflow It is possible to instantly crash gmetad by crafting a special request to be sent to the interactive port. In process_path() a char element[256] is allocated to contain the pieces of the path as it is processed. If a request is made with a path element longer than that the strncpy call will write to invalid memory location, since there is no length checking performed on the input data to make sure it is less than the size of element. Fix is provided by malloc'ing the right size to accommodate any element's length. REQUESTLEN provides protection against abusing malloc and launching a memory exhaustion attack. PoC: echo /`python -c \print \\%s/%s\\ % ('a'*300,'b'*300)\` | nc localhost 8652 === DoS attacks 1) Given REQUESTLEN=2048, and 3 characters to be the minimum to craft a valid and nonexistent path /x, with the above feature implemented it would be possible to trigger 2048/3 calls to process_path which would possibly lead to CPU overload. 2) extension to 1) - as it is ganglia returns the entire tree if an element is not found. with large trees 2048/3 requests could easily result in several GBs of data being transferred. Related to this if you look at gmetad/server.c lines 601:606 you'll see this: err_msg(Got a malformed path request from %s, remote_ip); /* Send them the entire tree to discourage attacks. */ strcpy(request, /); which leads to the same scenario as above. What I propose is that for both cases, malformed request and non existent items, we log an error and bail out. This would solve 2) and most of 1) making the call possibly exist far quicker. == Patches summary - server-c-bof-dos.diff , address buffer overflow and network DoS - server-c-multi-item-request.diff , add the multi item per request feature I have a version available that does not use strsep if you think that's preferable, but it's a bit more code and strsep shouldn't be a dependency problem even on not-so-new systems. regards, Spike -- Behind every great man there's a great backpack - B. --
Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port
Sounds good, can we condense the description below into something that we can put into the 3.1 STATUS file so that we can start the process of getting this backported into 3.1? Brad On 1/14/2009 at 10:00 AM, in message dbdc3b250901140900ya3fc858qf17854e0b3be5...@mail.gmail.com, Jesse Becker haw...@gmail.com wrote: Committed to trunk for testing, r1946. On Tue, Jan 13, 2009 at 10:41, Spike Spiegel fsm...@gmail.com wrote: Hi, I wanted to add a feature to gmetad so that it was possible to request multiple items per connection on the interactive port, and while doing so I uncovered a buffer overflow that crashes the gmetad server. I'm releasing both patches together because they are somewhat connected. Apologies if this is a problem and for the length of my email, the intent is to make it easier for reviewers but it might trigger brain filters. == Multiple requests per connection on interactive port We wanted to leverage ganglia to develop complex checks that rely on correlating various metrics over several hosts. With a large tree the best way to achieve that is by using the interactive port to request those specific metrics, but this incurred in a large number of connections given the limit of one item per request. The proposed patch removes this limit. I've picked up ':' as a separator since it's the natural one under Unix and wasn't included in the string process_request() checks against to sanitize input. To limit the changeset I've reused the existing code and simply wrapped process_path in a do while loop that splits the request in chunks and pass them to the function. As a side effect this happens: if you request something like /grid1/host1/load_one:/grid1/host1/load_five you'll get two times the grid and host xml tags rather than one grid/host tag with the two metrics merged inside. While at a first glance this looks ugly there are imho a bunch of things to consider: 1) doing the merge server side is a lot of work and a big changeset afaics. 2) I would argue that it's actually correct to not merge because you've requested for /grid/host/metric so you effectively get what you asked for. 3) Most likely on the client the processing will be done with a XML library resulting in something similar to res[grid][host] = val , which would make the merge fairly easy (at least in higher level languages like python). As to filters support, it is potentially a problem for multiple items per request, but as it is I don't see any. Right now there's only support for a SUMMARY filter which will only be applied to root/grid/cluster nodes and output HOSTS UP=\%u\ DOWN=\%u\ SOURCE=\gmetad\/. Being that the case with a request like /grid1:/grid2?filter=summary each grid will get its own summary which is the expected behavior. Nonetheless I wanted to point it out because in case more filters are added this would need to be accounted for. == Gmetad server buffer overflow and network overload These problems exist regardless of the above feature but would get far worse in that case, hence being linked. === Buffer overflow It is possible to instantly crash gmetad by crafting a special request to be sent to the interactive port. In process_path() a char element[256] is allocated to contain the pieces of the path as it is processed. If a request is made with a path element longer than that the strncpy call will write to invalid memory location, since there is no length checking performed on the input data to make sure it is less than the size of element. Fix is provided by malloc'ing the right size to accommodate any element's length. REQUESTLEN provides protection against abusing malloc and launching a memory exhaustion attack. PoC: echo /`python -c \print \\%s/%s\\ % ('a'*300,'b'*300)\` | nc localhost 8652 === DoS attacks 1) Given REQUESTLEN=2048, and 3 characters to be the minimum to craft a valid and nonexistent path /x, with the above feature implemented it would be possible to trigger 2048/3 calls to process_path which would possibly lead to CPU overload. 2) extension to 1) - as it is ganglia returns the entire tree if an element is not found. with large trees 2048/3 requests could easily result in several GBs of data being transferred. Related to this if you look at gmetad/server.c lines 601:606 you'll see this: err_msg(Got a malformed path request from %s, remote_ip); /* Send them the entire tree to discourage attacks. */ strcpy(request, /); which leads to the same scenario as above. What I propose is that for both cases, malformed request and non existent items, we log an error and bail out. This would solve 2) and most of 1) making the call possibly exist far quicker. == Patches summary - server-c-bof-dos.diff , address buffer overflow and network DoS - server-c-multi-item-request.diff , add the multi item per request
Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port
On Wed, Jan 14, 2009 at 07:00:29PM -0500, Jesse Becker wrote: On Wed, Jan 14, 2009 at 18:45, Kostas Georgiou k.georg...@imperial.ac.uk wrote: On Wed, Jan 14, 2009 at 05:10:31PM -0500, Jesse Becker wrote: I suggest that once this is accepted, we release 3.1.2 ASAP. Any possibility for 3.0.8 as well? I don't see why not. Is that an offer to help test the patch? ;-) I've build rpms for Fedora 9 and I'll be pushing them shortly after I've done some local testing. If anyone wants to have a look they are availbale in koji [1]. Kostas [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=78558 -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers