Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-15 Thread Spike Spiegel
On Fri, Jan 16, 2009 at 7:04 AM, Kostas Georgiou
 wrote:
> On Thu, Jan 15, 2009 at 01:41:53PM -0700, Brad Nicholes wrote:
>
>> >>> On 1/15/2009 at 8:56 AM, in message
>> <496efa2a02ac0003a...@lucius.provo.novell.com>, "Brad Nicholes"
>>  wrote:
>>
>> After taking a little closer look at the patch, I think we are OK as
>> far as the recursive call to process_path() is concerned since this
>> case is an error condition and should stop processing rather than
>> continuing in the recursive loop.

indeed, this should work just fine.

>>  The other two concerns are still
>> there however.  I still think that we are off-by-one in the malloc
>> call.  It should be len+1 and I still think that we should limit the
>> malloc to 256 rather than allowing it to be unlimited.
>
> I agree about the off-by-one

argh, my bad sorry, double dumb since I even considered the case.
len+1 it is and the comment should go, thanks.

> but I am not too worried about a malloc
> limit, from what I can tell it can only get as high as REQUESTLEN.

I agree with Kostas, as I wrote in my initial email I didn't worry
about that because of the REQUESTLEN boundary which is enforced in
readline.

as to limiting the path to 256 I actually did that in my first
implementation, but eventually converted to a malloc solution because
I was reminded that "640 KB ought to be enough for everybody" and I
could see no downsides.

>
> The malloc call needs to be checked for NULL and the comment that
> "The recursive structure doesn't require any memory allocations" is
> false now if malloc replaces the stack allocation.

correct

thanks everybody

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-15 Thread Kostas Georgiou
On Thu, Jan 15, 2009 at 01:41:53PM -0700, Brad Nicholes wrote:

> >>> On 1/15/2009 at 8:56 AM, in message
> <496efa2a02ac0003a...@lucius.provo.novell.com>, "Brad Nicholes"
>  wrote:
> 
> After taking a little closer look at the patch, I think we are OK as
> far as the recursive call to process_path() is concerned since this
> case is an error condition and should stop processing rather than
> continuing in the recursive loop.  The other two concerns are still
> there however.  I still think that we are off-by-one in the malloc
> call.  It should be len+1 and I still think that we should limit the
> malloc to 256 rather than allowing it to be unlimited. 

I agree about the off-by-one but I am not too worried about a malloc
limit, from what I can tell it can only get as high as REQUESTLEN.

The malloc call needs to be checked for NULL and the comment that
"The recursive structure doesn't require any memory allocations" is
false now if malloc replaces the stack allocation.

Kostas

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-15 Thread Brad Nicholes
>>> On 1/15/2009 at 8:56 AM, in message
<496efa2a02ac0003a...@lucius.provo.novell.com>, "Brad Nicholes"
 wrote:
 On 1/14/2009 at 3:10 PM, in message
> , "Jesse Becker"
>  wrote:
>> On Wed, Jan 14, 2009 at 12:56, Brad Nicholes  wrote:
>>>   Sounds good, can we condense the description below into something that we 
>> can put into the 3.1 STATUS file so that we can start the process of getting 
> 
>> this backported into 3.1?
>> 
>> Done.  Committed as r1947.
>> 
>> I suggest that once this is accepted, we release 3.1.2 ASAP.
>> 
>> Also, a big thanks to Spike for finding and tracking this problem down.
> 
> 
> Spike,
> Correct me if I am wrong, but I think we still have a one-off problem 
> with the patch.
> 
>  len = q-p;
>  /* +1 not needed as q-p is already accounting for that */
>  element = malloc(len);
>  strncpy(element, p, len);
>  element[len] = '\0';
> 
> 
> In the above code suppose "len" is 10 so we malloc() 10 bytes which means 
> that the array indexing for element is 0-9.  Then we strncpy() 10 bytes into 
> "element" which is OK.  However setting element[len] = "\0", IOW element[10] 
> = "\0" is indexing into the element array 1 byte beyond the actual length of 
> element.  So either the code needs to be element = malloc(len+1) or 
> element[len-1] = "\0" correct?
> 
> I would suggest that maybe rather than allowing the code to malloc() any 
> size buffer to match the incoming element length that instead or in addition 
> we limit the element length as well.  Allowing the code to malloc() an 
> unlimited buffer can also lead to security issues.  Since the original code 
> defined a 256 byte buffer, 256 seems to be a good limit for elements of the 
> path.  It might be a better idea to simply add the limit checking rather than 
> doing a lot of malloc and free calls.  If an element is larger than the 
> limit, then produce a mal-formed error message and bail out.
> 
> Also I don't think that the current patch will work without the multi-path 
> patch that was also submitted, without being reworked.  The current patch 
> removes the recursive call to process_path() which means that without the for 
> loop that was provided in the multi-path patch, only one element of the path 
> will be processed and the rest will be ignored.  The current patch probably 
> needs to be reworked and separated from the multi-path patch. 
> 

After taking a little closer look at the patch, I think we are OK as far as the 
recursive call to process_path() is concerned since this case is an error 
condition and should stop processing rather than continuing in the recursive 
loop.  The other two concerns are still there however.  I still think that we 
are off-by-one in the malloc call.  It should be len+1 and I still think that 
we should limit the malloc to 256 rather than allowing it to be unlimited. 

Brad


--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-15 Thread Jesse Becker
On Thu, Jan 15, 2009 at 10:56, Brad Nicholes  wrote:
 On 1/14/2009 at 3:10 PM, in message
> , "Jesse Becker"
>  wrote:
>> On Wed, Jan 14, 2009 at 12:56, Brad Nicholes  wrote:
>>>   Sounds good, can we condense the description below into something that we
>> can put into the 3.1 STATUS file so that we can start the process of getting
>> this backported into 3.1?
>>
>> Done.  Committed as r1947.
>>
>> I suggest that once this is accepted, we release 3.1.2 ASAP.
>>
>> Also, a big thanks to Spike for finding and tracking this problem down.
>
>
> Spike,
>Correct me if I am wrong, but I think we still have a one-off problem with 
> the patch.
>
> len = q-p;
> /* +1 not needed as q-p is already accounting for that */
> element = malloc(len);
> strncpy(element, p, len);
> element[len] = '\0';
>
>
> In the above code suppose "len" is 10 so we malloc() 10 bytes which means 
> that the array indexing for element is 0-9.  Then we strncpy() 10 bytes into 
> "element" which is OK.  However setting element[len] = "\0", IOW element[10] 
> = "\0" is indexing into the element array 1 byte beyond the actual length of 
> element.  So either the code needs to be element = malloc(len+1) or 
> element[len-1] = "\0" correct?

That sounds right.

> I would suggest that maybe rather than allowing the code to malloc() any size 
> buffer to match the incoming element length that instead or in addition we 
> limit the element length as well.  Allowing the code to malloc() an unlimited 
> buffer can also lead to security issues.  Since the original code defined a 
> 256 byte buffer, 256 seems to be a good limit for elements of the path.  It 
> might be a better idea to simply add the limit checking rather than doing a 
> lot of malloc and free calls.  If an element is larger than the limit, then 
> produce a mal-formed error message and bail out.

Well, part of the original problem was that a static-length buffer was
used, and there wasn't proper bounds checking in the first place. ;-)

>
> Also I don't think that the current patch will work without the multi-path 
> patch that was also submitted, without being reworked.  The current patch 
> removes the recursive call to process_path() which means that without the for 
> loop that was provided in the multi-path patch, only one element of the path 
> will be processed and the rest will be ignored.  The current patch probably 
> needs to be reworked and separated from the multi-path patch.

For reference, most filesystem don't allow individual file or
directory names to be longer than 255 characters.  However, the full
path can be much longer...

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-15 Thread Brad Nicholes
>>> On 1/14/2009 at 3:10 PM, in message
, "Jesse Becker"
 wrote:
> On Wed, Jan 14, 2009 at 12:56, Brad Nicholes  wrote:
>>   Sounds good, can we condense the description below into something that we 
> can put into the 3.1 STATUS file so that we can start the process of getting 
> this backported into 3.1?
> 
> Done.  Committed as r1947.
> 
> I suggest that once this is accepted, we release 3.1.2 ASAP.
> 
> Also, a big thanks to Spike for finding and tracking this problem down.


Spike,
Correct me if I am wrong, but I think we still have a one-off problem with 
the patch.

 len = q-p;
 /* +1 not needed as q-p is already accounting for that */
 element = malloc(len);
 strncpy(element, p, len);
 element[len] = '\0';


In the above code suppose "len" is 10 so we malloc() 10 bytes which means that 
the array indexing for element is 0-9.  Then we strncpy() 10 bytes into 
"element" which is OK.  However setting element[len] = "\0", IOW element[10] = 
"\0" is indexing into the element array 1 byte beyond the actual length of 
element.  So either the code needs to be element = malloc(len+1) or 
element[len-1] = "\0" correct?

I would suggest that maybe rather than allowing the code to malloc() any size 
buffer to match the incoming element length that instead or in addition we 
limit the element length as well.  Allowing the code to malloc() an unlimited 
buffer can also lead to security issues.  Since the original code defined a 256 
byte buffer, 256 seems to be a good limit for elements of the path.  It might 
be a better idea to simply add the limit checking rather than doing a lot of 
malloc and free calls.  If an element is larger than the limit, then produce a 
mal-formed error message and bail out.

Also I don't think that the current patch will work without the multi-path 
patch that was also submitted, without being reworked.  The current patch 
removes the recursive call to process_path() which means that without the for 
loop that was provided in the multi-path patch, only one element of the path 
will be processed and the rest will be ignored.  The current patch probably 
needs to be reworked and separated from the multi-path patch. 

Brad


--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-14 Thread Kostas Georgiou
On Wed, Jan 14, 2009 at 07:00:29PM -0500, Jesse Becker wrote:

> On Wed, Jan 14, 2009 at 18:45, Kostas Georgiou
>  wrote:
> > On Wed, Jan 14, 2009 at 05:10:31PM -0500, Jesse Becker wrote:
> >
> >> I suggest that once this is accepted, we release 3.1.2 ASAP.
> >
> > Any possibility for 3.0.8 as well?
> 
> I don't see why not.  Is that an offer to help test the patch?  ;-)

I've build rpms for Fedora 9 and I'll be pushing them shortly after I've
done some local testing. If anyone wants to have a look they are
availbale in koji [1].

Kostas

[1] https://koji.fedoraproject.org/koji/buildinfo?buildID=78558

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-14 Thread Jesse Becker
On Wed, Jan 14, 2009 at 18:45, Kostas Georgiou
 wrote:
> On Wed, Jan 14, 2009 at 05:10:31PM -0500, Jesse Becker wrote:
>
>> I suggest that once this is accepted, we release 3.1.2 ASAP.
>
> Any possibility for 3.0.8 as well?

I don't see why not.  Is that an offer to help test the patch?  ;-)

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-14 Thread Kostas Georgiou
On Wed, Jan 14, 2009 at 05:10:31PM -0500, Jesse Becker wrote:

> I suggest that once this is accepted, we release 3.1.2 ASAP.

Any possibility for 3.0.8 as well?

Kostas

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-14 Thread Jesse Becker
On Wed, Jan 14, 2009 at 12:56, Brad Nicholes  wrote:
>   Sounds good, can we condense the description below into something that we 
> can put into the 3.1 STATUS file so that we can start the process of getting 
> this backported into 3.1?

Done.  Committed as r1947.

I suggest that once this is accepted, we release 3.1.2 ASAP.

Also, a big thanks to Spike for finding and tracking this problem down.

-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0  2720 0083 0931 9A2B 06A2

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


Re: [Ganglia-developers] patches for: [Sec] Gmetad server BoF andnetwork overload + [Feature] multiple requests per conn oninteractive port

2009-01-14 Thread Brad Nicholes
   Sounds good, can we condense the description below into something that we 
can put into the 3.1 STATUS file so that we can start the process of getting 
this backported into 3.1?

Brad

>>> On 1/14/2009 at 10:00 AM, in message
, "Jesse Becker"
 wrote:
> Committed to trunk for testing, r1946.
> 
> On Tue, Jan 13, 2009 at 10:41, Spike Spiegel  wrote:
>> Hi,
>>
>> I wanted to add a feature to gmetad so that it was possible to request 
> multiple
>> items per connection on the interactive port, and while doing so I
>> uncovered a buffer overflow that
>> crashes the gmetad server. I'm releasing both patches together because they
>> are somewhat connected. Apologies if this is a problem and for the length of 
> my
>> email, the intent is to make it easier for reviewers but it might trigger
>> brain filters.
>>
>> == Multiple requests per connection on interactive port
>> We wanted to leverage ganglia to develop complex checks that rely on
>> correlating various metrics over several hosts. With a large tree the best 
> way
>> to achieve that is by using the interactive port to request those specific
>> metrics, but this incurred in a large number of connections given the limit 
> of
>> one item per request. The proposed patch removes this limit.
>>
>> I've picked up ':' as a separator since it's the natural one under Unix and
>> wasn't included in the string process_request() checks against to
>> sanitize input.
>>
>> To limit the changeset I've reused the existing code and simply wrapped
>> process_path in a do while loop that splits the request in chunks and pass 
> them
>> to the function. As a side effect this happens:
>> if you request something like
>> /grid1/host1/load_one:/grid1/host1/load_five you'll
>> get two times the grid and host xml tags rather than one grid/host tag with
>> the two metrics merged inside. While at a first glance this looks ugly there
>> are imho a bunch of things to consider:
>>
>> 1) doing the merge server side is a lot of work and a big changeset afaics.
>> 2) I would argue that it's actually correct to not merge because you've
>> requested for /grid/host/metric so you effectively get what you asked for.
>> 3) Most likely on the client the processing will be done with a XML library
>> resulting in something similar to res[grid][host] = val , which
>> would make the merge fairly easy (at least in higher level languages like
>> python).
>>
>> As to filters support, it is potentially a problem for multiple items per
>> request, but as it is I don't see any. Right now there's only support for a
>> SUMMARY filter which will only be applied to root/grid/cluster nodes and 
> output
>> "". Being that the case with 
>> a
>> request like "/grid1:/grid2?filter=summary" each grid will get its own 
> summary
>> which is the expected behavior. Nonetheless I wanted to point it out because 
> in
>> case more filters are added this would need to be accounted for.
>>
>> == Gmetad server buffer overflow and network overload
>> These problems exist regardless of the above feature but would get far worse 
> in
>> that case, hence being linked.
>>
>> === Buffer overflow
>> It is possible to instantly crash gmetad by crafting a special request to be
>> sent to the interactive port.
>>
>> In process_path() a char element[256] is allocated to contain the pieces of 
> the
>> path as it is processed. If a request is made with a path element longer 
> than
>> that the strncpy call will write to invalid memory location, since there is 
> no
>> length checking performed on the input data to make sure it is less than the
>> size of element.
>>
>> Fix is provided by malloc'ing the right size to accommodate any element's
>> length. REQUESTLEN provides protection against abusing malloc and launching 
> a
>> memory exhaustion attack.
>>
>> PoC: echo "/`python -c \"print \\"%s/%s\\" % ('a'*300,'b'*300)\"`" |
>> nc localhost 8652
>>
>> === DoS attacks
>> 1) Given REQUESTLEN=2048, and 3 characters to be the minimum to craft a 
> valid
>> and nonexistent path "/x", with the above feature implemented it would be
>> possible to trigger 2048/3 calls to process_path which would possibly lead 
> to
>> CPU overload.
>> 2) extension to 1) - as it is ganglia returns the entire tree if an element 
> is
>> not found. with large trees 2048/3 requests could easily result in several 
> GBs
>> of data being transferred. Related to this if you look at gmetad/server.c 
> lines
>> 601:606 you'll see this:
>> err_msg("Got a malformed path request from %s", 
> remote_ip);
>> /* Send them the entire tree to discourage attacks. */
>> strcpy(request, "/");
>> which leads to the same scenario as above.
>>
>> What I propose is that for both cases, malformed request and non existent
>> items, we log an error and bail out. This would solve 2) and most of 1) 
> making
>> the call possibly exist far quicker.
>>
>> == Patches summary
>> - server-c-bof-dos.diff , address buffer overflo