An interesting discussion. The admin of the server I use is rather critical
about malicious connections. His way to prevent continuing malicious
connections is to route the source IP address (incoming) to 127.0.0.1 after
X errors reported from a single IP address within Y minutes.

>From the logic presented here I guess that makes his server susceptible to
a DOS serivce: many bots could attack his server using all available ports
as no ACK are ever returned. However, there is also code in the server OS
to combat this particular problem.

So, yes: if servers delay answering - they are holding ports longer - and
this consumes IP (port) resources. If delaying responses means bot-masters
are going, because they are upset, re-program bots to massively spam a slow
server - the bot-masters also run a risk of detection.

My impression is that, all in all, a good discussion has been started - at
least people are (again?) talking about possibilities for managing "noise"
in a healthy way. Maybe, let's hope!, a good suggestion/idea does come up.
Just because it is hard is not a good reason to stop working towards a
solution - aka surrender.

So, to the person who started this discussion: thanks for the impulse.

On Fri, May 3, 2013 at 2:37 PM, André Warnier <a...@ice-sa.com> wrote:

> Tom Evans wrote:
>
>> On Fri, May 3, 2013 at 10:54 AM, André Warnier <a...@ice-sa.com> wrote:
>>
>>> So here is a challenge for the Apache devs : describe how a bot-writer
>>> could
>>> update his software to avoid the consequences of the scheme that I am
>>> advocating, without consequences on the effectivity of their
>>> URL-scanning.
>>>
>>
>> This has been explained several times. The bot makes requests
>> asynchronously with a short select() timeout. If it doesn't have a
>> response from one of its current requests due to artificial delays, it
>> makes an additional request, not necessarily to the same server.
>>
>> The fact that a single response takes longer to arrive is not
>> relevant, the bot can overall process roughly as many requests in the
>> same period as without a delay. The amount of concurrency that would
>> be required would be proportional to the artificial delay and the
>> network RTT.
>>
>> There is a little overhead due to the extra concurrency, but not much
>> - you are not processing any more requests in a specific time period,
>> nor using more network traffic than without concurrency, the only real
>> cost is more simultaneous network connections, most of which are idle
>> waiting for the artificial delay to expire.
>>
>> I would not be surprised if bots already behave like this, as it is a
>> useful way of increasing scanning rate if you have servers that are
>> slow to respond already, or have high network RTT.
>>
>>
> Ok, maybe I am understanding this wrongly. But I am open to be proven
> wrong.
>
> Suppose a bot is scanning 10000 IP's, 100 at a time concurrently (*), for
> 20 potentially
> vulnerable URLs per server. That is thus 200,000 HTTP requests to make.
> And let's suppose that the bot cannot tell, from the delay experienced
> when receiving any
> particular response, if this is a server that is artifically delaying
> responses, or if
> this is a normal delay due to whatever condition (**).
> And let's also suppose that, on the total of 200,000 requests, only 1%
> (2000) will be "hits" (where the URL actually responds by other than a 404
> response). That leaves 99% of requests (198,000) responding with a 404.
> And let's suppose that the bot is extra-smart, and always keeps his "pool"
> of 100 outgoing connections busy, in the sense that as soon as a response
> was received on one connection, that connection is closed and immediately
> re-opened for another HTTP request.
>
> If no webserver implements the scheme, we assume 10 ms per 404 response.
>
> So the bot launches the first batch of 100 requests (taking 10 ms to do
> so), then goes back to check its first connection and finds a response. If
> the response is  not a 404, it's a "hit" and gets added to the table of
> vulnerable IP's
> (and to gain some extra time, it means that if there would have been extra
> URLs to scan for the same server, they could now be canceled - although
> this could be disputed).
> If the response is a 404, it's a "miss". But it doesn't mean that there
> are no other vulnerable URLs on that server, so it still needs to scan the
> others.
> All in all, if the bot can keep issuing requests and processing responses
> at the rate of 100 per 10 ms on average, it will take it a total of 200,000
> / 100 * 10 ms = 2,000 ms to perform the scan of the 200,000 URLs, and it
> will have collected 2000 hits after doing so.
>
> Now let's suppose that out of these 10000 servers, 10% of them implement
> the scheme, and delay their 404 responses by an average of 1000 ms.
> So now the bot launches the first 100 requests in 10 ms, then goes back to
> check the status of the first one. With a probability of 0.1, this could be
> one of the delayed ones.
> In that case, no response will be there yet, and the bot skips to the next
> connection.
> At the end of this pass, the bot will thus have received 90 responses (10
> are still delayed), and re-issued 90 new requests. Then on the next pass,
> the same 10 delayed responses would still be delayed (on average), and
> among the 90 new ones, 9 would also be.
> So now it can only issue 81 new requests, and when it comes back to check,
> 10 + 9 + 8 = 27 will be delayed.
> Basically, after a few cycles like this, all his 100 pool connections will
> be waiting for a response, and it would have no choice between either
> waiting, or starting to kill the connections that have been waiting more
> than a certain amount of time.
> Or, increasing its number of connections and become more conspicuous (***).
>
> If it choses to wait, then its time to complete the scan of the 10000 IP's
> will have increased by 200,000 * 10% * 1000 ms = 20,000,000 ms.
> If it chooses not to wait, then it will never know if this URL was
> vulnerable or not.
>
> Is there a flaw in this reasoning ?
>
> If not, then the avoidance-scheme based on becoming more parallel would be
> quite ineffective, no ?
>
>
>
> (*) I pick 100 at a time, imagining that as the number of established
> outgoing connections
> increases, a bot becomes more and more visible on the host it is running
> on. So I imagine
> that there is a reasonable limit to how many of them it can open at a time.
>
> (**) this being because the server varies the individual 404 delay
> randomly between 2
> reasonable values (100 ms and 2000 ms e.g.)which can happen on any normal
> server.
>
> (***) I would say that a bot which would be opening 100 outgoing
> connections in parallel on average would already be *very* conspicuous.
>
>

Reply via email to