Tim,
Aleksandar,

On 9/8/20 11:18 PM, Aleksandar Lazic wrote:
> On 08.09.20 22:54, Tim Düsterhus wrote:
>> Reinhard,
>> Björn,
>>
>> Am 08.09.20 um 21:39 schrieb Björn Jacke:
>>>> the only official supported way to identify a google bot is to run a
>>>> reverse DNS lookup on the accessing IP address and run a forward DNS
>>>> lookup on the result to verify that it points to accessing IP address
>>>> and the resulting domain name is in either googlebot.com or google.com
>>>> domain.
>>>> ...
>>>
>>> thanks for asking this again, I brought this up earlier this year and I
>>> got no answer:
>>>
>>> https://www.mail-archive.com/haproxy@formilux.org/msg37301.html
>>>
>>> I would expect that this is something that most sites would actually
>>> want to check and I'm surprised that there is no solution for this
>>> or at
>>> least none that is obvious to find.
>>
>> The usually recommended solution for this kind of checks is either Lua
>> or the SPOA, running the actual logic out of process.
>>
>> For Lua my haproxy-auth-request script is a batteries included solution
>> to query an arbitrary HTTP service:
>> https://github.com/TimWolla/haproxy-auth-request. It comes with the
>> drawback that Lua runs single-threaded within HAProxy, so you might not
>> want to use this if the checks need to run in the hot path, handling
>> thousands of requests per second.
>>
>> It should be possible to cache the results of the script using a stick
>> table or a map.
>>
>> Back in nginx times I used nginx' auth_request to query a local service
>> that checked whether the client IP address was a Tor exit node. It
>> worked well.
>>
>> For SPOA there's this random IP reputation service within the HAProxy
>> repository:
>> https://github.com/haproxy/haproxy/tree/master/contrib/spoa_example. I
>> never used the SPOA feature, so I can't comment on whether that example
>> generally works and how hard it would be to extend it. It certainly
>> comes with the restriction that you are limited to C or Python (or a
>> manual implementation of the SPOA protocol) vs anything that speaks
>> HTTP.
>
> In addition to Tim's answer you can also try to use spoa_server which
> supports `-n <workers>`.
> https://github.com/haproxy/haproxy/tree/master/contrib/spoa_server
>
thanks, for your reply and the information. Sorry for my late reply, but
I had only today time to test. I did try to get the spoa server working
on a ubuntu bionic (18.04.4) with haproxy 2.2.3-2ppa1~bionic from the
vbernat ppa. I could compile the spoa server with python 3.6 support
from the latest github sources without obvious problems and it also
started without problems with the example python script (./spoa -d -f
ps_python.py).

If I start haproxy with the following command:

haproxy -f spoa-server.conf -d

haproxy seg faults on the first request to port 10001

If I start haproxy with the additional parameter -Ws then it does not
seg fault, but only the first and every 4th request get (correctly?)
forwarded to the spoa server, the 3 requests in between get answered
with an empty %[var(sess.iprep.ip_score)].

Here are the log files of a working request:

from haproxy:
00000000:test.accept(0008)=0014 from [127.0.0.1:57570] ALPN=<none>
00000000:test.clireq[0014:ffffffff]: GET / HTTP/1.1
00000000:test.clihdr[0014:ffffffff]: host: localhost:10001
00000000:test.clihdr[0014:ffffffff]: user-agent: curl/7.58.0
00000000:test.clihdr[0014:ffffffff]: accept: */*
00000000:test.clicls[0014:ffffffff]
00000000:test.closed[0014:ffffffff]

from spoa server:
1599906552.714422 [01] New connection from HAProxy accepted
1599906552.714593 [01] Hello handshake done: version=2.0 -
max-frame-size=16380 - healthcheck=false
1599906552.714780 [01] Notify frame received: stream-id=0 - frame-id=1
1599906552.714800 [01]   Message 'check-client-ip' received
[{'name': '', 'value': True},
 {'name': '', 'value': 1234},
 {'name': '', 'value': IPv4Address('127.0.0.1')},
 {'name': '', 'value': IPv6Address('::55')},
 {'name': '', 'value': 'localhost:10001'}]
1599906552.716741 [01] Ack frame sent: stream-id=0 - frame-id=1

And here from a not working request:

from haproxy:
0000001f:test.accept(0008)=0015 from [127.0.0.1:57634] ALPN=<none>
0000001f:test.clireq[0015:ffffffff]: GET / HTTP/1.1
0000001f:test.clihdr[0015:ffffffff]: host: localhost:10001
0000001f:test.clihdr[0015:ffffffff]: user-agent: curl/7.58.0
0000001f:test.clihdr[0015:ffffffff]: accept: */*
0000001f:test.clicls[0015:ffffffff]
0000001f:test.closed[0015:ffffffff]
00000020:spoe-server.srvcls[ffffffff:adfd]
00000020:spoe-server.clicls[ffffffff:adfd]
00000020:spoe-server.closed[ffffffff:adfd]

the spoa server does not log anything, during the request, but after a
while the following lines are logged:

1599906689.387816 [01] New connection from HAProxy accepted
1599906689.387848 [01] Failed to write Agent frame
1599906689.387853 [01] Close the client socket because of I/O errors

Every requests works if between the requests are at least 30 seconds,
because after 30 seconds the spoa server logs that it closes the connection:

1599907270.605946 [01] Disconnect frame received: reason=normal
1599907270.606078 [01] Disconnect frame sent: reason=normal

But also the following works for the first and last curl request and
this takes a lot less then 30 seconds:

curl -i localhost:10001; curl -i localhost:10001; curl -i
localhost:10001; curl -i localhost:10001; curl -i localhost:10001

I am unsure if I am making some stupid mistakes, or if I should test it
with an older haproxy version or how to debug the issue further. So any
pointers are very much appreciated.
>> Best regards
>> Tim Düsterhus
>
> Regards
> Aleks
>
Regards
Reinhard


Reply via email to