[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-22 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

> Are you aware of what currently supported platforms have SO_REUSEPORT defined 
> where the *reuse_port* parameter doesn't actually work? This would be quite 
> helpful to know. 

This was reported by a partner that was working porting our code to Android but 
might be fixed current Android API levels. I cannot test this myself.

> .. we don't directly support every single available platform, as making the 
> code specific to an OS incurs a long-term maintenance cost

Fully agree, part of the reason why I was suggesting getting rid of 
reuse_address and reuse_port completely on this higher level API. 

But I'm happy with Antoine's proposal which effectively kills the reuse_address 
parameter, just suggesting we also kill reuse_port as well.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-21 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

>  A higher-level interface like asyncio doesn't necessarily want to map its 
> kwargs directly to non-portable low-level options like this.

Also reuse_port has portability issues, the whole portability mess i s nicely 
summed up in: 
https://stackoverflow.com/questions/14388706/how-do-so-reuseaddr-and-so-reuseport-differ

And the docs say that "If the SO_REUSEPORT constant is not defined then this 
capability is unsupported." which is true but the converse is not - some 
platforms apparently do have SO_REUSEPORT defined but the option still doesn't 
work, resulting in a ValueError exception from create_datagram_endpoint().

So to create truly portable code, the developer who really wants to use these 
options has to know differences between underlying OS and kernel versions, 
catch errors that shouldn't happen according to docs and then try again with 
different options. 

This is ok for lower level APIs like socket.socket where things map 1:1 to 
system calls but not a good thing on higher level APIs as you point out. In my 
C++ projects that use these, I just have a #ifdef block for every OS and don't 
even try to abstract it.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-20 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

> We just chose an unfortunate default for one of them

I'd like to point out that it is also documented completely wrong up to this 
point in time and thus people who chose True are most likely to be unaware of 
the actual consequences. A user's explicit choice based on misinformation is 
not really demonstrating intent to shoot oneself in the foot.

So after changing my mind a couple of times I'm still attached to the idea that 
in 3.10 reuse_address and reuse_port are both removed and the tiny portion of 
developers who need them will just create a suitably setsockopt'd socket and 
pass it to the function. 2 lines of additional code is all it takes.

If the documentation had been correct all along and just the default bad, this 
mess would be much smaller.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-20 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

Going to SO_REUSEPORT will fix the security issue and emitting a deprecation 
warning for default value invocation will catch the eyes of some maintainers 
but it will not prevent what caused me to catch this issue in the first place - 
starting two processes (with same UID) accidentally listening on the same UDP 
port (in my case it was port 5060 as a default SIP port).

Since there already is a reuse_port parameter to create_datagram_endpoint(), I 
assume the proposal is to set default value for reuse_addresss=False and 
reuse_port=True? But if reuse_address is explicitly set to True, are we going 
to just set SO_REUSEPORT instead and always leave SO_REUSEADDR unset? This 
would leave the reuse_address parameter completely useless and still allow 
accidental port reuse.

What if I really do want SO_REUSEADDR? Ok I can create a socket separately, 
call setsockopt() on it and pass it as the sock parameter to 
create_datagram_endpoint(). 

Maybe I'm not fully grasping the proposal.. or maybe we should just deprecate 
reuse_port from both create_datagram_endpoint() and create_server() + 
reuse_addr from create_datagram_endpoint()? 

This would leave the TCP create_server() with the reuse_addr parameter, 
defaulting reasonably to True. To use TCP/UDP SO_REUSEPORT or UDP SO_REUSEADDR, 
the docs would tell you to bake your own socket with socket.socket(). Those few 
(like me) who really need the functionality can survive without all-in-one 
convenience functions on asyncio.loop

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-20 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

David, in terms of documentation changes and the emitted deprecation warning 
itself, I think it would be appropriate to instruct that please set the 
parameter explicitly to True or False to silence the warning AND point out that 
setting it to True has significant security and previously incorrectly 
documented functional implications.

Now your updated docs and warning read more like we are working around a Linux 
security bug which is not really the case - this behavior was intentionally 
added to the kernels and some of the code I do for a living relies on it to 
work properly. Admittedly the restriction of having the same UID wouldn't hurt.

And browsing again through the hits to my github searches, it makes me cringe 
how many people are already explicitly setting reuse_address=True in their code 
because the current documentation mistakenly makes it seem harmless and 
desirable. Makes me wonder if we need to put out a CVE? At the very least, I 
will be putting in PRs to the asyncio packages that I myself use and understand.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-19 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

I had a quick search through github for calls to create_datagram_endpoint() and 
the reuse_address is either not set or set explicitly to True, probably due to 
the error in the documentation.

Only in one case (of my admittedly small sample) did it seem like the parameter 
was set to True intentionally to get the behavior of binding multiple processes 
to the same port.

Due to the nature of the function, it is also often buried inside packages 
implementing higher level protocols (SIP, STUN, DNS, RTP) which want to create 
a convenience function to create asyncio servers and clients.

The deprecation warning and getting it into 3.6 and later sounds very 
reasonable to me.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-18 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

Sure, I fully appreciate implications of changing default behaviour and will 
post on python-dev.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-18 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

I definitely propose changing the default for UDP sockets. Having multiple 
processes binding to a the same port and load-balancing incoming UDP traffic 
intentionally is a rare scenario which should be supported but not the default.

For TCP the default is reasonable because everyone creating server sockets will 
usually want to set it.

--

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-11-18 Thread Jukka Väisänen

Jukka Väisänen  added the comment:

Can we still consider this for 3.9? I still think this is an easy way to 
accidentally blow your foot off with something that will probably only show up 
in production.

--
versions: +Python 3.9 -Python 3.7

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37228] UDP sockets created by create_datagram_endpoint() allow by default multiple processes to bind the same port

2019-06-11 Thread Jukka Väisänen

New submission from Jukka Väisänen :

When using loop.create_datagram_endpoint(), the default for reuse_address=True, 
which sets the SO_REUSEADDR sockopt for the UDP socket. 

This is a dangerous and unreasonable default for UDP, because in Linux it 
allows multiple processes to create listening sockets for the same UDP port and 
the kernel will randomly give incoming packets to these processes.

I discovered this by accidentally starting two Python asyncio programs with the 
same UDP port and instead of getting an exception that the address is already 
in use, everything looked to be working except half my packets went to the 
wrong process.

The documentation also refers to behavior with TCP sockets:
https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create_datagram_endpoint

"reuse_address tells the kernel to reuse a local socket in TIME_WAIT state, 
without waiting for its natural timeout to expire. If not specified will 
automatically be set to True on Unix."

This is true for TCP but not UDP.

Either the documentation should reflect the current (dangerous) behavior or the 
behavior should be changed for UDP sockets. 

Workaround is of course to create your own socket without SO_REUSEADDR and pass 
it to create_datagram_endpoint().

--
components: asyncio
messages: 345209
nosy: Jukka Väisänen, asvetlov, yselivanov
priority: normal
severity: normal
status: open
title: UDP sockets created by create_datagram_endpoint() allow by default  
multiple processes to bind the same port
type: behavior
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue37228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com