On 2020-11-07 01:01, Fabien COELHO wrote:
Hello Marina,
Hello, Fabien! Thank you for your comments!
While trying to test a patch that adds a synchronization barrier in pgbench [1] on Windows,Thanks for trying that, I do not have a windows setup for testing, and the sync code I wrote for Windows is basically blind coding:-(
FYI:1) It looks like pgbench will no longer support Windows XP due to the function DeleteSynchronizationBarrier. From https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier :
Minimum supported client: Windows 8 [desktop apps only] Minimum supported server: Windows Server 2012 [desktop apps only]On Windows Server 2008 R2 (MSVC 2013) the 6-th version of the patch [1] has compiled without (new) warnings, but when running pgbench I got the following error:
The procedure entry point DeleteSynchronizationBarrier could not be located in the dynamic link library KERNEL32.dll.
2) On Windows Server 2019 (MSVC 2019) the 6-th version of the patch [1] with fix_max_client_conn_on_Windows.patch has compiled without (new) warnings. I made a few runs (-M prepared -c 100 -j 10 -T 10 -P1 -S) with and without your patches. On Linux (-M prepared -c 1000 -j 500 -T 10 -P1 -S) your patches fix problems with progress reports as in [2], but on Windows I did not notice such changes, see attached pgbench_runs_linux_vs_windows.zip.
The almost same thing happens with reindexdb and vacuumdb (build on commit [3]):Windows fd implementation is somehow buggy because it does not return the smallest number available, and then with the assumption that select uses a dense array indexed with them (true on linux, less so on Windows which probably uses a sparse array), so that the number gets over the limit, even if less are actually used, hence the catch, as you noted.
I agree with you. It looks like the structure fd_set just contains used sockets by this application on Windows, and the macro FD_SETSIZE is used only here.
From https://docs.microsoft.com/en-us/windows/win32/api/winsock/ns-winsock-fd_set :
typedef struct fd_set { u_int fd_count; SOCKET fd_array[FD_SETSIZE]; } fd_set, FD_SET, *PFD_SET, *LPFD_SET;From https://docs.microsoft.com/en-us/windows/win32/winsock/maximum-number-of-sockets-supported-2 :
The maximum number of sockets that a Windows Sockets application can use is not affected by the manifest constant FD_SETSIZE. This value defined in the Winsock2.h header file is used in constructing the FD_SET structures used with select function.
IIUC the checks below are not correct on Windows, since on this system sockets can have values equal to or greater than FD_SETSIZE (see Windows documentation [4] and pgbench debug output in attached pgbench_debug.txt).Okay. But then, how may one detect that there are too many fds in the set? I think that an earlier version of the code needed to make assumptions about the internal implementation of windows (there is a counter somewhere in windows fd_set struct), which was rejected because if was breaking the interface. Now your patch is basically resurrecting that.
I tried to keep the behaviour "we check if the socket value can be used in select() at runtime", but now I will also read that thread...
Why not if there is no other solution, but this is quite depressing, and because it breaks the interface it would be broken if windows changed its internals for some reason:-(
It looks like if the internals of the structure fd_set are changed, we will also have problems with the function pgwin32_select from src/backend/port/win32/socket.c, because it uses fd_set.fd_count too?..
(I'm writing responses to the rest of your comments but it takes time...)
[1] https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2011021726390.989361%40pseudo [2] https://www.postgresql.org/message-id/20200227185129.hikscyenomnlrord%40alap3.anarazel.de
-- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
<<attachment: pgbench_runs_linux_vs_windows.zip>>