[issue4944] os.fsync() doesn't work as expect in Windows
Emil Styrke added the comment: eryksun, thank you for the explanation, the penny finally dropped for me. It's the *other* child processes that keep the file open, not the parent. Actually this bug started hitting me after I *stopped* doing exactly what you suggest (fdopen on the temp file), and instead started using file names generated by my program (opened with regular open) and now I see why that is. If I use os.fdopen with the O_NOINHERIT flag, it works as expected. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
eryksun added the comment: Emil, Your example child process opens the file with only read sharing, which fails with a sharing violation if some other process inherits the file handle with write access. The `with` block only prevents this in a single-threaded environment. When you spawn 10 children, each from a separate thread, there's a good chance that one child will inherit a handle that triggers a sharing violation in another child. Using close_fds is a blunt solution since it prevents inheriting all inheritable handles. What you really need here has actually already been done for you. Just use the file descriptor that mkstemp returns, i.e. use os.fdopen(infd, 'wb'). mkstemp opens the file with O_NOINHERIT set in the flags. -- nosy: +eryksun ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Emil Styrke added the comment: I have experienced this issue with Python 2.7.8 and 2.7.9. It is almost the same issue as the OP experiences as far as I can tell: spawn some Python threads that each create a file, flush, fsync, close, then start a subprocess which uses the file through the Windows API. One thing that may differ is that I create the file and spawn the child from the same thread. I tried with close_fds=True, and it indeed works then, but like the OP, in my production code I need to get the output from the process, so it is not a usable workaround for me. Test script and child program code is available at https://gist.github.com/anonymous/5d7d198cd3c22b31fa2a (the file upload button doesn't work for me it seems). Running the script on my machine will print at least one failure most of the time, but not always. -- nosy: +Emil.Styrke versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Terry J. Reedy tjre...@udel.edu added the comment: To be a tracker bug, as opposed to feature request, behavior must disagree with the manual or doc string, not personal expectation. In any case, OP declined to test on 2.6 and suggested closing, so I am. This still remains on tracker to be searched. Anyone who considers reopening should verify a discrepancy with 2.7 or later current version. -- nosy: +tjreedy resolution: - postponed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Changes by Brian Curtin cur...@acm.org: -- nosy: +brian.curtin priority: - normal stage: - test needed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: python2.6 is not so different from python2.5. Which performance concerns do you have? ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: I don't have Python2.6 installed. I just have ever read a bench article comparing the performance between different version of Python, including 2.5, 2.6 and 3.0. That article shows, for the same script, Python2.6 has longer running time than Python2.5. My application is a build system and the users are very care about the time spent in build. That's why I have the performance concern. What concerns me more is the stdout/stderr redirection. Python2.6 manual says they cannot be redirected if close_fds is set to True. My application relies on the I/O redirection to control the screen output from subprocess. Actually I have had a work-around for this issue. It works very well so far. I reported a bug here just because I want the Python to be better. I learned it one year ago but I love this language. I just hope nobody else encounter such problem again. If you guys think it won't be fixed in Python 2.5 or has been fixed in Python 2.6, it's OK to close this tracker. Thanks for your time. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: I'm using Python2.5 in which close_fds is not available in Windows. And I cannot use Python2.6 because I need to redirect the stdout and stderr and there's performance concern. I have questions on the root cause: a. why doesn't fopen() has sharing issue? b. why don't os.close() and FileObject.close() really close the file? ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: I did trace the order of file creation and process launch. It shows the file is indeed created before the process launch. I did another investigation. I added a new process, cmd.exe /c copy Makefile NewMakefile, to copy the the Makefile created, just before launching the nmake.exe process. The strangest thing happened again: each makefile was copied successful but there's still makefile not found reported by nmake.exe process. I tried standalone copy application xcopy.exe and the result was the same. So I guess that cmd.exe, xcopy.exe and nmake.exe might use different families of API or different mode (sync vs async) to access file. I decided to try the code you provided. In checkfile.c, fopen is used to test the file existence. I changed it to the Windows native API CreateFile and I also added a file write operation in order to make it more similar to the real case. Eventually, the problem was reproduced in your code. Following are the successful number of creating 1000 files 5 times in different file sizes: Create file in 403 bytes, 985, 992, 984, 989, 992 (no flush after creation) 883, 886, 907, 909, 915 (flush after creation) Create file in 4061 bytes 983, 976, 982, 977, 983 (no flush after creation) 654, 672, 684, 686, 648 (flush after creation) Create file in 16461 bytes: 970, 967, 963, 963, 971 (no flush after creation) 598, 664, 711, 653, 623 (flush after creation) In summary: a. Using fopen to test a file existence in check_file.c will never report failure, no matter buffer is flushed or not. b. Using CreateFile (read mode) to test a file existence in check_file.c will always report failure. The bigger the file size will cause more failure reported; the flush operation after file creation in test_file_flush.py will cause more failure reported; the flush operation after new file creation in check_file.c will cause more failure reported; no flush operation in both check_file.c and test_file_flush.py almost cause no failure. I don't know what's root cause: GIL, Python thread state switch, Python file buffer or Windows FileCreate API. I'm just certain there's race condition between Python and Windows. Added file: http://bugs.python.org/file12807/checkfile.c ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: The test code and script which can reproduce the problem has been uploaded. Added file: http://bugs.python.org/file12808/test_file_flush.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Gabriel Genellina gagsl-...@yahoo.com.ar added the comment: Thanks for adapting the smaller example. I think I figured out what's the problem. The error reported by checkfile.c is 0x20 = ERROR_SHARING_VIOLATION The process cannot access the file because it is being used by another process. I modified the subprocess call, adding the parameter close_fds=True -- and now I see no errors. Please do a similar change in your application and see if it works. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: I'm quite certain that the makefile is generated before the make program is launched in separated process. Follow is the original code where the makefile is created (line 14) and a task is put in queue (line 19). It's executed in the main thread (no parallel execution here). There's no way that the process is launched before the makefile generation. 01for Module in Pa.Platform.Modules: 02Ma = ModuleAutoGen(Wa, Module, BuildTarget, ToolChain, Arch, self.PlatformFile) 03if Ma == None: 04continue 05# generate AutoGen files and Makefile here 06if self.Target not in ['clean', 'cleanlib', 'cleanall', 'run', 'fds']: 07# for target which must generate AutoGen code and makefile 08if not self.SkipAutoGen or self.Target == 'genc': 09Ma.CreateCodeFile(True) 10if self.Target == genc: 11continue 12 13if not self.SkipAutoGen or self.Target == 'genmake': 14Ma.CreateMakeFile(True) 15if self.Target == genmake: 16continue 17 18# Generate build task for the module which will be launched separately 19Bt = BuildTask.New(ModuleMakeUnit(Ma, self.Target)) I think the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS should not be used for fsync in the Python core code (in posixmodule.c, right?). Because fsync is mostly used to solve file operation issue in parallel execution context, it should not give out CPU time to other threads or processes when it's running. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: gagenellina, My application is a little bit different from your test code. It won't wait for the exit of new process and there're file writing operations during makefile running. I changed the test code to be real multi-process and tried many file sizes. But I cannot reproduce it. Maybe my application is more complicated situation. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: The created files are on local drive. I saw the problem on the laptop (XP-SP2), desktop(XP-SP3) and server (Win2003). But there's no such problem on the Linux and Mac boxes. I tried to use fopen/fwrite in my extension according to your suggestion. The problem wasn't be reproduced. It seems the bug is more alike in Python part. My application is a build system and is also an open source project. Is it possible for you to download it and try it in your box? I created temporary user (py123, password: 123py123) for you (just in case) and here's steps of how to reproduce it. 1. Checkout the build system source code in, for example, c:\test C:\test svn co --username py123 --password 123py123 https://buildtools.tianocore.org/svn/buildtools/trunk/BaseTools tools 2. Checkout the source code to be built against in c:\test C:\test svn co --username py123 --password 123py123 https://edk2.tianocore.org/svn/edk2/trunk/edk2 edk2 3. Change the source code between line 222 and line 229 of c:\test\tools\Source\Python\Common\Misc.py (SaveFileOnChange function) like below: Fd = open(File, wb) Fd.write(Content) Fd.flush() os.fsync(Fd.fileno()) Fd.close() 4. In c:\test\edk2, run C:\test\edk2 edksetup.bat C:\test\edk2 set PYTHONPATH=C:\test\tools\Source\Python C:\test\edk2 python.exe C:\test\tools\Source\Python\build\build.py -n 2 -p MdeModulePkg\MdeModulePkg.dsc -a IA32 -s 5. If the application stops with message like makefile not found or AutoGen.h not found message, that means the problem happened. Visual Studio 2005 is needed to reproduce it. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: There would be more chances to see the problem by doing this: C:\test\edk2 python.exe C:\test\tools\Source\Python\build\build.py -n 2 -p IntelFrameworkModulePkg\IntelFrameworkModulePkg.dsc -a IA32 -s ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: I really tried, but I can't run your script. first PYTHONPATH must be set to some value (I tried with tools\Source\Python); Also the tool is complaining about missing WORKSPACE and EDK_TOOLS_PATH env vars, and a missing file Conf\target.txt. This seems very complicated for a test case... furthermore it comes with binaries already linked with a specific version of python... what if I want to use a debug build? Another question: is there a running anti-virus? if yes, can you try to temporarily disable it? ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Javen Wang jave...@gmail.com added the comment: I agree multithread programming is kind of difficult and tricky. But I don't think there's a race condition in _MultiThreadBuildPlatform method, because I do have only one producer. And the consumer consumes the product only when it's done (i.e. the file is created and closed). The only race condition, I think, it's in the Python or Windows file system, in which the file created might not be seen by other process due to the file buffer mechanism. I think the flush() method of file object and os.fsync() are to remove the race condition. But it seems that they are not working as they're supposed to be. What I know is that os.fsync() calls the _commit() which calls FlushFileBuffers(). Why no problem if I call the FlushFileBuffers() directly? That's why I think the most possible race condition is in Python file buffer operation which is out of the control of my Python code. I'm sorry that I didn't realize there's 150M code to checkout. Thanks for your patience. Actually they are not code of my application itself. They are the code used to test my application because my application is a build system which needs source code to build. The real code of my application is in the, for my example, directory of C:\test\tools\Source\Python with just about 3M source files :-) And I think I have narrowed down the issue in the file creation in SaveFileOnChange function in C:\test\tools\Source\Python\Common\Misc.py. I know it's very hard to reproduce issue in multi-thread context. And I cannot say absolutely there's no bug in my code. It's OK for you to close this tracker. But it would be better to let it open for a few days so that I can do more investigation then. Anyway, thanks for the trying again. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
New submission from Javen Wang jave...@gmail.com: I encountered a very strange issue in file flush operation in Windows. Here's the scenario of my application: 1. The main thread of my application will create makefiles sequentially. 2. Once a makefile is generated, launch a separate process calling nmake.exe to run it in parallell. The main thread then create another makefile until no more makefiles to create. 3. The number of new processes is limited by command line options. My application has been running for almost a year without any problem. But, after I made some changes recently to the content of makefile generated, nmake.exe in separate process sometimes complains that the makefile was not found. But when I went into the directory, the makefile was there. Because I didn't change any thing related to file creation and the new makefiles are a little bit less than before, I guessed that the makefile just created hasn't been flushed to disk because of size change so the new process could not see it in a short time. So I decided add code to force flush file buffer after writing the file content (I didn't flush the file explicitly before). I did it like below: Fd = open(File, w) Fd.write(Content) Fd.flush() os.fsync(Fd.fileno()) Fd.close() The strangest thing happened. The makefile missing happened more frequently than no flush operation. I searched the web but no answer there. Finally I tried to use Windows file API to create the file via pywin32 extension. The problem's gone. import win32file Fd = win32file.CreateFile( File, win32file.GENERIC_WRITE, 0, None, win32file.CREATE_ALWAYS, win32file.FILE_ATTRIBUTE_NORMAL, None ) win32file.WriteFile(Fd, str(Content), None) win32file.FlushFileBuffers(Fd) win32file.CloseHandle(Fd) I tried writing small python extension in C to make use Windows API to create file like above. It also works well, even I removed the FlushFileBuffers() calling. I think that there's a bug in Python file buffer mechanism. -- components: Windows messages: 79829 nosy: javen72 severity: normal status: open title: os.fsync() doesn't work as expect in Windows type: behavior versions: Python 2.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Changes by Gabriel Genellina gagsl-...@yahoo.com.ar: Added file: http://bugs.python.org/file12738/checkfile.c ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4944] os.fsync() doesn't work as expect in Windows
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: python2.5 uses the functions of the fopen() family: fwrite(), fclose(). Does the problem reproduce if you use these functions in your extension module? Are your files located on the local hard drive, or on a network storage? -- nosy: +amaury.forgeotdarc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4944 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com