Re: Curiosity about file access performance
I/O to/from /dev/zero or /dev/null could be special-cased. Benchmarking file system performance can be fraught. -- -Barry Shein, co-author of nfsstones benchmark Software Tool & Die| b...@theworld.com | http://www.TheWorld.com Purveyors to the Trade | Voice: +1 617-STD-WRLD | 800-THE-WRLD The World: Since 1989 | A Public Information Utility | *oo* -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: Curiosity about file access performance
On 10/29/2021 11:44 AM, Adam Dinwoodie wrote: AIUI it's a fundamental part of the trade-offs that NTFS makes: compared to common Linux file systems like ext4, NTFS is much slower at things like parsing directory structures (which is a necessary part of opening any given file). In the same way that native Windows programs tend to use threading implementations that work differently to fork(), native Windows applications will also often much prefer large monolithic data files, where native *nix applications are much more likely to have lots of small files. As a result, for things that require opening lots of files, WSL (at least if you're using the native WSL disk, which will be a *nix disk image stored in a file, rather than files under /mnt/c or similar) will likely be quicker than a similar operation through Cygwin, as Cygwin will always be affected by those NTFS overheads. Ah, that's interesting. The files in question, that seem to be opened (and *maybe* read) faster are in the *nix hierarchy, while my book files are all in Windows (/mnt/c on WSL1). So the huge speedup reading those makes sense. The speedup processing the rest still doesn't quite make sense, unless maybe WSL1's parsed-directory caching is more effective than Cygwin's or something. (I assume something like that is going on, to reduce conversions of directories to *nix format.) Regards - Eliot -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: Curiosity about file access performance
There are a bunch of different possibilities (*) temporary files - there was an improvement here in recent cygwin versions which means that if your machine has lots of memory and your program creates lot of temporary files, then it will now be significantly faster (*) file name lookup - linux has a path name cache, which makes it quite a bit faster then Linux for heavy use (git is the poster child here) (*) file information lookup - some of the "default" Unix APIs will look up a bunch of information which is cheap on unix, but expensive on Windows. Normally there are alternative API which will only load the minimal set of information, which will then be cheaper on Windows. (*) spawning - it is quite possible that Latex is making heavy use of spawning child processes to do various things, which is unfortunately more expensive on Windows. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: Curiosity about file access performance
On Fri, 29 Oct 2021 at 10:36, Eliot Moss wrote: > I think a lot of us know that fork() under Cygwin is slower than on Linux and > have some grasp of why. But I have noticed that file access is rather lower > under Cygwin as well. My "poster child" for this is running latex. I am > working on writing a book, which includes a huge number of LaTeX style files > and such. Under WSL1 (which has the same fork cost issues as Cygwin for > similar reasons), reading the style files goes by in little more than the > blink of an eye (about 1 sec), while on Cygwin it takes a little over 17 > seconds. > > The time to process the body of the book is 23 seconds under WSL1 and 35 under > Cygwin. So the total times are 53 seconds under Cygwin and 24 under WSL1. I > believe the LaTeX installations are the same versions, and I get the same > outputs. Both LaTeX's are 64 bit programs. There is not much forking here > (at least I don't believe there is, but maybe there is under the cover for > doing things with pdf figures or something), but a fair amount of file I/O. > > For many / most things, the Cygwin overhead is tolerable; for running this > book, since I will be doing it over and over, it was worth investing in > getting everything set up on WSL1. > > But it got me wondering as to why? AIUI it's a fundamental part of the trade-offs that NTFS makes: compared to common Linux file systems like ext4, NTFS is much slower at things like parsing directory structures (which is a necessary part of opening any given file). In the same way that native Windows programs tend to use threading implementations that work differently to fork(), native Windows applications will also often much prefer large monolithic data files, where native *nix applications are much more likely to have lots of small files. As a result, for things that require opening lots of files, WSL (at least if you're using the native WSL disk, which will be a *nix disk image stored in a file, rather than files under /mnt/c or similar) will likely be quicker than a similar operation through Cygwin, as Cygwin will always be affected by those NTFS overheads. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: Curiosity about file access performance
Sorry, it could depend on what we mean by "file access", so allow me to try to clarify. I am grateful of your data since they show that raw data handling speed is good. But to read a file you have to open it. I suspect that file lookup and opening may be an issue. Which remains me, I should check and see if any of the TeX lookup paths are significantly different between the two cases! Best wishes - Eliot -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: Curiosity about file access performance
On Fri, 29 Oct 2021 10:35:08 +0100 Eliot Moss wrote: > I think a lot of us know that fork() under Cygwin is slower than on Linux and > have some grasp of why. But I have noticed that file access is rather lower > under Cygwin as well. My "poster child" for this is running latex. I am > working on writing a book, which includes a huge number of LaTeX style files > and such. Under WSL1 (which has the same fork cost issues as Cygwin for > similar reasons), reading the style files goes by in little more than the > blink of an eye (about 1 sec), while on Cygwin it takes a little over 17 > seconds. > > The time to process the body of the book is 23 seconds under WSL1 and 35 under > Cygwin. So the total times are 53 seconds under Cygwin and 24 under WSL1. I > believe the LaTeX installations are the same versions, and I get the same > outputs. Both LaTeX's are 64 bit programs. There is not much forking here > (at least I don't believe there is, but maybe there is under the cover for > doing things with pdf figures or something), but a fair amount of file I/O. > > For many / most things, the Cygwin overhead is tolerable; for running this > book, since I will be doing it over and over, it was worth investing in > getting everything set up on WSL1. > > But it got me wondering as to why? Why do you think the cause is the file access performance? I tested the file access speed using dd as follows. In cygwin: [yano@Express5800-S70 ~]$ dd if=/dev/zero of=test.dat bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 0.186714 s, 2.8 GB/s [yano@Express5800-S70 ~]$ dd if=test.dat of=/dev/null bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 0.125709 s, 4.2 GB/s In WSL1: Express5800-S70:~> dd if=/dev/zero of=test.dat bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 0.301657 s, 1.7 GB/s Express5800-S70:~> dd if=test.dat of=/dev/null bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB, 500 MiB) copied, 0.229617 s, 2.3 GB/s The result shows the file access performance of cygwin is better than WSL1. I think the cause of your problem is something other than file access performance. -- Takashi Yano -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Curiosity about file access performance
Dear Cygwiners - I think a lot of us know that fork() under Cygwin is slower than on Linux and have some grasp of why. But I have noticed that file access is rather lower under Cygwin as well. My "poster child" for this is running latex. I am working on writing a book, which includes a huge number of LaTeX style files and such. Under WSL1 (which has the same fork cost issues as Cygwin for similar reasons), reading the style files goes by in little more than the blink of an eye (about 1 sec), while on Cygwin it takes a little over 17 seconds. The time to process the body of the book is 23 seconds under WSL1 and 35 under Cygwin. So the total times are 53 seconds under Cygwin and 24 under WSL1. I believe the LaTeX installations are the same versions, and I get the same outputs. Both LaTeX's are 64 bit programs. There is not much forking here (at least I don't believe there is, but maybe there is under the cover for doing things with pdf figures or something), but a fair amount of file I/O. For many / most things, the Cygwin overhead is tolerable; for running this book, since I will be doing it over and over, it was worth investing in getting everything set up on WSL1. But it got me wondering as to why? Best wishes - Eliot -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple