Re: read two files simultaneously
That's true. Using bigger buffer will help, but it doesn't tell why reading large size file is slower than reading small size file. On Sat, Feb 21, 2009 at 5:56 PM, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl wrote: I'm just guessing inode structure, the physical file location on HDD might be related to this. But, if I read only one file, the size doesn't matter. Reading file (10M, 100M, 700M) gives constantly about 70MB/s, and the weird thing happens when I read 2 files of big size. if you use O_DIRECT it's read from disk exactly as you specified, without readahead, so you do a lot of seeks. simply use bigger buffer like 1MB -- Junsuk ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: read two files simultaneously
Both of them. Reading two 100M files in interleaved way with 16K buffer, 62MB/s Reading two 700M files in interleaved way with 16K buffer, 9MB/s Reading two 100M files in interleaved way with 1M buffer, 55MB/s get worse with large buffer somehow Reading two 700M files in interleaved way with 1M buffer, 34MB/s get better with large buffer, but still difference, 55 vs 34 I cannot find the reason for this. gstat(8) also shows low rates when reading large files in interleaved way but not for small files. On Sun, Feb 22, 2009 at 5:20 PM, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl wrote: That's true. Using bigger buffer will help, but it doesn't tell why reading large size file is slower than reading small size file. really slower? or just bigger difference with large files? On Sat, Feb 21, 2009 at 5:56 PM, Wojciech Puchar woj...@wojtek.tensor.gdynia.pl wrote: I'm just guessing inode structure, the physical file location on HDD might be related to this. But, if I read only one file, the size doesn't matter. Reading file (10M, 100M, 700M) gives constantly about 70MB/s, and the weird thing happens when I read 2 files of big size. if you use O_DIRECT it's read from disk exactly as you specified, without readahead, so you do a lot of seeks. simply use bigger buffer like 1MB -- Junsuk ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
read two files simultaneously
Hello, I need to read two files simultaneously, and simply read(2) is interleaved to do this. The problem is the performance varies dramatically depending on the file size. I'm wondering what is the problem in this case. The test application does following: open 2 files - the size of two file is same - since I read only once, bypass cache with O_DIRECT read 16Kbytes of file1, then read 16K of file2, and so on simplified code is like this: fd1 = open(file1, O_RDONLY | O_DIRECT); fd2 = open(file2, O_RDONLY | O_DIRECT); for(...) { /* read 16K of file1 */ while(...) { count = read(fd1,...); } /* read 16K of file2 */ while(...) { count = read(fd2,...); } } When I tested with two 100M files, it takes 3.17 seconds (about 31MB/s per file, 62MB/s in total) However, if I test with two 700M files, it takes 162 seconds (about 4.5MB/s per file, 9MB/s in total) I'm just guessing inode structure, the physical file location on HDD might be related to this. But, if I read only one file, the size doesn't matter. Reading file (10M, 100M, 700M) gives constantly about 70MB/s, and the weird thing happens when I read 2 files of big size. The seek time might be related to this, but it looks like too huge difference. What is going on this? Thanks. -- Junsuk ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
read() vs fread()
Hi BSD guys, While I was doing simple file read test, I found that there is a huge difference in file read performance between read() and fread(). I'm wondering if I'm doing something wrong or if someone has experienced similar things. Here is what I did, For the specific application, I need to bypass cache (I read only once, and that's all) The test file is 700Mbytes dummy file. Test app just reads the whole file. Test is done on FreeBSD 7.1 amd 64, Celeron E1200, WD Caviar SE16 SATA 7200 RPM For test 1, fd = open(name, O_RDONLY | O_DIRECT); while(...) { cnt = read(); } for test 2, fd = open(name, O_RDONLY | O_DIRECT); file = fdopen(fd,r); while(...) { cnt = fread(); } test 1 takes about 11.64 seconds (63 MBytes/s), and test 2 takes about 51.53 seconds (14 MBytes/s) If I use the pair of fopen() and fread(), it will have cache effect, so the result doesn't say much of hdd performance. Personally, I don't think the overhead of fread() (wrapper in libc) is that huge. What would be the reason for this? Thanks. -- Junsuk ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: read() vs fread()
setvbuf(file, buf, _IOFBF, bufsize) solved the problem perfectly. Thanks a lot. On Fri, Feb 20, 2009 at 4:09 PM, Pieter de Goeje pie...@degoeje.nl wrote: On Friday 20 February 2009 21:07:57 Junsuk Shin wrote: Hi BSD guys, While I was doing simple file read test, I found that there is a huge difference in file read performance between read() and fread(). I'm wondering if I'm doing something wrong or if someone has experienced similar things. Here is what I did, For the specific application, I need to bypass cache (I read only once, and that's all) The test file is 700Mbytes dummy file. Test app just reads the whole file. Test is done on FreeBSD 7.1 amd 64, Celeron E1200, WD Caviar SE16 SATA 7200 RPM For test 1, fd = open(name, O_RDONLY | O_DIRECT); while(...) { cnt = read(); } for test 2, fd = open(name, O_RDONLY | O_DIRECT); file = fdopen(fd,r); while(...) { cnt = fread(); } test 1 takes about 11.64 seconds (63 MBytes/s), and test 2 takes about 51.53 seconds (14 MBytes/s) If I use the pair of fopen() and fread(), it will have cache effect, so the result doesn't say much of hdd performance. Personally, I don't think the overhead of fread() (wrapper in libc) is that huge. What would be the reason for this? The reason is that by default a FILE has a really small internal buffer. Take a look at gstat(8) while running the test: you can clearly see an insane amount of I/O requests being done (almost 5000 reads per second on my HDD). To solve this call setvbuf(3): setvbuf(file, buf, _IOFBF, bufsize); A bufsize of 16k or bigger should help a lot. After this modification, I see about 900 reads per second (using bufsize = 64k) and the read speed is equal to the read(2) case. Regards, Pieter de Goeje -- Junsuk ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org