[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread paolo dot carlini at oracle dot com


--- Comment #1 from paolo dot carlini at oracle dot com  2010-09-07 09:42 
---
If the problem is in the stdio sync code, then file a glibc PR.


-- 

paolo dot carlini at oracle dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread tstarling at wikimedia dot org


--- Comment #2 from tstarling at wikimedia dot org  2010-09-07 10:46 ---
(In reply to comment #1)
 If the problem is in the stdio sync code, then file a glibc PR.
 

I mean the stdio sync code as in the code in libstdc++ which synchronises
with glibc, not actual code within glibc. If there was a problem with glibc,
glibc would be slow, but it isn't.


-- 

tstarling at wikimedia dot org changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread paolo dot carlini at oracle dot com


--- Comment #3 from paolo dot carlini at oracle dot com  2010-09-07 11:15 
---
There is nothing we can do to speed up further the v3 side of the synced code,
thus, unless you have evidence that other implementations perform much better
than v3, and provide details, this is closed.


-- 

paolo dot carlini at oracle dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread tstarling at wikimedia dot org


--- Comment #4 from tstarling at wikimedia dot org  2010-09-07 17:18 ---
Benchmarking on Solaris indicates that cin.getline() takes only 1us per
iteration there, but I don't think the source code is available, so it's hard
to provide details. 

However, I think that a huge speedup could be achieved by making
basic_istreamchar::getline() into a simple wrapper around a GNU-specific
virtual function in basic_streambuf. This would allow it to be specialised in
stdio_sync_filebuf, where it could be implemented using fgets() or getdelim()
instead of getc(). 

This would have the additional positive impact of making it atomic. Currently,
cin.getline() does not properly lock the underlying libc stream with
flockfile(). This means that if one thread is calling cin.getline(), and
another thread is calling getc(), then cin.getline() may return mangled partial
lines due to interleaved calls to getc() from the other thread.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread paolo dot carlini at oracle dot com


--- Comment #5 from paolo dot carlini at oracle dot com  2010-09-07 17:49 
---
For sure we cannot add virtual functions to basic_streambuf without breaking
the ABI. Also, getline certainly isn't just fgets, takes a delim char, uses
traits, etc. Sure, anyway, in principle you can often speed-up special cases,
but also given that in ~5-7 years nobody else reported anything about the
performance of the synced getline, I don't think anything is going to happen
anytime soon, I could keep this open, but it would be futile, we have a lot of
work to do, for C++0x, in particular.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread paolo dot carlini at oracle dot com


--- Comment #6 from paolo dot carlini at oracle dot com  2010-09-07 17:55 
---
By the way, I don't know anything about your testcase (it would be a good idea
attaching something here, just in case), but on my machines, i7 mostly, I don't
see anything similar to your performance gap, I see something more similar to
9-10x, which, considering that a real synced mode must be unbuffered, seems
completely reasonable to me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread redi at gcc dot gnu dot org


--- Comment #7 from redi at gcc dot gnu dot org  2010-09-07 19:50 ---
(In reply to comment #0)
 Calling ios::sync_with_stdio(false) before the loop start reduces the time per
 line to around 0.3us, on par with fgets(). This suggests that the problem is
 with the stdio synchronisation code.

It's well known (though maybe not well enough) that you should use
sync_with_stdio(false) to get good performance, unless you specifically need
the synchronisation.

(In reply to comment #4)
 Benchmarking on Solaris indicates that cin.getline() takes only 1us per
 iteration there, but I don't think the source code is available, so it's hard
 to provide details. 

If you mean the classic iostreams provided with Sun Studio (rather than GCC on
Solaris or something else) then that stream library is not standard-conforming
and you're comparing apples and oranges.  If you mean the STLport iostreams
provided with Sun Studio and enabled by -library=stlport4, the source is
available, but I'd be surprised if you see a significant speed difference.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread tstarling at wikimedia dot org


--- Comment #8 from tstarling at wikimedia dot org  2010-09-08 01:34 ---
(In reply to comment #5)
 For sure we cannot add virtual functions to basic_streambuf without breaking
 the ABI. 

I'm mostly looking for a long-term fix, to improve the speed of libstdc++
applications generally, especially those that don't have developers who would
go to the trouble to track down the source of slowness in their programs. The
short-term fix is to call ios::sync_with_stdio(false). So it's fine for me to
wait for the next major version. 

 Also, getline certainly isn't just fgets, takes a delim char, uses
 traits, etc. 

The delim char can be taken care of with getdelim(). I don't think it's
unreasonable to specialise for default traits, that would take care of 99% of
use cases.

 Sure, anyway, in principle you can often speed-up special cases,
 but also given that in ~5-7 years nobody else reported anything about the
 performance of the synced getline, I don't think anything is going to happen
 anytime soon, I could keep this open, but it would be futile, we have a lot of
 work to do, for C++0x, in particular.

OK, let's keep it open. 

(In reply to comment #6)
 By the way, I don't know anything about your testcase (it would be a good idea
 attaching something here, just in case), but on my machines, i7 mostly, I 
 don't
 see anything similar to your performance gap, I see something more similar to
 9-10x, which, considering that a real synced mode must be unbuffered, seems
 completely reasonable to me.

Probably the main difference is the number of bytes per line in the input file.
I'm using a file with 1M lines and an average of 429 bytes per line. Using less
bytes per line would bring more pressure on to the constant per-line overhead,
and less on the inner loop. 

But a 9-10x difference doesn't sound reasonable to me. The synced mode is not
unbuffered, before or after my suggested change, it uses the internal buffer in
glibc.

(In reply to comment #7)
 It's well known (though maybe not well enough) that you should use
 sync_with_stdio(false) to get good performance, unless you specifically need
 the synchronisation.

Maybe you should tell that to Paolo Carlini, who closed bug 15002 as resolved
fixed in 2004, or to Loren Rittle, who closed bug 5001 as resolved fixed in
2003, declaring This issue was addressed by gcc 3.2.X such that
sync_with_stdio was no longer required for reasonable performance. 


-- 

tstarling at wikimedia dot org changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574



[Bug libstdc++/45574] ifstream::getline() is extremely slow

2010-09-07 Thread tstarling at wikimedia dot org


--- Comment #9 from tstarling at wikimedia dot org  2010-09-08 02:36 ---
Created an attachment (id=21732)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21732action=view)
10 lines, 500 bytes per line

Test file attached as requested, compressed with gzip. Test code follows.

getline-test.cpp

#include iostream

int main(int argc, char** argv) {
char buffer[65536];
while (std::cin.getline(buffer, sizeof(buffer), '\n'));
return 0;
}

fgets-test.cpp:

#include stdio.h

int main(int argc, char** argv) {
char buffer[65536];
while (fgets(buffer, sizeof(buffer), stdin));
return 0;
}

$ time ./fgets-test  500x100k.txt

real0m0.076s
user0m0.040s
sys 0m0.032s

$ time ./getline-test  500x100k.txt

real0m2.727s
user0m2.672s
sys 0m0.028s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45574