On Thu, Jan 8, 2026 at 10:55 AM Tomasz Kaminski <[email protected]> wrote:
> > > On Thu, Jan 8, 2026 at 10:38 AM Tomasz Kaminski <[email protected]> > wrote: > >> I have performed some microbenchmarking, on following example: >> * bench_empty: input is empty istreamstream, creates iterator and >> compares against end >> * bench_stream_create: only resetting stringstream with new content >> * bench_preinc: for loop over using pre increment >> * bench_postinc: for loop over using post increment >> * bench_deref_posting: while with *it++ >> >> This are the result I have observed, for the changes with the patch: >> -------------------------------------------------------------- >> Benchmark Time CPU Iterations >> -------------------------------------------------------------- >> bench_empty 2.10 ns 2.09 ns 337921434 >> bench_stream_create 10.6 ns 10.6 ns 65736840 >> bench_preinc 419 ns 419 ns 1662732 >> bench_postinc 436 ns 435 ns 1609828 >> bench_deref_postinc 306 ns 305 ns 22866 >> >> And the result before the patch: >> -------------------------------------------------------------- >> Benchmark Time CPU Iterations >> -------------------------------------------------------------- >> bench_empty 2.27 ns 2.26 ns 314944156 >> bench_stream_create 11.7 ns 11.6 ns 60679957 >> bench_preinc 873 ns 871 ns 807352 >> bench_postinc 920 ns 917 ns 751565 >> bench_deref_postinc 706 ns 704 ns 1038942 >> >> For sanity, I have also check the result on g++ 15.2 that ships with my >> system: >> -------------------------------------------------------------- >> Benchmark Time CPU Iterations >> -------------------------------------------------------------- >> bench_empty 2.30 ns 2.30 ns 303070664 >> bench_stream_create 10.6 ns 10.6 ns 65778662 >> bench_preinc 767 ns 765 ns 915085 >> bench_postinc 767 ns 764 ns 913985 >> bench_deref_postinc 675 ns 673 ns 1107772 >> >> So it looks like eliminating save to member from both equal and operator* >> makes implementation >> two times faster. I will look into that. >> >> Attaching also my benchmark file. >> >> Regards, >> Tomasz >> >> On Wed, Jan 7, 2026 at 3:41 PM Tomasz Kamiński <[email protected]> >> wrote: >> >>> The warning was produced by following seqeunce, given an >>> istream_iterator<char> >>> it, such that *it will result in hitting EoF in it->_M_get(), and thus >>> clearing >>> _M_sbuf, the subsequent call to ++it, will result in _M_sbuf->sbumpc() >>> call >>> on null pointer dereference. This is however an false-positive, as in >>> such sitution >>> it == istream_iteator() returns true, and the iterator should not be >>> dereferenced >>> in first place. >>> >>> This patch addresses the above by clearing the _M_sbuf in operator++, >>> instead of >>> _M_get(). This removes the need for making _M_sbuf mutable, and thus >>> make the >>> implementation conforming with regards to C++11 [res.on.data.races] p3. >>> >>> This change should not or positive have performance impact on the usual >>> iteration >>> patterns, in form: >>> while (it != end) { process(*it); ++it; } >>> In case when it is end-of-stream iterator, the it != end returns in one >>> call of >>> _M_sbuf->sgetc() both before and after the change. However we do not >>> modify _M_sbuf >>> in this case. For non-empty range, we replace call to _M_sbuf->sbumpc() >>> with >>> _M_sbuf->snextc() in pre-increment, and extract the check again from *it >>> to >>> ++it. However, as _M_sbuf is now cleared during increment, so last it != >>> end >>> check avoids _M_sbuf->sgetc() call to check against EoF. >>> >>> However, this change impact the behavior of the post-increment (*it++), >>> as >>> we now load both current character (for return value) and next character >>> (to >>> check against EoF). In consequence we call both sgetc() and snextc(), in >>> contrast >>> to previous single sbumpc() call. >>> >>> PR libstdc++/105580 >>> >>> libstdc++-v3/ChangeLog: >>> >>> * include/bits/streambuf_iterator.h >>> (istreambuf_iterator::_M_sbuf): >>> Remove mutable and adjust whitespace. >>> (istreambuf_iterator::_M_c): Adjust whitespace. >>> (istreambuf_iterator::operator++()): Clear _M_sbuf if next >>> character >>> is EoF. >>> (istreambuf_iterator::operator++(int)): Use _M_sbuf->sgetc() to >>> load current character, and define in terms of __*this. >>> (istreambuf_iterator::_M_get()): Do not clear _M_sbuf in case of >>> EoF. >>> * testsuite/24_iterators/istreambuf_iterator/2.cc: Test for using >>> multiple iterators to same rdbuf. >>> --- >>> This warning turned out to be false-positive (as explained in the commit >>> decription), so we can just add add pragrams to address that. But this >>> change seem to be beneficial anyway, as we are no longer causing >>> data-races. >>> >>> Tested on x86_64-linux. Locally tested with -Wnull-dereference, >>> and the istream_iterator warning in 24_iterators/istreambuf_iterator/2.cc >>> disappears (there are some other warning present I haven't looked into >>> that yet). >>> >>> .../include/bits/streambuf_iterator.h | 20 +++---- >>> .../24_iterators/istreambuf_iterator/2.cc | 58 +++++++++++++++++++ >>> 2 files changed, 66 insertions(+), 12 deletions(-) >>> >>> diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h >>> b/libstdc++-v3/include/bits/streambuf_iterator.h >>> index 93d3dd24f93..095928ca4d8 100644 >>> --- a/libstdc++-v3/include/bits/streambuf_iterator.h >>> +++ b/libstdc++-v3/include/bits/streambuf_iterator.h >>> @@ -112,8 +112,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >>> // the "end of stream" iterator value. >>> // NB: This implementation assumes the "end of stream" value >>> // is EOF, or -1. >>> - mutable streambuf_type* _M_sbuf; >>> - int_type _M_c; >>> + streambuf_type* _M_sbuf; >>> + int_type _M_c; >>> >>> public: >>> /// Construct end of input stream iterator. >>> @@ -172,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >>> >>> _M_message(__gnu_debug::__msg_inc_istreambuf) >>> ._M_iterator(*this)); >>> >>> - _M_sbuf->sbumpc(); >>> + if (_S_is_eof(_M_sbuf->snextc())) >>> + _M_sbuf = 0; >>> _M_c = traits_type::eof(); >>> return *this; >>> } >>> @@ -181,14 +182,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >>> istreambuf_iterator >>> operator++(int) >>> { >>> - __glibcxx_requires_cond(_M_sbuf && >>> - (!_S_is_eof(_M_c) || >>> !_S_is_eof(_M_sbuf->sgetc())), >>> - >>> _M_message(__gnu_debug::__msg_inc_istreambuf) >>> - ._M_iterator(*this)); >>> - >>> istreambuf_iterator __old = *this; >>> - __old._M_c = _M_sbuf->sbumpc(); >>> - _M_c = traits_type::eof(); >>> + __old._M_c = _M_sbuf->sgetc(); >>> + ++*this; >>> return __old; >>> } >>> >>> @@ -206,8 +202,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >>> _M_get() const >>> { >>> int_type __ret = _M_c; >>> - if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret = >>> _M_sbuf->sgetc())) >>> - _M_sbuf = 0; >>> + if (_M_sbuf && _S_is_eof(__ret)) >>> + __ret = _M_sbuf->sgetc(); >>> >> The benefit seem to be result of eliminating the if condition here, as we > are now having > one less eof check per loop, previously we got: > O: 4: 2x2 in _M_get per operator* and equal > N: 3: 2x1 in _M_get per operator* and equal, and one in operator++ > We could also eliminate the _S_is_eof(__ret) check, by returning __proxy instead of istream_iterator for pre-increment, and eliminate uses of _M_c completely. This will however break already non-portable code that uses: istream_iteartor i; istream_iteartor i2 = i++; We cannot provide the conversion operator, as this would require reintroducing S_is_eof check. I have prototyped that, and there is a measurable benefit from doing it, but I think something that we should do in stage 1 for GCC 17. return __ret; >>> } >>> >>> diff --git >>> a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >>> b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >>> index 78701d71cee..0318a677f35 100644 >>> --- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >>> +++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc >>> @@ -109,8 +109,66 @@ void test02(void) >>> } >>> } >>> >>> +void >>> +test_empty() >>> +{ >>> + std::istreambuf_iterator<char> null(0), end; >>> + VERIFY( null == end ); >>> + >>> + std::istringstream ess; >>> + // Thie specification hear seem to indicate, that such iterators >>> + // are not end-of-stream iterators, as rdbuf pointer is nullptr, >>> + // however we treat them as such, as otherwise we would produce >>> + // a range containing single eof character. >>> + std::istreambuf_iterator<char> it1(ess), it2(ess); >>> + VERIFY( it1 == end ); >>> + VERIFY( it2 == end ); >>> +} >>> + >>> +void >>> +test_multi() >>> +{ >>> + // C++98 (and later) define operator* in [istreambuf.iterator.ops] as: >>> + // Returns: The character obtained via the streambuf member >>> sbuf_->sgetc(). >>> + // This nails down behavior of mulptiple iterators to same sequence. >>> + std::istringstream iss("abcd"); >>> + std::istreambuf_iterator<char> it1(iss), it2(iss), end; >>> + >>> + VERIFY( it1 != end ); >>> + VERIFY( it2 != end ); >>> + VERIFY( *it1 == 'a' ); >>> + VERIFY( *it2 == 'a' ); >>> + ++it1; >>> + >>> + VERIFY( it1 != end ); >>> + VERIFY( it2 != end ); >>> + VERIFY( *it1 == 'b' ); >>> + VERIFY( *it2 == 'b' ); >>> + ++it2; >>> + >>> + VERIFY( it1 != end ); >>> + VERIFY( it2 != end ); >>> + VERIFY( *it1 == 'c' ); >>> + VERIFY( *it2 == 'c' ); >>> + ++it2; >>> + >>> + VERIFY( it1 != end ); >>> + VERIFY( it2 != end ); >>> + VERIFY( *it1 == 'd' ); >>> + VERIFY( *it2 == 'd' ); >>> + // second dereference >>> + VERIFY( *it1 == 'd' ); >>> + VERIFY( *it2 == 'd' ); >>> + ++it1; >>> + >>> + VERIFY( it1 == end ); >>> + VERIFY( it2 == end ); >>> +} >>> + >>> int main() >>> { >>> test02(); >>> + test_empty(); >>> + test_multi(); >>> return 0; >>> } >>> -- >>> 2.52.0 >>> >>>
