I have performed some microbenchmarking, on following example:
* bench_empty: input is empty istreamstream, creates iterator and compares
against end
* bench_stream_create: only resetting stringstream with new content
* bench_preinc: for loop over using pre increment
* bench_postinc: for loop over using post increment
* bench_deref_posting: while with *it++
This are the result I have observed, for the changes with the patch:
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
bench_empty 2.10 ns 2.09 ns 337921434
bench_stream_create 10.6 ns 10.6 ns 65736840
bench_preinc 419 ns 419 ns 1662732
bench_postinc 436 ns 435 ns 1609828
bench_deref_postinc 306 ns 305 ns 22866
And the result before the patch:
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
bench_empty 2.27 ns 2.26 ns 314944156
bench_stream_create 11.7 ns 11.6 ns 60679957
bench_preinc 873 ns 871 ns 807352
bench_postinc 920 ns 917 ns 751565
bench_deref_postinc 706 ns 704 ns 1038942
For sanity, I have also check the result on g++ 15.2 that ships with my
system:
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
bench_empty 2.30 ns 2.30 ns 303070664
bench_stream_create 10.6 ns 10.6 ns 65778662
bench_preinc 767 ns 765 ns 915085
bench_postinc 767 ns 764 ns 913985
bench_deref_postinc 675 ns 673 ns 1107772
So it looks like eliminating save to member from both equal and operator*
makes implementation
two times faster. I will look into that.
Attaching also my benchmark file.
Regards,
Tomasz
On Wed, Jan 7, 2026 at 3:41 PM Tomasz Kamiński <[email protected]> wrote:
> The warning was produced by following seqeunce, given an
> istream_iterator<char>
> it, such that *it will result in hitting EoF in it->_M_get(), and thus
> clearing
> _M_sbuf, the subsequent call to ++it, will result in _M_sbuf->sbumpc() call
> on null pointer dereference. This is however an false-positive, as in such
> sitution
> it == istream_iteator() returns true, and the iterator should not be
> dereferenced
> in first place.
>
> This patch addresses the above by clearing the _M_sbuf in operator++,
> instead of
> _M_get(). This removes the need for making _M_sbuf mutable, and thus make
> the
> implementation conforming with regards to C++11 [res.on.data.races] p3.
>
> This change should not or positive have performance impact on the usual
> iteration
> patterns, in form:
> while (it != end) { process(*it); ++it; }
> In case when it is end-of-stream iterator, the it != end returns in one
> call of
> _M_sbuf->sgetc() both before and after the change. However we do not
> modify _M_sbuf
> in this case. For non-empty range, we replace call to _M_sbuf->sbumpc()
> with
> _M_sbuf->snextc() in pre-increment, and extract the check again from *it to
> ++it. However, as _M_sbuf is now cleared during increment, so last it !=
> end
> check avoids _M_sbuf->sgetc() call to check against EoF.
>
> However, this change impact the behavior of the post-increment (*it++), as
> we now load both current character (for return value) and next character
> (to
> check against EoF). In consequence we call both sgetc() and snextc(), in
> contrast
> to previous single sbumpc() call.
>
> PR libstdc++/105580
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/streambuf_iterator.h (istreambuf_iterator::_M_sbuf):
> Remove mutable and adjust whitespace.
> (istreambuf_iterator::_M_c): Adjust whitespace.
> (istreambuf_iterator::operator++()): Clear _M_sbuf if next
> character
> is EoF.
> (istreambuf_iterator::operator++(int)): Use _M_sbuf->sgetc() to
> load current character, and define in terms of __*this.
> (istreambuf_iterator::_M_get()): Do not clear _M_sbuf in case of
> EoF.
> * testsuite/24_iterators/istreambuf_iterator/2.cc: Test for using
> multiple iterators to same rdbuf.
> ---
> This warning turned out to be false-positive (as explained in the commit
> decription), so we can just add add pragrams to address that. But this
> change seem to be beneficial anyway, as we are no longer causing
> data-races.
>
> Tested on x86_64-linux. Locally tested with -Wnull-dereference,
> and the istream_iterator warning in 24_iterators/istreambuf_iterator/2.cc
> disappears (there are some other warning present I haven't looked into
> that yet).
>
> .../include/bits/streambuf_iterator.h | 20 +++----
> .../24_iterators/istreambuf_iterator/2.cc | 58 +++++++++++++++++++
> 2 files changed, 66 insertions(+), 12 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h
> b/libstdc++-v3/include/bits/streambuf_iterator.h
> index 93d3dd24f93..095928ca4d8 100644
> --- a/libstdc++-v3/include/bits/streambuf_iterator.h
> +++ b/libstdc++-v3/include/bits/streambuf_iterator.h
> @@ -112,8 +112,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> // the "end of stream" iterator value.
> // NB: This implementation assumes the "end of stream" value
> // is EOF, or -1.
> - mutable streambuf_type* _M_sbuf;
> - int_type _M_c;
> + streambuf_type* _M_sbuf;
> + int_type _M_c;
>
> public:
> /// Construct end of input stream iterator.
> @@ -172,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> _M_message(__gnu_debug::__msg_inc_istreambuf)
> ._M_iterator(*this));
>
> - _M_sbuf->sbumpc();
> + if (_S_is_eof(_M_sbuf->snextc()))
> + _M_sbuf = 0;
> _M_c = traits_type::eof();
> return *this;
> }
> @@ -181,14 +182,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> istreambuf_iterator
> operator++(int)
> {
> - __glibcxx_requires_cond(_M_sbuf &&
> - (!_S_is_eof(_M_c) ||
> !_S_is_eof(_M_sbuf->sgetc())),
> -
> _M_message(__gnu_debug::__msg_inc_istreambuf)
> - ._M_iterator(*this));
> -
> istreambuf_iterator __old = *this;
> - __old._M_c = _M_sbuf->sbumpc();
> - _M_c = traits_type::eof();
> + __old._M_c = _M_sbuf->sgetc();
> + ++*this;
> return __old;
> }
>
> @@ -206,8 +202,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> _M_get() const
> {
> int_type __ret = _M_c;
> - if (_M_sbuf && _S_is_eof(__ret) && _S_is_eof(__ret =
> _M_sbuf->sgetc()))
> - _M_sbuf = 0;
> + if (_M_sbuf && _S_is_eof(__ret))
> + __ret = _M_sbuf->sgetc();
> return __ret;
> }
>
> diff --git a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
> b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
> index 78701d71cee..0318a677f35 100644
> --- a/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
> +++ b/libstdc++-v3/testsuite/24_iterators/istreambuf_iterator/2.cc
> @@ -109,8 +109,66 @@ void test02(void)
> }
> }
>
> +void
> +test_empty()
> +{
> + std::istreambuf_iterator<char> null(0), end;
> + VERIFY( null == end );
> +
> + std::istringstream ess;
> + // Thie specification hear seem to indicate, that such iterators
> + // are not end-of-stream iterators, as rdbuf pointer is nullptr,
> + // however we treat them as such, as otherwise we would produce
> + // a range containing single eof character.
> + std::istreambuf_iterator<char> it1(ess), it2(ess);
> + VERIFY( it1 == end );
> + VERIFY( it2 == end );
> +}
> +
> +void
> +test_multi()
> +{
> + // C++98 (and later) define operator* in [istreambuf.iterator.ops] as:
> + // Returns: The character obtained via the streambuf member
> sbuf_->sgetc().
> + // This nails down behavior of mulptiple iterators to same sequence.
> + std::istringstream iss("abcd");
> + std::istreambuf_iterator<char> it1(iss), it2(iss), end;
> +
> + VERIFY( it1 != end );
> + VERIFY( it2 != end );
> + VERIFY( *it1 == 'a' );
> + VERIFY( *it2 == 'a' );
> + ++it1;
> +
> + VERIFY( it1 != end );
> + VERIFY( it2 != end );
> + VERIFY( *it1 == 'b' );
> + VERIFY( *it2 == 'b' );
> + ++it2;
> +
> + VERIFY( it1 != end );
> + VERIFY( it2 != end );
> + VERIFY( *it1 == 'c' );
> + VERIFY( *it2 == 'c' );
> + ++it2;
> +
> + VERIFY( it1 != end );
> + VERIFY( it2 != end );
> + VERIFY( *it1 == 'd' );
> + VERIFY( *it2 == 'd' );
> + // second dereference
> + VERIFY( *it1 == 'd' );
> + VERIFY( *it2 == 'd' );
> + ++it1;
> +
> + VERIFY( it1 == end );
> + VERIFY( it2 == end );
> +}
> +
> int main()
> {
> test02();
> + test_empty();
> + test_multi();
> return 0;
> }
> --
> 2.52.0
>
>
#include <benchmark/benchmark.h>
#include <iterator>
#include <sstream>
const std::istreambuf_iterator<char> end;
const size_t n = 1000;
const std::string src(n, 'a');
void bench_empty(benchmark::State& s) {
std::istringstream empty;
while (s.KeepRunning())
{
std::istreambuf_iterator<char> it(empty);
benchmark::DoNotOptimize(it != end);
}
}
BENCHMARK(bench_empty);
void bench_stream_create(benchmark::State& s) {
std::istringstream is(src);
while (s.KeepRunning())
{
is.str(src);
benchmark::DoNotOptimize(is);
}
}
BENCHMARK(bench_stream_create);
void bench_preinc(benchmark::State& s) {
std::istringstream is(src);
while (s.KeepRunning())
{
is.str(src);
for (std::istreambuf_iterator<char> it(is); it != end; ++it)
benchmark::DoNotOptimize(*it);
}
}
BENCHMARK(bench_preinc);
void bench_postinc(benchmark::State& s) {
std::istringstream is(src);
while (s.KeepRunning())
{
is.str(src);
for (std::istreambuf_iterator<char> it(is); it != end; it++)
benchmark::DoNotOptimize(*it);
}
}
BENCHMARK(bench_postinc);
void bench_deref_postinc(benchmark::State& s) {
std::istringstream is(src);
while (s.KeepRunning())
{
is.str(src);
std::istreambuf_iterator<char> it(is);
while (it != end)
benchmark::DoNotOptimize(*it++);
}
}
BENCHMARK(bench_deref_postinc);
BENCHMARK_MAIN();