[Bug libstdc++/67922] std::unordered_map::clear should take time linear in the number of elements

2015-11-02 Thread yegor.derevenets at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922

--- Comment #3 from Yegor Derevenets  ---
A small correction. A colleague of mine bothered to read the source code of
libc++ and noticed that its implementation of clear() method also generally
takes time, linear in the number of buckets. This was not visible in the tests
I already presented, because libc++ has special handling of the empty container
case:

template 
void
__hash_table<_Tp, _Hash, _Equal, _Alloc>::clear() _NOEXCEPT
{
if (size() > 0)
{
__deallocate(__p1_.first().__next_);
__p1_.first().__next_ = nullptr;
size_type __bc = bucket_count();
for (size_type __i = 0; __i < __bc; ++__i)
__bucket_list_[__i] = nullptr;
size() = 0;
}
}

If we modify the test example slightly, we get:

$ cat test_clear.cpp
#include 

int main() {
std::unordered_map m;
for (int i = 0; i < 100; ++i) {
m[i] = i;
}
for (int i = 0; i < 1000; ++i) {
m[i] = i;
m.clear();
}
}

$ clang++-3.7 -O2 -std=c++11 -stdlib=libstdc++ test_clear.cpp && time ./a.out

real0m4.054s
user0m4.000s
sys 0m0.036s

$ clang++-3.7 -O2 -std=c++11 -stdlib=libc++ test_clear.cpp && time ./a.out

real0m6.114s
user0m6.000s
sys 0m0.036s

$ cat test_erase.cpp
#include 

int main() {
std::unordered_map m;
for (int i = 0; i < 100; ++i) {
m[i] = i;
}
for (int i = 0; i < 1000; ++i) {
m[i] = i;
m.erase(m.begin(), m.end());
}
}

$ clang++-3.7 -O2 -std=c++11 -stdlib=libstdc++ test_erase.cpp && time ./a.out

real0m0.151s
user0m0.116s
sys 0m0.036s

$ clang++-3.7 -O2 -std=c++11 -stdlib=libc++ test_erase.cpp && time ./a.out

real0m0.187s
user0m0.156s
sys 0m0.028s

I find it strange that both libraries implement clear() less efficiently than
erase(m.begin(), m.end()). Maybe there is a rationale for this which I do not
understand?

[Bug libstdc++/67922] std::unordered_map::clear should take time linear in the number of elements

2015-10-12 Thread yegor.derevenets at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922

--- Comment #2 from Yegor Derevenets  ---
> But then the issue is that clear () doesn't shrink the map.
No, the issue is that clear() touches all the buckets, instead of touching only
those containing the elements. libc++'s implementation does not shrink the map
(does not decrease the number of buckets), and is still fast. Actually, if in
libstdc++ you do m.erase(m.begin(), m.end()) instead of m.clear(), it will be
fast too.

#include 
#include 

int main() {
std::unordered_map m;
for (int i = 0; i < 100; ++i) {
m[i] = i;
}
std::cout << "Before clear(): " << m.bucket_count() << std::endl;
m.clear();
std::cout << "After clear(): " << m.bucket_count() << std::endl;
}

$ clang++-3.7 -stdlib=libstdc++ -O2 -std=c++11 test.cpp && time ./a.out
Before clear(): 1056323
After clear(): 1056323

real0m0.074s
user0m0.076s
sys 0m0.000s

$ clang++-3.7 -stdlib=libc++ -O2 -std=c++11 test.cpp && time ./a.out
Before clear(): 1646237
After clear(): 1646237

real0m0.102s
user0m0.080s
sys 0m0.016s


[Bug libstdc++/67922] New: std::unordered_map::clear should take time linear in the number of elements

2015-10-10 Thread yegor.derevenets at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922

Bug ID: 67922
   Summary: std::unordered_map::clear should take time linear in
the number of elements
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yegor.derevenets at gmail dot com
  Target Milestone: ---

std::unordered_map::clear internally clears the whole array of buckets using
memset. Consequently, the clearing time is proportional to the number of
buckets, instead of being proportional to the number of elements, which one
would expect, and what arguably should be the case according to the C++
Standard. (The Standard specifies that clear's complexity is linear, but
unfortunately does not say, linear in what.) This leads to terrible performance
of the following code:

#include 

int main() {
std::unordered_map m;
for (int i = 0; i < 100; ++i) {
m[i] = i;
}
for (int i = 0; i < 1000; ++i) {
m.clear();
}
}

$ g++-5 -O2 test.cpp -std=c++11 && time ./a.out

real0m4.009s
user0m3.924s
sys 0m0.028s

$ clang++-3.7 -stdlib=libstdc++ -O2 test.cpp -std=c++11 && time ./a.out

real0m4.153s
user0m3.976s
sys 0m0.036s

If you build the same program with libc++, the problem goes away:

$ clang++-3.7 -stdlib=libc++ -O2 test.cpp -std=c++11 && time ./a.out

real0m0.165s
user0m0.128s
sys 0m0.036s

You can achieve similar performance with libstdc++ if you implement clear()
manually in the most naive way:

#include 

int main() {
std::unordered_map m;
for (int i = 0; i < 100; ++i) {
m[i] = i;
}
for (int i = 0; i < 1000; ++i) {
while (m.begin() != m.end()) {
m.erase(m.begin());
}
}
}

$ g++-5 -O2 test.cpp -std=c++11 && time ./a.out

real0m0.167s
user0m0.132s
sys 0m0.028s

$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 5.2.1-21'
--with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 5.2.1 20151003 (Debian 5.2.1-21)


[Bug libstdc++/55123] [C++11] Construction of shared_ptr from unique_ptr fails

2012-10-29 Thread yegor.derevenets at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55123



--- Comment #4 from Yegor Derevenets  
2012-10-29 18:35:45 UTC ---

Confirm, the patch works for me.


[Bug libstdc++/55123] [C++11] Construction of shared_ptr from unique_ptr fails

2012-10-29 Thread yegor.derevenets at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55123



--- Comment #1 from Yegor Derevenets  
2012-10-29 17:09:49 UTC ---

Created attachment 28561

  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28561

Preprocessed test.cpp


[Bug libstdc++/55123] New: [C++11] Construction of shared_ptr from unique_ptr fails

2012-10-29 Thread yegor.derevenets at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55123

 Bug #: 55123
   Summary: [C++11] Construction of shared_ptr from
unique_ptr fails
Classification: Unclassified
   Product: gcc
   Version: 4.7.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: yegor.dereven...@gmail.com


Hello,

construction of a shared_ptr from a unique_ptr to a const object seems not to
work, although it should. Preprocessed file is attached.

[yegor@tomato tmp]$ cat test.cpp
#include 

void f() {
std::unique_ptr y;
std::shared_ptr x = std::move(y);
}


[yegor@tomato tmp]$ g++ -c -std=c++11 -save-temps test.cpp
In file included from
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/x86_64-unknown-linux-gnu/bits/c++allocator.h:34:0,
 from
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/allocator.h:48,
 from
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/memory:65,
 from test.cpp:1:
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/ext/new_allocator.h:
In instantiation of ‘struct __gnu_cxx::new_allocator’:
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/allocator.h:89:11:
  required from ‘class std::allocator’
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/shared_ptr_base.h:329:14:
  required from ‘struct std::_Sp_counted_deleter, std::allocator,
(__gnu_cxx::_Lock_policy)2u>::_My_Deleter’
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/shared_ptr_base.h:374:24:
  required from ‘class std::_Sp_counted_deleter, std::allocator,
(__gnu_cxx::_Lock_policy)2u>’
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/shared_ptr_base.h:625:39:
  required from ‘static std::_Sp_counted_base<_Lp>*
std::__shared_count<_Lp>::_S_create_from_up(std::unique_ptr<_Up, _Ep>&&,
typename std::enable_if<(! std::is_reference<_Del>::value)>::type*) [with _Tp =
const int; _Del = std::default_delete; __gnu_cxx::_Lock_policy _Lp =
(__gnu_cxx::_Lock_policy)2u; typename std::enable_if<(!
std::is_reference<_Del>::value)>::type = void]’
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/shared_ptr_base.h:549:43:
  required from ‘std::__shared_count<_Lp>::__shared_count(std::unique_ptr<_Up,
_Ep>&&) [with _Tp = const int; _Del = std::default_delete;
__gnu_cxx::_Lock_policy _Lp = (__gnu_cxx::_Lock_policy)2u]’
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/shared_ptr_base.h:855:4:
  required from ‘std::__shared_ptr<_Tp, _Lp>::__shared_ptr(std::unique_ptr<_Up,
_Ep>&&) [with _Tp1 = const int; _Del = std::default_delete; _Tp =
const int; __gnu_cxx::_Lock_policy _Lp = (__gnu_cxx::_Lock_policy)2u]’
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/bits/shared_ptr.h:259:36:
  required from ‘std::shared_ptr<_Tp>::shared_ptr(std::unique_ptr<_Up, _Ep>&&)
[with _Tp1 = const int; _Del = std::default_delete; _Tp = const
int]’
test.cpp:5:44:   required from here
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/ext/new_allocator.h:83:7:
error: ‘const _Tp*
__gnu_cxx::new_allocator<_Tp>::address(__gnu_cxx::new_allocator<_Tp>::const_reference)
const [with _Tp = const int; __gnu_cxx::new_allocator<_Tp>::const_pointer =
const int*; __gnu_cxx::new_allocator<_Tp>::const_reference = const int&]’
cannot be overloaded
/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/../../../../include/c++/4.7.2/ext/new_allocator.h:79:7:
error: with ‘_Tp*
__gnu_cxx::new_allocator<_Tp>::address(__gnu_cxx::new_allocator<_Tp>::reference)
const [with _Tp = const int; __gnu_cxx::new_allocator<_Tp>::pointer = const
int*; __gnu_cxx::new_allocator<_Tp>::reference = const int&]’


[yegor@tomato tmp]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /build/src/gcc-4.7.2/configure --prefix=/usr --libdir=/usr/lib
--libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=https://bugs.archlinux.org/
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared
--enable-threads=posix --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
--enable-libstdcxx-time --enable-gnu-unique-object --enable-linker-build-id
--with-ppl --enable-cloog-backend=isl --disable-ppl-version-check
--disable-cloog-version-check --enable-lto --enable-gold --enable-ld=default
--enable-plugin --with-plugin-ld=ld.gold --with-linker-hash-style=gnu
--enable-multilib --disable-libssp --disable-build-with-cxx
--disable-build-poststage1-with-cxx --enable-checking=release