[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-30 Thread paolo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #27 from paolo at gcc dot gnu.org paolo at gcc dot gnu.org ---
Author: paolo
Date: Mon Sep 30 17:42:31 2013
New Revision: 203035

URL: http://gcc.gnu.org/viewcvs?rev=203035root=gccview=rev
Log:
2013-09-30  Chris Jefferson  ch...@bubblescope.net

PR libstdc++/58437
* include/bits/stl_algo.h (__move_median_first): Rename to
__move_median_to_first, change to take an addition argument.
(__unguarded_partition_pivot): Adjust.
* testsuite/performance/25_algorithms/sort.cc: New.
* testsuite/performance/25_algorithms/sort_heap.cc: Likewise.
* testsuite/performance/25_algorithms/stable_sort.cc: Likewise.

Added:
trunk/libstdc++-v3/testsuite/performance/25_algorithms/sort.cc
trunk/libstdc++-v3/testsuite/performance/25_algorithms/sort_heap.cc
trunk/libstdc++-v3/testsuite/performance/25_algorithms/stable_sort.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/stl_algo.h


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-30 Thread paolo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #28 from paolo at gcc dot gnu.org paolo at gcc dot gnu.org ---
Author: paolo
Date: Mon Sep 30 17:42:52 2013
New Revision: 203036

URL: http://gcc.gnu.org/viewcvs?rev=203036root=gccview=rev
Log:
2013-09-30  Chris Jefferson  ch...@bubblescope.net

PR libstdc++/58437
* include/bits/stl_algo.h (__move_median_first): Rename to
__move_median_to_first, change to take an addition argument.
(__unguarded_partition_pivot): Adjust.
* testsuite/performance/25_algorithms/sort.cc: New.
* testsuite/performance/25_algorithms/sort_heap.cc: Likewise.
* testsuite/performance/25_algorithms/stable_sort.cc: Likewise.

Added:
   
branches/gcc-4_8-branch/libstdc++-v3/testsuite/performance/25_algorithms/sort.cc
   
branches/gcc-4_8-branch/libstdc++-v3/testsuite/performance/25_algorithms/sort_heap.cc
   
branches/gcc-4_8-branch/libstdc++-v3/testsuite/performance/25_algorithms/stable_sort.cc
Modified:
branches/gcc-4_8-branch/libstdc++-v3/ChangeLog
branches/gcc-4_8-branch/libstdc++-v3/include/bits/stl_algo.h


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-30 Thread paolo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #29 from paolo at gcc dot gnu.org paolo at gcc dot gnu.org ---
Author: paolo
Date: Mon Sep 30 17:43:05 2013
New Revision: 203037

URL: http://gcc.gnu.org/viewcvs?rev=203037root=gccview=rev
Log:
2013-09-30  Chris Jefferson  ch...@bubblescope.net

PR libstdc++/58437
* include/bits/stl_algo.h (__move_median_first): Rename to
__move_median_to_first, change to take an addition argument.
(__unguarded_partition_pivot): Adjust.
* testsuite/performance/25_algorithms/sort.cc: New.
* testsuite/performance/25_algorithms/sort_heap.cc: Likewise.
* testsuite/performance/25_algorithms/stable_sort.cc: Likewise.

Added:
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/performance/25_algorithms/sort.cc
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/performance/25_algorithms/sort_heap.cc
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/performance/25_algorithms/stable_sort.cc
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/include/bits/stl_algo.h


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-30 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #30 from Paolo Carlini paolo.carlini at oracle dot com ---
Fixed for 4.7.4/4.8.2/4.9.0.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-23 Thread tammy at Cadence dot COM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #26 from Tammy Hsu tammy at Cadence dot COM ---
Thanks a lot for the patch. It addresses the issue on the standalone test and
shows also in our real tool test results. With this patch, the major
degradations are gone. Compared to gcc445, there is a slight overall
improvement for this tool now.

Really appreciate all the comments and the creation of a patch in such a short
of time


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread chris at bubblescope dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #13 from Chris Jefferson chris at bubblescope dot net ---
Created attachment 30861
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30861action=edit
Sort patch

Wow, this an embarrassing bug to get through testing. Obviously not enough
profiling done!

This patch (I hope) gets performance back to g++-4.4 levels, and in my
experiments slightly (but not noticeably) better. I would be interested in
anyone trying out this patch on their own data.

For the interested, here is a description of the bug. I will assume you
understand quicksort!

Quicksort requires taking a pivot, then partitioning the array by that pivot.
Getting O(n log n) behaviour requires that you choose a good pivot. Like most
people, in g++ we choose the median of the first, middle, and last value in the
array as the pivot.

In the C++03 world, after we chose the pivot we took a copy of it. In C++11, we
cannot copy values in std::sort any more, only move and swap. 

Trying to keep track of the pivot as it moves around during the partitioning is
horrible and inefficient. So, I had the bright idea to do swap(*first,
*pivot). Now the pivot is at the first location in the array, and we can just
partition [first+1, end) and the partitioning is fine. See the mistake yet?

The bug comes when we recursively partition the first cell. We again choose the
median from the first, mid and last of this cell, but *first is always the
biggest thing in the cell (as we pivoted on it one level above), meaning our
median calculation is unbalanced and often chooses terrible values, leading to
us often hitting the O(n^2) behaviour of quicksort (actually we technically get
o(n log n), as we switch to heapsort when things are going badly, but the time
is still bad).

There are a selection of ways of fixing this. This patch switches the median
calculation from considering then first, mid, last locations to on first+1,
mid, last-1 (there is no reason in this bug to consider last-1 instead of last,
but I thought it wouldn't do any harm, and might avoid other issues of
accidentally choosing a bad pivot.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #14 from Marc Glisse glisse at gcc dot gnu.org ---
Hi Chris,

(detail: could you pass -u10, or at least -p, to diff to make the patches
easier to read? It isn't required so you don't have to)

I don't really understand why the pivot is still considered in lower levels of
the recursion at all. So, first we move the pivot to the first position with a
swap:
(pivot, rest...)
Then we partition:
(pivot, small..., big...)
Then we know what the right position is for the pivot (this isn't a stable
sort), so we could put it there with another swap:
(small..., pivot, big...)
And now we can recurse on small and big, and no one needs to look at the pivot
ever again.

From what I understand of the code, we are instead relying on recursive calls
to eventually sort pivot to the right position, which seems strange to me.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #15 from Paolo Carlini paolo.carlini at oracle dot com ---
Thanks a lot guys, I appreciate all the help you are providing. While fixing
this, let's remember that this regressed even in the old 4.7.x branch. Thus, if
we are sure we are minimally restoring the performance of the C++03 code, we
could also envisage a very minimal but *very* safe fix for 4.7.4 and 4.8.2 and
something a tad more aggressive for 4.9.0.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread chris at bubblescope dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #16 from Chris Jefferson chris at bubblescope dot net ---
Indeed, if std::sort had never used lower-level partitioning to get the pivot
in the correct location, we would never have had this problem in the first
place!

This is not too serious a problem performance-wise (as the pivot ends up in the
right place fairly quickly), but it would certainly be better to do that in
future. Unfortunately doing this requires changing quite a few functions, to
make all code which uses partition skip over the pivot value.

I would suggest my current patch (which is simpler than it looks from the diff)
for previous versions, then investigate pivot. In general I would like to
modernise the sort to match more modern thoughts on sorting such as making use
of partly sorted data (which would include reverse-ordered). The only problem
there is satisfying all the requirements of std::sort).

(detail: -p doesn't help, you just get things like  -110,16 +112,18 @@
_GLIBCXX_BEGIN_NAMESPACE_VERSION. Neither -u 10, -x -u 10 or -x -u 10 seem
to work, I just get errors. If you can give me the exact svn diff command to
run to get nicer output, I am happy to do so).


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #17 from Paolo Carlini paolo.carlini at oracle dot com ---
Can I ask you also a rather simple test for
testsuite/performance/25_algorithms?


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #18 from Marc Glisse glisse at gcc dot gnu.org ---
[Ugh, bugzilla seems half broken at the moment]

 I would suggest my current patch (which is simpler than it looks from the 
 diff) for previous versions, then investigate pivot.

Well, you're the one doing all the work ;-)

(svn diff --diff-cmd diff -x -u10 (actually I have diff-cmd = diff-for-svn in
~/.subversion/config where diff-for-svn is a script where I put the options I
like, so svn diff does it by default), too bad -p often doesn't work well with
C++)


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

 CC||paolo.carlini at oracle dot com

--- Comment #19 from Paolo Carlini paolo.carlini at oracle dot com ---
I ran some quick tests and indeed the performance seems equal of better of
those of the old C++03 code. Note that the patch is very small and can be
installed even on top of the installed headers, without rebuilding anything,
thus I would really encourage the bug submitters to test it and report back.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread chris at bubblescope dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #20 from Chris Jefferson chris at bubblescope dot net ---
Created attachment 30867
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30867action=edit
Performance tests for sort

This is some performance tests for performance checking. Sorry for tar rather
than patch, just pop them in performance/25_algorithms.

I decided to check stable_sort and heap_sort as well as just sort, just to
check nothing else got broken at the same time (turns out, nothing else did).

Not sure exactly what the appropriate memory / time tradeoffs are for the
performance test suite, so any comments welcome.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread jmbnyc at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #21 from Jeffrey M. Birnbaum jmbnyc at gmail dot com ---
(In reply to Paolo Carlini from comment #19)
 I ran some quick tests and indeed the performance seems equal of better of
 those of the old C++03 code. Note that the patch is very small and can be
 installed even on top of the installed headers, without rebuilding anything,
 thus I would really encourage the bug submitters to test it and report back.

Excellent. Will test and report back.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread tammy at Cadence dot COM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #24 from Tammy Hsu tammy at Cadence dot COM ---
Yes, it is a question. And thank you for the answer... :-)


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #23 from Paolo Carlini paolo.carlini at oracle dot com ---
If this is a question, the answer is YES.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread tammy at Cadence dot COM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #22 from Tammy Hsu tammy at Cadence dot COM ---
Thanks a lot. We will test it out with our real application.
Just want to do things right, the patch is the one described in Attachment
30861. And I just need to patch the
gcc481_root/include/c++/4.8.1/bits/stl_algo.h, don't need to rebuild gcc481.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-19 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #25 from Marc Glisse glisse at gcc dot gnu.org ---
Note that naively doing what I am proposing in comment #14 (it's just an
iter_swap and a +-1) also makes reverse-sorted arrays a bad case, because of
the way we do partitioning, so it isn't an alternative to Chris's first+1
approach, more of an orthogonal optimization.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.7.4
Summary|[4.7/4.7/4.9 Regression]|[4.7/4.8/4.9 Regression]
   |Sorting value in reverse|Sorting value in reverse
   |order is much slower|order is much slower
   |compare to gcc44|compare to gcc44


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread chris at bubblescope dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #7 from Chris Jefferson chris at bubblescope dot net ---
I will look at this next week. Quicksorts are always suboptimal for mostly
sorted (forwards or backwards) data, and it would be nice to fix that.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #8 from Paolo Carlini paolo.carlini at oracle dot com ---
Thanks Chris. Note that, according at least to the Dup bug, some slow down is
noticeable in other less special cases too.


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread jmbnyc at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #10 from Jeffrey M. Birnbaum jmbnyc at gmail dot com ---
Tammy,

Something must have been in the air because your timestamp on the bug
submission means that right at the time you were reporting the bug I was
actively looking into why I was seeing horrible sort performance. I am working
on an algo that requires a post sort of something 1B entries so I wanted to see
worst case for std::sort on 500M and was stunned to see how poor it was. My CPU
was Haswell so wondered if there was an issue so I tried another bug (that
happened to have gcc 4.4.7) and realized it appeared to be a compiler or std
lib regression. 

Anyhow, given the length of time that this has been out there it is weird that
we reported the same bug within hours of each other.

Best,
/JMB


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread tammy at Cadence dot COM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #11 from Tammy Hsu tammy at Cadence dot COM ---
Jeffrey,

It's weird indeed. We are still using gcc445 for our production releases and
are in the process of evaluating gcc481/gcc473. The performance tests show
slowness in some tests

I also don't have gcc45x available in house, so don't know if this bug exists
in gcc45. BTW, it's really nice to know someone else encountered the same
issue.

Tammy


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread tammy at Cadence dot COM
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #9 from Tammy Hsu tammy at Cadence dot COM ---
Marc/Paolo/Chris,

Thanks a lot for your help. It's awesome to see so many activities within hours
after submitting the bug.

Thanks, Tammy


[Bug libstdc++/58437] [4.7/4.8/4.9 Regression] Sorting value in reverse order is much slower compare to gcc44

2013-09-17 Thread jmbnyc at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58437

--- Comment #12 from Jeffrey M. Birnbaum jmbnyc at gmail dot com ---
Tammy,

We tested gcc 4.7.2, 4.6.2 and 4.4.3/5 (the bug is not in either 4.4.3/5). I
have gcc 4.8.1 on my laptop but have not tried it yet. I confirmed the issue by
compiling my test (almost identical to the one you submitted but using 500M
elements) on 4.4.5 and then moving the executable over to a box with 4.7.2
installed. the native compiled program performed poorly compared to the one
compiled with 4.4.5 (this also ruled out chip issues, i.e. haswell vs
sandybridge).

I knew something was wrong when my own single threaded merge sort that produces
a gradient instead of sorting the data in place was outperforming the std::sort
using -D_GLIBCXX_PARALLEL, i.e. parallel sort of 500M entries. 

Best,
/JMB