Hey all,
Couldn't sleep, so here's some benchmarks. Run on gcc14, test files are
generated by the gensort program (http://www.ordinal.com/gensort.html).
./current is my implementation, ./sortgl is the version glen lenker submitted.
rand01, rand03, and rand03 are the files generated by the
To: Chen Guo cheng...@yahoo.com
Cc: Bug Coreutils bug-coreutils@gnu.org
Sent: Sunday, October 18, 2009 10:31:51 PM
Subject: Re: [PATCH] sort: Add --threads option, which parallelizes internal
sort.
Chen Guo wrote:
Ah how ridiculously careless of me. .
I've ran through the checklist you provided
Forgot to cc the mailing list:
- Forwarded Message
From: Chen Guo cheng...@yahoo.com
To: Jim Meyering j...@meyering.net
Sent: Monday, October 19, 2009 9:25:15 PM
Subject: Re: [PATCH] sort: Add --threads option, which parallelizes internal
sort.
Oooh one thing that got lost amongst
Chen Guo wrote:
In my last patch submission I noted while sorting in LC_ALL the endline
characters of a couple of lines would be randomly cut off. The cause was
memcoll being not threadsafe, I've since included a workaround.
Thank you for continuing to work on this change!
In order to
Chen Guo wrote:
In my last patch submission I noted while sorting in LC_ALL the endline
characters of a couple of lines would be randomly cut off. The cause was
memcoll being not threadsafe, I've since included a workaround.
...
+ /* If singlethreaded, the merge uses the memory
Message
From: Jim Meyering j...@meyering.net
To: Chen Guo cheng...@yahoo.com
Cc: bug-coreutils@gnu.org
Sent: Sunday, October 18, 2009 11:16:18 AM
Subject: Re: [PATCH] sort: Add --threads option, which parallelizes internal
sort.
Chen Guo wrote:
In my last patch submission I noted while sorting
Chen Guo wrote:
Ah how ridiculously careless of me. .
I've ran through the checklist you provided, minus the mallocs.
Thanks!
When would it be not ok to exit upon malloc failure? I've ran through all of
sort.c and it seems in all cases of memory allocation xmalloc or xnmalloc are
used.
Hi all,
In my last patch submission I noted while sorting in LC_ALL the endline
characters of a couple of lines would be randomly cut off. The cause was
memcoll being not threadsafe, I've since included a workaround.
diff --git a/bootstrap.conf b/bootstrap.conf
index 6671027..a0959b8 100644
Hi all,
I originally only sent this to Jim because I had no clue how to
properly respond to the thread, hope I'm doing it right this time. I
should mention that the times posted below are elapsed, user, and system
times, and I should also note that my test file is cat'ed from
/dev/random,
Hi Glen,
* glen lenker wrote on Fri, Jun 19, 2009 at 01:03:10PM CEST:
On Fri, Jun 12, 2009 at 12:54 AM, Ralf Wildenhues wrote:
No, I did not specify the amount of RAM. The system I tested on has
plenty of RAM, way more than is needed for the sort. Specifying
something like 2G of RAM
On Fri, Jun 12, 2009 at 12:54 AM, Ralf Wildenhues ralf.wildenh...@gmx.dewrote:
* Jim Meyering wrote on Thu, May 28, 2009 at 09:33:21PM CEST:
Glen Lenker wrote:
On Thu, Mar 26, 2009 at 09:50:08PM +, Ralf Wildenhues wrote:
Example run, on an 8-way, and with cat'ed instances of the
* Jim Meyering wrote on Thu, May 28, 2009 at 09:33:21PM CEST:
Glen Lenker wrote:
On Thu, Mar 26, 2009 at 09:50:08PM +, Ralf Wildenhues wrote:
Example run, on an 8-way, and with cat'ed instances of the dictionary,
on tmpfs, timings best of three:
Hey Ralf, did you happen to specify
Glen Lenker wrote:
On Thu, Mar 26, 2009 at 09:50:08PM +, Ralf Wildenhues wrote:
Hi Paul, all,
Paul Eggert writes:
This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism
within an internal sort. On our
On Thu, Mar 26, 2009 at 09:50:08PM +, Ralf Wildenhues wrote:
Hi Paul, all,
Paul Eggert writes:
This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism
within an internal sort. On our simple tests on a
Ralf Wildenhues wrote:
I've looked at this in a bit more detail; no big conclusion but maybe
a few more hints that could help.
I am now pretty confident that your patch implements the threading
correctly. When inserting some
expensive_computation ();
in the sortlines function right
* Pádraig Brady wrote on Mon, Apr 20, 2009 at 01:57:59AM CEST:
Ralf Wildenhues wrote:
This comparison isn't entirely fair, as the splicing was done as a
precomputation. However, the difference is so pronounced that even
taking the splicing into account, the non-thread version would be
Hello Paul, Glen, Jim, all,
* Paul Eggert wrote on Fri, Apr 03, 2009 at 09:57:54PM CEST:
Of course we cannot reasonably expect this one performance improvement
to make 'sort' run 16x faster on a 16-CPU machine. That is because the
improvement parallelizes only one part of 'sort'. Even
On Fri, Apr 3, 2009 at 8:57 PM, Paul Eggert egg...@cs.ucla.edu wrote:
More important, it's not clear to me what the role of the test suite
ought to be. Should the test really fail if it doesn't get enough
performance improvement with 2 threads? How do we decide what's
enough? None of our
Hello Paul, Glen, Jim,
[ I already wrote this in part to Glen off-list, sorry for the
duplication ]
* Paul Eggert wrote on Fri, Apr 03, 2009 at 09:57:54PM CEST:
Of course we cannot reasonably expect this one performance improvement
to make 'sort' run 16x faster on a 16-CPU machine.
No.
* Paul Eggert wrote on Fri, Apr 03, 2009 at 09:57:54PM CEST:
More important, it's not clear to me what the role of the test suite
ought to be. Should the test really fail if it doesn't get enough
performance improvement with 2 threads? How do we decide what's
enough? None of our other tests
Jim Meyering j...@meyering.net writes:
Ramping up to 5M lines, the resulting test takes almost 2 minutes and
the sort itself took 34s on this particular quad-core system. ... A
more interesting test would be to ensure that when run on a multi-core
system sorting with --threads=2 is at least
Paul Eggert wrote:
Jim Meyering j...@meyering.net writes:
Ramping up to 5M lines, the resulting test takes almost 2 minutes and
the sort itself took 34s on this particular quad-core system. ... A
more interesting test would be to ensure that when run on a multi-core
system sorting with
Ralf Wildenhues wrote:
Hi Paul, all,
Paul Eggert writes:
This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism
within an internal sort. On our simple tests on a 2-core desktop x86,
overall performance improved
Paul Eggert writes:
This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism
within an internal sort. On our simple tests on a 2-core desktop x86,
overall performance improved by roughly a factor of 1.6.
In my
Sorry, I forgot to cc the mailing list...
On Sat, Mar 28, 2009 at 04:51:52PM +0100, Ralf Wildenhues wrote:
Hello Glen,
* Glen Lenker wrote on Fri, Mar 27, 2009 at 11:07:19PM CET:
On Thu, Mar 26, 2009 at 09:50:08PM +, Ralf Wildenhues wrote:
Example run, on an 8-way, and with cat'ed
Hi Paul, all,
Paul Eggert writes:
This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism
within an internal sort. On our simple tests on a 2-core desktop x86,
overall performance improved by roughly a factor of
26 matches
Mail list logo