On 09/12/15 00:02, Jeff Janes wrote:
> The second one consumes that giant tape run along with 232 small tape
> runs.
In terms of number of comparisons, binary merge works best when the
inputs are of similar length. I'd assume the same goes for n-ary
merge, but I don't know if comparison count is
On 07/12/15 16:44, Simon Riggs wrote:
> There are many optimizations we might adopt, yet planning time is a factor.
> It seems simple enough to ignore more complex optimizations if we have
> already achieved a threshold cost (say 10). Such a test would add nearly
> zero time for the common case.
On 08/10/15 17:09, Robert Haas wrote:
> - You consider pushing down ORDER BY if any prefix of the query
> pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
> If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
> ordering the result set by a doesn't help us much.
On 27/09/15 21:58, Gavin Wahl wrote:
> Somewhat harder but still possible would be using BRIN indexes to
> accelerate ORDER BY. This would require a sorting algorithm that can take
> advantage of mostly-sorted inputs. You would sort the page ranges by their
> minimum or maximum value, then feed
On 31/07/15 18:31, Robert Haas wrote:
On Fri, Jul 31, 2015 at 7:21 AM, Jeremy Harris j...@wizmail.org wrote:
Heapification is O(n) already, whether siftup (existing) or down.
That's not my impression, or what Wikipedia says. Source?
Measurements done last year:
http://www.postgresql.org
On 30/07/15 02:05, Peter Geoghegan wrote:
Since heapification is now a big fraction of the total cost of a sort
sometimes, even where the heap invariant need not be maintained for
any length of time afterwards, it might be worth revisiting the patch
to make that an O(n) rather than a O(log n)
On 22/05/15 22:30, Josh Berkus wrote:
At CoreOS Fest, Intel presented about a technology which they used to
improve write times for the nonrelational data store Etcd. It's called
Asynchronous DRAM Self-Refresh, or ADR. This is supposedly a feature of
all of their chips since E5 which allows
On 22/05/15 22:30, Josh Berkus wrote:
At CoreOS Fest, Intel presented about a technology which they used to
improve write times for the nonrelational data store Etcd. It's called
Asynchronous DRAM Self-Refresh, or ADR. This is supposedly a feature of
all of their chips since E5 which allows
On 03/03/15 03:08, Arthur Silva wrote:
Does it always perform mergesort instead of quicksort when enabled?
Yes; there seemed no advantage to any additional complexity.
The merge consistently performs fewer comparisons than the
quicksort, on random input - and many fewer if there are
any sorted
On 25/02/15 00:42, Peter Geoghegan wrote:
On Tue, Feb 24, 2015 at 4:32 PM, Jeremy Harris j...@wizmail.org wrote:
On 23/02/15 16:40, Tomas Vondra wrote:
On 22.2.2015 22:30, Peter Geoghegan wrote:
You should try it with the data fully sorted like this, but with one
tiny difference: The very
On 23/02/15 16:40, Tomas Vondra wrote:
On 22.2.2015 22:30, Peter Geoghegan wrote:
You should try it with the data fully sorted like this, but with one
tiny difference: The very last tuple is out of order. How does that
look?
If this case is actually important, a merge-sort can take
On 14/11/14 00:46, Simon Riggs wrote:
Limit (cost= rows=20 width=175) (actual time= rows=20 loops=1)
- Sort (cost= rows=568733 width=175) (actual time=
rows=20 loops=1)
Sort Method: top-N heapsort
Going off on a tangent, when I was playing with a merge-sort
On 14/11/14 14:54, Tom Lane wrote:
Jeremy Harris j...@wizmail.org writes:
On 14/11/14 00:46, Simon Riggs wrote:
Limit (cost= rows=20 width=175) (actual time= rows=20 loops=1)
- Sort (cost= rows=568733 width=175) (actual time=
rows=20 loops=1)
Sort Method: top-N heapsort
On 29/10/14 16:11, Andres Freund wrote:
I do think checking the link count to
be 1 is safe though.
You will fail against certain styles of online-backup.
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
On 26/02/14 03:08, Robert Haas wrote:
On Tue, Feb 25, 2014 at 2:55 PM, Jeremy Harris j...@wizmail.org wrote:
On 24/02/14 17:38, Robert Haas wrote:
On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:
Run under cachegrind, it takes about N/10 last-level cache misses,
all
On 24/02/14 17:38, Robert Haas wrote:
On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:
Run under cachegrind, it takes about N/10 last-level cache misses,
all for the new item being introduced to the heap. The existing
code takes none at all.
Can you explain this further
On 09/02/14 17:11, Jeremy Harris wrote:
On 06/02/14 18:21, Jeff Janes wrote:
Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets? We will also need to test it on
strings. I usually use md5(random()::text) to generate strings for such
purposes, at least
On 06/02/14 22:12, Jeremy Harris wrote:
Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets?
Summary (low numbers better):
Random ints: 83% compares, level on time.
Sorted ints: level compares, 70% time.
Reverse-sorted ints: 10% compares, 15
On 05/02/14 21:56, Robert Haas wrote:
On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris j...@wizmail.org wrote:
The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify
On 06/02/14 18:21, Jeff Janes wrote:
How big of
sets were you sorting in each case?
Big enough to go external. The timings and compare-counts given are
purely for the heapify stage not the total for the sort, so are
constrained by the work_mem not by the sort size per se.
I'm limited to
On 06/02/14 14:22, Robert Haas wrote:
On Thu, Feb 6, 2014 at 4:22 AM, Jeremy Harris j...@wizmail.org wrote:
On 05/02/14 21:56, Robert Haas wrote:
On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris j...@wizmail.org wrote:
The attached patch replaces the existing siftup method for heapify
The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.
Both algorithms appear to be O(n) (contradicting Wikipedia's claim
On 27/01/14 18:00, Simon Riggs wrote:
On 27 January 2014 17:44, Pavel Stehule pavel.steh...@gmail.com wrote:
This topic is interesting - we found very bad performance with hashing large
tables with high work_mem. MergeJoin with quicksort was significantly
faster.
I've seen this also.
I
In ordered_set_startup() sorts are initialised in non-randomAccess mode
(tuplesort_begin_heap() and ~datum(), last argument).
The use of tuplesort_skip_tuples() feels very like a random access to
me. I think it doesn't fail because the only use (and implementation)
is to skip forwards; if
On 22/01/14 03:16, Jon Nelson wrote:
Greetings -hackers:
I have worked up a patch to PostgreSQL which elides tuples during an
external sort. The primary use case is when sorted input is being used
to feed a DISTINCT operation. The idea is to throw out tuples that
compare as identical whenever
On 22/01/14 03:53, Tom Lane wrote:
Jon Nelson jnelson+pg...@jamponi.net writes:
A rough summary of the patch follows:
- a GUC variable enables or disables this capability
- in nodeAgg.c, eliding duplicate tuples is enabled if the number of
distinct columns is equal to the number of sort
On 13/01/14 18:01, Alexander Korotkov wrote:
Thanks. It's included into attached version of patch. As wall as estimation
improvements, more comments and regression tests fix.
Would it be possible to totally separate the two sets of sort-keys,
only giving the non-index set to the tuplesort? At
On 31/12/13 01:41, Andreas Karlsson wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan - sort. The
On 22/12/13 20:26, Alexander Korotkov wrote:
On Sat, Dec 14, 2013 at 6:30 PM, Jeremy Harris j...@wizmail.org wrote:
On 14/12/13 12:54, Andres Freund wrote:
Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre
On 14/01/14 22:23, Dave Chinner wrote:
On Tue, Jan 14, 2014 at 11:40:38AM -0800, Kevin Grittner wrote:
To quantify that, in a production setting we were seeing pauses of
up to two minutes with shared_buffers set to 8GB and default dirty
On 14/12/13 12:54, Andres Freund wrote:
On 2013-12-14 13:59:02 +0400, Alexander Korotkov wrote:
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need or do
sort of tuples. However, it could be useful to mix
On 20/11/13 19:58, Robert Haas wrote:
Parallel sort, and then parallel other stuff. Eventually general
parallel query.
I have recently updated https://wiki.postgresql.org/wiki/Parallel_Sort
and you may find that interesting/helpful as a statement of intent.
I've been playing with an internal
On 13/11/13 09:07, Leonardo Francalanci wrote:
Problem? having 4 btree indexes on random values (imsi+imei * 2,
since we have calling and caller) kills the performance in insertion
after a while.
Surely there's good correlation between IMSI IMEI, so have a separate
table to translate one to
Hi,
I don't see the regression tests running any index-hash operations.
What am I missing?
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/09/13 15:44, Robert Haas wrote:
On Fri, Sep 6, 2013 at 3:34 AM, Jeremy Harris j...@wizmail.org wrote:
I don't see the regression tests running any index-hash operations.
What am I missing?
What's an index-hash operation?
Ones that hit tuplesort_begin_index_hash()
--
Cheers
On 06/08/13 13:59, Robert Haas wrote:
My thought is to create a
queue abstraction that sits on top of the dynamic shared memory
infrastructure, so that you can set aside a portion of your dynamic
shared memory segment to use as a ring buffer and send messages back
and forth with using some
On 10/05/2012 09:03 PM, Tom Lane wrote:
Note that allowing subsequent requests to jump the queue would not be a
good fix for this; if you do that, it's likely the ex-lock will never be
granted, at least not till the next system idle time.
Offering that option to the admin sounds like a good
On 2012-03-18 15:25, Tom Lane wrote:
Jeff Janesjeff.ja...@gmail.com writes:
On Wed, Mar 7, 2012 at 11:55 AM, Robert Haasrobertmh...@gmail.com wrote:
The problem there is that none of the files can be deleted until it
was entirely read, so you end up with all the data on disk twice. I
don't
On 2012-01-05 10:04, Benedikt Grundmann wrote:
I have a question of how to benchmark hardware to determine
the appropriate ratio of seq_page_cost vs random_page_cost.
It'd be really nice if the DBMS measured actual experienced values..
--
Jeremy
--
Sent via pgsql-hackers mailing list
On 2012-01-03 04:44, Robert Haas wrote:
On read-only workloads, you get spinlock contention, because everyone
who wants a snapshot has to take the LWLock mutex to increment the
shared lock count and again (just a moment later) to decrement it.
Does the LWLock protect anything but the shared
40 matches
Mail list logo