date:20150927

[HACKERS] Building with MinGW

2015-09-27 Thread Jeff Janes

Has anyone had success following the instructions at
https://wiki.postgresql.org/wiki/Building_With_MinGW#Installing_Git
recently?

I've followed the instructions to set up the build environment on a Windows
box, and I can't build from git.  I can from the nightly tarball.  So I
think the most recent environment you get when following these instructions
has suffered some regression in the bison, yak, or something else involved
in building from git but not from tarball.

Also, is there anyway to know the exact git commit which the nightly
tarballs are build against?

Cheers,

Jeff

Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

2015-09-27 Thread David Rowley

On 26 September 2015 at 01:57, Tomas Vondra 
wrote:

> Hi,
>
> On 09/25/2015 03:39 AM, David Rowley wrote:
>
>> I looked at this again, and I'm not all that sure it's possible to
>>
> assume that 1.0 /  is valid when there's more than one
>> relation at either side of the join.
>>
> >
>
>> My reasoning for this is that the whole basis for the patch is that a
>> if we find a foreign key match, then we can be sure enough, as far as
>> row estimations go, that exactly 1 tuple will exist matching that
>> condition. This assumption of the 1 tuple no longer holds when 2
>> relations have already been joined, as this can either cause tuple
>> duplication, or elimination.
>>
>
> I don't see why that would be the case. Of course, you need to take the
> right , i.e. the "target" of the foreign key (the table with UNIQUE
> constraint) so that the selectivity matches the fact that exactly 1 tuple
> (on the PK side) matches.
>

hmm, ok. You're right. It appears I was a bit confused, but thanks for
explaining it again. I get it now.

I've been working on this again. I've put back the code that you wrote for
the looping over each combination of relations from either side of the join.

I've also added some code to get around the problem with eclass joins and
the RestrictInfo having some alternative Vars that don't belong to the
foreign key. Basically I'm just checking if the RestrictInfo has a
parent_ec, and if it does just loop over the members to try and find the
Vars that belong to the foreign key. I've tested it with the following, and
it seems to work:

create table a as select i as a_id1, i as a_id2, i as dummy1 from
generate_series(0,999) s(i);
alter table a add unique (a_id1, a_id2);
create table b as select i as b_id1, i as b_id2 from generate_series(0,332)
s(i);

analyze a;
analyze b;

alter table b add foreign key (b_id1, b_id2) references a (a_id1, a_id2);

explain analyze select * from a inner join b on a.dummy1 = b.b_id1 and
a.a_id2 = b.b_id2 where a.a_id1 = a.dummy1;

 QUERY PLAN
---
 Hash Join  (cost=18.57..26.41 rows=2 width=20) (actual time=0.775..1.046
rows=333 loops=1)
   Hash Cond: ((b.b_id1 = a.dummy1) AND (b.b_id2 = a.a_id2))
   ->  Seq Scan on b  (cost=0.00..5.33 rows=333 width=8) (actual
time=0.013..0.046 rows=333 loops=1)
   ->  Hash  (cost=18.50..18.50 rows=5 width=12) (actual time=0.737..0.737
rows=1000 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 51kB
 ->  Seq Scan on a  (cost=0.00..18.50 rows=5 width=12) (actual
time=0.014..0.389 rows=1000 loops=1)
   Filter: (dummy1 = a_id1)

The non-patched version estimates 1 row. The patched estimates 2 rows, but
that's due to the bad estimate on dummy1 = a_id1.

The 2 comes from ceil(5 * 0.333).

Perhaps you have a better test case to for this?

Regards

David Rowley

--
 David Rowley   http://www.2ndQuadrant.com/

 PostgreSQL Development, 24x7 Support, Training & Services


estimation-with-fkeys-v2_davidv3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump LOCK TABLE ONLY question

2015-09-27 Thread Tom Lane

=?UTF-8?Q?Filip_Rembia=C5=82kowski?=  writes:
> I'm running pg_dump constrained to one schema. It appears that pg_dump runs

> "LOCK TABLE %s IN ACCESS SHARE MODE" for each table.

> Naturally it makes sense, but...

> This schema has a table that serves as parent for thousands of child
> tables (via INHERITS).

> So effectively, to dump this schema, I have to LOCK all these tables
> not only parent.

They'd all end up locked anyway wouldn't they?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

2015-09-27 Thread Tom Lane

Alvaro Herrera  writes:
> Gavin Wahl wrote:
>> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
>> just find the page range with the largest/smallest value, and then only
>> scan that one. Would that be hard to implement? I'm interested in working
>> on it if someone can give me some pointers.

> I think this means you need to represent this operation as a specific
> Path in some way.  See build_minmax_path() and its callers in planagg;
> you probably need to tweak preprocess_minmax_aggregates() to consider
> this.

> This doesn't look like a simple project to me, mind.

>> Somewhat harder but still possible would be using BRIN indexes to
>> accelerate ORDER BY. This would require a sorting algorithm that can take
>> advantage of mostly-sorted inputs. You would sort the page ranges by their
>> minimum or maximum value, then feed the sorting algorithm in that order.

> I wouldn't know where to start for this.  Maybe once Tom is done with
> planner rejiggering it would make sense to consider looking at how to do
> it.

Yeah.  I would urgently recommend that people *not* try to build new
things like planagg.c right now.  A large part of the point of upper
planner path-ification is to have a less grotty way of dealing with
things like specialized aggregate implementations.

(And yes, I am working on that.  Honest.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rework the way multixact truncations work

2015-09-27 Thread Andres Freund

On 2015-09-27 14:21:08 -0500, Jim Nasby wrote:
> IMHO doing just a log of something this serious; it should at least be a
> WARNING.

In postgres LOG, somewhat confusingly, is more severe than WARNING.

> I think the concern about upgrading a replica before the master is valid; is
> there some way we could over-ride a PANIC when that's exactly what someone
> is trying to do? Check for a special file maybe?

I don't understand this concern - that's just the situation we have in
all released branches today.

> + boolsawTruncationInCkptCycle;
> What happens if someone downgrades the master, back to a version that no
> longer logs truncation? (I don't think assuming that the replica will need
> to restart if that happens is a safe bet...)

It'll just to do legacy truncation again - without a restart on the
standby required.

> - if (MultiXactIdPrecedes(oldestMXact, earliest))
> + /* If there's nothing to remove, we can bail out early. */
> + if (MultiXactIdPrecedes(oldestMulti, earliest))
>   {
> - DetermineSafeOldestOffset(oldestMXact);
> + LWLockRelease(MultiXactTruncationLock);
> If/when this is backpatched, would it be safer to just leave this alone?

What do you mean? This can't just isolated be left alone?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

2015-09-27 Thread Alvaro Herrera

Gavin Wahl wrote:
> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
> just find the page range with the largest/smallest value, and then only
> scan that one. Would that be hard to implement? I'm interested in working
> on it if someone can give me some pointers.

I think this means you need to represent this operation as a specific
Path in some way.  See build_minmax_path() and its callers in planagg;
you probably need to tweak preprocess_minmax_aggregates() to consider
this.

This doesn't look like a simple project to me, mind.

> Somewhat harder but still possible would be using BRIN indexes to
> accelerate ORDER BY. This would require a sorting algorithm that can take
> advantage of mostly-sorted inputs. You would sort the page ranges by their
> minimum or maximum value, then feed the sorting algorithm in that order.

I wouldn't know where to start for this.  Maybe once Tom is done with
planner rejiggering it would make sense to consider looking at how to do
it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

2015-09-27 Thread Gavin Wahl

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in working
on it if someone can give me some pointers.

Somewhat harder but still possible would be using BRIN indexes to
accelerate ORDER BY. This would require a sorting algorithm that can take
advantage of mostly-sorted inputs. You would sort the page ranges by their
minimum or maximum value, then feed the sorting algorithm in that order.

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

2015-09-27 Thread Thomas Munro

On Mon, Sep 28, 2015 at 9:58 AM, Gavin Wahl  wrote:
> It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
> just find the page range with the largest/smallest value, and then only scan
> that one.

You might need to scan more than that if you don't find any rows that
are visible.

> Would that be hard to implement? I'm interested in working on it
> if someone can give me some pointers.
>
> Somewhat harder but still possible would be using BRIN indexes to accelerate
> ORDER BY. This would require a sorting algorithm that can take advantage of
> mostly-sorted inputs. You would sort the page ranges by their minimum or
> maximum value, then feed the sorting algorithm in that order.

Currently you get a Bitmap Index Scan and Bitmap Heap Scan, and then a
Sort node (quicksort or external sort).  So the sort is already
receiving data sorted in BRIN-block order, isn't it?  Are you saying
that the sort node should switch to something like insertion sort in
this case?

http://www.sorting-algorithms.com/nearly-sorted-initial-order

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Building with MinGW

2015-09-27 Thread Andrew Dunstan




On 09/27/2015 02:11 PM, Jeff Janes wrote:
Has anyone had success following the instructions at 
https://wiki.postgresql.org/wiki/Building_With_MinGW#Installing_Git 
recently?


I've followed the instructions to set up the build environment on a 
Windows box, and I can't build from git.  I can from the nightly 
tarball.  So I think the most recent environment you get when 
following these instructions has suffered some regression in the 
bison, yak, or something else involved in building from git but not 
from tarball.



The Msys DTK contains a bison and flex that are adequate for our 
purposes. Are you sure you installed the developer tools?


I just followed this recipe. However, I had to do the following 
additional small steps on a fresh Windows Server 2012 instances on Amazon:


 * the git installer had to be run as Administrator or it got
   permissions violations
 * I had to add this to the end of the MSys PATH: ":/c/prog/git/cmd" (I
   installed it in c:/prog/git)
 * I had to add "C:/mingw /mingw" to the fstab


Given that, I was able to clone the code, build it and run the 
regression set without incident. This was using the gcc that comes with 
the mingw installer, not the separate one from mingw-w64.





Also, is there anyway to know the exact git commit which the nightly 
tarballs are build against?






Not to my knowledge. Perhaps the tarball generator should create a file 
reporting the head commit on the branch.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Effective storage of duplicates in B-tree index.

2015-09-27 Thread Peter Geoghegan

On Thu, Sep 3, 2015 at 8:35 AM, Anastasia Lubennikova
 wrote:
>> * Since everything is aligned within B-Tree, it's probably worth
>> considering the alignment boundaries when doing prefix compression, if
>> you want to go that way. We can probably imagine a world where
>> alignment is not required for B-Tree, which would work on x86
>> machines, but I can't see it happening soon. It isn't worth
>> compressing unless it compresses enough to cross an "alignment
>> boundary", where we're not actually obliged to store as much data on
>> disk. This point may be obvious, not sure.
>
> That is another reason, why I doubt prefix compression, whereas effective
> duplicate storage hasn't this problem.

Okay. That sounds reasonable. I think duplicate handling is a good project.

A good learning tool for Postgres B-Trees -- or at least one of the
better ones -- is my amcheck tool. See:

https://github.com/petergeoghegan/postgres/tree/amcheck

This is a tool for verifying B-Tree invariants hold, which is loosely
based on pageinspect. It checks that certain conditions hold for
B-Trees. A simple example is that all items on each page be in the
correct, logical order. Some invariants checked are far more
complicated, though, and span multiple pages or multiple levels. See
the source code for exact details. This tool works well when running
the regression tests (see stress.sql -- I used it with pgbench), with
no problems reported last I checked. It often only needs light locks
on relations, and single shared locks on buffers. (Buffers are copied
to local memory for the tool to operate on, much like
contrib/pageinspect).

While I have yet to formally submit amcheck to a CF (I once asked for
input on the goals for the project on -hackers), the comments are
fairly comprehensive, and it wouldn't be too hard to adopt this to
guide your work on duplicate handling. Maybe it'll happen for 9.6.
Feedback appreciated.

The tool calls _bt_compare() for many things currently, but doesn't
care about many lower level details, which is (very roughly speaking)
the level that duplicate handling will work at. You aren't actually
proposing to change anything about the fundamental structure that
B-Tree indexes have, so the tool could be quite useful and low-effort
for debugging your code during development.

Debugging this stuff is sometimes like keyhole surgery. If you could
just see at/get to the structure that you care about, it would be 10
times easier. Hopefully this tool makes it easier to identify problems.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rework the way multixact truncations work

2015-09-27 Thread Jim Nasby

On 9/23/15 1:48 PM, Andres Freund wrote:

Honestly, I wonder whether this message
>ereport(LOG,
>(errmsg("performing legacy multixact 
truncation"),
> errdetail("Legacy truncations are sometimes 
performed when replaying WAL from an older primary."),
> errhint("Upgrade the primary, it is 
susceptible to data corruption.")));
>shouldn't rather be a PANIC.  (The main reason not to, I think, is that
>once you see this, there is no way to put the standby in a working state
>without recloning).

Huh? The behaviour in that case is still better than what we have in
9.3+ today (not delayed till the restartpoint). Don't see why that
should be a panic. That'd imo make it pretty much impossible to upgrade
a pair of primary/master where you normally upgrade the standby first?

IMHO doing just a log of something this serious; it should at least be a 
WARNING.

I think the concern about upgrading a replica before the master is 
valid; is there some way we could over-ride a PANIC when that's exactly 
what someone is trying to do? Check for a special file maybe?

+   boolsawTruncationInCkptCycle;
What happens if someone downgrades the master, back to a version that no 
longer logs truncation? (I don't think assuming that the replica will 
need to restart if that happens is a safe bet...)

-   if (MultiXactIdPrecedes(oldestMXact, earliest))
+   /* If there's nothing to remove, we can bail out early. */
+   if (MultiXactIdPrecedes(oldestMulti, earliest))
{
-   DetermineSafeOldestOffset(oldestMXact);
+   LWLockRelease(MultiXactTruncationLock);
If/when this is backpatched, would it be safer to just leave this alone?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Effective storage of duplicates in B-tree index.

2015-09-27 Thread Peter Geoghegan

On Sun, Sep 27, 2015 at 4:11 PM, Peter Geoghegan  wrote:
> Debugging this stuff is sometimes like keyhole surgery. If you could
> just see at/get to the structure that you care about, it would be 10
> times easier. Hopefully this tool makes it easier to identify problems.

I should add that the way that the L technique works, and the way
that Postgres code is generally very robust/defensive can make direct
testing a difficult thing. I have seen cases where a completely messed
up B-Tree still gave correct results most of the time, and was just
slower. That can happen, for example, because the "move right" thing
results in a degenerate linear scan of the entire index. The
comparisons in the internal pages were totally messed up, but it
"didn't matter" once a scan could get to leaf pages and could move
right and find the value that way.

I wrote amcheck because I thought it was scary how B-Tree indexes
could be *completely* messed up without it being obvious; what hope is
there of a test finding a subtle problem in their structure, then?
Testing the invariants directly seemed like the only way to have a
chance of not introducing bugs when adding new stuff to the B-Tree
code. I believe that adding optimizations to the B-Tree code will be
important in the next couple of years, and there is no other way to
approach it IMV.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

2015-09-27 Thread Gavin Wahl

> Yeah.  I would urgently recommend that people *not* try to build new
> things like planagg.c right now.  A large part of the point of upper
> planner path-ification is to have a less grotty way of dealing with
> things like specialized aggregate implementations.

Ok. I will wait and ask again later.

[HACKERS] pg_xlogdump and .partial files

2015-09-27 Thread Peter Eisentraut

The pg_xlogdump man page states that it cannot read .partial WAL files
and that they need to be renamed.  It seems to me with about the same
number of lines we could just make it accept those files, no?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch: Revised documentation on base backups

2015-09-27 Thread Amir Rohan

On 09/28/2015 06:33 AM, Michael Paquier wrote:
> On Sun, Sep 27, 2015 at 11:27 AM, Amir Rohan wrote:
>> Further editing. See attached V3.
>
> I think that you should consider adding this patch to the next commit
> fest so as we do not lose track of it:
> https://commitfest.postgresql.org/7/
>


I've recently picked up the base backup section of the documentation:
http://www.postgresql.org/docs/9.4/static/continuous-archiving.html
(Section 24.3.2) and went through like a tutorial. I'm a relative
newcomer to postgres, so this was the real deal.

The docs seem to be in need of some improvement, primarily because
it has the feel of having been monkey-patched repeatedly it in the
past. It needs to have someone with more experience read through it
and comment on technical accuracy and any knee-slapper error I
might have introduced.


One pain point in rewriting is the definition of terms, or lack
there of in the manual. I didn't find a section defining terms
conclusively for consistent referencing throughout the documentation.
For example, how should one refer to the directory housing
"postgresql.conf"?, is it:

1. The data directory
2. The storage directory
3. The PGDATA directory
4. The cluster directory
5. The server root directory
6. The config file directory
7. etc'

This lack of clear guidelines and uniformly applied terms
makes for some awkward verbal manoeuvring and compromises the
quality of the documentation (but admittedly, not its copiousness).

Amir

>From f763259e6e91d69dd1c41f67169fa0666b1f82d9 Mon Sep 17 00:00:00 2001
From: root 
Date: Sat, 26 Sep 2015 11:50:43 +0300
Subject: [PATCH] Rewritten documentation on base backups V2


diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 7413666..e2727e1 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -427,8 +427,8 @@ tar -cf backup.tar /usr/local/pgsql/data
   
If simultaneous snapshots are not possible, one option is to shut down
the database server long enough to establish all the frozen snapshots.
-   Another option is to perform a continuous archiving base backup () because such backups are immune to file
+   Another option is to create a base backup using the continuous archiving feature
+   () because such backups are immune to file
system changes during the backup.  This requires enabling continuous
archiving just during the backup process; restore is done using
continuous archive recovery ().
@@ -752,60 +752,58 @@ test ! -f /mnt/server/archivedir/000100A90065  cp pg_xlog/
Making a Base Backup
 

-The easiest way to perform a base backup is to use the
- tool. It can create
-a base backup either as regular files or as a tar archive. If more
-flexibility than  can provide is
-required, you can also make a base backup using the low level API
-(see ).
+A base backup consists of one or more WAL files and a .
+Together they sufficient to recreate the database's state at some point in the past.
+Once a base backup is made, the WAL files that precede its creation are no longer
+necessary in order to recover the database to some later point in time.
+   
+
+   
+The interval between base backups should usually be
+chosen based on how much storage you want to expend on archived WAL
+files, since you must keep all the archived WAL files back to your
+last base backup.
+You should also consider how long you are prepared to spend
+recovering, if recovery should be necessary  the system will have to
+replay all those WAL segments, and that could take awhile if it has
+been a long time since the last base backup.

 

-It is not necessary to be concerned about the amount of time it takes
-to make a base backup. However, if you normally run the
-server with full_page_writes disabled, you might notice a drop
-in performance while the backup runs since full_page_writes is
-effectively forced on during backup mode.
+Creating a base backup may be a lengthy process if you have a lots of data.
+Be aware that If you normally run the server with full_page_writes
+disabled, you might notice a drop in performance while the backup runs since
+full_page_writes is effectively forced on during backup mode.

 

 To make use of the backup, you will need to keep all the WAL
 segment files generated during and after the file system backup.
-To aid you in doing this, the base backup process
-creates a backup history file that is immediately
-stored into the WAL archive area. This file is named after the first
-WAL segment file that you need for the file system backup.
-For example, if the starting WAL file is
-0001123455CD the backup history file will be
-named something like
-0001123455CD.007C9330.backup. (The second
-part of the file name stands for an exact position within the WAL

Re: [HACKERS] pg_xlogdump and .partial files

2015-09-27 Thread Michael Paquier

On Mon, Sep 28, 2015 at 9:47 AM, Peter Eisentraut  wrote:
> The pg_xlogdump man page states that it cannot read .partial WAL files
> and that they need to be renamed.  It seems to me with about the same
> number of lines we could just make it accept those files, no?

FWIW, the discussion has happened here:
http://www.postgresql.org/message-id/cab7npqs_x5-m+s9yo_xd7oxwq1etvwv-1tsjuqnovqyyazt...@mail.gmail.com
In short, I was under the impression that this was not worth it, and I
still am under this impression knowing that this is not a common use
case.

Now, and I imagine that this is actually what is disturbing for the
user: we could allow the definition of a .partial file as an end
segment. We would need to be careful though in XLogDumpXLogRead when
doing the segment switch when the last segment is being read. As
pg_xlogdump does not handle timeline jumps this would be useful to
allow the user to have a look at the segments of a previous timeline
up to the point the promoted standby has switched to, particularly if
the .partial file, archived by a promoted standby, and its complete
version, archived by old master are present at the same position.
Still this seems like a very narrow use-case to me, so opinions are
welcome.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Optimize memory allocation in function 'bringetbitmap'

2015-09-27 Thread Alvaro Herrera

Jinyu Zhang wrote:
> 
> BRIN Scan: Optimize memory allocation in function 'bringetbitmap'.
> We can allocate memory for some pointer before do long loop instead of 
> allocating
> memory in long loop.
> 
> Before optimizing code (warm run)
> postgres=# select count(*) from lineitem where l_orderkey=1;
> Time: 456.219 ms
> 
> After optimizing code (warm run)
> postgres=# select count(*) from lineitem where l_orderkey=1;
> Time: 349.219 ms

Hmm, did you measure this in a c-assert-enabled build?  Those are slower
in allocation because of the memory-clobber thingy; without that, the
allocations would be a lot faster, so perhaps the gains are not so
interesting.  Still, it's a lot of palloc/pfree calls that do not happen
with your patch, so perhaps it's still worthwhile, but please do measure
it.

However I wonder if it would be simpler to have the dtup structure have
the pointers, so that you can pass it as NULL in the first call and then
followup calls reuse the one allocated in the first call.  That makes
the callers a bit less ugly, I think.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On-demand running query plans using auto_explain and signals

2015-09-27 Thread Pavel Stehule

2015-09-25 19:13 GMT+02:00 Shulgin, Oleksandr 
:
>
>
> Some problems:
>
> There is a potential problem with the limited total number of DSM
> segments: it is reserved in a way to only allow 2 per backend (on average)
> and 64 additional per server, so if you run with the new option enabled at
> all times, you're down to only 1 additional DSM per backend (again, on
> average).  Not sure how critical this can be, but no one is forced to run
> with this option enabled for all backends.
>
> If you don't want to run it enabled at all times, then enabling the GUC
> per-backend can be problematic.  It's still possible to update the conf
> file and send SIGHUP to a single backend, but it's harder to accomplish
> over psql, for example.  I think here we might still have some luck with
> the signals: use another array of per-backend slots with flags, set the
> target backend's flag and send it SIGUSR1.  The backend wakes on the signal
> and examines its slot, then toggles the GUC if needed.  Sounds pretty safe,
> eh?
>
> No documentation changes yet, waiting for your comments. :-)
>

the preparing of content before execution is interesting idea, that can be
used more. The almost queries and plans are not too big, so when the size
of content is not too big - less than 1MB, then can be used one DSM for all
backends. When size of content is bigger than limit, then DSM will be
allocated specially for this content. The pointer to DSM and offset can be
stored in requested process slot. The reading and writing to requested slot
should be protected by spinlock, but it should block only two related
processes for short time (copy memory). Other possibility is showing the
size of content in requested process slot. Then the requester can
preallocate enough size of shared memory. But this doesn't solve a issues
related to waiting requester for content. So first variant is pretty
simple, and should be preferred. The disadvantage is clear - it can enforce
maybe significant slowdown of fast queries.

Regards

Pavel





> Happy hacking!
> --
> Alex
>
>

[HACKERS] BRIN Scan: Optimize memory allocation in function 'bringetbitmap'

2015-09-27 Thread Jinyu Zhang


BRIN Scan: Optimize memory allocation in function 'bringetbitmap'.
We can allocate memory for some pointer before do long loop instead of 
allocating
memory in long loop.

Before optimizing code (warm run)
postgres=# select count(*) from lineitem where l_orderkey=1;
 count
---
 6
(1 row)

Time: 456.219 ms

After optimizing code (warm run)
postgres=# select count(*) from lineitem where l_orderkey=1;
 count
---
 6
(1 row)

Time: 349.219 ms

The following shows the DDL of this test case.
CREATE TABLE LINEITEM ( L_ORDERKEYINTEGER NOT NULL,
 L_PARTKEY INTEGER NOT NULL,
 L_SUPPKEY INTEGER NOT NULL,
 L_LINENUMBER  INTEGER NOT NULL,
 L_QUANTITYDECIMAL(15,2) NOT NULL,
 L_EXTENDEDPRICE  DECIMAL(15,2) NOT NULL,
 L_DISCOUNTDECIMAL(15,2) NOT NULL,
 L_TAX DECIMAL(15,2) NOT NULL,
 L_RETURNFLAG  CHAR(1) NOT NULL,
 L_LINESTATUS  CHAR(1) NOT NULL,
 L_SHIPDATEDATE NOT NULL,
 L_COMMITDATE  DATE NOT NULL,
 L_RECEIPTDATE DATE NOT NULL,
 L_SHIPINSTRUCT CHAR(25) NOT NULL,
 L_SHIPMODE CHAR(10) NOT NULL,
 L_COMMENT  VARCHAR(44) NOT NULL);

copy lineitem from '/home/jinyu/mywork/dbgen/lineitem.tbl' delimiter '|';
create index brinLineitem on lineitem using brin(L_ORDERKEY) 
with(pages_per_range = 1);

Jinyu Zhang

[HACKERS] Patch: Optimize memory allocation in function 'bringetbitmap'

2015-09-27 Thread Jinyu Zhang


BRIN Scan: Optimize memory allocation in function 'bringetbitmap'.
We can allocate memory for some pointer before do long loop instead of 
allocating
memory in long loop.

Before optimizing code (warm run)
postgres=# select count(*) from lineitem where l_orderkey=1;
 count
---
 6
(1 row)

Time: 456.219 ms

After optimizing code (warm run)
postgres=# select count(*) from lineitem where l_orderkey=1;
 count
---
 6
(1 row)

Time: 349.219 ms

The following shows the DDL of this test case.
CREATE TABLE LINEITEM ( L_ORDERKEYINTEGER NOT NULL,
 L_PARTKEY INTEGER NOT NULL,
 L_SUPPKEY INTEGER NOT NULL,
 L_LINENUMBER  INTEGER NOT NULL,
 L_QUANTITYDECIMAL(15,2) NOT NULL,
 L_EXTENDEDPRICE  DECIMAL(15,2) NOT NULL,
 L_DISCOUNTDECIMAL(15,2) NOT NULL,
 L_TAX DECIMAL(15,2) NOT NULL,
 L_RETURNFLAG  CHAR(1) NOT NULL,
 L_LINESTATUS  CHAR(1) NOT NULL,
 L_SHIPDATEDATE NOT NULL,
 L_COMMITDATE  DATE NOT NULL,
 L_RECEIPTDATE DATE NOT NULL,
 L_SHIPINSTRUCT CHAR(25) NOT NULL,
 L_SHIPMODE CHAR(10) NOT NULL,
 L_COMMENT  VARCHAR(44) NOT NULL);

copy lineitem from '/home/jinyu/mywork/dbgen/lineitem.tbl' delimiter '|';
create index brinLineitem on lineitem using brin(L_ORDERKEY) 
with(pages_per_range = 1);

Jinyu Zhang





网易考拉iPhone6s玫瑰金5288元，现货不加价

patch_optimize_mem
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_dump LOCK TABLE ONLY question

2015-09-27 Thread Filip Rembiałkowski

Hi.

I'm running pg_dump constrained to one schema. It appears that pg_dump runs

"LOCK TABLE %s IN ACCESS SHARE MODE" for each table.

Naturally it makes sense, but...

This schema has a table that serves as parent for thousands of child
tables (via INHERITS).

So effectively, to dump this schema, I have to LOCK all these tables
not only parent.

pg_dump does it automatically, I checked in current trunk - it does
not add ONLY keyword. Should it?

I wonder if it it counts as a bug.
If not a bug, maybe it counts as a feature request?

Thanks.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Building with MinGW

Re: [HACKERS] PATCH: use foreign keys to improve join estimates v1

Re: [HACKERS] pg_dump LOCK TABLE ONLY question

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

Re: [HACKERS] Rework the way multixact truncations work

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

[HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

Re: [HACKERS] Building with MinGW

Re: [HACKERS] [PROPOSAL] Effective storage of duplicates in B-tree index.

Re: [HACKERS] Rework the way multixact truncations work

Re: [HACKERS] [PROPOSAL] Effective storage of duplicates in B-tree index.

Re: [HACKERS] BRIN indexes for MAX, MIN, ORDER BY?

[HACKERS] pg_xlogdump and .partial files

[HACKERS] Patch: Revised documentation on base backups

Re: [HACKERS] pg_xlogdump and .partial files

Re: [HACKERS] Patch: Optimize memory allocation in function 'bringetbitmap'

Re: [HACKERS] On-demand running query plans using auto_explain and signals

[HACKERS] BRIN Scan: Optimize memory allocation in function 'bringetbitmap'

[HACKERS] Patch: Optimize memory allocation in function 'bringetbitmap'

[HACKERS] pg_dump LOCK TABLE ONLY question

21 matches

Site Navigation

Mail list logo

Footer information