Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread David Johnston
Tom Lane-2 wrote
 Unfortunately, while testing it I noticed that there's a potentially
 fatal backwards-compatibility problem, namely that the COPY n status
 gets printed on stdout, which is the same place that COPY OUT data is
 going.  While this isn't such a big problem for interactive use,
 usages like this one are pretty popular:
 
psql -c 'copy mytable to stdout' mydatabase | some-program
 
 With the patch, COPY n gets included in the data sent to some-program,
 which never happened before and is surely not what the user wants.
 The same if the -c string uses \copy.
 
 There are several things we could do about this:
 
 1. Treat this as a non-backwards-compatible change, and document that
 people have to use -q if they don't want the COPY tag in the output.
 I'm not sure this is acceptable.

I've mostly used copy to with files and so wouldn't mind if STDOUT had the
COPY n sent to it as long as the target file is just the copy contents.


 2. Kluge ProcessResult so that it continues to not pass back a PGresult
 for the COPY TO STDOUT case, or does so only in limited circumstances
 (perhaps only if isatty(stdout), for instance).

The main problem with this is that people will test by sending output to a
TTY and see the COPY n.  Although if it can be done consistently then you
minimize backward incompatibility and encourage people to enforce quiet mode
while the command runs...


 3. Modify PrintQueryStatus so that command status goes to stderr not
 stdout.  While this is probably how it should've been done in the first
 place, this would be a far more severe compatibility break than #1.
 (For one thing, there are probably scripts out there that think that any
 output to stderr is an error message.)  I'm afraid this one is definitely
 not acceptable, though it would be by far the cleanest solution were it
 not for compatibility concerns.

Yes, it's a moot point but I'm not sure it would be best anyway.


 4. As #3, but print the command status to stderr only if it's COPY n,
 otherwise to stdout.  This is a smaller compatibility break than #3,
 but still a break since COPY status was formerly issued to stdout
 in non TO STDOUT/FROM STDIN cases.  (Note that PrintQueryStatus can't
 tell whether it was COPY TO STDOUT rather than any other kind of COPY;
 if we want that to factor into the behavior, we need ProcessResult to
 do it.)

Since we are considering stderr my (inexperienced admittedly) gut says that
using stderr for this is generally undesirable and especially given our
existing precedence.  stdout is the seemingly correct target, typically, and
the existing quiet-mode toggle provides sufficient control for typical
needs.


 5. Give up on the print-the-tag aspect of the change, and just fix the
 wrong-line-number issue (so we'd still introduce the copyStream variable,
 but not change how PGresults are passed around).
 
 I'm inclined to think #2 is the best answer if we can't stomach #1.
 But the exact rule for when to print a COPY OUT result probably
 still requires some debate.  Or maybe someone has another idea?
 
 Also, I'm thinking we should back-patch the aspects of the patch
 needed to fix the wrong-line-number issue.  That appears to have been
 introduced in 9.2; older versions of PG get the above example right.
 
 Comments?

I'd like COPY TO to anything but STDOUT to emit a COPY n on STDOUT -
unless suppressed by -q(uiet)

Document that COPY TO STDOUT does not emit COPY n because STDOUT is
already assigned for data and so is not available for notifications.  Since
COPY is more typically used for ETL than a bare-select, in addition to
back-compatibility concerns, this default behavior seems reasonable.

Would it be possible to store the n somewhere and provide a command - like
GET DIAGNOSTICS in pl/pgsql - if the user really wants to know how many rows
were sent to STDOUT?  I'm doubt this is even useful in the typical use-case
for COPY TO STDOUT but figured I'd toss the idea out there.

Is there anything besides a desire for consistency that anyone has or can
put forth as a use-case for COPY TO STDOUT emitting COPY n on STDOUT as
well?  If you are going to view the content inline, and also want a quick
count, ISTM you would be more likely to use SELECT to take advantage of all
its pretty-print features.

If we really need to cater to this use then maybe a --loud-copy-to-stdout
switch can be provided to override its default quiet-mode.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/COPY-table-FROM-STDIN-doesn-t-show-count-tag-tp5775018p5795611.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Pavel Stehule
2014-03-12 7:10 GMT+01:00 David Johnston pol...@yahoo.com:

 Tom Lane-2 wrote
  Unfortunately, while testing it I noticed that there's a potentially
  fatal backwards-compatibility problem, namely that the COPY n status
  gets printed on stdout, which is the same place that COPY OUT data is
  going.  While this isn't such a big problem for interactive use,
  usages like this one are pretty popular:
 
 psql -c 'copy mytable to stdout' mydatabase | some-program
 
  With the patch, COPY n gets included in the data sent to some-program,
  which never happened before and is surely not what the user wants.
  The same if the -c string uses \copy.
 
  There are several things we could do about this:
 
  1. Treat this as a non-backwards-compatible change, and document that
  people have to use -q if they don't want the COPY tag in the output.
  I'm not sure this is acceptable.

 I've mostly used copy to with files and so wouldn't mind if STDOUT had the
 COPY n sent to it as long as the target file is just the copy contents.


  2. Kluge ProcessResult so that it continues to not pass back a PGresult
  for the COPY TO STDOUT case, or does so only in limited circumstances
  (perhaps only if isatty(stdout), for instance).

 The main problem with this is that people will test by sending output to a
 TTY and see the COPY n.  Although if it can be done consistently then you
 minimize backward incompatibility and encourage people to enforce quiet
 mode
 while the command runs...


  3. Modify PrintQueryStatus so that command status goes to stderr not
  stdout.  While this is probably how it should've been done in the first
  place, this would be a far more severe compatibility break than #1.
  (For one thing, there are probably scripts out there that think that any
  output to stderr is an error message.)  I'm afraid this one is definitely
  not acceptable, though it would be by far the cleanest solution were it
  not for compatibility concerns.

 Yes, it's a moot point but I'm not sure it would be best anyway.


  4. As #3, but print the command status to stderr only if it's COPY n,
  otherwise to stdout.  This is a smaller compatibility break than #3,
  but still a break since COPY status was formerly issued to stdout
  in non TO STDOUT/FROM STDIN cases.  (Note that PrintQueryStatus can't
  tell whether it was COPY TO STDOUT rather than any other kind of COPY;
  if we want that to factor into the behavior, we need ProcessResult to
  do it.)

 Since we are considering stderr my (inexperienced admittedly) gut says that
 using stderr for this is generally undesirable and especially given our
 existing precedence.  stdout is the seemingly correct target, typically,
 and
 the existing quiet-mode toggle provides sufficient control for typical
 needs.


  5. Give up on the print-the-tag aspect of the change, and just fix the
  wrong-line-number issue (so we'd still introduce the copyStream variable,
  but not change how PGresults are passed around).
 
  I'm inclined to think #2 is the best answer if we can't stomach #1.
  But the exact rule for when to print a COPY OUT result probably
  still requires some debate.  Or maybe someone has another idea?
 
  Also, I'm thinking we should back-patch the aspects of the patch
  needed to fix the wrong-line-number issue.  That appears to have been
  introduced in 9.2; older versions of PG get the above example right.
 
  Comments?

 I'd like COPY TO to anything but STDOUT to emit a COPY n on STDOUT -
 unless suppressed by -q(uiet)


+1

This information can be really interesting and sometimes important, when
people has no idea, how they tables are long

Regards

Pavel



 Document that COPY TO STDOUT does not emit COPY n because STDOUT is
 already assigned for data and so is not available for notifications.  Since
 COPY is more typically used for ETL than a bare-select, in addition to
 back-compatibility concerns, this default behavior seems reasonable.

 Would it be possible to store the n somewhere and provide a command -
 like
 GET DIAGNOSTICS in pl/pgsql - if the user really wants to know how many
 rows
 were sent to STDOUT?  I'm doubt this is even useful in the typical use-case
 for COPY TO STDOUT but figured I'd toss the idea out there.

 Is there anything besides a desire for consistency that anyone has or can
 put forth as a use-case for COPY TO STDOUT emitting COPY n on STDOUT as
 well?  If you are going to view the content inline, and also want a quick
 count, ISTM you would be more likely to use SELECT to take advantage of all
 its pretty-print features.

 If we really need to cater to this use then maybe a --loud-copy-to-stdout
 switch can be provided to override its default quiet-mode.

 David J.





 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/COPY-table-FROM-STDIN-doesn-t-show-count-tag-tp5775018p5795611.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


 --
 Sent via pgsql-hackers mailing list 

Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)

2014-03-12 Thread Kouhei Kaigai
Thanks for your efforts!
  Head  patched
 Diff
 Select -  500K772ms2659ms-200%
 Insert - 400K   3429ms 1948ms  43% (I am
 not sure how it improved in this case)
 delete - 200K 2066ms 3978ms-92%
 update - 200K3915ms  5899ms-50%
 
 This patch shown how the custom scan can be used very well but coming to
 patch as It is having some performance problem which needs to be
 investigated.
 
 I attached the test script file used for the performance test.

First of all, it seems to me your test case has too small data set that
allows to hold all the data in memory - briefly 500K of 200bytes record
will consume about 100MB. Your configuration allocates 512MB of
shared_buffer, and about 3GB of OS-level page cache is available.
(Note that Linux uses free memory as disk cache adaptively.)

This cache is designed to hide latency of disk accesses, so this test
case does not fit its intention.
(Also, the primary purpose of this module is a demonstration for
heap_page_prune_hook to hook vacuuming, so simple code was preferred
than complicated implementation but better performance.)

I could reproduce the overall trend, no cache scan is faster than
cached scan if buffer is in memory. Probably, it comes from the
cost to walk down T-tree index using ctid per reference.
Performance penalty around UPDATE and DELETE likely come from
trigger invocation per row.
I could observe performance gain on INSERT a little bit.
It's strange for me, also. :-(

On the other hand, the discussion around custom-plan interface
effects this module because it uses this API as foundation.
Please wait for a few days to rebase the cache_scan module onto
the newer custom-plan interface; that I submitted just a moment
before.

Also, is it really necessary to tune the performance stuff in this
example module of the heap_page_prune_hook?
Even though I have a few ideas to improve the cache performance,
like insertion of multiple rows at once or local chunk copy instead
of t-tree walk down, I'm not sure whether it is productive in the
current v9.4 timeframe. ;-(

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Haribabu Kommi
 Sent: Wednesday, March 12, 2014 1:14 PM
 To: Kohei KaiGai
 Cc: Kaigai Kouhei(海外 浩平); Tom Lane; PgHacker; Robert Haas
 Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only
 table scan?)
 
 On Thu, Mar 6, 2014 at 10:15 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
  2014-03-06 18:17 GMT+09:00 Haribabu Kommi kommi.harib...@gmail.com:
  I will update you later regarding the performance test results.
 
 
 I ran the performance test on the cache scan patch and below are the readings.
 
 Configuration:
 
 Shared_buffers - 512MB
 cache_scan.num_blocks - 600
 checkpoint_segments - 255
 
 Machine:
 OS - centos - 6.4
 CPU - 4 core 2.5 GHZ
 Memory - 4GB
 
  Head  patched
 Diff
 Select -  500K772ms2659ms-200%
 Insert - 400K   3429ms 1948ms  43% (I am
 not sure how it improved in this case)
 delete - 200K 2066ms 3978ms-92%
 update - 200K3915ms  5899ms-50%
 
 This patch shown how the custom scan can be used very well but coming to
 patch as It is having some performance problem which needs to be
 investigated.
 
 I attached the test script file used for the performance test.
 
 Regards,
 Hari Babu
 Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)

2014-03-12 Thread Haribabu Kommi
On Wed, Mar 12, 2014 at 5:26 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 Thanks for your efforts!
  Head  patched
 Diff
 Select -  500K772ms2659ms-200%
 Insert - 400K   3429ms 1948ms  43% (I am
 not sure how it improved in this case)
 delete - 200K 2066ms 3978ms-92%
 update - 200K3915ms  5899ms-50%

 This patch shown how the custom scan can be used very well but coming to
 patch as It is having some performance problem which needs to be
 investigated.

 I attached the test script file used for the performance test.

 First of all, it seems to me your test case has too small data set that
 allows to hold all the data in memory - briefly 500K of 200bytes record
 will consume about 100MB. Your configuration allocates 512MB of
 shared_buffer, and about 3GB of OS-level page cache is available.
 (Note that Linux uses free memory as disk cache adaptively.)

Thanks for the information and a small correction. The Total number of
records are 5 million.
The select operation is selecting 500K records. The total table size
is around 1GB.

Once I get your new patch re-based on the custom scan patch, I will
test the performance
again by increasing my database size more than the RAM size. And also
I will make sure
that memory available for disk cache is less.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issue log message to suggest VACUUM FULL if a table is nearly empty

2014-03-12 Thread Haribabu Kommi
On Tue, Mar 11, 2014 at 2:59 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Mon, Mar 10, 2014 at 1:13 PM, Haribabu Kommi
 kommi.harib...@gmail.com wrote:
 On Mon, Mar 10, 2014 at 4:24 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Mon, Mar 10, 2014 at 5:58 AM, Wang, Jing ji...@fast.au.fujitsu.com 
 wrote:
  Enclosed is the patch to implement the requirement that issue log message 
  to
  suggest VACUUM FULL if a table is nearly empty.
 
  The requirement comes from the Postgresql TODO list.
 
 I think it would be better if we can use some existing stats to issue 
 warning
 message rather than traversing the FSM for all pages. For example after
 vacuuming page in lazy_scan_heap(), we update the freespace for page.
 You can refer below line in lazy_scan_heap().
 freespace = PageGetHeapFreeSpace(page);

 Now it might be possible that we might not get freespace info easily as
 it is not accumulated for previous vacuum's. Incase there is no viable
 way to get it through vacuum stats, we are already updating fsm after
 vacuum by FreeSpaceMapVacuum(), where I think it should be possible
 to get freespace.

 yes this way it works without extra penalty. But the problem is how to 
 calculate
 the free space which is left in the skipped pages because of visibility bit.

 One way could be by extrapolating (vac_estimate_reltuples) like we do for
 some other stats, but not sure if we can get the correct estimates. The
 main reason is that if you observe that code path, all the decisions are
 mainly done on the basis of vacrelstats. I have not checked in detail if by
 using any other stats, this purpose can be achieved, may be once you can
 look into it.

I checked the vac_estimate_reltuples() function, but not able to find
a proper way to identify the free space.

 By the way have you checked if FreeSpaceMapVacuum() can serve your
 purpose, because this call already traverses FSM in depth-first order to
 update the freespace. So may be by using this call or wrapper on this
 such that it returns total freespace as well apart from updating freespace
 can serve the need.

Thanks for information. we can get the table free space by writing some wrapper
or modify a little bit of FreeSpaceMapVacuum() function. This way it
will not add
any extra overhead in identifying the table is almost empty or not.

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Amit Kapila
On Tue, Mar 11, 2014 at 2:32 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 On 11/03/14 13:23, Amit Kapila wrote:
 Could you please once check (if you are comfortable doing so) wherever
 this patch is passing tuple, whether it is okay to pass it based on 
 visibility
 rules, else I will again verify it once.

 I think I got all places, but it would be nice to have a confirmation.

How about in IndexBuildHeapScan()? This also uses SnapShotAny to
scan the heap data.

Normally in this function, passing tuple to lock routine should not be a
problem as it would have ensured to have exclusive lock on relation
before reaching this stage, but below comment in this function leads
me to think, that there can be problem during system catalog scan.

/*
* Since caller should hold ShareLock or better, normally
* the only way to see this is if it was inserted earlier
* in our own transaction. However, it can happen in
* system catalogs, since we tend to release write lock
* before commit there.  Give a warning if neither case
* applies.
*/

I could not immediately think of testcase which can validate it, but this is
certainly a point to ponder.
Do you think it is safe to pass tuple to XactLockTableWaitWithInfo()
in this function?


I think now other things in your patch are good, just these tuple visibility
validations are tricky and it is taking time to validate the paths, because
these gets called in nested paths where in few cases even dirty or snapshot
any scans also seems to be safe w.r.t displaying tuple. Anyway, today I
have checked most paths, may be one more time I will give a look with fresh
mind and then pass on to Committer.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Magnus Hagander
On Mar 12, 2014 1:46 AM, Josh Berkus j...@agliodbs.com wrote:

 On 03/11/2014 06:57 AM, Tom Lane wrote:
  Mind you, I wouldn't be unhappy to see it go away; it's a kluge and
always
  has been.  I'm just expecting lots of push-back if we try.  And it's
kind
  of hard to resist push-back when you don't have a substitute to offer.

 Yeah, what we really need is encapsulated per-DB users and local
 superusers.  I think every agrees that this is the goal, but nobody
 wants to put in the work to implement a generalized solution.


Encapsulated would probably be the doable part. But local superuser? Given
that a superuser can load and run binaries, how would you propose you
restrict that superuser from doing anything they want? And if you don't
need that functionality, then hows it really different from being the
database owner?

/Magnus


Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Albe Laurenz
Josh Berkus wrote:
 What makes these GUCs worse is that nobody knows how to set them; nobody
 on this list and nobody in the field.  Heck, I doubt 1 in 1000 of our
 users (or 1 in 10 people on this list) know what a multixact *is*.

I won't contend your first statement, but multixacts are explained
in the documentation:

http://www.postgresql.org/docs/9.3/static/routine-vacuuming.html#VACUUM-FOR-MULTIXACT-WRAPAROUND

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas

On 03/10/2014 09:44 PM, Robert Haas wrote:

On Mon, Apr 30, 2012 at 1:34 PM, Noah Misch n...@leadboat.com wrote:

When GIN changes a metapage, we WAL-log its ex-header content and never use a
backup block.  This reduces WAL volume since the vast majority of the metapage
is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
content if the metapage LSN predates the WAL record LSN.  If a metapage write
tore and updated the LSN but not the other content, we would fail to complete
the update.  Instead, unconditionally reinitialize the metapage similar to how
_bt_restore_meta() handles the situation.

I found this problem by code reading and did not attempt to build a test case
illustrating its practical consequences.  It's possible that there's no
problem in practice on account of some reason I haven't contemplated.


The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-03-12 Thread MauMau

From: Alvaro Herrera alvhe...@2ndquadrant.com

MauMau escribió:

Do you know how I can reply to an email which was deleted locally?
I thought I could download an old mail by clicking raw link and
import it to the mailer.  However, it requires username/password
input, and it seems to be different from the one for editing
CommitFest.  I couldn't find how to authenticate myself.


The box that asks for password tells you what the user/password is.  I
think it's something like archives/archives or similar.  The password is
there only to keep spammers out, not to have any real auth.


Thank you, the user/password was certainly displayed in the box --  
archives/antispam.  The raw link only gave the mail in text format.  I 
hoped to import the mail into Windows Mail on Windows Vista, but I couldn't.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is SPI safe to use in multi-threaded PL/Java?

2014-03-12 Thread MauMau

From: Tom Lane t...@sss.pgh.pa.us

That, more or less.  There is exactly zero provision in the Postgres
code for multiple threads to exist inside a backend process.  It's
possible that PL/Java manages to completely insulate the Java world
from the C world, so that the C code never sees more than one thread.
But any leakage at all in that abstraction is probably going to cause
bugs; and as I said, we (PG hackers) are not going to consider such
bugs to be our problem.

On platforms where the standard libc supports threading (which is most,
these days), I'd be particularly worried about leakage along the path
java - libc - postgres.  If libc becomes aware that there are multiple
threads executing inside the process, it's likely to change behaviors.


I see... even Tom-san is suspicious about the PL/Java's design, or the use 
of SPI from code linked with libpthread.so.  I'll communicate this to the 
PL/Java community.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 The attached patch doesn't apply any more, but it looks like this
 issue still exists.

 Fixed.

Did you forget to push?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas

On 03/12/2014 02:05 PM, Robert Haas wrote:

On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.


Did you forget to push?


Yep. Pushed now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Fabrízio de Royes Mello
On Tue, Mar 11, 2014 at 8:42 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 March 2014 18:33, Tom Lane t...@sss.pgh.pa.us wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  -1 to *requiring* validation for table-level options for exactly the
  same reasons we no longer validate custom GUCs.
 
  Well, that is an interesting analogy, but I'm not sure how much it
applies
  here.  In the case of a GUC, you can fairly easily validate it once the
  module does get loaded (and before the module actually tries to do
  anything with it).  I don't see how that's going to work for table
  options.  I trust nobody is seriously proposing that on module load,
we're
  going to scan the whole of pg_class looking to see if there are
incorrect
  settings.  (Even if we did, what would we do about it?  Not try to
force a
  pg_class update, for sure.  And what if the module is loading into the
  postmaster thanks to a preload spec?)

 Thank goodness for that. Strict validation does seem scary.

  I don't really think partial validation makes sense.  We could just
remove
  the whole topic, and tell extension authors that it's up to them to
defend
  themselves against bizarre values stored for their table options.  But
I'm
  wondering if there's really so much use-case for a feature like that.

 DBAs are fairly used to the idea that if you put crap data in the
 database then bad things happen. We provide the table, they provide
 the data. Validation is possible, but not enforced as essential.
 (Except in terms of the datatype - but then we are also validating
 data to specific types here).

 So I think that DBAs will also cope rather well with table-level
 options without us nannying them.

 There is nothing more annoying that needing to run scripts in a
 specific sequence to make them work, or dumps that fail because
 certain modules aren't loaded yet (or cannot ever be so). And maybe
 the DBA wants to annotate tables based on a design and then later move
 to implement modules to take advantage of the annotation.

 Having an option be set and yet be unvalidated and/or unused is no
 more annoying than having a column in a table that is known incorrect
 and/or not accessed. Searching for badly set options needs to be
 possible, even easy, but hard validation can cause problems. And if we
 try and force it, whats to stop people from using a dummy validator
 just to circumvent the strictness?


Then I think my patch is more adherent given these conclusions, except by
the some adjustments suggested by Tom Lane and mentioned by Alvaro Herrera
[1].

Am I correct?


[1]
http://www.postgresql.org/message-id/20140307205649.gf4...@eldon.alvh.no-ip.org

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Alvaro Herrera
Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  On 03/11/2014 09:37 PM, Tom Lane wrote:
  In particular, I'd like to see an exclusion that prevents local users
  from having the same name as any global user, so that we don't have
  ambiguity in GRANT and similar commands.  This doesn't seem simple to
  enforce (if we supported partial indexes on system catalogs, it would
  be ...) but surely this representation is more amenable to enforcing it
  than the existing one.
 
  Should be workable if you're creating a local name - just check against 
  the list of global roles.
 
 Concurrent creations won't be safe without some sort of locking scheme.
 A unique index would be a lot better way of plugging that hole than a
 system-wide lock on user creation.  But not sure how to define a unique
 index that allows (joe, db1) to coexist with (joe, db2) but not with
 (joe, 0).

Isn't this just a case of creating a suitable operator and an exclusion
constraint?  Defining the constraint in BKI might require extra
infrastructure, but it should be possible.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] git-review: linking commits to review discussion in git

2014-03-12 Thread Murtuza Mukadam
From: Murtuza Mukadam murtuza.i.muka...@gmail.com
Date: Sat, Mar 8, 2014 at 1:56 PM
Subject: Re: [HACKERS] git-review: linking commits to review discussion in git
To: Heikki Linnakangas hlinnakan...@vmware.com
Cc: pgsql-hackers@postgresql.org


Hi Heikki

We have linked git commits and reviews in a web interface. If you
enter a commit hash, you will be redirected to the email archive of
the peer review discussion:
http://cesel.encs.concordia.ca/git-reviewed-tracker.php

This work is part of my thesis, so feedback is much appreciated. If
you have another git repo and mailing lists that you'd like linked,
please let us know.

Cheers,
Murtuza

How do we do the linking? We take each email patch, eliminate white
space and hash each line. We then compare the lines with those in
commits to the same files. The commit that changes the same files and
has the largest number of matching lines is considered to be the
reviewed commit.



On Tue, Jan 28, 2014 at 2:10 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 01/27/2014 11:36 PM, Murtuza Mukadam wrote:

 Hello All,
We have linked peer review discussions on
 'pgsql-hackers' to their respective commits within the main
 postgresql.git repository. You can view the linked reviews from 2012
 until present in the GitHub repo at
 https://github.com/mmukadam/postgres/tree/review

 If you want to work with these reviews locally, you can use our
 git-review tool. It allows you to create reviews and attach them to
 commits in git. We didn't modify git, instead we added some scripts
 that use standard git commands. git-review is beta, but since it only
 adds a detached 'review' branch and modifies the contents of this
 branch, it has minimal impact and can easily be removed by deleting
 the 'review' branch and scripts.

 The online man-page is here:
 http://users.encs.concordia.ca/~m_mukada/git-review.html

 In order to install git-review, you need to clone the repository:
 https://github.com/mmukadam/git-review.git

 The online tutorial is available here:
 http://users.encs.concordia.ca/~m_mukada/git-review-tutorial.html

 The clone of postgresql.git with linked review discussion is here (new
 review discussion are linked nightly)
 https://github.com/mmukadam/postgres

 This work is part of my Master's thesis. If you'd like us to change
 the tool to better suit your review process, have another git repo
 you'd like us to link commits with review discussion, or have other
 feedback, please let us know.


 I don't understand what this does. The repository at
 https://github.com/mmukadam/postgres looks like just a clone of the main
 PostgreSQL repository, with no extra links anywhere. And the repository at
 https://github.com/mmukadam/postgres/tree/review looks like a mailing list
 archive turned into a git repository, but I don't see any links to the
 commits in the main repository there.

 Am I missing something?

 - Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Rowtype column and domain subfield with DEFAULT and NOT NULL constraint

2014-03-12 Thread Julien Tachoires

Hi,

A customer has reported us a strange behaviour regarding a rowtype 
column with a domain subfield:


test=# CREATE DOMAIN my_int_not_null_1 AS INTEGER DEFAULT 1 NOT NULL;
CREATE DOMAIN
test=# CREATE TYPE  my_int_rowtype AS (
test(# f1 INTEGER,
test(# f2 my_int_not_null_1
test(# );
CREATE TYPE
test=# CREATE TABLE test (id SERIAL, col1 my_int_rowtype);
CREATE TABLE
test=# INSERT INTO test (col1.f1) VALUES (1);
INSERT 0 1
test=# INSERT INTO test (id, col1.f1) VALUES (2, 1);
INSERT 0 1
test=# INSERT INTO test (col1) VALUES ((1,NULL));
ERROR:  domain my_int_not_null_1 does not allow null values
test=# SELECT * FROM test;
 id | col1
+--
  1 | (1,)
  2 | (1,)
(2 rows)

It seems:

- the DEFAULT value (from the domain) is not inserted
- the NOT NULL constraint is no applied excepting if we set explicitly 
the value to NULL, looks like it is the same issue than before, when 
there is no DEFAULT the parser/rewriter should set the column/subfield 
to NULL.


Is build_column_default() the right place to handle that case ?

--
Julien Tachoires
http://dalibo.com - http://dalibo.org


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra
On 12 Březen 2014, 0:41, Peter Geoghegan wrote:
 On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote:
   ERROR:  index row size 1416 exceeds maximum 1352 for index gin_idx

 All index AMs have similar restrictions.

Yes, I know and I have no problem with restrictions in general. You may
run into similar issues with btree indexes on text columns with long text,
for example. The thing is that people don't generally index text directly,
because it usually does not make much sense, but using tsvector etc.

But with jsonb it's more likely because indexing is one of the goodies (at
least for me). And the discussions with several people interested in
storing json data I had recently went often like this:

me: It seems we'll have a better json datatype in 9.4.
them: Nice!
me: And it will be possible to do searches on arbitrary keys.
them: Yay!
me: And we actually got pretty significant improvements in GIN indexes.
them: Awesome!
me: But the values you may index need to be less than ~1500B.
them: Bummer :-(
me: Well, you can use GIST then.

 A good example of such header is dkim-signature which basically
 contains the whole message digitally signed with DKIM. The signature
 tends to be long and non-compressible, thanks to the signature.

 I'm wondering what's the best way around this, because I suspect many
 new users (especially those attracted by jsonb and GIN improvements)
 will run into this. Maybe not immediately, but eventully they'll try to
 insert a jsonb with long value, and it will fail ...

 The jsonb_hash_ops operator class just stores a 32-bit integer hash
 value (it always sets the recheck flag, which only some of the other
 default GIN opclass' strategies do). It only supports containment, and
 not the full variety of operators that the default opclass supports,
 which is why it isn't the default. I think that in practice the
 general recommendation will be that when indexing at the top level,
 use jsonb_hash_ops. When indexing nested items, use the more flexible
 default GIN opclass. That seems like a pretty smart trade-off to me.

OK, I'll look into the jsonb_hash_ops - that sounds more or less like what
I was thinking about (and sure, storing hashes makes some operations
impossible to support).

The other thing I was thinking about is introducing some kind of upper
limit for the value length - e.g. index just the first 1kB, or something
like that. My experience is most values are way shorter, or actually
differ in the first 1kB, so this should allow most decisions to be made.
But I'm not really that familiar with how GIN works, so maybe this is
nonsense.

 The more I think about it, the more inclined I am to lose GiST support
 entirely for the time being. It lets us throw out about 700 lines of C
 code, which is a very significant fraction of the total, removes the
 one open bug, and removes the least understood part of the code. The
 GiST opclass is not particularly compelling for this.

I disagree with that. I see GiST as a simple fallback option for the cases
I described. I wasn't able to create a GIN index because of exceeding the
max item length, but GiST created just fine. It was considerably slower,
but it worked.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Andres Freund
On 2014-03-12 10:03:42 -0300, Alvaro Herrera wrote:
 Tom Lane wrote:
  Andrew Dunstan and...@dunslane.net writes:
   On 03/11/2014 09:37 PM, Tom Lane wrote:
   In particular, I'd like to see an exclusion that prevents local users
   from having the same name as any global user, so that we don't have
   ambiguity in GRANT and similar commands.  This doesn't seem simple to
   enforce (if we supported partial indexes on system catalogs, it would
   be ...) but surely this representation is more amenable to enforcing it
   than the existing one.
  
   Should be workable if you're creating a local name - just check against 
   the list of global roles.
  
  Concurrent creations won't be safe without some sort of locking scheme.
  A unique index would be a lot better way of plugging that hole than a
  system-wide lock on user creation.  But not sure how to define a unique
  index that allows (joe, db1) to coexist with (joe, db2) but not with
  (joe, 0).
 
 Isn't this just a case of creating a suitable operator and an exclusion
 constraint?  Defining the constraint in BKI might require extra
 infrastructure, but it should be possible.

Except that we don't have the infrastructure to perform such checks
(neither partial, nor expression indexes, no exclusion constraints) on
system tables atm. So it's not a entirely trivial thing to do.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra
On 12 Březen 2014, 0:51, Peter Geoghegan wrote:
 On Tue, Mar 11, 2014 at 4:41 PM, Peter Geoghegan p...@heroku.com wrote:
 I think that in practice the
 general recommendation will be that when indexing at the top level,
 use jsonb_hash_ops. When indexing nested items, use the more flexible
 default GIN opclass. That seems like a pretty smart trade-off to me.

 By which I mean: index nested items using an expressional GIN index.

I'm still not sure how would that look. Does that mean I'd have to create
multiple GIN indexes - one for each possible key or something like that?
Can you give an example?

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgresql XML parsing

2014-03-12 Thread Ashoke
Hi,

   I am working on adding a functionality to PostgreSQL. I need to parse
the XML format query plan (produced by PostgreSQL v9.3) and save it in a
simple data structure (say C structure). I was wondering if PostgreSQL
already had any parsing functions implemented that I can use to do the XML
parsing. This is getting difficult as I was not able to get any DTD or XML
Schema for the XML files generated by PostgreSQL.

   I found the files xpath.c/xslt_proc.c files that contain parsing related
code, but none of the functions are being called for any xml related query
I issue to the database and some of the functions in those files mention as
deprecated.

   It would be greatly helpful if someone could guide me on this.

   Thank you.

-- 
Regards,
Ashoke


Re: [HACKERS] Postgresql XML parsing

2014-03-12 Thread Andrew Dunstan


On 03/12/2014 09:36 AM, Ashoke wrote:

Hi,

   I am working on adding a functionality to PostgreSQL. I need to 
parse the XML format query plan (produced by PostgreSQL v9.3) and save 
it in a simple data structure (say C structure). I was wondering if 
PostgreSQL already had any parsing functions implemented that I can 
use to do the XML parsing. This is getting difficult as I was not able 
to get any DTD or XML Schema for the XML files generated by PostgreSQL.


   I found the files xpath.c/xslt_proc.c files that contain parsing 
related code, but none of the functions are being called for any xml 
related query I issue to the database and some of the functions in 
those files mention as deprecated.


   It would be greatly helpful if someone could guide me on this.






The only XML parsing we have is where Postgres is built with libxml, in 
which case we use its parser. But query plan XML is delivered to a 
client (or a log file, which means more or less the same thing here). If 
you want to parse it then it should be parsed in the client - that's why 
we provide it. Inside postgres I don't see a point in parsing the XML 
rather than handling the query plan directly.


The worst possible option would be to make a hand-cut XML parser, either 
in the client or the server - XML parsing has all sorts of wrinkles that 
can bite you badly.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Kouhei Kaigai
It is another topic from the main thread,

I noticed the following message under the test cases that
takes heavy INSERT workload; provided by Haribabu.

[kaigai@iwashi ~]$ createdb mytest
[kaigai@iwashi ~]$ psql -af ~/cache_scan.sql mytest
\timing
Timing is on.
--cache scan select 5 million
create table test(f1 int, f2 char(70), f3 float, f4 char(100));
CREATE TABLE
Time: 22.373 ms
truncate table test;
TRUNCATE TABLE
Time: 17.705 ms
insert into test values (generate_series(1,500), 'fujitsu', 1.1, 'Australia 
software tech pvt ltd');
WARNING:  pgstat wait timeout
WARNING:  pgstat wait timeout
WARNING:  pgstat wait timeout
WARNING:  pgstat wait timeout
   :

Once I got above messages, write performance is dramatically
degraded, even though I didn't take detailed investigation.

I could reproduce it on the latest master branch without my
enhancement, so I guess it is not a problem something special
to me.
One other strangeness is, right now, this problem is only
happen on my virtual machine environment - VMware ESXi 5.5.0.
I couldn't reproduce the problem on my physical environment
(Fedora20, core i5-4570S).
Any ideas?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
 Sent: Wednesday, March 12, 2014 3:26 PM
 To: Haribabu Kommi; Kohei KaiGai
 Cc: Tom Lane; PgHacker; Robert Haas
 Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only
 table scan?)
 
 Thanks for your efforts!
   Head  patched
  Diff
  Select -  500K772ms2659ms-200%
  Insert - 400K   3429ms 1948ms  43% (I am
  not sure how it improved in this case)
  delete - 200K 2066ms 3978ms-92%
  update - 200K3915ms  5899ms-50%
 
  This patch shown how the custom scan can be used very well but coming
  to patch as It is having some performance problem which needs to be
  investigated.
 
  I attached the test script file used for the performance test.
 
 First of all, it seems to me your test case has too small data set that
 allows to hold all the data in memory - briefly 500K of 200bytes record
 will consume about 100MB. Your configuration allocates 512MB of
 shared_buffer, and about 3GB of OS-level page cache is available.
 (Note that Linux uses free memory as disk cache adaptively.)
 
 This cache is designed to hide latency of disk accesses, so this test case
 does not fit its intention.
 (Also, the primary purpose of this module is a demonstration for
 heap_page_prune_hook to hook vacuuming, so simple code was preferred than
 complicated implementation but better performance.)
 
 I could reproduce the overall trend, no cache scan is faster than cached
 scan if buffer is in memory. Probably, it comes from the cost to walk down
 T-tree index using ctid per reference.
 Performance penalty around UPDATE and DELETE likely come from trigger
 invocation per row.
 I could observe performance gain on INSERT a little bit.
 It's strange for me, also. :-(
 
 On the other hand, the discussion around custom-plan interface effects this
 module because it uses this API as foundation.
 Please wait for a few days to rebase the cache_scan module onto the newer
 custom-plan interface; that I submitted just a moment before.
 
 Also, is it really necessary to tune the performance stuff in this example
 module of the heap_page_prune_hook?
 Even though I have a few ideas to improve the cache performance, like
 insertion of multiple rows at once or local chunk copy instead of t-tree
 walk down, I'm not sure whether it is productive in the current v9.4
 timeframe. ;-(
 
 Thanks,
 --
 NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
 kai...@ak.jp.nec.com
 
 
  -Original Message-
  From: pgsql-hackers-ow...@postgresql.org
  [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Haribabu
  Kommi
  Sent: Wednesday, March 12, 2014 1:14 PM
  To: Kohei KaiGai
  Cc: Kaigai Kouhei(海外 浩平); Tom Lane; PgHacker; Robert Haas
  Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for
  cache-only table scan?)
 
  On Thu, Mar 6, 2014 at 10:15 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
   2014-03-06 18:17 GMT+09:00 Haribabu Kommi kommi.harib...@gmail.com:
   I will update you later regarding the performance test results.
  
 
  I ran the performance test on the cache scan patch and below are the
 readings.
 
  Configuration:
 
  Shared_buffers - 512MB
  cache_scan.num_blocks - 600
  checkpoint_segments - 255
 
  Machine:
  OS - centos - 6.4
  CPU - 4 core 2.5 GHZ
  Memory - 4GB
 
   Head  patched
  Diff
  Select -  500K772ms2659ms-200%
  Insert - 400K   3429ms 1948ms  43% (I am
  not sure how it 

Re: [HACKERS] Rowtype column and domain subfield with DEFAULT and NOT NULL constraint

2014-03-12 Thread Tom Lane
Julien Tachoires julien.tachoi...@dalibo.com writes:
 A customer has reported us a strange behaviour regarding a rowtype 
 column with a domain subfield:

Rowtypes in general do not support defaults for component fields.

 Is build_column_default() the right place to handle that case ?

It's unlikely that this is simple to change.  As an example, should
the default be inserted during a cast to the rowtype?  How about
plpgsql variable initialization?  What are you going to do about
scalar-NULL values of the rowtype (note forbid them is likely
to cause all sorts of collateral damage)?

But in any case, none of the examples you showed have anything to
do with build_column_default().  That would only get applied if
the INSERT's targetlist didn't mention col1 at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Tom Lane
Kouhei Kaigai kai...@ak.jp.nec.com writes:
 WARNING:  pgstat wait timeout
 WARNING:  pgstat wait timeout
 WARNING:  pgstat wait timeout
 WARNING:  pgstat wait timeout

 Once I got above messages, write performance is dramatically
 degraded, even though I didn't take detailed investigation.

 I could reproduce it on the latest master branch without my
 enhancement, so I guess it is not a problem something special
 to me.
 One other strangeness is, right now, this problem is only
 happen on my virtual machine environment - VMware ESXi 5.5.0.
 I couldn't reproduce the problem on my physical environment
 (Fedora20, core i5-4570S).

We've seen sporadic reports of that sort of behavior for years, but no
developer has ever been able to reproduce it reliably.  Now that you've
got a reproducible case, do you want to poke into it and see what's going
on?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Tom Lane
Jaime Casanova ja...@2ndquadrant.com writes:
 On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But not sure how to define a unique
 index that allows (joe, db1) to coexist with (joe, db2) but not with
 (joe, 0).

 and why you want that restriction?

So that if I say GRANT SELECT ON mytable TO joe, it's unambiguous which
user I'm granting to.  There should be at most one joe that can access
any given database.

If we don't have such a restriction, we'll need syntax kluges in GRANT,
ALTER OWNER, and probably other commands to disambiguate whether a local
or global user is meant.  Right now with the db_user_namespace mechanism,
you have to say GRANT ... TO joe@db1 if you were granting to a local
user in db1.  That's unambiguous all right, but it has little else to
recommend it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Yeah, what we really need is encapsulated per-DB users and local
 superusers.  I think every agrees that this is the goal, but nobody
 wants to put in the work to implement a generalized solution.

 Encapsulated would probably be the doable part. But local superuser? Given
 that a superuser can load and run binaries, how would you propose you
 restrict that superuser from doing anything they want? And if you don't
 need that functionality, then hows it really different from being the
 database owner?

A local user with the superuser privilege would not be able to log into
another database, because superuser doesn't give you any extra privilege
until you've logged in.

Yeah, as superuser you could still break things as much as you pleased,
but not through SQL.

I share your doubts as to how useful such a concept actually is, but
it'd work if we had real local users.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Magnus Hagander
On Wed, Mar 12, 2014 at 3:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  Yeah, what we really need is encapsulated per-DB users and local
  superusers.  I think every agrees that this is the goal, but nobody
  wants to put in the work to implement a generalized solution.

  Encapsulated would probably be the doable part. But local superuser?
 Given
  that a superuser can load and run binaries, how would you propose you
  restrict that superuser from doing anything they want? And if you don't
  need that functionality, then hows it really different from being the
  database owner?

 A local user with the superuser privilege would not be able to log into
 another database, because superuser doesn't give you any extra privilege
 until you've logged in.

 Yeah, as superuser you could still break things as much as you pleased,
 but not through SQL.


You could COPY over the hba file or sometihng like that :)  Or just
pg_read_binary_file() on the files in another database, which is accessible
through SQL as well.


I share your doubts as to how useful such a concept actually is, but
 it'd work if we had real local users.


It can also do interesting things like ALTER SYSTEM, replication, backups,
etc. All of which could be used to escalate privileges beyond the local
database.

So you'd have to somehow restrict those, at which point what's the point of
the property in the first place?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
 On Wed, Mar 12, 2014 at 3:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I share your doubts as to how useful such a concept actually is, but
  it'd work if we had real local users.

 
 It can also do interesting things like ALTER SYSTEM, replication, backups,
 etc. All of which could be used to escalate privileges beyond the local
 database.

Probably DROP ROLE for global users too.

 So you'd have to somehow restrict those, at which point what's the point of
 the property in the first place?

We've been asked quite often for a not-quite-superuser, as in, one which
can bypass the normal GRANT-based permission system but which can't do
things like create untrusted functions or do other particularly bad
activities.  I can certainly see value in that.  Another oft-requested
option is a read-only role which pg_dump or an auditor could use.

Anyway, this is getting a bit far afield from the original discussion,
which looked like it might actually be heading somewhere interesting..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Wed, Mar 12, 2014 at 3:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 A local user with the superuser privilege would not be able to log into
 another database, because superuser doesn't give you any extra privilege
 until you've logged in.
 
 Yeah, as superuser you could still break things as much as you pleased,
 but not through SQL.

 You could COPY over the hba file or sometihng like that :)  Or just
 pg_read_binary_file() on the files in another database, which is accessible
 through SQL as well.

More directly, he could alter pg_authid to make himself a not-local user.
But I don't see that it's our responsibility to prevent that.  As long as
the combination of features works in a straightforward way, I'm happy
with it --- and it would, AFAICS.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Magnus Hagander mag...@hagander.net writes:
  You could COPY over the hba file or sometihng like that :)  Or just
  pg_read_binary_file() on the files in another database, which is accessible
  through SQL as well.
 
 More directly, he could alter pg_authid to make himself a not-local user.
 But I don't see that it's our responsibility to prevent that.  As long as
 the combination of features works in a straightforward way, I'm happy
 with it --- and it would, AFAICS.

That depends on exactly what you mean by 'those features'.  There's
quite a difference between you can set the superuser flag on a local
user and then that user will be a superuser and a local user with
superuser flag will only be able to impact the database they are local
to.  I agree that there's nothing stopping us from having a local
user which is marked as a superuser from a technical level.

What Magnus and I are worried about is the *implication* of such a
configuration is and what the user will think it means.  Specifically,
there will be an assumption that local users can only access or impact
the databases which they have access to, which wouldn't be accurate for
a local user who is a superuser.  Certainly, documenting this would
help with that but with as many warnings as we'd have to put up about
that being dangerous and that it isn't actually going to prevent that
superuser from accessing the other databases if they really wanted to,
or prevent them from making a global superuser account, etc, I'm just
not convinced that it's worth it for the feature of allowing a local
account to be set as superuser- I don't see a huge use-case there.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Tomas Vondra
On 12 Březen 2014, 14:54, Kouhei Kaigai wrote:
 It is another topic from the main thread,

 I noticed the following message under the test cases that
 takes heavy INSERT workload; provided by Haribabu.

 [kaigai@iwashi ~]$ createdb mytest
 [kaigai@iwashi ~]$ psql -af ~/cache_scan.sql mytest
 \timing
 Timing is on.
 --cache scan select 5 million
 create table test(f1 int, f2 char(70), f3 float, f4 char(100));
 CREATE TABLE
 Time: 22.373 ms
 truncate table test;
 TRUNCATE TABLE
 Time: 17.705 ms
 insert into test values (generate_series(1,500), 'fujitsu', 1.1,
 'Australia software tech pvt ltd');
 WARNING:  pgstat wait timeout
 WARNING:  pgstat wait timeout
 WARNING:  pgstat wait timeout
 WARNING:  pgstat wait timeout
:

 Once I got above messages, write performance is dramatically
 degraded, even though I didn't take detailed investigation.

 I could reproduce it on the latest master branch without my
 enhancement, so I guess it is not a problem something special
 to me.
 One other strangeness is, right now, this problem is only
 happen on my virtual machine environment - VMware ESXi 5.5.0.
 I couldn't reproduce the problem on my physical environment
 (Fedora20, core i5-4570S).
 Any ideas?

I've seen this happening in cases when it was impossible to write
the stat file for some reason. IIRC there were two basic causes I've seen
in the past:

(1) writing the stat copy failed - for example when the temporary stat
directory was placed in tmpfs, but it was too small

(2) writing the stat copy took too long - e.g. with tmpfs and memory
pressure, forcing the system to swap to free space for the stat copy

(3) IIRC the inquiry (backend - postmaster) to write the file is sent
using UDP, which may be dropped in some cases (e.g. when the system is
overloaded), so the postmaster does not even know it should write the file

I'm not familiar with VMware ESXi virtualization, but I suppose it might
be relevant to all three causes.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Tom Lane
Rajeev rastogi rajeev.rast...@huawei.com writes:
 On 11 March 2014 19:52, Tom Lane wrote:
 After sleeping on it, I'm inclined to think we should continue to not
 print status for COPY TO STDOUT.  Aside from the risk of breaking
 scripts, there's a decent analogy to be made to SELECT: we don't print
 a status tag for that either.

 It is correct that SELECT does not print conventional way of status tag but 
 still it prints the number
 of rows selected (e.g. (2 rows)) along with rows actual value, which can be 
 very well considered
 as kind of status. User can make out with this result, that how many rows 
 have been selected.

 But in-case of COPY TO STDOUT, if we don't print anything, then user does not 
 have any direct way of finding
 that how many rows were copied from table to STDOUT, which might have been 
 very useful.

Uh, you mean other than the data rows that were just printed?  I fail
to see how this is much different from the SELECT case:

regression=# \copy int8_tbl to stdout
123 456
123 4567890123456789
4567890123456789123
45678901234567894567890123456789
4567890123456789-4567890123456789
regression=# 

(Note that I'm defining TO STDOUT from psql's perspective, ie the rows are
going to the queryFout file, which is the same place the COPY status would
get printed to.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas

On 02/26/2014 11:25 PM, Alexander Korotkov wrote:

On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov aekorot...@gmail.comwrote:


On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:


On 02/09/2014 12:11 PM, Alexander Korotkov wrote:


I've rebased catalog changes with last master. Patch is attached. I've
rerun my test suite with both last master ('committed') and attached
patch ('ternary-consistent').



Thanks!


method |   sum

+--
   committed  | 143491.71501
   fast-scan-11   | 126916.11199
   fast-scan-light|   137321.211
   fast-scan-light-heikki | 138168.02801
   master |   446976.288
   ternary-consistent |   125923.514

I explain regression in last master by change of MAX_MAYBE_ENTRIES from 8
to 4.



Yeah, probably. I set MAX_MAYBE_ENTRIES to 8 in initial versions to make
sure we get similar behavior in Tomas' tests that used 6 search terms. But
I always felt that it was too large for real queries, once we have the
catalog changes, that's why I lowered to 4 when committing. If an opclass
benefits greatly from fast scan, it should provide the ternary consistent
function, and not rely on the shim implementation.


  I'm not sure about decision to reserve separate procedure number for

ternary consistent. Probably, it would be better to add ginConfig method.
It would be useful for post 9.4 improvements.



Hmm, it might be useful for an opclass to provide both, a boolean and
ternary consistent function, if the boolean version is significantly more
efficient when all the arguments are TRUE/FALSE. OTOH, you could also do a
quick check through the array to see if there are any MAYBE arguments,
within the consistent function. But I'm inclined to keep the possibility to
provide both versions. As long as we support the boolean version at all,
there's not much difference in terms of the amount of code to support
having them both for the same opclass.

A ginConfig could be useful for many other things, but I don't think it's
worth adding it now.


What's the difference between returning GIN_MAYBE and GIN_TRUE+recheck?
We discussed that earlier, but didn't reach any conclusion. That needs to
be clarified in the docs. One possibility is to document that they're
equivalent. Another is to forbid one of them. Yet another is to assign a
different meaning to each.

I've been thinking that it might be useful to define them so that a MAYBE
result from the tri-consistent function means that it cannot decide if you
have a match or not, because some of the inputs were MAYBE. And
TRUE+recheck means that even if all the MAYBE inputs were passed as TRUE or
FALSE, the result would be the same, TRUE+recheck. The practical difference
would be that if the tri-consistent function returns TRUE+recheck, ginget.c
wouldn't need to bother fetching the other entries, it could just return
the entry with recheck=true immediately. While with MAYBE result, it would
fetch the other entries and call tri-consistent again. ginget.c doesn't
currently use the tri-consistent function that way - it always fetches all
the entries for a potential match before calling tri-consistent, but it
could. I had it do that in some of the patch versions, but Tomas' testing
showed that it was a big loss on some queries, because the consistent
function was called much more often. Still, something like that might be
sensible in the future, so it might be good to distinguish those cases in
the API now. Note that ginarrayproc is already using the return values like
that: in GinContainedStrategy, it always returns TRUE+recheck regardless of
the inputs, but in other cases it uses GIN_MAYBE.



Next revision of patch is attached.

In this version opclass should provide at least one consistent function:
binary or ternary. It's expected to achieve best performance when opclass
provide both of them. However, tests shows opposite :( I've to recheck it
carefully.



However, it's not!
This revision of patch completes my test case in 123330 ms. This is
slightly faster than previous revision.


Great. Committed, I finally got around to it.

Some minor changes: I reworded the docs and also updated the table of 
support functions in xindex.sgml. I refactored the query in 
opr_sanity.sql, to make it easier to read, and to check more carefully 
that the required support functions are present. I also added a runtime 
check to avoid crashing if neither is present.


A few things we ought to still discuss:

* I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE, 
rather than GIN_TRUE. The equivalent boolean version returns 'true' 
without recheck. Is that a typo, or was there some reason for the 
discrepancy?


* Come to think of it, I'm not too happy with the name GinLogicValue. 
It's too vague. It ought to include ternary or tri-value or 
something like that. How about renaming it to 

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Tom Lane
David Johnston pol...@yahoo.com writes:
 Tom Lane-2 wrote
 1. Treat this as a non-backwards-compatible change, and document that
 people have to use -q if they don't want the COPY tag in the output.
 I'm not sure this is acceptable.

 I've mostly used copy to with files and so wouldn't mind if STDOUT had the
 COPY n sent to it as long as the target file is just the copy contents.

I think you're missing the point: the case I'm concerned about is exactly
that the target file is psql's stdout, or more specifically the same place
that the COPY status would get printed to.

 2. Kluge ProcessResult so that it continues to not pass back a PGresult
 for the COPY TO STDOUT case, or does so only in limited circumstances
 (perhaps only if isatty(stdout), for instance).

 The main problem with this is that people will test by sending output to a
 TTY and see the COPY n.  Although if it can be done consistently then you
 minimize backward incompatibility and encourage people to enforce quiet mode
 while the command runs...

Yeah, the inconsistency of behavior that this solution would cause is not
a good thing.  My inclination now (see later traffic) is to suppress the
status report when the COPY destination is the same as pset.queryFout
(ie, a simple test whether the FILE pointers are equal).  This would
suppress the status report for \copy to stdout and COPY TO STDOUT
cases, and also for \copy to pstdout if you'd not redirected queryFout
with \o.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Robert Haas
On Tue, Mar 11, 2014 at 3:14 PM, Josh Berkus j...@agliodbs.com wrote:
 In the 9.3.3 updates, we added three new GUCs to control multixact
 freezing.  This was an unprecented move in my memory -- I can't recall
 ever adding a GUC to a minor release which wasn't backwards
 compatibility for a security fix.  This was a mistake.

I disagree.  I think it was the right decision.  I think it was a
mistake not including all of that stuff in the first place, and I
think it's good that we've now corrected that oversight.

 What makes these GUCs worse is that nobody knows how to set them; nobody
 on this list and nobody in the field.  Heck, I doubt 1 in 1000 of our
 users (or 1 in 10 people on this list) know what a multixact *is*.

Yeah, and that's a problem.   See, it turns out that we love periodic
full-table scans to freeze xmin so much that, in 9.3, we committed to
a design that requires us to make periodic full-table scans to freeze
xmax, too.  That may or may not have been a good decision, but at this
point we're stuck with it.  People are going to have to come to
understand the requirements there just as they do for freezing xmin.
Denying the user the ability to adjust the thresholds is not going to
accelerate the process of figuring out how they should be set.

 Further, there's no clear justification why these cannot be set to be
 the same as our other freeze ages (which our users also don't
 understand), or a constant calculated portion of them, or just a
 constant.

On most systems, mxid consumption will be much slower than xid
consumption because most users won't use tuple locks all that heavily.
 If we made all the defaults the same, then a full-table scan for xid
freezing would likely conclude that the many or all of the mxids
weren't old enough to be frozen yet.  To the greatest extent possible,
we want full-table vacuums for either XID freezing or MXID freezing to
advance both relfrozenxid and relminmxid so that we don't go through
and freeze for one reason and then have to come back and freeze for
the other reasons shortly thereafter.  Nobody knows exactly how to set
the settings to make that happen just yet, so we need settings at
least until people can determine what values work well in practice -
and probably permanently, because unfortunately I think the answer is
likely workload-dependent.

 Since nobody anticipated someone adding a GUC in a minor
 release, there was no discussion of this topic that I can find; the new
 GUCs were added as a side effect of fixing the multixact vacuum issue.
  Certainly I would have raised a red flag if the discussion of the new
 GUCs hadn't been buried deep inside really long emails.

Alvaro did explicitly ask if anyone wanted to oppose back-patching.  I
don't think you can really blame him if you didn't see/read that
email.

 Adding new GUCs which nobody has any idea how to set, or can even
 explain to new users, is not a service to our users.  These should be
 removed.

The need for these GUCs is an outgrowth of the fkey locking stuff.
Unless we rip that out again or rewrite it completely, the need for
them doesn't seem likely to go away - so we're going to need to learn
to live with it, not pretend like it isn't a problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas

On 03/12/2014 12:09 AM, Tomas Vondra wrote:

Hi all,

a quick question that just occured to me - do you plan to tweak the cost
estimation fot GIN indexes, in this patch?

IMHO it would be appropriate, given the improvements and gains, but it
seems to me gincostestimate() was not touched by this patch.


Good point. We have done two major changes to GIN in this release cycle: 
changed the data page format and made it possible to skip items without 
fetching all the keys (fast scan). gincostestimate doesn't know about 
either change.


Adjusting gincostestimate for the more compact data page format seems 
easy. When I hacked on that, I assumed all along that gincostestimate 
doesn't need to be changed as the index will just be smaller, which will 
be taken into account automatically. But now that I look at 
gincostestimate, it assumes that the size of one item on a posting tree 
page is a constant 6 bytes (SizeOfIptrData), which is no longer true. 
I'll go fix that.


Adjusting for the effects of skipping is harder. gincostestimate needs 
to do the same preparation steps as startScanKey: sort the query keys by 
frequency, and call consistent function to split the keys intao 
required and additional sets. And then model that the additional 
entries only need to be fetched when the other keys match. That's doable 
in principle, but requires a bunch of extra code.


Alexander, any thoughts on that? It's getting awfully late to add new 
code for that, but it sure would be nice somehow take fast scan into 
account.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Jeff Janes
On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Kouhei Kaigai kai...@ak.jp.nec.com writes:
  WARNING:  pgstat wait timeout
  WARNING:  pgstat wait timeout
  WARNING:  pgstat wait timeout
  WARNING:  pgstat wait timeout

  Once I got above messages, write performance is dramatically
  degraded, even though I didn't take detailed investigation.

  I could reproduce it on the latest master branch without my
  enhancement, so I guess it is not a problem something special
  to me.
  One other strangeness is, right now, this problem is only
  happen on my virtual machine environment - VMware ESXi 5.5.0.
  I couldn't reproduce the problem on my physical environment
  (Fedora20, core i5-4570S).

 We've seen sporadic reports of that sort of behavior for years, but no
 developer has ever been able to reproduce it reliably.  Now that you've
 got a reproducible case, do you want to poke into it and see what's going
 on?


I didn't know we were trying to reproduce it, nor that it was a mystery.
 Do anything that causes serious IO constipation, and you will probably see
that message.  For example, turn off synchronous_commit and run the default
pgbench transaction at a large scale but that still comfortably fits in
RAM, and wait for a checkpoint sync phase to kick in.

The pgstat wait timeout is a symptom, not the cause.

Cheers,

Jeff


Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Heikki Linnakangas

On 03/12/2014 06:26 PM, Robert Haas wrote:

On Tue, Mar 11, 2014 at 3:14 PM, Josh Berkus j...@agliodbs.com wrote:

In the 9.3.3 updates, we added three new GUCs to control multixact
freezing.  This was an unprecented move in my memory -- I can't recall
ever adding a GUC to a minor release which wasn't backwards
compatibility for a security fix.  This was a mistake.


I disagree.  I think it was the right decision.  I think it was a
mistake not including all of that stuff in the first place, and I
think it's good that we've now corrected that oversight.


In hindsight, I think permanent multixids in their current form was a 
mistake. Before 9.3, the thing that made multixids special was that they 
could just be thrown away at a restart. They didn't need freezing. Now 
that they do, why not just use regular XIDs for them? We had to 
duplicate much of the wraparound and freezing logic for multixids that 
simply would not have been an issue if we had used regular XIDs instead.


We could've perhaps kept the old multixids for their original purpose, 
as transient xids that can be forgotten about after all the old 
snapshots are gone. But for the permanent ones, it would've been simpler 
if we handled them more like subxids; make them part of the same XID 
space as regular XIDs.


This is pretty hand-wavy of course, and it's too late now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 We've seen sporadic reports of that sort of behavior for years, but no
 developer has ever been able to reproduce it reliably.  Now that you've
 got a reproducible case, do you want to poke into it and see what's going
 on?

 I didn't know we were trying to reproduce it, nor that it was a mystery.
  Do anything that causes serious IO constipation, and you will probably see
 that message.

The cases that are a mystery to me are where there's no reason to believe
that I/O is particularly overloaded.  But perhaps Kaigai-san's example is
only that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 12:45 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 03/12/2014 06:26 PM, Robert Haas wrote:
 On Tue, Mar 11, 2014 at 3:14 PM, Josh Berkus j...@agliodbs.com wrote:
 In the 9.3.3 updates, we added three new GUCs to control multixact
 freezing.  This was an unprecented move in my memory -- I can't recall
 ever adding a GUC to a minor release which wasn't backwards
 compatibility for a security fix.  This was a mistake.

 I disagree.  I think it was the right decision.  I think it was a
 mistake not including all of that stuff in the first place, and I
 think it's good that we've now corrected that oversight.

 In hindsight, I think permanent multixids in their current form was a
 mistake. Before 9.3, the thing that made multixids special was that they
 could just be thrown away at a restart. They didn't need freezing. Now that
 they do, why not just use regular XIDs for them?

Well, the numbering of MXIDs is closely bound up with their storage
format.  To do what you're proposing, we'd need to invent some new way
of associating an XID-used-as-MXID with update XID, list of lockers,
and lock modes.  Which is certainly possible, but it's not obvious
that it's a good idea.

I *am* concerned that we didn't adequately weigh the costs of adding
another thing that has to be frozen before we did it.  Clearly, the
feature has a lot of benefit, or will once we've flushed out most of
the bugs.  But it's hard to say at this point how much the cost is
going to be, and I do think that's cause for concern.  But I'm not
convinced that unifying the XID and MXID spaces would have addressed
that concern to any measurable degree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Alexander Korotkov
On Wed, Mar 12, 2014 at 8:29 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 03/12/2014 12:09 AM, Tomas Vondra wrote:

 Hi all,

 a quick question that just occured to me - do you plan to tweak the cost
 estimation fot GIN indexes, in this patch?

 IMHO it would be appropriate, given the improvements and gains, but it
 seems to me gincostestimate() was not touched by this patch.


 Good point. We have done two major changes to GIN in this release cycle:
 changed the data page format and made it possible to skip items without
 fetching all the keys (fast scan). gincostestimate doesn't know about
 either change.

 Adjusting gincostestimate for the more compact data page format seems
 easy. When I hacked on that, I assumed all along that gincostestimate
 doesn't need to be changed as the index will just be smaller, which will be
 taken into account automatically. But now that I look at gincostestimate,
 it assumes that the size of one item on a posting tree page is a constant 6
 bytes (SizeOfIptrData), which is no longer true. I'll go fix that.

 Adjusting for the effects of skipping is harder. gincostestimate needs to
 do the same preparation steps as startScanKey: sort the query keys by
 frequency, and call consistent function to split the keys intao required
 and additional sets. And then model that the additional entries only
 need to be fetched when the other keys match. That's doable in principle,
 but requires a bunch of extra code.

 Alexander, any thoughts on that? It's getting awfully late to add new code
 for that, but it sure would be nice somehow take fast scan into account.


Preparation we do in startScanKey requires knowledge of estimate size of
posting lists/trees. We do this estimate by traversal to leaf pages. I
think gincostestimate is expected to be way more cheap. So, we probably
need so more rough estimate there, don't we?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 12:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 David Johnston pol...@yahoo.com writes:
 Tom Lane-2 wrote
 1. Treat this as a non-backwards-compatible change, and document that
 people have to use -q if they don't want the COPY tag in the output.
 I'm not sure this is acceptable.

 I've mostly used copy to with files and so wouldn't mind if STDOUT had the
 COPY n sent to it as long as the target file is just the copy contents.

 I think you're missing the point: the case I'm concerned about is exactly
 that the target file is psql's stdout, or more specifically the same place
 that the COPY status would get printed to.

 2. Kluge ProcessResult so that it continues to not pass back a PGresult
 for the COPY TO STDOUT case, or does so only in limited circumstances
 (perhaps only if isatty(stdout), for instance).

 The main problem with this is that people will test by sending output to a
 TTY and see the COPY n.  Although if it can be done consistently then you
 minimize backward incompatibility and encourage people to enforce quiet mode
 while the command runs...

 Yeah, the inconsistency of behavior that this solution would cause is not
 a good thing.  My inclination now (see later traffic) is to suppress the
 status report when the COPY destination is the same as pset.queryFout
 (ie, a simple test whether the FILE pointers are equal).  This would
 suppress the status report for \copy to stdout and COPY TO STDOUT
 cases, and also for \copy to pstdout if you'd not redirected queryFout
 with \o.

This is reasonably similar to what we already do for SELECT, isn't it?
 I mean, the server always sends back a command tag, but psql
sometimes opts not to print it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Alexander Korotkov
On Wed, Mar 12, 2014 at 8:02 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 02/26/2014 11:25 PM, Alexander Korotkov wrote:

 On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov aekorot...@gmail.com
 wrote:

  On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

  On 02/09/2014 12:11 PM, Alexander Korotkov wrote:

  I've rebased catalog changes with last master. Patch is attached. I've
 rerun my test suite with both last master ('committed') and attached
 patch ('ternary-consistent').


 Thanks!


 method |   sum

 +--
committed  | 143491.71501
fast-scan-11   | 126916.11199
fast-scan-light|   137321.211
fast-scan-light-heikki | 138168.02801
master |   446976.288
ternary-consistent |   125923.514

 I explain regression in last master by change of MAX_MAYBE_ENTRIES
 from 8
 to 4.


 Yeah, probably. I set MAX_MAYBE_ENTRIES to 8 in initial versions to make
 sure we get similar behavior in Tomas' tests that used 6 search terms.
 But
 I always felt that it was too large for real queries, once we have the
 catalog changes, that's why I lowered to 4 when committing. If an
 opclass
 benefits greatly from fast scan, it should provide the ternary
 consistent
 function, and not rely on the shim implementation.


   I'm not sure about decision to reserve separate procedure number for

 ternary consistent. Probably, it would be better to add ginConfig
 method.
 It would be useful for post 9.4 improvements.


 Hmm, it might be useful for an opclass to provide both, a boolean and
 ternary consistent function, if the boolean version is significantly
 more
 efficient when all the arguments are TRUE/FALSE. OTOH, you could also
 do a
 quick check through the array to see if there are any MAYBE arguments,
 within the consistent function. But I'm inclined to keep the
 possibility to
 provide both versions. As long as we support the boolean version at all,
 there's not much difference in terms of the amount of code to support
 having them both for the same opclass.

 A ginConfig could be useful for many other things, but I don't think
 it's
 worth adding it now.


 What's the difference between returning GIN_MAYBE and GIN_TRUE+recheck?
 We discussed that earlier, but didn't reach any conclusion. That needs
 to
 be clarified in the docs. One possibility is to document that they're
 equivalent. Another is to forbid one of them. Yet another is to assign a
 different meaning to each.

 I've been thinking that it might be useful to define them so that a
 MAYBE
 result from the tri-consistent function means that it cannot decide if
 you
 have a match or not, because some of the inputs were MAYBE. And
 TRUE+recheck means that even if all the MAYBE inputs were passed as
 TRUE or
 FALSE, the result would be the same, TRUE+recheck. The practical
 difference
 would be that if the tri-consistent function returns TRUE+recheck,
 ginget.c
 wouldn't need to bother fetching the other entries, it could just return
 the entry with recheck=true immediately. While with MAYBE result, it
 would
 fetch the other entries and call tri-consistent again. ginget.c doesn't
 currently use the tri-consistent function that way - it always fetches
 all
 the entries for a potential match before calling tri-consistent, but it
 could. I had it do that in some of the patch versions, but Tomas'
 testing
 showed that it was a big loss on some queries, because the consistent
 function was called much more often. Still, something like that might be
 sensible in the future, so it might be good to distinguish those cases
 in
 the API now. Note that ginarrayproc is already using the return values
 like
 that: in GinContainedStrategy, it always returns TRUE+recheck
 regardless of
 the inputs, but in other cases it uses GIN_MAYBE.



 Next revision of patch is attached.

 In this version opclass should provide at least one consistent function:
 binary or ternary. It's expected to achieve best performance when opclass
 provide both of them. However, tests shows opposite :( I've to recheck it
 carefully.


 However, it's not!
 This revision of patch completes my test case in 123330 ms. This is
 slightly faster than previous revision.


 Great. Committed, I finally got around to it.

 Some minor changes: I reworded the docs and also updated the table of
 support functions in xindex.sgml. I refactored the query in opr_sanity.sql,
 to make it easier to read, and to check more carefully that the required
 support functions are present. I also added a runtime check to avoid
 crashing if neither is present.

 A few things we ought to still discuss:

 * I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE,
 rather than GIN_TRUE. The equivalent boolean version returns 'true' without
 recheck. Is that a typo, or was there some reason for the discrepancy?


Actually, 

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 1:52 PM, Alexander Korotkov
aekorot...@gmail.com wrote:
 * This patch added a triConsistent function for array and tsvector
 opclasses. Were you planning to submit a patch to do that for the rest of
 the opclasses, like pg_trgm? (it's getting awfully late for that...)

 Yes. I can try to get it into 9.4 if it's possible.

It seems to me that we'd be wise to push that to 9.5 at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical decoding documentation?

2014-03-12 Thread Robert Haas
On Tue, Mar 11, 2014 at 4:16 PM, Andres Freund and...@2ndquadrant.com wrote:
 Could you perhaps commit the attached patch fixing the issues you
 mentioned?

I committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Josh Berkus
On 03/12/2014 12:22 AM, Magnus Hagander wrote:
 On Mar 12, 2014 1:46 AM, Josh Berkus j...@agliodbs.com wrote:
 Yeah, what we really need is encapsulated per-DB users and local
 superusers.  I think every agrees that this is the goal, but nobody
 wants to put in the work to implement a generalized solution.

 
 Encapsulated would probably be the doable part. But local superuser? Given
 that a superuser can load and run binaries, how would you propose you
 restrict that superuser from doing anything they want? And if you don't
 need that functionality, then hows it really different from being the
 database owner?

Well, if you really want my I want a pony list:

Local superusers (maybe this concept needs another name) would be able
to do the following things in a *single* database:

1 change permissions for other users on that database and its objects
2 load extensions from a predefined .so directory / list
3 create/modify untrusted language functions
4 create per-database users and change their settings
5 change database settings (SET stuff)
6 NOT change their own user settings
7 NOT change any global users
8 NOT run SET PERSISTENT or other commands with global effect

The above is fairly similar to what Amazon currently offers for Postgres
RDS, except that they only have 1 database per instance, so clearly they
haven't worked out some of the security issues.

Now, obviously permission (3) could be used to escalate a local
superuser to global superuser permissions, so local superusers aren't
really a secure concept, unless you don't add any untrusted languages to
the list of allowed extensions.  Alternately, we could drop (3) from the
list of features.

H. On the other foot, though: all of 1,2,4 and 5 could conceivably
be done via a set of Security Definer functions loaded into the
database, with a lot less complexity and security risk.  So if we're
sacrificing untrusted languages, then we really don't need a local
superuser at all; just a bunch of variadic SD functions which can handle
user creation and permissions issues. It would limit the permissions
syntax we can express, but not critically so in my opinion. This could
all be done as an extension, and would probably be the better for it.

Assuming we have db-local users in the first place, of course.


On 03/11/2014 09:39 PM, David Johnston wrote:

 So if dave is already a user in db1 only that specific dave can be made a
 global user - any other dave would be disallowed.

Correct.  Well, unless the other dave was promoted first.  However, I
personally don't see any reason why we should even support promoting
users from local to global.  It adds complexity to the concept, and the
value of it eludes me.


 Would user - password be a better PK? Even with the obvious issue that
 password part of the key can change.  user-password to database is a
 properly many-to-many relationship.  Or see next for something simpler.

I have no idea where you're going with this.

 A simple implementation would simply have the global users copied into
each
 database as it is constructed.  There would also be a link from each
of the
 database-specific users and the global master so that a password change
 issued against the global user propagates to all the database-specific
 versions.

That's called massive backwards compatibility failure, and I don't
think we'll go any further with your concept.

 Be nice if all users could be global and there would be some way to give
 them permissions on databases.

Um, there is:
http://www.postgresql.org/docs/9.3/static/sql-grant.html

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas

On 03/12/2014 07:42 PM, Alexander Korotkov wrote:

Preparation we do in startScanKey requires knowledge of estimate size of
posting lists/trees. We do this estimate by traversal to leaf pages. I
think gincostestimate is expected to be way more cheap. So, we probably
need so more rough estimate there, don't we?


Yeah, maybe. We do something similar for b-tree MIN/MAX currently, but 
with a lot of keys, it could be a lot more expensive to traverse down to 
all of them.


I wonder if we could easily at least catch the common skewed cases, 
where e.g the logic of the consistent function is to AND all the keys. 
The opclass would know that, but we would somehow need to pass down the 
information to gincostestimate.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Andrew Dunstan


On 03/12/2014 02:09 PM, Josh Berkus wrote:

On 03/12/2014 12:22 AM, Magnus Hagander wrote:

On Mar 12, 2014 1:46 AM, Josh Berkus j...@agliodbs.com wrote:

Yeah, what we really need is encapsulated per-DB users and local
superusers.  I think every agrees that this is the goal, but nobody
wants to put in the work to implement a generalized solution.


Encapsulated would probably be the doable part. But local superuser? Given
that a superuser can load and run binaries, how would you propose you
restrict that superuser from doing anything they want? And if you don't
need that functionality, then hows it really different from being the
database owner?

Well, if you really want my I want a pony list:

Local superusers (maybe this concept needs another name) would be able
to do the following things in a *single* database:

1 change permissions for other users on that database and its objects
2 load extensions from a predefined .so directory / list
3 create/modify untrusted language functions
4 create per-database users and change their settings
5 change database settings (SET stuff)
6 NOT change their own user settings
7 NOT change any global users
8 NOT run SET PERSISTENT or other commands with global effect


Item 3 gives away the store. AFAIK Amazon doesn't load untrusted 
languages, period.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-03-12 Thread Alvaro Herrera
MauMau escribió:

 The raw link only gave the mail in text format.  I hoped to import
 the mail into Windows Mail on Windows Vista, but I couldn't.

You might need to run a conversion process by which you transform the
raw file (in mbox format) into EML format or whatever it is that Windows
Mail uses.  I vaguely recall there are tools for this.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
 Local superusers (maybe this concept needs another name) would be able
 to do the following things in a *single* database:
 
 1 change permissions for other users on that database and its objects

What about bypass permissions, ala what superuser does today?  Or are
you saying we'd only need to allow this new kind of role to bypass the
checks in the GRANT/REVOKE system?

 2 load extensions from a predefined .so directory / list

This would obviously have to be a curated list that avoids things like
'adminpack'...

 3 create/modify untrusted language functions

Uhh, I don't believe RDS allows you to do this..?

 4 create per-database users and change their settings

Presumably just for the 'local' DB?

 5 change database settings (SET stuff)

This can be done by the database-owner already, no?

 6 NOT change their own user settings

Don't think this is quite that simple (passwords?).

 7 NOT change any global users

What about role membership, wrt local vs. global roles?

 8 NOT run SET PERSISTENT or other commands with global effect

Indeed, or use 'COPY'..

 Now, obviously permission (3) could be used to escalate a local
 superuser to global superuser permissions, so local superusers aren't
 really a secure concept, unless you don't add any untrusted languages to
 the list of allowed extensions.  Alternately, we could drop (3) from the
 list of features.

That'd certainly be the main issue that I see with this proposal.  Doing
the rest but allowing untrusted languages would just get the naive in
trouble and not help those of us who want this, as we wouldn't be able
to use it.

 H. On the other foot, though: all of 1,2,4 and 5 could conceivably
 be done via a set of Security Definer functions loaded into the
 database, with a lot less complexity and security risk.

For my part- I don't see having everyone write their own set of SECURITY
DEFINER functions as being either less complex or less risk.  They're
also a lot less convenient to use.  That's not what RDS did, is it?  No,
and I agree with them on that part.

 On 03/11/2014 09:39 PM, David Johnston wrote:
  So if dave is already a user in db1 only that specific dave can be made a
  global user - any other dave would be disallowed.
 
 Correct.  Well, unless the other dave was promoted first.  However, I
 personally don't see any reason why we should even support promoting
 users from local to global.  It adds complexity to the concept, and the
 value of it eludes me.

Agreed.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-03-12 Thread Fujii Masao
On Tue, Feb 25, 2014 at 1:03 PM, Rajeev rastogi
rajeev.rast...@huawei.com wrote:
 On 04 February 2014 14:38, Myself wrote:


 On 4th February 2014, Christian kruse Wrote:
  On 04/02/14 12:38, Fujii Masao wrote:
   ISTM that the phrase Request queue is not used much around the
 lock.
   Using the phrase wait queue or Simon's suggestion sound better to
  at least me.
   Thought?
 
  Sounds reasonable to me. Attached patch changes messages to the
  following:
 
  Process holding the lock: A. Wait queue: B.
  Processes holding the lock: A, B. Wait queue: C.

 This looks good to me also.

 I have tested the revised patch and found ready to be committed.

 I am marking this as Ready for Committer.

Committed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Josh Berkus
On 03/12/2014 11:25 AM, Stephen Frost wrote:
 * Josh Berkus (j...@agliodbs.com) wrote:
 Local superusers (maybe this concept needs another name) would be able
 to do the following things in a *single* database:

 1 change permissions for other users on that database and its objects
 
 What about bypass permissions, ala what superuser does today?  Or are
 you saying we'd only need to allow this new kind of role to bypass the
 checks in the GRANT/REVOKE system?

More like what we have for the database owner role today.

 2 load extensions from a predefined .so directory / list
 
 This would obviously have to be a curated list that avoids things like
 'adminpack'...

It would need to be a list created by the global superuser.  By default,
nothing would be on it.

 4 create per-database users and change their settings
 
 Presumably just for the 'local' DB?

Right.

 5 change database settings (SET stuff)
 
 This can be done by the database-owner already, no?

Oh, point.

 6 NOT change their own user settings
 
 Don't think this is quite that simple (passwords?).

Well, we already limit what things users can change about themselves;
they can't promote themselves to superuser, for example.

 7 NOT change any global users
 
 What about role membership, wrt local vs. global roles?

Allowing global users to join local ROLEs is its own can'o'worms that
would merit an entire other thread.  Assuming, of course, that we had
local users in the first place.

 8 NOT run SET PERSISTENT or other commands with global effect
 
 Indeed, or use 'COPY'..

Yeah.  In theory, we should allow the local superuser to use COPY; in
practice, nobody will care because they'll be using client-side COPY
since the entire use-case for this is cloud-hosted DBs anyway.

 For my part- I don't see having everyone write their own set of SECURITY
 DEFINER functions as being either less complex or less risk.  They're
 also a lot less convenient to use.  That's not what RDS did, is it?  No,
 and I agree with them on that part.

I was thinking of having an extension in contrib, actually.  That is, a
canonical set of security definer functions.  But if you think it's
easier to actually implement the permissions restrictions in the actual
utility functions, I wouldn't argue.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Mar 12, 2014 at 12:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 My inclination now (see later traffic) is to suppress the
 status report when the COPY destination is the same as pset.queryFout
 (ie, a simple test whether the FILE pointers are equal).  This would
 suppress the status report for \copy to stdout and COPY TO STDOUT
 cases, and also for \copy to pstdout if you'd not redirected queryFout
 with \o.

 This is reasonably similar to what we already do for SELECT, isn't it?
  I mean, the server always sends back a command tag, but psql
 sometimes opts not to print it.

Right, the analogy to SELECT gives some comfort that this is reasonable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 03/12/2014 02:09 PM, Josh Berkus wrote:
 Well, if you really want my I want a pony list:
 
 Local superusers (maybe this concept needs another name) would be able
 to do the following things in a *single* database:
 
 1 change permissions for other users on that database and its objects
 2 load extensions from a predefined .so directory / list
 3 create/modify untrusted language functions
 4 create per-database users and change their settings
 5 change database settings (SET stuff)
 6 NOT change their own user settings
 7 NOT change any global users
 8 NOT run SET PERSISTENT or other commands with global effect

 Item 3 gives away the store.

Indeed.  If you can do (3), you can break out of any of the other
constraints.  I suspect even (1) and/or (5) would be enough to mount
trojan-horse attacks against real superusers who visit your database.

I do not put any stock in the notion of constrained superuser.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GSoC 2014

2014-03-12 Thread Ashutosh Dhundhara
Hello all,
I am Ashutosh Dhundhara from Thapat University, Patiala-India presently 
pursuing Bachelors degree in Computer Science and Engineering.
This year I wish to work for PostgreSQL under the flagship of GSoC 2014. So 
please help regarding this. I have a few questions :

1) Do I have to choose all ideas from the GSoC wiki page or any one of them ?
2) What is the deadline for fixing bugs which will account for selection 
procedure ?

Please guide me on how to proceed.
 
Regards, 
Ashutosh Dhundhara

Re: [HACKERS] GSoC 2014

2014-03-12 Thread Atri Sharma
On Wed, Mar 12, 2014 at 8:05 PM, Ashutosh Dhundhara 
ashutoshdhundh...@yahoo.com wrote:

 Hello all,
 I am Ashutosh Dhundhara from Thapat University, Patiala-India presently
 pursuing Bachelors degree in Computer Science and Engineering.
 This year I wish to work for PostgreSQL under the flagship of GSoC 2014.
 So please help regarding this. I have a few questions :

 1) Do I have to choose all ideas from the GSoC wiki page or any one of
 them ?
 2) What is the deadline for fixing bugs which will account for selection
 procedure ?

 Please guide me on how to proceed.



You can propose your own ideas as well. You can pick any number of ideas
from GSoC 2014 wiki page and send proposals for them.

The deadline for proposal is next friday, I believe.

Regards,

Atri


-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Robert Haas
On Tue, Mar 11, 2014 at 3:53 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Places where tuple info not available

 LOG:  process 5788 still waiting for ShareLock on transaction 679 after 
 1014.000
  ms
 CONTEXT:  while attempting to operate in relation public.idx_t1 of 
 database
 postgres

The way the context message is assembled piecemeal in
XactLockTableWaitErrorContextCallback violates translation guidelines.
 You need to have completely separate strings for each variant.

While attempting to operate in?  That seems like unhelpful
weasel-wording.  I wonder if we ought to have separate messages for
each possibility, like delete tuple (X,Y) when called from
heap_delete(), update tuple (X,Y), check exclusion constraint on
tuple (X,Y) when called from check_exclusion_constraint, etc.  That
seems like it would be handy information to have.

Why can't check_exclusion_constraint, for example, pass the TID, so
that at least that much information is available?

I'm not very happy with the idea of including the tuple details only
when the level is less than ERROR.  For one thing, to do that in a way
that respects translatability guidelines will require two versions of
every string that would otherwise require only one.  For another
thing, it seems like it's punting a pretty important case.  If we're
gonna add context detail to lots of cases (instead only the still
waiting case that people probably mostly care about) then we should
actually print the details more-or-less consistently in all of those
cases, not pretend like a solution that only works in the narrow case
is more general than it really is.  I think we should really try hard
to make the amount of detail provided as uniform as possible across
all the cases, even if that means removing information from some cases
where it might have been available.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Oleg Bartunov
Also, GiST index is faster for create/update operations. I really hope we will
improve jsonb indexing in the next one-two releases. For now I'd suggest people
index expressional indexes to index just interesting keys or use GiST.

On Wed, Mar 12, 2014 at 5:15 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 12 Březen 2014, 0:41, Peter Geoghegan wrote:
 On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra t...@fuzzy.cz wrote:
   ERROR:  index row size 1416 exceeds maximum 1352 for index gin_idx

 All index AMs have similar restrictions.

 Yes, I know and I have no problem with restrictions in general. You may
 run into similar issues with btree indexes on text columns with long text,
 for example. The thing is that people don't generally index text directly,
 because it usually does not make much sense, but using tsvector etc.

 But with jsonb it's more likely because indexing is one of the goodies (at
 least for me). And the discussions with several people interested in
 storing json data I had recently went often like this:

 me: It seems we'll have a better json datatype in 9.4.
 them: Nice!
 me: And it will be possible to do searches on arbitrary keys.
 them: Yay!
 me: And we actually got pretty significant improvements in GIN indexes.
 them: Awesome!
 me: But the values you may index need to be less than ~1500B.
 them: Bummer :-(
 me: Well, you can use GIST then.

 A good example of such header is dkim-signature which basically
 contains the whole message digitally signed with DKIM. The signature
 tends to be long and non-compressible, thanks to the signature.

 I'm wondering what's the best way around this, because I suspect many
 new users (especially those attracted by jsonb and GIN improvements)
 will run into this. Maybe not immediately, but eventully they'll try to
 insert a jsonb with long value, and it will fail ...

 The jsonb_hash_ops operator class just stores a 32-bit integer hash
 value (it always sets the recheck flag, which only some of the other
 default GIN opclass' strategies do). It only supports containment, and
 not the full variety of operators that the default opclass supports,
 which is why it isn't the default. I think that in practice the
 general recommendation will be that when indexing at the top level,
 use jsonb_hash_ops. When indexing nested items, use the more flexible
 default GIN opclass. That seems like a pretty smart trade-off to me.

 OK, I'll look into the jsonb_hash_ops - that sounds more or less like what
 I was thinking about (and sure, storing hashes makes some operations
 impossible to support).

 The other thing I was thinking about is introducing some kind of upper
 limit for the value length - e.g. index just the first 1kB, or something
 like that. My experience is most values are way shorter, or actually
 differ in the first 1kB, so this should allow most decisions to be made.
 But I'm not really that familiar with how GIN works, so maybe this is
 nonsense.

 The more I think about it, the more inclined I am to lose GiST support
 entirely for the time being. It lets us throw out about 700 lines of C
 code, which is a very significant fraction of the total, removes the
 one open bug, and removes the least understood part of the code. The
 GiST opclass is not particularly compelling for this.

 I disagree with that. I see GiST as a simple fallback option for the cases
 I described. I wasn't able to create a GIN index because of exceeding the
 max item length, but GiST created just fine. It was considerably slower,
 but it worked.

 Tomas



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus
All:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR:  55006: replication slot slot_1 is already active
LOCATION:  ReplicationSlotAcquire, slot.c:339

What defines an active slot?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier.  Am I missing something?

If I'm not, that seems like something to fix before 9.4 release.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Josh Berkus
On 03/12/2014 11:40 AM, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 03/12/2014 02:09 PM, Josh Berkus wrote:
 Well, if you really want my I want a pony list:

 Local superusers (maybe this concept needs another name) would be able
 to do the following things in a *single* database:

 1 change permissions for other users on that database and its objects
 2 load extensions from a predefined .so directory / list
 3 create/modify untrusted language functions
 4 create per-database users and change their settings
 5 change database settings (SET stuff)
 6 NOT change their own user settings
 7 NOT change any global users
 8 NOT run SET PERSISTENT or other commands with global effect
 
 Item 3 gives away the store.
 
 Indeed.  If you can do (3), you can break out of any of the other
 constraints.  I suspect even (1) and/or (5) would be enough to mount
 trojan-horse attacks against real superusers who visit your database.

... nobody reads my whole post, except Stephen.  :-(

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Andres Freund
Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
 I was just reading Michael's explanation of replication slots
 (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
 and realized there was something which had completely escaped me in the
 pre-commit discussion:
 
 select pg_drop_replication_slot('slot_1');
 ERROR:  55006: replication slot slot_1 is already active
 LOCATION:  ReplicationSlotAcquire, slot.c:339
 
 What defines an active slot?

One with a connected walsender.

 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
 On 03/12/2014 11:25 AM, Stephen Frost wrote:
  * Josh Berkus (j...@agliodbs.com) wrote:
  Local superusers (maybe this concept needs another name) would be able
  to do the following things in a *single* database:
 
  1 change permissions for other users on that database and its objects
  
  What about bypass permissions, ala what superuser does today?  Or are
  you saying we'd only need to allow this new kind of role to bypass the
  checks in the GRANT/REVOKE system?
 
 More like what we have for the database owner role today.

eh?  Can you clarify what you think the database owner role can do today
that actually matches what you're asking for above, or even what I'm
suggesting?

  2 load extensions from a predefined .so directory / list
  
  This would obviously have to be a curated list that avoids things like
  'adminpack'...
 
 It would need to be a list created by the global superuser.  By default,
 nothing would be on it.

Hmm, so this would be independent from what's installed on the system at
a file level.  That's an interesting idea.

  4 create per-database users and change their settings
  
  Presumably just for the 'local' DB?
 
 Right.
 
  5 change database settings (SET stuff)
  
  This can be done by the database-owner already, no?
 
 Oh, point.
 
  6 NOT change their own user settings
  
  Don't think this is quite that simple (passwords?).
 
 Well, we already limit what things users can change about themselves;
 they can't promote themselves to superuser, for example.

Just saying that we need to be more specific on this point.

  7 NOT change any global users
  
  What about role membership, wrt local vs. global roles?
 
 Allowing global users to join local ROLEs is its own can'o'worms that
 would merit an entire other thread.  

What would the caveats here be then..?  Local roles can join global
ones, but global roles can't join local ones?  I'm not sure how much of
an issue this will really be if they're all sharing one catalog with
different namespaces (eg: the '@db' stuff).

 Assuming, of course, that we had
 local users in the first place.

That's what this thread is about, isn't it..?

  8 NOT run SET PERSISTENT or other commands with global effect
  
  Indeed, or use 'COPY'..
 
 Yeah.  In theory, we should allow the local superuser to use COPY; in
 practice, nobody will care because they'll be using client-side COPY
 since the entire use-case for this is cloud-hosted DBs anyway.

How could we allow the local superuser to use COPY?  They've be able
to overwrite files in other databases, or possibly modify
postgresql.conf, etc, depending on the overall system environment.

  For my part- I don't see having everyone write their own set of SECURITY
  DEFINER functions as being either less complex or less risk.  They're
  also a lot less convenient to use.  That's not what RDS did, is it?  No,
  and I agree with them on that part.
 
 I was thinking of having an extension in contrib, actually.  That is, a
 canonical set of security definer functions.  But if you think it's
 easier to actually implement the permissions restrictions in the actual
 utility functions, I wouldn't argue.

Having it in contrib would be better than nothing, but I'd advocate
putting this in as part of our actual permissions model, along with the
read-only/auditor-type option.  Having this not-quite-superuser which is
able to read all tables, possibly modify any of them, without having to
change the permissions system could be extremely useful and would be
rather painful to implement through SD functions.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
 I was just reading Michael's explanation of replication slots
 (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
 and realized there was something which had completely escaped me in the
 pre-commit discussion:

 select pg_drop_replication_slot('slot_1');
 ERROR:  55006: replication slot slot_1 is already active
 LOCATION:  ReplicationSlotAcquire, slot.c:339

 What defines an active slot?

 One with a connected walsender.

 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?

 It's sufficient to terminate the walsender and then drop the slot. That
 seems ok for now?

Urgh.  That error message looks susceptible to improvement.  How about:

replication slot %s cannot be dropped because it is currently in use

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus
On 03/12/2014 12:03 PM, Andres Freund wrote:
 Hi,
 
 On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
 I was just reading Michael's explanation of replication slots
 (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
 and realized there was something which had completely escaped me in the
 pre-commit discussion:

 select pg_drop_replication_slot('slot_1');
 ERROR:  55006: replication slot slot_1 is already active
 LOCATION:  ReplicationSlotAcquire, slot.c:339

 What defines an active slot?
 
 One with a connected walsender.

In a world of network proxies, a walsender could be connected for
hours after the replica has ceased to exist.  Fortunately,
wal_sender_timeout is changeable on a reload.  We check for actual
standby feedback for the timeout, yes?

 
 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?
 
 It's sufficient to terminate the walsender and then drop the slot. That
 seems ok for now?

We have no safe way to terminate the walsender that I know of;
pg_terminate_backend() doesn't include walsenders last I checked.

So the procedure for this would be:

1) set wal_sender_timeout to some low value (1);
2) reload
3) call pg_drop_replication_slot('slotname')

Clumsy, but it will do for a first pass; we can make it better (for
example, by adding a force boolean to pg_drop_replication_slot) in 9.5.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Andres Freund
On 2014-03-12 15:18:04 -0400, Robert Haas wrote:
 On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund and...@2ndquadrant.com wrote:
  Hi,
 
  On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
  I was just reading Michael's explanation of replication slots
  (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
  and realized there was something which had completely escaped me in the
  pre-commit discussion:
 
  select pg_drop_replication_slot('slot_1');
  ERROR:  55006: replication slot slot_1 is already active
  LOCATION:  ReplicationSlotAcquire, slot.c:339
 
  What defines an active slot?
 
  One with a connected walsender.
 
  It seems like there's no way for a DBA to drop slots from the master if
  it's rapidly running out of disk WAL space without doing a restart, and
  there's no way to drop the slot for a replica which the DBA knows is
  permanently offline but was connected earlier.  Am I missing something?
 
  It's sufficient to terminate the walsender and then drop the slot. That
  seems ok for now?
 
 Urgh.  That error message looks susceptible to improvement.  How about:
 
 replication slot %s cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about replication slot %s is in use by another backend?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Andres Freund
On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:
 On 03/12/2014 12:03 PM, Andres Freund wrote:
  Hi,
  
  On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
  I was just reading Michael's explanation of replication slots
  (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
  and realized there was something which had completely escaped me in the
  pre-commit discussion:
 
  select pg_drop_replication_slot('slot_1');
  ERROR:  55006: replication slot slot_1 is already active
  LOCATION:  ReplicationSlotAcquire, slot.c:339
 
  What defines an active slot?
  
  One with a connected walsender.
 
 In a world of network proxies, a walsender could be connected for
 hours after the replica has ceased to exist.  Fortunately,
 wal_sender_timeout is changeable on a reload.  We check for actual
 standby feedback for the timeout, yes?

Yep.

  It seems like there's no way for a DBA to drop slots from the master if
  it's rapidly running out of disk WAL space without doing a restart, and
  there's no way to drop the slot for a replica which the DBA knows is
  permanently offline but was connected earlier.  Am I missing something?
  
  It's sufficient to terminate the walsender and then drop the slot. That
  seems ok for now?
 
 We have no safe way to terminate the walsender that I know of;
 pg_terminate_backend() doesn't include walsenders last I checked.

SELECT pg_terminate_backend(pid) FROM pg_stat_replication;

works.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-03-12 Thread Andres Freund
On 2014-03-07 17:54:32 +0200, Heikki Linnakangas wrote:
 So there are some unexplained differences there, but based on these results,
 I'm still OK with committing the patch.

So, I am looking at this right now.

I think there are some minor things I'd like to see addressed:

1) I think there needs to be a good sized comment explaining why
   WaitXLogInsertionsToFinish() isn't racy due to the unlocked read at
   the beginning of LWLockWait(). I think it's safe because we're
   reading Insert-CurrBytePos inside a spinlock, and it will only ever
   increment. As SpinLockAcquire() has to be a read barrier we can
   assume that every skewed read in LWLockWait() will be for lock
   protecting a newer insertingAt?
2) I am not particularly happy about the LWLockWait() LWLockWakeup()
   function names. They sound too much like a part of the normal lwlock
   implementation to me. But admittedly I don't have a great idea for
   a better naming scheme. Maybe LWLockWaitForVar(),
   LWLockWakeupVarWaiter()?
3) I am the wrong one to complain, I know, but the comments above struct
   WALInsertLock are pretty hard to read from th sentence structure.
4) WALInsertLockAcquire() needs to comment on acquiring/waking all but
   the last slot. Generally the trick of exclusive xlog insertion lock
   acquiration only really using the last lock could use a bit more
   docs.
5) WALInsertLockRelease() comments on the reset of insertingAt being
   optional, but I am not convinced that that's true anymore. If an
   exclusive acquiration isn't seen as 0 or
   INT64CONST(0x) by another backend we're in trouble,
   right? Absolutely not sure without thinking on it for longer than I
   can concentrate right now.
6) Pretty minor, but from a style POV it seems nicer to separate
   exclusive/nonexclusive out of WALInsertLockAcquire(). The cases don't
   share any code now.

A patch contianing some trivial changes is attached...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 484b9c5..8a55c6b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1628,8 +1628,6 @@ WALInsertLockRelease(void)
 static void
 WALInsertLockWakeup(XLogRecPtr insertingAt)
 {
-	int			i;
-
 	if (holdingAllLocks)
 	{
 		/*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index f88bf76..2695128 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -873,6 +873,9 @@ LWLockWait(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval)
 	int			extraWaits = 0;
 	bool		result = false;
 
+	/* can't be used with shared locks for now */
+	Assert(lock-shared == 0);
+
 	/*
 	 * Quick test first to see if it the slot is free right now.
 	 *
@@ -905,6 +908,8 @@ LWLockWait(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval)
 		SpinLockAcquire(lock-mutex);
 #endif
 
+		Assert(lock-shared == 0);
+
 		/* Is the lock now free, and if not, does the value match? */
 		if (lock-exclusive == 0)
 		{
@@ -1022,6 +1027,7 @@ LWLockWakeup(LWLock *l, uint64 *valptr, uint64 val)
 	SpinLockAcquire(lock-mutex);
 
 	/* we should hold the lock */
+	LWLockHeldByMe(l);
 	Assert(lock-exclusive == 1);
 
 	/* Update the lock's value */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 3:25 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-03-12 15:18:04 -0400, Robert Haas wrote:
 On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Hi,
 
  On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
  I was just reading Michael's explanation of replication slots
  (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
  and realized there was something which had completely escaped me in the
  pre-commit discussion:
 
  select pg_drop_replication_slot('slot_1');
  ERROR:  55006: replication slot slot_1 is already active
  LOCATION:  ReplicationSlotAcquire, slot.c:339
 
  What defines an active slot?
 
  One with a connected walsender.
 
  It seems like there's no way for a DBA to drop slots from the master if
  it's rapidly running out of disk WAL space without doing a restart, and
  there's no way to drop the slot for a replica which the DBA knows is
  permanently offline but was connected earlier.  Am I missing something?
 
  It's sufficient to terminate the walsender and then drop the slot. That
  seems ok for now?

 Urgh.  That error message looks susceptible to improvement.  How about:

 replication slot %s cannot be dropped because it is currently in use

 I think that'd require duplicating some code between acquire and drop,
 but how about replication slot %s is in use by another backend?

Sold.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan
On Wed, Mar 12, 2014 at 6:20 AM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm still not sure how would that look. Does that mean I'd have to create
 multiple GIN indexes - one for each possible key or something like that?
 Can you give an example?

It could mean that you're obliged to create multiple indexes, yes. For
an example, and to get a better sense of what I mean, look at the
documentation in the patch.

The idea that you're going to create one index on a jsonb, and it's
going to be able to usefully index a lot of different queries doesn't
seem practical for most use-cases. Mostly, people will have fairly
homogeneous json documents, and they'll want to index certain nested
fields common to all or at least a large majority of those documents.

By indexing entire jsonb datums, do you hope to get much benefit out
of the indexed values (as opposed to keys) being stored (in serialized
form) in the GIN index? Because you *are* indexing a large nested
structure as a value. Is that large nested structure going to appear
in your query predicate, or are you just going to subscript the jsonb
to get to the level that's of interest to query that? I'm pretty sure
that people want the latter. Are you sure that your complaint isn't
just that the default GIN opclass indexes values (as distinct from
keys) that are large and unwieldy, and not terribly useful?

I don't think expressional indexes are some kind of unfortunate work
around for a jsonb limitation. I think that they're the natural way to
approach indexing a nested structure in Postgres. MongoDB, for
example, does not magically index everything. You're still required to
make choices about indexing that consider the access patterns.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan
On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov obartu...@gmail.com wrote:
 Also, GiST index is faster for create/update operations. I really hope we will
 improve jsonb indexing in the next one-two releases. For now I'd suggest 
 people
 index expressional indexes to index just interesting keys or use GiST.

When do you ever want to index non-interesting keys?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus
On 03/12/2014 12:26 PM, Andres Freund wrote:
 On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:
 On 03/12/2014 12:03 PM, Andres Freund wrote:
 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?

 It's sufficient to terminate the walsender and then drop the slot. That
 seems ok for now?

 We have no safe way to terminate the walsender that I know of;
 pg_terminate_backend() doesn't include walsenders last I checked.
 
 SELECT pg_terminate_backend(pid) FROM pg_stat_replication;

Aha!  Ok, I'll work on some documentation.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus
On 03/12/2014 12:34 PM, Robert Haas wrote:
 Urgh.  That error message looks susceptible to improvement.  How about:
 
  replication slot %s cannot be dropped because it is currently in use
 
  I think that'd require duplicating some code between acquire and drop,
  but how about replication slot %s is in use by another backend?
 Sold.

Wait ... before you go further ... I object to dropping the word
active from the error message.  The column is called active, and
that's where a DBA should look; that word needs to stay in the error
message.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Oleg Bartunov
On Thu, Mar 13, 2014 at 12:10 AM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov obartu...@gmail.com wrote:
 Also, GiST index is faster for create/update operations. I really hope we 
 will
 improve jsonb indexing in the next one-two releases. For now I'd suggest 
 people
 index expressional indexes to index just interesting keys or use GiST.

 When do you ever want to index non-interesting keys?

Regular user may just index all keys.

I mean, that json can contains keys, which are not searched, so it's
not needed to index them and save index size. We probably could
provide option in CREATE INDEX to specify what to index and what not
index, but it require planner to know that information.



 --
 Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Andrew Dunstan


On 03/12/2014 04:10 PM, Peter Geoghegan wrote:

On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov obartu...@gmail.com wrote:

Also, GiST index is faster for create/update operations. I really hope we will
improve jsonb indexing in the next one-two releases. For now I'd suggest people
index expressional indexes to index just interesting keys or use GiST.

When do you ever want to index non-interesting keys?




The problem is when do you know they are interesting?

One major use case for using treeish data types in the first place is 
that you don't know when you're designing the database exactly what 
shape the data will be. If you don't know that, then how are you 
supposed to know what in it will be interesting? It's somewhat analogous 
to full text indexing, where we don't know in advance what phrases or 
words will be interesting. Here, a key is the equivalent of a word and a 
key path or subpath is the equivalent of a phrase.


Maybe I'm dreaming, since I have no idea how to go about this sort of 
indexing, but it's where I'd like to see lots of effort.


I agree with Oleg that we need to be very creative about jsonb indexing. 
One of my hopes is that by going down the road we are on, we'll get much 
wider interest in this, and that both ideas and money might flow towards 
addressing it in a way that we probably wouldn't have seen otherwise.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Thom Brown
On 12 March 2014 19:00, Josh Berkus j...@agliodbs.com wrote:
 All:

 I was just reading Michael's explanation of replication slots
 (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
 and realized there was something which had completely escaped me in the
 pre-commit discussion:

 select pg_drop_replication_slot('slot_1');
 ERROR:  55006: replication slot slot_1 is already active
 LOCATION:  ReplicationSlotAcquire, slot.c:339

 What defines an active slot?

 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?

I'm not clear on why would dropping an active replication slot would
solve disk space problems related to WAL.  I thought it was inactive
slots that were the problem in this regard?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Josh Berkus
Andrew, Peter:

Just so I'm clear on the limits here, lemme make sure I understand this:

a) GIN indexing is limited to ~~1500chars

b) The value, which includes everything other than the top level set
of keys, is one item as far as GIN is concerned.

Therefore: we are limited to indexing JSON where nothing below a
top-level key is more than 1500bytes?

I'm asking for documentation purposes.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan
On Wed, Mar 12, 2014 at 1:37 PM, Andrew Dunstan and...@dunslane.net wrote:
 One major use case for using treeish data types in the first place is that
 you don't know when you're designing the database exactly what shape the
 data will be. If you don't know that, then how are you supposed to know what
 in it will be interesting? It's somewhat analogous to full text indexing,
 where we don't know in advance what phrases or words will be interesting.
 Here, a key is the equivalent of a word and a key path or subpath is the
 equivalent of a phrase.

You don't know exactly how, but you have some idea. The major benefit
is that you can add new things to new documents as the need arises,
and that's not a big deal, nor does it require a migration with DDL.
If we continue to take MongoDB as representative of how people will
use jsonb, they pretty strongly encourage the idea that you have to
have some structure or design. Google mongodb schema design to see
what I mean - you'll find plenty. It has more to do with making
querying the data possible than anything else. There is a limited
amount you can do with a bunch of documents that share little in
common in terms of their structure - what does a query (that can use
an index just in principle) even look like there?

The use case you describe here doesn't sound like something similar to
full text search. It sounds like something identical.

In any case, let's focus on what we have right now. I think that the
indexing facilities proposed here are solid. In any case they do not
preclude working on better indexing strategies as the need emerges.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Andrew Dunstan


On 03/12/2014 04:58 PM, Peter Geoghegan wrote:

In any case, let's focus on what we have right now. I think that the
indexing facilities proposed here are solid. In any case they do not
preclude working on better indexing strategies as the need emerges.




I quite agree, didn't mean to suggest otherwise.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

2014-03-12 Thread Rukh Meski
Hi,

Here's an updated patch.  I had to push the LIMIT processing into ModifyTable 
to make the behaviour sane in parallel scenarios.  As usual, please ignore if 
you're busy with 9.4.  I will work on better docs and more tests from now on 
and am preparing to make a solid case for adding this.



♜ 

update_delete_order_by_limit_v1.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra
On 12.3.2014 20:40, Peter Geoghegan wrote:
 On Wed, Mar 12, 2014 at 6:20 AM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm still not sure how would that look. Does that mean I'd have to
 create multiple GIN indexes - one for each possible key or
 something like that? Can you give an example?
 
 It could mean that you're obliged to create multiple indexes, yes.
 For an example, and to get a better sense of what I mean, look at
 the documentation in the patch.

OK, will do.

 The idea that you're going to create one index on a jsonb, and it's
 going to be able to usefully index a lot of different queries doesn't
 seem practical for most use-cases. Mostly, people will have fairly
 homogeneous json documents, and they'll want to index certain nested
 fields common to all or at least a large majority of those documents.

I think that's unfounded assumption. Many users actually have very
little control over the documents or queries - a nice example may be the
mail archive, with headers stored in a hstore/jsonb. I have absolutely
no control over the headers or queries.

But I think this is a feedback loop too - what if many users actually
want that functionality, but realize that expression indexes are not
sufficient for their needs and thus don't even try (and so we don't hear
about them)?

And my experience is that this is actualy one of the very cool hstore
features - being able to index the whole structure and then do arbitrary
queries over that.

The only reason why I'm looking at jsonb is that it the improved support
for data types (especially arrays).

So I have my doubts about the claims that users have homogenous
documents and only want to index some fields with expression indexes.

 By indexing entire jsonb datums, do you hope to get much benefit out
 of the indexed values (as opposed to keys) being stored (in serialized
 form) in the GIN index? Because you *are* indexing a large nested
 structure as a value. Is that large nested structure going to appear
 in your query predicate, or are you just going to subscript the jsonb
 to get to the level that's of interest to query that? I'm pretty sure
 that people want the latter. Are you sure that your complaint isn't
 just that the default GIN opclass indexes values (as distinct from
 keys) that are large and unwieldy, and not terribly useful?

No, I don't expect a large nested structure to appear in the query. And
I expect most people won't need that, although I can imagine queries  @
doing that (not sure if that checks for equality or 'subset').

But I'm not sure I understand how's this related to my original post?

All I was asking whether it wouldn't be enough to store a hash instead
of the original value, i.e. instead of this:

  {from : j...@example.com,
   to : j...@example.com,
   content-type : text/plain; charset=us-ascii,
   dkim-signature :  vry long value }

this

  {129812 : 29382,
   459821 : 1029381,
21083 : 102941,
   111390 : 129010292}

which would solve issues with the long values and might still support
the queries (with recheck, of course). I don't know if that's what
jsonb_hash_ops do or if it's even possible / compatible with GIN.

 I don't think expressional indexes are some kind of unfortunate work
 around for a jsonb limitation. I think that they're the natural way to
 approach indexing a nested structure in Postgres. MongoDB, for
 example, does not magically index everything. You're still required to
 make choices about indexing that consider the access patterns.

For many usecases, expressional indexes are the right tool. But not for
all and I see no reason to just throw some tools away.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-03-12 Thread Heikki Linnakangas

On 03/04/2014 01:58 PM, Amit Kapila wrote:

On Mon, Mar 3, 2014 at 7:57 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

On 02/16/2014 01:51 PM, Amit Kapila wrote:


On Wed, Feb 5, 2014 at 5:29 PM, Heikki Linnakangas
hlinnakan...@vmware.com  wrote:


Thanks. I have to agree with Robert though that using the pglz encoding when
we're just checking for a common prefix/suffix is a pretty crappy way of
going about it [1].

As the patch stands, it includes the NULL bitmap when checking for a common
prefix. That's probably not a good idea, because it defeats the prefix
detection in a the common case that you update a field from NULL to not-NULL
or vice versa.

Attached is a rewritten version, which does the prefix/suffix tests directly
in heapam.c, and adds the prefix/suffix lengths directly as fields in the
WAL record. If you could take one more look at this version, to check if
I've missed anything.


I had verified the patch and found few minor points:
...


Fixed those.


One Question:
+ rdata[1].data = (char *) xlrec;
Earlier it seems to store record hearder as first segment rdata[0],
whats the reason of changing it?


I found the code easier to read that way. The order of rdata entries 
used to be:


0: xl_heap_update struct
1: full-page reference to oldbuf (no data)
2: xl_heap_header_len struct for the new tuple
3-7: logical decoding stuff

The prefix/suffix fields made that order a bit awkward, IMHO. They are 
logically part of the header, even though they're not part of the struct 
(they are documented in comments inside the struct). So they ought to 
stay together with the xl_heap_update struct. Another option would've 
been to move it after the xl_heap_header_len struct.


Note that this doesn't affect the on-disk format of the WAL record, 
because the moved rdata entry is just a full-page reference, with no 
payload of its own.



I have verified the patch by doing crash recovery for below scenario's
and it worked fine:
a. no change in old and new tuple
b. all changed in new tuple
c. half changed (update half of the values to NULLS) in new tuple
d. only prefix same in new tuple
e. only suffix same in new tuple
f.  prefix-suffix same, other columns values changed in new tuple.


Thanks!


Conclusion is that patch shows good WAL reduction and performance
improvement for favourable cases without CPU overhead for non-favourable
cases.


Ok, great. Committed!

I left out the regression tests. It was good to have them while 
developing this, but I don't think there's a lot of value in including 
them permanently in the regression suite. Low-level things like the 
alignment-sensitive test are fragile, and can easily stop testing the 
thing it's supposed to test, depending on the platform and future 
changes in the code. And the current algorithm doesn't care much about 
alignment anyway.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra
On 12.3.2014 21:58, Peter Geoghegan wrote:
 
 The use case you describe here doesn't sound like something similar to
 full text search. It sounds like something identical.

I think this very depends on the definition of full text search.

 In any case, let's focus on what we have right now. I think that the
 indexing facilities proposed here are solid. In any case they do not
 preclude working on better indexing strategies as the need emerges.

+1

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan
On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I think that's unfounded assumption. Many users actually have very
 little control over the documents or queries - a nice example may be the
 mail archive, with headers stored in a hstore/jsonb. I have absolutely
 no control over the headers or queries.

Maybe, but what do you want me to do to help them? Indexing a typical
jsonb field is a bad idea, unless you really do want something
essentially equivalent to full text search (which could be justified),
or unless you know ahead of time that your documents are not going to
be heavily nested. The whole basis of your complaints seems to be that
people won't know that at all.

 For many usecases, expressional indexes are the right tool. But not for
 all and I see no reason to just throw some tools away.

If the tool you're talking about throwing away is the GiST opclass, I
do not propose to throw that away. I don't think it's important enough
to justify inclusion in our first cut at this, especially given the
fact that the code has bugs, and is quite a bit more complex than GIN.
What's wrong with those reasons?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra
On 12.3.2014 21:55, Josh Berkus wrote:
 Andrew, Peter:
 
 Just so I'm clear on the limits here, lemme make sure I understand this:
 
 a) GIN indexing is limited to ~~1500chars

The exact message I get is this:

ERROR: index row size 1944 exceeds maximum 1352 for index tmp_idx

so it's 1352B. But IIRC this is closely related to block size, so with
larger block sizes you'll get different limits. Also, this is a limit on
compressed value, which makes it less user-friendly as it's difficult to
predict whether the row is OK or not :-(

And I just discovered this:

  create table tmp (val jsonb);
  create index tmp_gin_idx on tmp using gin (val);
  insert into tmp
 select ('{z : ' || repeat('z', 100) || '}')::jsonb;

which tries to insert a well-compressible string ('z' repeated
1e6-times), and fails with this:

ERROR: index row requires 11472 bytes, maximum size is 8191

So I think it's quite difficult to give simple and exact explanation in
the docs, other than there are limits, but it's difficult to say when
you hit them.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Stephen Frost
* Tomas Vondra (t...@fuzzy.cz) wrote:
 So I think it's quite difficult to give simple and exact explanation in
 the docs, other than there are limits, but it's difficult to say when
 you hit them.

Arrays have more-or-less the same issue...

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra
On 12.3.2014 22:43, Peter Geoghegan wrote:
 On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I think that's unfounded assumption. Many users actually have very
 little control over the documents or queries - a nice example may be the
 mail archive, with headers stored in a hstore/jsonb. I have absolutely
 no control over the headers or queries.
 
 Maybe, but what do you want me to do to help them? Indexing a
 typical jsonb field is a bad idea, unless you really do want
 something essentially equivalent to full text search (which could be
 justified), or unless you know ahead of time that your documents are
 not going to be heavily nested. The whole basis of your complaints
 seems to be that people won't know that at all.

Well, I would be quite happy with the GIN indexing without the limit I
ran into. I don't think we need to invent something entirely new.

You're right that the index is pretty futile with a condition matching
field/value combination. But what if I'm doing a query with multiple
such conditions, and the combination matches just a small fraction of
rows? GIN index works with that (and the patches from Alexander improve
this case tremendously, IIRC).

I still don't understand how's this similar to fulltext - that seems
pretty unsuitable for a treeish structure, assuming you can't flatten
it. Which you can't, if the queries use paths to access just parts of
the json value.

 For many usecases, expressional indexes are the right tool. But not for
 all and I see no reason to just throw some tools away.
 
 If the tool you're talking about throwing away is the GiST opclass, I
 do not propose to throw that away. I don't think it's important enough
 to justify inclusion in our first cut at this, especially given the
 fact that the code has bugs, and is quite a bit more complex than GIN.
 What's wrong with those reasons?

Meh, I accidentally mixed two responses :-/

I have no problem with expression indexes, but it's not a good solution
to all problems. I certainly can't use them to achieve what I'd like and
I disagree with your assumptions that it doesn't make sense to index
everything / non-interesting keys, or that the documents have
well-defined structure. I can live with larger / less efficient indexes
on all fields.

Regarding GiST - I understand your concerns about complexity, and you
may be right that not shipping it now is prefferable to shipping it with
bugs. The thing is it doesn't have issues with the value lengths, which
prevents me from using GIN, and although GiST is slower, it's at least
some indexing. But maybe jsonb_hash_ops will work, I haven't tried yet.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

2014-03-12 Thread Rukh Meski
Oops.  Of course shouldn't try and change how INSERT works.  Latest version 
attached.



♜

update_delete_order_by_limit_v2.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Robert Haas
On Mon, Mar 10, 2014 at 9:33 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I haven't touched pg_dump yet, but if this proposed design sits well
 with everyone, my intention is that the dump output will contain the
 pg_register_option_namespace() calls necessary so that a table
 definition will be able to do the SET calls to set the values the
 original table has, and succeed.  In other words, restoring a dump will
 preserve the values you had, without a need of having the module loaded
 in the new server.  I think this is what was discussed.  Robert, do you
 agree?

No, I wasn't imagining anything like pg_register_option_namespace().
My thought was that you'd need to have any relevant modules loaded at
restore time.  In essence, patching in a new option via an extension
module would work about like adding one by patching the core code: you
need a server version that supports that option in order to set it.

I don't like the idea of using reloptions to let people attach
arbitrary unvalidated settings to tables.  I consider the way things
work with GUCs to be a bug, not a feature, and definitely not
something I want to propagate into every other area of the system
where the underlying storage format happens to allow it.

I also kind of think that what you're going to find if you try to
press forward with the pg_register_option_namespace() idea is that
what you really want is CREATE RELOPTION NAMESPACE, ALTER RELOPTION
NAMESPACE, DROP RELOPTION NAMESPACE.  Short of that, you're going to
end up with a bunch of kludges, I suspect.  And some kind of real DDL
syntax (with better naming) is OK with me, but as you observed
elsewhere on the thread, now you're looking at a new catalog and a
bunch more complexity.

I kind of think that this is too half-baked for 9.4 and we ought to
punt it to 9.5.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 5:30 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Ok, great. Committed!

Awesome.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Michael Paquier
On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown t...@linux.com wrote:
 On 12 March 2014 19:00, Josh Berkus j...@agliodbs.com wrote:
 All:

 I was just reading Michael's explanation of replication slots
 (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
 and realized there was something which had completely escaped me in the
 pre-commit discussion:

 select pg_drop_replication_slot('slot_1');
 ERROR:  55006: replication slot slot_1 is already active
 LOCATION:  ReplicationSlotAcquire, slot.c:339

 What defines an active slot?

 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?

 I'm not clear on why would dropping an active replication slot would
 solve disk space problems related to WAL.  I thought it was inactive
 slots that were the problem in this regard?
You could still have an active slot with a standby that is not able to
catch up AFAIK.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rowtype column and domain subfield with DEFAULT and NOT NULL constraint

2014-03-12 Thread Michael Paquier
On Wed, Mar 12, 2014 at 11:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Julien Tachoires julien.tachoi...@dalibo.com writes:
 A customer has reported us a strange behaviour regarding a rowtype
 column with a domain subfield:

 Rowtypes in general do not support defaults for component fields.
And what about adding a TODO item?
Support default values for component fields of rowtypes

We could as well for the time being improve the documentation to
mention that with some examples. For example with some more content on
the page of INSERT.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-03-12 Thread Christian Kruse
Hi,

On 13/03/14 03:27, Fujii Masao wrote:
 Committed!

Thank you very much!

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpkDoVMmXIL4.pgp
Description: PGP signature


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 9:19 AM, Andres Freund and...@2ndquadrant.com wrote:
 Isn't this just a case of creating a suitable operator and an exclusion
 constraint?  Defining the constraint in BKI might require extra
 infrastructure, but it should be possible.

 Except that we don't have the infrastructure to perform such checks
 (neither partial, nor expression indexes, no exclusion constraints) on
 system tables atm. So it's not a entirely trivial thing to do.

I'm probably woefully underinformed here, but it seems like getting
exclusion constraints working might be simpler than partial indexes or
expression indexes, because both of those involve being able to
evaluate arbitrary predicates, whereas exclusion constraints just
involve invoking index access methods to look for conflicting rows via
smarts built into your index AM.  The latter seems to involve less
risk of circularity (but I might be wrong).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Thom Brown
On 12 March 2014 23:17, Michael Paquier michael.paqu...@gmail.com wrote:
 On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown t...@linux.com wrote:
 On 12 March 2014 19:00, Josh Berkus j...@agliodbs.com wrote:
 All:

 I was just reading Michael's explanation of replication slots
 (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
 and realized there was something which had completely escaped me in the
 pre-commit discussion:

 select pg_drop_replication_slot('slot_1');
 ERROR:  55006: replication slot slot_1 is already active
 LOCATION:  ReplicationSlotAcquire, slot.c:339

 What defines an active slot?

 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?

 I'm not clear on why would dropping an active replication slot would
 solve disk space problems related to WAL.  I thought it was inactive
 slots that were the problem in this regard?
 You could still have an active slot with a standby that is not able to
 catch up AFAIK.

In that scenario, why would one wish to drop the replication slot?  If
it can't keep up, dropping the replication slot would likely mean
you'd orphan the standby due to the primary no longer holding on to
the necessary WAL, and the standby is then useless.  In which case, if
the standby is causing such problems, why not shut down that standby,
thereby effectively decommissioning it, then delete the slot?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-12 Thread Robert Haas
On the pgsql-packagers list, there has been some (OT for that list)
discussion of whether commit 9a57858f1103b89a5674f0d50c5fe1f756411df6
is sufficiently serious to justify yet another immediate minor release
of 9.3.x.  The relevant questions seem to be:

1. Is it really bad?

2. Does it affect a lot of people or only a few?

3. Are there more, equally bad bugs that are unfixed, or perhaps even
unreported, yet?

Obviously, we don't want to leave serious bugs unpatched.  On the
other hand, as Tom pointed out in that discussion, releases are a lot
of work, and we can't do them for every commit.

Discuss.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgresql XML parsing

2014-03-12 Thread Kyotaro HORIGUCHI
Hello,

 On 03/12/2014 09:36 AM, Ashoke wrote:
  Hi,
 
 I am working on adding a functionality to PostgreSQL. I need to parse
 the XML format query plan (produced by PostgreSQL v9.3) and save it in
 a simple data structure (say C structure). I was wondering if
...
 The only XML parsing we have is where Postgres is built with libxml,
 in which case we use its parser. But query plan XML is delivered to a
 client (or a log file, which means more or less the same thing
 here).

As a HACKERS' matter, explain output can be obtained from
ExplainPrintPlan() in any format in backend. I don't know if it
is the case though.

 If you want to parse it then it should be parsed in the client
 - that's why we provide it. Inside postgres I don't see a point in
 parsing the XML rather than handling the query plan directly.
 
 The worst possible option would be to make a hand-cut XML parser,
 either in the client or the server - XML parsing has all sorts of
 wrinkles that can bite you badly.

I agree with it. If XML input is not essential, JSON format would
be parsed more easily than xml. 9.3 already intrinsically has a
JSON parser infrastructure available for the purpose.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Josh Berkus
On 03/12/2014 03:58 PM, Robert Haas wrote:
 I don't like the idea of using reloptions to let people attach
 arbitrary unvalidated settings to tables.  I consider the way things
 work with GUCs to be a bug, not a feature, and definitely not
 something I want to propagate into every other area of the system
 where the underlying storage format happens to allow it.

+1.  Relopts are one of the uglier warts we have.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus
On 03/12/2014 04:52 PM, Thom Brown wrote:
 On 12 March 2014 23:17, Michael Paquier michael.paqu...@gmail.com wrote:
 On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown t...@linux.com wrote:
 I'm not clear on why would dropping an active replication slot would
 solve disk space problems related to WAL.  I thought it was inactive
 slots that were the problem in this regard?
 You could still have an active slot with a standby that is not able to
 catch up AFAIK.
 
 In that scenario, why would one wish to drop the replication slot?  If
 it can't keep up, dropping the replication slot would likely mean
 you'd orphan the standby due to the primary no longer holding on to
 the necessary WAL, and the standby is then useless.  In which case, if
 the standby is causing such problems, why not shut down that standby,
 thereby effectively decommissioning it, then delete the slot?

The problem I'm anticipating is that the replica server is actually
offline, but the master doesn't know it yet.  So here's the situ:

1. replica with a slot dies
2. wal logs start piling up and master is running low on disk space
3. replica is still marked active because we're waiting for default
tcp timeout (3+ hours) or for the proxy to kill the connection (forever).

But as Andres has shown, there's a two ways to fix the above.  So we're
in good shape.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] db_user_namespace a temporary measure

2014-03-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Mar 12, 2014 at 9:19 AM, Andres Freund and...@2ndquadrant.com wrote:
 Except that we don't have the infrastructure to perform such checks
 (neither partial, nor expression indexes, no exclusion constraints) on
 system tables atm. So it's not a entirely trivial thing to do.

 I'm probably woefully underinformed here, but it seems like getting
 exclusion constraints working might be simpler than partial indexes or
 expression indexes, because both of those involve being able to
 evaluate arbitrary predicates, whereas exclusion constraints just
 involve invoking index access methods to look for conflicting rows via
 smarts built into your index AM.  The latter seems to involve less
 risk of circularity (but I might be wrong).

You might be right.  I don't think anyone's ever looked at what it
would take to support that particular case.  We have looked at the
other cases and run away screaming ... but I think that was before
exclusion constraints existed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >