date:20100525

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread KaiGai Kohei

The attached patch is a revised one for DML permission checks.

List of updates:
- Source code comments in the patched functions were revised.
- ExecCheckRTPerms() and ExecCheckRTEPerms() were moved to aclchk.c,
  and renamed to chkpriv_relation_perms() and chkpriv_rte_perms().
- It took the 2nd argument (bool abort) that is a hint of behavior
  on access violations.
- It also returns AclResult, instead of bool.
- I assumed RI_Initial_Check() is not broken, right now.
  So, this patch just reworks DML permission checks without any bugfixes.
- The ESP hook were moved to ExecCheckRTPerms() from ExecCheckRTEPerms().
- At DoCopy() and RI_Initial_Check() call the checker function with
  list_make1(&rte), instead of &rte.
- In DoCopy(), required_access is used to store either ACL_SELECT or
  ACL_INSERT; initialized at head of the function.
- In DoCopy(), it initialize selectedCols or modifiedCol of RTE depending
  on if (is_from), instead of columnsSet.

ToDo:
- makeRangeTblEntry() stuff to allocate a RTE node with given parameter
  is not yet.

Thanks,

(2010/05/26 12:04), KaiGai Kohei wrote:
> (2010/05/26 11:12), Stephen Frost wrote:
>> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
 #2: REALLY BIG ISSUE- You've added ExecutorCheckPerms_hook as part of
 this patch- don't, we're in feature-freeze right now and should not be
 adding hooks at this time.
>>>
>>> The patch is intended to submit for the v9.1 development, not v9.0, isn't 
>>> it?
>>
>> That really depends on if this is actually fixing a bug in the existing
>> code or not.  I'm on the fence about that at the moment, to be honest.
>> I was trying to find if we expliitly say that SELECT rights are needed
>> to reference a column but wasn't able to.  If every code path is
>> expecting that, then perhaps we should just document it that way and
>> move on.  In that case, all these changes would be for 9.1.  If we
>> decide the current behavior is a bug, it might be something which could
>> be fixed in 9.0 and maybe back-patched.
> 
> Ahh, because I found out an independent problem during the discussion,
> it made us confused. Please make clear this patch does not intend to
> fix the bug.
> 
> If we decide it is an actual bug to be fixed/informed, I also agree
> it should be worked in a separated patch.
> 
> Well, rest of discussion should be haven in different thread.
> 
>> In *either* case, given that one is a 'clean-up' patch and the other is
>> 'new functionality', they should be independent *anyway*.  Small
>> incremental changes that don't break things when applied is what we're
>> shooting for here.
> 
> Agreed.
> 
 #3: You didn't move ExecCheckRTPerms() and ExecCheckRTEPerms() to
 utils/acl and instead added executor/executor.h to rt_triggers.c.
 I don't particularly like that.  I admit that DoCopy() already knew
 about the executor, and if that were the only case outside of the
 executor where ExecCheckRTPerms() was getting called it'd probably be
 alright, but we already have another place that wants to use it, so
 let's move it to a more appropriate place.
>>>
>>> Sorry, I'm a bit confused.
>>> It seemed to me you suggested to utilize ExecCheckRTPerms() rather than
>>> moving its logic anywhere, so I kept it here. (Was it misunderstand?)
>>
>> I'm talking about moving the whole function (all 3 lines of it) to
>> somewhere else and then reworking the function to be more appropriate
>> based on it's new location (including renaming and changing arguments
>> and return values, as appropriate).
> 
> OK, I agreed.
> 
>>> If so, but, I doubt utils/acl is the best placeholder of the moved
>>> ExecCheckRTPerms(), because the checker function calls both of the
>>> default acl functions and a optional external security function.
>>
>> Can you explain why you think that having a function in utils/acl (eg:
>> include/utils/acl.h and backend/utils/aclchk.c) which calls default acl
>> functions and an allows for an external hook would be a bad idea?
>>
>>> It means the ExecCheckRTPerms() is caller of acl functions, not acl
>>> function itself, isn't it?
>>
>> It's providing a higher-level service, sure, but there's nothing
>> particularly interesting or special about what it's doing in this case,
>> and, we need it in multiple places.  Why duplicate it?
> 
> If number of the checker functions is only a reason why we move
> ExecCheckRTPerms() into the backend/utils/aclchk.c right now, I
> don't have any opposition.
> When it reaches to a dozen, we can consider new location. Right?
> 
> Sorry, the name of pg_rangetbl_aclcheck() was misleading for me.
> 
>>> I agreed the checker function is not a part of executor, but it is
>>> also not a part of acl functions in my opinion.
>>>
>>> If it is disinclined to create a new directory to deploy the checker
>>> function, my preference is src/backend/utils/adt/security.c and
>>> src/include/utils/security.h .
>>
>> We don't need a new directory or file for one function, as Robert
>>

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Wed, 2010-05-26 at 13:03 +0900, Fujii Masao wrote:
> On Wed, May 26, 2010 at 1:04 AM, Simon Riggs  wrote:
> > On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote:
> >> On Tue, May 25, 2010 at 10:29 AM, Josh Berkus  wrote:
> >> > I agree that #4 should be done last, but it will be needed, not in the
> >> > least by your employer ;-) .  I don't see any obvious way to make #4
> >> > compatible with any significant query load on the slave, but in general
> >> > I'd think that users of #4 are far more concerned with 0% data loss than
> >> > they are with getting the slave to run read queries.
> >>
> >> Since #2 and #3 are enough for 0% data loss, I think that such users
> >> would be more concerned about what results are visible in the standby.
> >> No?
> >
> > Please add #4 also. You can do that easily at the same time as #2 and
> > #3, and it will leave me free to fix the perceived conflict problems.
> 
> I think that we should implement the feature in small steps rather than
> submit one big patch at a time. So I'd like to focus on #2 and #3 at first,
> and #4 later (maybe third or fourth CF).

We both know if you do #2 and #3 then doing #4 also is trivial.

If you leave it out then we'll end up missing something that is required
and have to rework everything.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 23:59 -0400, Robert Haas wrote:
> Quorum commit is definitely an extra knob, IMHO.

No, its about three less, as I have explained.

Explain your position, don't just demand others listen.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Heikki Linnakangas


On 25/05/10 23:56, Josh Berkus wrote:

Do we get a bit in the visibility map for a page which has aborted
transaction rows on it?


If there's a tuple with an aborted xmin on a page, the bit in the 
visibility map is not set. A tuple with aborted xmax doesn't matter.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-25 Thread Jesper Krogh



On 26/05/2010, at 01.16, Jan Urbański  wrote:


On 19/05/10 21:01, Jesper Krogh wrote:

The document base is arount 350.000 documents and
I have set the statistics target on the tsvector column
to 1000 since the 100 seems way of.


So for tsvectors the statistics target means more or less "at any time
track at most 10 *  lexemes simultaneously" where "track"  
means

keeping them in memory while going through the tuples being analysed.

Remember that the measure is in lexemes, not whole tsvectors and the  
10
factor is meant to approximate the average number of unique lexemes  
in a

tsvector. If your documents are very large, this might not be a good
approximation.


I just did a avg(length(document_tsvector)) which is 154
Doc count is 1.3m now in my sample set.


But the distribution is very "flat" at the end, the last 128 values  
are

excactly
1.00189e-05
which means that any term sitting outside the array would get an
estimate of
1.00189e-05 * 350174 / 2 = 1.75 ~ 2 rows


Yeah, this might meant that you could try cranking up the stats  
target a

lot, to make the set of simulatenously tracked lexemes larger (it will
cost time and memory during analyse though). If the documents have
completely different contents, what can happen is that almost all
lexemes are only seen a few times and get removed during the pruning  
of

the working set. I have seen similar behaviour while working on the
typanalyze function for tsvectors.


I Think i would prefer something less "magic"   I Can increase the  
statistics target and get more reliable data but that increases also  
the amount of tuples being picked out for analysis which is really  
time consuming.


But that also means that what gets stored as the lower bound of the  
historgram isn't anywhere near the lower bound, more the lower bound  
of the "artificial" histogram that happened after the last pruning.


I Would suggest that the pruning in the end should be aware of this.  
Perhaps by keeping track of the least frequent value that never got  
pruned and using that as the last pruning ans lower bound?


Thanks a lot for the explanation it fits fairly well why i couldn't  
construct a simple test set that had the problem.




So far I have no idea if this is bad or good, so a couple of sample  
runs

of stuff that
is sitting outside the "most_common_vals" array:

[gathered statistics suck]


So the "most_common_vals" seems to contain a lot of values that  
should

never have been kept in favor
of other values that are more common.


In practice, just cranking the statistics estimate up high enough  
seems

to solve the problem, but doesn't
there seem to be something wrong in how the statistics are collected?


The algorithm to determine most common vals does not do it accurately.
That would require keeping all lexemes from the analysed tsvectors in
memory, which would be impractical. If you want to learn more about  
the

algorithm being used, try reading
http://www.vldb.org/conf/2002/S10P03.pdf and corresponding comments in
ts_typanalyze.c


I'll do some Reading

Jesper
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Fujii Masao

On Wed, May 26, 2010 at 1:04 AM, Simon Riggs  wrote:
> On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote:
>> On Tue, May 25, 2010 at 10:29 AM, Josh Berkus  wrote:
>> > I agree that #4 should be done last, but it will be needed, not in the
>> > least by your employer ;-) .  I don't see any obvious way to make #4
>> > compatible with any significant query load on the slave, but in general
>> > I'd think that users of #4 are far more concerned with 0% data loss than
>> > they are with getting the slave to run read queries.
>>
>> Since #2 and #3 are enough for 0% data loss, I think that such users
>> would be more concerned about what results are visible in the standby.
>> No?
>
> Please add #4 also. You can do that easily at the same time as #2 and
> #3, and it will leave me free to fix the perceived conflict problems.

I think that we should implement the feature in small steps rather than
submit one big patch at a time. So I'd like to focus on #2 and #3 at first,
and #4 later (maybe third or fourth CF).

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 11:36 PM, Fujii Masao  wrote:
> On Wed, May 26, 2010 at 2:10 AM, Simon Riggs  wrote:
>> My suggestion is simply to have a single parameter (name unimportant)
>>
>> number_of_synch_servers_we_wait_for = N
>>
>> which is much easier to understand because it is phrased in terms of the
>> guarantee given to the transaction, not in terms of what the admin
>> thinks is the situation.
>
> How can we choose #2, #3 or #4 by using your proposed option?
>
> As the result of the discussion, I'm inclined towards choosing the
> "mix" approach. How about specifying #1, #2, #3 or #4 per standby,
> and specifying the number of "synchronous" (i.e., means #2, #3 or
> #4) standbys the transaction commit waits for at the master as
> Simon suggests?
>
> We add new option "replication_mode" (better name?) specifying
> when the standby sends the ACK meaning the completion of replication
> to the master into recovery.conf. Valid values are "async", "recv",
> "fsync" and "redo". Those correspond to #1, #2, #3 and #4 I defined
> on the top of the thread.
>
> If "async", the standby never sends any ACK. If "recv", "fsync",
> or "redo", the standby sends the ACK when it has received, fsynced
> or replayed the WAL from the master, respectively.
>
> On the other hand, we add new GUC "max_synchronous_standbys"
> (I prefer it to "number_of_synch_servers_we_wait_for", but does
> anyone have better name?) as PGC_USERSET into postgresql.conf.
> It specifies the maximum number of standbys which transaction
> commit must wait for the ACK from.
>
> If max_synchronous_standbys is 0, no transaction commit waits for
> ACK even if some connected standbys set their replication_mode to
> "recv", "fsync" or "redo". If it's positive, transaction comit waits
> for N ACKs. N is the smaller number between max_synchronous_standbys
> and the actual number of connected "synchronous" standbys.
>
> Thought?

I think we're over-engineering this.  For a first version we should do
something simple.  Then we can add some of these extra knobs in a
follow-on patch.  Quorum commit is definitely an extra knob, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread KaiGai Kohei

(2010/05/26 12:17), Tom Lane wrote:
> Stephen Frost  writes:
>> That may be the case.  I'm certainly more concerned with a bug in the
>> existing code than any new code that we're working on.  The question is-
>> is this actually a user-visible bug?  Or do we require that a user
>> creating an FK needs SELECT rights on the primary table?  If so, it's
>> still a bug, but at that point it's a bug in our documentation where we
>> don't mention that SELECT rights are also needed.
> 
> Having an FK to another table certainly allows at least an indirect
> form of SELECT, because you can determine whether any given key
> exists in the PK table by seeing if you're allowed to insert a
> referencing row.  I haven't dug in the SQL spec to see if that addresses
> the point, but it wouldn't bother me in the least to insist that
> both REFERENCES and SELECT privilege are required to create an FK.
> 
> In any case, RI_Initial_Check isn't broken, because if it can't do
> the SELECTs it just falls back to a slower method.  It's arguable
> that the FK triggers themselves are assuming more than they should
> about permissions, but I don't think that RI_Initial_Check can be
> claimed to be buggy.

Hmm. If both REFERENCES and SELECT privilege are required to create
a new FK constraint, why RI_Initial_Check() need to check SELECT
permission prior to SPI_execute()?

It eventually checks SELECT privilege during execution of the secondary
query. It is unclear for me why we need to provide a slower fallback.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Fujii Masao

On Wed, May 26, 2010 at 2:10 AM, Simon Riggs  wrote:
> My suggestion is simply to have a single parameter (name unimportant)
>
> number_of_synch_servers_we_wait_for = N
>
> which is much easier to understand because it is phrased in terms of the
> guarantee given to the transaction, not in terms of what the admin
> thinks is the situation.

How can we choose #2, #3 or #4 by using your proposed option?

As the result of the discussion, I'm inclined towards choosing the
"mix" approach. How about specifying #1, #2, #3 or #4 per standby,
and specifying the number of "synchronous" (i.e., means #2, #3 or
#4) standbys the transaction commit waits for at the master as
Simon suggests?

We add new option "replication_mode" (better name?) specifying
when the standby sends the ACK meaning the completion of replication
to the master into recovery.conf. Valid values are "async", "recv",
"fsync" and "redo". Those correspond to #1, #2, #3 and #4 I defined
on the top of the thread.

If "async", the standby never sends any ACK. If "recv", "fsync",
or "redo", the standby sends the ACK when it has received, fsynced
or replayed the WAL from the master, respectively.

On the other hand, we add new GUC "max_synchronous_standbys"
(I prefer it to "number_of_synch_servers_we_wait_for", but does
anyone have better name?) as PGC_USERSET into postgresql.conf.
It specifies the maximum number of standbys which transaction
commit must wait for the ACK from.

If max_synchronous_standbys is 0, no transaction commit waits for
ACK even if some connected standbys set their replication_mode to
"recv", "fsync" or "redo". If it's positive, transaction comit waits
for N ACKs. N is the smaller number between max_synchronous_standbys
and the actual number of connected "synchronous" standbys.

Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Open Item: pg_controldata - machine readable?

2010-05-25 Thread Takahiro Itagaki


Joe Conway  wrote:

> >> There is an open item "pg_controldata - machine readable?" in the list:
> >> http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items
> >> Should we add the module to 9.0?
> > 
> > No.  This is a new feature that wasn't even under consideration,
> > let alone written, at the time of 9.0 feature freeze.  It does not
> > get into either core or contrib this time around.
> 
> Yup, agreed. That was why I put it on github instead of sending a patch
> to the list. It was also a quick and dirty hack -- maybe it could be
> cleaned up for 9.1, but I'm not sure there was consensus that it was
> really needed.

OK, I moved it from 9.0 open items to new features for 9.1.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Open Item: invalid declspec for PG_MODULE_MAGIC

2010-05-25 Thread Tom Lane

Takahiro Itagaki  writes:
>   * Should we backport the fix to previous releases?

Certainly not.  It hasn't gotten through beta, and the risk of
breaking third-party modules is nonnegligible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread Tom Lane

Stephen Frost  writes:
> That may be the case.  I'm certainly more concerned with a bug in the
> existing code than any new code that we're working on.  The question is-
> is this actually a user-visible bug?  Or do we require that a user
> creating an FK needs SELECT rights on the primary table?  If so, it's
> still a bug, but at that point it's a bug in our documentation where we
> don't mention that SELECT rights are also needed.

Having an FK to another table certainly allows at least an indirect
form of SELECT, because you can determine whether any given key
exists in the PK table by seeing if you're allowed to insert a
referencing row.  I haven't dug in the SQL spec to see if that addresses
the point, but it wouldn't bother me in the least to insist that
both REFERENCES and SELECT privilege are required to create an FK.

In any case, RI_Initial_Check isn't broken, because if it can't do
the SELECTs it just falls back to a slower method.  It's arguable
that the FK triggers themselves are assuming more than they should
about permissions, but I don't think that RI_Initial_Check can be
claimed to be buggy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Open Item: pg_controldata - machine readable?

2010-05-25 Thread Joe Conway

On 05/25/2010 08:03 PM, Tom Lane wrote:
> Takahiro Itagaki  writes:
>> There is an open item "pg_controldata - machine readable?" in the list:
>> http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items
> 
>> The proposal by Joe Conway is adding a new contib module.
>> http://archives.postgresql.org/message-id/4b959d7a.6010...@joeconway.com
>> http://github.com/jconway/pg_controldata
> 
>> Should we add the module to 9.0?
> 
> No.  This is a new feature that wasn't even under consideration,
> let alone written, at the time of 9.0 feature freeze.  It does not
> get into either core or contrib this time around.

Yup, agreed. That was why I put it on github instead of sending a patch
to the list. It was also a quick and dirty hack -- maybe it could be
cleaned up for 9.1, but I'm not sure there was consensus that it was
really needed.

Joe




signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread KaiGai Kohei

(2010/05/26 11:12), Stephen Frost wrote:
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>>> #2: REALLY BIG ISSUE- You've added ExecutorCheckPerms_hook as part of
>>> this patch- don't, we're in feature-freeze right now and should not be
>>> adding hooks at this time.
>>
>> The patch is intended to submit for the v9.1 development, not v9.0, isn't it?
> 
> That really depends on if this is actually fixing a bug in the existing
> code or not.  I'm on the fence about that at the moment, to be honest.
> I was trying to find if we expliitly say that SELECT rights are needed
> to reference a column but wasn't able to.  If every code path is
> expecting that, then perhaps we should just document it that way and
> move on.  In that case, all these changes would be for 9.1.  If we
> decide the current behavior is a bug, it might be something which could
> be fixed in 9.0 and maybe back-patched.

Ahh, because I found out an independent problem during the discussion,
it made us confused. Please make clear this patch does not intend to
fix the bug.

If we decide it is an actual bug to be fixed/informed, I also agree
it should be worked in a separated patch.

Well, rest of discussion should be haven in different thread.

> In *either* case, given that one is a 'clean-up' patch and the other is
> 'new functionality', they should be independent *anyway*.  Small
> incremental changes that don't break things when applied is what we're
> shooting for here.

Agreed.

>>> #3: You didn't move ExecCheckRTPerms() and ExecCheckRTEPerms() to
>>> utils/acl and instead added executor/executor.h to rt_triggers.c.
>>> I don't particularly like that.  I admit that DoCopy() already knew
>>> about the executor, and if that were the only case outside of the
>>> executor where ExecCheckRTPerms() was getting called it'd probably be
>>> alright, but we already have another place that wants to use it, so
>>> let's move it to a more appropriate place.
>>
>> Sorry, I'm a bit confused.
>> It seemed to me you suggested to utilize ExecCheckRTPerms() rather than
>> moving its logic anywhere, so I kept it here. (Was it misunderstand?)
> 
> I'm talking about moving the whole function (all 3 lines of it) to
> somewhere else and then reworking the function to be more appropriate
> based on it's new location (including renaming and changing arguments
> and return values, as appropriate).

OK, I agreed.

>> If so, but, I doubt utils/acl is the best placeholder of the moved
>> ExecCheckRTPerms(), because the checker function calls both of the
>> default acl functions and a optional external security function.
> 
> Can you explain why you think that having a function in utils/acl (eg:
> include/utils/acl.h and backend/utils/aclchk.c) which calls default acl
> functions and an allows for an external hook would be a bad idea?
>
>> It means the ExecCheckRTPerms() is caller of acl functions, not acl
>> function itself, isn't it?
>
> It's providing a higher-level service, sure, but there's nothing
> particularly interesting or special about what it's doing in this case,
> and, we need it in multiple places.  Why duplicate it?

If number of the checker functions is only a reason why we move
ExecCheckRTPerms() into the backend/utils/aclchk.c right now, I
don't have any opposition.
When it reaches to a dozen, we can consider new location. Right?

Sorry, the name of pg_rangetbl_aclcheck() was misleading for me.

>> I agreed the checker function is not a part of executor, but it is
>> also not a part of acl functions in my opinion.
>>
>> If it is disinclined to create a new directory to deploy the checker
>> function, my preference is src/backend/utils/adt/security.c and
>> src/include/utils/security.h .
> 
> We don't need a new directory or file for one function, as Robert
> already pointed out.

OK, let's consider when aclchk.c holds a dozen of checker functions.

>>> #6: I havn't checked yet, but if there are other things in an RTE which
>>> would make sense in the DoCopy case, beyond just what's needed for the
>>> permissions checking, and which wouldn't be 'correct' with a NULL'd
>>> value, I would set those.  Yes, we're building the RTE to check
>>> permissions, but we don't want someone downstream to be suprised when
>>> they make a change to something in the permissions checking and discover
>>> that a value in RTE they expected to be there wasn't valid.  Even more
>>> so, if there are function helpers which can be used to build an RTE, we
>>> should be using them.  The same goes for RI_Initial_Check().
>>
>> Are you saying something like makeFuncExpr()?
>> I basically agree. However, should it be done in this patch?
> 
> Actually, I mean looking for, and using, things like
> markRTEForSelectPriv() and addRangeTableEntry() or
> addRangeTableEntryForRelation().

OK, I'll make it in separated patch.

>>> #8: When moving ExecCheckRTPerms(), you should rename it to be more like
>>> the other function calls in acl.h  Perhaps pg_rangetbl_aclcheck()?
>>> Also, it sh

Re: [HACKERS] Open Item: pg_controldata - machine readable?

2010-05-25 Thread Tom Lane

Takahiro Itagaki  writes:
> There is an open item "pg_controldata - machine readable?" in the list:
> http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items

> The proposal by Joe Conway is adding a new contib module.
> http://archives.postgresql.org/message-id/4b959d7a.6010...@joeconway.com
> http://github.com/jconway/pg_controldata

> Should we add the module to 9.0?

No.  This is a new feature that wasn't even under consideration,
let alone written, at the time of 9.0 feature freeze.  It does not
get into either core or contrib this time around.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Jan Wieck


On 5/25/2010 4:16 PM, Tom Lane wrote:

Jan Wieck  writes:
No, I meant how will the *function* know, if a superuser and/or some 
background process can purge records at any time?


The data contains timestamps which are supposedly taken in commit order. 


You can *not* rely on the commit timestamps to be in exact order.
(Perhaps approximate ordering is good enough for what you want here,
but just be careful to not fall into the trap of assuming that they're
exactly ordered.)


I am well aware of the fact that commit timestamps within the WAL can go 
backwards and that the serial numbers of this proposed implementation of 
commit order can even be different from what the timestamps AND the WAL 
are saying.


As long as the serial number (record position inside of segment) is 
determined while the transaction still holds all its locks, this is 
going to be good enough for what async replication users today are used 
to. Again, it will not magically make it possible to determine a 
serializable order of actions, that happened from transactions running 
in read committed isolation level, post mortem. I don't even even think 
that is possible at all.


And I don't think anyone proposed a solution for that problem anyways.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] libpq should not be using SSL_CTX_set_client_cert_cb

2010-05-25 Thread Tom Lane

I've been experimenting with SSL setups involving chains of CA
certificates, ie, where the server or client cert itself is signed by
an intermediate CA rather than a trusted root CA.  This appears to work
well enough on the server side if you configure the server correctly
(see discussion of bug #5468).  However, libpq is not able to work with
a client certificate unless that cert is directly signed by a CA that
the server trusts (ie, one listed directly in the server's root.crt file).
This is because there is no good way to feed back any intermediate CA
certs to the server.  The man page for SSL_CTX_set_client_cert_cb says
in so many words that the client_cert_cb API is maldesigned:

BUGS

The client_cert_cb() cannot return a complete certificate chain,
it can only return one client certificate. If the chain only has
a length of 2, the root CA certificate may be omitted according
to the TLS standard and thus a standard conforming answer can be
sent to the server. For a longer chain, the client must send the
complete chain (with the option to leave out the root CA
certificate). This can only be accomplished by either adding the
intermediate CA certificates into the trusted certificate store
for the SSL_CTX object (resulting in having to add CA
certificates that otherwise maybe would not be trusted), or by
adding the chain certificates using the
SSL_CTX_add_extra_chain_cert(3) function, which is only
available for the SSL_CTX object as a whole and that therefore
probably can only apply for one client certificate, making the
concept of the callback function (to allow the choice from
several certificates) questionable.

It strikes me that we could not only fix this case, but make the libpq
code simpler and more like the backend case, if we got rid of
client_cert_cb and instead preloaded the ~/.postgresql/postgresql.crt
file using SSL_CTX_use_certificate_chain_file().  Then, using an
indirectly signed client cert would only require including the full cert
chain in that file.

So I'm wondering if there was any specific reason behind using the
callback API to start with.  Anybody remember?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Open Item: invalid declspec for PG_MODULE_MAGIC

2010-05-25 Thread Takahiro Itagaki

This open item is for replacing PGDLLIMPORT markers for PG_MODULE_MAGIC
and PG_FUNCTION_INFO_V1 to __declspec(dllexport) because they are always
expored by user modules rather than by the core codes.
http://archives.postgresql.org/message-id/20100329184705.a60e.52131...@oss.ntt.co.jp

The fix is simple, so I think we can include it to 9.0.
Arguable issues for the patch are:
  * Are there better name than PGMODULEEXPORT?  I like PGDLLEXPORT
because it is similar to PGDLLIMPORT, but it might be too similar.
  * Should we backport the fix to previous releases?
I'd like to backport it because it should not break any existing
third party modules because they cannot be even built on Windows.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread Stephen Frost

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> The reason why user must have SELECT privileges on the PK/FK tables is
> the validateForeignKeyConstraint() entirely calls SPI_execute() to verify
> FK constraints can be established between two tables (even if fallback path).
> 
> And, the reason why RI_Initial_Check() now calls pg_class_aclcheck() is
> to try to avoid unexpected access violation error because of SPI_execute().
> However, the fallback path also calls SPI_execute() entirely, so I concluded
> the permission checks in RI_Initial_Check() is nonsense.

That may be the case.  I'm certainly more concerned with a bug in the
existing code than any new code that we're working on.  The question is-
is this actually a user-visible bug?  Or do we require that a user
creating an FK needs SELECT rights on the primary table?  If so, it's
still a bug, but at that point it's a bug in our documentation where we
don't mention that SELECT rights are also needed.

Anyone know what the SQL spec says about this (if anything...)?

> However, it is an independent issue right now, so I kept it as is.

Uh, I don't really see it as independent..  If we have a bug there that
we need to fix, and it's because we have two different bits of code
trying to do the same checking, we should fix it be eliminating the
duplicate checking, imv.

> The origin of the matter is that we applies unnecessary permission checks,
> although it is purely internal use and user was already checked to execute
> whole of ALTER TABLE statement. Right?

That's certainly a nice thought, but given the complexity in ALTER
TABLE, in particular with regard to permissions checking, I have no idea
if what it's doing is intentional or wrong.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread Stephen Frost

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> > #2: REALLY BIG ISSUE- You've added ExecutorCheckPerms_hook as part of
> > this patch- don't, we're in feature-freeze right now and should not be
> > adding hooks at this time.
> 
> The patch is intended to submit for the v9.1 development, not v9.0, isn't it?

That really depends on if this is actually fixing a bug in the existing
code or not.  I'm on the fence about that at the moment, to be honest.
I was trying to find if we expliitly say that SELECT rights are needed
to reference a column but wasn't able to.  If every code path is
expecting that, then perhaps we should just document it that way and
move on.  In that case, all these changes would be for 9.1.  If we
decide the current behavior is a bug, it might be something which could
be fixed in 9.0 and maybe back-patched.

In *either* case, given that one is a 'clean-up' patch and the other is
'new functionality', they should be independent *anyway*.  Small
incremental changes that don't break things when applied is what we're
shooting for here.

> > #3: You didn't move ExecCheckRTPerms() and ExecCheckRTEPerms() to
> > utils/acl and instead added executor/executor.h to rt_triggers.c.
> > I don't particularly like that.  I admit that DoCopy() already knew
> > about the executor, and if that were the only case outside of the
> > executor where ExecCheckRTPerms() was getting called it'd probably be
> > alright, but we already have another place that wants to use it, so
> > let's move it to a more appropriate place.
> 
> Sorry, I'm a bit confused.
> It seemed to me you suggested to utilize ExecCheckRTPerms() rather than
> moving its logic anywhere, so I kept it here. (Was it misunderstand?)

I'm talking about moving the whole function (all 3 lines of it) to
somewhere else and then reworking the function to be more appropriate
based on it's new location (including renaming and changing arguments
and return values, as appropriate).

> If so, but, I doubt utils/acl is the best placeholder of the moved
> ExecCheckRTPerms(), because the checker function calls both of the
> default acl functions and a optional external security function.

Can you explain why you think that having a function in utils/acl (eg:
include/utils/acl.h and backend/utils/aclchk.c) which calls default acl
functions and an allows for an external hook would be a bad idea?

> It means the ExecCheckRTPerms() is caller of acl functions, not acl
> function itself, isn't it?

It's providing a higher-level service, sure, but there's nothing
particularly interesting or special about what it's doing in this case,
and, we need it in multiple places.  Why duplicate it?

> I agreed the checker function is not a part of executor, but it is
> also not a part of acl functions in my opinion.
> 
> If it is disinclined to create a new directory to deploy the checker
> function, my preference is src/backend/utils/adt/security.c and
> src/include/utils/security.h .

We don't need a new directory or file for one function, as Robert
already pointed out.

> > #6: I havn't checked yet, but if there are other things in an RTE which
> > would make sense in the DoCopy case, beyond just what's needed for the
> > permissions checking, and which wouldn't be 'correct' with a NULL'd
> > value, I would set those.  Yes, we're building the RTE to check
> > permissions, but we don't want someone downstream to be suprised when
> > they make a change to something in the permissions checking and discover
> > that a value in RTE they expected to be there wasn't valid.  Even more
> > so, if there are function helpers which can be used to build an RTE, we
> > should be using them.  The same goes for RI_Initial_Check().
> 
> Are you saying something like makeFuncExpr()?
> I basically agree. However, should it be done in this patch?

Actually, I mean looking for, and using, things like
markRTEForSelectPriv() and addRangeTableEntry() or
addRangeTableEntryForRelation().

> > #8: When moving ExecCheckRTPerms(), you should rename it to be more like
> > the other function calls in acl.h  Perhaps pg_rangetbl_aclcheck()?
> > Also, it should return an actual AclResult instead of just true/false.
> 
> See the comments in #3.
> And, if the caller has to handle aclcheck_error(), user cannot distinguish
> access violation errors between the default PG permission and any other
> external security stuff, isn't it?

I'm not suggesting that the caller handle aclcheck_error()..
ExecCheckRTPerms() could just as easily have a flag which indicates if
it will call aclcheck_error() or not, and if not, to return an
AclResult to the caller.  That flag could then be passed to
ExecCheckRTEPerms() as you have it now.

Thanks,

Stephen

signature.asc
Description: Digital signature

[HACKERS] Fwd: PDXPUG Day at OSCON 2010

2010-05-25 Thread Mark Wong

It was recommended to me to forward this to -hackers.

Regards,
Mark

-- Forwarded message --
From: Mark Wong 
Date: Tue, May 18, 2010 at 6:57 AM
Subject: PDXPUG Day at OSCON 2010
To: pgsql-annou...@postgresql.org

Thanks to the generosity of O'Reilly, we will be having a full day of
free PostgreSQL sessions on Sunday, July 18 at the Oregon Convention
Center.  Location details and schedule information can be found on the
wiki at:

http://wiki.postgresql.org/wiki/PDXPUGDay2010

We will ask for a $30 donation towards PostgreSQL at the conference,
but no one will be turned away. Sign up here:

https://spreadsheets.google.com/viewform?hl=en&formkey=dDVBRnJGWVlZRkdycFdXbXVuYTNiU2c6MQ

Please submit your talk proposal here:

http://spreadsheets.google.com/viewform?hl=en&formkey=dHBFMGFIWmxJUzhRM3R6dXVlWWxYQ1E6MQ.

Proposals will be decided upon in June 7th and updated on the wiki.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Open Item: pg_controldata - machine readable?

2010-05-25 Thread Takahiro Itagaki

There is an open item "pg_controldata - machine readable?" in the list:
http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items

The proposal by Joe Conway is adding a new contib module.
http://archives.postgresql.org/message-id/4b959d7a.6010...@joeconway.com
http://github.com/jconway/pg_controldata

Should we add the module to 9.0? If we do so, SGML documentation is required.

IMHO, I'd like to put the feature into the core instead of a contrib
module, but we cannot change the catalog version in this time.
So, how about providing control file information through pg_settings
view? We will retrieve those variables as GUC options.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Hiding data in postgresql

2010-05-25 Thread Stephen Frost

Hector,

* Hector Beyers (hqbey...@gmail.com) wrote:
> Does someone have any ideas how I can hide data without the meta data
> noticing? To explain further, I would like to save some collection of data
> where the meta-data does not see it. I am trying to do some security through
> obscurity. It is for research purposes.

This explanation doesn't actually explain anything, near as I can tell.
Perhaps if you would share what your actual problem is, we could
recommend a solution.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Florian Pflug

On May 25, 2010, at 22:16 , Simon Riggs wrote:
> All of these issues show why I want to specify the synchronisation mode
> as a USERSET. That will allow us to specify more easily which parts of
> our application are important when the cluster is degraded and which
> data is so critical it must reach multiple servers.

Hm, but since flushing a important COMMIT to the slave(s) will also need to 
flush all previous (potentially unimportant) COMMITs to the slave(s), isn't 
there a substantial chance of priority-inversion type problems there?

Then again, if asynchronous_commit proved to be effective than so will this 
probably, so maybe my fear is unjustified.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread KaiGai Kohei

(2010/05/25 22:59), Stephen Frost wrote:
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> * DoCopy() and RI_Initial_Check() were reworked to call ExecCheckRTEPerms()
>>with locally built RangeTblEntry.
> 
> Maybe I missed it somewhere, but we still need to address the case where
> the user doesn't have those SELECT permissions that we're looking for in
> RI_Initial_Check(), right?  KaiGai, your patch should be addressing that
> in a similar fashion..

The reason why user must have SELECT privileges on the PK/FK tables is
the validateForeignKeyConstraint() entirely calls SPI_execute() to verify
FK constraints can be established between two tables (even if fallback path).

And, the reason why RI_Initial_Check() now calls pg_class_aclcheck() is
to try to avoid unexpected access violation error because of SPI_execute().
However, the fallback path also calls SPI_execute() entirely, so I concluded
the permission checks in RI_Initial_Check() is nonsense.

However, it is an independent issue right now, so I kept it as is.

The origin of the matter is that we applies unnecessary permission checks,
although it is purely internal use and user was already checked to execute
whole of ALTER TABLE statement. Right?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread KaiGai Kohei

(2010/05/25 21:44), Stephen Frost wrote:
> KaiGai,
> 
> * KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
>> OK, the attached patch reworks it according to the way.
> 
> Reviewing this patch, there are a whole slew of problems.
> 
> #1: REALLY BIG ISSUE- Insufficient comment updates.  You've changed
> function definitions in a pretty serious way as well as moved some code
> around such that some of the previous comments don't make sense.  You
> have got to update comments when you're writing a patch.  Indeed, the
> places I see a changes in comments are when you've removed what appears
> to still be valid and appropriate comments, or places where you've added
> comments which are just blatently wrong with the submitted patch.

Hmm. I'll revise/add the comment around the patched code.

> #2: REALLY BIG ISSUE- You've added ExecutorCheckPerms_hook as part of
> this patch- don't, we're in feature-freeze right now and should not be
> adding hooks at this time.

The patch is intended to submit for the v9.1 development, not v9.0, isn't it?

> #3: You didn't move ExecCheckRTPerms() and ExecCheckRTEPerms() to
> utils/acl and instead added executor/executor.h to rt_triggers.c.
> I don't particularly like that.  I admit that DoCopy() already knew
> about the executor, and if that were the only case outside of the
> executor where ExecCheckRTPerms() was getting called it'd probably be
> alright, but we already have another place that wants to use it, so
> let's move it to a more appropriate place.

Sorry, I'm a bit confused.
It seemed to me you suggested to utilize ExecCheckRTPerms() rather than
moving its logic anywhere, so I kept it here. (Was it misunderstand?)

If so, but, I doubt utils/acl is the best placeholder of the moved
ExecCheckRTPerms(), because the checker function calls both of the
default acl functions and a optional external security function.
It means the ExecCheckRTPerms() is caller of acl functions, not acl
function itself, isn't it?
In other words, I wonder we should categorize a function X which calls
A and (optionally) B as a part of A.

I agreed the checker function is not a part of executor, but it is
also not a part of acl functions in my opinion.

If it is disinclined to create a new directory to deploy the checker
function, my preference is src/backend/utils/adt/security.c and
src/include/utils/security.h .

> #4: As mentioned previously, the hook (which should be added in a
> separate patch anyway) makes more sense to me to be in
> ExecCheckRTPerms(), not ExecCheckRTEPerms().  This also means that we
> need to be calling ExecCheckRTPerms() from DoCopy and
> RI_Initial_Check(), to make sure that the hook gets called.  To that
> end, I wouldn't even expose ExecCheckRTEPerms() outside of acl.c.  Also,
> there should be a big comment about not using or calling
> ExecCheckRTEPerms() directly outside of ExecCheckRTPerms() since the
> hook would then be skipped.

I don't have any differences in preference between ExecCheckRTPerms()
and ExecCheckRTEPerms(), except for DoCopy() and RI_Initial_Check()
have to call the checker function with list_make1(&rte), instead of &rte.

> #5: In DoCopy, you can remove relPerms and remainingPerms, but I'd
> probably leave required_access up near the top and then just use it to
> set rte->required_access directly rather than moving that bit deep down
> into the function.

OK,

> #6: I havn't checked yet, but if there are other things in an RTE which
> would make sense in the DoCopy case, beyond just what's needed for the
> permissions checking, and which wouldn't be 'correct' with a NULL'd
> value, I would set those.  Yes, we're building the RTE to check
> permissions, but we don't want someone downstream to be suprised when
> they make a change to something in the permissions checking and discover
> that a value in RTE they expected to be there wasn't valid.  Even more
> so, if there are function helpers which can be used to build an RTE, we
> should be using them.  The same goes for RI_Initial_Check().

Are you saying something like makeFuncExpr()?
I basically agree. However, should it be done in this patch?

> #7: I'd move the conditional if (is_from) into the foreach which is
> building the columnsSet and eliminate the need for columnsSet; I don't
> see that it's really adding much here.

OK,

> #8: When moving ExecCheckRTPerms(), you should rename it to be more like
> the other function calls in acl.h  Perhaps pg_rangetbl_aclcheck()?
> Also, it should return an actual AclResult instead of just true/false.

See the comments in #3.
And, if the caller has to handle aclcheck_error(), user cannot distinguish
access violation errors between the default PG permission and any other
external security stuff, isn't it?

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-25 Thread Jan Urbański

On 19/05/10 21:01, Jesper Krogh wrote:
> The document base is arount 350.000 documents and
> I have set the statistics target on the tsvector column
> to 1000 since the 100 seems way of.

So for tsvectors the statistics target means more or less "at any time
track at most 10 *  lexemes simultaneously" where "track" means
keeping them in memory while going through the tuples being analysed.

Remember that the measure is in lexemes, not whole tsvectors and the 10
factor is meant to approximate the average number of unique lexemes in a
tsvector. If your documents are very large, this might not be a good
approximation.

> # ANALYZE verbose reference (document_tsvector);
> INFO:  analyzing "reference"
> INFO:  "reference": scanned 14486 of 14486 pages, containing 350174 live
> rows and 6027 dead rows; 30 rows in sample, 350174 estimated total rows
> ANALYZE
> 
> Ok, so analyze allmost examined all rows. Looking into
> "most_common_freqs" I find
> # select count(unnest) from (select unnest(most_common_freqs) from
> pg_stats where attname = 'document_tsvector') as foo;
>  count
> ---
>   2810
> (1 row)

So the size of the most_common_freqs and most_common_vals rows in
pg_statistics for tsvectors has an upper bound of  * 10
(for the same reasons as mentioned before) and holds lexemes (not whole
tsvectors). What happens also is that lexemes that where seen only one
while going through the analysed set are discarded, so that's why you
can actually get less entries in these arrays, even if your document set
is big.

> But the distribution is very "flat" at the end, the last 128 values are
> excactly
> 1.00189e-05
> which means that any term sitting outside the array would get an
> estimate of
> 1.00189e-05 * 350174 / 2 = 1.75 ~ 2 rows

Yeah, this might meant that you could try cranking up the stats target a
lot, to make the set of simulatenously tracked lexemes larger (it will
cost time and memory during analyse though). If the documents have
completely different contents, what can happen is that almost all
lexemes are only seen a few times and get removed during the pruning of
the working set. I have seen similar behaviour while working on the
typanalyze function for tsvectors.

> So far I have no idea if this is bad or good, so a couple of sample runs
> of stuff that
> is sitting outside the "most_common_vals" array:
> 
> [gathered statistics suck]

> So the "most_common_vals" seems to contain a lot of values that should
> never have been kept in favor
> of other values that are more common.

> In practice, just cranking the statistics estimate up high enough seems
> to solve the problem, but doesn't
> there seem to be something wrong in how the statistics are collected?

The algorithm to determine most common vals does not do it accurately.
That would require keeping all lexemes from the analysed tsvectors in
memory, which would be impractical. If you want to learn more about the
algorithm being used, try reading
http://www.vldb.org/conf/2002/S10P03.pdf and corresponding comments in
ts_typanalyze.c

It would be interesting to know what's the average size of a tsvector in
your document set (ie. how many unique lexemes does a tsvector have on
average). In general, the tsvector typanalyze function is designed to
suck less than the constant factor that has been used previously, but it
only works really well on the most common lexemes (thus preventing most
gross misestimates). I'm not very surprised it misses the difference
between 1612/350174 and 4/350174 and I'm quite happy that is gets that
if you set the stats target really high :o)

There's always the possibility that there's some stupid bug there, but I
think you just set your expectations too high for the tsvector typanalze
function. If you could come up with a better way of doing tsvector
stats, that would be awesome - currently it's just doing its best to
prevent the most outrageous errors.

Cheers,
Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [spf:guess] Re: [HACKERS] ROLLBACK TO SAVEPOINT

2010-05-25 Thread Sam Vilain

Florian Pflug wrote:
> On May 25, 2010, at 12:18 , Heikki Linnakangas wrote:
>   
>> On 25/05/10 13:03, Florian Pflug wrote:
>> 
>>> On May 25, 2010, at 6:08 , Sam Vilain wrote:
>>>   
 http://www.postgresql.org/docs/8.4/static/sql-savepoint.html

 Lead us to believe that if you roll back to the same savepoint name
 twice in a row, that you might start walking back through the
 savepoints.  I guess I missed the note on ROLLBACK TO SAVEPOINT that
 that is not how it works.

 Here is the section:

 SQL requires a savepoint to be destroyed automatically when another
 savepoint with the same name is established. In PostgreSQL, the old
 savepoint is kept, though only the more recent one will be used when
 rolling back or releasing. (Releasing the newer savepoint will cause the
 older one to again become accessible to ROLLBACK TO SAVEPOINT and
 RELEASE SAVEPOINT.) Otherwise, SAVEPOINT is fully SQL conforming.

>>> I'm confused. The sentence in brackets "Releasing the newer savepoint will 
>>> cause the older one to again become accessible to ROLLBACK TO SAVEPOINT and 
>>> RELEASE SAVEPOINT" implies that you *will* walk backwards through all the 
>>> savepoints named "a" if you repeatedly issue "ROLLBACK TO SAVEPOINT a", no? 
>>> If that is not how it actually works, then this whole paragraph is wrong, 
>>> I'd say.
>>>   
>> Releasing the newer savepoint will cause the older one to again become 
>> accessible, as the doc says, but rolling back to a savepoint does not 
>> implicitly release it. You'll have to use RELEASE SAVEPOINT for that.
>> 
>
> Ah, now I get it. Thanks.
>
> Would changing "Releasing the newer savepoint will cause ... " to "Explicitly 
> releasing the newer savepoint" or maybe even "Explicitly releasing the newer 
> savepoint with RELEASE SAVEPOINT will cause ..." make things clearer?
>   

Yes, probably - your misreading matches my misreading of it :-)

There is another way you can get there - releasing to a savepoint before
the re-used savepoint name will also release the savepoints after it.

ie

   savepoint foo;
   savepoint bar;
   savepoint foo;
   release to savepoint bar;
   release to savepoint foo;

After the first release, the second 'foo' savepoint is gone.  I think
this is a key advantage in saving the old savepoints.

Cheers,
Sam

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Jan Wieck

On 5/25/2010 4:50 PM, Simon Riggs wrote:

On Tue, 2010-05-25 at 16:41 -0400, Jan Wieck wrote:

On 5/25/2010 12:03 PM, Simon Riggs wrote:
> On Sun, 2010-05-23 at 16:21 -0400, Jan Wieck wrote:
> 
>> In some systems (data warehousing, replication), the order of commits is

>> important, since that is the order in which changes have become visible.
>> This information could theoretically be extracted from the WAL, but
>> scanning the entire WAL just to extract this tidbit of information would
>> be excruciatingly painful.
> 
> I think it would be quite simple to read WAL. WALSender reads the WAL

> file after its been flushed, so it would be simple for it to read a blob
> of WAL and then extract the commit order from it.
> 
> Overall though, it would be easier and more efficient to *add* info to

> WAL and then do all this processing *after* WAL has been transported
> elsewhere. Extracting info with DDL triggers, normal triggers, commit
> order and everything else seems like too much work to me. Every other
> RDBMS has moved away from trigger-based replication and we should give
> that serious consideration also.

Reading the entire WAL just to find all COMMIT records, then go back to 
the origin database to get the actual replication log you're looking for 
is simpler and more efficient? I don't think so.

Agreed, but I think I've not explained myself well enough.

I proposed two completely separate ideas; the first one was this:

If you must get commit order, get it from WAL on *origin*, using exact
same code that current WALSender provides, plus some logic to read
through the WAL records and extract commit/aborts. That seems much
simpler than the proposal you outlined and as SR shows, its low latency
as well since commits write to WAL. No need to generate event ticks
either, just use XLogRecPtrs as WALSender already does.

I see no problem with integrating that into core, technically or
philosophically.

Which means that if I want to allow a consumer of that commit order data 
to go offline for three days or so to replicate the 5 requested, low 
volume tables, the origin needs to hang on to the entire WAL log from 
all 100 other high volume tables?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Josh Berkus


> Correct. The problem actually are aborted transactions. Just because an
> XID is really old doesn't mean it was committed.

Yes, that's the main issue with my idea; XIDs which fell off the CLOG
would become visible even if they'd aborted.

Do we get a bit in the visibility map for a page which has aborted
transaction rows on it?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confused about the buffer pool size

2010-05-25 Thread Josh Berkus

MMK,

> But it does not tell my anything about what the actual buffer size is.
> How do I know what the real buffer size is? I am using 8.4.4 and I am
> running only one query at a time.

Please move this discussion to the pgsql-general or pgsql-performance
lists.  pgsql-hackers is for working on PostgreSQL code, and further
questions on this list will probably not be answered.

Other than that, I have no idea what you mean by "buffer size", nor why
you need to know it.  I'd suggest starting your post on the other
mailing list by explaining what specific problem you're trying to solve.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Jan Wieck


On 5/24/2010 9:30 AM, Heikki Linnakangas wrote:

On 22/05/10 16:35, Tom Lane wrote:

Josh Berkus  writes:

   From a discussion at dinner at pgcon, I wanted to send this to the list
for people to poke holes in it:


Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.


Me.


 Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.


Hmm, we don't rely on setting hint bits to truncate CLOG anymore 
(http://archives.postgresql.org/pgsql-committers/2006-11/msg00026.php). 
It's the replacement of xids with FrozenXid that matters, the hint bits 
are really just hints.


Doesn't change the conclusion, though: you still need to replace XIDs 
with FrozenXids to truncate the clog. Conceivably we could keep around 
more than 2^32 transactions in clog with this scheme, but then you need 
a lot more space for the clog. But perhaps it would be better to do that 
than to launch anti-wraparound vacuums, or to refuse more updates in the 
extreme cases.


Correct. The problem actually are aborted transactions. Just because an 
XID is really old doesn't mean it was committed.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 16:41 -0400, Jan Wieck wrote:
> On 5/25/2010 12:03 PM, Simon Riggs wrote:
> > On Sun, 2010-05-23 at 16:21 -0400, Jan Wieck wrote:
> > 
> >> In some systems (data warehousing, replication), the order of commits is
> >> important, since that is the order in which changes have become visible.
> >> This information could theoretically be extracted from the WAL, but
> >> scanning the entire WAL just to extract this tidbit of information would
> >> be excruciatingly painful.
> > 
> > I think it would be quite simple to read WAL. WALSender reads the WAL
> > file after its been flushed, so it would be simple for it to read a blob
> > of WAL and then extract the commit order from it.
> > 
> > Overall though, it would be easier and more efficient to *add* info to
> > WAL and then do all this processing *after* WAL has been transported
> > elsewhere. Extracting info with DDL triggers, normal triggers, commit
> > order and everything else seems like too much work to me. Every other
> > RDBMS has moved away from trigger-based replication and we should give
> > that serious consideration also.
> 
> Reading the entire WAL just to find all COMMIT records, then go back to 
> the origin database to get the actual replication log you're looking for 
> is simpler and more efficient? I don't think so.

Agreed, but I think I've not explained myself well enough.

I proposed two completely separate ideas; the first one was this:

If you must get commit order, get it from WAL on *origin*, using exact
same code that current WALSender provides, plus some logic to read
through the WAL records and extract commit/aborts. That seems much
simpler than the proposal you outlined and as SR shows, its low latency
as well since commits write to WAL. No need to generate event ticks
either, just use XLogRecPtrs as WALSender already does.

I see no problem with integrating that into core, technically or
philosophically.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Josh Berkus

Alvaro,

>> This sounds like extending Xid to 64 bits, without having to store the
>> high bits everywhere.  Was this discussed in the PGCon devs meeting?

Essentially, yes.

One of the main objections to raising XID to 64-bit has been the per-row
overhead.  But adding 4 bytes per page wouldn't be much of an impact.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Jan Wieck


On 5/25/2010 12:03 PM, Simon Riggs wrote:

On Sun, 2010-05-23 at 16:21 -0400, Jan Wieck wrote:


In some systems (data warehousing, replication), the order of commits is
important, since that is the order in which changes have become visible.
This information could theoretically be extracted from the WAL, but
scanning the entire WAL just to extract this tidbit of information would
be excruciatingly painful.


I think it would be quite simple to read WAL. WALSender reads the WAL
file after its been flushed, so it would be simple for it to read a blob
of WAL and then extract the commit order from it.

Overall though, it would be easier and more efficient to *add* info to
WAL and then do all this processing *after* WAL has been transported
elsewhere. Extracting info with DDL triggers, normal triggers, commit
order and everything else seems like too much work to me. Every other
RDBMS has moved away from trigger-based replication and we should give
that serious consideration also.


Reading the entire WAL just to find all COMMIT records, then go back to 
the origin database to get the actual replication log you're looking for 
is simpler and more efficient? I don't think so.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Kevin Grittner

Robert Haas  wrote:

> maybe we should get serializable working and committed on one
> node first and then worry about how to distribute it.  I think
> there might be other approaches to this problem

Well, I've got two or three other ideas on how we can manage this
for HS, but since I now realize that I've totally misunderstood the
main use case for this (which is to support trigger-based
replication), I'd like to be clear on something before letting it
drop.  The big question is, do such replicas need to support
serializable access to the data modified by serializable
transactions in the source database?  That is, is there a need for
such replicas to only see states which are possible in some serial
order of execution of serializable transactions on the source
database?  Or to phrase the same question a third way, should there
be a way to run queries on such replicas with confidence that what
is viewed is consistent with user-defined constraints and business
rules?

If not, there's no intersection between this feature and SSI.  If
there is, I think we should think through at least a general
strategy sooner, rather than later.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 21:19 +0200, Yeb Havinga wrote:
> Simon Riggs wrote:
> > How we handle degraded mode is important, yes. Whatever parameters we
> > choose the problem will remain the same.
> >
> > Should we just ignore degraded mode and respond as if nothing bad had
> > happened? Most people would say not.
> >
> > If we specify server1 = synch and server2 = async we then also need to
> > specify what happens if server1 is down. People might often specify
> > if (server1 == down) server2 = synch.
> >   
> I have a hard time imagining including async servers in the quorum. If 
> an async servers vote is necessary to reach quorum due to a 'real' sync 
> standby server failure, it would mean that the async-intended standby is 
> now also in sync with the master transactions. IMHO this is a bad 
> situation, since instead of the DBA getting the error: "not enough sync 
> standbys to reach quorum", he'll now get "database is slow" complaints, 
> only to find out later that too much sync standby servers went south. 
> (under the assumption that async servers are mostly on too slow links to 
> consider for sync standby).

Yeh, there's difficulty either way. 

We don't need to think of servers as being "synch" or "async", more
likely we would rate them in terms of typical synchronisation delay. So
yeh, calling them "fast" and "slow" in terms of synchronisation delay
makes sense.

Some people with low xact rate and high need for protection might want
to switch across to the slow server and keep running. If not, the
max_synch_delay would trip and you would then select
synch_failure_action = rollback. 

The realistic response is to add a second "fast" sync server, to allow
you to stay up even when you lose one of the fast servers. That now
gives you 4 servers and the failure modes start to get real complex.

Specifying rules to achieve what you're after would be much harder. Some
people might want that, but most people won't in the general case and if
they did specify them they'd likely get them wrong.

All of these issues show why I want to specify the synchronisation mode
as a USERSET. That will allow us to specify more easily which parts of
our application are important when the cluster is degraded and which
data is so critical it must reach multiple servers.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Tom Lane

Jan Wieck  writes:
>> No, I meant how will the *function* know, if a superuser and/or some 
>> background process can purge records at any time?

> The data contains timestamps which are supposedly taken in commit order. 

You can *not* rely on the commit timestamps to be in exact order.
(Perhaps approximate ordering is good enough for what you want here,
but just be careful to not fall into the trap of assuming that they're
exactly ordered.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: Hiding data in postgresql

2010-05-25 Thread Joseph Adams

On Tue, May 25, 2010 at 3:39 PM, Hector Beyers  wrote:
>
> Hi guys,
> (I tried the question in another forum first)
> Does someone have any ideas how I can hide data without the meta data
> noticing? To explain further, I would like to save some collection of data
> where the meta-data does not see it. I am trying to do some security through
> obscurity. It is for research purposes.
> For example, populate a table with 1000 rows, but the meta-data only knows
> of about 500 of them? Only on an export of a dump can you find all the data
> again. Or maybe to make a hidden duplicate schema that can point to the
> hidden data?
> Does someone have any good ideas on how to achieve this or something
> similar?
> Kind regards
> Hector
>
>
> On Mon, May 24, 2010 at 9:16 PM, Hector Beyers  wrote:
>>
>> Hi guys,
>> does ANYONE have any tips on hiding data on a database server? This means
>> that data is stored in places that is not necessarily picked up in the
>> schema of the database. I am doing some research on databases and need some
>> direction.
>> Any help or direction will be highly appreciated.
>> Kind regards
>> Hector

Not sure if this helpful, but be sure to know about views, which can
be used to filter out rows of a table.  Example:

CREATE TABLE foo (name TEXT, visible BOOL);
INSERT INTO foo VALUES ('two', true);
INSERT INTO foo VALUES ('three', true);
INSERT INTO foo VALUES ('four', false);
INSERT INTO foo VALUES ('five', true);
INSERT INTO foo VALUES ('six', false);
INSERT INTO foo VALUES ('seven', true);
INSERT INTO foo VALUES ('eight', false);
INSERT INTO foo VALUES ('nine', false);
INSERT INTO foo VALUES ('ten', false);
INSERT INTO foo VALUES ('eleven', true);

CREATE VIEW foo_view AS SELECT foo.name FROM foo WHERE visible=true;

=> SELECT * FROM foo;
  name  | visible
+-
 two| t
 three  | t
 four   | f
 five   | t
 six| f
 seven  | t
 eight  | f
 nine   | f
 ten| f
 eleven | t
(10 rows)

=> SELECT * FROM foo_view;
  name

 two
 three
 five
 seven
 eleven
(5 rows)

Note that views are SELECT-only, but you can use CREATE RULE to
simulate an updatable view.

You may also want to read about Veil:
http://veil.projects.postgresql.org/curdocs/main.html

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] mergejoin null handling (was Re: [PERFORM] merge join killing performance)

2010-05-25 Thread Tom Lane

Scott Marlowe  writes:
> So, Tom, so you think it's possible that the planner isn't noticing
> all those nulls and thinks it'll just take a row or two to get to the
> value it needs to join on?

I dug through this and have concluded that it's really an oversight in
the patch I wrote some years ago in response to this:
http://archives.postgresql.org/pgsql-performance/2005-05/msg00219.php

That patch taught nodeMergejoin that a row containing a NULL key can't
possibly match anything on the other side.  However, its response to
observing a NULL is just to advance to the next row of that input.
What we should do, if the NULL is in the first merge column and the sort
order is nulls-high, is realize that every following row in that input
must also contain a NULL and so we can just terminate the mergejoin
immediately.  The original patch works well for cases where there are
just a few nulls in one input and the important factor is to not read
all the rest of the other input --- but it fails to cover the case where
there are many nulls and the important factor is to not read all the
rest of the nulls.  The problem can be demonstrated if you modify the
example given in the above-referenced message so that table t1 contains
lots of nulls rather than just a few: explain analyze will show that
all of t1 gets read by the mergejoin, and that's not necessary.

I'm inclined to think this is a performance bug and should be
back-patched, assuming the fix is simple (which I think it is, but
haven't coded/tested yet).  It'd probably be reasonable to go back to
8.3; before that, sorting nulls high versus nulls low was pretty poorly
defined and so there'd be risk of breaking cases that gave the right
answers before.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Jan Wieck


On 5/24/2010 9:30 AM, Greg Sabino Mullane wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


In light of the proposed purging scheme, how would it be able to distinguish 
between those two cases (nothing there yet vs. was there but purged)?



There is a difference between an empty result set and an exception.


No, I meant how will the *function* know, if a superuser and/or some 
background process can purge records at any time?


The data contains timestamps which are supposedly taken in commit order. 
Checking the age of the last entry in the file should be simple enough 
to determine if the segment matches the "max age" configuration (if 
set). In the case of a superuser telling what to purge he would just 
call a function with a serial number (telling the obsolete segments).



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Kevin Grittner

Florian Pflug  wrote:

> Hm, but for there to be an actual problem (and not a false
> positive), an actual dangerous circle has to exist in the
> dependency graph. The existence of a dangerous structure is just a
> necessary (but not sufficient) and easily checked-for condition
> for that, right? Now, if a read-only transaction only ever has
> outgoing edges, it cannot be part of a (dangerous or not) circle,
> and hence any dangerous structure it is part of is a false
> positive.
> 
> I guess my line of reasoning is flawed somehow, but I cannot
> figure out why...

Here's why:

We're tracking rw-dependencies, where the "time-arrow" showing
effective order of execution points from the reader to the writer
(since the reader sees a state prior to the write, it effectively
executes before it).  These are important because there have to be
two such dependencies, one in to the pivot and one out from the
pivot, for a problem to exist.  (See various works by Dr. Alan
Fekete, et al, for details.)  But other dependencies can imply an
order of execution.  In particular, a wr-dependency, where a
transaction *can* see data committed by another transaction, implies
that the *writer* came first in the order of execution.  In this
example, the transaction which lists the receipts successfully reads
the control table update, but is not able to read the receipt
insert.  This completes the cycle, making it a real anomaly and not
a false positive.

Note that the wr-dependency can actually exist outside the database,
making it pretty much impossible to accurately tell a false positive
from a true anomaly when the pivot exists and the transaction
writing data which the pivot can't read commits first.  For example,
let's say that the update to the control table is committed from an
application which, seeing that its update came back without error,
proceeds to list the receipts for the old date in a subsequent
transaction.  You have a wr-dependency which is, in reality, quite
real and solid with no way to notice it within the database engine. 
That's why the techniques used in SSI are pretty hard to improve
upon beyond more detailed and accurate tracking of rw-conflicts.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Nicolas Barbier

2010/5/25 Florian Pflug :

> Hm, but for there to be an actual problem (and not a false positive), an
> actual dangerous circle has to exist in the dependency graph. The
> existence of a dangerous structure is just a necessary (but not
> sufficient) and easily checked-for condition for that, right? Now, if a
> read-only transaction only ever has outgoing edges, it cannot be part
> of a (dangerous or not) circle, and hence any dangerous structure it is
> part of is a false positive.
>
> I guess my line of reasoning is flawed somehow, but I cannot figure out why...

In the general case, "wr" dependencies also create "must be serialized
before" edges. It seems that those edges can be discarded when finding
a pivot, but if you want to go "back to basics":

("<" means "must be serialized before".)

* T1 < T2, because T1 reads a version of a data element for which T2
later creates a newer version (rw between T1 and T2).
* T3 < T1, because T3 reads a version of a data element for which T1
later creates a newer version (rw between T3 and T1).
* T2 < T3, because T2 creates a version of a data element, which is
then read by T3 (wr between T2 and T3).

(As you can see, those 3 edges form a cycle.)

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Fwd: Hiding data in postgresql

2010-05-25 Thread Hector Beyers

Hi guys,

(I tried the question in another forum first)

Does someone have any ideas how I can hide data without the meta data
noticing? To explain further, I would like to save some collection of data
where the meta-data does not see it. I am trying to do some security through
obscurity. It is for research purposes.

For example, populate a table with 1000 rows, but the meta-data only knows
of about 500 of them? Only on an export of a dump can you find all the data
again. Or maybe to make a hidden duplicate schema that can point to the
hidden data?

Does someone have any good ideas on how to achieve this or something
similar?

Kind regards
Hector

On Mon, May 24, 2010 at 9:16 PM, Hector Beyers  wrote:

>
> Hi guys,
>
> does ANYONE have any tips on hiding data on a database server? This means
> that data is stored in places that is not necessarily picked up in the
> schema of the database. I am doing some research on databases and need some
> direction.
>
> Any help or direction will be highly appreciated.
>
> Kind regards
>
> Hector
>
>

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Nicolas Barbier

2010/5/25 Florian Pflug :

> On May 25, 2010, at 20:18 , Dan Ports wrote:
>
>> T3, which is a read-only transaction, sees the incremented date and an
>> empty list of receipts. But T1 later commits a new entry in the
>> receipts table with the old date. No serializable ordering allows this.
>>
>> However, if T3 hadn't performed its read, there'd be no problem; we'd
>> just serialize T1 before T2 and no one would be the wiser.
>
> Hm, so in fact SSI sometimes allows the database to be inconsistent, but only 
> as long as nobody tries to observe it?

I would not call this an inconsistent state: it would become
inconsistent only after someone (e.g., T3) has observed it _and_ T1
commits.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Florian Pflug

On May 25, 2010, at 20:48 , Dan Ports wrote:
> On Tue, May 25, 2010 at 08:35:44PM +0200, Florian Pflug wrote:
>> Hm, so in fact SSI sometimes allows the database to be inconsistent, but 
>> only as long as nobody tries to observe it?
> 
> Yes. Note that even while it's in an inconsistent state, you can still
> perform any query that doesn't observe the inconsistency -- hopefully
> most queries fall into this category.

Yeah, as long as you just walk by without looking, the database is happy ;-)

>> Btw, I still don't get how this follows from the Cahill paper. For a 
>> transaction to lie on a dangerous circle, it needs incoming and outgoing 
>> edges in the conflict graph, right? But I'd have though that conflicts are 
>> always between a reader and a writer or between two writers. So how can a 
>> read-only transaction have incoming and outgoing edges?
> 
> Right, the read-only transaction can't have incoming edges, but it can
> have outgoing edges. So it can't be the "pivot" itself (the transaction
> with both outgoing and incoming edges), but it can cause *another*
> transaction to be.
> 
> In the example I gave, T3 (the r/o transaction) has an outgoing edge to
> T1, because it didn't see T1's concurrent update. T1 already had an
> outgoing edge to T2, so adding in this incoming edge from T3 creates
> the dangerous structure.

Hm, but for there to be an actual problem (and not a false positive), an actual 
dangerous circle has to exist in the dependency graph. The existence of a 
dangerous structure is just a necessary (but not sufficient) and easily 
checked-for condition for that, right? Now, if a read-only transaction only 
ever has outgoing edges, it cannot be part of a (dangerous or not) circle, and 
hence any dangerous structure it is part of is a false positive.

I guess my line of reasoning is flawed somehow, but I cannot figure out why...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Nicolas Barbier

2010/5/25 Dan Ports :

> On Tue, May 25, 2010 at 02:00:42PM +0200, Nicolas Barbier wrote:
>
>> I don't understand the problem. According to me, in the context of
>> SSI, a read-only slave can just map SERIALIZABLE to the technical
>> implementation of REPEATABLE READ (i.e., the currently-existing
>> "SERIALIZABLE"). The union of the transactions on the master and the
>> slave(s) will still exhibit SERIALIZABLE behavior because the
>> transactions on the slave cannot write anything and are therefore
>> irrelevant.
>
> This, unfortunately, isn't true in SSI.
>
> Consider read-only transactions on a single node SSI database -- the
> situation is the same for read-only transactions that run on a slave.
> These transactions can be part of anomalies, so they need to be checked
> for conflicts and potentially aborted.
>
> Consider Kevin's favorite example, where one table contains the current
> date and the other is a list of receipts (initially empty).
>  T1 inserts (select current_date) into receipts, but doesn't commit
>  T2 increments current_date and commits
>  T3 reads both current_date and the receipt table
>  T1 commits
>
> T3, which is a read-only transaction, sees the incremented date and an
> empty list of receipts. But T1 later commits a new entry in the
> receipts table with the old date. No serializable ordering allows this.
> However, if T3 hadn't performed its read, there'd be no problem; we'd
> just serialize T1 before T2 and no one would be the wiser.
>
> SSI would detect a potential conflict here, which we could resolve by
> aborting T3. (We could also abort T1, but if this is a replicated
> system this isn't always an option -- T3 might be running on the
> slave, so only the slave will know about the conflict, and it can't
> very well abort an update transaction on the master.)

Ah, indeed. I made the same reasoning mistake as Florian (presumably)
did: I didn't think of the fact that the read-only transaction doesn't
need to be the pivot.

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confused about the buffer pool size

2010-05-25 Thread MMK

Hello Heikki:

This is what the documentation says (see below).

But it does not tell my anything about what the actual buffer size is.
How do I know what the real buffer size is? I am using 8.4.4 and I am running 
only one query at a time.

Cheers,

MMK.

Sets the planner's assumption about the effective size of the disk
cache that is available to a single query. This is factored into
estimates of the cost of using an index; a higher value makes it more
likely index scans will be used, a lower value makes it more likely
sequential scans will be used. When setting this parameter you should
consider both PostgreSQL's shared buffers and the portion of the 
kernel's disk cache that will be used for PostgreSQL
data files. Also, take into account the expected number of concurrent
queries on different tables, since they will have to share the
available space. This parameter has no effect on the size of shared
memory allocated by PostgreSQL, nor
does it reserve kernel disk cache; it is used only for estimation
purposes. The default is 128 megabytes (128MB).

--- On Tue, 5/25/10, Heikki Linnakangas  
wrote:

From: Heikki Linnakangas 
Subject: Re: [HACKERS] Confused about the buffer pool size
To: "MMK" 
Cc: "PostgreSQL-development" 
Date: Tuesday, May 25, 2010, 11:36 AM

On 25/05/10 19:49, MMK wrote:
> Hello All:
> In the code (costsize.c), I see that effective_cache_size is set to 
> DEFAULT_EFFECTIVE_CACHE_SIZE.
> This is defined as follows in cost.h
> #define DEFAULT_EFFECTIVE_CACHE_SIZE 16384
> But when I say
> show shared_buffers in psql I get,
> shared_buffers  28MB
> In postgresql.conf file, the following lines appear
> shared_buffers = 28MB                   # min 128kB           # (change 
> requires restart)#temp_buffers = 8MB                     # min 800kB
> 
> So I am assuming that the buffer pool size is 28MB = 28 * 128 = 3584 8K pages.
> So should effective_cache_size be set to 3584 rather than the 16384?

No. Please see the manual for what effective_cache_size means:

http://www.postgresql.org/docs/8.4/interactive/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE

--   Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Dimitri Fontaine

Hi,

Simon Riggs  writes:
> On Tue, 2010-05-25 at 19:08 +0200, Alastair Turner wrote:
>> On Tue, May 25, 2010 at 6:28 PM, Simon Riggs  wrote:
>> > The best parameter we can specify is the number of servers that we wish
>> > to wait for confirmation from. 
>> 
>> This may be an incredibly naive question, but what happens to the
>> transaction on the master if the number of confirmations is not
>> received? 
>
> It's much easier to say you want to wait for N servers to respond, but
> don't care which they are. One parameter, simple and flexible.
[...]
> So whatever we do, we need additional parameters to specify timeouts
> (including wait-forever as an option) and action-on-timeout: commit or
> rollback. 

I was preparing an email on the line that we need each slave to declare
its desired minimum level of synchronicity, and have the master filter
that with what the transaction wants.

Scratch that.

Thinking about it some more, I see that Simon's proposal is both more
simple and effective: we already have Hot Standby and admin functions
that tells us the last replayed LSN. The bigger wins. So in case of
failover we know which slave to choose.

The only use case I can see for what I had in mind is to allow the user
to choose which server is trusted to have accurate data or better read
only performances. But if the link is slow, the code will soon enough
notice, mind you.

I'm still not sure about my preference here, but I can see why Simon's
proposal is simpler and addresses all concerns apart from forcing the
servers into a non-optimal setup for a gain that is uneasy to see.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Yeb Havinga


Simon Riggs wrote:

How we handle degraded mode is important, yes. Whatever parameters we
choose the problem will remain the same.

Should we just ignore degraded mode and respond as if nothing bad had
happened? Most people would say not.

If we specify server1 = synch and server2 = async we then also need to
specify what happens if server1 is down. People might often specify
if (server1 == down) server2 = synch.
  
I have a hard time imagining including async servers in the quorum. If 
an async servers vote is necessary to reach quorum due to a 'real' sync 
standby server failure, it would mean that the async-intended standby is 
now also in sync with the master transactions. IMHO this is a bad 
situation, since instead of the DBA getting the error: "not enough sync 
standbys to reach quorum", he'll now get "database is slow" complaints, 
only to find out later that too much sync standby servers went south. 
(under the assumption that async servers are mostly on too slow links to 
consider for sync standby).


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Kevin Grittner

Jan Wieck  wrote:

> Have you ever looked at one of those queries, that Londiste or
> Slony issue against the provider DB in order to get all the log
> data that has been committed between two snapshots? Is that really
> the best you can think of?

No, I admit I haven't.  In fact, I was thinking primarily in terms
of log-driven situations, like HS.  What would be the best place for
me to look to come up to speed on your use case?  (I'm relatively
sure that the issue isn't that there's no information to find, but
that a sequential pass over all available information would take a
*long* time.)  I've been working through the issues on WAL-based
replicas, and have some additional ideas and alternatives, but I'd
like to see the "big picture", including trigger-based replication,
before posting.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Dan Ports

On Tue, May 25, 2010 at 08:35:44PM +0200, Florian Pflug wrote:
> Hm, so in fact SSI sometimes allows the database to be inconsistent, but only 
> as long as nobody tries to observe it?

Yes. Note that even while it's in an inconsistent state, you can still
perform any query that doesn't observe the inconsistency -- hopefully
most queries fall into this category.

> Btw, I still don't get how this follows from the Cahill paper. For a 
> transaction to lie on a dangerous circle, it needs incoming and outgoing 
> edges in the conflict graph, right? But I'd have though that conflicts are 
> always between a reader and a writer or between two writers. So how can a 
> read-only transaction have incoming and outgoing edges?

Right, the read-only transaction can't have incoming edges, but it can
have outgoing edges. So it can't be the "pivot" itself (the transaction
with both outgoing and incoming edges), but it can cause *another*
transaction to be.

In the example I gave, T3 (the r/o transaction) has an outgoing edge to
T1, because it didn't see T1's concurrent update. T1 already had an
outgoing edge to T2, so adding in this incoming edge from T3 creates
the dangerous structure.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Kevin Grittner

Florian Pflug  wrote:

> Hm, so in fact SSI sometimes allows the database to be
> inconsistent, but only as long as nobody tries to observe it?

Not exactly.  The eventually-persisted state is always consistent,
but there can be a transitory committed state which would violate
user-defined constraints or business rules *if viewed*.  This is
what I've been on about -- the commit sequence is not necessarily
the same as the apparent order of execution.  A read-only
transaction, if run before the overlapping commits "settle", can
view a state which is not consistent with any serial order of
execution, and might therefore break the rules.  SSI detects that
and rolls one of the transactions back if they're all running at
serializable transaction isolation in a single SSI database, but the
question is how to handle this when the read happens in a replica.

> Btw, I still don't get how this follows from the Cahill paper. For
> a transaction to lie on a dangerous circle, it needs incoming and
> outgoing edges in the conflict graph, right?

At least one of the transactions participating in the cycle does. 
There's no requirement that they all do.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-25 Thread Alvaro Herrera

Excerpts from Jesper Krogh's message of mié may 19 15:01:18 -0400 2010:

> But the distribution is very "flat" at the end, the last 128 values are 
> excactly
> 1.00189e-05
> which means that any term sitting outside the array would get an estimate of
> 1.00189e-05 * 350174 / 2 = 1.75 ~ 2 rows

I don't know if this is related, but tsvector stats are computed and
stored per term, not per datum.  This is different from all other
datatypes.  Maybe there's code somewhere that's assuming per-datum and
coming up with the wrong estimates?  Or maybe the tsvector-specific code
contains a bug somewhere; maybe a rounding error?

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Florian Pflug

On May 25, 2010, at 20:18 , Dan Ports wrote:
> On Tue, May 25, 2010 at 02:00:42PM +0200, Nicolas Barbier wrote:
>> I don't understand the problem. According to me, in the context of
>> SSI, a read-only slave can just map SERIALIZABLE to the technical
>> implementation of REPEATABLE READ (i.e., the currently-existing
>> "SERIALIZABLE"). The union of the transactions on the master and the
>> slave(s) will still exhibit SERIALIZABLE behavior because the
>> transactions on the slave cannot write anything and are therefore
>> irrelevant.
> 
> This, unfortunately, isn't true in SSI.
> 
> Consider read-only transactions on a single node SSI database -- the
> situation is the same for read-only transactions that run on a slave. 
> These transactions can be part of anomalies, so they need to be checked
> for conflicts and potentially aborted.
> 
> Consider Kevin's favorite example, where one table contains the current
> date and the other is a list of receipts (initially empty). 
>  T1 inserts (select current_date) into receipts, but doesn't commit
>  T2 increments current_date and commits
>  T3 reads both current_date and the receipt table
>  T1 commits
> 
> T3, which is a read-only transaction, sees the incremented date and an
> empty list of receipts. But T1 later commits a new entry in the
> receipts table with the old date. No serializable ordering allows this.
> 
> However, if T3 hadn't performed its read, there'd be no problem; we'd
> just serialize T1 before T2 and no one would be the wiser.

Hm, so in fact SSI sometimes allows the database to be inconsistent, but only 
as long as nobody tries to observe it?

Btw, I still don't get how this follows from the Cahill paper. For a 
transaction to lie on a dangerous circle, it needs incoming and outgoing edges 
in the conflict graph, right? But I'd have though that conflicts are always 
between a reader and a writer or between two writers. So how can a read-only 
transaction have incoming and outgoing edges?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Dan Ports

On Tue, May 25, 2010 at 02:00:42PM +0200, Nicolas Barbier wrote:
> I don't understand the problem. According to me, in the context of
> SSI, a read-only slave can just map SERIALIZABLE to the technical
> implementation of REPEATABLE READ (i.e., the currently-existing
> "SERIALIZABLE"). The union of the transactions on the master and the
> slave(s) will still exhibit SERIALIZABLE behavior because the
> transactions on the slave cannot write anything and are therefore
> irrelevant.

This, unfortunately, isn't true in SSI.

Consider read-only transactions on a single node SSI database -- the
situation is the same for read-only transactions that run on a slave. 
These transactions can be part of anomalies, so they need to be checked
for conflicts and potentially aborted.

Consider Kevin's favorite example, where one table contains the current
date and the other is a list of receipts (initially empty). 
  T1 inserts (select current_date) into receipts, but doesn't commit
  T2 increments current_date and commits
  T3 reads both current_date and the receipt table
  T1 commits

T3, which is a read-only transaction, sees the incremented date and an
empty list of receipts. But T1 later commits a new entry in the
receipts table with the old date. No serializable ordering allows this.
However, if T3 hadn't performed its read, there'd be no problem; we'd
just serialize T1 before T2 and no one would be the wiser.

SSI would detect a potential conflict here, which we could resolve by
aborting T3. (We could also abort T1, but if this is a replicated
system this isn't always an option -- T3 might be running on the
slave, so only the slave will know about the conflict, and it can't
very well abort an update transaction on the master.)

There's another example of a read-only transaction anomaly that could
cause similar problems at
http://portal.acm.org/citation.cfm?doid=1031570.1031573, but I think
this one is easier to follow.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Tom Lane

Alvaro Herrera  writes:
> This sounds like extending Xid to 64 bits, without having to store the
> high bits everywhere.  Was this discussed in the PGCon devs meeting?

Yeah, that's what it would amount to.  It was not discussed at the dev
meeting --- it was an idea that came up one evening at PGCon.

I'm not sure whether this would imply having to widen xid to 64 bits
internally.  That could be a bit unpleasant as far as CPU and shared
memory space go, although every year that goes by makes 32-bit machines
less interesting as DB servers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Tom Lane

Robert Haas  writes:
> On Tue, May 25, 2010 at 1:09 PM, Mike Fowler  wrote:
>>> We're unlikely to accept this patch if it changes the minimum version
>>> of libxml2 required to compile PostgreSQL
>> 
>> Why? 2.6.27 is almost 4 years old.

> Because we work hard to minimize our dependencies and make them as
> non-onerous as possible.

> At a minimum, I think it's fair to say that the burden is on you to
> justify what it's worth bumping the version number.

Yes.  Increasing the minimum required version of some library is a Big
Deal, we don't do it on a whim.  And we definitely don't do it just
because it's old.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Andrew Dunstan




Robert Haas wrote:

On Tue, May 25, 2010 at 1:09 PM, Mike Fowler  wrote:
  

We're unlikely to accept this patch if it changes the minimum version
of libxml2 required to compile PostgreSQL
  

Why? 2.6.27 is almost 4 years old.



Because we work hard to minimize our dependencies and make them as
non-onerous as possible.

At a minimum, I think it's fair to say that the burden is on you to
justify what it's worth bumping the version number.  If there is some
major speed or performance advantage to using the newer API, maybe
we'll consider it.  But if it's just a few extra lines of code to work
around it, then it's better to write those extra lines of code rather
than potentially force users to upgrade packages they're otherwise
happy with.

  


The real issue is what's going to be available on most of the platforms 
we build on. Unfortunately, 2.6.26 is what's on my CentOS 5.4 boxes, for 
example. I'm sure we don't want to make 9.1 not buildable with the 
installed libraries on still fairly current RedHat-derived platforms.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Alastair Turner

On Tue, May 25, 2010 at 6:28 PM, Simon Riggs  wrote:
...
>
> The best parameter we can specify is the number of servers that we wish
> to wait for confirmation from. That is a definition that easily manages
> the complexity of having various servers up/down at any one time. It
> also survives misconfiguration more easily, as well as providing a
> workaround if replicating across a bursty network where we can't
> guarantee response times, even of the typical response time is good.
>

This may be an incredibly naive question, but what happens to the
transaction on the master if the number of confirmations is not
received? Is this intended to create a situation where the master
effectively becomes unavailable for write operations when its
synchronous slaves are unavailable?

Alastair "Bell" Turner

^F5

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 13:31 -0400, Robert Haas wrote:
> So I agree that we need to talk about whether or not we want to do
> this.  I'll give my opinion.  I am not sure how useful this really is.
>  Consider a master with two standbys.  The master commits a
> transaction and waits for one of the two standbys, then acknowledges
> the commit back to the user.  Then the master crashes.  Now what?
> It's not immediately obvious which standby we should being online as
> the primary, and if we guess wrong we could lose transactions thought
> to be committed.  This is probably a solvable problem, with enough
> work: we can write a script to check the last LSN received by each of
> the two standbys and promote whichever one is further along.
> 
> But... what happens if the master and one standby BOTH crash
> simultaneously?  There's no way of knowing (until we get at least one
> of them back up) whether it's safe to promote the other standby.

Not much of a problem really, is it? If you have one server left out of
3, then you promote it OR you stay down - your choice.

There is no "safe to promote" knowledge in *any* scenario; you never
know what was on the primary, only what was received by the standby. If
you have N standbys still up, you can pick which using the algorithm you
mention.

Remember that the WAL is sequential, so its not like the commit order of
transactions will differ across servers if we use quorum commit. So not
a problem.

The multiple simultaneous case is fairly common for people that pick the
"synch to server in next rack" because there's a 100 reasons why we'd
take out both at the same time, ask JD.

> I like the idea of a "quorum commit" type feature where we promise the
> user that things are committed when "enough" servers have acknowledged
> the commit.  But I think most people are not going to want that
> configuration unless we also provide some really good management tools
> that we don't have today.

Good name.

Management tools has nothing to do with this; completely orthogonal.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Michael Tharp


On 05/25/2010 01:09 PM, Mike Fowler wrote:

Why? 2.6.27 is almost 4 years old.


RHEL 5 ships with 2.6.26. I imagine that supporting it is very 
desirable, regardless of its age, since that is unfortunately still the 
latest version of RHEL.


-- m. tharp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 19:08 +0200, Alastair Turner wrote:
> On Tue, May 25, 2010 at 6:28 PM, Simon Riggs  wrote:
> ...
> >
> > The best parameter we can specify is the number of servers that we wish
> > to wait for confirmation from. That is a definition that easily manages
> > the complexity of having various servers up/down at any one time. It
> > also survives misconfiguration more easily, as well as providing a
> > workaround if replicating across a bursty network where we can't
> > guarantee response times, even of the typical response time is good.
> >
> 
> This may be an incredibly naive question, but what happens to the
> transaction on the master if the number of confirmations is not
> received? Is this intended to create a situation where the master
> effectively becomes unavailable for write operations when its
> synchronous slaves are unavailable?

How we handle degraded mode is important, yes. Whatever parameters we
choose the problem will remain the same.

Should we just ignore degraded mode and respond as if nothing bad had
happened? Most people would say not.

If we specify server1 = synch and server2 = async we then also need to
specify what happens if server1 is down. People might often specify
if (server1 == down) server2 = synch.
So now we have 3 configuration settings, one quite complex.

It's much easier to say you want to wait for N servers to respond, but
don't care which they are. One parameter, simple and flexible.

In both cases, we have to figure what to do if we can't get either
server to respond. In replication there is no such thing as "server
down" just a "server didn't reply in time X". So we need to define
timeouts.

So whatever we do, we need additional parameters to specify timeouts
(including wait-forever as an option) and action-on-timeout: commit or
rollback. 

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 1:09 PM, Mike Fowler  wrote:
>> We're unlikely to accept this patch if it changes the minimum version
>> of libxml2 required to compile PostgreSQL
>
> Why? 2.6.27 is almost 4 years old.

Because we work hard to minimize our dependencies and make them as
non-onerous as possible.

At a minimum, I think it's fair to say that the burden is on you to
justify what it's worth bumping the version number.  If there is some
major speed or performance advantage to using the newer API, maybe
we'll consider it.  But if it's just a few extra lines of code to work
around it, then it's better to write those extra lines of code rather
than potentially force users to upgrade packages they're otherwise
happy with.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 13:31 -0400, Robert Haas wrote:
> On Tue, May 25, 2010 at 1:10 PM, Simon Riggs  wrote:
> > On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote:
> >> Robert Haas  wrote:
> >> > Simon Riggs  wrote:
> >> >> If we define robustness at the standby level then robustness
> >> >> depends upon unseen administrators, as well as the current
> >> >> up/down state of standbys.  This is action-at-a-distance in its
> >> >> worst form.
> >> >
> >> > Maybe, but I can't help thinking people are going to want some
> >> > form of this.  The case where someone wants to do sync rep to the
> >> > machine in the next rack over and async rep to a server at a
> >> > remote site seems too important to ignore.
> >>
> >> I think there may be a terminology issue here -- I took "configure
> >> by standby" to mean that *at the master* you would specify rules for
> >> each standby.  I think Simon took it to mean that each standby would
> >> define the rules for replication to it.  Maybe this issue can
> >> resolve gracefully with a bit of clarification?
> >
> > The use case of "machine in the next rack over and async rep to a server
> > at a remote site" would require the settings
> >
> > server.nextrack = synch
> > server.remotesite = async
> >
> > which leaves open the question of what happens when "nextrack" is down.
> >
> > In many cases, to give adequate performance in that situation people add
> > an additional server, so the config becomes
> >
> > server.nextrack1 = synch
> > server.nextrack2 = synch
> > server.remotesite = async
> >
> > We then want to specify for performance reasons that we can get a reply
> > from either nextrack1 or nextrack2, so it all still works safely and
> > quickly if one of them is down. How can we express that rule concisely?
> > With some difficulty.
> 
> Perhaps the difficulty here is that those still look like per-server
> settings to me.  Just maybe with a different set of semantics.

(Those are the per-server settings.)

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Confused about the buffer pool size

2010-05-25 Thread Heikki Linnakangas


On 25/05/10 19:49, MMK wrote:

Hello All:
In the code (costsize.c), I see that effective_cache_size is set to 
DEFAULT_EFFECTIVE_CACHE_SIZE.
This is defined as follows in cost.h
#define DEFAULT_EFFECTIVE_CACHE_SIZE 16384
But when I say
show shared_buffers in psql I get,
shared_buffers  28MB
In postgresql.conf file, the following lines appear
shared_buffers = 28MB   # min 128kB   # (change 
requires restart)#temp_buffers = 8MB # min 800kB

So I am assuming that the buffer pool size is 28MB = 28 * 128 = 3584 8K pages.
So should effective_cache_size be set to 3584 rather than the 16384?


No. Please see the manual for what effective_cache_size means:

http://www.postgresql.org/docs/8.4/interactive/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 1:10 PM, Simon Riggs  wrote:
> On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote:
>> Robert Haas  wrote:
>> > Simon Riggs  wrote:
>> >> If we define robustness at the standby level then robustness
>> >> depends upon unseen administrators, as well as the current
>> >> up/down state of standbys.  This is action-at-a-distance in its
>> >> worst form.
>> >
>> > Maybe, but I can't help thinking people are going to want some
>> > form of this.  The case where someone wants to do sync rep to the
>> > machine in the next rack over and async rep to a server at a
>> > remote site seems too important to ignore.
>>
>> I think there may be a terminology issue here -- I took "configure
>> by standby" to mean that *at the master* you would specify rules for
>> each standby.  I think Simon took it to mean that each standby would
>> define the rules for replication to it.  Maybe this issue can
>> resolve gracefully with a bit of clarification?
>
> The use case of "machine in the next rack over and async rep to a server
> at a remote site" would require the settings
>
> server.nextrack = synch
> server.remotesite = async
>
> which leaves open the question of what happens when "nextrack" is down.
>
> In many cases, to give adequate performance in that situation people add
> an additional server, so the config becomes
>
> server.nextrack1 = synch
> server.nextrack2 = synch
> server.remotesite = async
>
> We then want to specify for performance reasons that we can get a reply
> from either nextrack1 or nextrack2, so it all still works safely and
> quickly if one of them is down. How can we express that rule concisely?
> With some difficulty.

Perhaps the difficulty here is that those still look like per-server
settings to me.  Just maybe with a different set of semantics.

> My suggestion is simply to have a single parameter (name unimportant)
>
> number_of_synch_servers_we_wait_for = N
>
> which is much easier to understand because it is phrased in terms of the
> guarantee given to the transaction, not in terms of what the admin
> thinks is the situation.

So I agree that we need to talk about whether or not we want to do
this.  I'll give my opinion.  I am not sure how useful this really is.
 Consider a master with two standbys.  The master commits a
transaction and waits for one of the two standbys, then acknowledges
the commit back to the user.  Then the master crashes.  Now what?
It's not immediately obvious which standby we should being online as
the primary, and if we guess wrong we could lose transactions thought
to be committed.  This is probably a solvable problem, with enough
work: we can write a script to check the last LSN received by each of
the two standbys and promote whichever one is further along.

But... what happens if the master and one standby BOTH crash
simultaneously?  There's no way of knowing (until we get at least one
of them back up) whether it's safe to promote the other standby.

I like the idea of a "quorum commit" type feature where we promise the
user that things are committed when "enough" servers have acknowledged
the commit.  But I think most people are not going to want that
configuration unless we also provide some really good management tools
that we don't have today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Alvaro Herrera

Excerpts from Heikki Linnakangas's message of mar may 25 04:41:30 -0400 2010:
> On 24/05/10 22:49, Alvaro Herrera wrote:

> > I think this is nonsense.  If you have 3-years-old sales transactions,
> > and your database has any interesting churn, tuples those pages have
> > been frozen for a very long time *already*.

> What's missing from the suggestion is that relfrozenxid and datfrozenxid 
> also need to be expanded to 8-bytes. That way you effectively have 
> 8-byte XIDs, which means that you never need to vacuum to avoid XID 
> wraparound.

Hmm, so are we going to use the "xid epoch" more officially?  That's
entirely a new line of development, perhaps it opens new possibilities.

This sounds like extending Xid to 64 bits, without having to store the
high bits everywhere.  Was this discussed in the PGCon devs meeting?

-- 
Álvaro Herrera 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 11:52 -0500, Kevin Grittner wrote:
> Robert Haas  wrote:
> > Simon Riggs  wrote:
> >> If we define robustness at the standby level then robustness
> >> depends upon unseen administrators, as well as the current
> >> up/down state of standbys.  This is action-at-a-distance in its
> >> worst form.
> > 
> > Maybe, but I can't help thinking people are going to want some
> > form of this.  The case where someone wants to do sync rep to the
> > machine in the next rack over and async rep to a server at a
> > remote site seems too important to ignore.
>  
> I think there may be a terminology issue here -- I took "configure
> by standby" to mean that *at the master* you would specify rules for
> each standby.  I think Simon took it to mean that each standby would
> define the rules for replication to it.  Maybe this issue can
> resolve gracefully with a bit of clarification?

The use case of "machine in the next rack over and async rep to a server
at a remote site" would require the settings

server.nextrack = synch
server.remotesite = async

which leaves open the question of what happens when "nextrack" is down.

In many cases, to give adequate performance in that situation people add
an additional server, so the config becomes

server.nextrack1 = synch
server.nextrack2 = synch
server.remotesite = async

We then want to specify for performance reasons that we can get a reply
from either nextrack1 or nextrack2, so it all still works safely and
quickly if one of them is down. How can we express that rule concisely?
With some difficulty.

My suggestion is simply to have a single parameter (name unimportant)

number_of_synch_servers_we_wait_for = N

which is much easier to understand because it is phrased in terms of the
guarantee given to the transaction, not in terms of what the admin
thinks is the situation.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Mike Fowler


Robert Haas wrote:

On Tue, May 25, 2010 at 12:04 PM, Mike Fowler  wrote:
  

Erik Rijkers wrote:


libxml2.x86_64  2.6.26-2.1.2.8  installed
libxml2-devel.x86_642.6.26-2.1.2.8  installed

  

Thanks for testing my patch Erik. It turns out I've got libxml2 installed at
version 2.7.5. Searching the gnome mailing lists, it turns out
xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very next
version from yours, 2.6.27 (see:
http://mail.gnome.org/archives/xml/2006-October/msg00119.html).



We're unlikely to accept this patch if it changes the minimum version
of libxml2 required to compile PostgreSQL


Why? 2.6.27 is almost 4 years old.

I realise that my patch didn't update configure and configure.in, and 
indeed I didn't think of it when I responded to Erik (I'm too used to 
the Java world where people manage their own dependencies). I've now 
attached the updated patch which ups the check from version 2.6.23 to 
2.6.27.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: configure
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/configure,v
retrieving revision 1.679
diff -c -r1.679 configure
*** configure	13 May 2010 22:07:40 -	1.679
--- configure	25 May 2010 16:57:49 -
***
*** 9079,9087 
  
  if test "$with_libxml" = yes ; then
  
! { $as_echo "$as_me:$LINENO: checking for xmlSaveToBuffer in -lxml2" >&5
! $as_echo_n "checking for xmlSaveToBuffer in -lxml2... " >&6; }
! if test "${ac_cv_lib_xml2_xmlSaveToBuffer+set}" = set; then
$as_echo_n "(cached) " >&6
  else
ac_check_lib_save_LIBS=$LIBS
--- 9079,9087 
  
  if test "$with_libxml" = yes ; then
  
! { $as_echo "$as_me:$LINENO: checking for xmlXPathCompiledEvalToBoolean in -lxml2" >&5
! $as_echo_n "checking for xmlXPathCompiledEvalToBoolean in -lxml2... " >&6; }
! if test "${ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean+set}" = set; then
$as_echo_n "(cached) " >&6
  else
ac_check_lib_save_LIBS=$LIBS
***
*** 9099,9109 
  #ifdef __cplusplus
  extern "C"
  #endif
! char xmlSaveToBuffer ();
  int
  main ()
  {
! return xmlSaveToBuffer ();
;
return 0;
  }
--- 9099,9109 
  #ifdef __cplusplus
  extern "C"
  #endif
! char xmlXPathCompiledEvalToBoolean ();
  int
  main ()
  {
! return xmlXPathCompiledEvalToBoolean ();
;
return 0;
  }
***
*** 9129,9140 
  	 test "$cross_compiling" = yes ||
  	 $as_test_x conftest$ac_exeext
 }; then
!   ac_cv_lib_xml2_xmlSaveToBuffer=yes
  else
$as_echo "$as_me: failed program was:" >&5
  sed 's/^/| /' conftest.$ac_ext >&5
  
! 	ac_cv_lib_xml2_xmlSaveToBuffer=no
  fi
  
  rm -rf conftest.dSYM
--- 9129,9140 
  	 test "$cross_compiling" = yes ||
  	 $as_test_x conftest$ac_exeext
 }; then
!   ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean=yes
  else
$as_echo "$as_me: failed program was:" >&5
  sed 's/^/| /' conftest.$ac_ext >&5
  
! 	ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean=no
  fi
  
  rm -rf conftest.dSYM
***
*** 9142,9150 
conftest$ac_exeext conftest.$ac_ext
  LIBS=$ac_check_lib_save_LIBS
  fi
! { $as_echo "$as_me:$LINENO: result: $ac_cv_lib_xml2_xmlSaveToBuffer" >&5
! $as_echo "$ac_cv_lib_xml2_xmlSaveToBuffer" >&6; }
! if test "x$ac_cv_lib_xml2_xmlSaveToBuffer" = x""yes; then
cat >>confdefs.h <<_ACEOF
  #define HAVE_LIBXML2 1
  _ACEOF
--- 9142,9150 
conftest$ac_exeext conftest.$ac_ext
  LIBS=$ac_check_lib_save_LIBS
  fi
! { $as_echo "$as_me:$LINENO: result: $ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean" >&5
! $as_echo "$ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean" >&6; }
! if test "x$ac_cv_lib_xml2_xmlXPathCompiledEvalToBoolean" = x""yes; then
cat >>confdefs.h <<_ACEOF
  #define HAVE_LIBXML2 1
  _ACEOF
***
*** 9152,9159 
LIBS="-lxml2 $LIBS"
  
  else
!   { { $as_echo "$as_me:$LINENO: error: library 'xml2' (version >= 2.6.23) is required for XML support" >&5
! $as_echo "$as_me: error: library 'xml2' (version >= 2.6.23) is required for XML support" >&2;}
 { (exit 1); exit 1; }; }
  fi
  
--- 9152,9159 
LIBS="-lxml2 $LIBS"
  
  else
!   { { $as_echo "$as_me:$LINENO: error: library 'xml2' (version >= 2.6.27) is required for XML support" >&5
! $as_echo "$as_me: error: library 'xml2' (version >= 2.6.27) is required for XML support" >&2;}
 { (exit 1); exit 1; }; }
  fi
  
Index: configure.in
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/configure.in,v
retrieving revision 1.627
diff -c -r1.627 configure.in
*** configure.in	13 May 2010 22:07:42 -	1.627
--- configure.in	25 May 2010 16:22:32 -
***
*** 940,946 
  fi
  
  if test "$with_libxml" = yes ; then
!   AC_CHECK_LIB(xml2, xmlSaveToBuffer, [], [AC_MSG_ERROR([library 'xml2' (version >= 2.6.23) is required for XML support])])
  fi
  
  if test "$with_libxslt" = yes ; then
--- 940,946 ---

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 12:40 -0400, Robert Haas wrote:
> On Tue, May 25, 2010 at 12:28 PM, Simon Riggs  wrote:
> > Synchronous replication implies that a commit should wait. This wait is
> > experienced by the transaction, not by other parts of the system. If we
> > define robustness at the standby level then robustness depends upon
> > unseen administrators, as well as the current up/down state of standbys.
> > This is action-at-a-distance in its worst form.
> 
> Maybe, but I can't help thinking people are going to want some form of
> this.  
> The case where someone wants to do sync rep to the machine in
> the next rack over and async rep to a server at a remote site seems
> too important to ignore.

The use case of "machine in the next rack over and async rep to a server
at a remote site" *is* important, but you give no explanation as to why
that implies "per-standby" is the solution to it.

If you read the rest of my email, you'll see that I have explained the
problems "per-standby" settings would cause.

Please don't be so quick to claim it is me ignoring anything.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Add _PG_init to PL language handler documentation

2010-05-25 Thread Jonathan Leto

Howdy,

This tiny doc patch adds _PG_init to the skeleton example code for a
PL. The information is quite valuable to PL authors, who might miss it
when it is described in the shared library documentation.

This patch was based off of 6e2ba96 in the git mirror and a colorized
diff can be viewed here:

http://github.com/leto/postgres/commit/a9e265a7f55a0605fb4c6135f0f689c8b89e9623

Duke

-- 
Jonathan "Duke" Leto
jonat...@leto.net
http://leto.net


pginit.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Kevin Grittner

Robert Haas  wrote:
> Simon Riggs  wrote:
>> If we define robustness at the standby level then robustness
>> depends upon unseen administrators, as well as the current
>> up/down state of standbys.  This is action-at-a-distance in its
>> worst form.
> 
> Maybe, but I can't help thinking people are going to want some
> form of this.  The case where someone wants to do sync rep to the
> machine in the next rack over and async rep to a server at a
> remote site seems too important to ignore.

I think there may be a terminology issue here -- I took "configure
by standby" to mean that *at the master* you would specify rules for
each standby.  I think Simon took it to mean that each standby would
define the rules for replication to it.  Maybe this issue can
resolve gracefully with a bit of clarification?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 10:52 AM, Joseph Adams
 wrote:
>> Well, I think it's fine to use the wiki for brainstorming, but before
>> you change the design you probably need to talk about it here.  You
>> can't rely on everyone on -hackers to follow changes on a wiki page
>> somewhere.  It looks like the API has been overhauled pretty heavily
>> since the last version we talked about here, and I'm not sure I
>> understand it.
>
> I'll try to explain it in one big nutshell:
>
> Instead of, for instance, json_to_number('5') and number_to_json(5), I
> propose changing it to from_json(5)::INT and to_json('5').  Note how
> from_json simply returns TEXT containing the underlying value for the
> user to cast.  I plan to make calling to_json/from_json with arrays or
> objects (e.g. to_json(ARRAY[1,2,3]) and from_json('[1,2,3]') ) throw
> an error for now, as implementing all the specifics of this could be
> quite distracting.

I don't see how that's an improvement over the previous design.  It
seems like it adds a lot of extra casting and removes useful list
operations without any corresponding advantage.

> If I'm not mistaken, json_object([content [AS name] [, ...]] | *)
> RETURNS json can't be implemented without augmenting the grammar (as
> was done with xmlforest), so I considered making it take a RECORD
> parameter like the hstore(RECORD) function does, as was suggested on
> IRC.  However, this may be inadequate for selecting some columns but
> not others.  Using examples from hstore:
>
> SELECT hstore(foo) FROM foo;  => '"e"=>"2.71828", "pi"=>"3.14159"'
> -- this works, but what if we only want one field?
>
> SELECT hstore(pi) FROM foo;
> -- function type error
>
> SELECT hstore(row(pi)) FROM foo;  => '"f1"=>"3.14159"'
> -- field name is lost
>
> SELECT hstore(bar) FROM (select pi FROM foo) AS bar;  => '"f1"=>"3.14159"'
> -- ugly, and field name is *still* lost

Yeah.  I'm not sure what to do about this problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Confused about the buffer pool size

2010-05-25 Thread MMK

Hello All:
In the code (costsize.c), I see that effective_cache_size is set 
to DEFAULT_EFFECTIVE_CACHE_SIZE.
This is defined as follows in cost.h
#define DEFAULT_EFFECTIVE_CACHE_SIZE 16384
But when I say 
show shared_buffers in psql I get,
shared_buffers  28MB
In postgresql.conf file, the following lines appear
shared_buffers = 28MB                   # min 128kB           # (change 
requires restart)#temp_buffers = 8MB                     # min 800kB

So I am assuming that the buffer pool size is 28MB = 28 * 128 = 3584 8K pages.
So should effective_cache_size be set to 3584 rather than the 16384?
Thanks,
MMK.

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Joshua D. Drake

On Tue, 2010-05-25 at 12:40 -0400, Robert Haas wrote:
> On Tue, May 25, 2010 at 12:28 PM, Simon Riggs  wrote:
> > Synchronous replication implies that a commit should wait. This wait is
> > experienced by the transaction, not by other parts of the system. If we
> > define robustness at the standby level then robustness depends upon
> > unseen administrators, as well as the current up/down state of standbys.
> > This is action-at-a-distance in its worst form.
> 
> Maybe, but I can't help thinking people are going to want some form of
> this.  The case where someone wants to do sync rep to the machine in
> the next rack over and async rep to a server at a remote site seems
> too important to ignore.

Uhh yeah, that is pretty much the standard use case. The "next rack" is
only 50% of the equation. The next part is the disaster recovery rack
over 100Mb (or even 10Mb) that is half way across the country. It is
common, very common.

Joshua D. Drake

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 12:04 PM, Mike Fowler  wrote:
> Erik Rijkers wrote:
>>
>> libxml2.x86_64          2.6.26-2.1.2.8  installed
>> libxml2-devel.x86_64    2.6.26-2.1.2.8  installed
>>
>
> Thanks for testing my patch Erik. It turns out I've got libxml2 installed at
> version 2.7.5. Searching the gnome mailing lists, it turns out
> xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very next
> version from yours, 2.6.27 (see:
> http://mail.gnome.org/archives/xml/2006-October/msg00119.html).

We're unlikely to accept this patch if it changes the minimum version
of libxml2 required to compile PostgreSQL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 12:36 PM, Simon Riggs  wrote:
> On Tue, 2010-05-25 at 19:12 +0900, Fujii Masao wrote:
>> On Mon, May 17, 2010 at 5:33 PM, Fujii Masao  wrote:
>> > On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
>> >> Hmm, OK, I think that makes sense.  Would you care to propose a patch?
>> >
>> > Yep. Here is the patch.
>> >
>> > This patch distinguishes normal shutdown from unexpected exit, while the
>> > server is in recovery. That is, when smart or fast shutdown is requested
>> > during recovery, the bgwriter sets the ControlFile->state to new-introduced
>> > DB_SHUTDOWNED_IN_RECOVERY state.
>>
>> This patch is worth applying for 9.0? If not, I'll add it into
>> the next CF for 9.1.
>
> Presumably Robert will be applying the patch? It seems to address the
> concern raised on the thread.

Yes, I was planning to review it.  But if you or someone else would
like to cut in, that's OK too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Robert Haas

On Tue, May 25, 2010 at 12:28 PM, Simon Riggs  wrote:
> Synchronous replication implies that a commit should wait. This wait is
> experienced by the transaction, not by other parts of the system. If we
> define robustness at the standby level then robustness depends upon
> unseen administrators, as well as the current up/down state of standbys.
> This is action-at-a-distance in its worst form.

Maybe, but I can't help thinking people are going to want some form of
this.  The case where someone wants to do sync rep to the machine in
the next rack over and async rep to a server at a remote site seems
too important to ignore.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] recovery getting interrupted is not so unusual as it used to be

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 19:12 +0900, Fujii Masao wrote:
> On Mon, May 17, 2010 at 5:33 PM, Fujii Masao  wrote:
> > On Sat, May 15, 2010 at 3:20 AM, Robert Haas  wrote:
> >> Hmm, OK, I think that makes sense.  Would you care to propose a patch?
> >
> > Yep. Here is the patch.
> >
> > This patch distinguishes normal shutdown from unexpected exit, while the
> > server is in recovery. That is, when smart or fast shutdown is requested
> > during recovery, the bgwriter sets the ControlFile->state to new-introduced
> > DB_SHUTDOWNED_IN_RECOVERY state.
> 
> This patch is worth applying for 9.0? If not, I'll add it into
> the next CF for 9.1.

Presumably Robert will be applying the patch? It seems to address the
concern raised on the thread.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Mon, 2010-05-24 at 18:29 -0700, Josh Berkus wrote:

> If people agree that the above is our roadmap, implementing
> "per-standby" first makes sense, and then we can implement "per-session"
> GUC later.

IMHO "per-standby" sounds simple, but is dangerously simplistic,
explained on another part of the thread.

We need to think clearly about failure modes and how they will be
handled. Failure modes and edge cases completely govern the design here.
"All running smoothly" isn't a major concern and so it appears that the
user interface can be done various ways.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Mon, 2010-05-24 at 22:20 +0900, Fujii Masao wrote:

> Second, we need to discuss about how to specify the synch
> level. There are three approaches:
> 
> * Per standby
>   Since the purpose, location and H/W resource often differ
>   from one standby to another, specifying level per standby
>   (i.e., we set the level in recovery.conf) is a
>   straightforward approach, I think. For example, we can
>   choose #3 for high-availability standby near the master,
>   and choose #1 (async) for the disaster recovery standby
>   remote.
> 
> * Per transaction
>   Define the PGC_USERSET option specifying the level and
>   specify it on the master in response to the purpose of
>   transaction. In this approach, for example, we can choose
>   #4 for the transaction which should be visible on the
>   standby as soon as a "success" of the commit has been
>   returned to a client. We can also choose #1 for
>   time-critical but not mission-critical transaction.
> 
> * Mix
>   Allow users to specify the level per standby and
>   transaction at the same time, and then calculate the real
>   level from them by using some algorithm.
> 
> Which should we adopt for 9.1? I'd like to implement the
> "per-standby" approach at first since it's simple and seems
> to cover more use cases. Thought?

-1

Synchronous replication implies that a commit should wait. This wait is
experienced by the transaction, not by other parts of the system. If we
define robustness at the standby level then robustness depends upon
unseen administrators, as well as the current up/down state of standbys.
This is action-at-a-distance in its worst form. 

Imagine having 2 standbys, 1 synch, 1 async. If the synch server goes
down, performance will improve and robustness will have been lost. What
good would that be?

Imagine a standby connected over a long distance. DBA brings up standby
in synch mode accidentally and the primary server hits massive
performance problems without any way of directly controlling this.

The worst aspect of standby-level controls is that nobody ever knows how
safe a transaction is. There is no definition or test for us to check
exactly how safe any particular transaction is. Also, the lack of safety
occurs at the time when you least want it - when one of your servers is
already down.

So I call "per-standby" settings simple, and broken in multiple ways.

Putting the control in the hands of the transaction owner (i.e. on the
master) is exactly where the control should be. I personally like the
idea of that being a USERSET, though could live with system wide
settings if need be. But the control must be on the *master* not on the
standbys.

The best parameter we can specify is the number of servers that we wish
to wait for confirmation from. That is a definition that easily manages
the complexity of having various servers up/down at any one time. It
also survives misconfiguration more easily, as well as providing a
workaround if replicating across a bursty network where we can't
guarantee response times, even of the typical response time is good.

(We've discussed this many times before over a period of years and not
really sure why we have to re-discuss this repeatedly just because
people disagree. You don't mention the earlier discussions, not sure
why. If we want to follow the community process, then all previous
discussions need to be taken into account, unless things have changed -
which they haven't: same topic, same people, AFAICS.)

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

2010-05-25 Thread Simon Riggs

On Tue, 2010-05-25 at 12:40 +0900, Fujii Masao wrote:
> On Tue, May 25, 2010 at 10:29 AM, Josh Berkus  wrote:
> > I agree that #4 should be done last, but it will be needed, not in the
> > least by your employer ;-) .  I don't see any obvious way to make #4
> > compatible with any significant query load on the slave, but in general
> > I'd think that users of #4 are far more concerned with 0% data loss than
> > they are with getting the slave to run read queries.
> 
> Since #2 and #3 are enough for 0% data loss, I think that such users
> would be more concerned about what results are visible in the standby.
> No?

Please add #4 also. You can do that easily at the same time as #2 and
#3, and it will leave me free to fix the perceived conflict problems.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Simon Riggs

On Sun, 2010-05-23 at 16:21 -0400, Jan Wieck wrote:

> In some systems (data warehousing, replication), the order of commits is
> important, since that is the order in which changes have become visible.
> This information could theoretically be extracted from the WAL, but
> scanning the entire WAL just to extract this tidbit of information would
> be excruciatingly painful.

I think it would be quite simple to read WAL. WALSender reads the WAL
file after its been flushed, so it would be simple for it to read a blob
of WAL and then extract the commit order from it.

Overall though, it would be easier and more efficient to *add* info to
WAL and then do all this processing *after* WAL has been transported
elsewhere. Extracting info with DDL triggers, normal triggers, commit
order and everything else seems like too much work to me. Every other
RDBMS has moved away from trigger-based replication and we should give
that serious consideration also.

-- 
 Simon Riggs   www.2ndQuadrant.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard

2010-05-25 Thread Mike Fowler


Erik Rijkers wrote:

libxml2.x86_64  2.6.26-2.1.2.8  installed
libxml2-devel.x86_642.6.26-2.1.2.8  installed
  


Thanks for testing my patch Erik. It turns out I've got libxml2 
installed at version 2.7.5. Searching the gnome mailing lists, it turns 
out xmlXPathCompiledEvalToBoolean was added (unbelievably) in the very 
next version from yours, 2.6.27 (see: 
http://mail.gnome.org/archives/xml/2006-October/msg00119.html).


Regards,

--
Mike Fowler
Registered Linux user: 379787


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard (was: Add xpath_exists Function)

2010-05-25 Thread Erik Rijkers

On Tue, May 25, 2010 16:31, Mike Fowler wrote:
> I've been reading the SQL/XML standard and discovered that it defines a
> function named XMLEXISTS that does exactly what the todo item
> xpath_exists defines. My original patch named the function as per the
> todo but I think using the function name from the standard is a better
> idea. So this patch is the same as before, but the function is now named
> XMLEXISTS instead of xpath_exists.
>

I tried this path (cvs HEAD, applies without error), but get this error:

[...]
utils/adt/xml.o: In function `xmlexists':
/var/data1/pg_stuff/pg_sandbox/pgsql.xmlexists/src/backend/utils/adt/xml.c:3639:
 undefined
reference to `xmlXPathCompiledEvalToBoolean'
collect2: ld returned 1 exit status
make[2]: *** [postgres] Error 1
make[2]: Leaving directory 
`/var/data1/pg_stuff/pg_sandbox/pgsql.xmlexists/src/backend'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/var/data1/pg_stuff/pg_sandbox/pgsql.xmlexists/src'
make: *** [all] Error 2



./configure --prefix=/var/data1/pg_stuff/pg_installations/pgsql.xmlexists 
--with-pgport=6548
--quiet --enable-depend --enable-cassert --enable-debug --with-openssl 
--with-perl --with-libxml
--with-libxslt


centos 5.4  2.6.18-164.el5  x86_64 GNU/Linux
libxml2.x86_64  2.6.26-2.1.2.8  installed
libxml2-devel.x86_642.6.26-2.1.2.8  installed



Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clearing psql`s input buffer after auto-reconnect

2010-05-25 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> 3. Have CheckConnection do longjmp(sigint_interrupt_jmp) after resetting
...
> Now #1 might be the best long-term solution but I have no particular
> appetite to tackle it, and #2 is just too ugly to contemplate.  That
> leaves #3, which is a bit ugly in its own right but seems like the best
> fix we're likely to get.
>
> Comments, better ideas?

I like #3. If this were a more common event I might lean towards #1 
but it's not so #3 seems fine.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005251113
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkv76TYACgkQvJuQZxSWSsiP6wCePU5TDpfFiv7MQpQ0vdIMms0d
XZcAoMES58ilXZr2m5TEfeRUeiuuuss2
=36Z9
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-25 Thread Bruce Momjian

Robert Haas wrote:
> On Mon, May 24, 2010 at 11:35 PM, Bruce Momjian  wrote:
> > Have you read the docs? ?It does mention the issue with /contrib and
> > stuff. ?How do I document a limitation I don't know about? ?This is all
> > very vague. ?Please suggest some wording.
> 
> OK, here's an attempt.  Please fact-check.
> 
> --
> 
> General Limitations
> 
> pg_upgrade relies on binary compatibility between the old and new
> on-disk formats, including the on-disk formats of individual data
> types.  pg_upgrade attempts to detect cases in which the on-disk
> format has changed; for example, it verifies that the old and new
> clusters have the same value for --enable-integer-datetimes.  However,
> there is no systematic way for pg_upgrade to detect problems of this
> type; it has hard-coded knowledge of the specific cases known to exist
> in core PostgreSQL, including /contrib.  If third-party or
> user-defined data types or access methods are used, it is the user's
> responsibility to verify that the versions loaded into the old and new
> clusters use compatible on-disk formats.  If they do not, pg_upgrade
> may appear to work but subsequently crash or silently corrupt data.

OK, I have added a mention of the issues above, in a more abbreviated
format.

> pg_upgrade also relies on ABI compatibility between modules loaded
> into the old and new clusters.  For example, if an SQL function in the
> old cluster is defined to call a particular C function, pg_upgrade
> will recreate SQL function in the new cluster and will configure it to
> call the same C function.  If no such C function can be found by the
> new cluster, pg_upgrade will simply fail.  However, if a C function of
> the same name exists in the new cluster, but expects a different
> number of arguments or different types of arguments, then it is likely
> to crash the system when called.  In the worst case, data corruption
> could result.

These issues are not unique to pg_upgrade, and could happen even in a
pg_dump restore.

> Also, the following sentence appears not to fit with our "only to 9.0"
> policy: "For Windows users, note that due to different integer
> datetimes settings used by the one-click installer and the MSI
> installer, it is only possible to upgrade from version 8.3 of the
> one-click distribution to version 8.4 of the one-click distribution.
> It is not possible to upgrade from the MSI installer to the one-click
> installer."

Agreed.  I added a "8.4 or later" mention.  It is not worth calling it
"9.0 or later" because then I would have to update this mention for
every major release.

Applied patch attached.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com
Index: doc/src/sgml/pgupgrade.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/pgupgrade.sgml,v
retrieving revision 1.10
diff -c -c -r1.10 pgupgrade.sgml
*** doc/src/sgml/pgupgrade.sgml	24 May 2010 17:43:39 -	1.10
--- doc/src/sgml/pgupgrade.sgml	25 May 2010 14:50:36 -
***
*** 16,21 
--- 16,31 
9.0.1 -> 9.0.4.
   
  
+  
+   pg_upgrade works because, though new features are
+   regularly added to Postgres major releases, the internal data storage
+   format rarely changes.  pg_upgrade does its best to
+   make sure the old and new clusters are binary-compatible, e.g.  by
+   checking for compatible compile-time settings.  It is important that
+   any external modules are also binary compatibile, though this cannot
+   be checked by pg_upgrade.
+  
+ 
   
Supported Versions
  
***
*** 440,446 

Limitations in migrating from PostgreSQL 8.3
   
-  

 Upgrading from PostgreSQL 8.3 has additional restrictions not present
 when upgrading from later PostgreSQL releases.  For example,
--- 450,455 
***
*** 502,509 
 For Windows users, note that due to different integer datetimes settings
 used by the one-click installer and the MSI installer, it is only
 possible to upgrade from version 8.3 of the one-click distribution to
!version 8.4 of the one-click distribution. It is not possible to upgrade
!from the MSI installer to the one-click installer.

   
  
--- 511,518 
 For Windows users, note that due to different integer datetimes settings
 used by the one-click installer and the MSI installer, it is only
 possible to upgrade from version 8.3 of the one-click distribution to
!version 8.4 or later of the one-click distribution. It is not
!possible to upgrade from the MSI installer to the one-click installer.

   
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JSON manipulation functions

2010-05-25 Thread Joseph Adams

> Well, I think it's fine to use the wiki for brainstorming, but before
> you change the design you probably need to talk about it here.  You
> can't rely on everyone on -hackers to follow changes on a wiki page
> somewhere.  It looks like the API has been overhauled pretty heavily
> since the last version we talked about here, and I'm not sure I
> understand it.

I'll try to explain it in one big nutshell:

Instead of, for instance, json_to_number('5') and number_to_json(5), I
propose changing it to from_json(5)::INT and to_json('5').  Note how
from_json simply returns TEXT containing the underlying value for the
user to cast.  I plan to make calling to_json/from_json with arrays or
objects (e.g. to_json(ARRAY[1,2,3]) and from_json('[1,2,3]') ) throw
an error for now, as implementing all the specifics of this could be
quite distracting.

If I'm not mistaken, json_object([content [AS name] [, ...]] | *)
RETURNS json can't be implemented without augmenting the grammar (as
was done with xmlforest), so I considered making it take a RECORD
parameter like the hstore(RECORD) function does, as was suggested on
IRC.  However, this may be inadequate for selecting some columns but
not others.  Using examples from hstore:

SELECT hstore(foo) FROM foo;  => '"e"=>"2.71828", "pi"=>"3.14159"'
-- this works, but what if we only want one field?

SELECT hstore(pi) FROM foo;
-- function type error

SELECT hstore(row(pi)) FROM foo;  => '"f1"=>"3.14159"'
-- field name is lost

SELECT hstore(bar) FROM (select pi FROM foo) AS bar;  => '"f1"=>"3.14159"'
-- ugly, and field name is *still* lost

To get (and set, which I overlooked before), use json_get and
json_set.  These take "JSONPath" expressions, but I don't plan to
implement all sorts of fancy features during the summer.  However, I
do plan to support some kind of parameter substitution so you can do
this:

json_get('[0,1,4,9,16,25]', '[%]' %% 2)=> '4'::TEXT

For this use case, though, it would be simpler to say:

'[0,1,4,9,16,25]'::JSON -> 2

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-25 Thread Andrew Dunstan




Alex Goncharov wrote:

,--- I/Alex (Mon, 24 May 2010 12:25:18 -0400) *
| No equivalent of FETCH_COUNT is available at the libpq level, so I
| assume that the interface I am using is smart enough not to send
| gigabytes of data to FE.
| 
| Where does the result set (GBs of data) reside after I call

| PQexecPrepared?  On BE, I hope?

Sorry for asking again...

No sarcasm meant: is there no straightforward answer here?  Or nobody
is certain?  Or a wrong list?


  


You have been given the answer. Please re-read the replies, e.g. the one 
from Abhijit Menon-Sen.


The data is saved on the client side before the call returns. If that 
uses too much memory, use a cursor.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Add XMLEXISTS function from the SQL/XML standard (was: Add xpath_exists Function)

2010-05-25 Thread Mike Fowler

I've been reading the SQL/XML standard and discovered that it defines a 
function named XMLEXISTS that does exactly what the todo item 
xpath_exists defines. My original patch named the function as per the 
todo but I think using the function name from the standard is a better 
idea. So this patch is the same as before, but the function is now named 
XMLEXISTS instead of xpath_exists.


Regards,

--
Mike Fowler
Registered Linux user: 379787

Index: src/backend/utils/adt/xml.c
===
RCS file: /home/mfowler/cvsrepo/pgrepo/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.97
diff -c -r1.97 xml.c
*** src/backend/utils/adt/xml.c	3 Mar 2010 17:29:45 -	1.97
--- src/backend/utils/adt/xml.c	25 May 2010 14:02:33 -
***
*** 3495,3497 
--- 3495,3668 
  	return 0;
  #endif
  }
+ 
+ /*
+  * Determines if the node specified by the supplied XPath exists
+  * in a given XML document, returning a boolean.
+  *
+  * It is up to the user to ensure that the XML passed is in fact
+  * an XML document - XPath doesn't work easily on fragments without
+  * a context node being known.
+  */
+ Datum
+ xmlexists(PG_FUNCTION_ARGS)
+ {
+ #ifdef USE_LIBXML
+ 	text	   *xpath_expr_text = PG_GETARG_TEXT_P(0);
+ 	xmltype*data = PG_GETARG_XML_P(1);
+ 	ArrayType  *namespaces = PG_GETARG_ARRAYTYPE_P(2);
+ 	xmlParserCtxtPtr ctxt = NULL;
+ 	xmlDocPtr	doc = NULL;
+ 	xmlXPathContextPtr xpathctx = NULL;
+ 	xmlXPathCompExprPtr xpathcomp = NULL;
+ 	char	   *datastr;
+ 	int32		len;
+ 	int32		xpath_len;
+ 	xmlChar*string;
+ 	xmlChar*xpath_expr;
+ 	int			i;
+ 	int			ndim;
+ 	Datum	   *ns_names_uris;
+ 	bool	   *ns_names_uris_nulls;
+ 	int			ns_count;
+ 	int			result;
+ 
+ 	/*
+ 	 * Namespace mappings are passed as text[].  If an empty array is passed
+ 	 * (ndim = 0, "0-dimensional"), then there are no namespace mappings.
+ 	 * Else, a 2-dimensional array with length of the second axis being equal
+ 	 * to 2 should be passed, i.e., every subarray contains 2 elements, the
+ 	 * first element defining the name, the second one the URI.  Example:
+ 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
+ 	 * 'http://example2.com']].
+ 	 */
+ 	ndim = ARR_NDIM(namespaces);
+ 	if (ndim != 0)
+ 	{
+ 		int		   *dims;
+ 
+ 		dims = ARR_DIMS(namespaces);
+ 
+ 		if (ndim != 2 || dims[1] != 2)
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_DATA_EXCEPTION),
+ 	 errmsg("invalid array for XML namespace mapping"),
+ 	 errdetail("The array must be two-dimensional with length of the second axis equal to 2.")));
+ 
+ 		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+ 
+ 		deconstruct_array(namespaces, TEXTOID, -1, false, 'i',
+ 		  &ns_names_uris, &ns_names_uris_nulls,
+ 		  &ns_count);
+ 
+ 		Assert((ns_count % 2) == 0);	/* checked above */
+ 		ns_count /= 2;			/* count pairs only */
+ 	}
+ 	else
+ 	{
+ 		ns_names_uris = NULL;
+ 		ns_names_uris_nulls = NULL;
+ 		ns_count = 0;
+ 	}
+ 
+ 	datastr = VARDATA(data);
+ 	len = VARSIZE(data) - VARHDRSZ;
+ 	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+ 	if (xpath_len == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DATA_EXCEPTION),
+  errmsg("empty XPath expression")));
+ 
+ 	string = (xmlChar *) palloc((len + 1) * sizeof(xmlChar));
+ 	memcpy(string, datastr, len);
+ 	string[len] = '\0';
+ 
+ 	xpath_expr = (xmlChar *) palloc((xpath_len + 1) * sizeof(xmlChar));
+ 	memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+ 	xpath_expr[xpath_len] = '\0';
+ 
+ 	pg_xml_init();
+ 	xmlInitParser();
+ 
+ 	PG_TRY();
+ 	{
+ 		/*
+ 		 * redundant XML parsing (two parsings for the same value during one
+ 		 * command execution are possible)
+ 		 */
+ 		ctxt = xmlNewParserCtxt();
+ 		if (ctxt == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		"could not allocate parser context");
+ 		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+ 		if (doc == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+ 		"could not parse XML document");
+ 		xpathctx = xmlXPathNewContext(doc);
+ 		if (xpathctx == NULL)
+ 			xml_ereport(ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		"could not allocate XPath context");
+ 		xpathctx->node = xmlDocGetRootElement(doc);
+ 		if (xpathctx->node == NULL)
+ 			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+ 		"could not find root XML element");
+ 
+ 		/* register namespaces, if any */
+ 		if (ns_count > 0)
+ 		{
+ 			for (i = 0; i < ns_count; i++)
+ 			{
+ char	   *ns_name;
+ char	   *ns_uri;
+ 
+ if (ns_names_uris_nulls[i * 2] ||
+ 	ns_names_uris_nulls[i * 2 + 1])
+ 	ereport(ERROR,
+ 			(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ 	  errmsg("neither namespace name nor URI may be null")));
+ ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+ ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
+ if (xmlXPathRegisterNs(xpathctx,
+ 	   (xmlChar *) ns_name,
+ 	   (xmlChar *) ns_uri) != 0)
+ 	e

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread Stephen Frost

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> * DoCopy() and RI_Initial_Check() were reworked to call ExecCheckRTEPerms()
>   with locally built RangeTblEntry.

Maybe I missed it somewhere, but we still need to address the case where
the user doesn't have those SELECT permissions that we're looking for in
RI_Initial_Check(), right?  KaiGai, your patch should be addressing that
in a similar fashion..

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-25 Thread Andrew Chernow


On 05/25/2010 07:35 AM, Alex Goncharov wrote:

,--- I/Alex (Mon, 24 May 2010 12:25:18 -0400) *
| No equivalent of FETCH_COUNT is available at the libpq level, so I
| assume that the interface I am using is smart enough not to send
| gigabytes of data to FE.
|
| Where does the result set (GBs of data) reside after I call
| PQexecPrepared?  On BE, I hope?

Sorry for asking again...

No sarcasm meant: is there no straightforward answer here?  Or nobody
is certain?  Or a wrong list?



Issue multiple queries and make use of LIMIT/OFFSET.  You'll have to go 
manual on this one.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade docs

2010-05-25 Thread Robert Haas

On Mon, May 24, 2010 at 11:35 PM, Bruce Momjian  wrote:
> Have you read the docs?  It does mention the issue with /contrib and
> stuff.  How do I document a limitation I don't know about?  This is all
> very vague.  Please suggest some wording.

OK, here's an attempt.  Please fact-check.

--

General Limitations

pg_upgrade relies on binary compatibility between the old and new
on-disk formats, including the on-disk formats of individual data
types.  pg_upgrade attempts to detect cases in which the on-disk
format has changed; for example, it verifies that the old and new
clusters have the same value for --enable-integer-datetimes.  However,
there is no systematic way for pg_upgrade to detect problems of this
type; it has hard-coded knowledge of the specific cases known to exist
in core PostgreSQL, including /contrib.  If third-party or
user-defined data types or access methods are used, it is the user's
responsibility to verify that the versions loaded into the old and new
clusters use compatible on-disk formats.  If they do not, pg_upgrade
may appear to work but subsequently crash or silently corrupt data.

pg_upgrade also relies on ABI compatibility between modules loaded
into the old and new clusters.  For example, if an SQL function in the
old cluster is defined to call a particular C function, pg_upgrade
will recreate SQL function in the new cluster and will configure it to
call the same C function.  If no such C function can be found by the
new cluster, pg_upgrade will simply fail.  However, if a C function of
the same name exists in the new cluster, but expects a different
number of arguments or different types of arguments, then it is likely
to crash the system when called.  In the worst case, data corruption
could result.

--

Also, the following sentence appears not to fit with our "only to 9.0"
policy: "For Windows users, note that due to different integer
datetimes settings used by the one-click installer and the MSI
installer, it is only possible to upgrade from version 8.3 of the
one-click distribution to version 8.4 of the one-click distribution.
It is not possible to upgrade from the MSI installer to the one-click
installer."

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-25 Thread Alex Goncharov

,--- Abhijit Menon-Sen (Tue, 25 May 2010 17:26:18 +0530) *
| Unless you explicitly declare and fetch from an SQL-level cursor, your
| many GBs of data are going to be transmitted to libpq, which will eat
| lots of memory. (The wire protocol does have something like cursors,
| but libpq does not use them, it retrieves the entire result set.)
,--- Yeb Havinga (Tue, 25 May 2010 14:08:51 +0200) *
| The GBs of data are gathered at the site of the libpq client (pgresult 
| object gathered/allocated while consuming result input from backend).
`--*

Thank you very much!

-- Alex -- alex-goncha...@comcast.net --

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExecutorCheckPerms() hook

2010-05-25 Thread Stephen Frost

KaiGai,

* KaiGai Kohei (kai...@ak.jp.nec.com) wrote:
> OK, the attached patch reworks it according to the way.

Reviewing this patch, there are a whole slew of problems.

#1: REALLY BIG ISSUE- Insufficient comment updates.  You've changed
function definitions in a pretty serious way as well as moved some code
around such that some of the previous comments don't make sense.  You
have got to update comments when you're writing a patch.  Indeed, the
places I see a changes in comments are when you've removed what appears
to still be valid and appropriate comments, or places where you've added
comments which are just blatently wrong with the submitted patch.

#2: REALLY BIG ISSUE- You've added ExecutorCheckPerms_hook as part of
this patch- don't, we're in feature-freeze right now and should not be
adding hooks at this time.

#3: You didn't move ExecCheckRTPerms() and ExecCheckRTEPerms() to
utils/acl and instead added executor/executor.h to rt_triggers.c.
I don't particularly like that.  I admit that DoCopy() already knew
about the executor, and if that were the only case outside of the
executor where ExecCheckRTPerms() was getting called it'd probably be
alright, but we already have another place that wants to use it, so
let's move it to a more appropriate place.

#4: As mentioned previously, the hook (which should be added in a
separate patch anyway) makes more sense to me to be in
ExecCheckRTPerms(), not ExecCheckRTEPerms().  This also means that we
need to be calling ExecCheckRTPerms() from DoCopy and
RI_Initial_Check(), to make sure that the hook gets called.  To that
end, I wouldn't even expose ExecCheckRTEPerms() outside of acl.c.  Also,
there should be a big comment about not using or calling
ExecCheckRTEPerms() directly outside of ExecCheckRTPerms() since the
hook would then be skipped.

#5: In DoCopy, you can remove relPerms and remainingPerms, but I'd
probably leave required_access up near the top and then just use it to
set rte->required_access directly rather than moving that bit deep down
into the function.

#6: I havn't checked yet, but if there are other things in an RTE which
would make sense in the DoCopy case, beyond just what's needed for the
permissions checking, and which wouldn't be 'correct' with a NULL'd
value, I would set those.  Yes, we're building the RTE to check
permissions, but we don't want someone downstream to be suprised when
they make a change to something in the permissions checking and discover
that a value in RTE they expected to be there wasn't valid.  Even more
so, if there are function helpers which can be used to build an RTE, we
should be using them.  The same goes for RI_Initial_Check().

#7: I'd move the conditional if (is_from) into the foreach which is
building the columnsSet and eliminate the need for columnsSet; I don't
see that it's really adding much here.

#8: When moving ExecCheckRTPerms(), you should rename it to be more like
the other function calls in acl.h  Perhaps pg_rangetbl_aclcheck()?
Also, it should return an actual AclResult instead of just true/false.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] JSON manipulation functions

2010-05-25 Thread Magnus Hagander

On Tue, May 25, 2010 at 12:57, Robert Haas  wrote:
> On Tue, May 25, 2010 at 5:37 AM, Joseph Adams
>  wrote:
>> I started a wiki article for brainstorming the JSON API:
>> http://wiki.postgresql.org/wiki/JSON_API_Brainstorm .  I also made
>> substantial changes to the draft of the API based on discussion here
>> and on the #postgresql IRC channel.
>>
>> Is it alright to use the wiki for brainstorming, or should it stay on
>> the mailing list or go somewhere else?
>
> Well, I think it's fine to use the wiki for brainstorming, but before
> you change the design you probably need to talk about it here.  You
> can't rely on everyone on -hackers to follow changes on a wiki page
> somewhere.  It looks like the API has been overhauled pretty heavily
> since the last version we talked about here, and I'm not sure I
> understand it.

The general idea that most people have been using, and that I think is
correct, is to have the discussion here on the list, and then keep a
summary of the current state of it on the wiki page so it's easier for
someone entering the discussion to catch up on where it is.


>> I'll try not to spend too much time quibbling over the specifics as I
>> tend to do.  While the brainstorming is going on, I plan to start
>> implementing the datatype by itself so I can establish an initial
>> working codebase.
>
> Sounds good.

Agreed.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-25 Thread Abhijit Menon-Sen

At 2010-05-25 07:35:34 -0400, alex-goncha...@comcast.net wrote:
>
> | Where does the result set (GBs of data) reside after I call
> | PQexecPrepared?  On BE, I hope?

Unless you explicitly declare and fetch from an SQL-level cursor, your
many GBs of data are going to be transmitted to libpq, which will eat
lots of memory. (The wire protocol does have something like cursors,
but libpq does not use them, it retrieves the entire result set.)

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq, PQexecPrepared, data size sent to FE vs. FETCH_COUNT

2010-05-25 Thread Yeb Havinga


Alex Goncharov wrote:

,--- I/Alex (Mon, 24 May 2010 12:25:18 -0400) *
| No equivalent of FETCH_COUNT is available at the libpq level, so I
| assume that the interface I am using is smart enough not to send
| gigabytes of data to FE.
| 
| Where does the result set (GBs of data) reside after I call

| PQexecPrepared?  On BE, I hope?

Sorry for asking again...

No sarcasm meant: is there no straightforward answer here?  Or nobody
is certain?  Or a wrong list?
  
The straighforward answer is that the libpq frontend c-library does not 
support something like the JDBC client's setFetchSize.


The GBs of data are gathered at the site of the libpq client (pgresult 
object gathered/allocated while consuming result input from backend).


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Exposing the Xact commit order to the user

2010-05-25 Thread Nicolas Barbier

2010/5/25 Dan Ports :

> On Mon, May 24, 2010 at 10:24:07AM -0500, Kevin Grittner wrote:
>
>> Replicating or recreating the whole predicate locking and conflict
>> detection on slaves is not feasible for performance reasons. (I
>> won't elaborate unless someone feels that's not intuitively
>> obvious.) The only sane way I can see to have a slave database allow
>> serializable behavior is to WAL-log the acquisition of a snapshot by
>> a serializable transaction, and the rollback or commit, on the
>> master, and to have the serializable snapshot build on a slave
>> exclude any serializable transactions for which there are still
>> concurrent serializable transactions. Yes, that does mean WAL-
>> logging the snapshot acquisition even if the transaction doesn't yet
>> have an xid, and WAL-logging the commit or rollback even if it never
>> acquires an xid.
>
> One important observation is that any anomaly that occurs on the slave
> can be resolved by aborting a local read-only transaction. This is a
> good thing, because the alternatives are too horrible to consider.
>
> You could possibly cut the costs of predicate locking by having the
> master ship with each transaction the list of predicate locks it
> acquired. But you'd still have to track locks for read-only
> transactions, so maybe that's not a significant cost improvement. On
> the other hand, if you're willing to pay the price of serializability
> on the master, why not the slaves too?

I don't understand the problem. According to me, in the context of
SSI, a read-only slave can just map SERIALIZABLE to the technical
implementation of REPEATABLE READ (i.e., the currently-existing
"SERIALIZABLE"). The union of the transactions on the master and the
slave(s) will still exhibit SERIALIZABLE behavior because the
transactions on the slave cannot write anything and are therefore
irrelevant.

Is anything wrong with that reasoning?

Nicolas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 115 matches

Mail list logo