Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-08 Thread Amit Kapila
On Thu, Feb 27, 2014 at 4:14 PM, Christian Kruse
 wrote:
> On 25/02/14 16:11, Robert Haas wrote:
>> Reading this over, I'm not sure I understand why this is a CONTEXT at
>> all and not just a DETAIL for the particular error message that it's
>> supposed to be decorating.  Generally CONTEXT should be used for
>> information that will be relevant to all errors in a given code path,
>> and DETAIL for extra information specific to a particular error.
>
> Because there is more than one scenario where this context is useful,
> not just log_lock_wait messages.
>
>> If we're going to stick with CONTEXT, we could rephrase it like this:
>>
>> CONTEXT: while attempting to lock tuple (1,2) in relation with OID 3456
>>
>> or when the relation name is known:
>>
>> CONTEXT: while attempting to lock tuple (1,2) in relation "public"."foo"
>
> Accepted. Patch attached.

Thanks. I have done review of this patch and found couple of things
which I feel should be addressed.

1.
"> ISTM that you should be printing just the value and the unique index
> there, and not any information about the tuple proper.

Good point, I will have a look at this."

This point is still not handled in patch, due to which the message
it displays seems to be incomplete. Message it displays is as below:

LOG:  process 1800 still waiting for ShareLock on transaction 679 after 1014.000
 ms
CONTEXT:  while attempting to lock in relation "public"."idx_t1" of database pos
tgres

Here it is not clear what it attempts *lock in*

2. In function XactLockTableWaitErrorContextCallback(), ignore dropped
columns in tuple, else it will lead to following failure:

Session-1

postgres=# select * from t1;
 c1 | c2
+
  2 | 99
(1 row)


postgres=# Alter table t1 drop column c2;
ALTER TABLE
postgres=# begin;
BEGIN
postgres=# delete from t1 where c1=2;

Session-2
postgres=# begin;
BEGIN
postgres=# delete from t1 where c1=2;
ERROR:  cache lookup failed for type 0
CONTEXT:  while attempting to lock tuple (0,2) in relation "public"."t1" of data
base postgres

Problem is in Session-2 (cache lookup failed), when it tries to print values
for dropped attribute.

3.
" in relation \"%s\".\"%s\" of database %s",
Why to display only relation name in quotes?
I think database name should also be displayed in quotes.

4.
Context message:
"while attempting to lock tuple (0,2) ".

I think it will be better to rephrase it (may be by using 'operate on'
instead of 'lock').
The reason is that actually we trying to lock on transaction, so it doesn't make
good sense to use "lock on tuple"

5.
+ XactLockTableWaitWithInfo(relation, &tp, xwait);

+ MultiXactIdWaitWithErrorContext(relation, &tp, (MultiXactId) xwait,
+  MultiXactStatusUpdate, NULL, infomask);

I think we should make the name of MultiXactIdWaitWithErrorContext()
similar to XactLockTableWaitWithInfo.

6.
@@ -1981,7 +1981,8 @@ EvalPlanQualFetch(EState *estate, Relation
relation, int lockmode,
  if (TransactionIdIsValid(SnapshotDirty.xmax))
  {
  ReleaseBuffer(buffer);
- XactLockTableWait(SnapshotDirty.xmax);
+ XactLockTableWaitWithInfo(relation, &tuple,
+  SnapshotDirty.xmax);

In function EvalPlanQualFetch(), we are using SnapshotDirty to fetch
the tuple, so I think there is a chance that it will log the tuple which
otherwise will not be visible. I don't think this is right.


7.
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1307,7 +1307,7 @@ retry:
  if (TransactionIdIsValid(xwait))
  {
  index_endscan(index_scan);
- XactLockTableWait(xwait);
+ XactLockTableWaitWithInfo(heap, tup, xwait);

In function check_exclusion_constraint(), DirtySnapshot is used,
so it seems to me that this will also have the problem of logging
of tuple which otherwise will not be visible.

8.
 void
 WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode)
 {
- List   *l;
+ List   *l;

Extra spacing not needed.

9.
/*
 * XactLockTableWaitErrorContextCallback
 * error context callback set up by
 * XactLockTableWaitWithInfo. It reports some
 * tuple information and the relation of a lock to aquire
 *
 */
Please improve indentation of above function header

10.
+#include "access/htup.h"
+
+struct XactLockTableWaitLockInfo
+{
+ Relation rel;
+ HeapTuple tuple;
+};

I think it is better to add comments for this structure.
You can refer comments for struct HeapUpdateFailureData


11.
+ *
+ * Use this instead of calling XactTableLockWait()

In above comment, function name used is wrong and I am not sure this
is right statement to use because still we are using XactLockTableWait()
at few places like in function Do_MultiXactIdWait() and recently this is used
in function SnapBuildFindSnapshot().

12.
heap_update()
{
..
..
/*
 * There was no UPDATE in the MultiXact; or it aborted. No
 * TransactionIdIsInProgress() call needed here, since we called
 * MultiXactIdWait() above.

Change the function name in above comment.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list

Re: [HACKERS] ANALYZE sampling is too good

2014-03-08 Thread Bruce Momjian

I assume we never came up with a TODO from this thread:

---

On Tue, Dec  3, 2013 at 11:30:44PM +, Greg Stark wrote:
> At multiple conferences I've heard about people trying all sorts of
> gymnastics to avoid ANALYZE which they expect to take too long and
> consume too much I/O. This is especially a big complain after upgrades
> when their new database performs poorly until the new statistics are
> in and they did pg_upgrade to avoid an extended downtime and complain
> about ANALYZE taking hours.
> 
> I always gave the party line that ANALYZE only takes a small
> constant-sized sample so even very large tables should be very quick.
> But after hearing the same story again in Heroku I looked into it a
> bit further. I was kind of shocked but the numbers.
> 
> ANALYZE takes a sample of 300 * statistics_target rows. That sounds
> pretty reasonable but with default_statistics_target set to 100 that's
> 30,000 rows. If I'm reading the code right It takes this sample by
> sampling 30,000 blocks and then (if the table is large enough) taking
> an average of one row per block. Each block is 8192 bytes so that
> means it's reading 240MB of each table.That's a lot more than I
> realized.
> 
> It means if your table is anywhere up to 240MB you're effectively
> doing a full table scan and then throwing out nearly all the data
> read.
> 
> Worse, my experience with the posix_fadvise benchmarking is that on
> spinning media reading one out of every 16 blocks takes about the same
> time as reading them all. Presumably this is because the seek time
> between tracks dominates and reading one out of every 16 blocks is
> still reading every track. So in fact if your table is up to about
> 3-4G ANALYZE is still effectively going to do a full table scan, at
> least as far as I/O time goes.
> 
> The current algorithm seems like it was designed with a 100G+ table in
> mind but the consequences on the more common 100M-100G tables weren't
> really considered. Consider what this means for partitioned tables. If
> they partition their terabyte table into 10 partitions ANALYZE will
> suddenly want to use 10x as much I/O which seems like a perverse
> consequence.
> 
> I'm not sure I have a prescription but my general feeling is that
> we're spending an awful lot of resources going after a statistically
> valid sample when we can spend a lot less resources and get something
> 90% as good. Or if we're really going to read that much data that we
> might as well use more of the rows we find.
> 
> -- 
> greg
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: libpq: change PQconndefaults() to ignore invalid service files

2014-03-08 Thread Bruce Momjian
On Tue, Dec  3, 2013 at 11:42:08AM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Tue, Dec  3, 2013 at 11:30:15AM -0500, Tom Lane wrote:
> >> Why does this patch remove the errorMessage argument from
> >> pg_fe_getauthname?  I gauge the amount of thought that went into
> >> that choice by the fact that the comment wasn't updated.
> 
> > Oh, the argument was not used, so I remove it.  C comment updated. 
> > Thanks.
> 
> My point was that I didn't think you'd thought about error handling.
> 
> In particular, it appears to me that if the strdup in pg_fe_getauthname
> fails, we'll just let that pass without comment, which is inconsistent
> with all the other out-of-memory cases in conninfo_add_defaults.
> (I wonder whether any code downstream of this supposes that we always
> have a value for the "user" option.  It's a pretty safe bet that the
> case is hardly ever tested.)
> 
> More generally, why is it that we'd want to eliminate any prospect
> of reporting other errors in pg_fe_getauthname?  Is it such a great

[Just getting back to this.]

Agreed.  I have developed the attached patch which passes the strdup()
failure up from pg_fe_getauthname() and maps the failure to
PQconndefaults(), which is now documented as being memory allocation
failure.

FYI, there was odd coding in PQconndefaults() where we set local
variable 'name' to NULL, then we tested to see if it was NULL --- I
removed that test.

> idea that we're ignoring failure returns from pqGetpwuid/GetUserName?

If we want pqGetpwuid/GetUserName to be a special return value, we would
need to modify PQconndefaults()'s API, which doesn't seem worth it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
new file mode 100644
index e10c970..3312678
*** a/src/interfaces/libpq/fe-auth.c
--- b/src/interfaces/libpq/fe-auth.c
*** pg_fe_getauthname(void)
*** 741,756 
  	 */
  	pglock_thread();
  
- 	if (!name)
- 	{
  #ifdef WIN32
! 		if (GetUserName(username, &namesize))
! 			name = username;
  #else
! 		if (pqGetpwuid(geteuid(), &pwdstr, pwdbuf, sizeof(pwdbuf), &pw) == 0)
! 			name = pw->pw_name;
  #endif
- 	}
  
  	authn = name ? strdup(name) : NULL;
  
--- 741,753 
  	 */
  	pglock_thread();
  
  #ifdef WIN32
! 	if (GetUserName(username, &namesize))
! 		name = username;
  #else
! 	if (pqGetpwuid(geteuid(), &pwdstr, pwdbuf, sizeof(pwdbuf), &pw) == 0)
! 		name = pw->pw_name;
  #endif
  
  	authn = name ? strdup(name) : NULL;
  
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
new file mode 100644
index 1f0eeaf..669bcf5
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
*** conninfo_add_defaults(PQconninfoOption *
*** 4483,4488 
--- 4483,4495 
  		if (strcmp(option->keyword, "user") == 0)
  		{
  			option->val = pg_fe_getauthname();
+ 			if (!option->val)
+ 			{
+ if (errorMessage)
+ 	printfPQExpBuffer(errorMessage,
+ 	  libpq_gettext("out of memory\n"));
+ return false;
+ 			}
  			continue;
  		}
  	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git-review: linking commits to review discussion in git

2014-03-08 Thread Murtuza Mukadam
Hi Heikki

We have linked git commits and reviews in a web interface. If you
enter a commit hash, you will be redirected to the email archive of
the peer review discussion:
http://cesel.encs.concordia.ca/git-reviewed-tracker.php

This work is part of my thesis, so feedback is much appreciated. If
you have another git repo and mailing lists that you'd like linked,
please let us know.

Cheers,
Murtuza

How do we do the linking? We take each email patch, eliminate white
space and hash each line. We then compare the lines with those in
commits to the same files. The commit that changes the same files and
has the largest number of matching lines is considered to be the
reviewed commit.
Regards,
Murtuza


On Tue, Jan 28, 2014 at 2:10 AM, Heikki Linnakangas
 wrote:
> On 01/27/2014 11:36 PM, Murtuza Mukadam wrote:
>>
>> Hello All,
>>We have linked peer review discussions on
>> 'pgsql-hackers' to their respective commits within the main
>> postgresql.git repository. You can view the linked reviews from 2012
>> until present in the GitHub repo at
>> https://github.com/mmukadam/postgres/tree/review
>>
>> If you want to work with these reviews locally, you can use our
>> git-review tool. It allows you to create reviews and attach them to
>> commits in git. We didn't modify git, instead we added some scripts
>> that use standard git commands. git-review is beta, but since it only
>> adds a detached 'review' branch and modifies the contents of this
>> branch, it has minimal impact and can easily be removed by deleting
>> the 'review' branch and scripts.
>>
>> The online man-page is here:
>> http://users.encs.concordia.ca/~m_mukada/git-review.html
>>
>> In order to install git-review, you need to clone the repository:
>> https://github.com/mmukadam/git-review.git
>>
>> The online tutorial is available here:
>> http://users.encs.concordia.ca/~m_mukada/git-review-tutorial.html
>>
>> The clone of postgresql.git with linked review discussion is here (new
>> review discussion are linked nightly)
>> https://github.com/mmukadam/postgres
>>
>> This work is part of my Master's thesis. If you'd like us to change
>> the tool to better suit your review process, have another git repo
>> you'd like us to link commits with review discussion, or have other
>> feedback, please let us know.
>
>
> I don't understand what this does. The repository at
> https://github.com/mmukadam/postgres looks like just a clone of the main
> PostgreSQL repository, with no extra links anywhere. And the repository at
> https://github.com/mmukadam/postgres/tree/review looks like a mailing list
> archive turned into a git repository, but I don't see any links to the
> commits in the main repository there.
>
> Am I missing something?
>
> - Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve timestamp substraction to be DST-aware

2014-03-08 Thread Bruce Momjian
On Mon, Dec  2, 2013 at 11:43:06PM +0100, Piotr Marcinczyk wrote:
> > The alternative proposal that's been on the table for awhile (see the
> > preceding entry in the TODO list) is to remove the interval_justify_hours
> > call in timestamp_mi, which would also have the effect of fixing the
> > inconsistency that T1 + (T2 - T1) doesn't necessarily yield T2.  And it
> > would do that a lot more straightforwardly, with less risk that there's
> > still corner cases that would misbehave.
> > 
> > If it's not the T1 + (T2 - T1) issue that's bothering you, perhaps
> > you should explain exactly what results you're hoping to get by changing
> > this behavior.
> > 
> 
> In SQL99 "4.7 Datetimes and intervals" I read, that day-time intervals
> (I think, that our interval has this type) should have hours in range
> 0-23. I suggest to remove preceding entry from TODO list, and not treat
> this as alternative. Current behavior is OK.
> 
> Regarding this, we have two options: use session TZ, or wait for
> implementation of TZ saved in timestamp field. In my opinion saving TZ
> in field doesn't give serious benefits, and it's probable that it will
> never be implemented. In this case, using session TZ is sufficient.
> 
> Can You explain, why do You read this proposal as fuzzy? I believe that
> using session context is normal in many cases. Maybe I should consider
> saving TZ in timestamp once again?

I have remove the TODO item and added an interval subtraction/addition
section to the docs for PG 9.4:

http://www.postgresql.org/docs/devel/static/functions-datetime.html


The paragraphs being with "When adding an interval value to" and
"Subtraction of dates and timestamps can also be complex.".  Is there
anything more to add there?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment - uniqueness of relfilenode

2014-03-08 Thread Bruce Momjian
On Thu, Mar  6, 2014 at 09:20:39PM -0500, Bruce Momjian wrote:
> On Thu, Mar  6, 2014 at 08:34:29AM +0100, Antonin Houska wrote:
> > >>> You're right. I missed the fact that Postgres (unlike another DBMS that
> > >>> I worked with) allows for tablespace to be shared across databases.
> > >>
> > >> I have update the C comment:
> > >>
> > >> <  * Notice that relNode is only unique within a particular 
> > >> database.
> > >> ---
> > >> >  * Notice that relNode is only unique within a particular 
> > >> tablespace.
> > > 
> > > Yep. But the new text is no more correct than the old text.  Did you
> > > read what I wrote upthread?
> > 
> > Perhaps "... unique within a particular (tablespace, database)
> > combination." ?
> 
> Oh, I thought people just wanted the text to be consistent with other
> mentions, rather than improving it.  Attached is a patch which updates
> comments about the tablespace/database issue.

Applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl status with nonexistent data directory

2014-03-08 Thread Bruce Momjian
On Sat, Mar  8, 2014 at 09:08:31AM +0530, Amit Kapila wrote:
> On Fri, Mar 7, 2014 at 7:59 PM, Bruce Momjian  wrote:
> > On Fri, Mar  7, 2014 at 07:18:04PM +0530, Amit Kapila wrote:
> >> > OK, done with the attached patch  Three is returned for status, but one
> >> > for other cases.
> >>
> >> I think it would have been better if it return status as 4 for cases where
> >> directory/file is not accessible (current new cases added by this patch).
> >>
> >> I think status 3 should be only return only when the data directory is 
> >> valid
> >> and pid file is missing, because in script after getting this code, the 
> >> next
> >> thing they will probably do is to start the server which doesn't seem a
> >> good fit for newly added cases.
> >
> > OK, I have updated the attached patch to reflect this, and added a C
> > comment.
> 
> This is fine, do you think we should mention about this in docs?
> I have added one line in docs (updated patch attached), if you think
> it makes sense then add it otherwise ignore it.

I have applied your version of the patch with slightly different text:

If an accessible data directory is not specified, the process returns an
exit status of 4.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Little confusing things about client_min_messages.

2014-03-08 Thread Tom Lane
Bruce Momjian  writes:
> On Sat, Mar  8, 2014 at 11:31:22AM -0500, Tom Lane wrote:
>> Tomonari Katsumata  writes:
>>> [ client_min_messages = info is not documented ]

>> That's intentional, because it's not a useful setting.  Even more so
>> for the other two.

> Well, 'info' is between other settings we do document, so I am not clear
> why info should be excluded.  It is because we always output INFO to the
> client?  From elog.c:

> if (ClientAuthInProgress)
> output_to_client = (elevel >= ERROR);
> else
> output_to_client = (elevel >= client_min_messages ||
> elevel == INFO);

Right, so if you did set it to that, it wouldn't be functionally different
from NOTICE.

I'm not real sure why we allow setting client_min_messages to FATAL or
PANIC at all; seems to me that would break the FE/BE protocol, which says
that command cycles end with either the expected response or ErrorMessage.
In some quick experimentation, libpq/psql don't seem to get as confused as
I thought they would; but the user is sure likely to.

regression=# select 1/0;
ERROR:  division by zero
regression=# set client_min_messages TO panic;
SET
regression=# select 1/0;
regression=# slect
regression-# ;
regression=# select 1.0;
 ?column? 
--
  1.0
(1 row)

regression=# foo;
regression=# 

So no, I don't think we ought to be advertising these as suggested
values.  A saner proposed patch would be to remove them from the
valid values altogether.  We probably had some good reason for leaving
them in the list back when, but I'm having a hard time reconstructing
what that would be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Little confusing things about client_min_messages.

2014-03-08 Thread Bruce Momjian
On Sat, Mar  8, 2014 at 11:31:22AM -0500, Tom Lane wrote:
> Tomonari Katsumata  writes:
> > [ client_min_messages = info is not documented ]
> 
> That's intentional, because it's not a useful setting.  Even more so
> for the other two.

Well, 'info' is between other settings we do document, so I am not clear
why info should be excluded.  It is because we always output INFO to the
client?  From elog.c:

if (ClientAuthInProgress)
output_to_client = (elevel >= ERROR);
else
output_to_client = (elevel >= client_min_messages ||
elevel == INFO);


-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gaussian distribution pgbench

2014-03-08 Thread Fabien COELHO


Hello Mitsumasa-san,


New "\setrandom" interface is here.
 \setrandom var min max [gaussian threshold | exponential threshold]


Attached patch realizes this interface, but it has little bit ugly codeing in 
executeStatement() and process_commands()..


I think it is not too bad. The "ignore extra arguments on the line" is a 
little pre-existing mess anyway.



What do you think?


I'm okay with this UI and its implementation.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Little confusing things about client_min_messages.

2014-03-08 Thread Tom Lane
Tomonari Katsumata  writes:
> [ client_min_messages = info is not documented ]

That's intentional, because it's not a useful setting.  Even more so
for the other two.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is SPI safe to use in multi-threaded PL/Java?

2014-03-08 Thread Tom Lane
"MauMau"  writes:
> Is PL/Java safe to use in terms of its threading design?  I'm going to ask 
> the PL/Java community about this too, but I'd ask for opinions here because 
> I believe people in this community have seasoned knowledge of OS and SPI.

> To put the question in other words, is it safe to load a multi-threaded PL 
> library in the single-threaded backend process, if the PL only calls SPI in 
> the main thread?

When it breaks, we're not going to be concerned.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shave a few instructions from child-process startup sequence

2014-03-08 Thread Tom Lane
Bruce Momjian  writes:
> On Sun, Dec  1, 2013 at 12:07:21PM -0500, Gurjeet Singh wrote:
>> Yes, this is a performance patch, but as the subject says, it saves a few
>> instructions. I don't know how to write a test case that can measure savings 
>> of
>> skipping a few instructions in a startup sequence that potentially takes
>> thousands, or even millions, of instructions.

> Are we saying we don't want this patch?

I don't --- I think it makes the code less robust for no gain worthy
of the name.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Selection of join algorithm.

2014-03-08 Thread Atri Sharma
On Saturday, March 8, 2014, Ishaya Bhatt  wrote:

> Hi,
>
> I am trying to analyze join performance. But I see that even for a table
> having 100,000 rows and join attribute as primary key, postgres always
> performs hash join.
>
> Can anyone please tell me under which conditions merge join or nested loop
> join is invoked?
>
>
>
Nested loop is generally performed when one of the tables being joined is
small so the inner loop will be fast.

Hash joins are the preferred join types unless the hash table does not fit
in work_mem.if that is the case,then some other join algorithm is preferred.

Regards,

Atri


-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Static Code Analysis Exploits.

2014-03-08 Thread Patrick Curran


On 03/07/2014 07:19 PM, Tom Lane wrote:

Patrick Curran  writes:

We use Postgres in our product and we have a client that requires a
static code analysis scan to detect vulnerabilities. They are concerned
because the tool (Veracode) found several flaws in Postgres and they
believe there might be a security risk. I'm sure there are lots of
companies that use Postgres that have security policies like theirs in
place, so I'm hoping someone has the experience to know that these are
false positives or that they are not a security risk for some reason.
Below is a description of the vulnerability and the location in the
source code. Version 9.3.2.1 was scanned. Please let me know if there is
a better place to ask this kind of question.

TBH, I don't think anyone's going to bother with going through this list
in this form.  Line numbers in something that's not an official community
release are not very helpful, and the descriptions are far too vague for
someone looking at the list to be sure exactly what their tool is on
about.  I took one example at random:


Stack-based Buffer Overflow (CWE ID 121)(13 flaws):
There is a potential buffer overflow with these functions. If an
attacker can control the data written into the buffer, the overflow may
result in execution of arbitrary code.
libpq.dll .../interfaces/libpq/pqexpbuffer.c 369

This seems to be complaining about the memcpy in appendBinaryPQExpBuffer.
Well, I don't see anything unsafe about it: the preceding call to
enlargePQExpBuffer should have made sure that the target buffer is big
enough.  And the reference to stack-based buffer overflow is completely
nonsensical, because no PQExpBuffer keeps its buffer on the stack.  It's
conceivable that their tool has spotted some unsafe pattern in some call
site, but this report is unhelpful about identifying what that would be.

I did look at a few of the other items, and none of the ones I looked at
were any more clear.

FWIW, we do have access to Coverity scans of the Postgres sources, and
we do make efforts to fix things Coverity complains about.  But we're
unlikely to take reports like this one seriously: there's simply not
enough information to know what the tool is unhappy about, nor any
clear reason to believe that it's spotted something that both human
readers and Coverity have missed.

Sorry if that's not the answer you wanted; but a more positive response
is going to require a substantially greater amount of work on your end.
In particular, given the very substantial amounts of work that have
already gone into hardening the Postgres code, I think the burden of
proof is on you or your client to show that these are issues, not on
us to disprove claims that are too vague to be disproven.

regards, tom lane
I understand. It makes perfect sense. Thanks for your response. It's 
actually been quite helpful.


Thanks again,
Patrick



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Selection of join algorithm.

2014-03-08 Thread Ishaya Bhatt
Hi,

I am trying to analyze join performance. But I see that even for a table
having 100,000 rows and join attribute as primary key, postgres always
performs hash join.

Can anyone please tell me under which conditions merge join or nested loop
join is invoked?

Thanks and Regards,
Ishaya


[HACKERS] Little confusing things about client_min_messages.

2014-03-08 Thread Tomonari Katsumata
Hi,

I noticed that the behavior of client_min_messages do not have
a consistency in document and 'pg_settings', postgresql.conf.


the behaviro is:
I can set 'INFO', 'FATAL' and 'PANIC' as the valid value.

postgres=# set client_min_messages to info;
SET
postgres=# set client_min_messages to fatal;
SET
postgres=# set client_min_messages to panic;
SET


document says:

DEBUG1, LOG, NOTICE,
WARNING, ERROR, FATAL,
and PANIC.  Each level

I couldn't understand the reason of disappearing 'INFO' from the document.


pg_settings says:

{debug5,debug4,debug3,debug2,debug1,log,notice,warning,error}

and

postgresql.conf says:

#client_min_messages = notice   # values in order of decreasing
detail:
#   debug5
#   debug4
#   debug3
#   debug2
#   debug1
#   log
#   notice
#   warning
#   error


also I couldn't understand the reason of disappearing
'info', 'fatal' and 'panic' from them.



My proposal is all valid values should be present for users.
I fixed this, please see the attached patch.

regards,
---
Tomonari Katsumata


clear_client_min_messages.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is SPI safe to use in multi-threaded PL/Java?

2014-03-08 Thread MauMau

Hello,

Is PL/Java safe to use in terms of its threading design?  I'm going to ask 
the PL/Java community about this too, but I'd ask for opinions here because 
I believe people in this community have seasoned knowledge of OS and SPI.


To put the question in other words, is it safe to load a multi-threaded PL 
library in the single-threaded backend process, if the PL only calls SPI in 
the main thread?


PL/Java (pljava.so) is linked with the JNI (Java Native Interface) library, 
libjvm.so, in JRE.  libjvm.so is linked with libpthread.so, because Java VM 
is multi-threaded.  SO, "ldd pljava.so" shows libjvm.so and libpthread.so. 
pljava.so doesn't seem to be built for multh-threading --- none 
of -mt, -D_REENTRANT or -D_POSIX_C_SOURCE is specified when building it.


When the application calls Java stored function, pljava.so calls a function 
in libjvm.so to create a JVM in the backend process, then invokes the 
user-defined Java method in the main thread.  The user-defined Java method 
calls JDBC methods to access database.  The JDBC method calls are translated 
to backend SPI function calls through JNI.


The main thread can create Java threads using Java Thread API, and those 
threads can call JDBC methods.  However, PL/Java intercepts JDBC method 
calls and serializes SPI calls.  So, only one thread calls SPI functions at 
a time.  I'm wondering if this is the reason why PL/Java is safe for use.


What I'm concerned about is whether multi-threaded code (Java VM) can run 
safely in a single-threaded code (postgres).  I don't know what can be a 
particular problem with PL/Java, but in general, the mixture of 
single-threaded code and multi-threaded one seems to cause trouble around 
handling errno, memory and file handles/pointers.


FYI, JNI specification says that the code called from Java VM should be 
built for multi-threading as follows.  But postgres is not.


http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/design.html#wp9502

[Excerpt]
Compiling, Loading and Linking Native Methods
Since the Java VM is multithreaded, native libraries should also be compiled 
and linked with multithread aware native compilers. For example, the -mt 
flag should be used for C++ code compiled with the Sun Studio compiler. For 
code complied with the GNU gcc compiler, the flags -D_REENTRANT 
or -D_POSIX_C_SOURCE should be used. For more information please refer to 
the native compiler documentation.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers