Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-20 Thread Gabriele Bartolini

Hi Noah,

thanks for your unvaluable review, rich of useful and thorough comments 
and notes. Marco and myself will add your proposed tests as soon as 
possible (most likely after the Italian PGDay which is this week). 
However, given the feedback received from other developers too 
(including Tom), I would first concentrate on defining the syntax and 
how referential integrity actions should work.


Il 17/11/11 05:28, Noah Misch ha scritto:
Removing values from the array seems best to me. There's no doubt 
about what ON UPDATE CASCADE should do, and having ON DELETE CASCADE 
excise individual array elements is consistent with that. It's less 
clear for SET NULL, but I'd continue with a per-element treatment. I'd 
continue to forbid SET DEFAULT. However, Jeff Davis did expect ON 
DELETE CASCADE to remove entire rows: 
http://archives.postgresql.org/message-id/1288119207.15279.24.camel@jdavis-ux.asterdata.local 
So, perhaps the behavior needs to be user-selectable. 
I would agree with what Tom is saying here, given that SQL specs do not 
say anything about this feature. We could leave standard REFERENCES 
keyword handling the array value as it is now. If a user wants to take 
advantage of in-array referential integrity, we could implement the 
special keyword ARRAY REFERENCES as Tom proposes (or a similar keyword).


Consequently, we need to agree on what the actions on delete and update 
operations are. In case of ARRAY REFERENCES, I would be inclined to 
leave the same meaning of ROW scope actions to CASCADE and SET NULL 
actions, while disallowing the SET DEFAULT action (as Noah suggests 
too). At the same time, I would add two actions for ARRAY REFERENCES 
which will be processing array elements:


* ARRAY CASCADE
* ARRAY SET NULL

(Of course if you are welcome to propose a better naming convention). 
This table summarises the scope of the actions.


--- - -
   |   ON|   ON|
Action | DELETE  | UPDATE  |
--- - -
CASCADE|   Row   | Element |
SET NULL   |   Row   |   Row   |
ARRAY CASCADE  | Element | Element |
ARRAY SET NULL | Element | Element |
SET DEFAULT|  Error  |  Error  |
NO ACTION  |-|-|
RESTRICT   |-|-|
--- - -

For instance, with an ARRAY REFERENCES ... ON DELETE CASCADE, I would 
expect that the whole row is deleted (as Jeff et al. say). However, if I 
specify ARRAY REFERENCES ... ON DELETE ARRAY CASCADE, I would expect 
that elements in the referencing array are removed.
Similary the ARRAY REFERENCES ... ON DELETE SET NULL will set the row 
to NULL, whereas ARRAY REFERENCES ... ON DELETE ARRAY SET NULL will 
set individual elements in the referencing array to NULL.


In case of updates, SET NULL and ARRAY SET NULL works the same (updating 
the whole row or the single elements). CASCADE and ARRAY CASCADE are 
synonyms, as they would work in individual elements (which is the action 
that makes more sense anyway).


I believe that, before we proceed with one implementation or another, it 
is important we discuss this sort of things and agree on a possible 
long-term path (so that we can organise intermediate deliverables).


Thanks,
Gabriele

--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: better support for debugging of overloaded functions

2011-11-20 Thread Pavel Stehule
2011/11/18 Robert Haas robertmh...@gmail.com:
 On Fri, Nov 18, 2011 at 6:24 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 CONTEXT:  PL/pgSQL function assign_rslts line 50 at assignment (oid: 65903)

 \sf+ 65903

 I'm pretty unenthused by the idea of making OIDs more user-visible
 than they already are.  If the message is ambiguous, we should include
 argument types and (if not the object that would be visible under the
 current search_path) a schema qualification.  Spitting out a five (or
 six or seven or eight) digit number doesn't seem like a usability
 improvement.

Is possible to add GUC variable plpgsql.log_function_signature (maybe
just log_function_signature (for all PL))? I am not sure about GUC
name.

When this variable is true, then CONTEXT line will contain a qualified
function's signature instead function name

I don't would to check if function name is ambiguous or not after
exception is raised. There is a problem with access to system tables
and then exception handling can be slower. Using a qualified name is
necessary, because psql meta statements are not smart - they are
based on search_path and fact, so name is not  ambiguous doesn't help
there.

Regards

Pavel Stehule

p.s. Other issue is missing CONTEXT line for RAISE EXCEPTION


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-20 Thread Gianni Ciolli
Hi Gabriele and Marco,

On Sun, Nov 20, 2011 at 10:36:15AM +0100, Gabriele Bartolini wrote:
 --- - -
|   ON|   ON|
 Action | DELETE  | UPDATE  |
 --- - -
 CASCADE|   Row   | Element |
 SET NULL   |   Row   |   Row   |
 ARRAY CASCADE  | Element | Element |
 ARRAY SET NULL | Element | Element |
 SET DEFAULT|  Error  |  Error  |
 NO ACTION  |-|-|
 RESTRICT   |-|-|
 --- - -

thank you for this very clear and concise summary!

I agree with your appeal for a broad discussion on the proposed
syntax, and I will use the same language to express my proposal (for
clarity and to simplify the discussion):

-- - ---
  |   ON|   ON  |
Action| DELETE  | UPDATE|
-- - ---
CASCADE   | Element | Element   |
SET NULL  | Element | Element   |
SET DEFAULT   |  Error  |  Error|
ARRAY CASCADE |   Row   | Element = Row |
ARRAY SET NULL|   Row   |   Row |
ARRAY SET DEFAULT |   Row   |   Row |
NO ACTION |-|-  |
RESTRICT  |-|-  |
-- - ---

I have swapped your syntax in the following way which looks cleaner to
me: the ARRAY (CASCADE | SET NULL | SET DEFAULT) syntax denote
operations that happen on the whole array, and CASCADE | SET NULL |
SET DEFAULT denote instead operations that happen on the elements of
the array.

Associating the Element behaviour with the ON DELETE CASCADE syntax
is also consistent with the case where the referencing table A is
constructed as GROUP BY of another table B, and the array reference on
A is built by aggregating a non-array reference on B with ON DELETE
CASCADE syntax. In other words, the same syntax (ON DELETE CASCADE)
would denote the same behaviour in both the aggregated case ( = one
row per object, using array references) and the non-aggregated case
(multiple rows for one object, using equality references), which
represent two distinct implementations of the same abstraction.

The Row behaviour would instead be associated to a new syntax (ON
DELETE ARRAY CASCADE), which cannot be obtained via the existing
syntax in the non-aggregated implementation, on the grounds that it
might be useful for some semantics (for instance: if you remove a
vertex from a polygon, you can either destroy the polygon or transform
it into a polygon with less vertices).

The same principle of considering the two implementations as the same
abstraction would also confirm your choice to raise an exception on ON
(DELETE | UPDATE) SET DEFAULT.

It would also suggest to enable ON (DELETE | UPDATE) ARRAY SET
DEFAULT. The reasoning is that we can actually specify a default array
in the referencing column, but we can't specify a default element.
Before I briefly thought to use the referenced column default as a
default for the single element, but it seems a bad idea: a default is
an expression (possibly non-constant) which is evaluated only when a
new row is created in the referenced table, and using it outside of
that context looks inappropriate.

Regarding ON UPDATE ARRAY CASCADE, I agree to make it a synonym, since
updating the whole array to take into account the update on the
referenced field is equivalent to updating the single element to take
into account the same fact.

Finally, ON UPDATE ARRAY SET NULL would still have an use case as a
different behaviour than ON UPDATE SET NULL, which we make available
to the database designer: instead of replacing the updated element in
the array with a NULL, we replace the whole array with a NULL. This is
essentially the same difference that we have between ON DELETE ARRAY
CASCADE and ON DELETE CASCADE.

Thanks,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2011-11-20 Thread Noah Misch
On Sun, Nov 20, 2011 at 10:36:15AM +0100, Gabriele Bartolini wrote:
 I would agree with what Tom is saying here, given that SQL specs do not  
 say anything about this feature. We could leave standard REFERENCES  
 keyword handling the array value as it is now. If a user wants to take  
 advantage of in-array referential integrity, we could implement the  
 special keyword ARRAY REFERENCES as Tom proposes (or a similar 
 keyword).

No objection to that.

 --- - -
|   ON|   ON|
 Action | DELETE  | UPDATE  |
 --- - -
 CASCADE|   Row   | Element |
 SET NULL   |   Row   |   Row   |
 ARRAY CASCADE  | Element | Element |
 ARRAY SET NULL | Element | Element |
 SET DEFAULT|  Error  |  Error  |
 NO ACTION  |-|-|
 RESTRICT   |-|-|
 --- - -

I like this.

 CASCADE and ARRAY CASCADE are  
 synonyms, as they would work in individual elements (which is the action  
 that makes more sense anyway).

What about making ON UPDATE CASCADE an error?  That way, we can say that ARRAY
action always applies to array elements, and plain action always applies to
entire rows.

SET DEFAULT should now be fine to allow.  It's ARRAY SET DEFAULT, in your new
terminology, that wouldn't make sense.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql setenv command

2011-11-20 Thread Josh Kupershmidt
On Wed, Nov 2, 2011 at 5:36 PM, Andrew Dunstan and...@dunslane.net wrote:
 Updated patch is attached - adding to Nov commitfest.

Review of the v2 patch:

* Submission Review
Patch applies with some fuzz and builds without warnings. I noticed
some tab characters being used in psql-ref.sgml where they shouldn't
be.

* Usability Review
I sort of doubt I'll use the feature myself, but there was at least
one +1 for the feature idea already, so it seems useful enough. That
said, the stated motivation for the patch:

  First, it would be useful to be able to set pager options and possibly other
 settings, so my suggestion is for a \setenv command that could be put in a
 .psqlrc file, something like:

seems like it would be more suited to being set in the user's
~/.profile (perhaps via an 'alias' if you didn't want the changes to
apply globally). So unless I miss something, the real niche for the
patch seems to be for people to interactively change environment
settings while within psql. Again, not something I'm keen on for my
own use, but if others think it's useful, that's good enough for me.

* Coding Review / Nitpicks
The patch implements \setenv via calls to unsetenv() and putenv(). As
the comment notes,

| That means we
| leak a bit of memory here, but not enough to worry about.

which seems true; the man page basically says there's nothing to be
done about the situation.

The calls to putenv() and unsetenv() are done without any real input
checking. So this admittedly-pathological case produces surprising
results:

test=# \setenv foo=bar baz
test=# \! echo $foo
bar=baz

Perhaps 'envvar' arguments with a '=' in the argument should just be
disallowed.

Also, should the malloc() of newval just use pg_malloc() instead?

* Reviewer Review
I haven't tested on Windows.

Josh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Hmm.  There's obviously something that's different in your
 environment or configuration from what I tested, but I don't know
 what it is. The fact that your scale factor is larger than
 shared_buffers might matter; or Intel vs. AMD.  Or maybe you're
 running with synchronous_commit=on?
 
Yes, I had synchronous_commit = on for these runs.  Here are the
settings:
 
cat  $PGDATA/postgresql.conf EOM;
max_connections = 200
max_pred_locks_per_transaction = 256
shared_buffers = 10GB
maintenance_work_mem = 1GB
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
EOM
 
Is there any chance that having pg_xlog on a separate RAID 10 set of
drives with it's own BBU controller would explain anything?  I mean,
I always knew that was a good idea for a big, heavily-loaded box,
but I remember being surprised at how *big* a difference that made
when a box accidentally went into production without moving the
pg_xlog directory there.

There is one other things which might matter, I didn't use the -n
pgbench option, and on the sample you showed, you were using it.
 
Here is the median of five from the latest runs.  On these
read/write tests there was very little spread within each set of
five samples, with no extreme outliers like I had on the SELECT-only
tests.  In the first position s means simple protocol and p means
prepared protocol.  In the second position m means master, f means
with the flexlock patch.
 
sm1 tps = 1092.269228 (including connections establishing)
sf1 tps = 1090.511552 (including connections establishing)
sm2 tps = 2171.867100 (including connections establishing)
sf2 tps = 2158.609189 (including connections establishing)
sm4 tps = 4278.541453 (including connections establishing)
sf4 tps = 4269.921594 (including connections establishing)
sm8 tps = 8472.257182 (including connections establishing)
sf8 tps = 8476.150588 (including connections establishing)
sm16 tps = 15905.074160 (including connections establishing)
sf16 tps = 15937.372689 (including connections establishing)
sm32 tps = 22331.817413 (including connections establishing)
sf32 tps = 22861.258757 (including connections establishing)
sm64 tps = 26388.391614 (including connections establishing)
sf64 tps = 26529.152361 (including connections establishing)
sm80 tps = 25617.651194 (including connections establishing)
sf80 tps = 26560.541237 (including connections establishing)
sm96 tps = 24105.455175 (including connections establishing)
sf96 tps = 26569.244384 (including connections establishing)
sm128 tps = 21467.530210 (including connections establishing)
sf128 tps = 25883.023093 (including connections establishing)
 
pm1 tps = 1629.265970 (including connections establishing)
pf1 tps = 1619.024905 (including connections establishing)
pm2 tps = 3164.061963 (including connections establishing)
pf2 tps = 3137.469377 (including connections establishing)
pm4 tps = 6114.787505 (including connections establishing)
pf4 tps = 6061.750200 (including connections establishing)
pm8 tps = 11884.534375 (including connections establishing)
pf8 tps = 11870.670086 (including connections establishing)
pm16 tps = 20575.737107 (including connections establishing)
pf16 tps = 20437.648809 (including connections establishing)
pm32 tps = 27664.381103 (including connections establishing)
pf32 tps = 28046.846479 (including connections establishing)
pm64 tps = 26764.294547 (including connections establishing)
pf64 tps = 26631.589294 (including connections establishing)
pm80 tps = 27716.198263 (including connections establishing)
pf80 tps = 28393.642871 (including connections establishing)
pm96 tps = 26616.076293 (including connections establishing)
pf96 tps = 28055.921427 (including connections establishing)
pm128 tps = 23282.912620 (including connections establishing)
pf128 tps = 23072.766829 (including connections establishing)
 
Note that on this 32 core box, performance on the read/write pgbench
is peaking at 64 clients, but without a lot of variance between 32
and 96 clients.  And with the patch, performance still hasn't fallen
off too badly at 128 clients. This is good news in terms of not
having to sweat connection pool sizing quite as much as earlier
releases.
 
Next I will get the profile for the SELECT-only runs.  It seems to
make sense to profile at the peak performance level, which was 64
clients.
 
I can run one more set of tests tonight before I have to give it
back to the guy who's putting it into production.  It sounds like a
set like the above except with synchronous_commit = off might be
desirable?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpython SPI cursors

2011-11-20 Thread Steve Singer

On 11-10-15 07:28 PM, Jan Urbański wrote:

Hi,

attached is a patch implementing the usage of SPI cursors in PL/Python.
Currently when trying to process a large table in PL/Python you have
slurp it all into memory (that's what plpy.execute does).

J


I found a few bugs  (see my testing section below) that will need fixing 
+ a few questions about the code



Overview  Feature Review
---
This patch adds cursor support to plpython.  SPI cursors will allow
a plpython function to read process a large results set without having to
read it all into memory at once.  This is a good thing.  Without this
patch I think you could accomplish the same with using SQL DECLARE CURSOR
and SQL fetch.This feature allows you to use a python cursor as
an iterator resulting in much cleaner python code than the SQL FETCH
approach.   I think the feature is worth having


Usability Review
--
 The patch adds the user methods
cursor=plpy.cursor(query_or_plan)
cursor.fetch(100)
cursor.close()

Do we like the name plpy.cursor or would we rather call it something like
plpy.execute_cursor(...) or plpy.cursor_open(...) or plpy.create_cursor(...)
Since we will be mostly stuck with the API once we release 9.2 this is worth
some opinions on.  I like cursor() but if anyone disagrees now is the time.

This patch does not provide a wrapper around SPI_cursor_move.  The patch
is useful without that and I don't see anything that preculdes someone else
adding that later if they see a need.


Documentation Review
-

The patch includes documentation updates that describes the new feature.
The Database Access page doesn't provide a API style list of database access
functions like the plperl 
http://www.postgresql.org/docs/9.1/interactive/plperl-builtins.html page 
does.  I think the organization of the perl page is

clearer than the python one and we should think about a doing some
documentaiton refactoring.  That should be done as a seperate patch and
shouldn't be a barrier to committing this one.




Code Review


in PLy_cursor_plan line 4080
+ PG_TRY();
+ {
+ Portalportal;
+ char   *volatile nulls;
+ volatile int j;
+
+ if (nargs  0)
+ nulls = palloc(nargs * sizeof(char));
+ else
+ nulls = NULL;
+
+ for (j = 0; j  nargs; j++)
+ {
+ PyObject   *elem;
I am probably not seeing a code path or misunderstanding something
about the setjmp/longjump usages but I don't see why nulls and j need to be
volatile here.

line 444
 PLy_cursor(PyObject *self, PyObject *args)
+ {
+ char*query;
+ PyObject*plan;
+ PyObject   *planargs = NULL;
+
+ if (PyArg_ParseTuple(args, s, query))
+ return PLy_cursor_query(query);
+

Should query be freed with PyMem_free()



Testing
---

I tested both python 2.6 and 3 on a Linux system



create or replace function x() returns text as $$
cur=None
try:
  with plpy.subtransaction():
cur=plpy.cursor('select generate_series(1,1000)')
rows=cur.fetch(10);
plpy.execute('select f()')

except plpy.SPIError:
  rows=cur.fetch(10);
  return rows[0]['generate_series']
return 'none'
$$ language plpythonu;
select x();

crashes the backend
test=# select x();
The connection to the server was lost. Attempting reset: LOG:  server 
process (PID 3166) was terminated by signal 11: Segmentation fault


The below test gives me a strange error message:

create or replace function x1() returns text as $$
plan=None
try:
  with plpy.subtransaction():
plpy.execute('CREATE TEMP TABLE z AS select generate_series(1,1000)')
plan=plpy.prepare('select * FROM z')
plpy.execute('select * FROM does_not_exist')
except plpy.SPIError, e:
cur=plpy.cursor(plan)
rows=cur.fetch(10)
return rows[0]['generate_series']
return '1'
$$ language plpythonu;
select x1();

test=# select x1()
test-# ;
ERROR:  TypeError: Expected sequence of 82187072 arguments, got 0: NULL
CONTEXT:  Traceback (most recent call last):
  PL/Python function x1, line 9, in module
cur=plpy.cursor(plan)
PL/Python function x1
STATEMENT:  select x1()


I was expecting an error from the function just a bit more useful one.


Performance
---
I did not do any specific performance testing but I don't see this patch 
as having any impact to performance




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpython SPI cursors

2011-11-20 Thread Jan Urbański

On 20/11/11 19:14, Steve Singer wrote:

On 11-10-15 07:28 PM, Jan Urbański wrote:

Hi,

attached is a patch implementing the usage of SPI cursors in PL/Python.



I found a few bugs (see my testing section below) that will need fixing
+ a few questions about the code


Hi Steve,

thanks a lot for the review, I'll investigate the errors you were 
getting and post a follow-up.


Good catch on trying cursors with explicit subtransactions, I didn't 
think about how they would interact.


Cheers,
Jan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] testing ProcArrayLock patches

2011-11-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 I was actually thinking it would be interesting to oprofile the
 read-only test; see if we can figure out where those slowdowns are
 coming from.
 
CPU: Intel Core/i7, speed 2262 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with
a unit mask of 0x00 (No unit mask) count 10
samples  %image name  symbol name
3124242   5.7137  postgress_lock
254   4.6737  postgresAllocSetAlloc
2403412   4.3954  postgresGetSnapshotData
1967132   3.5975  postgresSearchCatCache
1872176   3.4239  postgresbase_yyparse
1327256   2.4273  postgreshash_search_with_hash_value
1040131   1.9022  postgres_bt_compare
1038976   1.9001  postgresLWLockAcquire
8171221.4944  postgresMemoryContextAllocZeroAligned
7383211.3503  postgrescore_yylex
6226131.1386  postgresMemoryContextAlloc
5970541.0919  postgresPinBuffer
5561381.0171  postgresScanKeywordLookup
5523181.0101  postgresexpression_tree_walker
4942790.9039  postgresLWLockRelease
4886280.8936  postgreshash_any
4729060.8649  postgresnocachegetattr
3964820.7251  postgresgrouping_planner
3829740.7004  postgresLockAcquireExtended
3751860.6861  postgresAllocSetFree
3750720.6859  postgresProcArrayLockRelease
3736680.6834  postgresnew_list
3659170.6692  postgresfmgr_info_cxt_security
3013980.5512  postgresProcArrayLockAcquire
3006470.5498  postgresLockReleaseAll
2920730.5341  postgresDirectFunctionCall1Coll
2857450.5226  postgresMemoryContextAllocZero
2846840.5206  postgresFunctionCall2Coll
2827010.5170  postgresSearchSysCache
 
max_connections = 100
max_pred_locks_per_transaction = 64
shared_buffers = 8GB
maintenance_work_mem = 1GB
checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
wal_writer_delay = 20ms
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.05
effective_cache_size = 40GB
default_transaction_isolation = '$iso'
 
pgbench -i -s 100
pgbench -S -M simple -T 300 -c 80 -j 80
 
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 80
number of threads: 80
duration: 300 s
number of transactions actually processed: 104391011
tps = 347964.636256 (including connections establishing)
tps = 347976.389034 (excluding connections establishing)
 
vmstat 1 showed differently this time -- no clue why.
 
procs --memory- ---swap-- -io
 r  b   swpdfree  buffcache   si   sobibo
 ---system -cpu--
in  cs us sy id wa st
91  0   8196 4189436 203925700 5231449200 0 0
 32255 1522807 85 13  1  0  0
92  0   8196 4189404 203925700 5231449200 0 0
 32796 1525463 85 14  1  0  0
67  0   8196 4189404 203925700 5231448800 0 0
 32343 1527988 85 13  1  0  0
93  0   8196 4189404 203925700 5231448800 0 0
 32701 1535827 85 13  1  0  0
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] FOSDEM 2012 - PostgreSQL Devroom: Call for Speakers

2011-11-20 Thread Andreas 'ads' Scherbaum


Hi all,

FOSDEM 2012 - PostgreSQL Devroom: Call for Speakers


The PostgreSQL project will have a Devroom at FOSDEM 2012, which takes 
place on February 4-5 in Brussels, Belgium. The Devroom will mainly 
cover topics for PostgreSQL users, developers and contributors. For more 
information about the event itself, please see the website at 
http://www.fosdem.org/2012/ .



We are now accepting proposals for talks. Please note that we only 
accept talks in English.


Each session will last 45 minutes (including discussion), and may be on 
any topic related to PostgreSQL. Suggested topic areas include:


* Developing applications for PostgreSQL
* Administering large scale PostgreSQL installations
* Case studies and/or success stories of PostgreSQL deployments
* PostgreSQL tools and utilities
* PostgreSQL hacking
* Community  user groups
* Tuning the server
* Migrating from other systems
* Scaling/replication
* Benchmarking  hardware
* PostgreSQL related products

Of course, we're happy to receive proposals for talks on other 
PostgreSQL related topics as well.



Please use our conference website to submit your proposal:

https://www.postgresql.eu/events/callforpapers/fosdem2012/


The deadline for submissions is December 20th, 2011. The schedule will 
be published and speakers will be informed by the end of the year.



Please also note my email about hotel reservation:

http://archives.postgresql.org/pgeu-general/2011-11/msg0.php


--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-11-20 Thread Peter Geoghegan
Part of my problem with not producing specialisations that I really
neglected to complain about until now is the inconsistency between
types, and the need to deal with that.

We benefit from assuming in our specialisation that we're only dealing
with POD types, simply not considering things that we would otherwise
have had to for the benefit of types that have a Datum representation
that is pass-by-reference, or have collations, or have multiple
scankeys. Leaving aside the question of straight compiler-optimisation
benefits for the moment, the remaining benefit of what I've done comes
not so much from avoiding the usual function call machinery per se, as
from doing so *as well as* cutting down on what currently happens in
comparetup_heap to handle every single compound and scalar type.
Compare the function comparetup_heap with my meta-function
TYPE##AppSort to see what I mean. The function comparetup_heap is the
comparator directly used by qsort_arg when sorting heap tuples, and
qsort_arg outsources to comparetup_heap some things that you might not
expect it to (very little has changed about qsort_arg since we lifted
it from NetBSD back in 2006). So while you might imagine that that
loop in comparetup_heap and things like its use of the heap_getattr
macros won't expand to that many instructions, you really don't want
them in the inner loop of a long operation like qsorting, paid for up
to O(n ^ 2) times. There's also the obvious implications for compiler
optimisations, particularly relating to effective usage of CPU cache.

I'm trying to get you what you asked for: A straight choice between
what Tom suggested and what I suggested, with perhaps some compromises
between the two positions. That's sort of tricky though, especially
considering the issues raised above.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Singleton range constructors versus functional coercion notation

2011-11-20 Thread Jeff Davis
On Sat, 2011-11-19 at 15:57 -0500, Tom Lane wrote:
  I'm hesitant to remove them because the alternative is significantly
  more verbose:
numrange(1.0, 1.0, '[]');
 
 Right.  The question is, does the case occur in practice often enough
 to justify a shorter notation?  I'm not sure.

Well, if there were a good shorter notation, then probably so. But it
doesn't look like we have a good idea, so I'm fine with dropping it.

 One thing I've been thinking for a bit is that for discrete ranges,
 I find the '[)' canonical form to be surprising/confusing.  If we
 were to fix range_adjacent along the lines suggested by Florian,
 would it be practical to go over to '[]' as the canonical form?
 One good thing about that approach is that canonicalization wouldn't
 have to involve generating values that might overflow.

I think we had that discussion before, and Florian brought up some good
points (both then and in a reply now). Also, Robert wasn't convinced
that '[]' is necessarily better for discrete ranges:

http://archives.postgresql.org/pgsql-hackers/2011-10/msg00598.php

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-11-20 Thread Etsuro Fujita
Hi Hanada-san,

Thank you for your valuable comments.  I will improve the items pointed
out by you at the next version of the patch, including documentation on
the purpose of AnalyzeForeignTable, how to write it, and so on.  Here I
comment only one point:

 - Why file_fdw skips sample tuples which have NULL value?  AFAIS
 original acquire_sample_rows doesn't do so.

To be precise, I've implemented to skip tuples that have null values in
certain column(s) but that are not allowed to contain null values in
that(those) column(s) by NOT NULL constrain.  file_fdw's
sample_row_acquirer considers those tuples as dead tuples.  This is
for the consistency with NOT NULL constrain.  (But I don't know why
fileIterateForeignScan routine allows such dead tuples.  I may have
missed something.)

Best regards,
Etsuro Fujita

(2011/11/18 21:00), Shigeru Hanada wrote:
 (2011/11/18 16:25), Etsuro Fujita wrote:
 Thank you for your testing.  I updated the patch according to your
 comments.  Attached is the updated version of the patch.
 
 I'd like to share result of my review even though it's not fully
 finished.  So far I looked from viewpoint of API design, code
 formatting, and documentation.  I'll examine effectiveness of the patch
 and details of implementation next week, and hopefully try writing
 ANALYZE handler for pgsql_fdw :)
 
 New patch has correct format, and it could be applied to HEAD of master
 branch cleanly.  All regression tests including those of contrib modules
 have passed.  It contains changes of codes and regression tests related
 to the issue, and they have enough comments.  IMO the document in this
 patch is not enough to show how to write analyze handler to FDW authors,
 but it can be enhanced afterward.  With this patch, FDW author can
 provide optional ANALYZE handler which updates statistics of foreign
 tables.  Planner would be able to generate better plan by using statistics.
 
 Yes.  But in the updated version, I've refactored analyze.c a little bit
 to allow FDW authors to simply call do_analyze_rel().
 snip
 The updated version enables FDW authors to just write their own
 acquire_sample_rows().  On the other hand, by retaining to hook
 AnalyzeForeignTable routine at analyze_rel(), higher level than
 acquire_sample_rows() as before, it allows FDW authors to write
 AnalyzeForeignTable routine for foreign tables on a remote server to ask
 the server for its current stats instead, as pointed out earlier by Tom
 Lane.
 
 IIUC, this patch offers three options to FDWs: a) set
 AnalyzeForeignTable to NULL to indicate lack of capability, b) provide
 AnalyzeForeignTable which calls do_analyze_rel with custom
 sample_row_acquirer, and c) create statistics data from scratch by FDW
 itself by doing similar things to do_analyze_rel with given argument or
 copying statistics from remote PostgreSQL server.
 
 ISTM that this design is well-balanced between simplicity and
 flexibility.  Maybe these three options would suit web-based wrappers,
 file-based or RDBMS wrappers, and pgsql_fdw respectively.  I think that
 adding more details of FdwRoutine, such as purpose of new callback
 function and difference from required ones, would help FDW authors,
 including me :)
 
 I have some random comments:
 
 - I think separated typedef of sample_acquire_rows would make codes more
 readable.  In addition, parameters of the function should be described
 explicitly.
 - I couldn't see the reason why file_fdw sets ctid of sample tuples,
 though I guess it's for Vitter's random sampling algorithm.  If every
 FDW must set valid ctid to sample tuples, it should be mentioned in
 document of AnalyzeForeignTable.  Exporting some functions from
 analyze.c relates this issue?
 - Why file_fdw skips sample tuples which have NULL value?  AFAIS
 original acquire_sample_rows doesn't do so.
 - Some comment lines go past 80 columns.
 - Some headers included in file_fdw.c seems unnecessary.
 
 Regards,


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Collecting statistics on CSV file data

2011-11-20 Thread Etsuro Fujita

(2011/11/19 0:54), Robert Haas wrote:

2011/11/18 Shigeru Hanadashigeru.han...@gmail.com:

- I couldn't see the reason why file_fdw sets ctid of sample tuples,
though I guess it's for Vitter's random sampling algorithm.  If every
FDW must set valid ctid to sample tuples, it should be mentioned in
document of AnalyzeForeignTable.  Exporting some functions from
analyze.c relates this issue?


If every FDW must set valid ctid to sample tuples, it should be fixed
so that they don't have to, I would think.


It's for neither Vitter's algorithm nor exporting functions from 
analyze.c.  It's for foreign index scan on CSV file data that I plan 
to propose in the next CF.  So, it is meaningless for now.  I'm sorry. 
I will fix it at the next version of the patch so that they don't have to.


Best regards,
Etsuro Fujita

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers