[HACKERS] Usage of planner_ctx

2012-04-24 Thread Amit Kapila
PlannerInfo-planner_ctx is assigned with either MessageContext or
PortalHeapMemory depending on the flow from which planner is invoked.

My doubt is that during whole planning process till the Plan is stored in
PlannedStmt, all the memory is from CurrentMemoryContext which is same as
PlannerInfo-planner_ctx.

 

So what is the use of having  PlannerInfo-planner_ctx which only contains
CurrentMemoryContext?

 



Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

2012-04-24 Thread Kyotaro HORIGUCHI
Hello,

   - xlog.c: Make StandbyMode shared.
 
   - checkpointer.c: Use IsStandbyMode() to check if postmaster is
    under standby mode.
 
 IsStandbyMode() looks overkill to me. The standby mode flag is forcibly
 turned off at the end of recovery, but its change doesn't need to be shared
 to the checkpointer process, IOW, the shared flag doesn't need to change
 since startup like XLogCtl-archiveCleanupCommand, I think. So we can
 simplify the code to share the flag to the checkpointer. See the attached
 patch (though not tested yet).

Hmm. I understood that the aim of the spinlock and volatil'ize of
the pointer in reading shared memory is to secure the memory
consistency on SMPs with weak memory consistency and to make
compiler help from over-optimization for non-volatile pointer
respectively.

You removed both of them in the patch.

If we are allowed to be tolerant of the temporary lack of
coherence in shared memory there, the spinlock could be removed.
But the possibility to read garbage by using XLogCtl itself to
access standbyMode does not seem to be tolerable. What do you
think about that?


 The comments in checkpointer.c seems to need to be revised more. For
 example,
 
 +  * XLogInsert that actually triggers a checkpoint when
 
 Currently a checkpoint is triggered by XLogWrite (not XLogInsert), the above
 needs to be corrected.

I will be carefull for such outdated description.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 idea, tablesample

2012-04-24 Thread Sandro Santilli
On Mon, Apr 23, 2012 at 08:34:44PM +0300, Ants Aasma wrote:
 On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli s...@keybit.net wrote:
  I'd love to see enhanced CTID operators, to fetch all visible tuples in a 
  page
  using a tidscan.  Something like: WHERE ctid =~ '(501,*)' or a ctidrange.
 
 Among other things, this would enable user-space implementation of
 tablesample. Given the operator =~(tid, int) that matches the page
 number and planner/executor integration so that it results in a TID
 scan, you would need the following functions:
 
 random_pages(tbl regclass, samples int) returns int[]
 aggregate function:
 reservoir_sample(item anyelement, samples int) returns anyarray
 
 Implementations for both of the functions could be adapted from analyze.c.
 
 Then tablesample could be implemented with the following query:
 SELECT (SELECT reservoir_sample(some_table, 50) AS samples
FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
 FROM random_pages('some_table', 50) AS rnd_pgtids;
 
 Actually, now that I think about it, it could actually be implemented
 without any modifications to core at some cost to efficiency.
 random_pages would have to return tid[] that contains for each
 generated pagenumber all possible tids on that page.

This is exactly what I'm after.
I've actually started crafting such a TableSample function and I'm in the
process to refine the signature so your suggested interface above is 
very useful, thanks !

But I don't understand the reservoir_sample call, what is it supposed to do ?
And how flexibly anyarray return would be ? Could you return arbitrary 
typed rowtypes from it ?

 By making the building blocks available users get more flexibility.
 The downside would be that we can't automatically make better sampling
 methods available.

One approach doesn't preclude the other. TABLESAMPLE will still be useful,
also for SQL compliance.

--strk; 

  ,--o-. 
  |   __/  |Delivering high quality PostGIS 2.0 !
  |  / 2.0 |http://strk.keybit.net - http://vizzuality.com
  `-o--'


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

2012-04-24 Thread Heikki Linnakangas

On 23.04.2012 02:59, Fujii Masao wrote:

On Thu, Apr 19, 2012 at 2:20 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp  wrote:

Hello, this is new version of standby checkpoint_segments patch.


Thanks for the patch!


This still makes catching up in standby mode slower, as you get many 
more restartpoints. The reason for ignoring checkpoint_segments during 
recovery was to avoid that. Maybe it's still better than what we have 
currently, I'm not sure, but at least it needs to be discussed. Would be 
good to do some performance testing of recovery with various 
checkpoint_segments and _timeout settings, with and without this patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DOMAINs and CASTs

2012-04-24 Thread Jaime Casanova
On Tue, Nov 29, 2011 at 10:12 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Nov 29, 2011 at 11:11 AM, Jaime Casanova ja...@2ndquadrant.com 
 wrote:
 On Tue, Nov 29, 2011 at 10:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, if we apply this, it has the possibility to break existing
 dumps.

 BTW, it occurs to me that we could dodge that objection, with much less
 work than Robert suggests, if we just made the message be a WARNING not
 an ERROR.  I think that'd do just as well in terms of what the message
 could usefully accomplish, ie, steer people away from doing things that
 won't work.  Still not sure that it's worth doing though,


 i'm fine with a WARNING

 Me too; I suggested it before (so did you).


are we going to put this warning in this release?

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-24 Thread Boszormenyi Zoltan

2012-04-23 15:08 keltezéssel, Marc Cousin írta:

On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:

2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:

2012-04-06 14:47 keltezéssel, Cousin Marc írta:

On 05/04/12 08:02, Boszormenyi Zoltan wrote:

2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:

I think this patch is doing two things: first touching infrastructure
stuff and then adding lock_timeout on top of that.  Would it work to
split the patch in two pieces?


Sure. Attached is the split version.

Best regards,
Zoltán Böszörményi


Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock-holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.

I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers


Attached are the new patches. I rebased them to current GIT and
they are expected to be applied after Robert Haas' patch in the
bug in fast-path locking thread.

Now it survives the above scenario.

Best regards,
Zoltán Böszörményi

New patch attached, rebased to today's GIT.

Best regards,
Zoltán Böszörményi


Ok, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.


It was added to 2012-Next when I posted it, 2012-01 was already
closed for new additions.


Is the patch in context diff format?
Yes

Does it apply cleanly to the current git master?
Yes

Does it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.

Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
Yes

Do we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.

Do we already have it?
No.

Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.

Does it include pg_dump support (if applicable)?
Not applicable

Are there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.

Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.

I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.

Rollback to savepoint while holding locks dont crash PostgreSQL anymore.

Other timeouts such as archive_timeout and checkpoint_timeout still
work.

Does the feature work as advertised?
Yes

Are there corner cases the author has failed to consider?
I didn't find any.

Are there any assertion failures or crashes?
No.

Does the patch slow down simple tests?
No

If it claims to improve performance, does it?
Not applicable

Does it slow down other things?
No

Does it follow the project coding guidelines?
I think so

Are there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.

Will it work on Windows/BSD etc?
It should. I couldn't test it though.

Are the comments sufficient and accurate?
Yes

Does it do what it says, correctly?
Yes

Does it produce compiler warnings?
No

Can you make it crash?
Not anymore

Is everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way of handling sigalarm seems more robust to
me.

Are there interdependencies that can cause problems?
I don't see any.

Regards,

Marc


Thanks for the review.

Best regards,
Zoltán 

Re: [HACKERS] B-tree page deletion boundary cases

2012-04-24 Thread Nikhil Sontakke
  Was wondering if there's a similar bug which gets triggered while using
  VACUUM FULL. See for instance this thread:
 
 
 http://postgresql.1045698.n5.nabble.com/index-corruption-in-PG-8-3-13-td4257589.html
 
  This issue has been reported on-off from time to time and in most cases
  VACUUM or VACUUM FULL appears to be involved. We have usually attributed
 it
  to hardware issues and reindex has been recommended by default as a
  solution/work around..

 I do not perceive much similarity.  The bug I've raised can produce wrong
 query results transiently.  It might permit injecting a tuple into the
 wrong
 spot in the tree, yielding persistent wrong results.  It would not
 introduce
 tree-structural anomalies like sibling pointers directed at zeroed pages or
 internal pages in an 1-level tree.  Given the symptoms you reported, I
 share
 Robert's suspicion of WAL replay in your scenario.


Thanks for taking the time to analyze this Noah.

Regards,
Nikhils


Re: [HACKERS] Gsoc2012 idea, tablesample

2012-04-24 Thread Sandro Santilli
On Tue, Apr 24, 2012 at 08:49:26AM +0200, Sandro Santilli wrote:
 On Mon, Apr 23, 2012 at 08:34:44PM +0300, Ants Aasma wrote:

  SELECT (SELECT reservoir_sample(some_table, 50) AS samples
 FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
  FROM random_pages('some_table', 50) AS rnd_pgtids;
 
 But I don't understand the reservoir_sample call, what is it supposed to do ?

Ok got it, that was probably to avoid:

 ERROR:  more than one row returned by a subquery used as an expression

But this also works nicely:

 SELECT * FROM lots_of_points
 WHERE ctid = ANY ( ARRAY[(SELECT random_tids('lots_of_points', 10))] );

and still uses tidscan.

The advanced TID operator would be for random_tids to only return pages rather
than full tids...

--strk; 

  ,--o-. 
  |   __/  |Delivering high quality PostGIS 2.0 !
  |  / 2.0 |http://strk.keybit.net - http://vizzuality.com
  `-o--'


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PL/PGSQL bug in handling composite types

2012-04-24 Thread Boszormenyi Zoltan

Hi,

we have found a way to make pl/pgsql throw an error for
a legitimate use case that works in plain SQL.

Minimal test case:

create table x1 (id serial primary key, d timestamptz);
create table x2 (id serial primary key, d timestamptz);
insert into x2 (d) values ('now');
create type mytype as (id bigint, d timestamptz);

Casting a set of values to mytype works in SQL:

=# select (max(id),min(d))::mytype from x2;
 row
-
 (1,2012-04-24 09:04:07.475315+02)
(1 row)

=# select (max(id),min(d))::mytype from x1;
 row
-
 (,)
(1 row)

Even empty data in plain text casted to mytype works
and individual fields can be tested for IS NULL:

=# select '(,)'::mytype;
 mytype

 (,)
(1 row)

=# select ('(,)'::mytype).id is null;
 ?column?
--
 t
(1 row)

Now, try this from PL/PGSQL:

create or replace function testfunc1() returns mytype as $$declare
  v_sql text;
  mt mytype;
begin
  v_sql := 'select (max(id),min(d))::mytype from x1';
  execute v_sql into mt;
  return mt;
end;$$ language plpgsql;

=# select testfunc1();
ERROR:  invalid input syntax for integer: (,)
CONTEXT:  PL/pgSQL function testfunc1 line 6 at EXECUTE statement

The same error happens with table x2 with data in it:

create or replace function testfunc2() returns mytype as $$declare
  v_sql text;
  mt mytype;
begin
  v_sql := 'select (max(id),min(d))::mytype from x2';
  execute v_sql into mt;
  return mt;
end;$$ language plpgsql;

=# select testfunc2();
ERROR:  invalid input syntax for integer: (1,2012-04-24 09:04:07.475315+02)
CONTEXT:  PL/pgSQL function testfunc2 line 6 at EXECUTE statement

Same happens also with non-dynamic queries:

create or replace function testfunc1a() returns mytype as $$declare
  mt mytype;
begin
  select (max(id),min(d))::mytype into mt from x1;
  return mt;
end;$$ language plpgsql;

=# select testfunc1a();
ERROR:  invalid input syntax for integer: (,)
CONTEXT:  PL/pgSQL function testfunc1a line 4 at SQL statement

create or replace function testfunc2a() returns mytype as $$declare
  mt mytype;
begin
  select (max(id),min(d))::mytype into mt from x2;
  return mt;
end;$$ language plpgsql;

=# select testfunc2a();
ERROR:  invalid input syntax for integer: (1,2012-04-24 09:04:07.475315+02)
CONTEXT:  PL/pgSQL function testfunc2a line 4 at SQL statement

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/PGSQL bug in handling composite types

2012-04-24 Thread Boszormenyi Zoltan

2012-04-24 09:59 keltezéssel, Boszormenyi Zoltan írta:

Hi,

we have found a way to make pl/pgsql throw an error for
a legitimate use case that works in plain SQL.

Minimal test case:

create table x1 (id serial primary key, d timestamptz);
create table x2 (id serial primary key, d timestamptz);
insert into x2 (d) values ('now');
create type mytype as (id bigint, d timestamptz);

Casting a set of values to mytype works in SQL:

=# select (max(id),min(d))::mytype from x2;
 row
-
 (1,2012-04-24 09:04:07.475315+02)
(1 row)

=# select (max(id),min(d))::mytype from x1;
 row
-
 (,)
(1 row)

Even empty data in plain text casted to mytype works
and individual fields can be tested for IS NULL:

=# select '(,)'::mytype;
 mytype

 (,)
(1 row)

=# select ('(,)'::mytype).id is null;
 ?column?
--
 t
(1 row)

Now, try this from PL/PGSQL:

create or replace function testfunc1() returns mytype as $$declare
  v_sql text;
  mt mytype;
begin
  v_sql := 'select (max(id),min(d))::mytype from x1';
  execute v_sql into mt;
  return mt;
end;$$ language plpgsql;

=# select testfunc1();
ERROR:  invalid input syntax for integer: (,)
CONTEXT:  PL/pgSQL function testfunc1 line 6 at EXECUTE statement

The same error happens with table x2 with data in it:

create or replace function testfunc2() returns mytype as $$declare
  v_sql text;
  mt mytype;
begin
  v_sql := 'select (max(id),min(d))::mytype from x2';
  execute v_sql into mt;
  return mt;
end;$$ language plpgsql;

=# select testfunc2();
ERROR:  invalid input syntax for integer: (1,2012-04-24 09:04:07.475315+02)
CONTEXT:  PL/pgSQL function testfunc2 line 6 at EXECUTE statement

Same happens also with non-dynamic queries:

create or replace function testfunc1a() returns mytype as $$declare
  mt mytype;
begin
  select (max(id),min(d))::mytype into mt from x1;
  return mt;
end;$$ language plpgsql;

=# select testfunc1a();
ERROR:  invalid input syntax for integer: (,)
CONTEXT:  PL/pgSQL function testfunc1a line 4 at SQL statement

create or replace function testfunc2a() returns mytype as $$declare
  mt mytype;
begin
  select (max(id),min(d))::mytype into mt from x2;
  return mt;
end;$$ language plpgsql;

=# select testfunc2a();
ERROR:  invalid input syntax for integer: (1,2012-04-24 09:04:07.475315+02)
CONTEXT:  PL/pgSQL function testfunc2a line 4 at SQL statement

Best regards,
Zoltán Böszörményi



Sorry, I didn't mention the version tested: 9.0.6 and 9.1.3.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/PGSQL bug in handling composite types

2012-04-24 Thread Boszormenyi Zoltan

2012-04-24 10:01 keltezéssel, Boszormenyi Zoltan írta:

2012-04-24 09:59 keltezéssel, Boszormenyi Zoltan írta:

Hi,

we have found a way to make pl/pgsql throw an error for
a legitimate use case that works in plain SQL.

Minimal test case:

create table x1 (id serial primary key, d timestamptz);
create table x2 (id serial primary key, d timestamptz);
insert into x2 (d) values ('now');
create type mytype as (id bigint, d timestamptz);

Casting a set of values to mytype works in SQL:

=# select (max(id),min(d))::mytype from x2;
 row
-
 (1,2012-04-24 09:04:07.475315+02)
(1 row)

=# select (max(id),min(d))::mytype from x1;
 row
-
 (,)
(1 row)

Even empty data in plain text casted to mytype works
and individual fields can be tested for IS NULL:

=# select '(,)'::mytype;
 mytype

 (,)
(1 row)

=# select ('(,)'::mytype).id is null;
 ?column?
--
 t
(1 row)

Now, try this from PL/PGSQL:

create or replace function testfunc1() returns mytype as $$declare
  v_sql text;
  mt mytype;
begin
  v_sql := 'select (max(id),min(d))::mytype from x1';
  execute v_sql into mt;
  return mt;
end;$$ language plpgsql;

=# select testfunc1();
ERROR:  invalid input syntax for integer: (,)
CONTEXT:  PL/pgSQL function testfunc1 line 6 at EXECUTE statement

The same error happens with table x2 with data in it:

create or replace function testfunc2() returns mytype as $$declare
  v_sql text;
  mt mytype;
begin
  v_sql := 'select (max(id),min(d))::mytype from x2';
  execute v_sql into mt;
  return mt;
end;$$ language plpgsql;

=# select testfunc2();
ERROR:  invalid input syntax for integer: (1,2012-04-24 09:04:07.475315+02)
CONTEXT:  PL/pgSQL function testfunc2 line 6 at EXECUTE statement

Same happens also with non-dynamic queries:

create or replace function testfunc1a() returns mytype as $$declare
  mt mytype;
begin
  select (max(id),min(d))::mytype into mt from x1;
  return mt;
end;$$ language plpgsql;

=# select testfunc1a();
ERROR:  invalid input syntax for integer: (,)
CONTEXT:  PL/pgSQL function testfunc1a line 4 at SQL statement

create or replace function testfunc2a() returns mytype as $$declare
  mt mytype;
begin
  select (max(id),min(d))::mytype into mt from x2;
  return mt;
end;$$ language plpgsql;

=# select testfunc2a();
ERROR:  invalid input syntax for integer: (1,2012-04-24 09:04:07.475315+02)
CONTEXT:  PL/pgSQL function testfunc2a line 4 at SQL statement

Best regards,
Zoltán Böszörményi



Sorry, I didn't mention the version tested: 9.0.6 and 9.1.3.


The same also happens with current 9.2 GIT.



Best regards,
Zoltán Böszörményi




--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree page deletion boundary cases

2012-04-24 Thread Simon Riggs
On Sat, Apr 21, 2012 at 5:52 PM, Noah Misch n...@leadboat.com wrote:

 As I mentioned[1] peripherally back in November, that algorithm has been
 insufficient since the introduction of non-XID-bearing transactions in
 PostgreSQL 8.3.  Such transactions do not restrain RecentXmin.  If no running
 transaction has an XID, RecentXmin == ReadNewTransactionId() and the page
 incorrectly becomes available for immediate reuse.

Good observation.

 The fix is to compare the stored XID to RecentGlobalXmin, not RecentXmin.  We
 already use RecentGlobalXmin when wal_level = hot_standby.  If no running
 transaction has an XID and all running transactions began since the last
 transaction that did bear an XID, RecentGlobalXmin == ReadNewTransactionId().
 Therefore, the correct test is btpo.xact  RecentGlobalXmin, not btpo.xact =
 RecentGlobalXmin as we have today.  This also cleanly removes the need for the
 bit of code in _bt_getbuf() that decrements btpo.xact before sending it down
 for ResolveRecoveryConflictWithSnapshot().  I suggested[2] that decrement on
 an unprincipled basis; it was just masking the off-by-one of using =
 RecentGlobalXmin instead of  RecentGlobalXmin in _bt_page_recyclable().

Looks like the right fix. I'll apply this to 9.0/9.1/HEAD.

 This change makes empty B-tree pages wait through two generations of running
 transactions before reuse, so some additional bloat will arise.

Could arise, in some circumstances. But that assumes VACUUMs are
fairly frequent and that they would be delayed/rendered less effective
by this, which I don't think will be the case.

I note that we don't take any account of the number of pages that may
be reused when we VACUUM, so when HOT avoids a VACUUM we may
accumulate pages for a longer period. Looks like there is more work to
be done yet in cleaning indexes.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-04-24 Thread Michael Meskes
 OK, I will implement #2. Another question popped up: what to do
 with FETCH ALL? The current readahead window size or temporarily
 bumping it to say some tens of thousands can be used. We may not
 know how much is the all records. This, although lowers performance,
 saves memory.

I would say doing a large fetch in two or three batches won't cost too much in
terms of performance.

 Please, don't apply this patch yet. I discovered a rather big hole
 that can confuse the cursor position tracking if you do this:
 ...
 That will also need a new round of review. Sorry for that.

No problem, better to find it now instead of after release. 

Anyway, I moved the patch to 2012-next (I hope I did it correctly) so 2012-1
can be closed. Let's try to get this patch done in the next commit fest.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at googlemail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 idea, tablesample

2012-04-24 Thread Ants Aasma
On Tue, Apr 24, 2012 at 10:31 AM, Sandro Santilli s...@keybit.net wrote:
 On Tue, Apr 24, 2012 at 08:49:26AM +0200, Sandro Santilli wrote:
 On Mon, Apr 23, 2012 at 08:34:44PM +0300, Ants Aasma wrote:

  SELECT (SELECT reservoir_sample(some_table, 50) AS samples
     FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
  FROM random_pages('some_table', 50) AS rnd_pgtids;

 But I don't understand the reservoir_sample call, what is it supposed to do ?

 Ok got it, that was probably to avoid:

  ERROR:  more than one row returned by a subquery used as an expression

No, it's to avoid bias towards tuples on more sparsely populated
pages. See http://en.wikipedia.org/wiki/Reservoir_sampling or
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=ff271644e0f93ee99bfe9c1f536f3dd48455d8d2;hb=HEAD#l1027

 The advanced TID operator would be for random_tids to only return pages rather
 than full tids...

Exactly. But when mainly IO bound (ie. sampling from a large table on
spinning rust) the overhead of probing with TID scan as opposed to
sequentially scanning the pages should be small enough. When CPU bound
I suspect that the function call machinery overhead for
reservoir_sample is going to become a large issue, so a built in
tablesample also has an edge there.


Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign table scan estimates

2012-04-24 Thread Albe Laurenz
Tom Lane wrote:
 While playing around with ANALYZE on foreign tables, I noticed
 that the row count estimate for foreign scans is still
 initialized to 1000 even if there are statistics for the
 foreign table.  I think that this should be improved.

 The attached patch illustrates my suggestion.

 I don't think this is appropriate; it will just waste cycles because
 the FDW will have to repeat the calculations after obtaining a real
 estimate of the foreign table size.  If we trusted pg_class.reltuples
 to be up to date, there might be some value in this.  But we don't
 trust that for regular tables (cf. plancat.c), and I don't see why
 we would do so for foreign tables.
 
 I think on the whole it's better to abdicate responsibility here and
 require the FDW to do something in its GetForeignRelSize function.
 It's not like we'd be saving the FDW a lot of code in the (unlikely)
 case that this is exactly what it would do anyway.

I agree.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree page deletion boundary cases

2012-04-24 Thread Noah Misch
On Tue, Apr 24, 2012 at 09:08:36AM +0100, Simon Riggs wrote:
 On Sat, Apr 21, 2012 at 5:52 PM, Noah Misch n...@leadboat.com wrote:
  The fix is to compare the stored XID to RecentGlobalXmin, not RecentXmin. 
  ?We
  already use RecentGlobalXmin when wal_level = hot_standby. ?If no running
  transaction has an XID and all running transactions began since the last
  transaction that did bear an XID, RecentGlobalXmin == 
  ReadNewTransactionId().
  Therefore, the correct test is btpo.xact  RecentGlobalXmin, not btpo.xact 
  =
  RecentGlobalXmin as we have today. ?This also cleanly removes the need for 
  the
  bit of code in _bt_getbuf() that decrements btpo.xact before sending it down
  for ResolveRecoveryConflictWithSnapshot(). ?I suggested[2] that decrement on
  an unprincipled basis; it was just masking the off-by-one of using =
  RecentGlobalXmin instead of  RecentGlobalXmin in _bt_page_recyclable().
 
 Looks like the right fix. I'll apply this to 9.0/9.1/HEAD.

Thanks.  8.3 and 8.4 need it, too, albeit simplified due to some of the
affected code not existing yet.

  This change makes empty B-tree pages wait through two generations of running
  transactions before reuse, so some additional bloat will arise.
 
 Could arise, in some circumstances. But that assumes VACUUMs are
 fairly frequent and that they would be delayed/rendered less effective
 by this, which I don't think will be the case.
 
 I note that we don't take any account of the number of pages that may
 be reused when we VACUUM, so when HOT avoids a VACUUM we may
 accumulate pages for a longer period. Looks like there is more work to
 be done yet in cleaning indexes.

Agreed.  Taking one VACUUM visit to mark the page BTP_DELETED and another to
add it to the FSM is fairly pessimistic; perhaps we can do better.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DOMAINs and CASTs

2012-04-24 Thread Robert Haas
On Tue, Apr 24, 2012 at 3:00 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 are we going to put this warning in this release?

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump EXTENSION sequences too

2012-04-24 Thread Robert Haas
On Thu, Apr 19, 2012 at 6:42 AM, Gianni Ciolli
gianni.cio...@2ndquadrant.it wrote:
 currently an EXTENSION can mark some of its tables as configuration
 tables using pg_catalog.pg_extension_config_dump(), so that pg_dump
 does the right thing.

 I think it would be useful to mark sequences too, but unfortunately it
 is not possible; hence, each time a dump is reloaded, all the
 sequences in the extension are reset to 1, causing all the related
 problems.

 Moreover, the error message that we get if we try to mark a sequence
 does not mention the requirement that the relation is a table. The
 OID %u does not refer to a table error message seems to be wrongly
 restricted to the case when get_rel_name can't find a relation.

 Is there any objection to the above proposal? I did a little search of
 the archives, but I couldn't find any related discussions; I apologise
 if I missed something.

Seems like a good idea to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres install program files folder

2012-04-24 Thread Robert Haas
On Wed, Apr 18, 2012 at 12:00 PM, Flavio Vodzinski
developer.fla...@gmail.com wrote:
 Hello,
 Windows environment, has a problem in installing postgres in the Program
 Files folder? I have this doubt because Windows works with locking system to
 this folder.

This question is off-topic for pgsql-hackers, which is for discussion
of PostgreSQL-related development.   Please try pgsql-general.

I'm not sure whether you are saying that you think there might be a
problem, or that you are actually having a problem.  The installers do
work in general; if you have a specific problem, please see
http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Usage of planner_ctx

2012-04-24 Thread Tom Lane
Amit Kapila amit.kap...@huawei.com writes:
 So what is the use of having  PlannerInfo-planner_ctx which only contains
 CurrentMemoryContext?

It might be clearer if you read up on the memory management in GEQO
planning mode.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timsort performance, quicksort (was: Re: Memory usage during sorting)

2012-04-24 Thread Robert Haas
On Wed, Apr 18, 2012 at 9:31 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Thoughts?

Interesting work.  I thought about trying to code up timsort at one
point, but I've been running short of round tuits.

I did some quick tests of quicksort using half a million random
strings.  On my MacBook Pro, if the strings are in random order,
quicksort takes about 12 seconds.  If they are presorted, it takes
about 800 ms.  If they are in sorted order with an empty string
appended onto the end, it takes about 25 seconds.

If I modify gen_qsort_tuple.pl to perform the presorted input check
only when n  40, the for the
presorted-except-the-last-element-is-small test drops from 25 seconds
to about 21.5 seconds, without apparently harming either of the other
two cases.  If I remove the check altogether, the time further drops
to about 13.5 seconds, the time to sort completely-ordered data rises
to about 10-10.5 seconds, and the time to sort randomly ordered data
still doesn't seem to change much.

Based on that, I'm inclined to propose rejiggering things so that the
presorted-input check runs only at the top level, and not during any
recursive steps.  The theory is that it won't cost much because if the
data is randomly ordered we won't do many comparisons before falling
out, and that seems to be true.  But the only point of doing it in the
recursive steps is to win when the data is partially ordered, and we
actually seem to be losing big-time in that case, perhaps because when
the data is partially ordered, the presort check will frequently to
run through a significant part of the array - wasting cycles - but
fall out before it actually gets to the end.

Of course, even if we did that, it might not be as good as your
timsort numbers, but that doesn't mean we shouldn't do it...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New sync commit mode remote_write

2012-04-24 Thread Robert Haas
On Fri, Apr 20, 2012 at 3:58 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Apr 21, 2012 at 12:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Apr 19, 2012 at 7:50 PM, Robert Haas robertmh...@gmail.com wrote:
 On 4/19/12, Jeff Janes jeff.ja...@gmail.com wrote:
 The work around would be for the master to refuse to automatically
 restart after a crash, insisting on a fail-over instead (or a manual
 forcing of recovery)?

 I suppose that would work, but I think Simon's idea is better: don't
 let the slave replay the WAL until either (a) it's promoted or (b) the
 master finishes the fsync.   That boils down to adding some more
 handshaking to the replication protocol, I think.

 It would be 8 bytes on every data message sent to the standby.

 There seems to be another problem to solve. In current design of streaming
 replication, we cannot send any WAL records before writing them locally.
 Which would mess up the mode which makes a transaction wait for remote
 write but not local one. We should change walsender so that it can send
 WAL records before they are written, e.g., send from wal_buffers?

In theory, writing WAL should be quick, since it's only going out to
the OS cache, and flushing it should be the slow part, since that
involves waiting for the actual disk write to complete.  Some
instrumentation I shoved in here reveals that there actually are some
cases where the write can take a long time, when Linux starts to get
worried about the amount of dirty data in cache and punishes anyone
who tries to write anything, but I'm not sure whether that's common
enough to warrant a change here.

One thing that does seem to be a problem is using WALWriteLock to
cover both the WAL write and the WAL flush.  Suppose that we're
writing WAL very quickly, so that wal_buffers fills up.  We can't
continue writing WAL until some of what's in the buffer has been
*written*, but the WAL writer process will grab WALWriteLock, write
*and flush* a chunk of WAL, and everybody who wants to insert WAL has
to wait for both the write and the flush.  It's probably possible to
do better, here.  Streaming directly from wal_buffers would allow sync
rep to dodge this problem altogether, but it's a general performance
problem as well so it would be nice to have a general solution that
would improve latency and throughput across the board, if such a
solution is possible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New sync commit mode remote_write

2012-04-24 Thread Jeff Janes
On Thu, Apr 19, 2012 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On 4/19/12, Jeff Janes jeff.ja...@gmail.com wrote:
 The work around would be for the master to refuse to automatically
 restart after a crash, insisting on a fail-over instead (or a manual
 forcing of recovery)?

 I suppose that would work, but I think Simon's idea is better: don't
 let the slave replay the WAL until either (a) it's promoted or (b) the
 master finishes the fsync.   That boils down to adding some more
 handshaking to the replication protocol, I think.

Alternative c) is that the master automatically recovers from a crash,
but doesn't replay that particular wal record because it doesn't find
it on disk, so the slave has to be instructed to throw it away.  (Or
perhaps the slave could feed the wal back to the master, so the master
could replay it?)

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timsort performance, quicksort (was: Re: Memory usage during sorting)

2012-04-24 Thread Greg Stark
On Tue, Apr 24, 2012 at 4:17 PM, Robert Haas robertmh...@gmail.com wrote:
 Based on that, I'm inclined to propose rejiggering things so that the
 presorted-input check runs only at the top level, and not during any
 recursive steps.

Just a thought. What about running only every nth step. Maybe
something like every 4th step.

But actually I'm confused. This seems to be misguided to me. Quicksort
isn't stable so even if you have a partially sorted data set the
recursive partitions are going to be at best partially sorted after a
pivot. I haven't walked through it but suspect even your
all-but-one-sorted data set is not finding
the data sorted in either partition on the next iteration.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timsort performance, quicksort (was: Re: Memory usage during sorting)

2012-04-24 Thread Peter Geoghegan
On 24 April 2012 16:17, Robert Haas robertmh...@gmail.com wrote:
 If they are in sorted order with an empty string
 appended onto the end, it takes about 25 seconds.

That's exactly what I'd have expected, but was surprised to have not
found with my own test. Perhaps it was same kind of fluke (i.e. a
re-creatable one - I'm quite confident that my benchmark was not
methodologically flawed, at least in execution).

 Based on that, I'm inclined to propose rejiggering things so that the
 presorted-input check runs only at the top level, and not during any
 recursive steps.  The theory is that it won't cost much because if the
 data is randomly ordered we won't do many comparisons before falling
 out, and that seems to be true.  But the only point of doing it in the
 recursive steps is to win when the data is partially ordered, and we
 actually seem to be losing big-time in that case, perhaps because when
 the data is partially ordered, the presort check will frequently to
 run through a significant part of the array - wasting cycles - but
 fall out before it actually gets to the end.

That makes sense to me, but obviously more data is needed here.

 Of course, even if we did that, it might not be as good as your
 timsort numbers, but that doesn't mean we shouldn't do it...

The frustrating thing about my timsort numbers, as I mentioned in
reply to Dimitri (He modified the subject a bit, so that might appear
to be a different thread to you), is that they appear to be almost
consistently winning when you consider the number of comparisons, but
in fact lose when you measure the duration of execution or TPS or
whatever. I expected a certain amount of this, but not enough to
entirely derail the case for replacing quicksort with timsort when
sorting a single key of text, which is obviously the really compelling
case for optimisation here. This situation is only going to be made
worse by the work you've done on SortSupport for text, which,
incidentally, I agree is worthwhile.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Welcome 2012 GSOC students

2012-04-24 Thread Josh Berkus
Hackers,

We've chosen the 5 GSOC projects for this year:

* JDBC Foreign Data Wrapper, by Atri, mentored by Merlin Moncure
* Document Collection Foreign Data Wrapper, by Zheng Yang (a returning
student), mentored by Satoshi Nagayasu
* Implementing TABLESAMPLE, by Qi, mentored by Stephen Frost
* Better Indexing for Ranges, by Alexander (returning), mentored by
Heikki Linnakangas
* xReader Streaming xLog Reader, by Aakash, mentored by Kevin Grittner

Congratulations to the selected students, and expect to see them here on
-hackers working on their projects.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

2012-04-24 Thread Fujii Masao
On Tue, Apr 24, 2012 at 3:53 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 23.04.2012 02:59, Fujii Masao wrote:

 On Thu, Apr 19, 2012 at 2:20 PM, Kyotaro HORIGUCHI
 horiguchi.kyot...@lab.ntt.co.jp  wrote:

 Hello, this is new version of standby checkpoint_segments patch.


 Thanks for the patch!


 This still makes catching up in standby mode slower, as you get many more
 restartpoints. The reason for ignoring checkpoint_segments during recovery
 was to avoid that. Maybe it's still better than what we have currently, I'm
 not sure, but at least it needs to be discussed.

I see your point. Agreed.

Another aspect of this problem is that if we ignore checkpoint_segments during
recovery, a restartpoint would take long time, and which prevents WAL files from
being removed from pg_xlog for a while. Which might cause the disk to fill up
with WAL files. This trouble is unlikely to happen in 9.1 or before because the
archived WAL files are always restored with a temporary name. OTOH, in 9.2,
cascading replication patch changed the recovery logic so that the archived
WAL files are restored with the correct name, so the number of WAL files in
pg_xlog keeps increasing until a restartpoint removes them. The disk is more
likely to fill up, in 9.2.

To alleviate the above problem, at least we might have to change the recovery
logic so that the archived WAL files are restored with a temporary name,
if cascading replication is not enabled (i.e., !standby_mode || !hot_standby ||
max_wal_senders = 0). Or we might have to remove the restored WAL file
after replaying it and before opening the next one, without waiting for
a restartpoint to remove the restored WAL files. Thought?

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

2012-04-24 Thread Fujii Masao
On Tue, Apr 24, 2012 at 3:47 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Hello,

   - xlog.c: Make StandbyMode shared.
 
   - checkpointer.c: Use IsStandbyMode() to check if postmaster is
    under standby mode.

 IsStandbyMode() looks overkill to me. The standby mode flag is forcibly
 turned off at the end of recovery, but its change doesn't need to be shared
 to the checkpointer process, IOW, the shared flag doesn't need to change
 since startup like XLogCtl-archiveCleanupCommand, I think. So we can
 simplify the code to share the flag to the checkpointer. See the attached
 patch (though not tested yet).

 Hmm. I understood that the aim of the spinlock and volatil'ize of
 the pointer in reading shared memory is to secure the memory
 consistency on SMPs with weak memory consistency and to make
 compiler help from over-optimization for non-volatile pointer
 respectively.

 You removed both of them in the patch.

 If we are allowed to be tolerant of the temporary lack of
 coherence in shared memory there, the spinlock could be removed.
 But the possibility to read garbage by using XLogCtl itself to
 access standbyMode does not seem to be tolerable. What do you
 think about that?

I'm not sure if we really need to worry about that for such shared variable
that doesn't change since it's been initialized at the start of recovery.
Anyway, if we really need to worry about that, we need to protect the
shared variable RecoveryTargetTLI and archiveCleanupCommand with
the spinlock because they are in the same situation as standbyMode.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Welcome 2012 GSOC students

2012-04-24 Thread Atri Sharma
On Tue, Apr 24, 2012 at 10:36 PM, Josh Berkus j...@agliodbs.com wrote:
 Hackers,

 We've chosen the 5 GSOC projects for this year:

 * JDBC Foreign Data Wrapper, by Atri, mentored by Merlin Moncure
 * Document Collection Foreign Data Wrapper, by Zheng Yang (a returning
 student), mentored by Satoshi Nagayasu
 * Implementing TABLESAMPLE, by Qi, mentored by Stephen Frost
 * Better Indexing for Ranges, by Alexander (returning), mentored by
 Heikki Linnakangas
 * xReader Streaming xLog Reader, by Aakash, mentored by Kevin Grittner

 Congratulations to the selected students, and expect to see them here on
 -hackers working on their projects.

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Hi Josh,

Thanks a lot for welcoming me.

I would also like to thank the entire community and yourself for
giving me this wonderful opportunity.I shall definitely try my hardest
for achieving the goal and meeting your expectations.

I shall be regularly posting updates on the -hackers list on the
progress of my project.

I would be requiring the support from the entire community.

Thanks once again,

Atri


-- 
Regards,

Atri
l'apprenant

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] remove dead ports?

2012-04-24 Thread Peter Eisentraut
I propose that we remove support for the following OS ports from our
source tree.  They are totally dead, definitely don't work, and/or
probably no one remembers what they even were.  The code just bit rots
and is in the way of future improvements.

* Dead/remove:

dgux
nextstep
sunos4
svr4
ultrix4
univel

* Dubious, but keep for now:

bsdi
irix
osf
sco

* No concern:

aix
cygwin
darwin
freebsd
hpux
linux
netbsd
openbsd
solaris
unixware
win32


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove dead ports?

2012-04-24 Thread Stefan Kaltenbrunner
On 04/24/2012 08:29 PM, Peter Eisentraut wrote:
 I propose that we remove support for the following OS ports from our
 source tree.  They are totally dead, definitely don't work, and/or
 probably no one remembers what they even were.  The code just bit rots
 and is in the way of future improvements.
 
 * Dead/remove:
 
 dgux
 nextstep
 sunos4
 svr4
 ultrix4
 univel

+1

 
 * Dubious, but keep for now:
 
 bsdi
 irix
 osf
 sco

I'm pretty sure I have seen at least semi recent reports of users on at
least irix and sco, for the rest of them I think bruce recently stopped
using bsdi so the only ever known user of that platform is gone...


Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3: summary of corruption detection / checksums / CRCs discussion

2012-04-24 Thread Robert Haas
On Sat, Apr 21, 2012 at 5:40 PM, Jeff Davis pg...@j-davis.com wrote:
 * In addition to detecting random garbage, we also need to be able to
 detect zeroing of pages. Right now, a zero page is not considered
 corrupt, so that's a problem. We'll need to WAL table extension
 operations, and we'll need to mitigate the performance impact of doing
 so. I think we can do that by extending larger tables by many pages
 (say, 16 at a time) so we can amortize the cost of WAL and avoid
 contention.

I think that extending tables in larger chunks is probably a very good
idea for performance reasons completely apart from checksums.
However, I don't think that WAL-logging relation extension will allow
you to treat a zero page as an error condition unless you not only
WAL-log the operation but also *flush WAL* before performing the
actual table-extension operation.  Otherwise, we might crash after the
actual extension and before the WAL record hits the disk, and now
we're back to having a zero'd page in the file.  And the impact of
writing *and flushing* WAL for every extension seems likely to be more
than we're willing to pay.  If we extended 16 pages at a time, that
means waiting for 8 WAL fsyncs per MB of relation extension.  On my
MacBook Pro, which is admittedly a pessimal case for fsyncs, that
would work out to an extra half second of elapsed time per MB written,
so pgbench -i -s 100 would probably take about an extra 640 seconds.
If that's a correct analysis, that sounds pretty bad, because right
now it's taking 143 seconds.

 * Should we try to use existing space in header? It's very appealing to
 be able to avoid the upgrade work by using existing space in the header.
 There was a surprising amount of discussion about which space to use:
 pd_pagesize_version or pd_tli. There was also some discussion of using a
 few bits from various places.

It still seems to me that pd_tli is the obvious choice, since there is
no code anywhere in the system that relies on that field having any
particular value, so we can pretty much whack it around at will
without breaking anything.  AFAICS, the only reason to bump the page
format is if we want a longer checksum - 32 bits, say, instead of 16.
But I am inclined to think that 16 ought to be enough to detect the
vast majority of cases of corruption.

 * Table-level, or system level? Table-level would be appealing if there
 turns out to be a significant performance impact. But there are
 challenges during recovery, because no relcache is available. It seems
 like a relatively minor problem, because pages would indicate whether
 they have a checksum or not, but there are some details to be worked
 out.

I think the root of the issue here is that it's hard to turn checksums
on and off *online*.  If checksums were an initdb-time option, the
design here would be pretty simple: store the flag in the control
file.  And it wouldn't even be hard to allow the flag to be flipped
while the cluster is shut down: just write a utility to checksum and
rewrite all the blocks, fsync everything, and then flip the flag in
the control file and fsync that; also, force the user to rebuild all
their standbys.  This might not be very convenient, but it would be
comparatively simple to implement.

However, I've been assuming that we do want to handle turning
checksums on and off without shutting down the cluster, and that
definitely makes things more complicated.  In theory it could work
more or less the same way as in the off-line case: you launch some
command or function that visits every data block in the cluster and
rewrites them all, xlogging as it goes.  However, I'm a little
concerned that it won't be very usable in that form for some of the
same reasons that the off-line solution might not be very usable: you
essentially force the user to do a very large data-rewriting operation
all at once.  We already know that having a kicking off a big vacuum
in the background can impact the performance of the whole cluster, and
that's just one table; this would be the whole cluster all at once.

Moreover, if it's based on the existing vacuum code, which seems
generally desirable, it seems like it's going to have to be something
you run in one database at a time until you've hit them all.  In that
case, you're going to need some way to indicate which tables or
databases have been processed already and which haven't yet ... and if
you have that, then it's not a very big step to think that maybe we
ought to just control it on a per-table or per-database level to begin
with, since that would also have some of the collateral benefits you
mention.  The alternative is to have a new command, or a new form of
vacuum, that processes every table in every cluster regardless of
which DB you're connected to at the moment.  That's a lot of new,
fairly special-purpose code and it doesn't seem very convenient from a
management perspective, but it is probably simpler otherwise.

I'm not sure what the right decision is here.  I 

Re: [HACKERS] 9.3: summary of corruption detection / checksums / CRCs discussion

2012-04-24 Thread Robert Haas
On Sat, Apr 21, 2012 at 7:08 PM, Greg Stark st...@mit.edu wrote:
 The earlier consensus was to move all the hint bits to a dedicated
 area and exclude them from the checksum. I think double-write buffers
 seem to have become more fashionable but a summary that doesn't
 describe the former is definitely incomplete.

I don't think we ever had any consensus that moving the hint bits
around was a good idea.  For one thing, they are only hints in one
direction.  It's OK to clear them by accident, but it's not OK to set
them by accident.  For two things, it's not exactly clear how we'd
rearrange the page to make this work at all: where are those hint bits
gonna go, if not in the tuple headers?  For three things, index pages
have hint-type changes that are not single-bit changes.

 That link points to the MVCC-safe truncate patch. I don't follow how
 optimizations in bulk loads are relevant to wal logging hint bit
 updates.

That patch actually has more than one optimization in it, I think, but
the basic idea is that if we could figure out a way to set
HEAP_XMIN_COMMITTED when loading data into a table created or
truncated within the same transaction, the need to set hint bits on
first scan of the table would be eliminated.  Writing the xmin as
FrozenTransactionId would save even more, though it introduces some
additional complexity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove dead ports?

2012-04-24 Thread Robert Haas
On Tue, Apr 24, 2012 at 2:29 PM, Peter Eisentraut pete...@gmx.net wrote:
 I propose that we remove support for the following OS ports from our
 source tree.  They are totally dead, definitely don't work, and/or
 probably no one remembers what they even were.  The code just bit rots
 and is in the way of future improvements.

I have no position on whether those operating systems are dead enough
to warrant removing support, but on a related point, I would like it
if we could get rid of as many spinlock implementations as are
applicable only to platforms that are effectively defunct.  I'm
suspicious of s_lock.h's support for National Semiconductor 32K,
Renesas' M32R, Renesas' SuperH, UNIVEL, SINIX / Reliant UNIX,
Nextstep, and Sun3, all of which are either on your list above, or
stuff I've never heard of.  I have no problem keeping whatever people
are still using, but it would be nice to eliminate anything that's
actually dead for the reasons you state.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New sync commit mode remote_write

2012-04-24 Thread Robert Haas
On Tue, Apr 24, 2012 at 12:21 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Thu, Apr 19, 2012 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
 On 4/19/12, Jeff Janes jeff.ja...@gmail.com wrote:
 The work around would be for the master to refuse to automatically
 restart after a crash, insisting on a fail-over instead (or a manual
 forcing of recovery)?

 I suppose that would work, but I think Simon's idea is better: don't
 let the slave replay the WAL until either (a) it's promoted or (b) the
 master finishes the fsync.   That boils down to adding some more
 handshaking to the replication protocol, I think.

 Alternative c) is that the master automatically recovers from a crash,
 but doesn't replay that particular wal record because it doesn't find
 it on disk, so the slave has to be instructed to throw it away.

Right.  Which kind of stinks.

 (Or
 perhaps the slave could feed the wal back to the master, so the master
 could replay it?)

Yes, that would be a very nice enhancement, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timsort performance, quicksort (was: Re: Memory usage during sorting)

2012-04-24 Thread Robert Haas
On Tue, Apr 24, 2012 at 12:30 PM, Greg Stark st...@mit.edu wrote:
 On Tue, Apr 24, 2012 at 4:17 PM, Robert Haas robertmh...@gmail.com wrote:
 Based on that, I'm inclined to propose rejiggering things so that the
 presorted-input check runs only at the top level, and not during any
 recursive steps.

 Just a thought. What about running only every nth step. Maybe
 something like every 4th step.

If there's actually some advantage to that, sure.  But so far, it
looks like less is more.

 But actually I'm confused. This seems to be misguided to me. Quicksort
 isn't stable so even if you have a partially sorted data set the
 recursive partitions are going to be at best partially sorted after a
 pivot.

Exactly.  That's why I think we should do it only at the topmost
level, before we've done any pivoting.  Doing it at any lower level is
apparently a waste of energy and counterproductive.

 I haven't walked through it but suspect even your
 all-but-one-sorted data set is not finding
 the data sorted in either partition on the next iteration.

I suspect that, too.  Actually, I'm a bit confused about why it's
picking such terrible pivots.  Our median-of-medians optimization
should be doing better than this, I would think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timsort performance, quicksort (was: Re: Memory usage during sorting)

2012-04-24 Thread Robert Haas
On Tue, Apr 24, 2012 at 12:51 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 24 April 2012 16:17, Robert Haas robertmh...@gmail.com wrote:
 If they are in sorted order with an empty string
 appended onto the end, it takes about 25 seconds.

 That's exactly what I'd have expected, but was surprised to have not
 found with my own test. Perhaps it was same kind of fluke (i.e. a
 re-creatable one - I'm quite confident that my benchmark was not
 methodologically flawed, at least in execution).

Oh, I read your results as showing something quite similar.

 Based on that, I'm inclined to propose rejiggering things so that the
 presorted-input check runs only at the top level, and not during any
 recursive steps.  The theory is that it won't cost much because if the
 data is randomly ordered we won't do many comparisons before falling
 out, and that seems to be true.  But the only point of doing it in the
 recursive steps is to win when the data is partially ordered, and we
 actually seem to be losing big-time in that case, perhaps because when
 the data is partially ordered, the presort check will frequently to
 run through a significant part of the array - wasting cycles - but
 fall out before it actually gets to the end.

 That makes sense to me, but obviously more data is needed here.

What more data do you think is needed?  I've been suspicious of that
code since the first time I looked at it, and I'm now fairly well
convinced that it's full of suckitude.  Honestly, I'm not sure I could
manage to contrive a case where that code wins if I set out to do so.

 Of course, even if we did that, it might not be as good as your
 timsort numbers, but that doesn't mean we shouldn't do it...

 The frustrating thing about my timsort numbers, as I mentioned in
 reply to Dimitri (He modified the subject a bit, so that might appear
 to be a different thread to you), is that they appear to be almost
 consistently winning when you consider the number of comparisons, but
 in fact lose when you measure the duration of execution or TPS or
 whatever. I expected a certain amount of this, but not enough to
 entirely derail the case for replacing quicksort with timsort when
 sorting a single key of text, which is obviously the really compelling
 case for optimisation here. This situation is only going to be made
 worse by the work you've done on SortSupport for text, ...

That is quite baffling.  Have you profiled it at all?

 ...which,
 incidentally, I agree is worthwhile.

Cool, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timsort performance, quicksort (was: Re: Memory usage during sorting)

2012-04-24 Thread Robert Haas
On Tue, Apr 24, 2012 at 7:16 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 That makes sense to me, but obviously more data is needed here.

 What more data do you think is needed?  I've been suspicious of that
 code since the first time I looked at it, and I'm now fairly well
 convinced that it's full of suckitude.  Honestly, I'm not sure I could
 manage to contrive a case where that code wins if I set out to do so.

 Yeah, I thought that the rationale for introducing the pre-sorted
 check, as it appeared in a commit message, was a little weak. I don't
 know that I'd go as far as to say that it was full of suckitude. The
 worst thing about that code to my mind is that despite being highly
 performance critical, it has exactly no comments beyond a brief
 description, and the names of variables are arcanely curt.

Well, what I don't like about it is that it doesn't work.  Having a
special case for pre-sorted input makes sense to me, but doing that
check recursively at every level is pointless unless it helps with
almost-sorted input, and doubling the runtime doesn't meet my
definition of helping.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3: summary of corruption detection / checksums / CRCs discussion

2012-04-24 Thread Greg Stark
On Tue, Apr 24, 2012 at 9:40 PM, Robert Haas robertmh...@gmail.com wrote:
  For three things, index pages
 have hint-type changes that are not single-bit changes.

? Just how big are these? Part of the reason hint bit updates are safe
is because one bit definitely absolutely has to be entirely in one
page. You can't tear a page in the middle of a bit. In reality the
size is much larger, probably 4k and almost certainly at least 512
bytes. But the postgres block layout doesn't really offer much
guarantees about the location of anything relative those 512 byte
blocks so probably anything larger than a word is unsafe to update.

The main problem with the approach was that we kept finding more hint
bits we had forgotten about. Once the coding idiom was established it
seems it was a handy hammer for a lot of problems.
-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove dead ports?

2012-04-24 Thread Greg Stark
On Tue, Apr 24, 2012 at 9:49 PM, Robert Haas robertmh...@gmail.com wrote:
  I'm
 suspicious of s_lock.h's support for National Semiconductor 32K,
 Renesas' M32R, Renesas' SuperH, UNIVEL, SINIX / Reliant UNIX,
 Nextstep, and Sun3

Were there ever multiprocessor Nextstep or Sun3 machines anyways?
Wouldn't someone on these OSes want spinlocks to immediately sleep
anyways?

I did experiment a while back with getting a vax emulator going to
build postgres on it but even then I was running NetBSD.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3: summary of corruption detection / checksums / CRCs discussion

2012-04-24 Thread Josh Berkus
On 4/21/12 2:40 PM, Jeff Davis wrote:
 If we do use WAL for hint bit updates, that has an impact on Hot
 Standby, because HS can't write WAL. So, it would seem that HS could not
 set hint bits.

If we're WAL-logging hint bits, then the standby would be receiving
them, so it doesn't *need* to write them.

However, I suspect that WAL-logging hint bits would be prohibitively
expensive.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Temporary tables under hot standby

2012-04-24 Thread Noah Misch
A key barrier to migrations from trigger-based replication to WAL-based
replication is the lack of temporary tables under hot standby.  I'd like to
close that gap; the changes needed will also reduce the master-side cost of
temporary table usage.  Here is a high-level design for your advice and
comments.  Much of this builds on ideas from past postings noted below.

Our temporary tables are cataloged and filled like permanent tables.  This has
the major advantage of making most code operate on tables with minimal regard
for their relpersistence.  It also brings disadvantages:

1. High catalog turnover in rapid create/drop workloads.  Heavy temporary
   table users often need to REINDEX relation-oriented catalogs.  Hot standby
   cannot assign OIDs or modify system catalogs at all.
2. Consumption of permanent XIDs for DML on the table.  This increases COMMIT
   cost on the master and is a non-starter under hot standby.
3. autovacuum cannot VACUUM temporary tables, but their relfrozenxid values
   delay pg_clog truncation and can trigger a wraparound-prevention shutdown.
4. sinval traffic from every CREATE TEMP TABLE et al.
5. CREATE TEMP TABLE records AccessExclusiveLock in the global lock table.
6. We don't automatically drop temporary tables that existed at the point of a
   crash, because they look much like permanent tables.

To resolve points 2 and 3, let's change the XID values stored in temporary
tables from regular TransactionId to LocalTransactionId.  This entails an lxid
counterpart for clog.c.  Functions like GetCurrentTransactionId() and
HeapTupleSatisfiesVisibility() will take a Relation argument to identify the
XID type.  One open question is whether to add conditional logic to functions
like HeapTupleSatisfiesMVCC() or to have parallel implementations like
HeapTupleSatisfiesMVCCLocal().  I lean toward the latter, perhaps with the
help of some code generation.  I don't think a counterpart for pg_subtrans
will be necessary; the backend knows its own XID tree, and the
TRANSACTION_STATUS_SUB_COMMITTED interlock is superfluous with only one
backend as reader and writer.  I'm also thinking the local clog can live
strictly in memory; a session that retains a temporary table across 2B local
transactions can afford 512 MiB of RAM.  With this change, VACUUM can ignore
relfrozenxid of temporary tables when calculating a new datfrozenxid.  This
change can form an independent patch.

I do not see a clean behind-the-scenes fix for points 1, 4 and 5.  We can
resolve those by adding a new variety of temporary table, one coincidentally
matching the SQL standard's notion of a temporary table.  The developer will
declare it once, after which all sessions observe it as an initially-empty
table whose contents remain local to the session.  Most relation catalog
entries, including all OIDs, are readily sharable among sessions.  The
exceptions are relpages, reltuples, relallvisible, relfrozenxid, and
pg_statistic rows.  I will handle the pg_class columns by introducing new
backend APIs abstracting them.  Those APIs will consult the relcache for
permanent tables and a local-memory hash for temporary tables.  For
statistics, add a new catalog pg_temp_statistic, an inheritance child of
pg_statistic and itself one of these new-variety temporary tables.

Past discussions have raised the issue of interaction between commands like
ALTER TABLE and sessions using the new-variety temporary table.  As a first
cut, let's keep this simple and have ongoing use of the table block operations
requiring AccessExclusiveLock.  Note that you can always just make a new
temporary table with a different name to deploy a change quickly.  Implement
this with a heavyweight lock having a novel life cycle.  When a session first
takes an ordinary relation lock on the table, acquire the longer-term lock and
schedule it for release on transaction abort.  On TRUNCATE, schedule a release
on transaction commit.  Of course, also release the lock at session end.

For new-variety temporary tables, change file names from relfilenode[_fork]
to refilenode[_fork].pid.localnode.  During crash recovery, delete all files
conforming to that pattern for refilenodes of known temporary tables.  This
also lets backends share pg_class.relfilenode.  The localnode field would
distinguish multiple generations of a table across VACUUM FULL, CLUSTER, and
TRUNCATE.  We could build on this strategy to safely resolve point 6 for the
existing kind of temporary table, but that entails enough other details to
probably not mix it into the same patch.

A third patch will permit the following commands in read-only transactions,
where they will throw an error if the subject is not a temporary table:

INSERT
UPDATE
DELETE
COPY ... FROM
TRUNCATE
ANALYZE
VACUUM (including VACUUM FULL)
CLUSTER (without USING clause)
REINDEX

I considered whether to instead separate the set of commands allowed in a
read-only transaction from the set allowed under hot standby.  This proposal
is closer 

Re: [HACKERS] remove dead ports?

2012-04-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I have no position on whether those operating systems are dead enough
 to warrant removing support, but on a related point, I would like it
 if we could get rid of as many spinlock implementations as are
 applicable only to platforms that are effectively defunct.  I'm
 suspicious of s_lock.h's support for National Semiconductor 32K,
 Renesas' M32R, Renesas' SuperH, UNIVEL, SINIX / Reliant UNIX,
 Nextstep, and Sun3, all of which are either on your list above, or
 stuff I've never heard of.  I have no problem keeping whatever people
 are still using, but it would be nice to eliminate anything that's
 actually dead for the reasons you state.

The Renesas implementations were added pretty darn recently, so I think
there are users for those.  The others you mention seem dead to me.
On the other hand, exactly how much is it costing us to leave those
sections of s_lock.h in there?  It's not like we have any plans to
redefine the spinlock interfaces.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Welcome 2012 GSOC students

2012-04-24 Thread Qi Huang
Thanks for the hackers' support. The discussion on the mailing is quite 
encouraging. Expecting to enjoy the 3 months' fun with Postgres. I'm still 
under the final period of my university, will participate more after the exams 
finish.
Thanks!

Sent from my Windows Phone

From: Josh Berkus
Sent: 25/4/2012 1:06 AM
To: PostgreSQL-development
Subject: [HACKERS] Welcome 2012 GSOC students

Hackers,

We've chosen the 5 GSOC projects for this year:

* JDBC Foreign Data Wrapper, by Atri, mentored by Merlin Moncure
* Document Collection Foreign Data Wrapper, by Zheng Yang (a returning
student), mentored by Satoshi Nagayasu
* Implementing TABLESAMPLE, by Qi, mentored by Stephen Frost
* Better Indexing for Ranges, by Alexander (returning), mentored by
Heikki Linnakangas
* xReader Streaming xLog Reader, by Aakash, mentored by Kevin Grittner

Congratulations to the selected students, and expect to see them here on
-hackers working on their projects.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers