Re: [HACKERS] pg_upgrade incorrectly equates pg_default and database tablespace

2012-04-06 Thread Jeff Davis
On Fri, 2012-03-30 at 13:11 -0700, Jeff Davis wrote:
> I confirmed this bug upgrading 9.1 to master, and that this patch fixes
> it. Thank you for the report!
> 
> Patch looks good to me as well, with one very minor nitpick: the added
> comment is missing an apostrophe.
> 
> Bruce, can you take a look at this?

Adding this to the next commitfest, just so it doesn't get forgotten.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: improve SLRU replacement algorithm

2012-04-06 Thread Jignesh Shah
On Wed, Apr 4, 2012 at 7:06 PM, Josh Berkus  wrote:
> On 4/4/12 4:02 PM, Tom Lane wrote:
>> Greg Stark  writes:
>>> On Wed, Apr 4, 2012 at 9:34 PM, Simon Riggs  wrote:
 Why is this pgbench run accessing so much unhinted data that is > 1
 million transactions old? Do you believe those numbers? Looks weird.
>>
>>> I think this is in the nature of the workload pgbench does. Because
>>> the updates are uniformly distributed, not concentrated 90% in 10% of
>>> the buffers like most real-world systems, (and I believe pgbench only
>>> does index lookups) the second time a tuple is looked at is going to
>>> average N/2 transactions later where N is the number of tuples.
>>
>> That's a good point, and it makes me wonder whether pgbench is the right
>> test case to be micro-optimizing around.  It would be a good idea to at
>> least compare the numbers for something with more locality of reference.
>
> Jignesh, would DVDstore help for this?
>
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com

I will try this out next week.. I am scrounging for decent hardware
and I think I found one.. though I will get access to it on Monday.

The way you could do locality in DVDStore is do the build with say
10GB size but while running the actual run just mention the size as
1GB and that should get you the 10% active population scenario.


Regards,
Jignesh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Collecting statistics on CSV file data

2012-04-06 Thread Shigeru Hanada
On Sat, Apr 7, 2012 at 4:19 AM, Tom Lane  wrote:
> Shigeru HANADA  writes:
>> Just after my post, Fujita-san posted another v7 patch[1], so I merged
>> v7 patches into v8 patch.
>
> I've committed a modified version of this, but right after pushing it
> I had a better idea about what the AnalyzeForeignTable API should do.
> An issue that I'd not previously figured out is how analysis of an
> inheritance tree could deal with foreign-table members, because it wants
> to estimate the members' sizes before collecting the actual sample rows.
> However, given that we've got the work split into a precheck phase and
> a sample collection phase, that's not hard to solve: we could insist
> that the FDW give back a size estimate in the precheck phase not the
> sample collection phase.  I'm off to fix that up ...

Thanks.  I'll improve pgsql_fdw so that it can collect statistics of
foreign data with this new API.

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types regression failure

2012-04-06 Thread Thom Brown
On 6 April 2012 22:35, Tom Lane  wrote:
> Thom Brown  writes:
>> I can't explain why I'm seeing a range type installcheck failure as I
>> don't see the same problem on the buildfarm, but out of all the tests
>> run, the range types test is the only one to fail.
>
> I can duplicate that output ordering if I force it to use indexscans,
> but it's quite unclear why it would do so by default for a six-row
> table.  Are you using weird planner cost parameters?  (I'd have expected
> such to result in a lot more diffs than this, though.)

Ah, you've nailed it.  It's performing an index-only scan:

thom@regression=# explain select * from numrange_test where nr <
numrange(-1000.0, -1000.0,'[]');
QUERY PLAN
--
 Index Only Scan using numrange_test_btree on numrange_test
(cost=0.00..20.00 rows=437 width=32)
   Index Cond: (nr < '[-1000.0,-1000.0]'::numrange)
(2 rows)

And you are right about my cost settings.  I have random_page_cost set
to 1.1 as I'm using an SSD.  Setting it back to 4.0 and re-running the
tests removes the issue.  My fault for swapping in my edited conf file
prior to tests.

Thanks.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types regression failure

2012-04-06 Thread Tom Lane
Thom Brown  writes:
> I can't explain why I'm seeing a range type installcheck failure as I
> don't see the same problem on the buildfarm, but out of all the tests
> run, the range types test is the only one to fail.

I can duplicate that output ordering if I force it to use indexscans,
but it's quite unclear why it would do so by default for a six-row
table.  Are you using weird planner cost parameters?  (I'd have expected
such to result in a lot more diffs than this, though.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] System catalog typos

2012-04-06 Thread Thom Brown
On 6 April 2012 21:56, Peter Eisentraut  wrote:
> On fre, 2012-04-06 at 01:24 +0100, Thom Brown wrote:
>> > I also found a couple typos in completely different sections. (patch
>> attached)
>>
>> Apologies, that last patch had one correction in the wrong place.
>> Reattached.
>>
> Committed.

Cheers!

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Range Types regression failure

2012-04-06 Thread Thom Brown
Hi,

I can't explain why I'm seeing a range type installcheck failure as I
don't see the same problem on the buildfarm, but out of all the tests
run, the range types test is the only one to fail.

I've attached the diff and the rangetypes.out file.  It appears that
while the rows output are the same, they aren't in the same order.
What I can't explain is why the ordering on my system is different to
all the buildfarm animals.

This is on Ubuntu 11.10 64-bit (Linux kernel 3.0.0) and the following
build options:
--enable-depend
--enable-cassert
--enable-debug
--with-ossp-uuid
--with-libxml
--with-libxslt

-- 
Thom


regression.diffs
Description: Binary data


rangetypes.out
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] System catalog typos

2012-04-06 Thread Peter Eisentraut
On fre, 2012-04-06 at 01:24 +0100, Thom Brown wrote:
> > I also found a couple typos in completely different sections. (patch
> attached)
> 
> Apologies, that last patch had one correction in the wrong place.
> Reattached.
> 
Committed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] System catalog typos

2012-04-06 Thread Peter Eisentraut
On tor, 2012-04-05 at 23:22 +0100, Thom Brown wrote:
> I attach a patch to correct various system catalog/view definitions in the 
> docs.

Committed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Collecting statistics on CSV file data

2012-04-06 Thread Tom Lane
Shigeru HANADA  writes:
> Just after my post, Fujita-san posted another v7 patch[1], so I merged
> v7 patches into v8 patch.

I've committed a modified version of this, but right after pushing it
I had a better idea about what the AnalyzeForeignTable API should do.
An issue that I'd not previously figured out is how analysis of an
inheritance tree could deal with foreign-table members, because it wants
to estimate the members' sizes before collecting the actual sample rows.
However, given that we've got the work split into a precheck phase and
a sample collection phase, that's not hard to solve: we could insist
that the FDW give back a size estimate in the precheck phase not the
sample collection phase.  I'm off to fix that up ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimizing Nested Correlated Queries by decorrelation: GSOC 2012 Project

2012-04-06 Thread Josh Berkus
On 3/30/12 7:29 AM, Tom Lane wrote:
> Arun Chaitanya  writes:
>> The link to the paper is
>> http://www.iith.ac.in/~ravig/courses/cs5050/papers/decorrelation-cesar.pdf
> 
> Given the authorship of that paper, I'd have to wonder whether Microsoft
> has filed for any patents regarding these ideas.

Feh, too bad Jim Grey isn't around anymore; I could just ask.

Is there some smaller query optimization project which Arun could
reasonably tackle?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question regarding SSL code in backend and frontend

2012-04-06 Thread Magnus Hagander
On Fri, Apr 6, 2012 at 18:43, Tom Lane  wrote:
> Magnus Hagander  writes:
>> True. I guess I was just assuming that JDBC (and npgsql i think?) were
>> using TLS - I would assume that to be the default in both Java and
>> .NET. We'd have to check that before making a change of course - and
>> I'm not convinced we need to make the change. But if we're making a
>> change to align those two with each other, that's the direction the
>> change should be in.
>
> Agreed, but should we align them?  IIUC, changing the server would cause
> it to reject connections from old non-TLS-aware clients.  Seems like
> that isn't a particularly good idea.

Well, it would be a good idea for those that want to be sure they're
using TLS for security reasons (tlsv1 is more secure than sslv3 - see
e.g. http://en.wikipedia.org/wiki/Transport_Layer_Security#Security).
We could also add a server parameter saying ssl_tls_only or something
like that which would switch it...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question regarding SSL code in backend and frontend

2012-04-06 Thread Tom Lane
Magnus Hagander  writes:
> True. I guess I was just assuming that JDBC (and npgsql i think?) were
> using TLS - I would assume that to be the default in both Java and
> .NET. We'd have to check that before making a change of course - and
> I'm not convinced we need to make the change. But if we're making a
> change to align those two with each other, that's the direction the
> change should be in.

Agreed, but should we align them?  IIUC, changing the server would cause
it to reject connections from old non-TLS-aware clients.  Seems like
that isn't a particularly good idea.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-04-06 Thread Tom Lane
Shigeru Hanada  writes:
> On Fri, Apr 6, 2012 at 11:20 PM, Tom Lane  wrote:
>> In
>> particular I do not like the specific way it's done in the v7 patch
>> (I've not looked at v8 yet) because the interposed logic has a
>> hard-wired assumption that foreign tables do not have inheritance
>> children.  I think that assumption has a life expectancy measured in
>> months at most, and I don't want to have to try to fix every FDW when
>> it changes.  But I think we can easily revise the hook details to fix
>> that, and I'm hoping to get that done today.

> I'll try implementing the design you suggested.

I've already got it fixed up ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question regarding SSL code in backend and frontend

2012-04-06 Thread Magnus Hagander
On Fri, Apr 6, 2012 at 02:05, Tom Lane  wrote:
> Magnus Hagander  writes:
>> If anything, we should be changing it to TLSv1 in both client and
>> server, since every client out there now should be using that anyway,
>> given that the client has been specifying it for a long time.
>
> Huh?  libpq isn't every client.

True. I guess I was just assuming that JDBC (and npgsql i think?) were
using TLS - I would assume that to be the default in both Java and
.NET. We'd have to check that before making a change of course - and
I'm not convinced we need to make the change. But if we're making a
change to align those two with each other, that's the direction the
change should be in.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-04-06 Thread Shigeru Hanada
On Fri, Apr 6, 2012 at 11:20 PM, Tom Lane  wrote:
>> Another concern is the place where we hook the process of ANALYZE.  IOW,
>> how much portion of ANALYZE should be overridable?
>
> Not much, IMO.  The FDW should be able to decide whether or not to
> analyze a particular table, and it should be in charge of implementing
> its own version of acquire_sample_rows, but no more than that.

ISTM that we have rough consensus about what FDW should do for an
ANALYZE request.  FDW should choose either of:
a) get sample rows and return them to backend
b) tell backend that the FDW has nothing to do for the request

> In
> particular I do not like the specific way it's done in the v7 patch
> (I've not looked at v8 yet) because the interposed logic has a
> hard-wired assumption that foreign tables do not have inheritance
> children.  I think that assumption has a life expectancy measured in
> months at most, and I don't want to have to try to fix every FDW when
> it changes.  But I think we can easily revise the hook details to fix
> that, and I'm hoping to get that done today.

I'll try implementing the design you suggested.

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel pg_dump

2012-04-06 Thread Stefan Kaltenbrunner
On 04/05/2012 12:32 PM, Joachim Wieland wrote:
> So here's a pg_dump benchmark from a real world database as requested
> earlier. This is a ~750 GB large 9.0.6 database, and the backup has
> been done over the internal network from a different machine. Both
> machines run Linux.
> 
> I am attaching a chart that shows the table size distribution of the
> largest tables and the overall pg_dump runtime. The resulting (zlib
> compressed) dump directory was 28 GB.
> 
> Here are the raw numbers:
> 
> -Fc dump
> real168m58.005s
> user146m29.175s
> sys 7m1.113s
> 
> -j 2
> real90m6.152s
> user155m23.887s
> sys 15m15.521s
> 
> -j 3
> real61m5.787s
> user155m33.118s
> sys 13m24.618s
> 
> -j 4
> real44m16.757s
> user155m25.917s
> sys 13m13.599s
> 
> -j 6
> real36m11.743s
> user156m30.794s
> sys 12m39.029s
> 
> -j 8
> real36m16.662s
> user154m37.495s
> sys 11m47.141s


interesting numbers, any details on the network speed between the boxes,
the number of cores, the size of the dump uncompressed and what the
appearant bottleneck was?


Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for foreign keys with arrays

2012-04-06 Thread Noah Misch
On Fri, Apr 06, 2012 at 09:21:17AM +0300, Peter Eisentraut wrote:
> On l??r, 2012-03-24 at 10:01 +, Gianni Ciolli wrote:
> > ON (DELETE | UPDATE) actions for EACH foreign keys
> > ==
> > 
> > -- --- ---
> >   |ON |ON |
> > Action|  DELETE   |  UPDATE   |
> > -- --- ---
> > CASCADE   |Row| Forbidden |
> > SET NULL  |Row|Row|
> > SET DEFAULT   |Row|Row|
> > EACH CASCADE  |  Element  |  Element  |
> > EACH SET NULL |  Element  |  Element  |
> > EACH SET DEFAULT  | Forbidden | Forbidden |
> > NO ACTION | - | - |
> > RESTRICT  | - | - |
> > -- - -
> > 
> I took another fresh look at this feature after not having looked for a
> month or two.  I think the functionality is probably OK, but I find the
> interfaces somewhat poorly named.  Consider, "PostgreSQL adds EACH
> foreign keys" -- huh?  I think they key word ELEMENT would be more
> descriptive and precise, and it also leaves the door open to other kind
> of non-atomic foreign key relationships outside of arrays.  EACH has no
> relationship with arrays.  It might as well refer to each row.

Good points.  Your proposed naming works for me.

> On the matter of the above chart, there has been a long back and forth
> about whether the row or the element case should be the default.  Both
> cases are probably useful, but unfortunately you have now settled on
> making maximum destruction the default.  Additionally, we would now have
> the case that sometimes, depending on some configuration elsewhere, an
> ON DELETE CASCADE deletes more than what was actually involved in the
> foreign key.  What I'd suggest is to make both cases explicit.  That is,
> forbid ON DELETE CASCADE altogether and make people write ON DELETE
> CASCADE ROW or ON DELETE CASCADE ELEMENT.  In addition to making things
> more explicit and safer, it would again leave the door open to other
> kinds of relationships later.

I'm ambivalent on this one.  ON DELETE CASCADE truly does the same thing
regardless of whether the FK incorporates an EACH column.  The current syntax
arises from that symmetry rather than a decision to dub its behavior as a
default.  That said, when a user wants CASCADE ELEMENT, your proposal would
more-rapidly divert him from wrongly using CASCADE ROW.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpoint patches

2012-04-06 Thread Greg Smith

On 04/05/2012 02:23 PM, Jim Nasby wrote:

If there's a fundamental flaw in how linux deals with heavy writes that
means you can't rely on certain latency windows, perhaps we should be
looking at using a different OS to test those cases...


Performance under this sort of write overload is something that's been a 
major focus of more recent Linux kernel versions than I've tested yet. 
It may get better just via the passage of time.  Today is surely far 
improved over the status quo a few years ago, when ext3 was the only 
viable filesystem choice and tens of seconds could pass with no activity.


The other thing to recognize here is that some heavy write operations 
get quite a throughput improvement from how things work now, with VACUUM 
being the most obvious one.  If I retune Linux to act more like other 
operating systems, with a smaller and more frequently flushed write 
cache, it will trash VACUUM write performance in the process.  That's 
one of the reasons I submitted the MB/s logging to VACUUM for 9.2, to 
make it easier to measure what happens to that as write cache changes 
are made.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)

2012-04-06 Thread Peter Geoghegan
On 29 March 2012 21:05, Tom Lane  wrote:
> Barring objections I'll go fix this, and then this patch can be
> considered closed except for possible future tweaking of the
> sticky-entry decay rule.

Attached patch fixes a bug, and tweaks sticky-entry decay.

The extant code bumps usage (though not call counts) in two hooks
(pgss_post_parse_analyze() and pgss_ExecutorEnd()) , so prepared
queries will always have about half the usage of an equivalent simple
query, which is clearly not desirable. With the proposed patch,
"usage" should be similar to "calls" until the first call of
entry_dealloc(), rather than usually having a value that's about twice
as high. With the patch, a run of pgbench with and without "-M
prepared" results in a usage of calls + 1 for each query from both
runs.

The approach I've taken with decay is to maintain a server-wide median
usage value (well, a convenient approximation), which is assigned to
sticky entries. This makes it hard to evict the entries in the first
couple of calls to entry_dealloc(). On the other hand, if there really
is contention for entries, it will soon become really easy to evict
sticky entries, because we use a much more aggressive multiplier of
0.5 for their decay.

I rather conservatively initially assume that the median usage is 10,
which is a very low value considering the use of the multiplier trick.
In any case, in the real world it won't take too long to call
entry_dealloc() to set the median value, if in fact it actually
matters.

You described entries as precious. This isn't quite the full picture;
while pg_stat_statements will malthusianistically burn through pretty
much as many entries as you care give to it, or so you might think, I
believe that in the real world, the rate at which the module burns
through them would frequently look logarithmic. In other words, after
an entry_dealloc() call the hashtable is 95% full, but it might take
rather a long time to reach 100% again - the first 5% is consumed
dramatically faster than the last. The user might not actually care if
you need to cache a sticky value for a few hours in one of their
slots, as you run an epic reporting query, even though the hashtable
is over 95% full.

The idea is to avoid evicting a sticky entry just because there
happened to be an infrequent entry_dealloc() at the wrong time, and
the least marginal of the most marginal 5% of non-sticky entries (that
is, the 5% up for eviction) happened to have a call count/usage of
higher than the magic value of 3, which I find quite plausible.

If I apply your test for dead sticky entries after the regression
tests (serial schedule) were run, my approach compares very favourably
(granted, presumably usage values were double-counted for your test,
making our results less than completely comparable).

For the purposes of this experiment, I've just commented out "if
(calls == 0) continue;" within the pg_stat_statements() function,
obviously:

postgres=# select calls = 0, count(*) from pg_stat_statements() group
by calls = 0;
-[ RECORD 1 ]-
?column? | f
count| 959
-[ RECORD 2 ]-
?column? | t
count| 3  <--- this includes the above query itself

postgres=# select calls = 0, count(*) from pg_stat_statements() group
by calls = 0;
-[ RECORD 1 ]-
?column? | f
count| 960   

Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-04-06 Thread Tom Lane
Shigeru HANADA  writes:
> (2012/04/06 1:29), Tom Lane wrote:
>> The one thing that seems pretty clear from this discussion is that one
>> size doesn't fit all.  I think we really ought to provide a hook so that
>> the FDW can determine whether ANALYZE applies to its foreign tables at
>> all, and how to obtain the sample rows if it does.

> To support foreign-table ANALYZE by adding a new hook, we would need a
> mechanism (or at least documented guide lines) to manage the chain of
> hook functions, because such hook might be used by multiple FDWs (or
> other plug-ins) at the same time.  A wrongly-written plug-in can easily
> break the hook chain.

Sorry, I used the word "hook" loosely, not with the intention of meaning
a global function pointer.  We should of course implement this with
another per-FDW method pointer, as in the postgresql-analyze patch
series.

> Another concern is the place where we hook the process of ANALYZE.  IOW,
> how much portion of ANALYZE should be overridable?

Not much, IMO.  The FDW should be able to decide whether or not to
analyze a particular table, and it should be in charge of implementing
its own version of acquire_sample_rows, but no more than that.  In
particular I do not like the specific way it's done in the v7 patch
(I've not looked at v8 yet) because the interposed logic has a
hard-wired assumption that foreign tables do not have inheritance
children.  I think that assumption has a life expectancy measured in
months at most, and I don't want to have to try to fix every FDW when
it changes.  But I think we can easily revise the hook details to fix
that, and I'm hoping to get that done today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-04-06 Thread Kyotaro HORIGUCHI
Excuse me for cutting in,

2012/4/6 Shigeru HANADA :
> To support foreign-table ANALYZE by adding a new hook, we would need a
> mechanism (or at least documented guide lines) to manage the chain of
> hook functions, because such hook might be used by multiple FDWs (or
> other plug-ins) at the same time.  A wrongly-written plug-in can easily
> break the hook chain.  We might need to provide register/unregister API
> for this hook point, like RegisterResourceReleaseCallback, and call each
> registered function until either of them processes the request.  Is
> there any other hook point which has similar issue?

+1

Plain hook mechanism in PostgreSQL is, I think, to hang a bunch
of faceless callbacks to be registered, unregistered and called
all together. And it does not fit to manage individual callbacks
which may be registered or unregistered in arbitrary order and
are preferred to be called separately.

Although we provide RegisterResourceReleaseCallback-like staff,
it seems far more complicated than the additional field in
FdwRoutine and some analyze_rel() modifications in core-side, and
confirmation of whether it's really the time for me should be a
reluctant work in plugin-side.

Of cource, I don't think there will be so many fdw-analyze
callbacks registered but two seems sufficient.

The current mods in analyze_rel() does not look definitive, but
it does not look so bad and seems more stable than simple hook
point which will be abandoned before long.

> Another concern is the place where we hook the process of ANALYZE.  IOW,
> how much portion of ANALYZE should be overridable?  Replacing
> analyze_rel or do_analyze_rel wholly requires plug-ins to copy most of
> codes from original function in order to implement hook function.  From
> the perspective of FDW author, I think that row sampling
> (acquire_sample_rows) function seems handy place to hook.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-06 Thread Cousin Marc
On 05/04/12 08:02, Boszormenyi Zoltan wrote:
> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>> I think this patch is doing two things: first touching infrastructure
>> stuff and then adding lock_timeout on top of that.  Would it work to
>> split the patch in two pieces?
>>
>
> Sure. Attached is the split version.
>
> Best regards,
> Zoltán Böszörményi
>
Hi,

I've started looking at and testing both patches.

Technically speaking, I think the source looks much better than the
first version of lock timeout, and may help adding other timeouts in the
future. I haven't tested it in depth though, because I encountered the
following problem:

While testing the patch, I found a way to crash PG. But what's weird is
that it crashes also with an unpatched git version.

Here is the way to reproduce it (I have done it with a pgbench schema):

- Set a small statement_timeout (just to save time during the tests)

Session1:
=#BEGIN;
=#lock TABLE pgbench_accounts ;

Session 2:
=#BEGIN;
=#lock TABLE pgbench_accounts ;
ERROR:  canceling statement due to statement timeout
=# lock TABLE pgbench_accounts ;

I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
done with a rollback to savepoint of course.

Session 2 crashes with this : TRAP : FailedAssertion(«
!(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
749).

It can also be done without a statement_timeout, and a control-C on the
second lock table.

I didn't touch anything but this. It occurs everytime, when asserts are
activated.

I tried it on 9.1.3, and I couldn't make it crash with the same sequence
of events. So maybe it's something introduced since ? Or is the assert
still valid ?

Cheers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster compression, again

2012-04-06 Thread Huchev
Well, the patent argument, used like this, looks like a wild card, which
can be freely interpreted as a mortal danger for some, and a non-issue for
others. A perfect scare-mongerer.
Quite frankly, I don't buy that one implementation is safer because there
is "Google backing it". I can't think of any reason why byte-aligned LZ77
algorithm could face any risk. And btw, just look at the number of
companies which had to pay protection money to Microsoft or face litigation
with Apple because they were using Google's Android. It looks to me that
Google is more a magnet for such dangers than a protector.

Regarding test tools : Yes, this is correct, Snappy C has more fuzzer tools
provided within the package.

Regarding integration to BTRFS, and therefore into Linux, both
implementation look on equal terms. I haven't seen anything which tells
that one has more chances than the other being part of Linux 3.5. In fact,
maybe both will be integrated at the same time.

However, a little publicized fact is that quite a few people tried both
implementation (Snappy C and LZ4), and there were more
failures/difficulties reported on Snappy C. It doesn't mean that Snappy C
is bad, just more complex to use. It seems that the LZ4 implementation is
more straightforward : less dependancies, less risks, less time spent to
properly optimize it,
well, in a word, simpler.



Le 5 avril 2012 01:11, Daniel Farina-4 [via PostgreSQL] <
ml-node+s1045698n5619199...@n5.nabble.com> a écrit :

> On Tue, Apr 3, 2012 at 7:29 AM, Huchev <[hidden 
> email]>
> wrote:
> > For a C implementation, it could interesting to consider LZ4 algorithm,
> since
> > it is written natively in this language. In contrast, Snappy has been
> ported
> > to C by Andy from the original C++ Google code, which lso translate into
> > less extensive usage and tests.
>
> From what I can tell, the C implementation of snappy has more tests
> than this LZ4 implementation, including a fuzz tester.  It's a
> maintained part of Linux as well, and used for btrfs --- this is why
> it was ported.  The high compression version of LZ4 is apparently
> LGPL.  And, finally, there is the issue of patents: snappy has several
> multi-billion dollar companies that can be held liable (originator
> Google, as well as anyone connected to Linux), and to the best of my
> knowledge, nobody has been held to extortion yet.
>
> Consider me unconvinced as to this line of argument.
>
> --
> fdr
>
> --
> Sent via pgsql-hackers mailing list ([hidden 
> email])
>
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://postgresql.1045698.n5.nabble.com/Faster-compression-again-tp5565675p5619199.html
>  To unsubscribe from Faster compression, again, click 
> here
> .
> NAML
>


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Faster-compression-again-tp5565675p5619870.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-04-06 Thread Shigeru HANADA
(2012/04/06 1:29), Tom Lane wrote:
> "Albe Laurenz"  writes:
>> Maybe the FDW API could be extended so that foreign data wrappers
>> can provide a random sample to avoid a full table scan.
> 
> The one thing that seems pretty clear from this discussion is that one
> size doesn't fit all.  I think we really ought to provide a hook so that
> the FDW can determine whether ANALYZE applies to its foreign tables at
> all, and how to obtain the sample rows if it does.
>
> Since we've already whacked the FDW API around in incompatible ways for
> 9.2, now is probably a good time to add that.  I'm inclined to say this
> should happen whether or not we accept any of the currently proposed
> patches for 9.2, because if the hook is there it will provide a way for
> people to experiment with foreign-table ANALYZE operations outside of
> core.

To support foreign-table ANALYZE by adding a new hook, we would need a
mechanism (or at least documented guide lines) to manage the chain of
hook functions, because such hook might be used by multiple FDWs (or
other plug-ins) at the same time.  A wrongly-written plug-in can easily
break the hook chain.  We might need to provide register/unregister API
for this hook point, like RegisterResourceReleaseCallback, and call each
registered function until either of them processes the request.  Is
there any other hook point which has similar issue?

Another concern is the place where we hook the process of ANALYZE.  IOW,
how much portion of ANALYZE should be overridable?  Replacing
analyze_rel or do_analyze_rel wholly requires plug-ins to copy most of
codes from original function in order to implement hook function.  From
the perspective of FDW author, I think that row sampling
(acquire_sample_rows) function seems handy place to hook.

Regards,
-- 
Shigeru HANADA

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last gasp

2012-04-06 Thread Greg Smith

On 04/05/2012 05:03 PM, Daniel Farina wrote:

To get to the point, I wonder if it makes sense for someone who has a
better sense a-priori what they're looking for in a committable patch
(i.e. a committer, or someone with a telepathic link to one or more)
to delegate specific pieces of patches for thorough review, sketching
some thoughts about pitfalls or hazards to look for.  Think of it as a
patch checklist that is informed by the experience of a committer
skimming its contents.


It's tricky to add this sort of idea into the current patch workflow in 
all cases.  A decent percentage of submissions bounce off a reviewer who 
doesn't commit, such that no committer spends time on them yet they 
still move forward.  Any idea that tries to involve committers earlier 
in the process risks using more of their time on things that ultimately 
are never going to pass toward commit anyway.  That is after all where 
this whole process started at, before CFs, and we know that didn't scale 
well.


We already have a loose bar in this exact area, one that suggests larger 
patches should first appear earlier in the development cycle.  That 
happened in this case, with a first submission in mid November, but it 
wasn't enough to make things go smoothly.


When I trace back how that played out, I think the gap in this case came 
from not being sure who was going to commit the result at the end.  If 
it's early January, and the community knows there's a large feature 
approaching because it's been around for over a month already, we better 
be asking who is going to commit it eventually already.  Many patch 
submitters agonize over "which committer is going to grill me most?", 
and the bigger the patch the more that impacts them.


Nailing that down and putting together the sort of early committer 
checklist you're describing strikes me as something that might happen in 
the first week of a CF, before there are normally a large pile of "Ready 
for Committer" things around.  Raising these question--who is 
interesting in committing big things and where they consider the bar to 
be--is something submitters who have a stake in seeing their patches go 
on might do.


It's better if people who are already working hard on features don't 
feel like they need to wander around begging for someone to pay 
attention to them though.  Ideally it would be the sort of thing a 
proper CF manager would highlight for them.  Unfortunately we often 
don't quite have one of those, instead just having people who sometimes 
make a bunch of noise at the beginning of a CF and then go AWOL. 
(cough)  No one is happy about that, and it's something that clearly 
needs to be solved better too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last gasp

2012-04-06 Thread Tom Lane
Greg Smith  writes:
> On 04/05/2012 04:27 PM, Simon Riggs wrote:
>> It's shocking since after months of work and an especially extended
>> edition CF, we expect people to deliver something, not just shunt the
>> whole thing off as rejected with 1 days's notice to alter that
>> outcome.

> I don't think this is being fair to Robert.

If we're going to ship a release at all, somebody's got to be willing
to say "no".  Personally, been there, done that, got the t-shirt [1].
Robert's just pointing out what has to be pointed out.

regards, tom lane

[1] http://imageshack.us/photo/my-images/199/b9w.jpg/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers