Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Stefan Kaltenbrunner

Tom Lane wrote:

Josh Berkus  writes:

On 3/29/10 7:46 AM, Joachim Wieland wrote:

I actually assume that whenever people are interested
in a very fast dump, it is because they are doing some maintenance
task (like migrating to a different server) that involves pg_dump. In
these cases, they would stop their system anyway.



Actually, I'd say that there's a broad set of cases of people who want
to do a parallel pg_dump while their system is active.  Parallel pg_dump
on a stopped system will help some people (for migration, particularly)
but parallel pg_dump with snapshot cloning will help a lot more people.


I doubt that.  My thought about it is that parallel dump will suck
enough resources from the source server, both disk and CPU, that you
would never want to use it on a live production machine.  Not even at
2am.  And your proposed use case is hardly a "broad set" in any case.
Thus, Joachim's approach seems perfectly sane from here.  I certainly
don't see that there's an argument for spending 10x more development
effort to pick up such use cases.

Another question that's worth asking is exactly what the use case would
be for parallel pg_dump against a live server, whether the snapshots are
synchronized or not.  You will not be able to use that dump as a basis
for PITR, so there is no practical way of incorporating any changes that
occur after the dump begins.  So what are you making it for?  If it's a
routine backup for disaster recovery, fine, but it's not apparent why
you want max speed and to heck with live performance for that purpose.
I think migration to a new server version (that's too incompatible for
PITR or pg_migrate migration) is really the only likely use case.


I really doubt that - on fast systems pg_dump is completely CPU 
bottlenecked and typical 1-2U typical hardware you get these days has 
8-16 cores so simply dedicating a few cores to dumping the database 
during quieter times is very realistic.
Databases are growing larger and larger and the single threaded nature 
of pg_dump makes it very hard to even stay withing reasonable time 
limits for doing the backup.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Stefan Kaltenbrunner

Alvaro Herrera wrote:

Josh Berkus escribió:


And snaphsots have more compile dependancies than releases do.


As far as I know, a snapshot is identical to a "release" in that regard.
If they are not, that's a bug and we can fix it before weekend.


yeah - snapshots do have the same compile time dependencies as release 
tarballs have.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC

2010-03-29 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, Mar 30, 2010 at 01:33:59AM +0530, Anindya Jyoti Roy wrote:
> I have some idea of implementing am image database system, which will let 
> you search against the image and fingerprint:

As others already pointed out, this seems a huge task for a summer,
unless you have the image processing parts sorted out. That said, you
might want to have a look at GNU Gift [1], [2]. They seem to have sorted
out the image fingerprinting and comparing bits and have a sound
academic foundation.

Note that the project itself is under the GNU GPL, thus you won't
be able to just copy code from there to build "into" PostgreSQL. But
you can turn to this project for inspiration and algorithms (or you
can realize it as an external project, as others have proposed).

[1] 
[2] 

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFLsZEKBcgs9XrR2kYRAhmYAJ0dwQdYxTq1dOt5dszPKEUk6E6dfQCdFFF7
Nh0cErVA+KgBxsn8O7MSOno=
=zGRr
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Questions about 9.0 release note

2010-03-29 Thread Takahiro Itagaki
Hi, I have some questions about 9.0 release note.
I'd like to work for some of them if required. Comments welcome.

* Allow per-tablespace sequential and random page cost variables
  (seq_page_cost/(random_page_cost)) via ALTER TABLESPACE ... SET/RESET
  ^^
Are those parentheses around random_page_cost intentional?
They seems to mean just "(seq_page_cost and random_page_cost)".

* EXCLUDE constraints has no tags to be linked.
The item in release note is pointing the following page,
http://developer.postgresql.org/pgdocs/postgres/sql-createtable.html#SQL-CREATETABLE-DESCRIPTION
but the actual description about EXCLUDE constraints are in the subentry
of "Parameters" section. Can we add a tag to  for EXCLUDE?

* "EXCLUDE constraints" is not indexed from the Index page.
Should we have for it? Unique Constraints have a section for them:
http://developer.postgresql.org/pgdocs/postgres/ddl-constraints.html#AEN2431

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameter name standby_mode

2010-03-29 Thread Fujii Masao
On Wed, Mar 3, 2010 at 9:41 PM, Fujii Masao  wrote:
> On Wed, Feb 24, 2010 at 2:18 PM, Fujii Masao  wrote:
>> If standby_mode is enabled, and neither primary_conninfo nor restore_command
>> are set, the standby would get stuck. How about forbidding (i.e., causing a
>> FATAL message) this wrong setting?
>
> Here is the patch which forbids that wrong setting of recovery.conf.

I think that this patch should be applied. Otherwise, if you wrongly
set neither primary_conninfo nor restore_command in recovery.conf,
the standby server would do nothing and get stuck because it doesn't
know where to retrieve the WAL files from. Banning the incorrect
setting makes sense to me.

Does anyone commit the patch? Does anyone have a say?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC

2010-03-29 Thread Jeff Davis
On Tue, 2010-03-30 at 01:33 +0530, Anindya Jyoti Roy wrote:
> I have some idea of implementing am image database system, which will 
> let you search against the image and fingerprint:
> 
> The idea crudely is the following:
> 
> I want to implement a image database system: This will have the 
> following quality:
> 1> will store image along with the other attributes
> 2> the database search engine will be able to search for image also
> 3> it will list the matching images in the order of degree of match.
> 4> in this matching system I will likely use the system of dividing the 
> image into important parts and match them.
> 5> The database will also contain fingerprints, that may  be the primary 
> key.
> 6> it will match the finger prints on the basis of the nodes and then 
> making a bitmap of it and using the coordinates with some error it will 
> match them
> 7> to include some accuracy in case of the angle of picture in case of 
> search of equality we will introduce some logical amount of error.
> 8> this project can be coded by me as I have some previous experience in 
> this type of project.
> 9> this will make a definitly good search engine as well as innovative 
> and uncommon.

If you plan to spend your time refining complex image processing
algorithms, or learning about image processing as you go, I think the
project will be impossible to finish in one summer. Not only that, you
would need to find someone capable of reviewing such code.

I believe you should focus on a couple simple image processing
algorithms that you already understand very well. If you written image
processing code in the past, I believe it could be a good project. The
way I see it, you would need to:

  1. Create a user-defined type for the fingerprint.
  2. Create a user-defined function to transform a raster image into a
fingerprint.
  3. Code to index fingerprints using the GiST or GIN API.
  4. Code to search for fingerprints that match a given fingerprint,
using the GiST or GIN API and a custom operator.

After you have the basics completed, you can introduce more
sophisticated fingerprint representations and comparison algorithms.
That would probably be an ongoing effort after the GSoC project is
complete.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Alvaro Herrera
Josh Berkus escribió:

> And snaphsots have more compile dependancies than releases do.

As far as I know, a snapshot is identical to a "release" in that regard.
If they are not, that's a bug and we can fix it before weekend.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC

2010-03-29 Thread Jeff Davis
On Mon, 2010-03-29 at 19:08 -0400, Robert Haas wrote:
> To be really useful, this would probably need
> some kind of index support, maybe based on GIN.

Yes.

>   You're not going to
> get that done by August even if you start now and don't sleep.

The extensibility mechanism is designed so that an expert in one field
can add interesting operators and indexing strategies with a reasonable
amount of effort. If he already understands the image processing aspect
well (and has at least some working code somewhere), I would like to
think that it's possible to develop very basic support in postgresql in
a few months.

It doesn't need to be as complete as PostGIS after one summer of work.
It just needs to be able to search images with basic functionality, and
the rest can be added later after he's already hooked on PostgreSQL ;)

>  Even
> if you did, my guess is that this is the sort of thing that the PG
> community would prefer to see maintained as a separate project that
> integrates with PG, rather than a part of core PG.

That isn't necessarily a bad thing. PostGIS is an incredibly successful
extension that is developed outside the project. It may be a problem
with GSoC unless we have some sort of community buy-in, but from a
technical standpoint I don't see a problem here.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC

2010-03-29 Thread Dann Corbit
Maybe this can be helpful:
http://sourceforge.net/search/?type_of_search=soft&words=fingerprint
http://sourceforge.net/search/?type_of_search=soft&words=image+recogniti
on


> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
> ow...@postgresql.org] On Behalf Of Anindya Jyoti Roy
> Sent: Monday, March 29, 2010 1:04 PM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] GSoC
> 
> I have some idea of implementing am image database system, which will
> let you search against the image and fingerprint:
> 
> The idea crudely is the following:
> 
> I want to implement a image database system: This will have the
> following quality:
> 1> will store image along with the other attributes
> 2> the database search engine will be able to search for image also
> 3> it will list the matching images in the order of degree of match.
> 4> in this matching system I will likely use the system of dividing
the
> image into important parts and match them.
> 5> The database will also contain fingerprints, that may  be the
> primary
> key.
> 6> it will match the finger prints on the basis of the nodes and then
> making a bitmap of it and using the coordinates with some error it
will
> match them
> 7> to include some accuracy in case of the angle of picture in case of
> search of equality we will introduce some logical amount of error.
> 8> this project can be coded by me as I have some previous experience
> in
> this type of project.
> 9> this will make a definitly good search engine as well as innovative
> and uncommon.
> 
> 
> 
> About me:
> 
> I am Sophomore of Indian Institute of Technology, kanpur in CSE BTech.
> I
> have a good academic record till date and a keen interest on Data
> Mining
> and Machine language as well as Image processing. I am attaching my
> resume in short form with this mail hoping it may help you to judge my
> credibility. I am strong in Coding and Maths and able to solve
problems
> within given timeline.
> 
> I can give a more detail explanation of my project. But I just wanted
> to
> give an outline here.
> 
> Let me know the feedback.
> Thanks for your time,
> 
> Your faithfully,
> 
>  Anindya Jyoti Roy
>  Sophomore
>  CSE Bteck
>  IIT Kanpur

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Magnus Hagander
Last i heard from Dave on that topic is that there's no chance of that
happening that quickly. He's on a plane now but I'm sure he'll confirm that
when he lands.

/Magnus

On Mar 29, 2010 6:14 PM, "Peter Eisentraut"  wrote:

On sön, 2010-03-28 at 19:22 -0400, Robert Haas wrote:
> On Sun, Mar 28, 2010 at 4:40 PM, Josh Berkus...
Feel free to do it.  It's documented at
.  Ask me if
something is unclear.

But as was said downthread, getting someone to do the Windows installer
would be good.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your sub...


Re: [HACKERS] GSoC

2010-03-29 Thread Josh Berkus

> The idea crudely is the following:
> 
> I want to implement a image database system: This will have the
> following quality:

This seems fine, but I believe it's been done before.  You might want to
search for existing projects, and make an improvement on an existing
project rather than starting from scratch.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 4:03 PM, Anindya Jyoti Roy  wrote:
> I want to implement a image database system: This will have the following
> quality:
> 1> will store image along with the other attributes
> 2> the database search engine will be able to search for image also
> 3> it will list the matching images in the order of degree of match.
> 4> in this matching system I will likely use the system of dividing the
> image into important parts and match them.
> 5> The database will also contain fingerprints, that may  be the primary
> key.
> 6> it will match the finger prints on the basis of the nodes and then making
> a bitmap of it and using the coordinates with some error it will match them
> 7> to include some accuracy in case of the angle of picture in case of
> search of equality we will introduce some logical amount of error.
> 8> this project can be coded by me as I have some previous experience in
> this type of project.
> 9> this will make a definitly good search engine as well as innovative and
> uncommon.

While you can certainly use PostgreSQL's extensible type system to
implement something like this, you'll probably need a good deal more
help and guidance from someone who is an expert on image matching
algorithms than you will from pgsql-hackers.  But feel free to reach
out to this community for help with the PG bits.

Having said that, I wouldn't suggest submitting this as a GSoC
project.  It sounds far too hard to be done well (or maybe even
poorly) in one summer.  To be really useful, this would probably need
some kind of index support, maybe based on GIN.  You're not going to
get that done by August even if you start now and don't sleep.  Even
if you did, my guess is that this is the sort of thing that the PG
community would prefer to see maintained as a separate project that
integrates with PG, rather than a part of core PG.

See also:

http://archives.postgresql.org/pgsql-hackers/2010-03/msg01034.php
http://archives.postgresql.org/pgsql-hackers/2010-03/msg01116.php

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Peter Eisentraut
On sön, 2010-03-28 at 19:22 -0400, Robert Haas wrote:
> On Sun, Mar 28, 2010 at 4:40 PM, Josh Berkus  wrote:
> > We've got two locations and some individuals signed up for a test-fest
> > this weekend.  Would it be possible to do an alpha release this week?
> > It would really help to be testing later code than Alpha4.
> 
> I'm willing to do the CVS bits, if that's helpful.  Or maybe Peter
> wants to do it.  Anyway I have no problem with the idea.
> 
> ...Robert
> 

Feel free to do it.  It's documented at
.  Ask me if
something is unclear.

But as was said downthread, getting someone to do the Windows installer
would be good.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Jaime Casanova
On Mon, Mar 29, 2010 at 3:57 PM, Robert Haas  wrote:
> On Mon, Mar 29, 2010 at 3:55 PM, Tom Lane  wrote:
>> Simon Riggs  writes:
>>> On Mon, 2010-03-29 at 13:31 -0600, Alex Hunsaker wrote:
 On Mon, Mar 29, 2010 at 12:17, Simon Riggs  wrote:
> You're not addressing the original point. I have been asked how would
> users know which tables have been removed and whether there is a way of
> checking that.

 Uhh... they wont be in the explain output...  Seems simple enough.
>>
>>> That is exactly what I replied, though nobody felt that was a great
>>> answer.
>>
>> Who complained about that exactly?  It seems like a perfectly
>> appropriate answer to me.
>
> Jaime Casanova.
>

i'm just finishing to read an 347 rows EXPLAIN ANALYZE, so yes i
prefer a DEBUG message than to hunt missing rels

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 4:40 PM, Stefan Kaltenbrunner
 wrote:
> yeah but you also need people changing the website - and probably more
> important given that josh wants windows as well help from dave for doing a
> new windows installer :)

True...  well, I can't help with those bits.  :-)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Andrew Dunstan



Robert Haas wrote:


It's completely possible that you could want to clone a server for dev
and have more CPU and I/O bandwidth available than can be efficiently
used by a non-parallel pg_dump.  But certainly what Joachim is talking
about will be a good start.  I think there is merit to the
synchronized snapshot stuff for pg_dump and perhaps other applications
as well, but I think Joachim's (well-taken) point is that we don't
have to treat it as a hard prerequisite.


  


Possibly. I think the most useful thing that could be done right now is 
probably the least controversial, namely creating a directory type of 
archive, with support for pg_restore, including parallel pg_restore.


Personally I think that's worth doing in its own right anyway.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Stefan Kaltenbrunner

Robert Haas wrote:

On Mon, Mar 29, 2010 at 2:52 PM, Josh Berkus  wrote:

or just use a specific recent snapshot and let people test that just in
case it is not feasible doing a new alpha on short notice.

Doesn't work if we want to test it on windows.  And snaphsots have more
compile dependancies than releases do.

Also ... this isn't short notice.  I requested a new alpha, this week, 2
weeks ago.


Also, I already said I would do it (unless another committer wants
to).  Somebody just has to tell me what they want done and when.
Presumably that means providing a patch to the release notes for me to
check in, a date to make the branch, and where they want the tarball
put.


yeah but you also need people changing the website - and probably more 
important given that josh wants windows as well help from dave for doing 
a new windows installer :)



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Alex Hunsaker  writes:
> It comes to mind you can probably do this with an plannerhook
> (whatever happened to those hooks to auto create/recommend indexes?)

Uh, we made some hooks to let someone build an *external* module that
would do such a thing.  Whether anybody got anywhere with it I dunno.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 4:11 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> On 3/29/10 7:46 AM, Joachim Wieland wrote:
>>> I actually assume that whenever people are interested
>>> in a very fast dump, it is because they are doing some maintenance
>>> task (like migrating to a different server) that involves pg_dump. In
>>> these cases, they would stop their system anyway.
>
>> Actually, I'd say that there's a broad set of cases of people who want
>> to do a parallel pg_dump while their system is active.  Parallel pg_dump
>> on a stopped system will help some people (for migration, particularly)
>> but parallel pg_dump with snapshot cloning will help a lot more people.
>
> I doubt that.  My thought about it is that parallel dump will suck
> enough resources from the source server, both disk and CPU, that you
> would never want to use it on a live production machine.  Not even at
> 2am.  And your proposed use case is hardly a "broad set" in any case.
> Thus, Joachim's approach seems perfectly sane from here.  I certainly
> don't see that there's an argument for spending 10x more development
> effort to pick up such use cases.
>
> Another question that's worth asking is exactly what the use case would
> be for parallel pg_dump against a live server, whether the snapshots are
> synchronized or not.  You will not be able to use that dump as a basis
> for PITR, so there is no practical way of incorporating any changes that
> occur after the dump begins.  So what are you making it for?  If it's a
> routine backup for disaster recovery, fine, but it's not apparent why
> you want max speed and to heck with live performance for that purpose.
> I think migration to a new server version (that's too incompatible for
> PITR or pg_migrate migration) is really the only likely use case.

It's completely possible that you could want to clone a server for dev
and have more CPU and I/O bandwidth available than can be efficiently
used by a non-parallel pg_dump.  But certainly what Joachim is talking
about will be a good start.  I think there is merit to the
synchronized snapshot stuff for pg_dump and perhaps other applications
as well, but I think Joachim's (well-taken) point is that we don't
have to treat it as a hard prerequisite.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs
On Mon, 2010-03-29 at 13:54 -0600, Alex Hunsaker wrote:

> whatever happened to those hooks to auto create/recommend indexes?

You haven't published them yet??

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Tom Lane
Dimitri Fontaine  writes:
> Tom Lane  writes:
>> If this is going to end up being one fairly small C file implementing
>> a spec that is not a moving target, I'd vote against depending on an
>> external library instead, no matter how spiffy and license-compatible
>> the external library might be.

> My understanding is that it's possible to include (fork) a MIT or BSD
> source code into our source tree, right? (Some other licenses certainly
> apply too).

MIT or 2-clause BSD would be ok for such a thing, other licenses
probably not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs
On Mon, 2010-03-29 at 15:55 -0400, Tom Lane wrote:
> Simon Riggs  writes:
> > On Mon, 2010-03-29 at 13:31 -0600, Alex Hunsaker wrote:
> >> On Mon, Mar 29, 2010 at 12:17, Simon Riggs  wrote:
> >>> You're not addressing the original point. I have been asked how would
> >>> users know which tables have been removed and whether there is a way of
> >>> checking that.
> >> 
> >> Uhh... they wont be in the explain output...  Seems simple enough.
> 
> > That is exactly what I replied, though nobody felt that was a great
> > answer.
> 
> Who complained about that exactly?  It seems like a perfectly
> appropriate answer to me.

I'm relaying feedback from others not on this list. People expect me to
do this. I shouldn't need to name them for us to accept the feedback,
nor should there be doubt that I relay this accurately (why else would I
raise the subject?!?). If it comes from me, I say so.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Tom Lane
Josh Berkus  writes:
> On 3/29/10 7:46 AM, Joachim Wieland wrote:
>> I actually assume that whenever people are interested
>> in a very fast dump, it is because they are doing some maintenance
>> task (like migrating to a different server) that involves pg_dump. In
>> these cases, they would stop their system anyway.

> Actually, I'd say that there's a broad set of cases of people who want
> to do a parallel pg_dump while their system is active.  Parallel pg_dump
> on a stopped system will help some people (for migration, particularly)
> but parallel pg_dump with snapshot cloning will help a lot more people.

I doubt that.  My thought about it is that parallel dump will suck
enough resources from the source server, both disk and CPU, that you
would never want to use it on a live production machine.  Not even at
2am.  And your proposed use case is hardly a "broad set" in any case.
Thus, Joachim's approach seems perfectly sane from here.  I certainly
don't see that there's an argument for spending 10x more development
effort to pick up such use cases.

Another question that's worth asking is exactly what the use case would
be for parallel pg_dump against a live server, whether the snapshots are
synchronized or not.  You will not be able to use that dump as a basis
for PITR, so there is no practical way of incorporating any changes that
occur after the dump begins.  So what are you making it for?  If it's a
routine backup for disaster recovery, fine, but it's not apparent why
you want max speed and to heck with live performance for that purpose.
I think migration to a new server version (that's too incompatible for
PITR or pg_migrate migration) is really the only likely use case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 3:55 PM, Tom Lane  wrote:
> Simon Riggs  writes:
>> On Mon, 2010-03-29 at 13:31 -0600, Alex Hunsaker wrote:
>>> On Mon, Mar 29, 2010 at 12:17, Simon Riggs  wrote:
 You're not addressing the original point. I have been asked how would
 users know which tables have been removed and whether there is a way of
 checking that.
>>>
>>> Uhh... they wont be in the explain output...  Seems simple enough.
>
>> That is exactly what I replied, though nobody felt that was a great
>> answer.
>
> Who complained about that exactly?  It seems like a perfectly
> appropriate answer to me.

Jaime Casanova.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Simon Riggs  writes:
> On Mon, 2010-03-29 at 13:31 -0600, Alex Hunsaker wrote:
>> On Mon, Mar 29, 2010 at 12:17, Simon Riggs  wrote:
>>> You're not addressing the original point. I have been asked how would
>>> users know which tables have been removed and whether there is a way of
>>> checking that.
>> 
>> Uhh... they wont be in the explain output...  Seems simple enough.

> That is exactly what I replied, though nobody felt that was a great
> answer.

Who complained about that exactly?  It seems like a perfectly
appropriate answer to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Alex Hunsaker
On Mon, Mar 29, 2010 at 13:41, Simon Riggs  wrote:
>> Uhh... they wont be in the explain output...  Seems simple enough.
>
> That is exactly what I replied, though nobody felt that was a great
> answer.

Maybe I missed it, but why exactly do they care?  About the only
reason I can think *i* would care is:  If I was running the same SQL
on multiple database products (like mysql) and thought "Sweet now I
can use this new feature to cleanup my sql so it runs better on
product X or version of postgres Y".  Is there some other reason other
than it would be _cool_ to know?  Or is it FUD that it might be buggy
and so they wish to be able to turn it off?

It comes to mind you can probably do this with an plannerhook
(whatever happened to those hooks to auto create/recommend indexes?)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Dimitri Fontaine
Tom Lane  writes:
> If this is going to end up being one fairly small C file implementing
> a spec that is not a moving target, I'd vote against depending on an
> external library instead, no matter how spiffy and license-compatible
> the external library might be.

My understanding is that it's possible to include (fork) a MIT or BSD
source code into our source tree, right? (Some other licenses certainly
apply too).

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Josh Berkus
On 3/29/10 7:46 AM, Joachim Wieland wrote:
> I actually assume that whenever people are interested
> in a very fast dump, it is because they are doing some maintenance
> task (like migrating to a different server) that involves pg_dump. In
> these cases, they would stop their system anyway.

Actually, I'd say that there's a broad set of cases of people who want
to do a parallel pg_dump while their system is active.  Parallel pg_dump
on a stopped system will help some people (for migration, particularly)
but parallel pg_dump with snapshot cloning will help a lot more people.

For example, imagine a user who has a 16-core machine on a 14-drive RAID
10, and a 100-table 1TB database.  At 2am, this person might reasonaly
want to allocate a large portion of the machine resources to the dump by
giving it 4 threads, without cutting access to the application.

So: if parallel dump in single-user mode is what you can get done, then
do it.  We can always improve it later, and we have to start somewhere.
 But we will eventually need parallel pg_dump on active systems, and
that should remain on the TODO list.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 2:52 PM, Josh Berkus  wrote:
>
>> or just use a specific recent snapshot and let people test that just in
>> case it is not feasible doing a new alpha on short notice.
>
> Doesn't work if we want to test it on windows.  And snaphsots have more
> compile dependancies than releases do.
>
> Also ... this isn't short notice.  I requested a new alpha, this week, 2
> weeks ago.

Also, I already said I would do it (unless another committer wants
to).  Somebody just has to tell me what they want done and when.
Presumably that means providing a patch to the release notes for me to
check in, a date to make the branch, and where they want the tarball
put.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs
On Mon, 2010-03-29 at 13:31 -0600, Alex Hunsaker wrote:
> On Mon, Mar 29, 2010 at 12:17, Simon Riggs  wrote:
> > You're not addressing the original point. I have been asked how would
> > users know which tables have been removed and whether there is a way of
> > checking that.
> 
> Uhh... they wont be in the explain output...  Seems simple enough.

That is exactly what I replied, though nobody felt that was a great
answer.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Alex Hunsaker
On Mon, Mar 29, 2010 at 12:17, Simon Riggs  wrote:
> You're not addressing the original point. I have been asked how would
> users know which tables have been removed and whether there is a way of
> checking that.

Uhh... they wont be in the explain output...  Seems simple enough.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] booleans in recovery.conf

2010-03-29 Thread Simon Riggs
On Tue, 2010-03-23 at 19:43 -0400, Bruce Momjian wrote:
> Is there a reason that recovery.conf uses true/false, while
> postgresql.conf uses on/off?
> 
>   #recovery_target_inclusive = 'true' # 'true' or 'false'
> 
> or are these settings more boolean for some reason?

The code accepts any of  on|off|true|false and uses the same code as the
postgresql.conf for parsing that.

I've changed the standby_mode to on|off as per the docs. I left the
above parameter because true|false reads better.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Josh Berkus

> or just use a specific recent snapshot and let people test that just in
> case it is not feasible doing a new alpha on short notice.

Doesn't work if we want to test it on windows.  And snaphsots have more
compile dependancies than releases do.

Also ... this isn't short notice.  I requested a new alpha, this week, 2
weeks ago.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Joseph Adams
On Mon, Mar 29, 2010 at 2:23 PM, David E. Wheeler  wrote:
> On Mar 29, 2010, at 9:02 AM, Tom Lane wrote:
>
>> If this is going to end up being one fairly small C file implementing
>> a spec that is not a moving target, I'd vote against depending on an
>> external library instead, no matter how spiffy and license-compatible
>> the external library might be.
>
> Perhaps you could fork one, in that case.
>
> Best,
>
> David
>
>

I'm considering using and adapting cJSON instead of continuing with my
redundant implementation.  I could run `indent -kr -i4` on it (will
that match PostgreSQL's coding style?), add support for UTF-16
surrogate pairs (pair of \u... escapes for each character above U+
as required by the JSON spec), and add a switch to turn on/off pure
ASCII output.

P.S.: Sorry for the repeat, David.  I forgot to CC the mailing list.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread David E. Wheeler
On Mar 29, 2010, at 9:02 AM, Tom Lane wrote:

> If this is going to end up being one fairly small C file implementing
> a spec that is not a moving target, I'd vote against depending on an
> external library instead, no matter how spiffy and license-compatible
> the external library might be.

Perhaps you could fork one, in that case.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs
On Mon, 2010-03-29 at 11:46 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > OK, I'll write a patch for that; and a consensus emerges that we
> > should also have enable_joinremoval, then I will add that as well.  I
> > think the only argument for NOT having enable_joinremoval is that you
> > can always modify the query to say SELECT * rather than some more
> > specific SELECT list,
> 
> Uh, no, the argument for not having enable_joinremoval is that it's
> useless.
> 
> In particular, I categorically deny the argument that putting it in will
> reduce user confusion.  If anyone is confused because EXPLAIN shows that
> some table isn't getting joined to, you think that the fact that
> somewhere in the manual is a mention of enable_joinremoval will
> un-confuse them?  If they knew that switch was there or what it did,
> they wouldn't be confused to begin with.

You're not addressing the original point. I have been asked how would
users know which tables have been removed and whether there is a way of
checking that. That is not a request for a tuning feature, or something
to reduce user confusion. If you don't like "enable_joinremoval" that's
fine but it would be good to answer the original request with an
alternative proposal.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 1:16 PM, Stefan Kaltenbrunner
 wrote:
> Robert Haas wrote:
>>
>> On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland  wrote:
>
> [...]
>>>
>>> - Regarding the output of pg_dump I am proposing two solutions. The
>>> first one is to introduce a new archive type "directory" where each
>>> table and each blob is a file in a directory, similar to the
>>> experimental "files" archive type. Also the idea has come up that you
>>> should be able to specify multiple directories in order to make use of
>>> several physical disk drives. Thinking this further, in order to
>>> manage all the mess that you can create with this, every file of the
>>> same backup needs to have a unique identifier and pg_restore should
>>> have a check parameter that tells you if your backup directory is in a
>>> sane and complete state (think about moving a file from one backup
>>> directory to another one or trying to restore from two directories
>>> which are from different backup sets...).
>>
>> I think that specifying several directories is a piece of complexity
>> that would be best left alone for a first version of this.  But a
>> single directory with multiple files sounds pretty reasonable.  Of
>> course we'll also need to support that format in non-parallel mode,
>> and in pg_restore.
>>
>>> The second solution to the single-file-problem is to generate no
>>> output at all, i.e. whatever you export from your source database you
>>> import directly into your target database, which in the end turns out
>>> to be a parallel form of "pg_dump | psql".
>>
>> This is a very interesting idea but you might want to get the other
>> thing merged first, as it's going to present a different set of
>> issues.
>
> I had some prior discussion with joachim (and I suspect I had some influence
> in him trying to implement that) on that.
> The reason why this is really needed is that the current pg_restore -j is
> actually a net loss(vs "pg_dump | psql") in a LOT of scenarios that are
> basically "duplicate this database to that location" (or any migration
> really).
> The example at had is a 240GB production database with around 850 tables, it
> takes ~145min to dump that database single threaded(completely CPU bound),
> simply loading the SQL using psql can restore it in ~150min(again CPU bound
> both for COPY and index creation), -j8 brings that down to ~55min.
> So if  you do the math(and a bit of handwaving):
>
> * using pg_dump | psql you get greatest(140,150) -> 150min.
> * using pg_dump -Z0 -Fc && pg_restore -j8 you get 145+55 -> 200min
> * using a theoretical parallel pg_dump and the existing parallel restore you
> would get: 50(just a guess for how fast it might be) + 55 -> 105min
> * a parallel dump & restore that can pipline would end up at
> greatest(50,55)->55min
>
>
> So a parallel dump alone would only give you a 50% speedup in total time for
> doing a migration/upgrade/dump-to-devbox despite the fact that it uses 8x
> the resources. A piplined solution would result in a ~3x speedup in total
> time and you don't even have to even think about stuff that might be a
> problem like having available diskspace on the source/destination to hold a
> full temporary dump(if you don't you might even have to add some transfer
> time as well).

It's a great idea - but there are two features here.  I've seen many
patches implementing two features during my relatively short time with
the project and if the rejection rate hasn't been 100% it's certainly
been close.  If Joachim thinks he's got it all working, by all means
submit both patches.  One can apply over the other if they are
interdependent.  But I STRONGLY suggest separating this into two
pieces - it is MUCH easier to get things applied that way, for good
and valid reasons.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs
On Mon, 2010-03-29 at 09:37 -0400, Robert Haas wrote:

> In many
> cases it's not horribly difficult to work around because you can do
> SELECT * FROM ... instead of your original select list, but there
> might be some cases with multiple levels of views where it isn't that
> easy.  I think it would be good to add this back.

People have asked me how they would know whether join removal has
worked. If there is a workaround that works for most cases, I think that
might be all we need. Do you have an example where that doesn't work, so
we can judge how common that would be? 

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Josh Berkus
On 3/28/10 8:52 PM, Hitoshi Harada wrote:
> There's another choice, called BSON.
>
> http://www.mongodb.org/display/DOCS/BSON
>
> I've not researched it yet deeply, it seems reasonable to be stored in
> databases as it is invented for MongoDB.

I wouldn't take that for granted.  The MongoDB project involves a lot of
"re-inventing the wheel" and I'd scrutinize any of their innovations
pretty thoroughly.

Besides, I thought the point of a JSON type was to be compatible with
the *majority* of JSON users?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs
On Mon, 2010-03-29 at 11:27 -0400, Tom Lane wrote:
> Robert Haas  writes:
> > And for the record, I believe I find it rather amusing that you're
> > asking me if I "have the faintest idea how many there would be".
> 
> Actually that was directed more at Simon.

I think you should drop the personal comments. They have no place here.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Stefan Kaltenbrunner

Robert Haas wrote:

On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland  wrote:

[...]

- Regarding the output of pg_dump I am proposing two solutions. The
first one is to introduce a new archive type "directory" where each
table and each blob is a file in a directory, similar to the
experimental "files" archive type. Also the idea has come up that you
should be able to specify multiple directories in order to make use of
several physical disk drives. Thinking this further, in order to
manage all the mess that you can create with this, every file of the
same backup needs to have a unique identifier and pg_restore should
have a check parameter that tells you if your backup directory is in a
sane and complete state (think about moving a file from one backup
directory to another one or trying to restore from two directories
which are from different backup sets...).


I think that specifying several directories is a piece of complexity
that would be best left alone for a first version of this.  But a
single directory with multiple files sounds pretty reasonable.  Of
course we'll also need to support that format in non-parallel mode,
and in pg_restore.


The second solution to the single-file-problem is to generate no
output at all, i.e. whatever you export from your source database you
import directly into your target database, which in the end turns out
to be a parallel form of "pg_dump | psql".


This is a very interesting idea but you might want to get the other
thing merged first, as it's going to present a different set of
issues.


I had some prior discussion with joachim (and I suspect I had some 
influence in him trying to implement that) on that.
The reason why this is really needed is that the current pg_restore -j 
is actually a net loss(vs "pg_dump | psql") in a LOT of scenarios that 
are basically "duplicate this database to that location" (or any 
migration really).
The example at had is a 240GB production database with around 850 
tables, it takes ~145min to dump that database single 
threaded(completely CPU bound), simply loading the SQL using psql can 
restore it in ~150min(again CPU bound both for COPY and index creation), 
-j8 brings that down to ~55min.

So if  you do the math(and a bit of handwaving):

* using pg_dump | psql you get greatest(140,150) -> 150min.
* using pg_dump -Z0 -Fc && pg_restore -j8 you get 145+55 -> 200min
* using a theoretical parallel pg_dump and the existing parallel restore 
you would get: 50(just a guess for how fast it might be) + 55 -> 105min
* a parallel dump & restore that can pipline would end up at 
greatest(50,55)->55min



So a parallel dump alone would only give you a 50% speedup in total time 
for doing a migration/upgrade/dump-to-devbox despite the fact that it 
uses 8x the resources. A piplined solution would result in a ~3x speedup 
in total time and you don't even have to even think about stuff that 
might be a problem like having available diskspace on the 
source/destination to hold a full temporary dump(if you don't you might 
even have to add some transfer time as well).




Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using HStore type in TSearch

2010-03-29 Thread Łukasz Dejneka
2010/3/29 Tom Lane 

> =?ISO-8859-2?Q?=A3ukasz_Dejneka?=  writes:
> > What I want to do is to use HStore data type (namely the HStore keys) in
> > TSearch. I don't want use TSVector and the original ts_match_vq function,
> > because I want to be able to control exact values that are passed to the
> > search and not use their lexemes.
>
> If that's what you're after, why don't you set up a text search
> configuration in which the parser/dictionary are trivial and do no
> transformations of the strings (beyond perhaps splitting at whitespace)?
> Seems a lot easier than constructing your own datatype and all the
> required support functions.
>
>regards, tom lane
>

Hi,

Thanks for the idea, but unfortunately it's not an option for me... This
needs to be expendable in the near future, so need to be a coded as a
separate function :/


Re: [HACKERS] proposal - structured funcid and lineno as new fields in error message

2010-03-29 Thread Pavel Stehule
2010/3/29 Tom Lane :
> Pavel Stehule  writes:
>> can we add well structured information about function id and lineno to
>> ErrorData?
>
> The idea that I was toying with was to report the function OID and line
> number, which might as well be two separate fields rather than messing
> around with anything "structured".
>
> The OID might be a bit inconvenient from the client side, but the
> trouble with trying to do more is that constructing a complete function
> descriptor will require catalog lookups, which is exactly what you don't
> want to be doing in an already-failed transaction.  (We just fixed some
> bugs along that line :-()
>
> In any case, the real problem we have is not so much that we lack error
> message fields: the messages we emit for plpgsql syntax errors are quite
> complete already.  The work that is needed is to provide that same
> infrastructure for run-time errors.

I thinking about it as some tool for some admin sw. But it is little
bit more complex than I though :(. More times we doesn't need oid of
really last function - mostly will be some C function - so we have to
have some like stack. I understand so we have to do rollback before
any using of oid. But I have to do it in all cases - only oid is
useless, we need source code - so rollback is necessary. These
information can exists together with current informations - like show
some for fast info before rollback and show more detailed info after
rollback. Some parts of error data are saved before rollback - but it
is task for client.

Regards
Pavel Stehule

>
>                        regards, tom lane
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal - structured funcid and lineno as new fields in error message

2010-03-29 Thread Tom Lane
Pavel Stehule  writes:
> can we add well structured information about function id and lineno to
> ErrorData?

The idea that I was toying with was to report the function OID and line
number, which might as well be two separate fields rather than messing
around with anything "structured".

The OID might be a bit inconvenient from the client side, but the
trouble with trying to do more is that constructing a complete function
descriptor will require catalog lookups, which is exactly what you don't
want to be doing in an already-failed transaction.  (We just fixed some
bugs along that line :-()

In any case, the real problem we have is not so much that we lack error
message fields: the messages we emit for plpgsql syntax errors are quite
complete already.  The work that is needed is to provide that same
infrastructure for run-time errors.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Greg Smith

Tom Lane wrote:

The problem with this line of thought is that it imagines you can look
at worked-out alternative plans.  You can't, because the planner doesn't
pursue rejected alternatives that far (and you'd not want to wait long
enough for it to do so...)
  


Not on any production system, sure.  I know plenty of people who would 
gladly let a rejected plan enumerator run for *a day* on their 
development box if it let them figure out exactly why the costing on the 
plan they expected ended up higher than the plan they actually get.  
While I know you don't run into this, regular people can easily spend a 
week on one such problem without gaining even that much insight, given 
the current level of instrumentation and diagnostic tools available.  
"Read the source" and "ask Tom" are both effective ways to resolve that 
but have their limits.  (Not because of you, of course--my bigger 
problem are people who just can't share their plans with the lists for 
privacy or security reasons)


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 12:17 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Mar 29, 2010 at 12:03 PM, Greg Smith  wrote:
>>> at the top of my list in this area for 9.1 is to track down the rumored
>>> patch that exports information about the rejected plans considered and get
>>> that comitted.  That always seems what I want to look at for answering the
>>> question "why this plan instead of what I was expecting?"
>
>> Having looked at that patch, I am skeptical of it, but we can
>> certainly take a fresh look.
>
> The problem with this line of thought is that it imagines you can look
> at worked-out alternative plans.  You can't, because the planner doesn't
> pursue rejected alternatives that far (and you'd not want to wait long
> enough for it to do so...)

Right.  And that's not what the patch did.  But a detailed discussion
of this topic should be (a) conducted on a separate thread and (b)
occur after we've all refamiliarized ourselves with it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql: edit function, show function commands patch

2010-03-29 Thread Pavel Stehule
Hello

this simple patch allow to specify cursor row when some function is
opened in editor.

\e aaa.txt
\a aaa.txt 3 ... move cursor on 3nd line of text
\ef foo
\ef foo 3 ... move cursor on 3nd line of function body
\sf foo ... show function body
\sf+ foo ... show function body - use line numbers
\sf[+] foo n ... show function body from line n

postgres=# \sf foo

CREATE OR REPLACE FUNCTION public.foo()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
begin
  return 10/0;
end;
$function$

postgres=# \sf+ foo

  CREATE OR REPLACE FUNCTION public.foo()
   RETURNS integer
   LANGUAGE plpgsql
   1  AS $function$
   2  begin
   3return 10/0;
   4  end;
  $function$

Regards
Pavel Stehule


editfce.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Robert Haas  writes:
> On Mon, Mar 29, 2010 at 12:03 PM, Greg Smith  wrote:
>> at the top of my list in this area for 9.1 is to track down the rumored
>> patch that exports information about the rejected plans considered and get
>> that comitted.  That always seems what I want to look at for answering the
>> question "why this plan instead of what I was expecting?"

> Having looked at that patch, I am skeptical of it, but we can
> certainly take a fresh look.

The problem with this line of thought is that it imagines you can look
at worked-out alternative plans.  You can't, because the planner doesn't
pursue rejected alternatives that far (and you'd not want to wait long
enough for it to do so...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 12:03 PM, Greg Smith  wrote:
> Robert Haas wrote:
>>
>> The query planner is a great piece of code but it
>> is not so transparently simple that it doesn't need debugging or
>> instrumentation, and "why did the planner do X" has got to be one of
>> our top ten most-frequently asked questions.
>
> Debugging and instrumentation are two slightly different issues.

Yeah, you're right.  This is debugging, not instrumentation.

> There is a
> lot more instrumentation needed in the query optimizer before people have
> better odds of understanding what's going on in this part of the database.
>  Recent features like pg_stat_statements and auto_explain are just the first
> round of what people really want here.  Now that we can get the explain data
> out in usable formats (XML, JSON, YAML) for a tool to manage them, the thing
> at the top of my list in this area for 9.1 is to track down the rumored
> patch that exports information about the rejected plans considered and get
> that comitted.  That always seems what I want to look at for answering the
> question "why this plan instead of what I was expecting?"

Having looked at that patch, I am skeptical of it, but we can
certainly take a fresh look.

> Stepping away from that, from the debugging perspective it seems one way to
> answer the question "is this unexpected behavior being caused by the new
> join removal code or not?" is to provide a way to toggle it off and see what
> changes.  Much like enable_seqscan, just because we don't ever want people
> to use it in production doesn't necessarily mean it's a bad idea to expose
> it.
>
> Also, given that this is a complicated feature, I think it's reasonable to
> ask whether allowing it to be turned off is the right thing just from the
> pragmatic basis that it provides a, ahem, backup plan in case there's
> unexpected difficulty with it in the field.

Yep.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 12:02 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sun, Mar 28, 2010 at 11:24 PM, Joseph Adams
>>  wrote:
>>> My reasoning for "It should be built-in" is:
>>>  * It would be nice to have a built-in serialization format that's
>>> available by default.
>>>  * It might be a little faster because it doesn't have to link to an
>>> external library.
>
>> I don't think either of these reasons is valid.
>
> FWIW, our track record with relying on external libraries has been less
> than great --- "upstream will maintain it" sounds good but has fallen
> over with respect to both the regex engine and the snowball stemmers,
> to take two examples.  And libxml2 has been nothing but a source of pain.
>
> If this is going to end up being one fairly small C file implementing
> a spec that is not a moving target, I'd vote against depending on an
> external library instead, no matter how spiffy and license-compatible
> the external library might be.

Fair enough.  Note that I did go on to say which reasons I did think
were potentially valid.  ;-)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Greg Smith

Robert Haas wrote:

The query planner is a great piece of code but it
is not so transparently simple that it doesn't need debugging or
instrumentation, and "why did the planner do X" has got to be one of
our top ten most-frequently asked questions.


Debugging and instrumentation are two slightly different issues.  There 
is a lot more instrumentation needed in the query optimizer before 
people have better odds of understanding what's going on in this part of 
the database.  Recent features like pg_stat_statements and auto_explain 
are just the first round of what people really want here.  Now that we 
can get the explain data out in usable formats (XML, JSON, YAML) for a 
tool to manage them, the thing at the top of my list in this area for 
9.1 is to track down the rumored patch that exports information about 
the rejected plans considered and get that comitted.  That always seems 
what I want to look at for answering the question "why this plan instead 
of what I was expecting?"


Stepping away from that, from the debugging perspective it seems one way 
to answer the question "is this unexpected behavior being caused by the 
new join removal code or not?" is to provide a way to toggle it off and 
see what changes.  Much like enable_seqscan, just because we don't ever 
want people to use it in production doesn't necessarily mean it's a bad 
idea to expose it.


Also, given that this is a complicated feature, I think it's reasonable 
to ask whether allowing it to be turned off is the right thing just from 
the pragmatic basis that it provides a, ahem, backup plan in case 
there's unexpected difficulty with it in the field.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Tom Lane
Robert Haas  writes:
> On Sun, Mar 28, 2010 at 11:24 PM, Joseph Adams
>  wrote:
>> My reasoning for "It should be built-in" is:
>>  * It would be nice to have a built-in serialization format that's
>> available by default.
>>  * It might be a little faster because it doesn't have to link to an
>> external library.

> I don't think either of these reasons is valid.

FWIW, our track record with relying on external libraries has been less
than great --- "upstream will maintain it" sounds good but has fallen
over with respect to both the regex engine and the snowball stemmers,
to take two examples.  And libxml2 has been nothing but a source of pain.

If this is going to end up being one fairly small C file implementing
a spec that is not a moving target, I'd vote against depending on an
external library instead, no matter how spiffy and license-compatible
the external library might be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 11:46 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> OK, I'll write a patch for that; and a consensus emerges that we
>> should also have enable_joinremoval, then I will add that as well.  I
>> think the only argument for NOT having enable_joinremoval is that you
>> can always modify the query to say SELECT * rather than some more
>> specific SELECT list,
>
> Uh, no, the argument for not having enable_joinremoval is that it's
> useless.
>
> In particular, I categorically deny the argument that putting it in will
> reduce user confusion.  If anyone is confused because EXPLAIN shows that
> some table isn't getting joined to, you think that the fact that
> somewhere in the manual is a mention of enable_joinremoval will
> un-confuse them?  If they knew that switch was there or what it did,
> they wouldn't be confused to begin with.

Uh, wow, no, it doesn't make any sense from that point of view.  What
I think the use case is is seeing how join removal changed the plan.
It could work out that the plan WITH join removal is significantly
different from the plan WITHOUT join removal.  Most of the time that
won't be the case - the join will just get snipped out.  But suppose
we're joining A to B and then to C and then to D, and that's the
actual join order.  Suppose further that C is most cheaply done as a
hash join.  It seems just barely possible to think that if the join to
C actually doesn't need to be done at all, then the join to D might be
done via some other method, because with the disappearance of the join
to C the join to D will receive its left input in some kind of sorted
order.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Robert Haas  writes:
> OK, I'll write a patch for that; and a consensus emerges that we
> should also have enable_joinremoval, then I will add that as well.  I
> think the only argument for NOT having enable_joinremoval is that you
> can always modify the query to say SELECT * rather than some more
> specific SELECT list,

Uh, no, the argument for not having enable_joinremoval is that it's
useless.

In particular, I categorically deny the argument that putting it in will
reduce user confusion.  If anyone is confused because EXPLAIN shows that
some table isn't getting joined to, you think that the fact that
somewhere in the manual is a mention of enable_joinremoval will
un-confuse them?  If they knew that switch was there or what it did,
they wouldn't be confused to begin with.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Jaime Casanova
On Mon, Mar 29, 2010 at 11:29 AM, Robert Haas  wrote:
> On Mon, Mar 29, 2010 at 11:23 AM, Jaime Casanova
>  wrote:
>> On Mon, Mar 29, 2010 at 10:42 AM, Alvaro Herrera
>>  wrote:
>>>
>>> It seems that what's really needed is some debug output to be able to
>>> find out what it did.
>>
>> +1, i was preparing an env for testing this but in the while i was
>> thinking how can i know what happens... counting the tables in the
>> explain analyze? it's possible but not ideal
>
> I don't actually see much value in making EXPLAIN show which tables
> were removed.

no, i was thinking in something like a DEBUG message indicating that
that rel is being removed... maybe a DEBUG1 message

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland  wrote:
> - There are ideas on how to solve the issue with the consistent
> snapshot but in the end you can always solve it by stopping your
> application(s). I actually assume that whenever people are interested
> in a very fast dump, it is because they are doing some maintenance
> task (like migrating to a different server) that involves pg_dump. In
> these cases, they would stop their system anyway.
> Even if we had consistent snapshots in a future version, would we
> forbid people to run parallel dumps against old server versions? What
> I suggest is to just display a big warning if run against a server
> without consistent snapshot support (which currently is every
> version).

Seems reasonable.

> - Regarding the output of pg_dump I am proposing two solutions. The
> first one is to introduce a new archive type "directory" where each
> table and each blob is a file in a directory, similar to the
> experimental "files" archive type. Also the idea has come up that you
> should be able to specify multiple directories in order to make use of
> several physical disk drives. Thinking this further, in order to
> manage all the mess that you can create with this, every file of the
> same backup needs to have a unique identifier and pg_restore should
> have a check parameter that tells you if your backup directory is in a
> sane and complete state (think about moving a file from one backup
> directory to another one or trying to restore from two directories
> which are from different backup sets...).

I think that specifying several directories is a piece of complexity
that would be best left alone for a first version of this.  But a
single directory with multiple files sounds pretty reasonable.  Of
course we'll also need to support that format in non-parallel mode,
and in pg_restore.

> The second solution to the single-file-problem is to generate no
> output at all, i.e. whatever you export from your source database you
> import directly into your target database, which in the end turns out
> to be a parallel form of "pg_dump | psql".

This is a very interesting idea but you might want to get the other
thing merged first, as it's going to present a different set of
issues.

> I am currently not planning to make parallel dumps work with the
> custom format even though this would be possible if we changed the
> format to a certain degree.

I'm thinking we probably don't want to change the existing formats.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 11:27 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> And for the record, I believe I find it rather amusing that you're
>> asking me if I "have the faintest idea how many there would be".
>
> Actually that was directed more at Simon.
>
>> I venture to say that after yourself I might be the person who knows
>> this code best.  I know how many there will be, if I get my way, and
>> that number is two.
>
> If you're speaking of adding a switch for the materialize-insertion
> behavior, I didn't object to that; I agree that turning that off might
> be an interesting thing to do.  But I remain of the opinion that a
> switch to disable join removal is just useless code and user-visible
> complexity.

OK, I'll write a patch for that; and a consensus emerges that we
should also have enable_joinremoval, then I will add that as well.  I
think the only argument for NOT having enable_joinremoval is that you
can always modify the query to say SELECT * rather than some more
specific SELECT list, but I think when there are several levels of
views involved it may not be so simple - you'll have to define
temporary versions of all the intermediate views, which is possibly
somewhat painful and certainly error-prone.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 11:23 AM, Jaime Casanova
 wrote:
> On Mon, Mar 29, 2010 at 10:42 AM, Alvaro Herrera
>  wrote:
>>
>> It seems that what's really needed is some debug output to be able to
>> find out what it did.
>
> +1, i was preparing an env for testing this but in the while i was
> thinking how can i know what happens... counting the tables in the
> explain analyze? it's possible but not ideal

I don't actually see much value in making EXPLAIN show which tables
were removed.  The usefulness of doing that presumes that the user is
asking a particular and very specific question, namely, exactly which
tables got removed?, or perhaps, did this particular table get
removed?  The latter question is not too hard to figure out anyway -
just look for that table in the query.  The former is a little harder
but if you really need to know you can always look at which tables
WERE referenced and work backwards from there, or write a little
script to post-process the JSON, XML, or YAML output.  Of course
sometimes it would be handy to have but if we start adding too many
bells and whistles to the EXPLAIN output it will get cluttered and
unuseful.

I think the value of an enable_joinremoval switch is that it lets
someone throw the switch and see how the plan changes.  It might
change in a very simple way or it might have a dramatic effect on the
whole plan.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Robert Haas  writes:
> And for the record, I believe I find it rather amusing that you're
> asking me if I "have the faintest idea how many there would be".

Actually that was directed more at Simon.

> I venture to say that after yourself I might be the person who knows
> this code best.  I know how many there will be, if I get my way, and
> that number is two.

If you're speaking of adding a switch for the materialize-insertion
behavior, I didn't object to that; I agree that turning that off might
be an interesting thing to do.  But I remain of the opinion that a
switch to disable join removal is just useless code and user-visible
complexity.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Jaime Casanova
On Mon, Mar 29, 2010 at 10:42 AM, Alvaro Herrera
 wrote:
>
> It seems that what's really needed is some debug output to be able to
> find out what it did.
>
>

+1, i was preparing an env for testing this but in the while i was
thinking how can i know what happens... counting the tables in the
explain analyze? it's possible but not ideal

-- 
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 10:36 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Mar 29, 2010 at 4:33 AM, Simon Riggs  wrote:
>>> So I think we need a parameter for join removal also.
>
>> I had this in my original patch but Tom wanted it taken out.
>
> And I still don't want it.  We are NOT going in the direction of adding
> an enable_ knob for every single planner activity --- do you have the
> faintest idea how many there would be?  We have such knobs for a small
> number of cases where it's arguable that the action might be the wrong
> thing for a particular query.  Join removal, if applicable, can't
> possibly be the wrong choice; it's better than every other join strategy.

As Tom Lane would put it, you're attacking a straw man.  No one has
proposed adding a planner knob for "every single planner activity".
What Simon and I have proposed is adding some planner knobs that are
virtually parallel to the existing ones, which are useful, and your
argument (or rather, your assertion) against that is apparently that
somehow even thought the OTHER ones are useful, these almost identical
ones for cases that aren't covered by the existing knobs will be
unuseful.

The argument that we don't need a knob to control join removal because
it has to always be the best strategy presuposes that enable_seqscan
exists because a sequential scan might not be the best strategy, or
that enable_hashjoin exists because a hash join might not be the best
strategy, which is emphatically nonsense.  If that were the purpose of
those knobs, that would by definition make them planner hints, albeit
incredibly poorly designed ones, and we would be having discussions
about the best way to turn them into more useful planner hints, as by
allowing them to apply to only certain portions of the query tree
and/or changing them from booleans to floats so that you could vary
the relative level of discouragement given to any particular planner
method, rather than only allowing zero and infinity.  Of course, we
are NOT having those conversations because that ISN'T the purpose of
those knobs.  Rather, their purpose, at least AIUI, is to allow the
user to see what the planner would have done had it not had those
strategies as its disposal - which is just as legitimate for join
removal or materialization as it is for hash join or merge join.  If
someone wants to ask a question like "how much does join removal speed
up this query?" or "how much does this material node (that didn't
exist in 8.4) speed up or slow down this query?", there is going to be
no easy way for them to understand that without these knobs.

And for the record, I believe I find it rather amusing that you're
asking me if I "have the faintest idea how many there would be".  I
venture to say that after yourself I might be the person who knows
this code best.  I know how many there will be, if I get my way, and
that number is two.  The query planner is a great piece of code but it
is not so transparently simple that it doesn't need debugging or
instrumentation, and "why did the planner do X" has got to be one of
our top ten most-frequently asked questions.  Failing to provide a few
trivially simple debugging tools for the small number of major new
planner features added in 9.0 is essentially pointless cruelty and
will result in an endless series of unanswerable questions on
-performance, particularly where materialization is concerned, since
for join removal there is at least a workaround (maybe not such an
easy one to use in all cases, but it exists - just select all the
columns).

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Alvaro Herrera  writes:
> It seems that what's really needed is some debug output to be able to
> find out what it did.

Isn't EXPLAIN good enough?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread David Fetter
On Mon, Mar 29, 2010 at 04:46:48PM +0200, Joachim Wieland wrote:
> People have been talking about a parallel version of pg_dump a few
> times already. I have been working on some proof-of-concept code for
> this feature every now and then and I am planning to contribute this
> for 9.1.
> 
> There are two main issues with a parallel version of pg_dump:
> 
> The first one is that it requires a consistent snapshot among
> multiple pg_dump clients

Cloning snapshots seems like the way to fix this.  I don't know how
far this project has drifted from the PostgreSQL code, but you might
want to look here:

http://pgfoundry.org/projects/snapclone/

> and the second is that currently the output goes to a single file
> and it is unclear what to do about multiple processes writing into a
> single file.

I don't think that's a good idea.  Coming up with a directory
structure for the new parallel pg_dump seems like a much better idea.

Andrew, do you have some notes on this?

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel pg_dump for 9.1

2010-03-29 Thread Joachim Wieland
People have been talking about a parallel version of pg_dump a few
times already. I have been working on some proof-of-concept code for
this feature every now and then and I am planning to contribute this
for 9.1.

There are two main issues with a parallel version of pg_dump:

The first one is that it requires a consistent snapshot among multiple
pg_dump clients and the second is that currently the output goes to a
single file and it is unclear what to do about multiple processes
writing into a single file.

- There are ideas on how to solve the issue with the consistent
snapshot but in the end you can always solve it by stopping your
application(s). I actually assume that whenever people are interested
in a very fast dump, it is because they are doing some maintenance
task (like migrating to a different server) that involves pg_dump. In
these cases, they would stop their system anyway.
Even if we had consistent snapshots in a future version, would we
forbid people to run parallel dumps against old server versions? What
I suggest is to just display a big warning if run against a server
without consistent snapshot support (which currently is every
version).

- Regarding the output of pg_dump I am proposing two solutions. The
first one is to introduce a new archive type "directory" where each
table and each blob is a file in a directory, similar to the
experimental "files" archive type. Also the idea has come up that you
should be able to specify multiple directories in order to make use of
several physical disk drives. Thinking this further, in order to
manage all the mess that you can create with this, every file of the
same backup needs to have a unique identifier and pg_restore should
have a check parameter that tells you if your backup directory is in a
sane and complete state (think about moving a file from one backup
directory to another one or trying to restore from two directories
which are from different backup sets...).

The second solution to the single-file-problem is to generate no
output at all, i.e. whatever you export from your source database you
import directly into your target database, which in the end turns out
to be a parallel form of "pg_dump | psql".

In fact, technically this is rather a parallel pg_restore than a
pg_dump as you need to respect the dependencies between objects. The
good news is that with the parallel pg_restore of the custom archive
format we have everything in place already for this dependency
checking. The addition is a new archive type that dumps (just-in-time)
whatever the dependency-algorithm decides to restore next.

This is probably the fastest way that we can copy or upgrade a
database when pg_migrator cannot be used (for example when you migrate
to a different hardware architecture).

As said, I have some working code for the features described (unix
only), if anybody would like to give it a try already now, just let me
know, I'd be happy to get some early test reports and you could check
for the speedup to expect. But before I continue, I'd like to have a
discussion about what is what people actually want and what is the
best way to go forward here.

I am currently not planning to make parallel dumps work with the
custom format even though this would be possible if we changed the
format to a certain degree.

Comments?


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using HStore type in TSearch

2010-03-29 Thread Tom Lane
=?ISO-8859-2?Q?=A3ukasz_Dejneka?=  writes:
> What I want to do is to use HStore data type (namely the HStore keys) in
> TSearch. I don't want use TSVector and the original ts_match_vq function,
> because I want to be able to control exact values that are passed to the
> search and not use their lexemes.

If that's what you're after, why don't you set up a text search
configuration in which the parser/dictionary are trivial and do no
transformations of the strings (beyond perhaps splitting at whitespace)?
Seems a lot easier than constructing your own datatype and all the
required support functions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Alvaro Herrera
Tom Lane escribió:
> Robert Haas  writes:
> > On Mon, Mar 29, 2010 at 4:33 AM, Simon Riggs  wrote:
> >> So I think we need a parameter for join removal also.
> 
> > I had this in my original patch but Tom wanted it taken out.
> 
> And I still don't want it.  We are NOT going in the direction of adding
> an enable_ knob for every single planner activity --- do you have the
> faintest idea how many there would be?  We have such knobs for a small
> number of cases where it's arguable that the action might be the wrong
> thing for a particular query.  Join removal, if applicable, can't
> possibly be the wrong choice; it's better than every other join strategy.

It seems that what's really needed is some debug output to be able to
find out what it did.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Tom Lane
Robert Haas  writes:
> On Mon, Mar 29, 2010 at 4:33 AM, Simon Riggs  wrote:
>> So I think we need a parameter for join removal also.

> I had this in my original patch but Tom wanted it taken out.

And I still don't want it.  We are NOT going in the direction of adding
an enable_ knob for every single planner activity --- do you have the
faintest idea how many there would be?  We have such knobs for a small
number of cases where it's arguable that the action might be the wrong
thing for a particular query.  Join removal, if applicable, can't
possibly be the wrong choice; it's better than every other join strategy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] five-key syscaches

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 4:21 AM, Joachim Wieland  wrote:
> On Mon, Mar 29, 2010 at 12:32 AM, Robert Haas  wrote:
>> Per previous discussion, PFA a patch to change the maximum number of
>> keys for a syscache from 4 to 5.
>>
>> http://archives.postgresql.org/pgsql-hackers/2010-02/msg01105.php
>>
>> This is intended for application to 9.1, and is supporting
>> infrastructure for knngist.
>
> It looks like there should be a 5 rather than a 4 for nkeys of
> SearchSysCacheList().
>
> +#define SearchSysCacheList5(cacheId, key1, key2, key3, key4, key5) \
> +       SearchSysCacheList(cacheId, 4, key1, key2, key3, key4, key5)

Good catch.  Will fix.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enable_joinremoval

2010-03-29 Thread Robert Haas
On Mon, Mar 29, 2010 at 4:33 AM, Simon Riggs  wrote:
>
> We have planner method parameters for many important parts of the
> planner. We don't have any parameter for join removal, AFAIK.
>
> The first question I get asked is "can I see the tables it removed?".
> This has been asked of me 3 times now this year, always first question.
> Same issue applies to constraint exclusion.
>
> But we *can* do this with constraint_exclusion, simply by turning it off
> and checking the two outputs.
>
> So I think we need a parameter for join removal also.
>
> I don't want to turn it off, but I think we need a way for people to
> check themselves that the removal of the joins is not an error. I
> foresee many false bug reports along the lines of "optimizer ate my join
> and I want it back".

I had this in my original patch but Tom wanted it taken out.  In many
cases it's not horribly difficult to work around because you can do
SELECT * FROM ... instead of your original select list, but there
might be some cases with multiple levels of views where it isn't that
easy.  I think it would be good to add this back.

An even worse problem I've been noticing is that there is no easy way
to determine whether the planner's new penchant for inserting
Materialize notes in all sorts of fun and exciting places is in fact
improving performance or not.  I'm not sure there's even a difficult
way.  I really, really think we need to do something out this - the
current situation is really quite horrible.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problems with variable cursorname in ecpg

2010-03-29 Thread Boszormenyi Zoltan
Hi,

Michael Meskes írta:
> Hi,
>
> I did some more testing on ecpg and found that allowing variables as cursor
> names seems to produce more problems than I anticipated. But then maybe it's
> just some missing checks to throw out error messages. Anyway, I attach a small
> test program that, from my understanding, should work, but dosn't. Could
> somebody with access to embedded SQL precompilers from other DBMSes please try
> if this test case works with them? 
>   

I have modified your code a little to be able to compile with ESQL/C.
Attached both the embedded SQL and the processed source.

> The problem we seem to have right now comes from the original logic in ecpg
> moving the declare cursor statement to the position of the open cursor
> statemend at compile time. With the cursor name being unique this never has
> been a problem. However, with variables as cursor names, this uniqueness need
> not hold anymore. If it does, i.e. each cursor gets its own variable, all is
> well, but if not, it doesn't work correctly at all times.
>   

This was what I found some time ago when the same issue, i.e.
two DECLAREs for the same cursor name in IF/ELSE appeared
for different queries:

The standard says (SQL:2008, section 14.1 ,
Syntax Rules):

"
Syntax Rules
1) If a  is contained in an
 M, then:
  a) The  shall not be equivalent to the
  of any other  or
  in M.
  b) The scope of the  is M with the exception
 of any  contained in M.
  c) Any  contained in the 
 shall be defined in a  in the
  that contains an 
 that specifies the  and is contained in the scope of
 that .
"

The standard text doesn't say a word about DECLARE has to imply
a function call, ESQL/C does call a function, ECPG doesn't.
Also, in the same section, under General Rules:

"
General Rules
1) A cursor declaration descriptor CDD is created. CDD includes
indications that:
  a) The kind of cursor is a standing cursor.
  b) The provenance of the cursor is an indication of the SQL-client
 module whose  contains the
 .
  c) The name of the cursor is the .
  d) The cursor's origin is the  contained in
 the .
  e) The cursor's declared properties are as determined by the
 .
"

This says "A cursor declaration descriptor CDD is created." - it doesn't
say where, and ECPG treats it as internal descriptor (as opposed to
a runtime descriptor in the processed C code as done by ESQL/C)
and currently it uses it to enforce the rules about cursors in embedded
SQL programs, like (in section 21.1 ):

"
14) A  that is contained in an 
shall precede in the text of that  any
SQL-statement that references the  of the .

15) A  that is contained in an
 shall precede in the text of that
 any SQL-statement that references the
 of the .
"

and the paragraph 1)a) cited above in "Syntax Rules" of section 14.1.

The above was described as "the DECLARE statement is declarative"
by You, the ECPG maintainer when I fixed another bug that was reported
by our client. If you remember, the bug was that Informix resets
SQLCA upon executing DECLARE, and PostgreSQL didn't do it because
DECLARE wasn't calling any function, it only have set up the
internal descriptor for the cursor. You accepted a fix for this
for the Informix compatible mode of ECPG but not for the native mode,
to keep the declarative nature of DECLARE. But this didn't change
the fact that DECLARE still doesn't involve any function call that
uses name of the cursor.

The interpretation of the standard in the above way (DECLARE is declarative,
not functional) leads to the situation where the ECPG transformation cannot
know the cursor's real name during runtime (ECPG is not a VM after all),
only at transformation time. This means that with a dynamic cursorname
the only thing it can check and match is the ":variablename" cursorname
so OPEN, FETCH and CLOSE will all mandatorily have to use the same
variable as was used in the DECLARE statement.

BTW, the declarative nature of the DECLARE statement means that
it (a DECLARE statement) can appear outside of any functions
in ECPG's native mode and ESQL/C's failure in (or interpretation of)
conforming to the standard treats it as an error.


The uniqueness problem can only be solved with modifying
the runtime library to keep track of the cursor names in the client.
It would ruin the declarative nature of DECLARE but would increase
compatibility with Informix, and we would also need to implement
correct "FREE cursorname" behaviour, too. Which would also bring
the consequence that the ECPG client library would need to
forbid cursors and prepared statements with the same name as
"FREE" can also free cursors and prepared statements.

But there's a workaround that is usable under ECPG.
One of the cursors can be put into a different source file,
and different statements for cursors (DECLARE, OPEN, FETCH,
CLOSE) can now be put into different functions. You can
even have different cursornames passed into the same
DECLARE using different state

[HACKERS] Known Issues Page

2010-03-29 Thread Tharakan, George (GE Healthcare)
Hi,
 
I have been using PostGreSQL as a part of our healthcare product. As an
important part of releasing a stable product it is important to also
document the known issues found in a PostGRE release. I would be
grateful if someone could forward me to the Known Issues Page(if any).
On these lines http://www.postgresql.org/support/security.html has been
very helpful but it only caters to security.
 
Rgds,
gmt


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Andrew Dunstan



Robert Haas wrote:

I feel pretty strongly that the data should be stored in the database
in the format in which it will be returned to the user - any
conversion which is necessary should happen on the way in.  I am not
100% sure to what extent we should attempt to canonicalize the input
and to what extend we should simply store it in whichever way the user
chooses to provide it.

  


ISTM that implies that, with a possible exception when the server 
encoding is utf8, you would have to \u escape the data on the way in 
fairly pessimistically.


I'd be inclined to say we should store and validate it exactly as the 
client gives it to us (converted to the server encoding, as it would be, 
of course). In practice that would mean that for non-utf8 databases the 
client would need to \u escape it. I suspect most uses of this would be 
in utf8-encoded databases anyway.


I also think we should provide a function to do the escaping, so users 
could do something like:


   insert into foo (myjson) values (json_escape('some jason text here'));

I also thought about a switch to turn on \u escaping on output - that 
might be useful for pg_dump for instance.


cheers

andrew






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Problems with variable cursorname in ecpg

2010-03-29 Thread Michael Meskes
Hi,

I did some more testing on ecpg and found that allowing variables as cursor
names seems to produce more problems than I anticipated. But then maybe it's
just some missing checks to throw out error messages. Anyway, I attach a small
test program that, from my understanding, should work, but dosn't. Could
somebody with access to embedded SQL precompilers from other DBMSes please try
if this test case works with them? 

The problem we seem to have right now comes from the original logic in ecpg
moving the declare cursor statement to the position of the open cursor
statemend at compile time. With the cursor name being unique this never has
been a problem. However, with variables as cursor names, this uniqueness need
not hold anymore. If it does, i.e. each cursor gets its own variable, all is
well, but if not, it doesn't work correctly at all times.

BTW I can modify the test case so it works fine, but ecpg will still throw an
error message, which is not a good situation either.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
ICQ 179140304, AIM/Yahoo/Skype michaelmeskes, Jabber mes...@jabber.org
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL
exec sql whenever sqlerror sqlprint;

void runit(char *);

main()
{
exec sql begin declare section;
char *cursor = strdup("A");
exec sql end declare section;

exec sql connect to test;
exec sql declare :cursor cursor for select 1;
runit(cursor);
exec sql close :cursor;
exec sql declare :cursor cursor for select 2;
runit(cursor);
exec sql close :cursor;
exec sql disconnect;
}

void runit(char *name)
{
exec sql begin declare section;
char *mycursor = name;
int i;
exec sql end declare section;

exec sql open :mycursor;
exec sql fetch :mycursor into :i;
printf("Found %d\n", i);
} 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Using HStore type in TSearch

2010-03-29 Thread Łukasz Dejneka
Hi,

I've asked this question in novice group and another Postgres forum, but
didn't get any help.

What I want to do is to use HStore data type (namely the HStore keys) in
TSearch. I don't want use TSVector and the original ts_match_vq function,
because I want to be able to control exact values that are passed to the
search and not use their lexemes.

I'm pretty green about Postgres and C, so please forgive me for any mistakes
I make here...

What I tried was to modify the ts_match_vq function and use it to pass
values from HStore directly to the recursive TS_execute function.
Unfortunately the HStore ARRPTR() and STRPTR() functions do not return the
same structures as their TSVector counterparts. HStore as I gathered returns
pointer to a string, but the TSVector (and the TS_execute) needs a ltree of
search entries. I guess I need to convert the HStore values to the ltree
structure, but I can't get it right.

This is the partially modified ts_match_vq:

#undef CALCDATASIZE
#undef ARRPTR
#undef STRPTR
#define CALCDATASIZE(x, lenstr) ( (x) * sizeof(HEntry) + HSHRDSIZE +
(lenstr) )
#define ARRPTR(x)( (HEntry*) ( (char*)(x) + HSHRDSIZE ) )
#define STRPTR(x)( (char*)(x) + HSHRDSIZE + ( sizeof(HEntry)
* ((HStore*)x)->size ) )

PG_FUNCTION_INFO_V1(ljd_test);
Datum ljd_test(PG_FUNCTION_ARGS);
Datum.
ljd_test(PG_FUNCTION_ARGS).
{
//get arguments
HStore <---><-->*hs = PG_GETARG_HS(0);.
TSQuery <--><-->tq = PG_GETARG_TSQUERY(1);
TSQuery <--><-->tq_in;
CHKVAL <---><-->chkvalKeys;
bool <-><-->res;

//check for empty values.
if (!hs->size || !tq->size)
{
<-->PG_FREE_IF_COPY(hs, 0);
<-->PG_FREE_IF_COPY(tq, 1);
<-->PG_RETURN_BOOL(false);
}
//process TSQuery
tq_in = TSQueryGetDatum(tq);
//process HStore
chkvalKeys.arrb = ARRPTR(hs);
chkvalKeys.arre = chkvalKeys.arrb + hs->size
chkvalKeys.values = STRPTR(hs);
chkvalKeys.operand = GETQUERY(tq_in);
res = ljd_exec(
<-->GETQUERY(tq_in),
<-->&chkvalKeys,
<-->true,
<-->checkcondition_str
);

PG_RETURN_BOOL(res);.

}
Thank you in advance for any help and tips on how to solve this.


Re: [HACKERS] GSoC Query

2010-03-29 Thread Gokulakannan Somasundaram
>
>
>
>  Similarly using the no. of select hits on a table we can check that if
>> maximum no. of times it is on a non-index field we can index on that field
>> to make select faster.
>>
>
> It's impractical to figure out where indexes should go at without
> simulating what the optimizer would then do with them against a sample set
> of queries.  You can't do anything useful just with basic statistics about
> the tables.
>
> I would recommend
> http://msdn.microsoft.com/en-us/library/aa226167(SQL.70).aspxas
>  a good, practical introduction to the topic of what it takes to figure
> out where indexes go at, from someone who came up with a reasonable solution
> to that problem.  You can find a list of the underlying research they cite
> (and an idea what has been done since then) at
> http://portal.acm.org/citation.cfm?id=673646
>
>
Even if you have devised a way to find the appropriate set of indexes, just
have a index adviser, which would advise a set of indexes for a set of
queries and let the DBA and the application user take the final call, after
looking at them..

Gokul.


Re: [HACKERS] [BUGS] BUG #5394: invalid declspec for PG_MODULE_MAGIC

2010-03-29 Thread Takahiro Itagaki

"Vladimir Barzionov"  wrote:

> Same problem was already discussed for example here
> http://dbaspot.com/forums/postgresql/393683-re-general-custom-c-function-palloc-broken.html
> 
> Looks like the simplest way for correcting the issue is declaring additional
> macro (something like PGMODULEEXPORT)

Sure, I agree it is a longstanding bug in PostgreSQL. Developers who use
MSVC (not mingw) always encounter the bug; machines in the buildfarm can
build Windows binaries just because they have non-standard equipments.

A patch attached. The name of "PGMODULEEXPORT" might be arguable.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



PGMODULEEXPORT.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 9.1: initdb -C option

2010-03-29 Thread Peter Eisentraut
On mån, 2010-03-29 at 00:04 -0500, David Christensen wrote:
> Enclosed is a patch to add a -C option to initdb to allow you to easily 
> append configuration directives to the generated postgresql.conf file for use 
> in programmatic generation.

I like this idea, but please use small -c for consistency with the
postgres program.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Peter Eisentraut
On sön, 2010-03-28 at 23:24 -0400, Joseph Adams wrote:
> Thus, here's an example of how (in my opinion) character sets and such
> should be handled in the JSON code:
> 
> Suppose the client's encoding is UTF-16, and the server's encoding is
> Latin-1.  When JSON is stored to the database:
>  1. The client is responsible and sends a valid UTF-16 JSON string.
>  2. PostgreSQL checks to make sure it is valid UTF-16, then converts
> it to UTF-8.
>  3. The JSON code parses it (to ensure it's valid).
>  4. The JSON code unparses it (to get a representation without
> needless whitespace).  It is given a flag indicating it should only
> output ASCII text.
>  5. The ASCII is stored in the server, since it is valid Latin-1.
> 
> When JSON is retrieved from the database:
>  1. ASCII is retrieved from the server
>  2. If user needs to extract one or more fields, the JSON is parsed,
> and the fields are extracted.
>  3. Otherwise, the JSON text is converted to UTF-16 and sent to the client.

The problem I see here is that a data type output function is normally
not aware of the client encoding.  The alternatives that I see is that
you always escape everything you see to plain ASCII, so it's valid in
every server encoding, but that would result in pretty sad behavior for
users of languages that don't use a lot of ASCII characters, or you
decree a nonstandard JSON variant that momentarily uses whatever
encoding you decide.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal - structured funcid and lineno as new fields in error message

2010-03-29 Thread Pavel Stehule
Hello

can we add well structured information about function id and lineno to
ErrorData?

Actually we sending some info - but formated as string and only when
Log_error_verbosity >= PGERROR_VERBOSE.

I hope, so this feature can help to some interactive GUI managers as
pgAdmin, and to psql too.

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha release this week?

2010-03-29 Thread Stefan Kaltenbrunner

Robert Haas wrote:

On Sun, Mar 28, 2010 at 4:40 PM, Josh Berkus  wrote:

We've got two locations and some individuals signed up for a test-fest
this weekend.  Would it be possible to do an alpha release this week?
It would really help to be testing later code than Alpha4.


I'm willing to do the CVS bits, if that's helpful.  Or maybe Peter
wants to do it.  Anyway I have no problem with the idea.


or just use a specific recent snapshot and let people test that just in 
case it is not feasible doing a new alpha on short notice.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] enable_joinremoval

2010-03-29 Thread Simon Riggs

We have planner method parameters for many important parts of the
planner. We don't have any parameter for join removal, AFAIK.

The first question I get asked is "can I see the tables it removed?".
This has been asked of me 3 times now this year, always first question.
Same issue applies to constraint exclusion.

But we *can* do this with constraint_exclusion, simply by turning it off
and checking the two outputs.

So I think we need a parameter for join removal also.

I don't want to turn it off, but I think we need a way for people to
check themselves that the removal of the joins is not an error. I
foresee many false bug reports along the lines of "optimizer ate my join
and I want it back". 

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] join removal

2010-03-29 Thread Pavel Stehule
2010/3/29 Marko Tiikkaja :
> On 2010-03-29 11:19 +0200, Pavel Stehule wrote:
>> postgres=# explain select  a from a left join b on true;

you have a true.

I forgot SELECT DISTINCT

regards

Pavel
>
> This is effectively a cross join and it would give wrong answers.  Try
> SELECT a FROM a LEFT JOIN b ON a.a = b.b;
>
>
> Regards,
> Marko Tiikkaja
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] join removal

2010-03-29 Thread Marko Tiikkaja
On 2010-03-29 11:19 +0200, Pavel Stehule wrote:
> postgres=# explain select  a from a left join b on true;

This is effectively a cross join and it would give wrong answers.  Try
SELECT a FROM a LEFT JOIN b ON a.a = b.b;


Regards,
Marko Tiikkaja

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] five-key syscaches

2010-03-29 Thread Joachim Wieland
On Mon, Mar 29, 2010 at 12:32 AM, Robert Haas  wrote:
> Per previous discussion, PFA a patch to change the maximum number of
> keys for a syscache from 4 to 5.
>
> http://archives.postgresql.org/pgsql-hackers/2010-02/msg01105.php
>
> This is intended for application to 9.1, and is supporting
> infrastructure for knngist.

It looks like there should be a 5 rather than a 4 for nkeys of
SearchSysCacheList().

+#define SearchSysCacheList5(cacheId, key1, key2, key3, key4, key5) \
+   SearchSysCacheList(cacheId, 4, key1, key2, key3, key4, key5)


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] join removal

2010-03-29 Thread Pavel Stehule
Hello

is any reason why join removal doesn't remove useless relation b?

postgres=# \d a
   Table "public.a"
 Column |  Type   | Modifiers
+-+---
 a  | integer |
Indexes:
"a_a_idx" UNIQUE, btree (a)

postgres=# \d b
   Table "public.b"
 Column |  Type   | Modifiers
+-+---
 b  | integer |
Indexes:
"b_b_idx" UNIQUE, btree (b)

postgres=# explain select  a from a left join b on true;
QUERY PLAN
---
 Nested Loop Left Join  (cost=0.00..72074.00 rows=576 width=4)
   ->  Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
   ->  Materialize  (cost=0.00..46.00 rows=2400 width=0)
 ->  Seq Scan on b  (cost=0.00..34.00 rows=2400 width=0)
(4 rows)

postgres=# explain select distinct a from a left join b on true;
   QUERY PLAN
-
 Unique  (cost=0.00..86520.25 rows=2400 width=4)
   ->  Nested Loop Left Join  (cost=0.00..72120.25 rows=576 width=4)
 ->  Index Scan using a_a_idx on a  (cost=0.00..80.25 rows=2400 width=4)
 ->  Materialize  (cost=0.00..46.00 rows=2400 width=0)
   ->  Seq Scan on b  (cost=0.00..34.00 rows=2400 width=0)
(5 rows)

Regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More idle thoughts

2010-03-29 Thread Simon Riggs
On Sun, 2010-03-28 at 12:47 -0400, Tom Lane wrote:
> Simon Riggs  writes:
> > On Fri, 2010-03-26 at 18:59 +, Greg Stark wrote:
> >> It occurs to me we could do the same for CHECK_FOR_INTERRUPTS() by
> >> conditionally having it call a function which calls gettimeofday and
> >> compares with the previous timestamp received at the last CFI().
> 
> > Reducing latency sounds good, but what has CFI got to do with that?
> 
> It took me about five minutes to figure out what Greg was on about too.
> His point is that we need to locate code paths in which an extremely
> long time can pass between successive CFI calls, because that means
> the backend will fail to respond to SIGINT/SIGTERM for a long time.
> Instrumenting CFI itself is a possible tool for that.

I was thinking we could do this via signals, but actually instrumenting
the code paths seems better.

There probably are a few paths still to improve. Dare I suggest we
follow the tried and tested open source approach of wait-for-complaint?

Reducing latency elsewhere would be time better spent (!). I was
thinking of adding a "reason" field onto ReadBuffer, so we can diagnose
the source of buffer waits.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Add JSON support

2010-03-29 Thread Dimitri Fontaine
Hi,

Joseph Adams  writes:
> As for reinventing the wheel, I'm in the process of writing yet
> another JSON implementation simply because I didn't find the other
> ones I looked at palatable.

Even this one (ANSI C, MIT Licenced)?

  cJSON -- An ultra-lightweight, portable, single-file, simple-as-can-be
  ANSI-C compliant JSON parser, under MIT license.

  http://sourceforge.net/projects/cjson/
  http://cjson.svn.sourceforge.net/viewvc/cjson/README?revision=7&view=markup
  http://cjson.svn.sourceforge.net/viewvc/cjson/cJSON.c?revision=33&view=markup

And from the cJSON.h we read that it could be somewhat easy to integrate
into PostgreSQL's memory management:
   56 typedef struct cJSON_Hooks {
   57   void *(*malloc_fn)(size_t sz);
   58   void (*free_fn)(void *ptr);
   59 } cJSON_Hooks;

Just adding some data points, hoping that's not adding only confusion.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC Query

2010-03-29 Thread Greg Smith

gaurav gupta wrote:
My idea is to add a functionality of Auto tuning and Auto Indexing/ 
Reindexing in DB languages.


Ah, the classic request to start with the tuning and index wizards.  
Really fun to work on, always lots of interest in it.  Here's the 
thing:  projects in this area attract endless interest.  If it were 
possible to write something useful in a couple of months, we'd have a 
hundred such programs fighting for attention.  So the fact that we 
actually have zero of them should tell you something about the actual 
difficultly level of the work.  You could spend the whole summer just 
reading research papers on this topic and maybe catch up to the late 
90's by the end.


Here's the usual advice I give to students looking to make a useful 
contribution to any mature development project:  the more boring the 
work sounds, the more likely it is you'll actually do something people 
can use.  It's easy to find people who want to work on fun projects--so 
easy that they've all been done already.  What's left is either much 
harder than it looks, or kind of dull to do.  The idea behind 
intentionally picking a boring one is that you're more likely to get one 
that's unfinished for that reason, rather than because it's actually a 
year or two of work to complete.  Or, in the case you're asking about, a 
decade or three if you were to start from scratch and were really 
smart.  If you started working on this now rather than stopping to 
follow the research already done you might catch up to 
http://portal.acm.org/citation.cfm?id=810505 in a couple of months.


Similarly using the no. of select hits on a table we can check that if 
maximum no. of times it is on a non-index field we can index on that 
field to make select faster.


It's impractical to figure out where indexes should go at without 
simulating what the optimizer would then do with them against a sample 
set of queries.  You can't do anything useful just with basic statistics 
about the tables.


I would recommend 
http://msdn.microsoft.com/en-us/library/aa226167(SQL.70).aspx as a good, 
practical introduction to the topic of what it takes to figure out where 
indexes go at, from someone who came up with a reasonable solution to 
that problem.  You can find a list of the underlying research they cite 
(and an idea what has been done since then) at 
http://portal.acm.org/citation.cfm?id=673646


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers