Re: [HACKERS] Patch queue - wiki (was varadic patch)

2008-04-03 Thread Dave Page
On Thu, Apr 3, 2008 at 12:35 AM, Bruce Momjian [EMAIL PROTECTED] wrote:
  It is not clear to me how a wiki can be easily created for 2k emails and
  then maintained in a reasonable way, or how emails can be added to it
  easily.

That seems like a *really* odd thing for one of the founders of the
world's most advanced OSS DBMS project to say. It's all relational
(which we do do pretty well) - we can add links to the wiki to threads
in the archives, and anything posted from then on is self-maintaining
(except when new threads are started - but even if each patch gets 5
threads that's not a huge chore).

I see no reason to go manually copying all 2k emails to the wiki.


-- 
Dave Page
EnterpriseDB UK Ltd: http://www.enterprisedb.com
PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Tom Dunstan
I had some thoughts about similar issues when looking at what it would
take to make pl/java yum-installable. The end goal was to be able to
say e.g. yum install pljava; echo create language pljava; | psql
mydb. Currently there's a non-trivial install process involving
running an sql script and java class.

My idea was to have a createlang_init kind of function that could be
called when installing a language to set up appropriate functions,
tables etc. There would be a similar function to clean up when
dropping the lang.

On Thu, Apr 3, 2008 at 6:12 AM, Ron Mayer [EMAIL PROTECTED] wrote:
 Agreed.  Such a mechanism would only really apply for things
 that are installed in the database.   But from an end user's
 point of view, installing functions, index types, languages,
 data types, etc all see to fit the pg_install postgis -d mydb,
 pg_install pl_ruby -d mydb, etc. pattern pretty well.


Well, there are a couple of major differences. Firstly cpan, gem etc
are able to install all required dependencies themselves, at least
where no native compilation is required, because they are basically
their own platform. PG libs more or less require a build environment.

Secondly, and more importantly, module installation for those
environments happens once and is global; installation of native libs
for pgsql is different to instllation in a database. What happens in
the above scenario when the postgis libs are already installed? And
what about cleanup? Also, it would seem that such an install process
requires the server to be running - so much for packaging as
RPMs/debs/win32 installer etc.

I think a better solution would be to have a pg_install be a
distribution mechanism capable of installing binaries / scripts /
other resources, but have pgsql itself handle module installation into
a particular database. I'm thinking a CREATE MODULE foo; kind of
thing that would be capable of finding either a module install script
or a foo_init() function in libfoo.so/foo.dll. Similarly for cleanup,
so cleanup isn't dependent on pg_install lying around or the version
that was install still being the latest when pg_install looks for an
uninstall script.

This would allow modules to be installed site-wide but optionally
created / dropped from specific databases in a much saner manner, and
standard pgsql permissions could apply to installation of modules. It
would also allow creation of rpms etc that can be shipped by a
distribution, and then enabled by the user by calling the appropriate
command.

 Finally, setting up modules so they can be built for Windows,
especially using MSVC, will probably be quite a challenge.


 Indeed.   Seems ruby gems give you the option of installing a ruby
 version or a windows version that I'm guessing has pre-compiled
 object files.

Yeah, setting up Cygwin to build postgres is a pain (or was when I
last did so). If we're serious about setting up a central repository,
we should consider having a virtualized windows machine capable of
building binaries for the modules that people upload.

Cheers

Tom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Zeugswetter Andreas OSB SD

  The closest analogy to what I'm thinking is the perl CPAN 
 or ruby gems.

I think this is more a developer thing. I don't think an ISP would want
all that automagic (and certainly does not do that for joe user).

 One thing that might be worth looking at is an install command at the 
 SQL level, so the INSTALL foo would run the install script for the
foo 
 module in the current database, assuming it's in the standard
location.

Yes.

 We don't have a central repository of non-standard modules, like CPAN,

 and so of course no facility for fetching / building / installing
them.

I think that is not a problem, since the service providers would rather
want
readily fetched built and regression tested modules, not anything fancy
or magic.
The readily built modules would simply be part of their binary
distibution.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Andrew Dunstan



Zeugswetter Andreas OSB SD wrote:
The closest analogy to what I'm thinking is the perl CPAN 
  

or ruby gems.



I think this is more a developer thing. I don't think an ISP would want
all that automagic (and certainly does not do that for joe user).
  



I think you are missing an essential part of the vision. This is not 
just targetted at developers. Binary distro authors typically include a 
huge number of CPAN modules as well as core Perl, and ISPs typically 
install them. We want to have something CPAN-like so we can get the same 
effect.


At any rate, that's a bit blue sky right now. I haven't seen any 
disagreement with our kissing contrib goodbye as a name, so let's work 
on that. Unfortunately, that's going to involve a bit of pain, including 
in the buildfarm, whose client relies on the name. I'll go to work on 
fixing that, and we can get a new version out so when we make the switch 
the buildfarm doesn't go dark.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Dawid Kuroczko
Hi!

I have sent a patch to pgsql-patches:
  http://archives.postgresql.org/pgsql-patches/2008-04/msg00050.php
...which adds \G command to psql client.

The idea of \G command is to perform the query, but with printing
query results using extended table output format.

For example:

postgres=# SELECT * FROM pg_stat_activity;
 datid | datname  | procpid | usesysid | usename  |
current_query  | waiting |  xact_start   |
 query_start  | backend_start |
client_addr | client_port
---+--+-+--+--+-+-+---+---+---+-+-
 11511 | postgres |   11729 |   10 | postgres | SELECT * FROM
pg_stat_activity; | f   | 2008-04-03 14:40:15.277272+02 |
2008-04-03 14:40:15.277272+02 | 2008-04-03 14:39:50.050512+02 |
 |  -1
(1 row)

postgres=# SELECT * FROM pg_stat_activity\G
-[ RECORD 1 ]-+---
datid | 11511
datname   | postgres
procpid   | 11729
usesysid  | 10
usename   | postgres
current_query | SELECT * FROM pg_stat_activity
waiting   | f
xact_start| 2008-04-03 14:41:47.533763+02
query_start   | 2008-04-03 14:41:47.533763+02
backend_start | 2008-04-03 14:39:50.050512+02
client_addr   |
client_port   | -1

postgres=# SELECT * FROM pg_stat_activity\g
 datid | datname  | procpid | usesysid | usename  |
current_query  | waiting |  xact_start   |
 query_start  | backend_start |
client_addr | client_port
---+--+-+--+--++-+---+---+---+-+-
 11511 | postgres |   11729 |   10 | postgres | SELECT * FROM
pg_stat_activity | f   | 2008-04-03 14:42:09.940897+02 |
2008-04-03 14:42:09.940897+02 | 2008-04-03 14:39:50.050512+02 |
 |  -1
(1 row)

Comments anyone?

   Regards,
   Dawid

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] COPY Transform support

2008-04-03 Thread Dimitri Fontaine
Hi,

Here's a proposal for COPY to support the T part of an ETL, that is adding the 
capability for COPY FROM to Transform the data it gets.

The idea is quite simple: adding to COPY FROM the option to run a function on 
the data before to call datatype_in functions. This needs some syntax 
addition to be worked out at the COPY side, then the COPY code will have to 
run the given function on the read data and consider giving the output of it 
to current COPY code (datatype input function).

The function could either get the data as text or bytea, and would have to 
return either text or bytea. bytea seems the more sensible choice, as long as 
we don't lose encoding information there, which I'm not sure about.

The syntax could be something like:
 COPY mytable FROM '/my/file.txt' WITH COLUMN x CONVERT USING myfunc;

I tried to only add keywords already present in [1], while getting something 
meaningfull... and x is intended to be the column number, counting from 1.
 [1] http://www.postgresql.org/docs/8.3/static/sql-keywords-appendix.html

Comments?
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] modules

2008-04-03 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 At any rate, that's a bit blue sky right now. I haven't seen any
 disagreement with our kissing contrib goodbye as a name, so let's work
 on that. Unfortunately, that's going to involve a bit of pain,

Yes, I'm not sure I see the point of it. It's got a bad name, but changing
it is just putting lipstick on a pig. End users don't know, and don't care,
about contrib. Sysadmins and casual DBAs only care what they can yum install.
That only leaves packagers and hard-core developers, both of whom already
know how contrib works.

Not that I wouldn't want to see some of the good ideas raised in this thread
explored. In particular, I'd love to see some of the more standard contrib
things installable as simple as:

postgres=# INSTALL earthdistance;

Right now contrib is a real catch-all of various things; it would be nice to
categorize them somehow. And by categorize, I emphatically do NOT mean
move to pgfoundry, which is pretty much a kiss of death.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200804030953
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkf04VUACgkQvJuQZxSWSsjmPACeMoaDTXgjqXBKlthPad6D3sWV
qooAn2y0cwnafYwnGonGBEq/6IAbXzlF
=SO7r
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread NikhilS
Hi,


On Thu, Apr 3, 2008 at 6:47 PM, Dimitri Fontaine [EMAIL PROTECTED]
wrote:

 Here's a proposal for COPY to support the T part of an ETL, that is adding
 the
 capability for COPY FROM to Transform the data it gets.

 The idea is quite simple: adding to COPY FROM the option to run a function
 on
 the data before to call datatype_in functions. This needs some syntax
 addition to be worked out at the COPY side, then the COPY code will have
 to
 run the given function on the read data and consider giving the output of
 it
 to current COPY code (datatype input function).

 The function could either get the data as text or bytea, and would have to
 return either text or bytea. bytea seems the more sensible choice, as long
 as
 we don't lose encoding information there, which I'm not sure about.

 The syntax could be something like:
  COPY mytable FROM '/my/file.txt' WITH COLUMN x CONVERT USING myfunc;

 I tried to only add keywords already present in [1], while getting
 something
 meaningfull... and x is intended to be the column number, counting from 1.
  [1] http://www.postgresql.org/docs/8.3/static/sql-keywords-appendix.html

 Comments?
 --
 dim


+1

Data transformation while doing a data load is a requirement now and then.
Considering that users will have to do mass updates *after* the load
completes to mend the data to their liking should be reason enough to do
this while the loading is happening. I think to go about it the right way we
should support the following:


* The ability to provide per-column transformation expressions

COPY mytable (col1 transform to col1 + 10, col2 transform to 'Post' ||
'greSQL', col3...) FROM ..

* The ability to use any kind of expressions while doing the transformation
The transformation expression should be any expression (basically
ExecEvalExpr) that can be evaluated to give a resulting value and obviously
a corresponding is_null value too. It should and could be system in-built
functions (e.g. UPPER, TRIM, TO_CHAR, TO_NUMBER etc.) or user defined
functions too

* The transformation expression can refer to other columns involved in the
load. So that when the current row is extracted from the input file, the
current values should be used to generate the new resultant values before
doing a heap_form_tuple. E.g.

(col1 transform col1 + 10, col2 transform col1 * col2, col3 transform
UPPER(col1 || col3),...)

I have spent some thoughts on how to do this and will be happy to share the
same if the list is interested. Personally, I think data transformation
using such expressions is a pretty powerful and important activity while
doing the data load itself.

Regards,
Nikhils
-- 
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] modules

2008-04-03 Thread Aidan Van Dyk
* Greg Sabino Mullane [EMAIL PROTECTED] [080403 09:54]:
 
 Right now contrib is a real catch-all of various things; it would be nice to
 categorize them somehow. And by categorize, I emphatically do NOT mean
 move to pgfoundry, which is pretty much a kiss of death.

But that begs the question of *why* it's a kiss of death?

For instance, in perl land, having something in CPAN and not in
perl core is most certainly *not* a kiss of death?  Why is it so
different for PostgreSQL?

Is it because the infrastructure behind CPAN is much better than that
behind pgfoundry?

Or is it because CPAN is better vetted and organized than pgfoundry?

Or is it because the projects that go into CPAN are better quality and
projects in pgroundry?

Or is it something else?

I'm pretty sure the answers to the above questions aren't all yes...

a.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 That line of argument could be used to justify putting anything and
 everything in core.  I think that our extensible architecture is an
 important feature and one we should not hesitate to use to the fullest.

I agree, but part of the problem here is that pgcrypto is extraordinary
overkill for people who just want a better hash function than md5.

Our extensible architecture is a feature, but our
contrib/packaging/gborg/pgfoundry situation is a mess. It's only the
efforts of the distro package maintainers that's kept things from
being even worse.


Here's what it boils down to for me:

1) Postgres has the md5() function, which is not ever getting removed.

2) Since it exists, people are using it.

3) Not having a builtin sha1() means we are less compatible with other
databases. Fair? Perhaps not. But requiring an installation of
pgcrypto, or plperl, is another hurdle to be cleared by people porting
and using applications with Postgres as a backend.

4) We're also encouraging the use of md5() by making it the only option.
Yes, we can talk about why people *shouldn't* use it for this purpose
or that, but they will.

5) It seems unwise to go through the trouble of just adding sha1(), when
we could easily add some better hashes, which has the nice side effect
of making us stand out more and push the envelope, rather than play follow
the leader, as was mentioned at PGCon East.


- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200804031020
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkf06AIACgkQvJuQZxSWSshJGACcDlE/sUBTJNx36zMW7C9G2FqE
n0QAoLOj50gGura/g2JCk+3sFxR0cLb1
=K8sl
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch queue - wiki (was varadic patch)

2008-04-03 Thread Aidan Van Dyk
The one concern I have with the way the last commitfest went (and I say
this as strictly an observer), there was no discussion on anything.

Now, I know that discussion happened, but it happened somewhere, in some
web-forum, in a community that seems to generally promote mailing lists
as the preferred method of discussion.

As an observer, who generally doesn't have much input code wise, but
occasionally might have an observation as a user, *I* would love to see
the commitfest patch-queue be something pretty simple, along the
lines of a big list of:

1) item name, submission date, author
2a) item intention (maybe a see $MSGID)
2b) item (see $MSGID)
3) status summary (in discussion, applied, needs $improvements,
rejected, see $MSGID

Note I said item because it appears as if the consensus is that the
commit-fest has to deal with more than just patches, but also proposals,
and fork-in-the-road details.

And no, I don't think it should included the 2K emails.  It should can
the $N items needing to be dealt with, and a list of pointers to
messages (which generally lead to threads), with a simple status
list/summary for each one (again with pointers to $MSGID where specific
information might be needed).

Basically, I would like to see the patch queue be more a
summary/pointer of/to discussion, then some web forum where the
discussion happens.  And I would like the mailling lists be where the
discussion of items in the patch queue happens.

But all this is the opinion of an observing devellopper, not involved in
any of the heavy-lifting, but as someone who would like to keep an eye
on what patches are presented, and their strengths/deficiencies, so that
when I present my first patch/proposal, hopefully I can avoid most of
the pitfalls.

But don't cater to me.  Cater to Tom and Bruce, who are the ones who
actually use whatever is in place.  Since they are the ones doing the
work, I have to accept (or ignore) whatever system they use.

a.

* Bruce Momjian [EMAIL PROTECTED] [080402 19:36]:
 
 It is not clear to me how a wiki can be easily created for 2k emails and
 then maintained in a reasonable way, or how emails can be added to it
 easily.
 
 There are several steps:
 
   o  getting those 2k emails to start the commit fest
   o  getting them into a wiki in a way that is fast/efficient
   o  updating the wiki for changes efficiently
 
 Keep in mind the patch emails are pretty dynamic.  As you get closer to
 the end of the commit fest, the wiki is easier because the list of open
 items becomes more stable.
 
 I am able to give others the ability to add, move, and delete emails in
 my patch queue, if desired.
 
 If people want to use the wiki, go ahead --- this would be one less job
 for me to do.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Tom Lane
Dawid Kuroczko [EMAIL PROTECTED] writes:
 The idea of \G command is to perform the query, but with printing
 query results using extended table output format.

Seems a bit useless --- if you prefer \x format, wouldn't you prefer it
all the time?  Or at least often enough that the toggling command is
fine?  I'm dubious that this is worth eating up a command letter for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread PFC


Data transformation while doing a data load is a requirement now and  
then.

Considering that users will have to do mass updates *after* the load
completes to mend the data to their liking should be reason enough to do
this while the loading is happening. I think to go about it the right  
way we

should support the following:



* The ability to provide per-column transformation expressions
* The ability to use any kind of expressions while doing the  
transformation

The transformation expression should be any expression (basically
ExecEvalExpr) that can be evaluated to give a resulting value and  
obviously

a corresponding is_null value too. It should and could be system in-built
functions (e.g. UPPER, TRIM, TO_CHAR, TO_NUMBER etc.) or user defined
functions too
* The transformation expression can refer to other columns involved in  
the

load. So that when the current row is extracted from the input file, the
current values should be used to generate the new resultant values before
doing a heap_form_tuple. E.g.
(col1 transform col1 + 10, col2 transform col1 * col2, col3 transform
UPPER(col1 || col3),...)
I have spent some thoughts on how to do this and will be happy to share  
the

same if the list is interested. Personally, I think data transformation
using such expressions is a pretty powerful and important activity while
doing the data load itself.



	Well, since COPY is about as fast as INSERT INTO ... SELECT plus the  
parsing overead, I suggest adding a special SELECT form that can read from  
a file instead of a table, which returns tuples, and which therefore can  
be used and abused to the user's liking. This is a much more powerful  
feature because :


- there is almost no new syntax
- it is much simpler for the user
- lots of existing stuff can be leveraged

EXAMPLE :

	Suppose I want to import a MySQL dump file (gasp !) which obviously  
contains lots of crap like -00-00 dates, '' instead of NULL, borken  
foreign keys, etc.


Let's have a new command :

CREATE FLATFILE READER mydump (
id  INTEGER,
dateTEXT,
...
) FROM file 'dump.txt'
(followed by delimiter specification syntax identical to COPY, etc)
;

	This command would create a set-returning function which is basically a  
wrapper around the existing parser in COPY.
	Column definition gives a name and type to the fields in the text file,  
and tells the parser what to expect and what to return.
	It looks like a table definition, and this is actually pretty normal : it  
is, after all, very close to a table.


	INSERT INTO mytable (id, date, ...) SELECT id, NULLIF( date, '-00-00'  
), ... FROM mydump WHERE (FKs check and drop the borken records);


	Now I can import data and transform it at will using a simple SELECT. The  
advantage is that everybody will know what to do without learning a new  
command, no awkward syntax (transform...), you can combine columns in  
expressions, JOIN to ckeck FKs, use ORDER to get a clustered table,  
anything you want, without any extension to the Postgres engine besides  
the creation of this file-parsing set-returning function, which should be  
pretty simple.


	Or, if I have a few gigabytes of logs, but I am absolutely not interested  
in inserting them into a table, instead I want to make some statistics, or  
perhaps I want to insert into my table some aggregate computation from  
this data, I would just :


CREATE FLATFILE READER accesses_dump (
dateTEXT,
ip  INET,
...
) FROM file 'web_server_logtxt';

And I can do some stats without even loading the data :

	SELECT ip, count(*) FROM accesses_dump GROUP BY ip ORDER BY count(*)  
HAVING count(*)  1000;


	Much better than having to load those gigabytes just to make a query on  
them...
































--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch queue - wiki (was varadic patch)

2008-04-03 Thread Tom Lane
Aidan Van Dyk [EMAIL PROTECTED] writes:
 The one concern I have with the way the last commitfest went (and I say
 this as strictly an observer), there was no discussion on anything.

Umm ... in the first place, the fest isn't over yet.  In the second
place, the reason you haven't seen much discussion is that we've been
working primarily on the stuff that could be committed without much
discussion.  That underbrush has mostly been cleared away at this point,
and we're starting to get down to the stuff that actually will need
extended discussion.  That should definitely happen on this list.

The remaining open issues are listed here:
http://momjian.us/cgi-bin/pgpatches
Feel free to start talking about any of them ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Andrew Dunstan



Greg Sabino Mullane wrote:

At any rate, that's a bit blue sky right now. I haven't seen any
disagreement with our kissing contrib goodbye as a name, so let's work
on that. Unfortunately, that's going to involve a bit of pain,



Yes, I'm not sure I see the point of it. It's got a bad name, but changing
it is just putting lipstick on a pig. End users don't know, and don't care,
about contrib. Sysadmins and casual DBAs only care what they can yum install.
That only leaves packagers and hard-core developers, both of whom already
know how contrib works.


  
If this were at all true we would not not have seen the complaints from 
people along the lines of My ISP won't install contrib. But we have, 
and quite a number of times. We have concrete evidence that calling it 
contrib actually works against us.


It's also worth pointing out that WE HAVE HAD THIS DISCUSSION BEFORE. 
Sometimes I get rather frustrated by our habit of turning time into a 
circle and running Groundhog Day.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Dawid Kuroczko
On Thu, Apr 3, 2008 at 4:35 PM, Tom Lane [EMAIL PROTECTED] wrote:
 Dawid Kuroczko [EMAIL PROTECTED] writes:
   The idea of \G command is to perform the query, but with printing
   query results using extended table output format.

  Seems a bit useless --- if you prefer \x format, wouldn't you prefer it
  all the time?  Or at least often enough that the toggling command is
  fine?  I'm dubious that this is worth eating up a command letter for.

No, the point is that I usually have mixed queries -- ones which are
most comfortably viewed in normal format (many not-so-long rows),
and ones which are best viewed expanded (little rows, many columns).

Alternating between formats using \x is, at least for me, a bit
cumbersome: usually _after_ I wrote a query I realize it would
look more readable in expanded format, which is a bit too late.
So I run the query, ctrl+c, \x, rerun the query... and forget to
turn expanded mode off afterwards.

I think that ability to decide about the format after the query,
not before, can be quite useful especially when writing ad-hoc
queries. Incidentally \g and \G is also used more or less
similarily by our dolphin-loving friends -- which doesn't help
using \G for other things.

   Regards,
 Dawid

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Tom Lane
Dimitri Fontaine [EMAIL PROTECTED] writes:
 Here's a proposal for COPY to support the T part of an ETL, that is adding 
 the 
 capability for COPY FROM to Transform the data it gets.

 The idea is quite simple: adding to COPY FROM the option to run a function on 
 the data before to call datatype_in functions.

The major concern I have about this is to ensure that no detectable
overhead is added to COPY when the feature isn't being used.

I am not actually convinced that the column-by-column design you seem to
have in mind is worth anything.  The examples that I remember seeing
often involve removing columns, generating one column from multiple ones
or vice versa, dealing with nonstandard column delimiters, etc.  What
would makes sense in my mind is a single function taking and returning
text, which is invoked once on each complete input line before it is
broken into fields.

This is, of course, just a substitute for running a sed or perl or
similar script over the data before feeding it to COPY --- and probably
not an amazingly good substitute at that.  For instance, assuming you
like perl for text-wrangling, I'd fully expect the function approach
to be slower than an external script because of the large overhead of
getting into and out of libperl for each line,

In situations where it's actually useful to apply SQL functions rather
than text-mangling operations to the data, you always have the option to
COPY into a temp table and then do INSERT/SELECT from there.

So the whole thing seems just marginally attractive to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Dimitri Fontaine
Le jeudi 03 avril 2008, PFC a écrit :
 CREATE FLATFILE READER mydump (
 id  INTEGER,
 dateTEXT,
 ...
 ) FROM file 'dump.txt'
 (followed by delimiter specification syntax identical to COPY, etc)
 ;
[...]
 INSERT INTO mytable (id, date, ...) SELECT id, NULLIF( date,
 '-00-00'   ), ... FROM mydump WHERE (FKs check and drop the borken
 records);

What do we gain against current way of doing it, which is:
  COPY loadtable FROM 'dump.txt' WITH ...
  INSERT INTO destination_table(...) SELECT ... FROM loadtable; 

-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Csaba Nagy
On Thu, 2008-04-03 at 16:44 +0200, PFC wrote:
 CREATE FLATFILE READER mydump (
   id  INTEGER,
   dateTEXT,
   ...
 ) FROM file 'dump.txt'
 (followed by delimiter specification syntax identical to COPY, etc)
 ;

Very cool idea, but why would you need to create a reader object
first ? You should be able to use COPY directly with the target table
being omitted,  meaning the copy will not pump it's result in the target
but be equivalent to a select... and use it in any place where a select
can be used. This would have absolutely no new  syntax, just the rules
changed... 

Now that I had a second look you actually need the field definitions to
meaningfully interpret the file, but then why not use a record
specification instead of the table in the normal COPY command ? I'm not
sure if there's any existing syntax for that but I would guess yes...

In any case, such a feature would help a lot in processing input files
based also on other existing data in the DB.

Cheers,
Csaba.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread PFC

INSERT INTO mytable (id, date, ...) SELECT id, NULLIF( date,
'-00-00'   ), ... FROM mydump WHERE (FKs check and drop the borken
records);


What do we gain against current way of doing it, which is:
  COPY loadtable FROM 'dump.txt' WITH ...
  INSERT INTO destination_table(...) SELECT ... FROM loadtable;


	You read and write the data only once instead of twice (faster) if you  
want to import all of it.
	If you just want to compute some aggregates and store the results in a  
table, you just read the data once and don't write it at all.


	The advantages are the same than your proposed transformations to COPY,  
except I feel this way of doing it opens more options (like, you can  
combine columns, check FKs at load, do queries on data without loading it,  
don't necessarily have to insert the data in a table, don't have to invent  
a new syntax to express the transformations, etc).







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread PFC

On Thu, 03 Apr 2008 16:57:53 +0200, Csaba Nagy [EMAIL PROTECTED] wrote:


On Thu, 2008-04-03 at 16:44 +0200, PFC wrote:

CREATE FLATFILE READER mydump (
id  INTEGER,
dateTEXT,
...
) FROM file 'dump.txt'
(followed by delimiter specification syntax identical to COPY, etc)
;


Very cool idea, but why would you need to create a reader object
first ? You should be able to use COPY directly with the target table
being omitted,  meaning the copy will not pump it's result in the target
but be equivalent to a select... and use it in any place where a select
can be used. This would have absolutely no new  syntax, just the rules
changed...

Now that I had a second look you actually need the field definitions to
meaningfully interpret the file,


	Yeah, you need to tell Postgres the field names, types, and NULLness  
before it can parse them... or else it's just a plain flat text file which  
makes no sense...

but then why not use a record
specification instead of the table in the normal COPY command ? I'm not
sure if there's any existing syntax for that but I would guess yes...


	Hm, yeah, that's even simpler, just create a type for the row (or just  
use table%ROWTYPE if you have a table that fits the description), and tell  
COPY to parse according to the row type definition... smart...


Like :

CREATE TYPE import_rowtype AS (id INTEGER, date TEXT);
INSERT INTO mytable (id, date, ...)
  SELECT id, NULLIF( date, '-00-00' )::DATE
  FROM (COPY AS import_rowtype FROM 'mysql_trash.txt') AS foo
  WHERE (FKs check and drop the borken records);

Looks clean...

	Obviously, in this case (and also in my proposal's case) you must use  
COPY and not \copy since it is the database server which will be reading  
the file.
	This could probably be hacked so the client sends the file via the \copy  
interface, too...



In any case, such a feature would help a lot in processing input files
based also on other existing data in the DB.


Yeah, it would be cool.
	Also, since COPY TO can use a SELECT as a data source, you could use  
postgres to read from a file/pipe, process data, and write to a file/pipe  
(kinda better than sed, lol)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Mark Mielke

Greg Sabino Mullane wrote:

4) We're also encouraging the use of md5() by making it the only option.
Yes, we can talk about why people *shouldn't* use it for this purpose
or that, but they will.
  


There is always the Java route - internal classes have package-scope 
constructors to specifically prevent them from being accidentally used 
(and relied on).


I prefer the let them use it, but warn them not to have expectations 
route, which is what PostgreSQL is doing today. The above is not a 
legitimate reason to provide additional functions in the core.



5) It seems unwise to go through the trouble of just adding sha1(), when
we could easily add some better hashes, which has the nice side effect
of making us stand out more and push the envelope, rather than play follow
the leader, as was mentioned at PGCon East


This presumes that better hashes truly exist. It is basic math to show 
that all hashes will include collisions. Ignoring the possibility that 
one hash has theoretical better distribution for real documents, the 
real benefit of SHA-1 over MD5, is that it has more bits. The 
ultimate solution here, is to store the original using the full copy 
hash technique, with 0 chance of collision. This extreme defeats the 
purpose of a hash to start with.


Why does PostgreSQL need something better than md5 as part of core? 
Bragging rights?


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Tom Dunstan
On Thu, Apr 3, 2008 at 8:25 PM, Andrew Dunstan [EMAIL PROTECTED] wrote:

  If this were at all true we would not not have seen the complaints from
 people along the lines of My ISP won't install contrib. But we have, and
 quite a number of times. We have concrete evidence that calling it contrib
 actually works against us.

It's hard to see ISPs who won't install contrib from installing
${random module} from the big bad internet as has been discussed in
this thread, but who knows?

If we go with a solution that allows users to say install mymodule;
or whatever into their own database, is there any reason not to
install (as in make install) all modules currently called contrib by
default? Are there any security issues with modules in there? I seem
to remember something coming up involving dblink a while back...

Cheers

Tom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 3 Apr 2008 21:03:05 +0530
Tom Dunstan [EMAIL PROTECTED] wrote:

 On Thu, Apr 3, 2008 at 8:25 PM, Andrew Dunstan [EMAIL PROTECTED]
 wrote:
 
   If this were at all true we would not not have seen the complaints
  from people along the lines of My ISP won't install contrib. But
  we have, and quite a number of times. We have concrete evidence
  that calling it contrib actually works against us.
 
 It's hard to see ISPs who won't install contrib from installing
 ${random module} from the big bad internet as has been discussed in
 this thread, but who knows?

Sure it is. The very word contrib brings about ideas of things like:

Unstable, Cooker, unofficial.

modules is completely different (from a perception perspective).

IMO the core modules should be compiled via configure with something
like:

./configure --enable-module=ALL 

or

./configure --enable-module=pgcrypto --enable-module=cube

This would install all the modules but not enable them in the database
itself (of course). This could also be extended to the pls so that we
have exactly one mechanism to control those options as well.

./configure --enable-module=pgcrypto --enable-module=plperl

Sincerely,

Joshua D. Drake



- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH9PwCATb/zqfZUUQRAoWtAKCdbdcv4KdOIdiF8gcjebWTIrub1gCgg8RU
QaatCVhlETRkA6+5wyYNdRM=
=z1gI
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Dimitri Fontaine
Le jeudi 03 avril 2008, Tom Lane a écrit :
 The major concern I have about this is to ensure that no detectable
 overhead is added to COPY when the feature isn't being used.

Well, when COLUMN x CONVERT USING or whatever syntax we choose is not used, we 
default to current code path, that is we do not mess with data content at all 
before to consider it's valid input syntax for target table datatypes.

And the syntax check is done only once, before beginning to read the data 
lines from the file.

 I am not actually convinced that the column-by-column design you seem to
 have in mind is worth anything.  The examples that I remember seeing
 often involve removing columns, generating one column from multiple ones
 or vice versa, dealing with nonstandard column delimiters, etc.

Yes, this is another need, but actually better solved, in my opinion, with 
loading data into a (temp) loadtable then process it with SQL:
  INSERT INTO destination_table SELECT whatever FROM loadtable;

The problem I'm trying to solve is not this one, I'm trying to have COPY able 
to load data into a table when the representation of it we have into the file 
does not match what datatype input function expects.

An example might help us talking about the same thing. mysqldump CSV outputs 
timestamp sometimes (depending on server version) as '20041002152952' when 
PostgreSQL expects '2004-10-02 15:29:52'. I'd like COPY to be able to cope 
with this situation.

Now, another syntax proposal could have both the needs solved. We basically 
need to be able to transform input fields and process them into input 
columns, in a way that N input fields (found in the data file) will get us M 
input columns:

  COPY destination_table(col1, col2, col3, col4) 
 USING (field1, field2 || field3, myfunc(field4, field5))
  FROM 'file.txt'
  WITH ...

This could get better than preprocessing then COPY then INSERT INTO ... SELECT 
because we don't need a temp table (don't need to care about its name being 
unique, nor to mess up with temp_buffers), etc. 
You're the one able to tell why it'll be better to have one COPY command 
instead of a two table steps load, I'm just guessing ;)

And if it's better for the user to preprocess in perl then COPY, he still has 
the option.
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Patch queue - wiki (was varadic patch)

2008-04-03 Thread Bruce Momjian
Dave Page wrote:
 On Thu, Apr 3, 2008 at 12:35 AM, Bruce Momjian [EMAIL PROTECTED] wrote:
   It is not clear to me how a wiki can be easily created for 2k emails and
   then maintained in a reasonable way, or how emails can be added to it
   easily.
 
 That seems like a *really* odd thing for one of the founders of the
 world's most advanced OSS DBMS project to say. It's all relational
 (which we do do pretty well) - we can add links to the wiki to threads
 in the archives, and anything posted from then on is self-maintaining
 (except when new threads are started - but even if each patch gets 5
 threads that's not a huge chore).
 
 I see no reason to go manually copying all 2k emails to the wiki.

Well, I am waiting for someone to show me how it is done because I can't
figure out a way.  Do it and I will gladly stop doing what I am doing.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Small TRUNCATE glitch

2008-04-03 Thread Tom Lane
Just noticed that TRUNCATE fails to clear the stats collector's counts
for the table.  I am not sure if it should reset the event counts or
not (any thoughts?) but surely it is wrong to not zero the live/dead
tuple counts.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Peter Eisentraut
Am Donnerstag, 3. April 2008 schrieb Andrew Dunstan:
 If this were at all true we would not not have seen the complaints from
 people along the lines of My ISP won't install contrib. But we have,
 and quite a number of times. We have concrete evidence that calling it
 contrib actually works against us.

ISPs also won't install additional Perl modules, for example.  Yet, CPAN does 
exist successfully.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Bruce Momjian
Dawid Kuroczko wrote:
 On Thu, Apr 3, 2008 at 4:35 PM, Tom Lane [EMAIL PROTECTED] wrote:
  Dawid Kuroczko [EMAIL PROTECTED] writes:
The idea of \G command is to perform the query, but with printing
query results using extended table output format.
 
   Seems a bit useless --- if you prefer \x format, wouldn't you prefer it
   all the time?  Or at least often enough that the toggling command is
   fine?  I'm dubious that this is worth eating up a command letter for.
 
 No, the point is that I usually have mixed queries -- ones which are
 most comfortably viewed in normal format (many not-so-long rows),
 and ones which are best viewed expanded (little rows, many columns).
 
 Alternating between formats using \x is, at least for me, a bit
 cumbersome: usually _after_ I wrote a query I realize it would
 look more readable in expanded format, which is a bit too late.
 So I run the query, ctrl+c, \x, rerun the query... and forget to
 turn expanded mode off afterwards.
 
 I think that ability to decide about the format after the query,
 not before, can be quite useful especially when writing ad-hoc
 queries. Incidentally \g and \G is also used more or less
 similarily by our dolphin-loving friends -- which doesn't help
 using \G for other things.

It seems more helpful if there were \x option to use extended format
only when the output is too wide.  TODO already has:

o Add auto-expanded mode so expanded output is used if the row
  length is wider than the screen width.

  Consider using auto-expanded mode for backslash commands like \df+.


-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] printTable API (was: Show INHERIT in \du)

2008-04-03 Thread Brendan Jurd
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 31/03/2008, Tom Lane  wrote:
 Brendan Jurd  writes:
1. describe malloc's the cells to zero, but print just does a local
   calloc without any initialisation.

 There isn't any functional difference there.  I am not sure, but I think
  the reason print.c has its own malloc wrappers instead of depending on
  common.c's is that we use print.c in some bin/scripts/ programs that
  do not want common.c too.


Yeah, it looks like createlang and droplang use print.c to emit a list
of installed languages.

2. describe only does an mbvalidate for WIN32, but print does it in all 
 cases.

 I don't know why describe only does that for WIN32; it looks
  inconsistent to me too.  Possibly some trolling in the CVS history would
  give a clue about this.


Well, mbvalidate was originally added to print.c in 2001, as part of a
big patch to add multibyte support to psql [1].  However, it was only
added to describe much later (2003) in response to a bug report about
8-bit  characters not displaying correctly on the Windows console [2].
 I think that because the bug was only observed in Windows, the patch
was added #ifdef WIN32, even though print.c was already using
mbvalidate for all content.

This nicely illustrates the nuisance inherent to duplication of code!

Based on this, I'm going to go ahead with using mbvalidate in all cases.

Cheers,
BJ

[1] http://repo.or.cz/w/PostgreSQL.git?a=commit;h=a428cef1
[2] http://repo.or.cz/w/PostgreSQL.git?a=commit;h=e6a16c17
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFH9QFt5YBsbHkuyV0RAv2ZAJ4/rfyjgFOh8XZo6aJo68dz5NsovQCgmf40
fCXMlsHdg1r4oTpfZD5DH+0=
=PrN1
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-03 Thread Pavan Deolasee
On Thu, Apr 3, 2008 at 4:34 AM, Tom Lane [EMAIL PROTECTED] wrote:


  The right way seems to be to treat our own insertions as live during
  ANALYZE, but then subtract off our own pending insertions from the
  live-tuples count sent to the stats collector.  pgstat_report_analyze()
  can handle the latter part by groveling through the backend's pending
  statistics data.


Seems like a right approach to me. I assume we shall do the same for
DELETE_IN_PROGRESS tuples.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Tom Dunstan
On Thu, Apr 3, 2008 at 9:17 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
   It's hard to see ISPs who won't install contrib from installing
   ${random module} from the big bad internet as has been discussed in
   this thread, but who knows?

  Sure it is. The very word contrib brings about ideas of things like:

  Unstable, Cooker, unofficial.

Point taken, and I completely agree. Part of the problem is that we
have explicitly encouraged this perception, ie it's in contrib so the
barrier to entry is lower. That may not be the case anymore, or it
may just be that the bar is really really high for non-contrib stuff
vs other projects. Whatever the actual case is, I agree that the name
is unfortunate.

When I wrote the above I was thinking about it from the other way
around: doing a cpan or gem install of some random module seems even
less safe to me, but maybe I'm just revealing confidence in pgsql or
fear of some cpan code etc that ISPs don't share.

  This would install all the modules but not enable them in the database
  itself (of course). This could also be extended to the pls so that we
  have exactly one mechanism to control those options as well.

  ./configure --enable-module=pgcrypto --enable-module=plperl

That's basically where I was heading, although I took it a step
further: why not build and install all possible modules by default, if
we think they're up to quality?

One answer is: what do you do if some required library isn't
available? Do you fail with an error message or just don't build that
module? I don't like the idea of e.g. accidentally and silently not
installing pl/perl just because the sysadmin hadn't installed their
perl-devel package or whatever.

--enable-module=ALL could be pretty good, though, especially if it
build pl/perl etc that most sysadmins will want to install but do so
in less configure args. :)

Cheers

Tom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 3 Apr 2008 21:45:52 +0530
Tom Dunstan [EMAIL PROTECTED] wrote:

   This would install all the modules but not enable them in the
  database itself (of course). This could also be extended to the pls
  so that we have exactly one mechanism to control those options as
  well.
 
   ./configure --enable-module=pgcrypto --enable-module=plperl
 
 That's basically where I was heading, although I took it a step
 further: why not build and install all possible modules by default, if
 we think they're up to quality?

Good point.

 
 One answer is: what do you do if some required library isn't
 available? 

If we build by default, then when a library isn't found the configure
output tells you:

Looking for Perl Development packages: No , disabling plperl build.
 
 --enable-module=ALL could be pretty good, though, especially if it
 build pl/perl etc that most sysadmins will want to install but do so
 in less configure args. :)

Right. I am using the Apache model here.

Sincerely,

Joshua D. Drake



- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH9QQ2ATb/zqfZUUQRAsD9AJ9b9/12ZtaJ/CpnQ3y0xH7U3a0EYACfVeUJ
FKUyEmuuw9nx3F+sk4mL2eQ=
=sA7I
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Ron Mayer

Aidan Van Dyk wrote:

* Greg Sabino Mullane [EMAIL PROTECTED] [080403 09:54]:

I emphatically do NOT mean
move to pgfoundry, which is pretty much a kiss of death.


But that begs the question of *why* it's a kiss of death?



For instance, in perl land, having something in CPAN and not in
perl core is most certainly *not* a kiss of death?  Why is it so
different for PostgreSQL?
Is it because the infrastructure behind CPAN is much better than that
behind pgfoundry?


I wouldn't say one is better than the other.  PGFoundry and CPAN have
totally disjoint feature sets.  PgFoundry's like SoruceForge +
Bugtrackers + Discussion Forums + Surveys + Mailing Lists -- pretty
much everything except installable packages.

CPAN and RubyGems is very much focused on installable packages.


Or is it because CPAN is better vetted and organized than pgfoundry?

Or is it because the projects that go into CPAN are better quality and
projects in pgroundry?


To simplify those two:
CPAN contains installers that mostly just work.
PGFoundry contains mostly works-in-progress without installers.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Svenne Krap

Mark Mielke wrote:
This presumes that better hashes truly exist. It is basic math to show 
that all hashes will include collisions. Ignoring the possibility that 
one hash has theoretical better distribution for real documents, the 
real benefit of SHA-1 over MD5, is that it has more bits. The 
ultimate solution here, is to store the original using the full 
copy hash technique, with 0 chance of collision. This extreme defeats 
the purpose of a hash to start with.


Why does PostgreSQL need something better than md5 as part of core? 
Bragging rights?
Having more than one hash algorithm significantly decreases the risk of 
(common) collisions.


As a non-developer (who does track most messages on the list anyways), I 
surely find the SHA* functions will add significantly value and they 
should be easy to install (well-defined functions) with no maintainance 
afterwards.
Hashes are an absolute minimum for keeping passwords stored somehat 
safely in a database.


More two or even three different hashes with different collion-points 
will strongly increase the security.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Mark Mielke

Svenne Krap wrote:

Mark Mielke wrote:
This presumes that better hashes truly exist. It is basic math to 
show that all hashes will include collisions. Ignoring the 
possibility that one hash has theoretical better distribution for 
real documents, the real benefit of SHA-1 over MD5, is that it has 
more bits. The ultimate solution here, is to store the original 
using the full copy hash technique, with 0 chance of collision. 
This extreme defeats the purpose of a hash to start with.


Why does PostgreSQL need something better than md5 as part of core? 
Bragging rights?
Having more than one hash algorithm significantly decreases the risk 
of (common) collisions.


No it doesn't. More bits reduces risk of collisions. Additional 
algorithms just muddy the waters.


As a non-developer (who does track most messages on the list anyways), 
I surely find the SHA* functions will add significantly value and they 
should be easy to install (well-defined functions) with no 
maintainance afterwards.
Hashes are an absolute minimum for keeping passwords stored somehat 
safely in a database.


It has yet to be proven that MD5 is insufficient for this purpose. 
Significant value being what?


More two or even three different hashes with different collion-points 
will strongly increase the security.


No it doesn't unless you are thinking about a security through obscurity 
argument.


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Ron Mayer

D'Arcy J.M. Cain wrote:


Check out NetBSD pkgsrc as a model.  It is very flexible.  One nice
thing would be the ability to specify where the packages are rather
than always insisting that they be on pgfoundry.


Yup - a feature shared by RubyGems:
  gem install rails –source http://gems.rubyonrails.org

Many of the most popular modules seem to live outside of
pgfoundry anyway (postgis, the contrib ones, etc); so I'd
think even if we maintain a central repository we want to
make sure it can install from other sites.



Perl and Ruby are languages - Postgres is a very different animal.


...Overall though
I don't think that what is being installed to changes much.  The basics
remain the same - define the package with latest version, download if
necessary,check that the source package is the correct, tested one,
build, install, register.


+1.   From the end user I think he cares that the software is installed
with the required dependencies and passes any included regression tests.
Bonus points if it also registers itself in his database.

And in the ruby/gems world the Windows guys seem not to have liked
the check...source packages...build so they include precompiled
windows libraries for those guys in many Ruby Gems.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-03 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 Please see the attached patch. One change I made is to hold the SHARE lock
 on the page while ANALYZE is reading tuples from it. I thought it would
 be a right thing to do instead of repeatedly acquiring/releasing the lock.

I've applied a modified/extended form of this patch for 8.3.2.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread D'Arcy J.M. Cain
On Thu,  3 Apr 2008 13:54:11 -
Greg Sabino Mullane [EMAIL PROTECTED] wrote:
 Right now contrib is a real catch-all of various things; it would be nice to
 categorize them somehow. And by categorize, I emphatically do NOT mean
 move to pgfoundry, which is pretty much a kiss of death.

Yes!  I have plenty of FTP servers to put up my own open source
projects.  It would annoy me if I was forced to use someone else's
development environment.  Whatever we do should allow for packages to
be picked up from anywhere.  We can use MD5 checksums to assure users
that no one has changed the file since it was tested and packaged.

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Brendan Jurd
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 04/04/2008, Joshua D. Drake  wrote:
 Tom Dunstan  wrote:
  
   One answer is: what do you do if some required library isn't
   available?


 If we build by default, then when a library isn't found the configure
  output tells you:

  Looking for Perl Development packages: No , disabling plperl build.


That might easily go unnoticed in amongst all the other configure
output.  It would only be effective if the messages were repeated
again at the end of the configure, or configure somehow draws
attention to the fact that there was a problem.

Another approach I've come across is to fail with an error message like

Perl development files not found, required to build module plperl.
Install these files or configure with --disable-module=plperl

Cheers,
BJ
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFH9Qfu5YBsbHkuyV0RAmKIAJ9eBkAGaw5kBmahk4CzJ4JbrkmitACff9DB
eYYSl1SiANAaAyky/3QBSIs=
=Fg12
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 3 Apr 2008 12:35:31 -0400
D'Arcy J.M. Cain [EMAIL PROTECTED] wrote:

 On Thu,  3 Apr 2008 13:54:11 -
 Greg Sabino Mullane [EMAIL PROTECTED] wrote:
  Right now contrib is a real catch-all of various things; it would
  be nice to categorize them somehow. And by categorize, I
  emphatically do NOT mean move to pgfoundry, which is pretty much a
  kiss of death.

Pgfoundry is not a kiss of death except that you spread falsehoods like
that. PGfoundry is a very alive project that is constantly adding
content and has continuing and very active projects.

Joshua D. Drake



- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH9QjVATb/zqfZUUQRAmuMAKCR/+mgHqB9TTsdI0G3Ax2Y5ry4SQCfQMNt
d7+jcUa3pDirWo34n7dqg2o=
=p4Oq
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Magnus Hagander
Andrew Dunstan wrote:
 
 
 Ron Mayer wrote:
  Andrew Dunstan wrote:
  Tom Lane wrote:
  as having better system support for packages or modules or
  whatever you want to call them; and maybe we also need some
  marketing-type
 
  ...re-raise the question of getting rid of contrib...
  The PostgreSQL Standard Modules. 
 
  While renaming, could we go one step further and come up with a
  clear definition of what it takes for something to qualify as
  a module?   In particular I think standardizing the installation
  would go a long way to letting packagers automate the installation
  of modules from pgfoundry.
 
  I think it'd be especially cool if one could one-day have a command
 
pg_install_module  [modulename] -d [databasename]
 
  and it would magically get (or verify that it had) the latest
  version from pgfoundry; compile it (if needed) and install it
  in the specified database.
 
  The closest analogy to what I'm thinking is the perl CPAN or ruby
  gems.
 
 
 Yes, and the CPAN analogy that has been in several minds, but it only 
 goes so far. Perl and Ruby are languages - Postgres is a very
 different animal.
 
 We do in fact have some support for building / installing some
 modules in a standard way. It's called pgxs and it is used by quite a
 number of existing modules.

On Windows we also have the StackBuilder application which is used for
installation of binary modules.


//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Alvaro Herrera
Bruce Momjian escribió:

 It seems more helpful if there were \x option to use extended format
 only when the output is too wide.  TODO already has:
 
 o Add auto-expanded mode so expanded output is used if the row
   length is wider than the screen width.
 
   Consider using auto-expanded mode for backslash commands like \df+.

Some sort of \x auto?  Sounds interesting ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread D'Arcy J.M. Cain
On Thu, 3 Apr 2008 09:41:57 -0700
Joshua D. Drake [EMAIL PROTECTED] wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On Thu, 3 Apr 2008 12:35:31 -0400
 D'Arcy J.M. Cain [EMAIL PROTECTED] wrote:
 
  On Thu,  3 Apr 2008 13:54:11 -
  Greg Sabino Mullane [EMAIL PROTECTED] wrote:
   Right now contrib is a real catch-all of various things; it would
   be nice to categorize them somehow. And by categorize, I
   emphatically do NOT mean move to pgfoundry, which is pretty much a
   kiss of death.
 
 Pgfoundry is not a kiss of death except that you spread falsehoods like
 that. PGfoundry is a very alive project that is constantly adding
 content and has continuing and very active projects.

Eep!  Careful with attributions.  There is not a single word of mine in
what you included.  I know it technically says that but since your
comments were directed at Greg you really should have replied to his
email and not to mine that included his.

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Martijn van Oosterhout
On Thu, Apr 03, 2008 at 12:07:54PM -0400, Bruce Momjian wrote:
  Alternating between formats using \x is, at least for me, a bit
  cumbersome: usually _after_ I wrote a query I realize it would
  look more readable in expanded format, which is a bit too late.
  So I run the query, ctrl+c, \x, rerun the query... and forget to
  turn expanded mode off afterwards.
 
 It seems more helpful if there were \x option to use extended format
 only when the output is too wide.  TODO already has:

I was thinking that maybe \x should have a one-shot mode, i.e.

\x query

does it only for this one statement. It would solve the OPs problem.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] modules

2008-04-03 Thread Steve Atkins


On Apr 3, 2008, at 7:01 AM, Aidan Van Dyk wrote:

* Greg Sabino Mullane [EMAIL PROTECTED] [080403 09:54]:

Right now contrib is a real catch-all of various things; it would  
be nice to
categorize them somehow. And by categorize, I emphatically do NOT  
mean

move to pgfoundry, which is pretty much a kiss of death.


But that begs the question of *why* it's a kiss of death?

For instance, in perl land, having something in CPAN and not in
perl core is most certainly *not* a kiss of death?  Why is it so
different for PostgreSQL?

Is it because the infrastructure behind CPAN is much better than that
behind pgfoundry?


Yes. I can install a package from a CPAN mirror with a one-line
incantation and be sufficiently sure it works that on the very rare
occasions it doesn't I'm really surprised.

On the Windows end of things I can usually get pre-built binaries
of those same packages installed, in the cases where a compiler
is needed to build them. The exact process is a bit different, but it's
consistent across most packages and uses the same namespace.

Or is it because CPAN is better vetted and organized than  
pgfoundry?


Partly. Vetted is partly self-vetting - you're expected to pass your  
self
tests and install cleanly before you publish to CPAN. The naming  
hierarchy

helps with the CPAN organization, and makes it easier to use than the
trove approach, once you're familiar with the perl namespace habits.

Some of that is applicable to a postgresql package distribution method,
but the neat organization is a perl thing, not a CPAN thing, so that  
idea

doesn't really transfer.




Or is it because the projects that go into CPAN are better quality and
projects in pgroundry?


Partly. There are some dubious packages on CPAN but they're finished,
and with extremely few exceptions download, pass their self tests and
do what it says on the box (the main flaws are packages going stale
and occasionally dependency problems).

Pgfoundry is a development site with a search engine and has projects
in various stages of completion from vaporware to production tested
usable code.


Or is it something else?


Projects vs Packages sums up the differences.

Cheers,
  Steve

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-03 Thread Pavan Deolasee
On Thu, Apr 3, 2008 at 10:02 PM, Tom Lane [EMAIL PROTECTED] wrote:


  I've applied a modified/extended form of this patch for 8.3.2.


Thanks. I had another concern about VACUUM not reporting DEAD line
pointers (please see up thread). Any comments on that ?

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Sam Mason
On Thu, Apr 03, 2008 at 06:14:17PM +0200, Svenne Krap wrote:
 Hashes are an absolute minimum for keeping passwords stored somehat 
 safely in a database.

 More two or even three different hashes with different collion-points 
 will strongly increase the security.

Not only that, but they also increase the complexity of the system.
Increases in complexity tend to mean decreases in reliability and,
by implication, security.  As an example, someone may do some fancy
cryptanalysis and discover that having lots of hashes will actually make
it easier.  As another point, most passwords have significantly less
state than a 128bit hash allowing attacks like rainbow tables become
viable.


  Sam

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 3 Apr 2008 12:46:30 -0400
D'Arcy J.M. Cain [EMAIL PROTECTED] wrote:

 On Thu, 3 Apr 2008 09:41:57 -0700
 Joshua D. Drake [EMAIL PROTECTED] wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  On Thu, 3 Apr 2008 12:35:31 -0400
  D'Arcy J.M. Cain [EMAIL PROTECTED] wrote:
  
   On Thu,  3 Apr 2008 13:54:11 -
   Greg Sabino Mullane [EMAIL PROTECTED] wrote:
Right now contrib is a real catch-all of various things; it
would be nice to categorize them somehow. And by categorize, I
emphatically do NOT mean move to pgfoundry, which is pretty
much a kiss of death.
  
  Pgfoundry is not a kiss of death except that you spread falsehoods
  like that. PGfoundry is a very alive project that is constantly
  adding content and has continuing and very active projects.
 
 Eep!  Careful with attributions.  There is not a single word of mine
 in what you included.  I know it technically says that but since your
 comments were directed at Greg you really should have replied to his
 email and not to mine that included his.

Sorry Darcy :).

Joshua D. Drake 


- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH9Qv2ATb/zqfZUUQRAtBCAJ4yRvm6IydAstjb06G2mM8XhkVfPACfdmCy
oa3KN6PmkXzZgFlFOSHseVk=
=T5sZ
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread D'Arcy J.M. Cain
On Thu, 03 Apr 2008 09:31:01 -0700
Ron Mayer [EMAIL PROTECTED] wrote:
 D'Arcy J.M. Cain wrote:
  Check out NetBSD pkgsrc as a model.  It is very flexible.  One nice
  thing would be the ability to specify where the packages are rather
  than always insisting that they be on pgfoundry.
 
 Yup - a feature shared by RubyGems:
gem install rails ?source http://gems.rubyonrails.org

Yes but what I am suggesting goes beyond that.  My idea is that there
is a modules directory that contains a file for each installable
module.  This file would contain all the information about the module
such as name, version, where to get the actual package, an MD5 checksum
of the package, minimum and maximum PostgreSQL versions required, etc.

Naturally we should allow for people to define their own local packages
as well.

In fact, this may be the way to deprecate contrib.  Start building
modules and move the contrib packages to it one at a time.  That way
people using contrib have some time to switch and we can point people
to modules if they are just starting out.

Is there support for this idea?  I would like to start exploring this
if so.

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small TRUNCATE glitch

2008-04-03 Thread Martijn van Oosterhout
On Thu, Apr 03, 2008 at 11:58:11AM -0400, Tom Lane wrote:
 Just noticed that TRUNCATE fails to clear the stats collector's counts
 for the table.  I am not sure if it should reset the event counts or
 not (any thoughts?) but surely it is wrong to not zero the live/dead
 tuple counts.

Wern't there complaints from people regularly truncating and
refilling tables getting bad plans because they lost the statistics?

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Bruce Momjian
Alvaro Herrera wrote:
 Bruce Momjian escribi?:
 
  It seems more helpful if there were \x option to use extended format
  only when the output is too wide.  TODO already has:
  
  o Add auto-expanded mode so expanded output is used if the row
length is wider than the screen width.
  
Consider using auto-expanded mode for backslash commands like 
  \df+.
 
 Some sort of \x auto?  Sounds interesting ...

Yep.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Andrew Dunstan



D'Arcy J.M. Cain wrote:


In fact, this may be the way to deprecate contrib.  Start building
modules and move the contrib packages to it one at a time.  That way
people using contrib have some time to switch and we can point people
to modules if they are just starting out.

Is there support for this idea?  I would like to start exploring this
if so.

  


No. I don't want to deprecate it, I want to get rid of it, lock, stock 
and barrel. If you think that we need more than renaming then we can 
discuss it, but I don't want a long death, I want one that is certain 
and swift.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Bruce Momjian
Martijn van Oosterhout wrote:
-- Start of PGP signed section.
 On Thu, Apr 03, 2008 at 12:07:54PM -0400, Bruce Momjian wrote:
   Alternating between formats using \x is, at least for me, a bit
   cumbersome: usually _after_ I wrote a query I realize it would
   look more readable in expanded format, which is a bit too late.
   So I run the query, ctrl+c, \x, rerun the query... and forget to
   turn expanded mode off afterwards.
  
  It seems more helpful if there were \x option to use extended format
  only when the output is too wide.  TODO already has:
 
 I was thinking that maybe \x should have a one-shot mode, i.e.
 
 \x query
 
 does it only for this one statement. It would solve the OPs problem.

But break for others who want all output \x.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small TRUNCATE glitch

2008-04-03 Thread Tom Lane
Martijn van Oosterhout [EMAIL PROTECTED] writes:
 On Thu, Apr 03, 2008 at 11:58:11AM -0400, Tom Lane wrote:
 Just noticed that TRUNCATE fails to clear the stats collector's counts
 for the table.  I am not sure if it should reset the event counts or
 not (any thoughts?) but surely it is wrong to not zero the live/dead
 tuple counts.

 Wern't there complaints from people regularly truncating and
 refilling tables getting bad plans because they lost the statistics?

Not related --- the planner doesn't look at pgstats data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Svenne Krap

Mark Mielke wrote:
More two or even three different hashes with different collion-points 
will strongly increase the security.
No it doesn't unless you are thinking about a security through 
obscurity argument.

It is really the same argument on all your questions

If I have a simple table now

ID serial
Username varchar
Password varchar

I currently save only md5(id || username || 'password')*  into password, 
if I had access to sha1 (for example) i would add another password 
column so, having for example


ID serial
Username varchar
Password_md5 varchar
Password_sha1 varchar

No matter how you see it, I get more bits of hash to check against.

I would drop md5 totally and use sha1 and ripemd-160 if possible.. but 
currently i use only md5 as it is the only available one..  Loading 
pgcrypto is overkill for something as simple as hash-functions.


Svenne

* I prepend the id and the username to guard users with weak passwords 
against known hashvalues (rainbow tables) should the box ever get 
comprised ... if you are in doubt about the value of this, try google 
for 40e94aa51dc5c0ccc5aad4e6aefdde2a and guess the secret password...


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

2008-04-03 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 Thanks. I had another concern about VACUUM not reporting DEAD line
 pointers (please see up thread). Any comments on that ?

If you want to work on that, go ahead, but I wanted it separate because
I didn't think it merited back-patching.  It's strictly cosmetic in
terms of being about what VACUUM VERBOSE prints, no?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Dimitri Fontaine [EMAIL PROTECTED] writes:
 Here's a proposal for COPY to support the T part of an ETL, that is adding 
 the 
 capability for COPY FROM to Transform the data it gets.

 The idea is quite simple: adding to COPY FROM the option to run a function 
 on 
 the data before to call datatype_in functions.

 The major concern I have about this is to ensure that no detectable
 overhead is added to COPY when the feature isn't being used.

 I am not actually convinced that the column-by-column design you seem to
 have in mind is worth anything.  The examples that I remember seeing
 often involve removing columns, generating one column from multiple ones
 or vice versa, dealing with nonstandard column delimiters, etc.  What
 would makes sense in my mind is a single function taking and returning
 text, which is invoked once on each complete input line before it is
 broken into fields.

I think not having to deal with separating fields is actually one of the few
reasons to do this within COPY. If you can separate out yourself or need to do
something more clever than COPY is capable of to split the columns then you're
better off preprocessing it with perl or something anyways.

To that end all the other use cases you describe could be handled with his
plan. There's nothing stopping you from doing

CREATE READER foo (a integer, b integer)
INSERT INTO b (SELECT a+b FROM foo);
or
INSERT INTO b (SELECT 1, a, b, greatest(a,b) FROM foo)

However I'm not sure we even need new syntax for CREATE READER. I would think
something like this would make more sense:

CREATE FUNCTION transform(integer, integer) RETURNS SETOF b;

COPY b FROM 'foo' USING transform(integer,integer);

 So the whole thing seems just marginally attractive to me.

Everything about ETL is only marginally attractive, but it's something people
spend a lot of time doing. Nobody's come up with any particularly clean
solutions I think.

AFAIK the state of the art is actually to load the data into a table which
closely matches the source material, sometimes just columns of text. Then copy
it all to another table doing transformations. Not impressed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread D'Arcy J.M. Cain
On Thu, 03 Apr 2008 13:06:25 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:
 D'Arcy J.M. Cain wrote:
  In fact, this may be the way to deprecate contrib.  Start building
  modules and move the contrib packages to it one at a time.  That way
  people using contrib have some time to switch and we can point people
  to modules if they are just starting out.
 
  Is there support for this idea?  I would like to start exploring this
  if so.
 
 No. I don't want to deprecate it, I want to get rid of it, lock, stock 
 and barrel. If you think that we need more than renaming then we can 
 discuss it, but I don't want a long death, I want one that is certain 
 and swift.

Well, OK, but given that this is a huge public project with lots of
users expecting things to be in certain places, how fast do you think
we could make such a change.  It seems to me that we are going to have
to make things look the same for some time at least otherwise we are
going to have lots of complaints.  How swift is swift?  To me, swift
means add the alternate functionality to the next release and remove
the old in the release after. Do you see things happening any faster?

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Tom Dunstan
On Thu, Apr 3, 2008 at 10:36 PM, Andrew Dunstan [EMAIL PROTECTED] wrote:
  No. I don't want to deprecate it, I want to get rid of it, lock, stock and
 barrel. If you think that we need more than renaming then we can discuss it,
 but I don't want a long death, I want one that is certain and swift.

I'll admit that I had thought that moving contrib modules over to a
modules dir as they were, uh, modularized would be the way forward.
Anything that doesn't fit the database-owner-installable pattern
(pgbench? start-scripts? others?) could end up in a utils dir, and
anything left in contrib shows us what's left to do before e.g. 8.4.
The end goal would be no more contrib dir by the next major release.

As a side note, how were you intending to rename contrib? Directory
shenanigans in CVS are horrible, particularly if you want  all your
old branches to still work.

Cheers

Tom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Andrew Dunstan



Svenne Krap wrote:


If I have a simple table now

ID serial
Username varchar
Password varchar

I currently save only md5(id || username || 'password')*  into 
password, if I had access to sha1 (for example) i would add another 
password column so, having for example


ID serial
Username varchar
Password_md5 varchar
Password_sha1 varchar

No matter how you see it, I get more bits of hash to check against.



Really? Why stop at two, then? How many hash functions is enough?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] best way for export gram.y symbols

2008-04-03 Thread Pavel Stehule
Hello

I would to use main scanner from plpgsql. I need some values from parser/parse.h

#define SELECT 543
#define PARAM 642


and YYSTYPE

any ideas how to do it?

I would not copy it by hand.

Regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 3 Apr 2008 13:27:03 -0400
D'Arcy J.M. Cain [EMAIL PROTECTED] wrote:
 
 Well, OK, but given that this is a huge public project with lots of
 users expecting things to be in certain places, how fast do you think
 we could make such a change. 

8.4.

Joshua D. Drake




- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD4DBQFH9RY/ATb/zqfZUUQRAu0SAJ9+bnPyHmVIRb/QgbD8plEmGBRC2gCY0uS2
L+stcsM5h97QAzT23VD8zw==
=z+FW
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Svenne Krap

Mark Mielke wrote:

Svenne Krap wrote:

Mark Mielke wrote:

Svenne Krap wrote:
More two or even three different hashes with different 
collion-points will strongly increase the security.
No it doesn't unless you are thinking about a security through 
obscurity argument
Your logic is invalid - the best quality would be to not use a hash at 
all, and store in plain text, or ROT-13. Then you will have no 
collisions. If you truly believe more bits are better, don't use a 
hash to start with.




Ooops, went offlist by a wrong click. Putting it back onliste

I am aware that plain text (or any 1:1 mapping) has no chance of 
collision, but on the other hand if the box is compromised it gives an 
easy target for stealing passwords (and a lot of users use the same 
passwords a lot of places).
I believe that hashing through one hash function is an acceptable 
compromise between collisions (i.e. people get in with the wrong 
password) and password safety (evil hacker cannot read passwords) given 
you deploy anti rainbow table meassures.


I would still prefer two hash functions as they do add a better 
safeguard towards collisions (the gentoo distribtion actually hashes the 
files by three different algorithms SHA1, SHA256 and RMD160)  - i would 
be inclined to use three hashes too, if they were instantly available.


Svenne



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Andrew Dunstan



D'Arcy J.M. Cain wrote:

On Thu, 03 Apr 2008 13:06:25 -0400
Andrew Dunstan [EMAIL PROTECTED] wrote:
  

D'Arcy J.M. Cain wrote:


In fact, this may be the way to deprecate contrib.  Start building
modules and move the contrib packages to it one at a time.  That way
people using contrib have some time to switch and we can point people
to modules if they are just starting out.

Is there support for this idea?  I would like to start exploring this
if so.
  
No. I don't want to deprecate it, I want to get rid of it, lock, stock 
and barrel. If you think that we need more than renaming then we can 
discuss it, but I don't want a long death, I want one that is certain 
and swift.



Well, OK, but given that this is a huge public project with lots of
users expecting things to be in certain places, how fast do you think
we could make such a change.  It seems to me that we are going to have
to make things look the same for some time at least otherwise we are
going to have lots of complaints.  How swift is swift?  To me, swift
means add the alternate functionality to the next release and remove
the old in the release after. Do you see things happening any faster?

  



I don't understand this at all. We are talking about directory and 
package organisation here. How do you do that with transition 
arrangements? I guess we could put in a symlink from contrib, but I just 
don't see the point. I don't think we are under any obligation to 
preserve the way we package or split packages between releases. And 
doing this reorganisation now, fairly early in the release cycle, would 
let us give people like packagers plenty of advance notice.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Aidan Van Dyk
* Tom Dunstan [EMAIL PROTECTED] [080403 13:30]:

 As a side note, how were you intending to rename contrib? Directory
 shenanigans in CVS are horrible, particularly if you want  all your
 old branches to still work.

Well, please, anybody doing this, just simply copy and use cvs remove and
cvs add...  

We're using CVS, so we live with disjoint history on renames... As long
as the commit comment is clear, the history isn't lost, just another
command away.

a.
-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


[HACKERS] Locale, Collation, ICU patch

2008-04-03 Thread Gregory Stark

Regarding the ICU patch in the commitfest here's my plan. 

IMHO the idea of making ICU a hard dependency which Postgres will have to use
forevermore on all systems is a non-starter. I'm not entirely against having
ICU as a supported collation system which packagers on systems where the
system locale support is weak can choose to make a dependency of their binary
packages though, assuming the issues raised elsewhere about ICU are resolved.

As long as this bogeyman is scaring us though it's preventing us from having
the SQL standard collation syntax and the accompanying catalog and planner
changes.

And as long as we don't have that support -- which is a big job -- nobody
who's interested in implementing ICU or strcoll_l() or any other interfaces
for a new platform will get around to it. The actual porting glue to call
those functions on each platform is fairly lightweight and could easily be
done by experts on that platform who aren't catalog and planner mavens.

So we have a bit of a chicken and egg problem. We aren't getting the planner
and syntax changes because we aren't sure the support would be good on every
platform and we aren't getting the platform support because we don't have the
planner and catalog changes.

What I want to do is focus on adding the planner and catalog changes somehow.

We implement a kind of baseline locale support something only slightly better
than what we have now using setlocale before every comparison. This is clearly
not the recommended configuration but as long as it handles what we handle
today without a performance hit and a bit more besides it would be a big
start.

I'm assuming we would check if the desired locale is the current locale and
skip the assignment. So if only one locale is *actually* in use then basically
no additional overhead is incurred. Moreover if the desired locale is C then
we can skip the assignment and use strcmp directly. So actually as long as
only one non-C locale is in use then no additional overhead would be incurred.

The big gotcha is what collation to use when comparing with data in the system
tables, especially the shared system tables. I think we do need to define a
database-wide encoding and collation to use for system tables. (Unless we can
get by with varchar_pattern_ops indexes on system tables?)

So the following use cases arise:

a) They're actually using only one collation for both the system tables and
their own data. This is well handled by our existing setup and would be
basically unchanged in the new setup.

b) They're using multiple collations for their data but only one at a time.
Either one per database or one per session. In which case they don't incur any
overhead

c) They're using multiple collations for their data but only one collation in
a given application unit of work. This is probably the most common case for
OLTP application since each unit of work represents some particular user's
operation. In this case as long as the system tables are set up to use the C
locale then this would require at most one setlocale() call per unit of work
though.

d) They're actively using multiple collations in a single query, possibly even
within a single sort (something like ORDER BY a COLLATION en_US, b COLLATION
es_US). This would perform passably on glibc but abysmally on most other
libc's.

From that point forward we would go about adding support for strcoll_l() and
other interfaces to handle case (d) on various platforms. For platforms with
no reasonable interface we could add a --enable-ICU users or packagers could
choose to use.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Separate psql commands from arguments (was: psql command aliases support)

2008-04-03 Thread Brendan Jurd
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Moving to -hackers ...

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFH9RwN5YBsbHkuyV0RAr9ZAKD+XwNYYw3ugsTvowvKImOlKMZzPQCfTHkQ
u9jLkEIAWI/0MbNzzxBt0ok=
=So1n
-END PGP SIGNATURE-

On Fri, Apr 4, 2008 at 4:19 AM, Tom Lane [EMAIL PROTECTED] wrote:
  Yeah, the fundamental difference between the backslash command situation
  and aliases in shells and suchlike is that, because we've historically
  allowed no space between command name and argument, it's not that easy
  to tell what string ought to be compared against alias names.

  I think that an alias facility would only be acceptably safe if we
  disallowed that syntax (ie, start to *require* a space between command
  and args).  Are we ready to do that?


As far as i know, this behaviour isn't documented anywhere.  In fact,
the manual denies its existence:

The format of a psql command is the backslash, followed
immediately by a command verb, then any arguments. The arguments are
separated from the command verb and each other by any number of
whitespace characters.

(unless you interpret any number to include zero, but that's quite a
stretch in this context)

For what it's worth, I've been using Postgres actively for about five
years, and I've not once suspected that it was possible to omit the
space between a psql command and its argument.

The idea of writing the command and its arguments in one word is so
completely bizarre to me that I can't imagine anyone even trying it
casually to see if it works.  Although it is likely that some people
have stumbled upon it accidentally via typos, it's hard to imagine
them wanting to use it in any kind of pratical application.

+1 for dropping this quirk.  And, if there are no objections (or other
takers), I volunteer to write a patch.

Regards,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Sam Mason
On Thu, Apr 03, 2008 at 07:07:56PM +0200, Svenne Krap wrote:
 I currently save only md5(id || username || 'password')*  into password, 
 if I had access to sha1 (for example) i would add another password 
 column so, having for example
 
 ID serial
 Username varchar
 Password_md5 varchar
 Password_sha1 varchar
 
 No matter how you see it, I get more bits of hash to check against.

Are you a cryptanalyst and are you sure that this doesn't actually make
things worse?  I'm sure it gives you a warm fuzzy feeling that it's
*got* to be better, but unless someone has done some hard maths I'm not
sure how you can be so sure.

Why not just use SHA-512, you get many more quality bits that way.

 I would drop md5 totally and use sha1 and ripemd-160 if possible.. but 
 currently i use only md5 as it is the only available one..  Loading 
 pgcrypto is overkill for something as simple as hash-functions.

Sounds like a good reason for moving the current md5 function out into
pgcrypto as well! :)

 * I prepend the id and the username to guard users with weak passwords 
 against known hashvalues (rainbow tables) should the box ever get 
 comprised ... 

I take it your threat model doesn't include the attacker logging
incoming queries to look for the clear-text password.


  Sam

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread David Fetter
On Thu, Apr 03, 2008 at 01:06:26PM -0400, Bruce Momjian wrote:
 Alvaro Herrera wrote:
  Bruce Momjian escribi?:
  
   It seems more helpful if there were \x option to use extended format
   only when the output is too wide.  TODO already has:
   
   o Add auto-expanded mode so expanded output is used if the row
 length is wider than the screen width.
   
 Consider using auto-expanded mode for backslash commands like 
   \df+.
  
  Some sort of \x auto?  Sounds interesting ...
 
 Yep.

Having \df+ go to \x automatically sounds like a really great idea :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] best way for export gram.y symbols

2008-04-03 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
 I would to use main scanner from plpgsql. I need some values from 
 parser/parse.h

 #define SELECT 543
 #define PARAM 642

No, you don't.  Whatever you think you need those for, there's probably
a better way to do it.  We got out of the business of letting anything
but scan.c and gram.c depend on Bison symbol numbers years ago, and
I don't much want to re-introduce that dependency.

What exactly are you trying to accomplish?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] About numeric division again

2008-04-03 Thread Tom Lane
One of the items on the commit-fest list is my patch from last year
to rewrite the numeric division operator using schoolbook division:
http://archives.postgresql.org/pgsql-patches/2007-06/msg00173.php

The code that's currently in there sometimes has to propagate rounding
to the left, meaning that you can never be certain whether all of the
digits you have so far are good, and that means that it can sometimes
generate an incorrect truncated output.  This leads to the bugs cited
in the above message.

The reason I didn't just commit it last year is that I was dissatisfied
with the speed penalty --- on very long inputs (dozens or hundreds of
digits) division is about 4X slower than with our existing code.
However, no one has come up with a better answer; and as a wise man once
said, I can make this program arbitrarily fast, if it doesn't have to
give the right answer.  Correctness has to trump speed.

One thing that occurs to me is that we could keep the existing
approximate division code in there too, and use it internally in the
transcendental function implementations.  Those are not particularly
interested in getting exact truncated results, and they are the worst
case for the speed penalty because they do lots of divisions on values
that are likely to be long.  However this idea could fairly be charged
with being code bloat.

Comments?

Also, there was some discussion of providing a SQL-level numeric
integer division operator or function, that is the equivalent of
trunc(x/y) except faster (since it'd not need to compute fractional
digits that would then be thrown away).  Is this worth doing, and if so
what should we call it exactly?  The amount of new code needed should
be pretty small (just an interface function), so I'm willing to take
care of it if we want one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Locale, Collation, ICU patch

2008-04-03 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 The big gotcha is what collation to use when comparing with data in the system
 tables, especially the shared system tables. I think we do need to define a
 database-wide encoding and collation to use for system tables.

You mean cluster-wide?  If we can get away with that, it'd solve a lot
of problems.

Note that the stuff in the system tables is mostly type name not text,
and the comparison semantics for that have always been strcmp(), so the
question of collation doesn't really apply.  Name in itself doesn't care
about encoding either, but I think we have to restrict encoding to avoid
the problem of injecting data that's invalidly encoded into one database
from another via the shared catalogs.

The other issue that'd have to be resolved is the problem of system log
output.  I think we'd wish that log messages are written in a uniform
encoding (CSV output in particular is going to have a hard time
otherwise) but what do you do when you need to report something that
includes a character not present in that encoding?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Darcy Buskermolen
On Thursday 03 April 2008 08:47:12 Joshua D. Drake wrote:
 On Thu, 3 Apr 2008 21:03:05 +0530

 Tom Dunstan [EMAIL PROTECTED] wrote:
  On Thu, Apr 3, 2008 at 8:25 PM, Andrew Dunstan [EMAIL PROTECTED]
 
  wrote:
If this were at all true we would not not have seen the complaints
   from people along the lines of My ISP won't install contrib. But
   we have, and quite a number of times. We have concrete evidence
   that calling it contrib actually works against us.
 
  It's hard to see ISPs who won't install contrib from installing
  ${random module} from the big bad internet as has been discussed in
  this thread, but who knows?

 Sure it is. The very word contrib brings about ideas of things like:

 Unstable, Cooker, unofficial.

 modules is completely different (from a perception perspective).

 IMO the core modules should be compiled via configure with something
 like:

 ./configure --enable-module=ALL

 or

 ./configure --enable-module=pgcrypto --enable-module=cube

 This would install all the modules but not enable them in the database
 itself (of course). This could also be extended to the pls so that we
 have exactly one mechanism to control those options as well.

 ./configure --enable-module=pgcrypto --enable-module=plperl

I think --enable-module might be the wrong term here, since you specificaly 
state we are not enabling them in the database.

I think --with-module=... might be a better way to go.


 Sincerely,

 Joshua D. Drake



 --
 The PostgreSQL Company since 1997: http://www.commandprompt.com/
 PostgreSQL Community Conference: http://www.postgresqlconference.org/
 United States PostgreSQL Association: http://www.postgresql.us/
 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate



-- 
Darcy Buskermolen
Command Prompt, Inc.
+1.503.667.4564 X 102
http://www.commandprompt.com/
PostgreSQL solutions since 1997

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] modules

2008-04-03 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, 3 Apr 2008 12:03:43 -0700
Darcy Buskermolen [EMAIL PROTECTED] wrote:

  This would install all the modules but not enable them in the
  database itself (of course). This could also be extended to the pls
  so that we have exactly one mechanism to control those options as
  well.
 
  ./configure --enable-module=pgcrypto --enable-module=plperl
 
 I think --enable-module might be the wrong term here, since you
 specificaly state we are not enabling them in the database.
 
 I think --with-module=... might be a better way to go.
 

That would work.

Joshua D. Drake

- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH9S8HATb/zqfZUUQRAjAFAJ0dsH4Cwr3WuiLXVKw9tReOarhKSQCeNuKL
GkaxyLV8eC/YhUzgfd4YTEI=
=6C6r
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Locale, Collation, ICU patch

2008-04-03 Thread Andrew Dunstan



Tom Lane wrote:

The other issue that'd have to be resolved is the problem of system log
output.  I think we'd wish that log messages are written in a uniform
encoding (CSV output in particular is going to have a hard time
otherwise) but what do you do when you need to report something that
includes a character not present in that encoding?


  


I think the only problem with CSV logs would be in trying to read them 
back into Postgres (which I agree is the main point of having them).


We need to be more aggressive about dealing with these problems, or else 
how will we ever get to per-column charsets/collations?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Separate psql commands from arguments (was: psql command aliases support)

2008-04-03 Thread Bernd Helmle
--On Freitag, April 04, 2008 05:04:04 +1100 Brendan Jurd 
[EMAIL PROTECTED] wrote:


For everyone else who hasn't read the original discussion on -patches, 
here's a link into the archives:


http://archives.postgresql.org/pgsql-patches/2008-04/msg5.php



On Fri, Apr 4, 2008 at 4:19 AM, Tom Lane [EMAIL PROTECTED] wrote:

 Yeah, the fundamental difference between the backslash command situation
 and aliases in shells and suchlike is that, because we've historically
 allowed no space between command name and argument, it's not that easy
 to tell what string ought to be compared against alias names.

 I think that an alias facility would only be acceptably safe if we
 disallowed that syntax (ie, start to *require* a space between command
 and args).  Are we ready to do that?



Okay, the more we talked about that, the more i got that feeling, too.



As far as i know, this behaviour isn't documented anywhere.  In fact,
the manual denies its existence:

The format of a psql command is the backslash, followed
immediately by a command verb, then any arguments. The arguments are
separated from the command verb and each other by any number of
whitespace characters.



Not only there, the code itself  doesn't encourage the use of this syntax 
(src/bin/psql/command.c), too:



   /*
* If the command was not recognized, try to parse it as a 
one-letter
* command with immediately following argument (a 
still-supported, but

* no longer encouraged, syntax).
*/


(unless you interpret any number to include zero, but that's quite a
stretch in this context)

For what it's worth, I've been using Postgres actively for about five
years, and I've not once suspected that it was possible to omit the
space between a psql command and its argument.

The idea of writing the command and its arguments in one word is so
completely bizarre to me that I can't imagine anyone even trying it
casually to see if it works.  Although it is likely that some people
have stumbled upon it accidentally via typos, it's hard to imagine
them wanting to use it in any kind of pratical application.

+1 for dropping this quirk.  And, if there are no objections (or other
takers), I volunteer to write a patch.



Here's a quick and dirty patch which removes the responsible code from psql 
(maybe not enough, but short testing shows it's working). Sorry for the 
unified diff


+1, too. I advised people not to use that syntax for years now, maybe we're 
in luck and everyone else was doing the same ;)


--
 Thanks

   Bernddiff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 1392972..c1194c2 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -90,30 +90,6 @@ HandleSlashCmds(PsqlScanState scan_state,
 	/* And try to execute it */
 	status = exec_command(cmd, scan_state, query_buf);
 
-	if (status == PSQL_CMD_UNKNOWN  strlen(cmd)  1)
-	{
-		/*
-		 * If the command was not recognized, try to parse it as a one-letter
-		 * command with immediately following argument (a still-supported, but
-		 * no longer encouraged, syntax).
-		 */
-		char		new_cmd[2];
-
-		/* don't change cmd until we know it's okay */
-		new_cmd[0] = cmd[0];
-		new_cmd[1] = '\0';
-
-		psql_scan_slash_pushback(scan_state, cmd + 1);
-
-		status = exec_command(new_cmd, scan_state, query_buf);
-
-		if (status != PSQL_CMD_UNKNOWN)
-		{
-			/* adjust cmd for possible messages below */
-			cmd[1] = '\0';
-		}
-	}
-
 	if (status == PSQL_CMD_UNKNOWN)
 	{
 		if (pset.cur_cmd_interactive)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Merlin Moncure
On Thu, Apr 3, 2008 at 2:43 PM, David Fetter [EMAIL PROTECTED] wrote:
 On Thu, Apr 03, 2008 at 01:06:26PM -0400, Bruce Momjian wrote:
Some sort of \x auto?  Sounds interesting ...
  
   Yep.

  Having \df+ go to \x automatically sounds like a really great idea :)

you can get pretty good resultsr currently for \df+ if you set up your
'less' pager a particular way.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Separate psql commands from arguments (was: psql command aliases support)

2008-04-03 Thread Brendan Jurd
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, Apr 4, 2008 at 6:35 AM, Bernd Helmle  wrote:
  Here's a quick and dirty patch which removes the responsible code from psql
 (maybe not enough, but short testing shows it's working). Sorry for the
 unified diff


I didn't realise it would be that straightforward!

You could probably also do away with psql_scan_slash_pushback()
(psqlcan.h, psqlcan.l) as it is only used by the code you removed.

Cheers,
BJ
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: http://getfiregpg.org

iD8DBQFH9TV85YBsbHkuyV0RAq2aAKC5ay/QJQu9De9Ivq1kcgUFOGUa9gCgyB0c
wt1IjpIZH26O7DiWxf0nXE0=
=7Ybj
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 AFAIK the state of the art is actually to load the data into a table which
 closely matches the source material, sometimes just columns of text. Then copy
 it all to another table doing transformations. Not impressed.

I liked the idea of allowing COPY FROM to act as a table source in a
larger SELECT or INSERT...SELECT.  Not at all sure what would be
involved to implement that, but it seems a lot more flexible than
any other approach.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread David Fetter
On Thu, Apr 03, 2008 at 03:43:50PM -0400, Merlin Moncure wrote:
 On Thu, Apr 3, 2008 at 2:43 PM, David Fetter [EMAIL PROTECTED] wrote:
  On Thu, Apr 03, 2008 at 01:06:26PM -0400, Bruce Momjian wrote:
 Some sort of \x auto?  Sounds interesting ...
   
Yep.
 
   Having \df+ go to \x automatically sounds like a really great
   idea :)
 
 you can get pretty good resultsr currently for \df+ if you set up
 your 'less' pager a particular way.

Does 'less' have a way to re-arrange columns?!?

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Separate psql commands from arguments (was: psql command aliases support)

2008-04-03 Thread Bernd Helmle
--On Freitag, April 04, 2008 06:52:37 +1100 Brendan Jurd 
[EMAIL PROTECTED] wrote:



I didn't realise it would be that straightforward!


Stumbled across it during hacking...



You could probably also do away with psql_scan_slash_pushback()
(psqlcan.h, psqlcan.l) as it is only used by the code you removed.


It's not clean yet, but i thought we need something quick so people can try 
and comment on it.


--
 Thanks

   Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] best way for export gram.y symbols

2008-04-03 Thread Pavel Stehule


 No, you don't.  Whatever you think you need those for, there's probably
  a better way to do it.  We got out of the business of letting anything
  but scan.c and gram.c depend on Bison symbol numbers years ago, and
  I don't much want to re-introduce that dependency.

  What exactly are you trying to accomplish?

when I build CASE expression, I have to merge some PLpgSQL_expr
together. Then I have to reparse expr-query and I have to find params
and actualize it.
I found some else. I can't include parser/parse.h in gram.y file,
because there is name's conflict. But I can do it in other file. It's
better, because is less risk of wrong preproces. So I have function:

#include parser/parse.h
#include parser/gramparse.h

extern char *base_yytext;

int
plpgsql_querylex(int *param, char **ttext)
{
int tok = base_yylex();

if (tok == 0)
return PLPGSQL_QUERYLEX_DONE;

*ttext = base_yytext;
switch (tok)
{
case SELECT:
return PLPGSQL_QUERYLEX_SELECT;

case PARAM:
*param = base_yylval.ival;
return PLPGSQL_QUERYLEX_PARAM;

default:
return PLPGSQL_QUERYLEX_NONPARAM;
}
}


and then I can merge queries in function:
/*
 * This function joins an PLpgSQL_expr to expression stack. It's used
 * for CASE statement where from some expr is created one expression.
 * Reparsing is necessary for detecting parameters in SQL query.
 */
static void
add_expr(PLpgSQL_expr *expr, PLpgSQL_dstring *ds, int *nparams, int *params)
{
charbuff[32];
int lex;
int pnum;
char*yytext;


scanner_init(expr-query);

/* First lexem have to be SELECT */
if (plpgsql_querylex(pnum, yytext) != PLPGSQL_QUERYLEX_SELECT)
{
plpgsql_error_lineno = plpgsql_scanner_lineno();
/* internal error */
elog(ERROR, expected \SELECT \, got \%s\,
yytext);
}

while((lex = plpgsql_querylex(pnum, yytext)) != PLPGSQL_QUERYLEX_DONE)
{
if (lex == PLPGSQL_QUERYLEX_PARAM)
{
int dno;
int i;

if (pnum  1 || pnum = MAX_EXPR_PARAMS)
elog(ERROR, parsing query failure,
wrong param $%d, pnum);

dno = expr-params[pnum-1];
for (i = 0; i  *nparams; i++)
if (params[i] == dno)
break;

snprintf(buff, sizeof(buff), $%d, i+1);
/* when not found variable */
if (i = *nparams)
{
if (*nparams = MAX_EXPR_PARAMS)
{
plpgsql_error_lineno =
plpgsql_scanner_lineno();
ereport(ERROR,

(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 errmsg(too many
variables specified in SQL statement)));
}
params[*nparams] = dno;
(*nparams)++;
}
plpgsql_dstring_append(ds, buff);
}
else
plpgsql_dstring_append(ds, yytext);
}

scanner_finish();
}

Regards
Pavel Stehule


 regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Mark Mielke

Svenne Krap wrote:
I would still prefer two hash functions as they do add a better 
safeguard towards collisions (the gentoo distribtion actually hashes 
the files by three different algorithms SHA1, SHA256 and RMD160)  - i 
would be inclined to use three hashes too, if they were instantly 
available. 
Technically MD5 (128 bits) + SHA1 (160 bits) gives better strength than 
MD5 on its own, or SHA1 on its own, in that finding one collision is 
likely to be insufficient to break in, however, I doubt you could 
mathematically prove that it would perform equal to a 128 + 160 = 288 
bit equivalent strength hash function. At the bare minimum, I refer you 
to the fact that each component on its own represents a self-contained 
hash of the entire document, and a small change in a real-life document 
will presumedly affect both values (this is how hash functions are 
designed), therefore, for real-life documents, there are bit patterns 
that will never coincide, therefore, not all 288 bits are being used. 
For real-life documents, it is highly likely that the MD5 and the SHA1 
will always be a ~1:1 mapping (how many collision have truly been found 
in practice?), and the effective bit strength of the total for real 
documents approaches the highest of the two - which is 160 bits. I 
suggest that MD5 + SHA1 is not 128 bits better than SHA1 on its own.


In any case, this is all irrelevant, because md5 passwords are still 
very useful, and the argument that more = better is a never ending 
infinite resource trap. More is not better. Better is better. If you can 
prove md5 is insufficient for PostgreSQL passwords, the correct decision 
would be to switch to something better, and deprecate md5 from the core.


Cheers,
mark

--
Mark Mielke [EMAIL PROTECTED]


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Merlin Moncure
On Thu, Apr 3, 2008 at 4:08 PM, David Fetter [EMAIL PROTECTED] wrote:

 On Thu, Apr 03, 2008 at 03:43:50PM -0400, Merlin Moncure wrote:
   On Thu, Apr 3, 2008 at 2:43 PM, David Fetter [EMAIL PROTECTED] wrote:
On Thu, Apr 03, 2008 at 01:06:26PM -0400, Bruce Momjian wrote:
   Some sort of \x auto?  Sounds interesting ...
 
  Yep.
   
 Having \df+ go to \x automatically sounds like a really great
 idea :)
  
   you can get pretty good resultsr currently for \df+ if you set up
   your 'less' pager a particular way.

  Does 'less' have a way to re-arrange columns?!?

no, but being able to scroll left/right with the arrow keys is
(usually) just as good.

that said, \G is pretty neat...I use it once in a while on the 'other'
open source database.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Alvaro Herrera
Bruce Momjian escribió:
 Martijn van Oosterhout wrote:

  I was thinking that maybe \x should have a one-shot mode, i.e.
  
  \x query
  
  does it only for this one statement. It would solve the OPs problem.
 
 But break for others who want all output \x.

I think Martijn is proposing using it as some sort of prefix which would
take effect only on the current query.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Row estimation for var const and for NOT (...) queries

2008-04-03 Thread Nikolay Samokhvalov
Hi,

I have a table table1 with ~100k rows, the table having flag1
column. The value of flag1 is NULL in 85k+ rows, and it's TRUE in
7k+ rows, and FALSE in 6k rows. I use EXPLAIN to get apprx. number of
rows for simple SELECT queries. But in case of ...WHERE NOT flag1
the optimizer is completely wrong:

 -- it's OK here, the estimation is fine
test=# EXPLAIN ANALYZE SELECT * FROM table1 WHERE flag1;
 QUERY PLAN
-
 Seq Scan on table1  (cost=0.00..9962.84 rows=7875 width=532) (actual
time=0.107..134.729 rows=7652 loops=1)
   Filter: flag1
 Total runtime: 139.460 ms
(3 rows)

 -- here optimizer thinks that we have 90k+ rows with flag1 = FALSE,
while the real number of rows is 6k+
test=# EXPLAIN ANALYZE SELECT * FROM table1 WHERE NOT flag1;
  QUERY PLAN
--
 Seq Scan on table1  (cost=0.00..9962.84 rows=91809 width=532) (actual
time=0.087..110.596 rows=6243 loops=1)
   Filter: (NOT flag1)
 Total runtime: 114.414 ms
(3 rows)

I've checked statistics available and have found that Postgres
actually knows how many FALSE values are present (approximately) in
the table:
test=# SELECT null_frac, n_distinct, most_common_vals,
most_common_freqs FROM pg_stats
WHERE tablename='table1' AND attname='flag1';
 null_frac | n_distinct | most_common_vals | most_common_freqs
---++--+---
  0.864667 |  2 | {t,f}| {0.079,0.056}
(1 row)

So, I've started to think that this is a shortcoming of the optimizer
code, which makes Postgres count both FALSEs and NULLs when estimating
var  const expressions.

1) backend/utils/adt/selfuncs.c, in neqsel() we have:

...
result = DatumGetFloat8(DirectFunctionCall4(eqsel, ...
...
result = 1.0 - result;
PG_RETURN_FLOAT8(result);
...

-- so, there is a wrong assumption that for var  const expressions
we may just use estimation for var = const and subtract it from 1.
In fact, NULLs are ignored here. According to ternary logic, in this
case we must subtract the number of NULLs also. This will improve row
estimation for var  const queries (but not in case when we deal
with boolean datatype, look at (2)!). If there are no objections, I'll
send the patch, which is straightforward.

2). In case of WHERE flag1 = FALSE or WHERE flag1  TRUE the
planner rewrites the query to WHERE NOT flag1 and then uses the
logic defined in backend/optimizer/path/clausesel.c, where, again, we
see the wrong approach which ignores NULLs:

...
else if (not_clause(clause))
{
/* inverse of the selectivity of the underlying clause */
s1 = 1.0 - clause_selectivity(root,
  (Node *) 
get_notclausearg((Expr *) clause),
  
varRelid,
  
jointype);
...
I have no idea how to improve this. AFAIKS, at this level we have no
knowledge about the data we're dealing with (am I right?) -- so, I'm
afraid that for booleans there is no way to improve the optimizer. If
my thoughts described in (1) are correct and we improve the estimation
for , we will have a situation where using booleans might decrease
the performance due to wrong rows count estimation.

I'll appreciate any help and ideas that will allow to improve the situation.

P. S. I use current HEAD version of Postgres; before running queries
the statistic was updated with ANALYZE
-- 
Best regards,
Nikolay

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Heikki Linnakangas

Mark Mielke wrote:
In any case, this is all irrelevant, because md5 passwords are still 
very useful, and the argument that more = better is a never ending 
infinite resource trap. More is not better. Better is better. If you can 
prove md5 is insufficient for PostgreSQL passwords, the correct decision 
would be to switch to something better, and deprecate md5 from the core.


Agreed.

One must also remember that if you use two hashes, if *either* one of 
them is broken in the future so that you can reconstruct the password 
from the hash, you're screwed.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Bruce Momjian escribió:
 Martijn van Oosterhout wrote:
 I was thinking that maybe \x should have a one-shot mode, i.e.
 \x query
 does it only for this one statement. It would solve the OPs problem.
 
 But break for others who want all output \x.

 I think Martijn is proposing using it as some sort of prefix which would
 take effect only on the current query.

A bigger problem is that it doesn't play nicely at all with multi-line
queries.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Svenne Krap

Heikki Linnakangas wrote:

Mark Mielke wrote:
One must also remember that if you use two hashes, if *either* one of 
them is broken in the future so that you can reconstruct the password 
from the hash, you're screwed.

That is quite a good argument actually :)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY Transform support

2008-04-03 Thread Andrew Dunstan



Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:
  

AFAIK the state of the art is actually to load the data into a table which
closely matches the source material, sometimes just columns of text. Then copy
it all to another table doing transformations. Not impressed.



I liked the idea of allowing COPY FROM to act as a table source in a
larger SELECT or INSERT...SELECT.  Not at all sure what would be
involved to implement that, but it seems a lot more flexible than
any other approach.


  


Several years ago Bruce and I discussed the then theoretical use of a 
SELECT query as the source for COPY TO, and we agreed that the sane 
analog would be to have an INSERT query as the target of COPY FROM.


This idea seems to take that rather further. If doable I think it would 
be cool, as long as people don't try using it as an alternative storage 
engine. I can just imagine people creating views over such SELECT 
statements ...


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row estimation for var const and for NOT (...) queries

2008-04-03 Thread Tom Lane
Nikolay Samokhvalov [EMAIL PROTECTED] writes:
 I have a table table1 with ~100k rows, the table having flag1
 column. The value of flag1 is NULL in 85k+ rows, and it's TRUE in
 7k+ rows, and FALSE in 6k rows.

Yeah, you're going to have some problems with so many NULLs, I'm sure.

 -- so, there is a wrong assumption that for var  const expressions
 we may just use estimation for var = const and subtract it from 1.
 In fact, NULLs are ignored here. According to ternary logic, in this
 case we must subtract the number of NULLs also. This will improve row
 estimation for var  const queries (but not in case when we deal
 with boolean datatype, look at (2)!). If there are no objections, I'll
 send the patch, which is straightforward.

It doesn't seem all that straightforward to me, unless your intent is to
copy-and-paste all of eqsel(), which I wouldn't regard as a very
acceptable solution.  Otherwise you're going to need some refactoring.

 2). In case of WHERE flag1 = FALSE or WHERE flag1  TRUE the
 planner rewrites the query to WHERE NOT flag1 and then uses the
 logic defined in backend/optimizer/path/clausesel.c, where, again, we
 see the wrong approach which ignores NULLs:

I think the only case where we could hope to improve that is where the
argument is a simple bool variable --- but of course that's also the
only case where we could've done much with the flag1 = FALSE form, so
the rewriting isn't really hurting here.  I'd suggest pushing the work
into selfuncs.c and seeing if examine_variable can do anything with the
argument.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] best way for export gram.y symbols

2008-04-03 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
 What exactly are you trying to accomplish?

 when I build CASE expression, I have to merge some PLpgSQL_expr
 together. Then I have to reparse expr-query and I have to find params
 and actualize it.

There has to be a better way than that.  What CASE syntax are you
trying to implement, anyhow?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] SHA1 on postgres 8.3

2008-04-03 Thread Svenne Krap

Sam Mason wrote:

Are you a cryptanalyst and are you sure that this doesn't actually make
things worse?  I'm sure it gives you a warm fuzzy feeling that it's
*got* to be better, but unless someone has done some hard maths I'm not
sure how you can be so sure.
  

No sadly I am no cryptoanalyst.

Why not just use SHA-512, you get many more quality bits that way.
  

I would, if it was available in core.
  
I would drop md5 totally and use sha1 and ripemd-160 if possible.. but 
currently i use only md5 as it is the only available one..  Loading 
pgcrypto is overkill for something as simple as hash-functions.


Sounds like a good reason for moving the current md5 function out into
pgcrypto as well! :)
  
I am not sure how I am to understand that comment. But again I am just a 
user...
* I prepend the id and the username to guard users with weak passwords 
against known hashvalues (rainbow tables) should the box ever get 
comprised ... 



I take it your threat model doesn't include the attacker logging
incoming queries to look for the clear-text password.
  

No it doesn't, I am mostly concerned with the grab and run scenario.

I am still convinced having more (and better) hash-functions in core is 
a gain for some users.


And it is fairly un-intrusive as the hash functions are well-defined and 
never going to change (new ones can be added and old ones deleted, but 
SHA256 for example will never change).


I think I will drop the issue as I cannot present formal proof of my 
case, sorry to have wasted your time.


Svenne


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera [EMAIL PROTECTED] writes:

  I think Martijn is proposing using it as some sort of prefix which would
  take effect only on the current query.
 
 A bigger problem is that it doesn't play nicely at all with multi-line
 queries.

Hmm, why wouldn't it?  I assume it would only be recognized if the query
buffer is empty.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small TRUNCATE glitch

2008-04-03 Thread Alvaro Herrera
Tom Lane wrote:
 Just noticed that TRUNCATE fails to clear the stats collector's counts
 for the table.  I am not sure if it should reset the event counts or
 not (any thoughts?) but surely it is wrong to not zero the live/dead
 tuple counts.

Agreed, the live/dead counters should be reset.  Regarding event counts,
my take is that we should have a separate statement count for truncate
(obviously not a tuple count), and the others should be left alone.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane escribió:
 A bigger problem is that it doesn't play nicely at all with multi-line
 queries.

 Hmm, why wouldn't it?  I assume it would only be recognized if the query
 buffer is empty.

Huh?  The proposed syntax was

\x  query...

What do you do when you'd like the query to extend over multiple lines?
Backslash commands can't cross lines.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small TRUNCATE glitch

2008-04-03 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Just noticed that TRUNCATE fails to clear the stats collector's counts
 for the table.  I am not sure if it should reset the event counts or
 not (any thoughts?) but surely it is wrong to not zero the live/dead
 tuple counts.

 Agreed, the live/dead counters should be reset.  Regarding event counts,
 my take is that we should have a separate statement count for truncate
 (obviously not a tuple count), and the others should be left alone.

I thought some more about how to do it, and stumbled over how to cope
with TRUNCATE being rolled back.  That nixed my first idea of just
having TRUNCATE send a zero-the-counters-now message.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Alvaro Herrera
Tom Lane escribió:

 Huh?  The proposed syntax was
 
   \x  query...
 
 What do you do when you'd like the query to extend over multiple lines?
 Backslash commands can't cross lines.

Save the fact that the current query is extended, until query end?  I
haven't actually looked at what the implementation would look like.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \G command -- send query and output using extended format

2008-04-03 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane escribió:
 Huh?  The proposed syntax was
 
 \x  query...
 
 What do you do when you'd like the query to extend over multiple lines?
 Backslash commands can't cross lines.

 Save the fact that the current query is extended, until query end?

Yech.  To name just a couple of problems, what if you decide after
typing another line or two that you didn't want \x after all?
Action-at-a-distance commands suck.  You'd also find that this didn't
play very nicely with history recall, since the history stuff assumes
that a line starting with a backslash command is separate from those
around it.

If we want this at all, we should do it the way first proposed (\G).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >