Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-29 Thread Dean Rasheed
2009/7/28 Tom Lane t...@sss.pgh.pa.us:
 ... speaking of adding catalog columns, I just discovered that the patch
 adds searches of pg_depend and pg_constraint to BuildIndexInfo.  This
 seems utterly unacceptable on two grounds:

 * It's sheer luck that it gets through bootstrap without crashing.
 Those aren't part of the core set of catalogs that we expect to be
 accessed by primitive catalog searches.  I wouldn't be too surprised
 if it gets the wrong answer, and the only reason there's not a visible
 bug is none of the core catalogs have deferrable indexes (so there's
 no pg_depend entry to be found).

 * It increases the runtime of BuildIndexInfo by what seems likely to
 be orders of magnitude (note that get_index_constraint is not a
 syscacheable operation).  Did anyone do any performance checks on
 this patch, like seeing if pgbench got slower?

 I think we had better add the deferrability state to pg_index
 to avoid this.

 I tried to see if we could dispense with storing the flags in IndexInfo
 at all, so as not to have to do that.  It looks to me like the only
 place where it's really needed is in ExecInsertIndexTuples:

        if (is_vacuum || !relationDescs[i]-rd_index-indisunique)
            uniqueCheck = UNIQUE_CHECK_NO;
 ==     else if (indexInfo-ii_Deferrable)
            uniqueCheck = UNIQUE_CHECK_PARTIAL;
        else
            uniqueCheck = UNIQUE_CHECK_YES;

 Since this code has its hands on the pg_index row already, it definitely
 doesn't need a copy in IndexInfo if the state is in pg_index.  However,
 I also notice that it doesn't particularly care about the deferrability
 state in the sense that the triggers use (ie, whether to check at end of
 statement or end of transaction).  What we want to know here is whether
 to force an old-school immediate uniqueness check in the index AM.  So
 I'm thinking that we only need one bool added to pg_index, not two.

 And I'm further thinking about intentionally calling it something other
 than deferred, to emphasize that the semantics aren't quite like
 regular constraint deferral.  Maybe invert the sense and call it
 immediate.  Comments?

                        regards, tom lane


Yes that makes sense. Sorry I didn't spot this - it was a performance
regression, which I should have spotted with pgbench.

 - Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-29 Thread Dean Rasheed
2009/7/29 Tom Lane t...@sss.pgh.pa.us:
 Another thought on the index AM API issues: after poking through the
 code I realized that there is *nobody* paying any attention to the
 existing bool result of aminsert() (ie, did we insert anything or not).
 So I think that instead of adding a bool* parameter, we should repurpose
 the function result, along the lines of this spec:

  para
   The method's boolean result value is significant only when
   literalcheckUnique/ is literalUNIQUE_CHECK_PARTIAL/.
   In this case a TRUE result means the new entry is known unique, whereas
   FALSE means it might be non-unique (and a deferred uniqueness check must
   be scheduled).  For other cases a constant FALSE result is recommended.
  /para


And you'll be moving the ereport() back into the btree code? Makes
sense, provided that nothing is ever going to care whether the index
actually inserted an entry. I can see arguments for making the
recommended return value for other cases either TRUE or FALSE, but I
guess it doesn't matter since nothing is going to check it.


  para
   For non-unique indexes, it is not required that functionaminsert/
   do anything; it might for instance refuse to index NULLs.
  /para


Doesn't this comment apply equally to unique indexes?

 - Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: to_char, support for EEEE format

2009-07-29 Thread Brendan Jurd
2009/7/29 Euler Taveira de Oliveira eu...@timbira.com:
 Looks better but I did some tests and caught some strange behaviors.


Pavel,

Any comment on these cases?  When I looked at the patch I wasn't
really sure why it filled the string with '#' either, but I didn't
question it.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread Peter Eisentraut
On Tuesday 28 July 2009 23:30:23 pg...@mohawksoft.com wrote:
 The thing that perplexed me was that it was not obvious from the docs how,
 exactly, to get the functionality that was simple and straight forward in
 XML2.

I continue to be in favor of adding

xpath_string
xpath_number
xpath_boolean

functions, which would be both easier to use and provide a more casting-free 
approach to pass the data around.  In the past there were some doubts and 
objections about that, but I think it could be done.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Revised signal multiplexer patch

2009-07-29 Thread Fujii Masao
Hi,

I revised the patch according to the suggestion.

On Tue, Jul 28, 2009 at 4:08 PM, Fujii Masaomasao.fu...@gmail.com wrote:
 On Tue, Jul 28, 2009 at 12:09 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 I think you're making things more complicated when they should be
 getting simpler.

 It strikes me that the current API of pass the BackendId if known or
 InvalidBackendId if not still works for processes without a BackendId,
 as long as you can tolerate a bit of extra search overhead for them.
 (You could reduce the search overhead by searching the array back to
 front.)  So a new process index may be overkill.

 Yeah, this is very simple. I'll change the patch according to your suggestion.

Done.

 Umm... the patch should cover a notify interrupt which currently uses
 SIGUSR2?

 Getting rid of the separate SIGUSR2 handler would definitely be a good
 proof of concept that the mechanism works for more than one use.

 OK. I'll change the patch as above.

Done.

But there is one issue; the extra search is always required to send a notify
interrupt. This is because pg_listener doesn't have a backend ID and we
cannot pass it to SendProcSignal. In order to solve this issue, we should
newly add backend ID field into pg_listener?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


signal_multiplexer_0729.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: to_char, support for EEEE format

2009-07-29 Thread Pavel Stehule
2009/7/29 Brendan Jurd dire...@gmail.com:
 2009/7/29 Euler Taveira de Oliveira eu...@timbira.com:
 Looks better but I did some tests and caught some strange behaviors.


 Pavel,

 Any comment on these cases?  When I looked at the patch I wasn't
 really sure why it filled the string with '#' either, but I didn't
 question it.

This is consistent with postgres.

Pavel


 Cheers,
 BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql: support identif%TYPE[], (from ToDo)

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 1:44 AM, Pavel Stehulepavel.steh...@gmail.com wrote:
 2009/7/29 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 I would to solve some points from ToDo. I began with TYPE [] support.

 plpgsql's %type support is a crock that's going to have to be rewritten
 from the ground up as soon as we consolidate the lexer with the core.
 I wouldn't suggest spending any time in that area now.

 ook

There's also the fact that we still have almost half the patches in
this CommitFest that have yet to be closed out.  If you're looking for
something to do

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-29 Thread Robert Haas
On Tue, Jul 28, 2009 at 9:52 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 The other possibility here is that this just doesn't work.  :-)

 That's why we wanted to test it ;-).

 I don't have time to look right now, but ISTM the original discussion
 that led to making that patch had ideas about scenarios where it would
 be faster.  It'd be worth digging that up and seeing if the current
 tests covered the case or not.

This is what I've been able to find on a quick look:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg00678.php

Sounds like Kevin may want to try renaming some of his indices to
produce intermingling...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

2009-07-29 Thread Robert Haas
On Sat, Jul 18, 2009 at 5:54 PM, Dimitri Fontainedfonta...@hi-media.com wrote:
 I'm going to update the status of patch and resume work on it next week.

Any update?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Non-blocking communication between a frontend and a backend (pqcomm)

2009-07-29 Thread Fujii Masao
Hi,

Thanks for the comment!

On Tue, Jul 28, 2009 at 12:27 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 Well, actually, this description perfectly illustrates my basic
 complaint: the patch breaks the API abstraction provided by pqcomm.c.
 Callers are encouraged/forced to deal with the next layer down, and to
 the extent that pqcomm.c does anything useful, callers have to allow for
 that too.

 As far as the read side of pq_wait goes, it should probably be more
 like
* return true if any bytes are in the buffer
* else try to read some bytes into the buffer, without blocking
* now return true or false depending on whether bytes are in
  the buffer.
 (Or perhaps instead of true/false, return the count of available bytes.)

Seems good. But when we wait for 4 bytes and there is 1 byte in the buffer,
the above pq_wait always returns immediately, and we cannot get 4 bytes.
So, I think that we should make pq_wait monitor only the socket. And, we
should make the newly-introduced low-level I/O functions (such as
pq_getbufbytes) load some bytes into the buffer without blocking (i.e.,
pq_wait with no timeout is called by those functions), and return true/false
according to whether the specified bytes are in there. What is your opinion?

 Also, I suspect the API needs to make a distinction between no more
 bytes available yet and EOF (channel closed, so no more bytes ever
 will be available).

That's right.

 I'm not sure about the write side.  The patch isn't really addressing
 blocking-on-write, except in the offhand way of having a forWrite option
 in pq_wait, but that doesn't seem too well thought out to me.

I'm inclined to use the write side in Synch Rep. So, I'll consider it carefully.

  Again,
 the buffering pqcomm does makes a direct query of the underlying state
 seem pretty suspicious.

Sorry, I didn't get your point. Your concern is only the receiving of a query
from the frontend?

I'm not planning to change the existing communication of a query and result
between the backend and frontend. Non-blocking communication is used in
log-shipping between walsender and walreceiver. In order to speed up log-
shipping, the walsender should send the XLOG records and receive the
reply from walreceiver as concurrently as possible. In other words, I'd like to
prevent the pause that the walsender cannot send the outstanding records
during receiving with blocking. That pause might take long because of slow
network.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-07-29 Thread Peter Eisentraut
On Tuesday 14 July 2009 22:12:28 Oleg Bartunov wrote:
 we'd like to introduce filtering dictionaries support for text search
 and new contrib module unaccent, which provides useful example of
 filtering dictionary. It finally solves the known problem of
 incorrect generation of headlines of text with accents.

What is the source of the unaccent rules, and how complete is the rule set?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dependencies for generated header files

2009-07-29 Thread Peter Eisentraut
On Sunday 28 June 2009 21:21:35 Robert Haas wrote:
 I think that our dependencies for generated header files (gram.h,
 fmgroids.h, probes.h) are not as good as they could be.  What we do
 right now is make src/backend/Makefile rebuild these before recursing
 through its subdirectories.  This works OK for a top-level make, but
 if you run make further down in the tree (like under
 src/backend/commands) it won't necessarily rebuild everything that it
 should.

 The attached patch moves some of this logic from src/backend/Makefile
 to src/Makefile.global.in.

Have you tried putting them in src/backend/common.mk?  That might be a better 
place.

Also, modulo that change, do you think it would be OK to apply this patch now 
while the other patch about header generation is under discussion?  Or would 
that create a complete merge disaster if we ended up going with some variant 
of the header generation patch?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dependencies for generated header files

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 8:43 AM, Peter Eisentrautpete...@gmx.net wrote:
 On Sunday 28 June 2009 21:21:35 Robert Haas wrote:
 I think that our dependencies for generated header files (gram.h,
 fmgroids.h, probes.h) are not as good as they could be.  What we do
 right now is make src/backend/Makefile rebuild these before recursing
 through its subdirectories.  This works OK for a top-level make, but
 if you run make further down in the tree (like under
 src/backend/commands) it won't necessarily rebuild everything that it
 should.

 The attached patch moves some of this logic from src/backend/Makefile
 to src/Makefile.global.in.

 Have you tried putting them in src/backend/common.mk?  That might be a better
 place.

No, I haven't.  I can look at that at some point, but I'm a bit backed
up right now.

 Also, modulo that change, do you think it would be OK to apply this patch now
 while the other patch about header generation is under discussion?  Or would
 that create a complete merge disaster if we ended up going with some variant
 of the header generation patch?

Well, since this one is only an 83-line patch, I don't think it will
be too bad.  Of course if committing this gets you excited about
overhauling the Makefiles and you decide to change a bunch of other
things too, that might be more of an issue...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-29 Thread Tom Lane
Dean Rasheed dean.a.rash...@googlemail.com writes:
 2009/7/29 Tom Lane t...@sss.pgh.pa.us:
   For non-unique indexes, it is not required that functionaminsert/
   do anything; it might for instance refuse to index NULLs.

 Doesn't this comment apply equally to unique indexes?

Hmm, I was thinking that a unique-capable index would have to index all
tuples.  But I guess if it's restricted to one index column (like hash)
it could omit nulls and still enforce uniqueness correctly.  I'll change
that comment.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 28, 2009 at 9:52 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 I don't have time to look right now, but ISTM the original discussion
 that led to making that patch had ideas about scenarios where it would
 be faster.

 This is what I've been able to find on a quick look:
 http://archives.postgresql.org/pgsql-hackers/2009-05/msg00678.php

 Sounds like Kevin may want to try renaming some of his indices to
 produce intermingling...

Also, the followup to that message points out that the 8.4.0 code has a
potential O(N^2) dependency on the total number of TOC items in the
dump.  So it might be interesting to check the behavior with very large
numbers of tables/indexes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Filtering dictionaries support and unaccent dictionary

2009-07-29 Thread Oleg Bartunov

On Wed, 29 Jul 2009, Peter Eisentraut wrote:


On Tuesday 14 July 2009 22:12:28 Oleg Bartunov wrote:

we'd like to introduce filtering dictionaries support for text search
and new contrib module unaccent, which provides useful example of
filtering dictionary. It finally solves the known problem of
incorrect generation of headlines of text with accents.


What is the source of the unaccent rules, and how complete is the rule set?


unicode tables from unicode.org. It'be nice if someone check the completeness.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-29 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 This is what I've been able to find on a quick look:
 
 http://archives.postgresql.org/pgsql-hackers/2009-05/msg00678.php
 
 Sounds like Kevin may want to try renaming some of his indices to
 produce intermingling...
 
Thanks, I'll give that a try.  Renaming them is one thing, getting a
new dump is another, though.  I probably won't be able to test that
theory until tomorrow night.
 
Last night's test yielded a couple interesting results.
 
For one thing, while the optimized postgresql.conf was 2.5% faster
than the default file for a single job in one database transaction, it
was 10% *slower* than the default for multi-job restores.  I'll check
on that more later, to see what might be helping and what is hurting.
 
For another thing, with the default settings, the patched version ran
an additional 1% faster than the unpatched; although I don't have
enough samples to have a high degree of confidence it wasn't noise. 
I'll run another slew of tests tonight with the existing dump file to
confirm to debunk that result, while I create a new dump file to test
with name intermingling on later nights.
 
For the record, the default postgresql.conf:
 
port = 5678
datestyle = 'iso, mdy'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
default_text_search_config = 'pg_catalog.english'
escape_string_warning = off
standard_conforming_strings = on
 
The optimized file adds these:
 
max_connections = 100
shared_buffers = 256MB
work_mem = 50MB
maintenance_work_mem = 500MB
bgwriter_lru_maxpages = 600
bgwriter_lru_multiplier = 10.0
fsync = off
full_page_writes = off
wal_buffers = 4MB
random_page_cost = 2.0
effective_cache_size = 3GB
logging_collector = on
log_line_prefix = '[%m] %p %q%u %d %r '
track_counts = off
autovacuum = off
sql_inheritance = off
 
I'm sure that there is a wealth of opinion on which of these are
slowing things down, but I'm going to withhold any guesses in favor of
testing them.  (They all proved themselves neutral or beneficial in
objective testing for the single-job restores under 8.3.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-29 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: 
 
 Also, the followup to that message points out that the 8.4.0 code
 has a potential O(N^2) dependency on the total number of TOC items
 in the dump.  So it might be interesting to check the behavior with
 very large numbers of tables/indexes.
 
I've got 431 user tables with 578 indexes.  How high should I push
this?  Can I just create a bunch of randomly named empty tables with
primary keys to provoke this effect?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread pgsql
 On Tuesday 28 July 2009 23:30:23 pg...@mohawksoft.com wrote:
 The thing that perplexed me was that it was not obvious from the docs
 how,
 exactly, to get the functionality that was simple and straight forward
 in
 XML2.

 I continue to be in favor of adding

 xpath_string
 xpath_number
 xpath_boolean

 functions, which would be both easier to use and provide a more
 casting-free
 approach to pass the data around.  In the past there were some doubts and
 objections about that, but I think it could be done.


I totally agree, but I tend to be more of a pragmatist than a purist. It
seems to me that purists tend to like a lot of topical consistency in an
API, like the new implementation of xpath over the former. Where as a
pragmatists will violate some of the rules to make something seemingly
more easy.

The issue I have with the xpath implementation is that it seems more
geared to an XML implementation on top of SQL instead of an XML
implementation embedded within SQL.

For instance, I use an XML column in a database for metadata about
customers and other objects. So, we have a base table of objects and the
specifics of each object is contained within XML.


So, the former API was perfect for this use:

select datum form objects were key ='GUID' and
xpath_string(datum,E'foo/bar') = 'frobozz';

The logic of the function seems is that it is intended to use extracted
XML within a query. The new xpath functionality seems not to be designed
to facilitate this, requiring a pretty arcane query structure to do the
same thing:

select datum from objects where key='GUID' and (xpath(E'foo/bar',
XMLPARSE(CONTENT datum))::text())[1] = 'frobozz';

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

2009-07-29 Thread Dimitri Fontaine

Le 29 juil. 09 à 13:07, Robert Haas a écrit :
On Sat, Jul 18, 2009 at 5:54 PM, Dimitri Fontainedfonta...@hi-media.com 
 wrote:
I'm going to update the status of patch and resume work on it next  
week.


Any update?



Sorry about delays, resuming tomorrow evening is the current plan.
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-29 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote: 
 Also, the followup to that message points out that the 8.4.0 code
 has a potential O(N^2) dependency on the total number of TOC items
 in the dump.  So it might be interesting to check the behavior with
 very large numbers of tables/indexes.
 
 I've got 431 user tables with 578 indexes.  How high should I push
 this?  Can I just create a bunch of randomly named empty tables with
 primary keys to provoke this effect?

Yeah, just add a bunch of empty tables.  Ten thousand or so, perhaps.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] date_part()/EXTRACT(second) behaviour with time data type

2009-07-29 Thread Gregory Stark

I think we broke date_part for extracting seconds from time arguments. It
appears we leave out the milliseconds whereas we don't for timestamp
arguments. This was not the case in 8.3 where we included the milliseconds for
both data types.

Unless this is intentional? I know we wacked around both the meaning of SECOND
for intervals and the code for date_part in a lot of ways. But I don't see why
it would be different for time versus timestamp.

postgres=# select extract(second from now()::time with time zone);
 date_part 
---
27
(1 row)

postgres=# select extract(second from now()::time without time zone);
 date_part 
---
27
(1 row)

postgres=# select extract(second from now()::timestamp with time zone);
 date_part 
---
 27.782458
(1 row)

postgres=# select extract(second from now()::timestamp without time zone);
 date_part 
---
 27.782458
(1 row)


-- 
  Gregory Stark
  http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql: support identif%TYPE[], (from ToDo)

2009-07-29 Thread Pavel Stehule
2009/7/29 Robert Haas robertmh...@gmail.com:
 On Wed, Jul 29, 2009 at 1:44 AM, Pavel Stehulepavel.steh...@gmail.com wrote:
 2009/7/29 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 I would to solve some points from ToDo. I began with TYPE [] support.

 plpgsql's %type support is a crock that's going to have to be rewritten
 from the ground up as soon as we consolidate the lexer with the core.
 I wouldn't suggest spending any time in that area now.

 ook

 There's also the fact that we still have almost half the patches in
 this CommitFest that have yet to be closed out.  If you're looking for
 something to do


I am bad man for review and I understand little bit only to plpgsql. I
am preparing plan for next commitfest.

Pavel

 ...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql: support identif%TYPE[], (from ToDo)

2009-07-29 Thread Pavel Stehule
2009/7/28 Martijn van Oosterhout klep...@svana.org:
 On Tue, Jul 28, 2009 at 10:53:08PM +0200, Pavel Stehule wrote:
 Hello

 I would to solve some points from ToDo. I began with TYPE [] support.
 I thing, so this should be relative simple, but there are one issue.

 snip

 My first idea is using word element:

 create or replace function x(a int[])
 ...
 declare f a%element;
 begin
 ...

 I would have thought:

 declare f a[0]%type;

 as in: the type of a[0]. Perhaps a[]%type but that seems less natural.


It's clean so a[0] means element of array, but why zero. PostgreSQL
array should to start from any int value :(. And this syntax is little
bit difficult parserable.

What do you thing about:

declare
  f array of a%type;
  x element of f%type;

??

regards
Pavel


 Have a nice day,
 --
 Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while
 boarding. Thank you for flying nlogn airlines.

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.9 (GNU/Linux)

 iD8DBQFKb2lxIB7bNG8LQkwRApwuAJ4snkAtixsMa2ju8r0jYYIH2hIJ6ACeOVQj
 /oiBdGmX8zQddwmwsvjLnOM=
 =xtg2
 -END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: to_char, support for EEEE format

2009-07-29 Thread Brendan Jurd
2009/7/29 Euler Taveira de Oliveira eu...@timbira.com:
 This is not a problem with your patch but something that needs to be fixed in
 PostgreSQL to match Oracle behavior. The following example should emit an
 error. IMHO, filling the string with # is very strange. TODO?

 euler=# SELECT to_char(1234.56789, '9.080');
  to_char
 -
  #.#8#
 (1 row)

The formatting functions have a lot of weird corner cases.  I've been
trying to clean up some of the more bizarre behaviours in the
date/time formatting functions, but haven't touched the numeric
formatting because I haven't ever needed to use it.

Filling unused characters in the string with # may be strange, but
changing it would require a much broader patch that covers all of the
numeric formatting styles, not just .  A TODO is probably the way
to go.


 Couldn't the following code be put inside switch clause? If not, you should
 add a comment why the validation is outside switch.

 +   if (IS_(num)  n-key-id != NUM_E)
 +   {
 +       NUM_cache_remove(last_NUMCacheEntry);
 +       ereport(ERROR,
 +               (errcode(ERRCODE_SYNTAX_ERROR),
 +                errmsg(\\ must be the last pattern used)));
 +   }
 +
    switch (n-key-id)
    {
        case NUM_9:

The switch is on (n-key-id), but the test you mentioned above is
looking for any keywords *other than* the  keyword, where  has
previously been parsed.

So if you put the test inside the switch, it would need to appear in
every single branch of the switch except for the NUM_E one.  I'm
confused about why you think this needs a comment.  Perhaps I
misunderstood you?


 Oracle has a diferent overflow limit [1] but I think we could stay with the
 PostgreSQL one. But the #. is not the intended behavior. IIRC you're
 limited to 99 exponent.

 SQL SELECT to_char(1.234567E+308, '9.999');
 SELECT to_char(1.234567E+308, '9.999')

 ERROR at line 1:
 ORA-01426: numeric overflow

 euler=# SELECT to_char(1.234567E+308, '9.999');
  to_char
 ---
  #.###
 (1 row)

I don't see any problem with extending this to allow up to 3 exponent
digits ... Pavel, any comment?


 The problem is in numeric_to_char() and float8_to_char(). You could fix it
 with the following code. Besides that I think you should comment why '5' or
 '6' in the other *_to_char() functions.

 +       /* 6 means '.' (decimal point), 'E', '+', and 3 exponent digits */

Agreed about the comment; I'll add it.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: to_char, support for EEEE format

2009-07-29 Thread Pavel Stehule
2009/7/29 Brendan Jurd dire...@gmail.com:
 2009/7/29 Euler Taveira de Oliveira eu...@timbira.com:
 This is not a problem with your patch but something that needs to be fixed in
 PostgreSQL to match Oracle behavior. The following example should emit an
 error. IMHO, filling the string with # is very strange. TODO?

 euler=# SELECT to_char(1234.56789, '9.080');
  to_char
 -
  #.#8#
 (1 row)

 The formatting functions have a lot of weird corner cases.  I've been
 trying to clean up some of the more bizarre behaviours in the
 date/time formatting functions, but haven't touched the numeric
 formatting because I haven't ever needed to use it.

 Filling unused characters in the string with # may be strange, but
 changing it would require a much broader patch that covers all of the
 numeric formatting styles, not just .  A TODO is probably the way
 to go.


 Couldn't the following code be put inside switch clause? If not, you should
 add a comment why the validation is outside switch.

 +   if (IS_(num)  n-key-id != NUM_E)
 +   {
 +       NUM_cache_remove(last_NUMCacheEntry);
 +       ereport(ERROR,
 +               (errcode(ERRCODE_SYNTAX_ERROR),
 +                errmsg(\\ must be the last pattern used)));
 +   }
 +
    switch (n-key-id)
    {
        case NUM_9:

 The switch is on (n-key-id), but the test you mentioned above is
 looking for any keywords *other than* the  keyword, where  has
 previously been parsed.

 So if you put the test inside the switch, it would need to appear in
 every single branch of the switch except for the NUM_E one.  I'm
 confused about why you think this needs a comment.  Perhaps I
 misunderstood you?


 Oracle has a diferent overflow limit [1] but I think we could stay with the
 PostgreSQL one. But the #. is not the intended behavior. IIRC you're
 limited to 99 exponent.

 SQL SELECT to_char(1.234567E+308, '9.999');
 SELECT to_char(1.234567E+308, '9.999')

 ERROR at line 1:
 ORA-01426: numeric overflow

 euler=# SELECT to_char(1.234567E+308, '9.999');
  to_char
 ---
  #.###
 (1 row)

 I don't see any problem with extending this to allow up to 3 exponent
 digits ... Pavel, any comment?


I am not sure - this function should be used in reports witl fixed
line's width. And I am thinking, so it's should be problem - I prefer
showing some #.### chars. It's clean signal, so some is wrong, but it
doesn't break generating long run reports (like exception in Oracle)
and doesn't broke formating like 3 exponent digits.

Pavel


 The problem is in numeric_to_char() and float8_to_char(). You could fix it
 with the following code. Besides that I think you should comment why '5' or
 '6' in the other *_to_char() functions.

 +       /* 6 means '.' (decimal point), 'E', '+', and 3 exponent digits */

 Agreed about the comment; I'll add it.

 Cheers,
 BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread Andrew Dunstan



pg...@mohawksoft.com wrote:

On Tuesday 28 July 2009 23:30:23 pg...@mohawksoft.com wrote:


The thing that perplexed me was that it was not obvious from the docs
how,
exactly, to get the functionality that was simple and straight forward
in
XML2.
  

I continue to be in favor of adding

xpath_string
xpath_number
xpath_boolean

functions, which would be both easier to use and provide a more
casting-free
approach to pass the data around.  In the past there were some doubts and
objections about that, but I think it could be done.




I totally agree, but I tend to be more of a pragmatist than a purist. It
seems to me that purists tend to like a lot of topical consistency in an
API, like the new implementation of xpath over the former. Where as a
pragmatists will violate some of the rules to make something seemingly
more easy.

The issue I have with the xpath implementation is that it seems more
geared to an XML implementation on top of SQL instead of an XML
implementation embedded within SQL.

For instance, I use an XML column in a database for metadata about
customers and other objects. So, we have a base table of objects and the
specifics of each object is contained within XML.


So, the former API was perfect for this use:

select datum form objects were key ='GUID' and
xpath_string(datum,E'foo/bar') = 'frobozz';

The logic of the function seems is that it is intended to use extracted
XML within a query. The new xpath functionality seems not to be designed
to facilitate this, requiring a pretty arcane query structure to do the
same thing:

select datum from objects where key='GUID' and (xpath(E'foo/bar',
XMLPARSE(CONTENT datum))::text())[1] = 'frobozz';

  


It's not that arcane. Mike Rylander and I came up with the same answer 
independently within a very short time of you posting your query. I 
guess it depends how used you are to using XPath.


It's also probably not terribly hard to produce a wrapper to do what 
you'd like.


I have no problem with adding some convenience functions. I do have a 
problem with functions where we try to make things easy and instead muck 
them up. We just ripped out a convenience from our xpath processing 
that was totally braindead, so this isn't an idle concern.


I would argue that xpath_string is a fairly horrible name for what the 
xml2 module provides. Specifically, my objection is that an xpath query 
returns a nodeset, and what this function returns is not the string 
value of the nodeset, but the string value of the *contents* of those 
nodes, which is not the same thing at all. To that extent the xml2 
module documentation is at best imprecise and at worst plain wrong.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] date_part()/EXTRACT(second) behaviour with time data type

2009-07-29 Thread Tom Lane
Gregory Stark st...@mit.edu writes:
 I think we broke date_part for extracting seconds from time arguments. It
 appears we leave out the milliseconds whereas we don't for timestamp
 arguments. This was not the case in 8.3 where we included the milliseconds for
 both data types.

It's not new.  This appears to be a difference between the integer and
float timestamp code paths, and I'd say it's probably a thinko:

case DTK_SECOND:
#ifdef HAVE_INT64_TIMESTAMP
result = tm-tm_sec + fsec / USECS_PER_SEC;
#else
result = tm-tm_sec + fsec;
#endif
break;

In the integer case, fsec is an integer and so the division loses the
fraction.  timestamptz_part does this instead:

case DTK_SECOND:
#ifdef HAVE_INT64_TIMESTAMP
result = tm-tm_sec + fsec / 100.0;
#else
result = tm-tm_sec + fsec;
#endif
break;

I agree that we should change it, but should we back-patch it, and if so
how far?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system timezone regression failure

2009-07-29 Thread Andrew Dunstan



Andrew Dunstan wrote:



Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:
 
Building 8.4 on a client's system, I get a regression failure 
apparently due to some difference between the system's timezone DB 
and what out regression tests expect, as shown below.



Those regression tests were *intentionally* set up to fail if someone's
TZ support is not Y2038 clean.  This is not a bug.  Advise your client
to get some less-obsolete timezone data files; or don't depend on the
system TZ database.  (The only reason why you should do so is if it's
being kept up to date, eh?)

   
  


Oh, you're right, I misread the diffs. The client is getting the 
machines updated.


Well, this is interesting:


and...@jimbo:~ rpm -q -i timezone
Name: timezone Relocations: (not relocatable)
Version : 2.4   Vendor: SUSE LINUX 
Products GmbH, Nuernberg, Germany
Release : 31.61 Build Date: Thu 02 Apr 2009 
02:31:18 PM EDT
Install Date: Wed 29 Jul 2009 10:52:55 AM EDT  Build Host: 
cherubini.suse.de
Group   : System/Base   Source RPM: 
glibc-2.4-31.61.src.rpm
Size: 712873   License: BSD 3-Clause; 
GPL v2 or later; LGPL v2.1 or later
Signature   : DSA/SHA1, Fri 03 Apr 2009 09:20:05 AM EDT, Key ID 
a84edae89c800aca

Packager: http://bugs.opensuse.org
URL : http://www.gnu.org/software/libc/libc.html
Summary : Timezone descriptions
Description :
These are configuration files that describe available time zones. You
can select an appropriate time zone for your system with YaST.
Distribution: SUSE Linux Enterprise 10 (X86-64)


And I still get these errors. Looks like at least SUSE is not keeping 
up. I'll build without the system timezones.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: to_char, support for EEEE format

2009-07-29 Thread Euler Taveira de Oliveira
Brendan Jurd escreveu:
 Filling unused characters in the string with # may be strange, but
 changing it would require a much broader patch that covers all of the
 numeric formatting styles, not just .  A TODO is probably the way
 to go.
 
That was my suggestion: a TODO.

 So if you put the test inside the switch, it would need to appear in
 every single branch of the switch except for the NUM_E one.  I'm
 confused about why you think this needs a comment.  Perhaps I
 misunderstood you?
 
Yes, I know you need to modify every 'case' clause to test if  was
previously used (that was one suggestion) but I said if you don't want to go
that way, add a comment explaining why you're using that 'if' above the
'switch' and not inside it.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system timezone regression failure

2009-07-29 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 And I still get these errors. Looks like at least SUSE is not keeping 
 up. I'll build without the system timezones.

Or just live with it.  The purpose of the regression tests is to let
you know there's a problem, not to dictate what you do about it.
In this case you need to consider whether the application cares about
post-Y2038 date arithmetic, and whether that's more important than
having regularly updated timezone details.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] date_part()/EXTRACT(second) behaviour with time data type

2009-07-29 Thread Greg Stark
On Wed, Jul 29, 2009 at 5:15 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 I agree that we should change it, but should we back-patch it, and if so
 how far?


Well at least to 8.4 so someone who has just always been using
downloaded binaries or binaries compiled with the default
configuration continues to get the same behaviour.

My inclination would be to backpatch it further back. But I'm not 100%
sure either.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Higher TOAST compression.

2009-07-29 Thread decibel

On Jul 23, 2009, at 6:22 AM, Laurent Laborde wrote:
On Wed, Jul 22, 2009 at 10:54 AM, Laurent  
Labordekerdez...@gmail.com wrote:

My 1st applied patch is the safest and simpliest :
in pg_lzcompress.c :

static const PGLZ_Strategy strategy_default_data = {
   256,/* Data chunks less than 256 are not compressed */
   256,/* force compression on data chunks on record =  
256bytes */
   1,  /* compression rate below 1% fall back to  
uncompressed*/
   256,/* Stop history lookup if a match of 256 bytes is  
found   */
   6   /* lower good match size b 6% at every lookup  
iteration   */

};
const PGLZ_Strategy *const PGLZ_strategy_default =  
strategy_default_data;


I'm testing in production since yesterday.
It greatly improved %IOwait.

My 1st guess is that postgresql keep more data inline instead of
moving it in extern to toast table, reducing massively the IOseek and
resulting in a higher IO througput.
(iostat show a 5~25MB/s bandwidth at 100%util instead of 2~5MB/s at  
100%util).


So... now i'm not sure anymore about lowering the tuple per page  
from 4 to 8.

Doing that would mean more data in TOAST table ...
What's the typical size of your data that's being toasted? I actually  
have a number of cases where I'd like to push data into external  
storage, because it seriously hurts tuple density (and I doubt it'd  
compress well).

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] new digest datatypes, or generic fixed-len hex types?

2009-07-29 Thread decibel

On Jul 28, 2009, at 6:15 AM, Peter Eisentraut wrote:

On Monday 27 July 2009 14:50:30 Alvaro Herrera wrote:
We've developed some code to implement fixed-length datatypes for  
well

known digest function output (MD5, SHA1 and the various SHA2 types).
These types have minimal overhead and are quite complete, including
btree and hash opclasses.

We're wondering about proposing them for inclusion in pgcrypto.  I  
asked

Marko Kreen but he is not sure about it; according to him it would be
better to have general fixed-length hex types.  (I guess it would be
possible to implement the digest types as domains over those.)


I think equipping bytea with a length restriction would be a very  
natural,
simple, and useful addition.  If we ever want to move the bytea  
type closer to

the SQL standard blob type, this will need to happen anyway.

The case for separate fixed-length data types seems very dubious,  
unless you
can show very impressive performance numbers.  For one thing, they  
would make
the whole type system more complicated, or in the alternative,  
would have

little function and operator support.
bytea doesn't cast well to and from text when you're dealing with hex  
data; you end up using the same amount of space as a varchar. What  
would probably work well is a hex datatype that internally works like  
bytea but requires that the input data is hex (I know you can use  
encode/decode, but that added step is a pain). A similar argument  
could be made for base64 encoded data.

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow count in window query

2009-07-29 Thread Hitoshi Harada
2009/7/18 Hitoshi Harada umi.tan...@gmail.com:
 If I understand exlain analyze correctly and it tells us the fact,
 WindowAgg without ORDER BY clause gets unreasonably slow. Let me see.


I haven't determined the difference between with and without ORDER BY
clause in OVER(), but I took a benchmark that throws an interesting
result.

$ bin/psql regression -c 'explain analyze select count(*) over() from x'
   QUERY PLAN



 WindowAgg  (cost=0.00..2741.00 rows=10 width=0) (actual time=3725.294..4559
.828 rows=10 loops=1)
   -  Seq Scan on x  (cost=0.00..1491.00 rows=10 width=0) (actual time=0.11
2..310.349 rows=10 loops=1)
 Total runtime: 4811.115 ms
(3 rows)

The query is quite slow because profiling hook function calls
gettimeofday() each time. And here's the result that counted up
eval_windowaggregate() call and its children functions. Elapse time is
in second and it is subtracted with total gettimeofday() overhead.

eval_windowaggregates:
Count 10
Elapse 0.588426

Address  |Name  |Count |Elapse(Total)
0x8204067|initialize_windowaggregate| 1| 0.000277
0x8204d4a|spool_tuples  |12| 0.620092
0x83dcd08|tuplestore_select_read_pointer|11| 0.011080
0x83dda2f|tuplestore_gettupleslot   |11| 0.049005
0x8204fdd|row_is_in_frame   |10| 0.014978
0x8204168|advance_windowaggregate   |10| 0.025675
0x81ead8a|ExecClearTuple|10| 0.022105
0x8204462|finalize_windowaggregate  | 1| 0.15
0x8204120|MemoryContextSwitchTo | 2| 0.00

spool_tuples() is dominant in eval_windowaggregates(). I think it is
not needed if the query contains only simple aggregate like count(*)
OVER () but currently we copy all the rows from the source table to
tuplestore. Even if it fits in memory, the copy operation costs too
much.

I am thinking about how to avoid unnecessary copy overhead...


Regards,

---
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread pgsql


 select datum form objects were key ='GUID' and
 xpath_string(datum,E'foo/bar') = 'frobozz';

 The logic of the function seems is that it is intended to use extracted
 XML within a query. The new xpath functionality seems not to be designed
 to facilitate this, requiring a pretty arcane query structure to do the
 same thing:

 select datum from objects where key='GUID' and (xpath(E'foo/bar',
 XMLPARSE(CONTENT datum))::text())[1] = 'frobozz';



 It's not that arcane. Mike Rylander and I came up with the same answer
 independently within a very short time of you posting your query. I
 guess it depends how used you are to using XPath.

That is sort of the point I was making. It just seems arcane, by the very
definition of arcane, within a SQL context. It is workable and it can be
used, but I don't think the change was designed to make writing queries
easier. It was designed to be more about XPath than SQL.


 It's also probably not terribly hard to produce a wrapper to do what
 you'd like.

No, it isn't but you haven't made the usage of XPath any easier in the
more general case.


 I have no problem with adding some convenience functions. I do have a
 problem with functions where we try to make things easy and instead muck
 them up. We just ripped out a convenience from our xpath processing
 that was totally braindead, so this isn't an idle concern.

 I would argue that xpath_string is a fairly horrible name for what the
 xml2 module provides. Specifically, my objection is that an xpath query
 returns a nodeset, and what this function returns is not the string
 value of the nodeset, but the string value of the *contents* of those
 nodes, which is not the same thing at all. To that extent the xml2
 module documentation is at best imprecise and at worst plain wrong.

Well, the API is there, it is where, I guess, PostgreSQL is going, but I
think, philosophically, the API needs to see the XML contained within SQL
columns as being able to represent variable and optional columns in object
oriented environments easily. The harder it is to use a feature, the less
usable the feature is.

Do you disagree?



 cheers

 andrew

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 8.4 win32 shared memory patch

2009-07-29 Thread Kevin Field
On Win2k3 Std SP2, the service won't start once I've applied the
patch.  In the log, I get:

%t LOG:  CreateProcess call failed: A blocking operation was
interrupted by a call to WSACancelBlockingCall.

 (error code 2)
%t LOG:  could not fork autovacuum worker process: A blocking
operation was interrupted by a call to WSACancelBlockingCall.

[ and then 47k worth of that whole message repeated, and then: ]

%t LOG:  received fast shutdown request
%t LOG:  aborting any active transactions
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t FATAL:  terminating connection due to administrator command
%t LOG:  autovacuum launcher shutting down
%t FATAL:  terminating connection due to administrator command
%t LOG:  shutting down
%t LOG:  database system is shut down


(BTW, I noticed the %t everywhere in all of my 8.4 logs, whereas in my
8.3 logs there're actual timestamps instead.  Bug?)

I put the old binary back and it works fine.

Cheers,
Kev

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 win32 shared memory patch

2009-07-29 Thread Kevin Field
 I put the old binary back and it works fine.

For the record, fine meaning I've never had the shared memory
problem.

Kev

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] date_part()/EXTRACT(second) behaviour with time data type

2009-07-29 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 On Wed, Jul 29, 2009 at 5:15 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 I agree that we should change it, but should we back-patch it, and if so
 how far?

 Well at least to 8.4 so someone who has just always been using
 downloaded binaries or binaries compiled with the default
 configuration continues to get the same behaviour.

 My inclination would be to backpatch it further back. But I'm not 100%
 sure either.

Given the lack of prior complaints, I'm thinking just to 8.4 is a good
compromise.  Any objections out there?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread Andrew Dunstan



pg...@mohawksoft.com wrote:


Well, the API is there, it is where, I guess, PostgreSQL is going, but I
think, philosophically, the API needs to see the XML contained within SQL
columns as being able to represent variable and optional columns in object
oriented environments easily. The harder it is to use a feature, the less
usable the feature is.

Do you disagree?

  


There is always a degree of tradeoff between power and ease of use.  But 
whether or not you like the way the xpath() function now works hardly 
matters - we're not going to change the behaviour of an existing 
function except to fix a bug.


As I said upthread, I think we can easily provide some extra convenience 
functions which will do what you want. The only thing I was really 
arguing about was the function name for such a gadget.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-29 Thread Tom Lane
Dean Rasheed dean.a.rash...@googlemail.com writes:
 [ deferrable unique constraints patch ]

Applied after rather extensive editorialization.  Aside from the points
we already discussed, I redid the logic in _bt_check_unique ... it
didn't look right to me, and I also felt that we need a cross-check
to verify that the original tuple's index entry gets found in the
UNIQUE_CHECK_EXISTING search.

I'm going to go add the point about better support of bulk updates
to TODO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-29 Thread Alvaro Herrera
Tom Lane wrote:
 Dean Rasheed dean.a.rash...@googlemail.com writes:
  [ deferrable unique constraints patch ]
 
 Applied after rather extensive editorialization.

Kudos!!

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 5:00 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Dean Rasheed dean.a.rash...@googlemail.com writes:
 [ deferrable unique constraints patch ]

 Applied after rather extensive editorialization.  Aside from the points

Wow, cool.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for TODO Item: Add prompt escape to display the client and server versions

2009-07-29 Thread Robert Haas
On Tue, Jul 21, 2009 at 3:55 PM, Dickson S. Guedeslis...@guedesoft.net wrote:
 I think there is enough support for the patch.  So please adjust it to
 report the server version correctly.

 Thanks Peter, I'll adjust the patch and post a new version ASAP.

As this patch was reviewed over a week ago and has not been updated, I
am marking it Returned with Feedback.  Please resubmit for
CommitFest 2009-09.

Thanks,

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: More portable way to support 64bit platforms

2009-07-29 Thread Robert Haas
On Fri, Jul 24, 2009 at 4:24 PM, Peter Eisentrautpete...@gmx.net wrote:
 On Friday 26 June 2009 12:07:24 Tsutomu Yamada wrote:
 Included is a conceptual patch to use intptr_t. Comments are welcome.

 After closer inspection, not having a win64 box available, I have my doubts
 whether this patch actually does anything.  Foremost, it doesn't touch the
 definition of the Datum type, which ought to be at the core of a change like
 this.

 Now I see that you call this a conceptual patch.  Perhaps we should wait
 until you have developed it into a complete patch?

Is there any reason to consider this patch any further during this
CommitFest?  It seems that this is a long way from being ready to go.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-29 Thread Josh Williams
On Tue, 2009-07-28 at 23:38 -0400, Josh Williams wrote:
 Huh, running the patched version on a single thread with 128 clients
 just got it to crash.  Actually consistently, three times now.  Will try
 the same thing on the development box tomorrow morning to get some
 better debugging information.

So yeah, buffer overrun.

In pgbench.c FD_SETSIZE is redefined to get around the Windows default
of 64.  But this is done after bringing in winsock2.h (a couple levels
in as a result of first including postgres_fe.h).  So any fd_set is
built with an array of 64 descriptors, while pgbench thinks it has 1024
available to work with.

This was introduced a while back; the multi-threaded patch just makes it
visible by giving it an important pointer to write over.  Previously it
would just run over into the loop counter (and probably a couple other
things) and thus it'd continue on happily with the [sub]set it has.

In either case this seems to be a simple fix, to move that #define
earlier (see pgbench_win32.patch.)

- Josh Williams

diff -c -r1.87 pgbench.c
*** contrib/pgbench/pgbench.c	11 Jun 2009 14:48:51 -	1.87
--- contrib/pgbench/pgbench.c	29 Jul 2009 21:18:18 -
***
*** 26,31 
--- 26,36 
   * PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
   *
   */
+ 
+ #ifdef WIN32
+ #define FD_SETSIZE 1024		/* set before winsock2.h is included */
+ #endif   /* ! WIN32 */
+ 
  #include postgres_fe.h
  
  #include libpq-fe.h
***
*** 34,41 
  #include ctype.h
  
  #ifdef WIN32
- #undef FD_SETSIZE
- #define FD_SETSIZE 1024
  #include win32.h
  #else
  #include signal.h
--- 39,44 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improvements for dict_xsyn extended synonym dictionary - RRR

2009-07-29 Thread Andres Freund
Hi Sergey,

Sorry that the second round took almost as long as the first one...

On Monday 27 July 2009 12:01:46 Sergey V. Karpov wrote:
  - Imho mode=MAP should error out if keeporig is false
  - I personally find the the names for the different modes a bit
  nondescriptive. One possibility would be to introduce parameters like:
  - matchorig
  - matchsynonym
  - keeporig
  - keepsynonym
  That sounds way much easier to grasp for me.
 Yes, I agree. In such a way user has the complete (and more
 straightforward) control over the dictionary behaviour.

 Here is the revised patch version, with following options:

  * matchorig controls whether the original word is accepted by the
dictionary. Default is true.

  * keeporig controls whether the original word is included (if true)
in results, or only its synonyms (if false). Default is true.

  * matchsynonyms controls whether any of the synonyms is accepted by
the dictionary (if true). Default is false.

  * keepsynonyms controls whether synonyms are returned by the
dictionary (if true). Default is true.

 Defaults are set to keep default behaviour compatible with original
 version.
Looks nice. The only small gripe I have is that the patch adds trailing 
whitespaces at a lot of places...

Except maybe that I do see no need for changes anymore...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Steve Prentice
Is there a reason we force plpgsql IN parameters to constant? The  
reason I ask is because having them mutable would go a long way in  
easing a port from Informix's SPL. For better or worse, we have a fair  
amount of code in SPL that does something like:


   -- pObjectId is an IN parameter
   IF pObjectId IS NULL THEN
   pObjectId := newid();
   END IF;

I understand it may be better to use a different technique here, but  
we have a substantial amount of SPL (40k lines) and if we could make  
the IN parameters mutable, it would make my day.


Looking at the history of the code, it looks like this has been the  
way it has been since the beginning. Tom added a comment in 1995  
asking why we force the IN parameters to constant, but the why? part  
of the comment was removed in a later change to support OUT and INOUT  
parameters.


I've attached a patch that would change this behavior. Also, the  
test2(int) function below works with the patch, but would fail to  
compile without. I also checked to make sure the parameter wasn't  
passed by reference and it is not. The test at the bottom returns 't'  
meaning test2(int) did not change the a variable in test1().


CREATE OR REPLACE FUNCTION test1() RETURNS INT AS $$
DECLARE
a INT;
BEGIN
a := 1;
PERFORM test2(a);
RETURN a;
END
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION test2(a INT) RETURNS VOID AS $$
BEGIN
a := 2;
END
$$ LANGUAGE plpgsql;

SELECT test1() = 1;

If this change would be acceptable, I'll proceed in finishing the  
patch by updating docs and adding regression tests.


-Steve



notconstant.patch
Description: Binary data




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Andrew Dunstan



Steve Prentice wrote:
Is there a reason we force plpgsql IN parameters to constant? The 
reason I ask is because having them mutable would go a long way in 
easing a port from Informix's SPL. For better or worse, we have a fair 
amount of code in SPL that does something like:


   -- pObjectId is an IN parameter
   IF pObjectId IS NULL THEN
   pObjectId := newid();
   END IF;

I understand it may be better to use a different technique here, but 
we have a substantial amount of SPL (40k lines) and if we could make 
the IN parameters mutable, it would make my day.




First reaction is that it would mean we could never pass them by 
reference. I know PLPerl uses in effect pass by copy, but what does 
PLPgsql do?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 7:55 PM, Steve Prenticeprent...@cisco.com wrote:
 Is there a reason we force plpgsql IN parameters to constant? The reason I
 ask is because having them mutable would go a long way in easing a port from
 Informix's SPL. For better or worse, we have a fair amount of code in SPL
 that does something like:

   -- pObjectId is an IN parameter
   IF pObjectId IS NULL THEN
       pObjectId := newid();
   END IF;

 I understand it may be better to use a different technique here, but we have
 a substantial amount of SPL (40k lines) and if we could make the IN
 parameters mutable, it would make my day.

 Looking at the history of the code, it looks like this has been the way it
 has been since the beginning. Tom added a comment in 1995 asking why we
 force the IN parameters to constant, but the why? part of the comment was
 removed in a later change to support OUT and INOUT parameters.

 I've attached a patch that would change this behavior. Also, the test2(int)
 function below works with the patch, but would fail to compile without. I
 also checked to make sure the parameter wasn't passed by reference and it is
 not. The test at the bottom returns 't' meaning test2(int) did not change
 the a variable in test1().

 CREATE OR REPLACE FUNCTION test1() RETURNS INT AS $$
 DECLARE
    a INT;
 BEGIN
    a := 1;
    PERFORM test2(a);
    RETURN a;
 END
 $$ LANGUAGE plpgsql;

 CREATE OR REPLACE FUNCTION test2(a INT) RETURNS VOID AS $$
 BEGIN
    a := 2;
 END
 $$ LANGUAGE plpgsql;

 SELECT test1() = 1;

 If this change would be acceptable, I'll proceed in finishing the patch by
 updating docs and adding regression tests.

Wow.  I can imagine about a thousand ways that this could break
existing applications.  I would not be prepared to bet a dollar that
anything I've written would survive the impact unscathed.

I have a feeling someone else is going to shoot you out of the water
completely, but all I'll say is it would definitely need to be
OPTIONAL.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-29 Thread Greg Smith
This patch is wrapping up nicely.  I re-tested against the updated 
pgbench-mt_20090724 and now I get similar results whether or not 
--enable-thread-safety is enabled on Linux, so that problem is gone. 
Josh's successful Windows tests along with finding the bug he attached a 
patch to is also encouraging.


I re-ran my performance tests with the same basic setup (16 core system, 
database scale=10, read-only tests) but this time increased shared_buffers 
to 256MB just to see if results popped up significantly (they didn't).


Here's a comparison of the original pgbench select-only TPS against the 
new version using 1 thread:


clients
threads 16  32  64  128
none91763   69707   68465   63730
1   90797   70117   66324   63626

I ran these a few times and those are basically the same result.  If 
there's a regression using 1 threads instead of 1 process, which I thought 
I was seeing at one point with j=1/c=128, under closer investigation it 
would have to be much smaller than the run to run variation of pgbench 
because it vanished when I collected many runs of data.


Running the new pgbench with thread safety turned on:

clients
threads 16  32  64  128
1   89503   67849   67120   63499
2   97883   91888   87556   84430
4   95319   96409   90445   83569
8   96002   95411   88988   82383
16  103798  95056   87701   82253
32  X   95869   88253   82253

Running it without thread safety turned on so it uses processes instead 
(this is the case I couldn't report on before):


clients
threads 16  32  64  128
1   89706   68702   64545   62770
2   99224   91677   88812   82442
4   96124   96552   90245   83311
8   97066   96000   89149   83266
16  103276  96088   88276   82652
32  X   97405   90082   83672

Those two tables are also identical relative to the run to run pgbench 
noise.


This looks ready for a committer review to me, I'm happy that the patch 
performs as expected and it seems to work across two platforms.


To step back for a second, I'm testing a fairly optimistic situation--the 
standard RHEL 2.6.18 kernel which doesn't have any major issues here--and 
I see a decent sized speedup (30%) in the worst case.  I've reported 
before that running pgbench on newer Linux kernels (=2.6.23) is horribly 
slow, and sure enough the original results kicking off this thread showed 
the same thing:  only 11600 TPS on a modern 8 core system.  That's less 
than 1/4 what that server is capable of, and this patch allows working 
around that issue nicely.  pgbench not scaling up really a much worse 
problem than my test results suggest.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improvements for dict_xsyn extended synonym dictionary - RRR

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 6:59 PM, Andres Freundand...@anarazel.de wrote:
 Looks nice. The only small gripe I have is that the patch adds trailing
 whitespaces at a lot of places...

 Except maybe that I do see no need for changes anymore...

I have fixed this for Sergey in the attached version using git apply
--whitespace=fix.  (For those who may be using git to develop
patches, I highly recommend git --check to catch these types of issues
before submitting.)

I will mark this Ready for Committer.

...Robert
*** a/contrib/dict_xsyn/dict_xsyn.c
--- b/contrib/dict_xsyn/dict_xsyn.c
***
*** 26,31  typedef struct
--- 26,32 
  	char	   *key;			/* Word */
  	char	   *value;			/* Unparsed list of synonyms, including the
   * word itself */
+ 	int pos;/* Position of key word in original string */
  } Syn;
  
  typedef struct
***
*** 33,39  typedef struct
--- 34,44 
  	int			len;
  	Syn		   *syn;
  
+ 	bool		matchorig;
  	bool		keeporig;
+ 	bool		matchsynonyms;
+ 	bool		keepsynonyms;
+ 
  } DictSyn;
  
  
***
*** 88,93  read_dictionary(DictSyn *d, char *filename)
--- 93,99 
  	{
  		char	   *value;
  		char	   *key;
+ 		char   *pos;
  		char	   *end = NULL;
  
  		if (*line == '\0')
***
*** 96,121  read_dictionary(DictSyn *d, char *filename)
  		value = lowerstr(line);
  		pfree(line);
  
! 		key = find_word(value, end);
! 		if (!key)
! 		{
! 			pfree(value);
! 			continue;
! 		}
  
! 		if (cur == d-len)
  		{
! 			d-len = (d-len  0) ? 2 * d-len : 16;
! 			if (d-syn)
! d-syn = (Syn *) repalloc(d-syn, sizeof(Syn) * d-len);
! 			else
! d-syn = (Syn *) palloc(sizeof(Syn) * d-len);
! 		}
  
! 		d-syn[cur].key = pnstrdup(key, end - key);
! 		d-syn[cur].value = value;
  
! 		cur++;
  	}
  
  	tsearch_readline_end(trst);
--- 102,140 
  		value = lowerstr(line);
  		pfree(line);
  
! 		pos = value;
  
! 		while((key = find_word(pos, end)) != NULL)
  		{
! 			if (cur == d-len)
! 			{
! d-len = (d-len  0) ? 2 * d-len : 16;
! if (d-syn)
! 	d-syn = (Syn *) repalloc(d-syn, sizeof(Syn) * d-len);
! else
! 	d-syn = (Syn *) palloc(sizeof(Syn) * d-len);
! 			}
! 
! 			/* Read first word only if we will match it */
! 			if (pos != value || d-matchorig)
! 			{
! d-syn[cur].key = pnstrdup(key, end - key);
! d-syn[cur].value = pstrdup(value);
! d-syn[cur].pos = key - value;
! 
! cur++;
! 			}
! 
! 			pos = end;
  
! 			/* Don't read synonyms if we do not match them */
! 			if (!d-matchsynonyms)
! 			{
! break;
! 			}
! 		}
  
! 		pfree(value);
  	}
  
  	tsearch_readline_end(trst);
***
*** 133,155  dxsyn_init(PG_FUNCTION_ARGS)
  	List	   *dictoptions = (List *) PG_GETARG_POINTER(0);
  	DictSyn*d;
  	ListCell   *l;
  
  	d = (DictSyn *) palloc0(sizeof(DictSyn));
  	d-len = 0;
  	d-syn = NULL;
  	d-keeporig = true;
  
  	foreach(l, dictoptions)
  	{
  		DefElem*defel = (DefElem *) lfirst(l);
  
! 		if (pg_strcasecmp(defel-defname, KEEPORIG) == 0)
  		{
  			d-keeporig = defGetBoolean(defel);
  		}
  		else if (pg_strcasecmp(defel-defname, RULES) == 0)
  		{
! 			read_dictionary(d, defGetString(defel));
  		}
  		else
  		{
--- 152,191 
  	List	   *dictoptions = (List *) PG_GETARG_POINTER(0);
  	DictSyn*d;
  	ListCell   *l;
+ 	char   *filename = NULL;
  
  	d = (DictSyn *) palloc0(sizeof(DictSyn));
  	d-len = 0;
  	d-syn = NULL;
+ 	d-matchorig = true;
  	d-keeporig = true;
+ 	d-matchsynonyms = false;
+ 	d-keepsynonyms = true;
  
  	foreach(l, dictoptions)
  	{
  		DefElem*defel = (DefElem *) lfirst(l);
  
! 		if (pg_strcasecmp(defel-defname, MATCHORIG) == 0)
! 		{
! 			d-matchorig = defGetBoolean(defel);
! 		}
! 		else if (pg_strcasecmp(defel-defname, KEEPORIG) == 0)
  		{
  			d-keeporig = defGetBoolean(defel);
  		}
+ 		else if (pg_strcasecmp(defel-defname, MATCHSYNONYMS) == 0)
+ 		{
+ 			d-matchsynonyms = defGetBoolean(defel);
+ 		}
+ 		else if (pg_strcasecmp(defel-defname, KEEPSYNONYMS) == 0)
+ 		{
+ 			d-keepsynonyms = defGetBoolean(defel);
+ 		}
  		else if (pg_strcasecmp(defel-defname, RULES) == 0)
  		{
! 			/* we can't read the rules before parsing all options! */
! 			filename = pstrdup(defGetString(defel));
  		}
  		else
  		{
***
*** 160,165  dxsyn_init(PG_FUNCTION_ARGS)
--- 196,207 
  		}
  	}
  
+ 	if(filename)
+ 	{
+ 		read_dictionary(d, filename);
+ 		pfree(filename);
+ 	}
+ 
  	PG_RETURN_POINTER(d);
  }
  
***
*** 198,204  dxsyn_lexize(PG_FUNCTION_ARGS)
  		int			value_length = strlen(value);
  		char	   *pos = value;
  		int			nsyns = 0;
- 		bool		is_first = true;
  
  		res = palloc(0);
  
--- 240,245 
***
*** 214,221  dxsyn_lexize(PG_FUNCTION_ARGS)
  			res = repalloc(res, sizeof(TSLexeme) * (nsyns + 2));
  			res[nsyns].lexeme = NULL;
  
! 			/* first word is added to result only if KEEPORIG flag is set */
! 			if (d-keeporig || !is_first)
  			{
  

Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Steve Prentice

On Jul 29, 2009, at 5:26 PM, Robert Haas wrote:

Wow.  I can imagine about a thousand ways that this could break
existing applications.  I would not be prepared to bet a dollar that
anything I've written would survive the impact unscathed.

I have a feeling someone else is going to shoot you out of the water
completely, but all I'll say is it would definitely need to be
OPTIONAL.


I guess I don't get how it would break existing applications. All of  
the regression tests pass. The parameters are passed as a copy, so it  
can't modify your variable that you pass in. Perhaps I'm missing  
something--can you elaborate on how this would break existing  
applications?


-Steve

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Steve Prentice

On Jul 29, 2009, at 5:23 PM, Andrew Dunstan wrote:

First reaction is that it would mean we could never pass them by  
reference. I know PLPerl uses in effect pass by copy, but what does  
PLPgsql do?


Isn't this effectively what we accomplish with an IN/OUT parameter?


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Jul 29, 2009 at 7:55 PM, Steve Prenticeprent...@cisco.com wrote:
 Is there a reason we force plpgsql IN parameters to constant?

 Wow.  I can imagine about a thousand ways that this could break
 existing applications.  I would not be prepared to bet a dollar that
 anything I've written would survive the impact unscathed.

Really?  AFAICS the only impact is that if you tried to assign to a
parameter variable, it would do it instead of throwing a compile-time
error.  It's hard to imagine that breaking any code that works now.

It's true that doing this might foreclose some implementation shortcuts
in future versions of plpgsql, but it's not going to be an issue in
anything that works remotely like the way that plpgsql variables work
now.

Also, if we think it's a good idea, why are we only forcing CONST
for scalar arguments and not composite arguments?  (The fact that
plpgsql doesn't even have a way to mark composites as CONST might
be the reason ;-), but it's surely not a real good reason.)  And
special trigger arguments like tg_name aren't marked CONST either,
for even less reason.

Now having said all that, I'm not really in favor of Steve's
proposal --- it seems like it mostly would be encouraging dubious
programming practices.  But it's hard to say that the arguments
against are more than theoretical/aesthetic ones.

Does anyone happen to know how it works in Oracle's PL/SQL?
I think that following their lead is usually the argument-settler
when it comes to plpgsql behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 First reaction is that it would mean we could never pass them by 
 reference. I know PLPerl uses in effect pass by copy, but what does 
 PLPgsql do?

It's not really an issue, because plpgsql keeps track of whether
the current value of the variable belongs to it or not.  Look
at the freeval manipulations in pl_exec.c.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 9:05 PM, Steve Prenticeprent...@cisco.com wrote:
 On Jul 29, 2009, at 5:26 PM, Robert Haas wrote:

 Wow.  I can imagine about a thousand ways that this could break
 existing applications.  I would not be prepared to bet a dollar that
 anything I've written would survive the impact unscathed.

 I have a feeling someone else is going to shoot you out of the water
 completely, but all I'll say is it would definitely need to be
 OPTIONAL.

 I guess I don't get how it would break existing applications. All of the
 regression tests pass. The parameters are passed as a copy, so it can't
 modify your variable that you pass in. Perhaps I'm missing something--can
 you elaborate on how this would break existing applications?

Well, in my imagination, if you were proposing something completely
different, it would...  sorry for the noise.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Robert Haas
On Wed, Jul 29, 2009 at 9:08 PM, Tom Lanet...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Jul 29, 2009 at 7:55 PM, Steve Prenticeprent...@cisco.com wrote:
 Is there a reason we force plpgsql IN parameters to constant?

 Wow.  I can imagine about a thousand ways that this could break
 existing applications.  I would not be prepared to bet a dollar that
 anything I've written would survive the impact unscathed.

 Really?  AFAICS the only impact is that if you tried to assign to a
 parameter variable, it would do it instead of throwing a compile-time
 error.  It's hard to imagine that breaking any code that works now.

Yeah, I'm wrong.  Sorry.

 It's true that doing this might foreclose some implementation shortcuts
 in future versions of plpgsql, but it's not going to be an issue in
 anything that works remotely like the way that plpgsql variables work
 now.

 Also, if we think it's a good idea, why are we only forcing CONST
 for scalar arguments and not composite arguments?  (The fact that
 plpgsql doesn't even have a way to mark composites as CONST might
 be the reason ;-), but it's surely not a real good reason.)  And
 special trigger arguments like tg_name aren't marked CONST either,
 for even less reason.

 Now having said all that, I'm not really in favor of Steve's
 proposal --- it seems like it mostly would be encouraging dubious
 programming practices.  But it's hard to say that the arguments
 against are more than theoretical/aesthetic ones.

 Does anyone happen to know how it works in Oracle's PL/SQL?
 I think that following their lead is usually the argument-settler
 when it comes to plpgsql behavior.

Hmm, well if I understand this correctly (now), it's similar to allowing:

void
test2(int a)
{
a = 2;
return;
}

...which is a fairly common programming practice that I don't think is
particularly considered bad style, or at least certainly not by
everyone.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Steve Prentice

On Jul 29, 2009, at 4:55 PM, Steve Prentice wrote:


Tom added a comment in 1995


For the record, I meant 2005.

-Steve

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Hmm, well if I understand this correctly (now), it's similar to allowing:

 void
 test2(int a)
 {
 a = 2;
 return;
 }

 ...which is a fairly common programming practice that I don't think is
 particularly considered bad style, or at least certainly not by
 everyone.

Right, you could only change the local copy of the parameter value.
It's certainly not demonstrably horrible ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFD: Don't force plpgsql IN parameters to constant

2009-07-29 Thread Tom Lane
Steve Prentice prent...@cisco.com writes:
 On Jul 29, 2009, at 4:55 PM, Steve Prentice wrote:
 Tom added a comment in 1995

 For the record, I meant 2005.

I was intending to say something like I've been around this project
a long time, but not THAT long ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] question about the _SPI_save_plan() and plan cache

2009-07-29 Thread Tom Lane
Tao Ma feng_e...@163.com writes:
 I knew that the delete_function() will reclaim the memory context
 allocated for the function. But I did not find any code for removing
 the plan(SPI plan memory context), saved by calling _SPI_save_plan.

Hmmm ... good point, those probably won't get cleaned up.  In an
environment where functions are getting changed constantly, that
might be worth doing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 8.5, transformationHook

2009-07-29 Thread Robert Haas
On Sun, Jul 26, 2009 at 9:29 AM, Pavel Stehulepavel.steh...@gmail.com wrote:
 Hello

 new patch add new contrib transformations with three modules
 anotation, decode and json.

 These modules are ported from my older work.

 Before applying this patch, please use named-fixed patch too. The hook
 doesn't need it, but modules anotation and json depend on it.

These are pretty good examples, but the whole thing still feels a bit
grotty to me.  The set of syntax transformations that can be performed
with a hook of this type is extremely limited - in particular, it's
the set of things where the parser thinks it's valid and that the
structure is reasonably similar to what you have in mind, but the
meaning is somewhat different.  The fact that two of your three
examples require your named and mixed parameters patch seems to me to
be evidence of that.

The JSON transformation provides functionality which is very similar
to what we also offer for XML.  I sort of think we ought to just
provide that, rather than making it an add-on.  I have found it to be
a tremendously attractive alternative to XML.

With regard to the annotation transformation, if we're about to
diverge from SQL:201x, do we want to rethink our oppostion to foo(bar
= baz)?  Just asking.

I'm not dead set against this patch.  But I'm not really sold either.
I think we need some more input from other people.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 8.5, transformationHook

2009-07-29 Thread Pavel Stehule
Hello

2009/7/30 Robert Haas robertmh...@gmail.com:
 On Sun, Jul 26, 2009 at 9:29 AM, Pavel Stehulepavel.steh...@gmail.com wrote:
 Hello

 new patch add new contrib transformations with three modules
 anotation, decode and json.

 These modules are ported from my older work.

 Before applying this patch, please use named-fixed patch too. The hook
 doesn't need it, but modules anotation and json depend on it.

 These are pretty good examples, but the whole thing still feels a bit
 grotty to me.  The set of syntax transformations that can be performed
 with a hook of this type is extremely limited - in particular, it's
 the set of things where the parser thinks it's valid and that the
 structure is reasonably similar to what you have in mind, but the
 meaning is somewhat different.  The fact that two of your three
 examples require your named and mixed parameters patch seems to me to
 be evidence of that.


I see the main hook using as open door to functionality like decode
and json. Anotation is little bit corner use case. We don't need a
change of syntax or rules in parser. But I need to get some info for
functions from parser stage - like JSON or replace standard coercion
rules like decode. Hook is the most simple and general technique for
it (what I found). I thing, so there are other techniques - but it
needs more invasive patch and are not too general - what is values of
any hooking.

I doesn't thing, so there will be any real extended parser based on
bison in near or far future. I thing, so this is theoretically
possible, but nobody work on it. What more - with extensible parser we
still need the transformation hook, because we need the change in
transformation - decode, json.

 The JSON transformation provides functionality which is very similar
 to what we also offer for XML.  I sort of think we ought to just
 provide that, rather than making it an add-on.  I have found it to be
 a tremendously attractive alternative to XML.

The JSON is only one use case (there should be output to any format),
and I agree, so this should be in core. But every integration similar
function to core needs one or two years. Hook allows do this things
faster and from external library. It's little bit lighter process to
put your project to pgfoundry than to PostgreSQL core.

Pavel

 With regard to the annotation transformation, if we're about to
 diverge from SQL:201x, do we want to rethink our oppostion to foo(bar
 = baz)?  Just asking.

 I'm not dead set against this patch.  But I'm not really sold either.
 I think we need some more input from other people.



 ...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers