date:20041030

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Tatsuo Ishii

> On Sun, 31 Oct 2004, Tatsuo Ishii wrote:
> 
> > > and now we could add functions that work with this charset
> > > 
> > >   CREATE FUNCTION bar (x VARCHAR(255) CHARACTER SET foo) 
> > > 
> > > What we are saying is that we don't want to be able to do this?
> > 
> > Not sure we want to add above. Is it something defined in the
> > standard?
> 
> The syntax in sql2003 do allow it. The exact semantics is difficult to 
> get a clear picture of from the spec (as always).

I wonder what is the intention to allow such that syntax. It seems
it's just useless since we could make a function bar() which accepts
any charsets.

> I thought this question was the whole argument. We can't have something 
> like the above with the pg overloading since then the resolving process 
> will be too hard according to Tom.
> 
> > 2) make shared tables such as pg_database and pg_shadow can handle
> >multiple charsets. this is necessary because database names and
> >user names could be repsented in different charsets
> 
> Shouldn't we just define the charset for user names and database names?  
> Either one fixed or one that's set during initdb.  You don't mean that we
> want different user numes to be defined using different charsets?

What I have in my mind was:

CREATE DATABASE foo CHARSET bar;

If we allow this, we need 2).

> The rest of the points looks good to me. The main problem is still what to
> do with the function definitions as above. Is it something we want or not?
> 
> Is the charset something that makes two text types different or not?

I understand your point. If we are going to allow that, probably we
have to regard each text type distinct which has different charset or
collation each other. This will lead to
number-of-types-explosion-problem which I really want to avoid...
--
Tatsuo Ishii

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] not null partial index?

2004-10-30 Thread Tom Lane

Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> Sorry for the confusing and foolish question. However still I wonder
> why my expression(functional) index does not work.

You could likely have gotten it to match to a query like
SELECT ... WHERE (bid is not null) = true;
which would have the proper form of "(indexed value) = constant".

Whether the planner would have picked an indexscan for that without
coercion is another issue.  IIRC 7.4 does not keep statistics for
functional indexes and so it is unlikely to get the rowcount estimates
right for a query expressed this way.  (Looking back at your example,
you don't seem to have run an ANALYZE anyway :-()

A partial index is likely to work better for this problem on both
counts: you can write just "WHERE bid is not null", and the normal
stats will (I think) be able to estimate that well.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Dennis Bjorklund

On Sun, 31 Oct 2004, Tatsuo Ishii wrote:

> > and now we could add functions that work with this charset
> > 
> >   CREATE FUNCTION bar (x VARCHAR(255) CHARACTER SET foo) 
> > 
> > What we are saying is that we don't want to be able to do this?
> 
> Not sure we want to add above. Is it something defined in the
> standard?

The syntax in sql2003 do allow it. The exact semantics is difficult to 
get a clear picture of from the spec (as always).

I thought this question was the whole argument. We can't have something 
like the above with the pg overloading since then the resolving process 
will be too hard according to Tom.

> 2) make shared tables such as pg_database and pg_shadow can handle
>multiple charsets. this is necessary because database names and
>user names could be repsented in different charsets

Shouldn't we just define the charset for user names and database names?  
Either one fixed or one that's set during initdb.  You don't mean that we
want different user numes to be defined using different charsets?

The rest of the points looks good to me. The main problem is still what to
do with the function definitions as above. Is it something we want or not?

Is the charset something that makes two text types different or not?

-- 
/Dennis Björklund

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] not null partial index?

2004-10-30 Thread Tatsuo Ishii

> Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> > It seems partial indexes with not null condition do not work:
> 
> What you created wasn't a partial index, it was a functional index.
> Try something like
>   create index nonnullindex on accounts(bid) where bid is not null;

Sorry for the confusing and foolish question. However still I wonder
why my expression(functional) index does not work.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Tatsuo Ishii

> > For me that seems to be the right way. I'm not sure if two oids are
> > the right solution but we need to store extra info in varlena
> > structure to support charset/collation anyway. In my understanding
> > TOAST has already done in similar way.
> > 
> > Other than charset/collation we also need coercibility info to meet
> > with the SQL standard. This could only be represented in each text,
> > not by function parameters.
> 
> Arn't we limiting ourself in how we can use charsets when we remove it
> from the type.

? That is a news to me. In my understanding nobody did remove charsets
from the type. The charset info has not been in text type since
PostgreSQL was born.

> The reason why I started to look at the function parameters
> is because in the standard one can do this:
> 
>   CREATE CHARACTER SET foo GET Latin1;

Surely we want add this to PostgreSQL.

> and now we could add functions that work with this charset
> 
>   CREATE FUNCTION bar (x VARCHAR(255) CHARACTER SET foo) 
> 
> What we are saying is that we don't want to be able to do this?

Not sure we want to add above. Is it something defined in the
standard?

Though I think we want to add something like:

SELECT bar(x CHARACTER SET foo COLLATE buz) 

>  I just 
> want to understand all the implications of simplifying the types.
> 
> Same thing if the user wants to create new collations using CREATE
> COLLATION. How can we override functions for these new charsets and
> collations if all we can define are functions like foo(x VARCHAR)?
> 
> Maybe one wants the information in both place.

I think we could make our text type richer so that it could include
charset, collate info and so on.

At this point my plan to implement CREATE CHARSET/COLLATE includes:

1) add charset/collation info to pg_attribute
2) make shared tables such as pg_database and pg_shadow can handle
   multiple charsets. this is necessary because database names and
   user names could be repsented in different charsets
3) create new system catalogs to keep defined charsets and collations
4) make text type on memory representation richer so that it can keep
   chaset/collate/coercibility etc. info
5) enhance pg_am and friends
6) create our own locale database
--
Tatsuo Ishii

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] not null partial index?

2004-10-30 Thread Tom Lane

Tatsuo Ishii <[EMAIL PROTECTED]> writes:
> It seems partial indexes with not null condition do not work:

What you created wasn't a partial index, it was a functional index.
Try something like
create index nonnullindex on accounts(bid) where bid is not null;

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Dennis Bjorklund

On Sun, 31 Oct 2004, Tatsuo Ishii wrote:

> > So you want to expand every string with 8 bytes (two oid's)?
> 
> For me that seems to be the right way. I'm not sure if two oids are
> the right solution but we need to store extra info in varlena
> structure to support charset/collation anyway. In my understanding
> TOAST has already done in similar way.
> 
> Other than charset/collation we also need coercibility info to meet
> with the SQL standard. This could only be represented in each text,
> not by function parameters.

Arn't we limiting ourself in how we can use charsets when we remove it
from the type. The reason why I started to look at the function parameters
is because in the standard one can do this:

  CREATE CHARACTER SET foo GET Latin1;

and now we could add functions that work with this charset

  CREATE FUNCTION bar (x VARCHAR(255) CHARACTER SET foo) 

What we are saying is that we don't want to be able to do this? I just 
want to understand all the implications of simplifying the types.

Same thing if the user wants to create new collations using CREATE
COLLATION. How can we override functions for these new charsets and
collations if all we can define are functions like foo(x VARCHAR)?

Maybe one wants the information in both place.

-- 
/Dennis Björklund

---(end of broadcast)---
TIP 8: explain analyze is your friend

[HACKERS] not null partial index?

2004-10-30 Thread Tatsuo Ishii

It seems partial indexes with not null condition do not work:
I did some testings with pgbench database and I observe:

1) statistics information is slghtly incorrect

2) partial index is not used

Am I missing something?
--
Tatsuo Ishii

$ psql test
Welcome to psql 7.4.6, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help on internal slash commands
   \g or terminate with semicolon to execute query
   \q to quit

Pager usage is off.
test=# \d accounts
   Table "public.accounts"
  Column  | Type  | Modifiers 
--+---+---
 aid  | integer   | not null
 bid  | integer   | 
 abalance | integer   | 
 filler   | character(84) | 
Indexes:
"accounts_pkey" primary key, btree (aid)

test=# update accounts set bid = NULL;
UPDATE 10
test=# update accounts set bid = 1 where aid = 1;
UPDATE 1
test=# create index nonnullindex on accounts((bid is not null));
CREATE INDEX
test=# vacuum analyze accounts;
VACUUM
test=# explain select * from accounts where bid is not null;
QUERY PLAN
--
 Seq Scan on accounts  (cost=0.00..4227.00 rows=10 width=100)
   Filter: (bid IS NOT NULL)
(2 rows)

test=# vacuum full accounts;
VACUUM
test=# explain select * from accounts where bid is not null;
QUERY PLAN
--
 Seq Scan on accounts  (cost=0.00..2588.00 rows=10 width=100)
   Filter: (bid IS NOT NULL)
(2 rows)

test=# select count(*) from accounts where bid is not null;
 count 
---
 1
(1 row)

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Tatsuo Ishii

> On Sat, 30 Oct 2004, Tom Lane wrote:
> 
> > > The alternative is storing the charset and collation inside each string.  
> > > That seems like a too big price to pay, it belong in the type.
> > 
> > No, the alternative you're proposing is too big a price to pay.
> 
> So you want to expand every string with 8 bytes (two oid's)?

For me that seems to be the right way. I'm not sure if two oids are
the right solution but we need to store extra info in varlena
structure to support charset/collation anyway. In my understanding
TOAST has already done in similar way.

Other than charset/collation we also need coercibility info to meet
with the SQL standard. This could only be represented in each text,
not by function parameters.

> Or special case the storing in tables so that you only store the text and
> not the charset/collation info?

On disk representation we don't need charset/collation since they will
be stored in probably pg_attribute. Coercibility info is not necessary
on disk too.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] array_to_column function

2004-10-30 Thread Markus Bertheau

Ð ÐÐÑ, 30.10.2004, Ð 21:54, David Fetter ÐÐÑÐÑ:
> Kind people,

> CREATE OR REPLACE FUNCTION array_to_column (ANYARRAY)

You want to see that function distributed with PostgreSQL? It would
probably have to be implemented in C then, because PL/pgSQL-support has
to be explicitly "enabled" for every database.

-- 
Markus Bertheau <[EMAIL PROTECTED]>


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Dennis Bjorklund

On Sat, 30 Oct 2004, Tom Lane wrote:

> > The alternative is storing the charset and collation inside each string.  
> > That seems like a too big price to pay, it belong in the type.
> 
> No, the alternative you're proposing is too big a price to pay.

So you want to expand every string with 8 bytes (two oid's)?

Or special case the storing in tables so that you only store the text and
not the charset/collation info?

-- 
/Dennis Björklund

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Tom Lane

Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> On Sat, 30 Oct 2004, Tom Lane wrote:
>> I'm worried about the fact that instead of, say, one length(text)
>> function, we would now have to have a different one for every
>> characterset/collation.

> For some functions one really want different ones depending on the
> charset. For example the length function, then we will need to calculate
> the length differently for each charset. We can never have one length
> function that works for every possible charset.

Strange ... we seem to have one now ...

> We could have one pg function that do N different things inside
> depending on the charset, but that's not really a simplification.

Yes it is.  It's about the same amount of actual code either way, and it
gets the job done without introducing huge headaches for the overloaded-
function resolution rules.  That stuff is complicated and fragile enough
without introducing issues like whether to match optional subtype
information.

Basically, you're talking about a substantial increase in complexity in
the type resolution rules, and breaking a great number of internal APIs
(because they don't pass the information that would need to be passed),
in order to have a slightly different layout of the code that will
actually implement character-set-specific operations.  I can't see that
that's a sensible use of development effort.

> The alternative is storing the charset and collation inside each string.  
> That seems like a too big price to pay, it belong in the type.

No, the alternative you're proposing is too big a price to pay.  It's
too much work for too little gain, and it's also too inflexible (I think
there will be cases where we don't want to nail down the character set
at datatype-resolution time).

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Dennis Bjorklund

On Sat, 30 Oct 2004, Tom Lane wrote:

> > Are you worried about performance or is it the smaller change that you
> > want?
> 
> I'm worried about the fact that instead of, say, one length(text)
> function, we would now have to have a different one for every
> characterset/collation.

This is not about how the parameter information is stored, but let's 
discuss that anyway. It's important issues.

I was hoping that we could implement functions where one didn't have to 
specify the charset and collation (but could if we want to).

For some functions one really want different ones depending on the
charset. For example the length function, then we will need to calculate
the length differently for each charset. We can never have one length
function that works for every possible charset. We could have one pg
function that do N different things inside depending on the charset, but
that's not really a simplification.

For functions where one have not specified the charset of an argument then
we need to be able to pass on that type information to where ever we use
that argument. Variables already have a type and if we have a (pseudo
code) function like

foo (a varchar) returns int
{
  select length(a);
}

and call it with

foo ('foo' charset latin1) 

then we need to make sure that variable a inside the function body of foo
get the type from the caller and then the function call to length(a) will
work out since it would select the length function for latin1. I think it
should work but an implementation is the only way to know.

Every string do in the end need to know what charset and what collation it
is in. Otherwise it can not be used for anything, not even to compare it
with another string.

I could even imagine to have different functions for each
charset/collation. It's not that many functions built in that are affected
and not all of them need to work with every collation. The user just need
to call them with the correct one. I don't expect any functions like

  foo (a varchar collation sv_SE,
   b varchar collation en_US)

or any other combination of a and b. If any then a and be will be the same
type. So there would not be arbitrary many combinations (but still a lot).

The alternative is storing the charset and collation inside each string.  
That seems like a too big price to pay, it belong in the type.

> Not to mention one for every possible N in varchar(N).

This doesn't matter since one can always implement functions to take 
varchar arguments without any limit and then any shorter string can be 
implictly casted up to that type. Or one can treat the length exactly like 
the charset above.

Of course you do not want one length function for each length.

-- 
/Dennis Björklund

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] 8.0b4: COMMIT outside of a transaction echoes ROLLBACK

2004-10-30 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> On Wed, Oct 27, 2004 at 09:29:21PM -0400, Tom Lane wrote:
>> Wouldn't it be better to just stay in TBLOCK_STARTED state, as if the
>> COMMIT were just some random utility command?

> It's the same thing, because CommitTransactionCommand acts identically
> either way.  I changed it anyway because it seems simpler.

Patch applied.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[HACKERS] Problems using pgxs on Win32

2004-10-30 Thread Thomas Hallgren

I'm trying to change the Makefile system for PL/Java so that it uses 
PGXS instead of compiling using a complete PostgreSQL source tree. As it 
turns out, the directory "include/port/win32" is not present in the 
PostgreSQL binary installation. Without it, it's not possible to compile 
on win32.

Do I need some special configuration in order to get the missing pieces?
Regards,
Thomas Hallgren
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

[HACKERS] array_to_column function

2004-10-30 Thread David Fetter

Kind people,

Here's something I came up with, having accidentally discovered the
ARRAY() constructor (BTW, I think at least some pointer to it should
be in the array section of functions & operators).

CREATE OR REPLACE FUNCTION array_to_column (ANYARRAY)
RETURNS SETOF ANYELEMENT
IMMUTABLE
LANGUAGE plpgsql
AS $$
BEGIN
IF (position('][' IN array_dims($1)) <> 0)
THEN
RAISE EXCEPTION 'Only 1-dimensional arrays are allowed!';
END IF;
FOR i IN array_lower($1, 1) .. array_upper($1, 1)
LOOP
RETURN NEXT $1[i];
END LOOP;
RETURN;
END;
$$;

Thanks to Markus Bertheau aka ska-fan for help with removing an
unneeded regex compare and with spelling. :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Signature change for SPI_cursor_open

2004-10-30 Thread Tom Lane

Thomas Hallgren <[EMAIL PROTECTED]> writes:
> Ok, now I understand. Thanks for the explanation. I guess that if 
> read_only is set to true and an attempt is made to execute a plan that 
> has side effect, that will cause an ERROR?

Right, it'll bounce anything except a SELECT query.  (This test is not
watertight; consider SELECT nextval() ... but on the other hand that's a
non-transactional operation that does not care about snapshots anyway.)

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Signature change for SPI_cursor_open

2004-10-30 Thread Thomas Hallgren

Tom Lane wrote:
> You should set the flag if and only if you are executing a pl/java
> function that has a provolatile setting of "stable" or "immutable".
> The new rule is that only functions declared "volatile" are allowed
> to have side effects on the database.  See pghackers discussions in
> early September about updating snapshots, doing CommandCounterIncrement,
> and so forth within functions.
>
Ok, now I understand. Thanks for the explanation. I guess that if 
read_only is set to true and an attempt is made to execute a plan that 
has side effect, that will cause an ERROR?

Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] 8.0b4: COMMIT outside of a transaction echoes ROLLBACK

2004-10-30 Thread Alvaro Herrera

On Wed, Oct 27, 2004 at 09:29:21PM -0400, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > On Wed, Oct 27, 2004 at 04:21:53PM -0400, Tom Lane wrote:
> >> On the other hand, it's also a pretty minor issue, and if it turns out
> >> to require a lot of code rejiggering to make it do that, I'd not think
> >> it worthwhile.
> 
> > Patch attached.  It passes the regression tests.  It shouldn't have
> > secondary effects, but please test.
> 
> Wouldn't it be better to just stay in TBLOCK_STARTED state, as if the
> COMMIT were just some random utility command?

It's the same thing, because CommitTransactionCommand acts identically
either way.  I changed it anyway because it seems simpler.

-- 
Alvaro Herrera ()
Licensee shall have no right to use the Licensed Software
for productive or commercial use. (Licencia de StarOffice 6.0 beta)
Index: src/backend/access/transam/xact.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.192
diff -c -r1.192 xact.c
*** src/backend/access/transam/xact.c   16 Oct 2004 18:57:22 -  1.192
--- src/backend/access/transam/xact.c   30 Oct 2004 18:18:16 -
***
*** 2537,2552 
break;

/*
!* here, the user issued COMMIT when not inside a transaction.
!* Issue a WARNING and go to abort state.  The upcoming call
!* to CommitTransactionCommand() will then put us back into
!* the default state.
 */
case TBLOCK_STARTED:
ereport(WARNING,
(errcode(ERRCODE_NO_ACTIVE_SQL_TRANSACTION),
 errmsg("there is no transaction in 
progress")));
!   s->blockState = TBLOCK_ABORT_PENDING;
break;

/* These cases are invalid. */
--- 2537,2552 
break;

/*
!* The user issued COMMIT when not inside a transaction.  
Issue a
!* WARNING, staying in TBLOCK_STARTED state.  The upcoming 
call to
!* CommitTransactionCommand() will then close the transaction 
and
!* put us back into the default state.
 */
case TBLOCK_STARTED:
ereport(WARNING,
(errcode(ERRCODE_NO_ACTIVE_SQL_TRANSACTION),
 errmsg("there is no transaction in 
progress")));
!   result = true;
break;

/* These cases are invalid. */

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Tom Lane

Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> On Sat, 30 Oct 2004, Tom Lane wrote:
>> Why would we not keep this information right in the string values?

> We could, but then we would need to parse it every time.

Huh?  We'd store it in the most compact pre-parsed form we could think
of; probably some sort of index into a list of supported character sets
and collations.  (This is not so different from representing timezones
inside timestamptz values, instead of using a global setting.)

> Are you worried about performance or is it the smaller change that you
> want?

I'm worried about the fact that instead of, say, one length(text)
function, we would now have to have a different one for every
characterset/collation.  Not to mention one for every possible N in
varchar(N).  Making those properties part of a function's parameter
signature is unworkable on its face --- it'll cause an exponential
explosion in the number of pg_proc entries, and probably make it
impossible to resolve a unique candidate function in many situations.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Dennis Bjorklund

On Sat, 30 Oct 2004, Tom Lane wrote:

> Why would we not keep this information right in the string values?

We could, but then we would need to parse it every time. Storing it in a 
structured way seems like the database solution and at least as a user 
from the client side it makes sense.

Are you worried about performance or is it the smaller change that you
want?

I can't tell how much, if any, is gained in speed by having an array of
strings instead of another system table.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] psql and schemas

2004-10-30 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> psql's slash commands for schemas seem a little weird to me.

The behaviors you mention were written at different times by different
people, and mostly have nothing to do with schemas per se.  I agree that
some more consistency would probably be good.  Do you have a specific
proposal?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Tom Lane

Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> Today we store the max length of a string in the typmod field, but that
> has to be extended so we also store the charset and the collation of the
> string.

Why would we not keep this information right in the string values?

> [ unworkable proposal snipped ]

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[HACKERS] Charset/collate support and function parameters

2004-10-30 Thread Dennis Bjorklund

I have a long term plan to implement charset support in pg and now when I
have dropped the work on the timestamps, I've been looking into this
subject.

Today we store the max length of a string in the typmod field, but that
has to be extended so we also store the charset and the collation of the
string. That's simple but we need functions that take a string of a
specific charset and collation as an input and give that as a result.
Currently all information we have about function arguments are the OID of
the type. The function argument OID's are stored in an array in pg_proc
and I suggest that we instead of this array have a table pg_parameters
that is much like

http://www.postgresql.org/docs/7.4/static/infoschema-parameters.html

Notice how there are a lot of columns describing the dynamic parts of a
type, like character_maximum_length, character_set_name,
datetime_precision. We would of course not store the name of a charset,
but the oid (and so on).

Most of these are NULL since they only apply to a specific type, but
that's okay since NULL values are stored in a bitmap so the row width will
still be small.

Before one start to work on charset/collation support I think it would be
good of one can make the above change with just the old properties. As a
result we could write functions like

  foo (bar varchar(5))

We probably won't write functions like that very often. but as a first
step this is what we want.

Changing this is a lot of work, especially when one look in pg_proc.h and 
realize that one need to alter 3000 lines of

  DATA(insert OID = 2238 ( bit_and PGNSP PGUID 12 t f f f i 1 23 "23" _null_ 
aggregate_dummy - _null_));
  DESCR("bitwise-and integer aggregate");

into another form. The "23" should be pulled out and it would become a row 
in the pg_parameters table. Maybe some job for a script :-)

  Sometimes I wish that (at least part of) the bootstrap was in a higher 
  level and that the above was just normal sql statements:

  CREATE FUNCTION bit_and (  ) AS ...

In addition to the function arguments we also need to treat the function
return value in a similar way. The natural solution is to extend pg_proc
with many of the same columns as in the pg_parameters table. One could
also reuse the pg_parameters table and store a parameter with ordinal
number 0 to be the return value. But then there would be some columns that
do not apply to return values.

My current plan is

A) Implement a pg_parameters table and let everything else work
   as today. Also, the return values have to be taken care of in a 
   similar way.

B) Change function overloading so we can have functions with the same 
   name but different properties. For example for strings that means 
   different max lengths are used to resolve overloading.

C) Work on charset / collation.

All of these will probably not happen for 8.1 but I hope to finish A and
B. It all depends on how much trouble I run into and how much time I can
put into it. The function overload parts in pg are far from trivial, but I
will not worry about that until I get that far.

Any comments about this plan?

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Signature change for SPI_cursor_open

2004-10-30 Thread Tom Lane

Thomas Hallgren <[EMAIL PROTECTED]> writes:
> Apparently the signature for function SPI_cursor_open got an additional 
> "read_only" parameter starting with 8.0.0beta3. The documentation states 
> that this flag should be "true for read-only execution".

> Is it correct to assume that this flag implies that some kind of 
> optimization has been added and that if this flag is "false", the 
> execution of the statement will take somewhat longer? I need some advice 
> in how to handle this.

You should set the flag if and only if you are executing a pl/java
function that has a provolatile setting of "stable" or "immutable".
The new rule is that only functions declared "volatile" are allowed
to have side effects on the database.  See pghackers discussions in
early September about updating snapshots, doing CommandCounterIncrement,
and so forth within functions.

For code examples see the commits in the pl languages here:

2004-09-13 16:07  tgl

* src/: include/executor/execdesc.h, include/executor/executor.h,
include/executor/spi.h, include/tcop/pquery.h,
include/tcop/utility.h, include/utils/tqual.h, pl/plperl/plperl.c,
pl/plperl/spi_internal.c, pl/plperl/spi_internal.h,
pl/plpgsql/src/pl_comp.c, pl/plpgsql/src/pl_exec.c,
pl/plpgsql/src/plpgsql.h, pl/plpython/plpython.c, pl/tcl/pltcl.c,
test/regress/expected/transactions.out,
test/regress/sql/transactions.sql: Redesign query-snapshot timing
so that volatile functions in READ COMMITTED mode see a fresh
snapshot for each command in the function, rather than using the
latest interactive command's snapshot.  Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and
IMMUTABLE functions, instead using the snapshot taken for the most
closely nested regular query.  (This behavior is only sane for
read-only functions, so the patch also enforces that such functions
contain only SELECT commands.) As per my proposal of 6-Sep-2004; I
note that I floated essentially the same proposal on 19-Jun-2002,
but that discussion tailed off without any action.  Since 8.0 seems
like the right place to be taking possibly nontrivial backwards
compatibility hits, let's get it done now.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

2004-10-30 Thread Tom Lane

Jan Wieck <[EMAIL PROTECTED]> writes:
> On 10/26/2004 1:53 AM, Tom Lane wrote:
>> Greg Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>>> Another issue is what we do with the effective_cache_size value once we
>>> have a number we trust.  We can't readily change the size of the ARC
>>> lists on the fly.
>> 
> Huh? I thought effective_cache_size was just used as an factor the cost
> estimation equation.
>> 
>> Today, that is true.  Jan is speculating about using it as a parameter
>> of the ARC cache management algorithm ... and that worries me.

> If we need another config option, it's not that we are running out of 
> possible names, is it?

No, the point is that the value is not very trustworthy at the moment.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

2004-10-30 Thread Jan Wieck

On 10/26/2004 1:53 AM, Tom Lane wrote:
Greg Stark <[EMAIL PROTECTED]> writes:
Tom Lane <[EMAIL PROTECTED]> writes:
Another issue is what we do with the effective_cache_size value once we
have a number we trust.  We can't readily change the size of the ARC
lists on the fly.

Huh? I thought effective_cache_size was just used as an factor the cost
estimation equation.
Today, that is true.  Jan is speculating about using it as a parameter
of the ARC cache management algorithm ... and that worries me.
If we need another config option, it's not that we are running out of 
possible names, is it?

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] Signature change for SPI_cursor_open

2004-10-30 Thread Thomas Hallgren

Apparently the signature for function SPI_cursor_open got an additional 
"read_only" parameter starting with 8.0.0beta3. The documentation states 
that this flag should be "true for read-only execution".

I can't see any mention of this on the hackers list nor in the change logs.
Is it correct to assume that this flag implies that some kind of 
optimization has been added and that if this flag is "false", the 
execution of the statement will take somewhat longer? I need some advice 
in how to handle this.

In PL/Java, I use this method in my JDBC implementation and have little 
knowledge of what the intentions are at that point in time since I don't 
write the actual code that will use it.

A couple of questions:
Besides from the plain obvious like INSERT, UPDATE, and DELETE, any 
SELECT that calls a function may potentially change something right?

What are the implications of setting this flag to false always?
Why doesn't SPI_prepare figure this out with some help from attributes 
that defines the characteristics of involved functions? I.e. why isn't 
the "read-only" a property of the statement itself rather than something 
to be provided when the statement is executed.

Can one single statement be executed in both read-only and not read-only 
mode?

Regards,
Thomas Hallgren
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] not null partial index?

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] not null partial index?

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] not null partial index?

Re: [HACKERS] Charset/collate support and function parameters

[HACKERS] not null partial index?

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] array_to_column function

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] 8.0b4: COMMIT outside of a transaction echoes ROLLBACK

[HACKERS] Problems using pgxs on Win32

[HACKERS] array_to_column function

Re: [HACKERS] Signature change for SPI_cursor_open

Re: [HACKERS] Signature change for SPI_cursor_open

Re: [HACKERS] 8.0b4: COMMIT outside of a transaction echoes ROLLBACK

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] Charset/collate support and function parameters

Re: [HACKERS] psql and schemas

Re: [HACKERS] Charset/collate support and function parameters

[HACKERS] Charset/collate support and function parameters

Re: [HACKERS] Signature change for SPI_cursor_open

Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

[HACKERS] Signature change for SPI_cursor_open

28 matches

Site Navigation

Mail list logo

Footer information