Re: [HACKERS] "anyelement2" pseudotype

2007-02-13 Thread Tom Lane
"Matt Miller" <[EMAIL PROTECTED]> writes:
> A few months ago at
> http://archives.postgresql.org/pgsql-general/2006-11/msg01770.php the
> notion of adding an "anyelement2" pseudotype was discussed.  The context
> was a compatibility SQL function to support Oracle's DECODE function.
> Assuming this new pseudotype has not been added yet, I'm ready to look
> into doing this myself, and I'd like a bit of shove in the right direction.

The reason it's not in there already is we didn't seem to have quite
enough use-case to justify it.  Do you have more?

As for actually adding it, grep for all references to ANYELEMENT and add
code accordingly; shouldn't be that hard.  Note you'd need to add an
anyarray2 at the same time for things to keep working sanely.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] "anyelement2" pseudotype

2007-02-13 Thread Matt Miller
A few months ago at
http://archives.postgresql.org/pgsql-general/2006-11/msg01770.php the
notion of adding an "anyelement2" pseudotype was discussed.  The context
was a compatibility SQL function to support Oracle's DECODE function.
Assuming this new pseudotype has not been added yet, I'm ready to look
into doing this myself, and I'd like a bit of shove in the right direction.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Cosmetic note: hit rates in logs

2007-02-13 Thread Jeroen T. Vermeulen
Just noticed a small cosmetic point in the logs when logging statement
performance data: if a statement accesses 0 blocks, the "hit rate" is
given as 0%.

I can see that that makes sense mathematically--insofar as 0/0 makes
mathematical sense at all--but wouldn't it be more helpful to represent
this as a 100% hit rate?

I guess computing hit rate as the limit of 0/x is as valid as computing
the limit of x/x (with x being the number of accesses that approaches
zero).  But when I look at the logs I find myself going "low hit rate
here--oh wait, that's for zero accesses" all the time.  Or would the
change make other people "good hit rate here--oh wait, that's for zero
accesses"?


Jeroen



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Re: [PATCHES] Re: [BUGS] BUG #2724: Could not check connection status with "ssl=on"

2007-02-13 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Bruce Momjian <[EMAIL PROTECTED]> writes:
> >>> ! if (SOCK_ERRNO != ECONNRESET)
> >>> ! SSL_shutdown(conn->ssl);
> >> 
> >> Ummm ... what is this supposed to fix exactly, and what are the odds
> 
> > I think the user was getting SIGPIPE on SSL_shutdown() of a closed
> > connection.
> 
> It seems moderately improbable that by the time control arrives here,
> errno still has anything to do with the last operation on the SSL
> socket.

Yep, I was wondering that too.  It is called SOCK_ERRNO, but in fact it
is just errno on Unix.

I generated the patch just to try to give an example of what the user
might be suggesting.  Let's see if anyone wants to research this more.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Day and month name localization uses wrong locale category

2007-02-13 Thread Bruce Momjian

Would someone update this patch with the optimization below.  The patch
is at?

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Tom Lane wrote:
> Peter Eisentraut <[EMAIL PROTECTED]> writes:
> > What's concerning me about the way this is written is that it calls 
> > setlocale() for each formatting instance, which will be very slow.
> 
> Perhaps, the first time the info is needed, do setlocale(), ask strftime
> for the 12+7 strings we need and save them away, then revert to C locale
> and proceed from there.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Various breakages in new contrib/isn module

2007-02-13 Thread Bruce Momjian

Added to TODO:

* Clean up casting in /contrib/isn

  http://archives.postgresql.org/pgsql-hackers/2006-11/msg00245.php


---

Tom Lane wrote:
> It occurred to me to run the regression tests type_sanity and opr_sanity
> over the contrib/isn code.  (The easy way to do this is to copy the isn
> install script into the regress/sql directory and then add it to
> serial_schedule just before those two tests.)  It turned up a couple
> of moderately serious problems:
> 
> * There are a whole bunch of "shell" operators created; try
>   select oid::regoperator from pg_operator where oprcode = 0;
> after loading isn.  I didn't track it down in detail, but it looked
> like most or all of these come from dangling oprcom links, ie, there's
> an operator that claims to have a commutator but you never supplied one.
> This is very bad, because the planner *will* try to use those operators
> given the right kind of query.
> 
> * There are hash opclasses for these datatypes but the corresponding
> equality operators are not marked hashable.  This is not quite as bad,
> but should be fixed.
> 
> Please submit a patch that fixes these.
> 
> Note to hackers: it might be worth trying the same thing with the other
> contrib modules; I don't have time right now though.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] "recovering prepared transaction" after server restart message

2007-02-13 Thread Bruce Momjian

Added to TODO:

* Improve logging of prepared statements recovered during startup

  http://archives.postgresql.org/pgsql-hackers/2006-11/msg00092.php



---

Joachim Wieland wrote:
> There have been several reports that people could not vacuum any more or
> observed strange locks even after server restart. The reason was that they
> still had uncommitted prepared transactions around.
> 
> 
> I wonder if it could help to change the log level from
> 
> ereport(LOG,
> (errmsg("recovering prepared transaction %u", xid)));
> 
> to WARNING maybe in order to make that message more striking within the
> normal startup messages.
> 
> 
> 
> Joachim
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Deadlock with pg_dump?

2007-02-13 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Are we ready to set 'log_min_error_statement = error' by default for
> > 8.3?
> 
> We already did that in 8.2.

Oh, interesting.  Oops again.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Deadlock with pg_dump?

2007-02-13 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Are we ready to set 'log_min_error_statement = error' by default for
> 8.3?

We already did that in 8.2.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Deadlock with pg_dump?

2007-02-13 Thread Bruce Momjian
Tom Lane wrote:
> "Albe Laurenz" <[EMAIL PROTECTED]> writes:
> >> [ Memo to hackers: why is it that log_min_error_statement = error
> >> isn't the default? ]
> 
> > To avoid spamming the logs with every failed SQL statement?
> 
> Certainly there are people who will turn it off, but that's why it's
> configurable.  I've had to answer "how do I find out what's causing
> error message FOO" often enough that I'm starting to think logging error
> statements is a more useful default than not logging 'em ...

Are we ready to set 'log_min_error_statement = error' by default for
8.3?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Deadlock with pg_dump?

2007-02-13 Thread Bruce Momjian
Simon Riggs wrote:
> On Thu, 2006-10-26 at 18:45 -0400, Tom Lane wrote:
> > Chris Campbell <[EMAIL PROTECTED]> writes:
> > > Is there additional logging information I can turn on to get more  
> > > details? I guess I need to see exactly what locks both processes  
> > > hold, and what queries they were running when the deadlock occurred?  
> > > Is that easily done, without turning on logging for *all* statements?
> > 
> > log_min_error_statement = error would at least get you the statements
> > reporting the deadlocks, though not what they're conflicting against.
> 
> Yeh, we need a much better locking logger for performance analysis.
> 
> We really need to dump the whole wait-for graph for deadlocks, since
> this might be more complex than just two statements involved. Deadlocks
> ought to be so infrequent that we can afford the log space to do this -
> plus if we did this it would likely lead to fewer deadlocks.
> 
> For 8.3 I'd like to have a log_min_duration_lockwait (secs) parameter
> that would allow you to dump the wait-for graph for any data-level locks
> that wait too long, rather than just those that deadlock. Many
> applications experience heavy locking because of lack of holistic
> design. That will also show up the need for other utilities to act
> CONCURRENTLY, if possible.

Old email, but I don't see how our current output is not good enough?

test=> lock a;
ERROR:  deadlock detected
DETAIL:  Process 6855 waits for AccessExclusiveLock on relation 16394 of
database 16384; blocked by process 6795.
Process 6795 waits for AccessExclusiveLock on relation 16396 of database
16384; blocked by process 6855.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixing insecure security definer functions

2007-02-13 Thread Andrew Dunstan
Peter Eisentraut wrote:
> Regarding the advisory on possibly insecure security definer functions
> that I just sent out (by overriding the search path you can make the
> function do whatever you want with the privileges of the function
> owner), the favored solution after some initial discussion in the core
> team was to save the search path at creation time with each function.
> This measure will arguably also increase the robustness of functions in
> general, and it seems to be desirable as part of the effort to make
> plan invalidation work.
>
> Quite probably, there will be all sorts of consequences in terms of
> backward compatibility and preserving the possibility of valid uses of
> the old behavior and so on.  So I'm inviting input on how to fix the
> problem in general and how to avoid the mentioned follow-up problems in
> particular.


Maybe we need an option on CREATE  ... SECURITY DEFINER to allow the
function to inherit the caller's search path.

cheers

andrew



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Foreign keys for non-default datatypes, redux

2007-02-13 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes:
> On Mon, 12 Feb 2007, Tom Lane wrote:
>> It turns out this isn't sufficient: ri_Check_Pk_Match() wants to
>> generate PK = PK checks, and the PK = FK operator isn't the right one
>> for that.

> Ugh, right, for modifications of the pk side with no action to make sure
> there isn't a new row with that key.

When I looked closer I found out that ri_AttributesEqual() is applied to
pairs of FK values as well as pairs of PK values.  The old coding was
looking up the default btree operator for the types, which is close but
not close enough, if we want to allow for the possibility of unique
indexes built on non-default operator classes.  So I ended up with three
new columns in pg_constraint --- PK=FK, PK=PK, FK=FK operator OIDs.
Anyway it's all done and committed.

It strikes me BTW that the RI mechanism was vulnerable to the same sort
of search_path-based shenanigans that Peter is worried about: any RI
checks triggered within a SECURITY DEFINER function could have been
subverted by substituting a different "=" operator.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Fixing insecure security definer functions

2007-02-13 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
> Stephen Frost <[EMAIL PROTECTED]> writes:
> > It'll break most of the functions that we have in our production
> > systems...  They're not security definer functions but it's routine for
> > us to switch between different schemas to run a function on.
> 
> > What about pushing all the in-function references down to the
> > specific objects referenced at plan creation time (err, I thought this
> > was done?)?
> 
> Wouldn't that break exactly the cases you're worried about?  It would be
> an enormous amount of work, too.

No, because what we tend to do is build up a query in a string and 
then call it using execute.  It doesn't matter to the execute'd string
if the references in the functions are mapped to oids or not at creation
time (since the query being built in the string couldn't possibly be
affected).  If the search path is forced to something that'll screw up
the query being execute'd tho.

The calls to build up the query don't use things in the current search 
path much (they're generally refering to a seperate specific reference 
schema).  Once the command is built it's then run, but it could be run 
in a number of different schemas (because they all have basically the 
exact same set of tables) which is based on the search path.  This
allows us to have one set of functions (I think we're up to around 80
now) which can work against a number of schemas.

Indeed, what I tend to do is set up the search path something like:

set search_path = user1_tables, user1_results, func_schema;
select do_scan();

set search_path = user2_tables, user2_results, func_schema;
select do_scan();

etc, etc.  The queries are run against each user's tables and the
results put into a seperate schema for each user.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Fixing insecure security definer functions

2007-02-13 Thread Tom Lane
Stephen Frost <[EMAIL PROTECTED]> writes:
> It'll break most of the functions that we have in our production
> systems...  They're not security definer functions but it's routine for
> us to switch between different schemas to run a function on.

> What about pushing all the in-function references down to the
> specific objects referenced at plan creation time (err, I thought this
> was done?)?

Wouldn't that break exactly the cases you're worried about?  It would be
an enormous amount of work, too.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Writing triggers in C++

2007-02-13 Thread Tom Lane
Jacob Rief <[EMAIL PROTECTED]> writes:
> I tried to write a trigger using C++.

That is most likely not going to work anyway, because the backend
operating environment is C not C++.  If you dumb it down enough
--- no exceptions, no RTTI, no use of C++ library --- then it might
work, but at that point you're really coding in C anyway.

> Is there any convention how to rename such identifiers? If I would
> rename those identifiers (I simply would add an underscore to each of
> them), would such a patch be accepted and adopted onto one of the next
> releases? 

No.  Because of the above problems, we don't see much reason to avoid
C++'s extra keywords.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Fixing insecure security definer functions

2007-02-13 Thread Tom Lane
Stephen Frost <[EMAIL PROTECTED]> writes:
> * Peter Eisentraut ([EMAIL PROTECTED]) wrote:
>> Regarding the advisory on possibly insecure security definer functions
>> that I just sent out (by overriding the search path you can make the
>> function do whatever you want with the privileges of the function
>> owner), the favored solution after some initial discussion in the core
>> team was to save the search path at creation time with each function. 

> Would this be done only on security-definer functions?

I would like to see it done for all functions, security-definer or not.
There are efficiency reasons: if you keep the search path from thrashing
then you can cache plans more effectively.  (Currently, plpgsql's plan
caching doesn't pay any attention to whether the search path has
changed, but it's impossible to argue that that's not broken.)

I would suggest that the search path be added as an explicit parameter
to CREATE FUNCTION, with a default of the current setting.  The main
reason for this is that it's going to be a real PITA for pg_dump if we
don't allow an explicit specification.

It might also be worth allowing "PATH NULL" or some such locution to
specify the current behavior, for those who really want it.  (In
particular, most C functions would want this to avoid useless overhead
for calls to things that aren't affected by search path.)

Bottom line here is that this feature is really orthogonal to SECURITY
DEFINER ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] array_accum aggregate

2007-02-13 Thread Bruce Momjian

What is the status of this feature addition?

---

Stephen Frost wrote:
-- Start of PGP signed section.
> * Tom Lane ([EMAIL PROTECTED]) wrote:
> > (However, now that we support nulls in arrays, meseems a more consistent
> > definition would be that it allows null inputs and just includes them in
> > the output.  So probably you do need it non-strict.)
> 
> This was my intention.
> 
> > I'm inclined to think that this example demonstrates a deficiency in the
> > aggregate-function design: there should be a way to declare what we're
> > really doing.  But I don't know exactly what that should look like.
> 
> I agree and would much rather have a clean solution which works with the
> design than one which has to work outside it.  When I first was trying
> to decide on the state-type I was looking through the PG catalogs for
> essentially a "complex C type" which translated to a void*.  Perhaps
> such a type could be added.  Unless that's considered along the lines of
> an 'any' type it'd cause problems for the polymorphism aspect.  
> 
> Another alternative would be to provide a seperate area for each 
> aggregate to put any other information it needs.  This would almost
> certainly only be available to C functions but would essentially be a
> void* which is provided through the AggState structure but tracked by
> the aggregator routines and reset for each aggregate function being 
> run.  If that's acceptable, I don't think it'd be all that difficult to
> implement.  With that, aaccum_sfunc and aaccum_ffunc would ignore the 
> state variable passed to them in favor of their custom structure 
> available through fcinfo->AggState (I expect they'd just keep the 
> state variable NULL and be marked non-strict, or set it to some constant
> if necessary).  The pointer would have to be tracked somewhere and then
> copied in/out on each call, but that doesn't seem too difficult to me.
> After all, the state variable is already being tracked somewhere, this
> would just sit next to it, in my head anyway.
> 
> I've got some time this weekend and would be happy to take a shot at
> the second proposal if that's generally acceptable.
> 
>   Thanks,
> 
>   Stephen
-- End of PGP section, PGP failed!

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixing insecure security definer functions

2007-02-13 Thread Stephen Frost
* Peter Eisentraut ([EMAIL PROTECTED]) wrote:
> Regarding the advisory on possibly insecure security definer functions 
> that I just sent out (by overriding the search path you can make the 
> function do whatever you want with the privileges of the function 
> owner), the favored solution after some initial discussion in the core 
> team was to save the search path at creation time with each function.  

Would this be done only on security-definer functions?

> This measure will arguably also increase the robustness of functions in 
> general, and it seems to be desirable as part of the effort to make 
> plan invalidation work.
> 
> Quite probably, there will be all sorts of consequences in terms of 
> backward compatibility and preserving the possibility of valid uses of 
> the old behavior and so on.  So I'm inviting input on how to fix the 
> problem in general and how to avoid the mentioned follow-up problems in 
> particular.

It'll break most of the functions that we have in our production
systems...  They're not security definer functions but it's routine for
us to switch between different schemas to run a function on.  I
certainly don't want to have to have seperate functions for these either
as it'd be completely duplicated code with just different search paths
set.  I'm honestly not the least bit impressed with this solution to
the problem.

What about pushing all the in-function references down to the
specific objects referenced at plan creation time (err, I thought this
was done?)?  In most cases what we're doing is building up queries and 
then running them with execute so I *think* that'd still work for us.
Also, it seems like that's the way to deal with the plan-invocation
problem (since that's really going to have to go down to object level
anyway eventually, isn't it?) rather than trying to handle using the
search path to figure out if it's invalidated now or not based on what's
currently there.

Note that I'm not suggesting users would change their source code for
this but rather that the 'create function' command would implicitly do
this ala what create view does.  I really could have sworn something
like this was done (where OIDs are saved).

Another option might be to modify the 'create function' syntax to have
an option of 'search path' to set what search path the function should
have at the start.  Then, for security definer functions at least, issue
a WARNING if that isn't being set at CREATE FUNCTION time.  I'm pretty
sure that in most cases (certainly for us) that'd be noticed and at
least investigated.  Another option would be to ERROR if it's not
provided and allow the previous behaviour by allowing it to be set to
NULL (again, mainly on security definer functions..  maybe warning on
others or something).

Hope these thoughts help...

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCHES] Use non-deprecated APIs for dynloader/darwin.c

2007-02-13 Thread Bruce Momjian

The Darwin dlopen() patch has already been applied.  Where are we on the
Bonjour patch?  Do we want code that works on Darwin 10.2 and 10.3?

---

Chris Campbell wrote:
> On Oct 8, 2006, at 14:29, Tom Lane wrote:
> 
> > Looks good, but I don't think we want to abandon OSX 10.2 support
> > just yet.  I'll revise this to use a configure probe for dlopen.
> 
> Maybe we can abandon Mac OS X 10.2 in 8.3 and later? And not back- 
> port these patches to the 7.x, 8.0, and 8.1 branches?
> 
> BTW, I think the configure probe (only on Darwin, correct?) should  
> test for the existence of .
> 
> > My inclination is to apply this one now, since it only affects OSX
> > and should be easily testable, but to hold off on your other patch
> > for portable Bonjour support until 8.3 devel starts.  The portability
> > implications of that one are unclear, and I don't know how to test it
> > either, so I think putting it in now is too much risk.
> 
> The Bonjour patch wasn't intended to be portable to other platforms  
> just yet. As submitted, it has the same risks/advantages as this  
> dlopen() patch -- it only works on 10.3 and later, but isn't  
> deprecated in 10.4.
> 
> If we want to keep 10.2 support for Bonjour, we can test for both  
> DNSServiceDiscovery.h and dns_sd.h in ./configure, and prefer  
> dns_sd.h if it's found (which will be the case for 10.3 and 10.4) but  
> use DNSServiceDiscovery.h if not (which will be the case for 10.2).
> 
> Thanks!
> 
> - Chris
> 
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Writing triggers in C++

2007-02-13 Thread Peter Eisentraut
Jacob Rief wrote:
> Is there any convention how to rename such identifiers? If I would
> rename those identifiers (I simply would add an underscore to each of
> them), would such a patch be accepted and adopted onto one of the
> next releases?

Couldn't you do the required renamings as preprocessor macros, e.g.,

#define typename _typename
#include 
#undef typename

#include 

your_code;


I would expect very little enthusiasm for making PostgreSQL code C++ 
safe.  There is already too much trouble keeping up with all the 
variants of C.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Fixing insecure security definer functions

2007-02-13 Thread Peter Eisentraut
Regarding the advisory on possibly insecure security definer functions 
that I just sent out (by overriding the search path you can make the 
function do whatever you want with the privileges of the function 
owner), the favored solution after some initial discussion in the core 
team was to save the search path at creation time with each function.  
This measure will arguably also increase the robustness of functions in 
general, and it seems to be desirable as part of the effort to make 
plan invalidation work.

Quite probably, there will be all sorts of consequences in terms of 
backward compatibility and preserving the possibility of valid uses of 
the old behavior and so on.  So I'm inviting input on how to fix the 
problem in general and how to avoid the mentioned follow-up problems in 
particular.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Writing triggers in C++

2007-02-13 Thread Jacob Rief
I tried to write a trigger using C++. This requires to include the
following header-files:

extern "C" {
#include 
#include 
#include 
#include 
}

Unfortunately some of the included headers define some structs and
functions where a few identifiers are C++ keywords.
The compiler-directive 'extern "C"' does not help here, it just tells
the compiler not to mangle C-identifiers. 'extern "C"' does not rename
C++ keywords into something else. Therefore AFAIK, if someone wants to
include those headers files into a C++ program, the identifiers causing
problems have to be renamed manually.

For instance, Postgresql version 8.2.3
/usr/include/pgsql/server/nodes/primnodes.h:950:
List  *using;/* USING clause, if any (list of String) */
'using' is a C++ keyword

/usr/include/pgsql/server/nodes/parsenodes.h:179:
Oid   typeid;/* type identified by OID */
'typeid' is a C++ keyword

/usr/include/pgsql/server/nodes/parsenodes.h:249,265,401,943,1309:
TypeName   *typename;
'typename' is a C++ keyword

/usr/include/pgsql/server/utils/builtins.h:544:
extern char *quote_qualified_identifier(const char *namespace,
'namespace' is a C++ keyword

Is there any convention how to rename such identifiers? If I would
rename those identifiers (I simply would add an underscore to each of
them), would such a patch be accepted and adopted onto one of the next
releases? 

Regards, Jacob



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] NULL and plpgsql rows

2007-02-13 Thread Bruce Momjian

Is there a TODO here?

---

Jim Nasby wrote:
> On Oct 2, 2006, at 6:28 PM, Tom Lane wrote:
> > "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> >> However, the test right above that means that we'll fail if the user
> >> tries something like "row_variable := NULL;":
> >
> > The patch you seem to have in mind would allow
> > row_variable := int_variable;
> > to succeed if the int_variable chanced to contain NULL, which is  
> > surely
> > not very desirable.
> 
> Hrm... is there any reasonable way to catch that?
> 
> > The real issue here is that the bare NULL has type UNKNOWN and  
> > we're not
> > making any effort to cast it.  I'm not sure whether it'd work to  
> > simply
> > apply exec_cast_value --- that looks like it's only meant to handle
> > scalars, where in general you'd need something close to
> > ExecEvalConvertRowtype().
> >
> >> Of course, setting a row variable to null is a lot more useful if  
> >> we can
> >> actually test for it after the fact, and I'm not really sure how  
> >> to make
> >> that happen.
> >
> > Doesn't IS NULL work (as of CVS HEAD)?
> 
> Ahh, so it does. Doesn't work with RECORD, though... which seems a  
> bit surprising. I can't really think of a good reason why they should  
> differ.
> 
> ERROR:  record "v_row" is not assigned yet
> DETAIL:  The tuple structure of a not-yet-assigned record is  
> indeterminate.
> CONTEXT:  PL/pgSQL function "test" line 4 at return
> 
> --
> Jim Nasby[EMAIL PROTECTED]
> EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
> 
> 
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> Tom Lane wrote:
>>> delay.  I just stuck a fixed 100msec delay into the accept-failed code
>>> path.
>> 
>> Seems worth mentioning that bgwriter sleeps 1 sec in case of failure.
>> (And so does the autovac code I'm currently looking at).

> There is probably a good case for a shorter delay in postmaster, though.

Yeah, that's what I thought.  We don't really care if either bgwriter or
autovac goes AWOL for a little while, but if the postmaster's asleep
then nobody can connect.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Andrew Dunstan

Alvaro Herrera wrote:

Tom Lane wrote:
  

Jim Nasby <[EMAIL PROTECTED]> writes:


On Feb 13, 2007, at 12:15 PM, Tom Lane wrote:
  

We could possibly sleep() a bit before retrying,
just to not suck 100% CPU, but that doesn't really *fix* anything ...

Well, not only that, but the machine is currently writing to the  
postmaster log at the rate of 2-3MB/s. ISTM some kind of sleep  
(perhaps growing exponentially to some limit) would be a good idea.
  

Well, since the code has always behaved that way and no one noticed
before, I don't think it's worth anything as complicated as a variable
delay.  I just stuck a fixed 100msec delay into the accept-failed code
path.



Seems worth mentioning that bgwriter sleeps 1 sec in case of failure.
(And so does the autovac code I'm currently looking at).

  


There is probably a good case for a shorter delay in postmaster, though.

cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Alvaro Herrera
Tom Lane wrote:
> Jim Nasby <[EMAIL PROTECTED]> writes:
> > On Feb 13, 2007, at 12:15 PM, Tom Lane wrote:
> >> We could possibly sleep() a bit before retrying,
> >> just to not suck 100% CPU, but that doesn't really *fix* anything ...
> 
> > Well, not only that, but the machine is currently writing to the  
> > postmaster log at the rate of 2-3MB/s. ISTM some kind of sleep  
> > (perhaps growing exponentially to some limit) would be a good idea.
> 
> Well, since the code has always behaved that way and no one noticed
> before, I don't think it's worth anything as complicated as a variable
> delay.  I just stuck a fixed 100msec delay into the accept-failed code
> path.

Seems worth mentioning that bgwriter sleeps 1 sec in case of failure.
(And so does the autovac code I'm currently looking at).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] DROP DATABASE and prepared xacts

2007-02-13 Thread Bruce Momjian

Added to TODO:

* Improve failure message when DROP DATABASE is used on a database that
  has prepared transactions


---

Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
> > I think we should set things up so that prepared transactions are
> > dropped when they concern a database being dropped.  Opinions?
> 
> Agreed, if you want to drop the database, you don't care about the 
> transactions in it anymore.
> 
> It seems straightforward to implement. We'll need a version of
> FinishPreparedTransaction that takes an xid instead of a global
> transaction id. Then that needs to be called at roughly the same time as
> DatabaseCancelAutovacuumActivity. Preferably there isn't a wide window 
> between rolling back the prepared transactions and committing to 
> dropping the database...
> 
> I just realized that you can prepare a transaction in one database, 
> connect to another database in the same cluster, and issue a "COMMIT 
> PREPARED" there. At least NOTIFY/LISTEN gets confused when you do that, 
> and sends the notification to the another database, not the one where 
> the original transaction was running :(.
> 
> Do we consider committing a transaction from another database a feature, 
> and fix NOTIFY/LISTEN, or should COMMIT PREPARED throw an error if 
> you're not connected to the same database?
> 
> Actually, I think we should completely separate the namespaces of the 
> global transaction identifiers, so that you could use the same gid in 
> two different databases without a conflict.
> 
> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com
> 
> 
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Andrew Hammond

On 2/13/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote:

Andrew Hammond wrote:
> The FreeBSD database/postgres* ports depend on them. Which is
> probably why Marc insists on keeping them.

I hesitate to believe that seeing that they don't actually work, whereas
we have heard no complaints that the FreeBSD ports don't work.


I am not convinced anyone who is serious about postgresql uses the
ports for reasons outlined in a prior post. However, they certainly
are used in the ports (FreeBSD 6.2, ports cvsup'd about 2 mins ago):

Script started on Tue Feb 13 19:25:28 2007
[EMAIL PROTECTED] /usr/ports/databases/postgresql82-server]# make


 === BACKUP YOUR DATA! =
 As always, backup your data before
 upgrading. If the upgrade leads to a higher
 minor revision (e.g. 7.3.x -> 7.4), a dump
 and restore of all databases is
 required. This is *NOT* done by the port!

 Press ctrl-C *now* if you need to pg_dump.
 ===

===>  Found saved configuration for postgresql-server-8.2.3
=> postgresql-base-8.2.3.tar.bz2 doesn't seem to exist in
/usr/ports/distfiles/postgresql.
=> Attempting to fetch from
ftp://ftp8.us.postgresql.org/postgresql/source/v8.2.3/.
postgresql-base-8.2.3.tar.bz2 100% of 8301 kB  619 kBps 00m00s
=> postgresql-opt-8.2.3.tar.bz2 doesn't seem to exist in
/usr/ports/distfiles/postgresql.
=> Attempting to fetch from
ftp://ftp8.us.postgresql.org/postgresql/source/v8.2.3/.
postgresql-opt-8.2.3.tar.bz2  100% of  163 kB  171 kBps
=> postgresql-test-8.2.3.tar.bz2 doesn't seem to exist in
/usr/ports/distfiles/postgresql.
=> Attempting to fetch from
ftp://ftp8.us.postgresql.org/postgresql/source/v8.2.3/.
postgresql-test-8.2.3.tar.bz2 100% of  962 kB  254 kBps
===>  Extracting for postgresql-server-8.2.3
=> MD5 Checksum OK for postgresql/postgresql-base-8.2.3.tar.bz2.
=> SHA256 Checksum OK for postgresql/postgresql-base-8.2.3.tar.bz2.
=> MD5 Checksum OK for postgresql/postgresql-opt-8.2.3.tar.bz2.
=> SHA256 Checksum OK for postgresql/postgresql-opt-8.2.3.tar.bz2.
=> MD5 Checksum OK for postgresql/postgresql-test-8.2.3.tar.bz2.
=> SHA256 Checksum OK for postgresql/postgresql-test-8.2.3.tar.bz2.
-- snip --

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] OT: IRC nick to real world mapping

2007-02-13 Thread Dennis Bjorklund

Tom Lane skrev:


Is there any cross-check on the correctness of this list?


As have been said, there is a registration service that makes it harder 
to steal nicks.


There is no guarantee that anyone who claim to be this or that really is 
who he say he is. On the other hand, a lot of us have been there most 
every day the last 5 years or so and after a while you do get to know 
the guy (or girl) behind the nick.




(Hint: if someone shows up in IRC claiming to be me, he's more than
likely lying.)


In the special case of you I'm pretty sure we would spot it very fast if 
someone is pretending to be you. And if someone can fool us he would be 
just as good as having the real thing in the channel :-)


/Dennis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Alvaro Herrera
Peter Eisentraut wrote:
> Andrew Hammond wrote:
> > The FreeBSD database/postgres* ports depend on them. Which is
> > probably why Marc insists on keeping them.
> 
> I hesitate to believe that seeing that they don't actually work, whereas 
> we have heard no complaints that the FreeBSD ports don't work.

Perhaps what it does is install all the split tarballs and build from
there, which would be an extremely clever use of split tarballs indeed.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Andrew Hammond wrote:
>> The FreeBSD database/postgres* ports depend on them. Which is
>> probably why Marc insists on keeping them.

> I hesitate to believe that seeing that they don't actually work, whereas 
> we have heard no complaints that the FreeBSD ports don't work.

I would assume that "depends on" means "they prefer to download all the
smaller tarballs instead of the one big one".  But they must be building
with the complete tree in place, so this seems a mighty weak form of
dependency.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Tom Lane
Jim Nasby <[EMAIL PROTECTED]> writes:
> On Feb 13, 2007, at 12:15 PM, Tom Lane wrote:
>> We could possibly sleep() a bit before retrying,
>> just to not suck 100% CPU, but that doesn't really *fix* anything ...

> Well, not only that, but the machine is currently writing to the  
> postmaster log at the rate of 2-3MB/s. ISTM some kind of sleep  
> (perhaps growing exponentially to some limit) would be a good idea.

Well, since the code has always behaved that way and no one noticed
before, I don't think it's worth anything as complicated as a variable
delay.  I just stuck a fixed 100msec delay into the accept-failed code
path.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Jim Nasby

On Feb 13, 2007, at 12:15 PM, Tom Lane wrote:

Interesting.  So accept() fails because it can't allocate an FD, which
means that the select condition isn't cleared, so we keep retrying
forever.  I don't see what else we could do though.  Having the
postmaster abort on what might well be a transient condition doesn't
sound like a hot idea.  We could possibly sleep() a bit before  
retrying,

just to not suck 100% CPU, but that doesn't really *fix* anything ...


Well, not only that, but the machine is currently writing to the  
postmaster log at the rate of 2-3MB/s. ISTM some kind of sleep  
(perhaps growing exponentially to some limit) would be a good idea.


I've been meaning to bug you about increasing cuckoo's FD limit  
anyway;

it keeps failing in the regression tests.

ulimit is set to 1224 open files, though I seem to keep bumping  
into that

(anyone know what the system-level limit is, or how to change it?)


On my OS X machine, "ulimit -n unlimited" seems to set the limit to
10240 (or so a subsequent ulimit -a reports).  But you could probably
fix it using the buildfarm parameter that cuts the number of  
concurrent

regression test runs.


Odd... that works on my MBP (sudo bash; ulimit -n unlimited) and I  
get 12288. But the same thing doesn't work on cuckoo, which is a G4;  
the limit stays at 1224 no matter what. Perhaps because I'm setting  
maxfiles in launchd.conf.


In any case, I've upped it to a bit over 2k; we'll see what that  
does. I find it interesting that aubrac isn't affected by this, since  
it's still running with the default of only 256 open files.


I'm thinking we might want to change the default value for  
max_files_per_process on OS X, or have initdb test it like it does  
for other things.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Andrew Hammond
> > The FreeBSD database/postgres* ports depend on them. Which is probably
> > why Marc insists on keeping them.
>
> Well, I think that's a horrid dependency to have. Other packaging
> systems (e.g. the RPM builds) seem quite able to split up a single
> unified build into multiple packages - what can't FBSD? What would we do
> if some other packaging system wanted to ask us for a different split?

I am not particularly impressed with the FreeBSD database/postgres*
ports. The emphasis on splitting postgres into -server -client and -
contrib packages, while in keeping with the rest of the ports
collection seems misplaced when you consider that they offer no
mechanism (at least of which I am aware) to support multiple versions
of the binary.

I can't imagine a situation where I would care about having separate
packages, aside from being annoyed that some of the more valuable
stuff in contrib is not built / installed. Does anyone operate a
production environment without at least pgstattuple? On the other
hand, every production server I've worked on has had at least 2 binary
packages installed and ready for use at all times (the current build
and the last production build in case we're forced to roll back). In
many cases servers I've worked on have had multiple back-ends running,
often with different binaries.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] add to ToDo, please

2007-02-13 Thread Pavel Stehule

Hello

please add to ToDo: Holdable cursor support in SPI

Regards
Pavel Stehule

_
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. 
http://messenger.msn.cz/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> Any such site is already broken, because with the proposed design, code
>> is only exposed to short-header datums if it is also exposed to toasted
>> datums.  

> Well the toaster doesn't run until we were about to put the tuple on disk. I
> was afraid there might be circumstances where data from in-memory tuples are
> returned and we use them without fearing them being toasted. In particular I
> was fearing the record/row handling in plpgsql.

True, which is another argument why datatype-specific functions should
never return short-header datums: there are places that call such
functions directly and probably expect an untoasted result.  I'm not
very concerned about stuff that's gone through a tuple though: in most
cases that could be expected to contain toasted values anyway.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Peter Eisentraut
Andrew Hammond wrote:
> The FreeBSD database/postgres* ports depend on them. Which is
> probably why Marc insists on keeping them.

I hesitate to believe that seeing that they don't actually work, whereas 
we have heard no complaints that the FreeBSD ports don't work.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Gregory Stark wrote:
> Alternatively, what does the trailing "a" in varlena signify? Would this be
> varlenb?

"attribute"

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> You keep suggesting things that I've previously considered and rejected --
>> perhaps prematurely. Off the top of my head I recall the following four
>> options from our discussions. It looks like we're circling around option 4.
>
> No, I think I'm arguing for option 3.

oops, sorry, got my own numbering mixed up.

>> And I take it you're not worried about sites that might not detoast a datum 
>> or
>> detoast one in the wrong memory context where previously they were guaranteed
>> it wouldn't generate a copy? In particular I'm worried about btree code and
>> plpgsql row/record handling.
>
> Any such site is already broken, because with the proposed design, code
> is only exposed to short-header datums if it is also exposed to toasted
> datums.  

Well the toaster doesn't run until we were about to put the tuple on disk. I
was afraid there might be circumstances where data from in-memory tuples are
returned and we use them without fearing them being toasted. In particular I
was fearing the record/row handling in plpgsql.

I was also worried about the tuples that indexes stores. But it seems they
don't go through heap_form_tuple at all. It does seem that the
INDEX_TOAST_HACK would need some work -- but its existence means there can't
be anyone else trusting index tuples not to contain toasted data which is
convenient.

> I'm inclined to think that we might want to set things up so that
> varlena datatypes can individually opt-in or opt-out of this treatment;
> a datatype that needs alignment of its content might well wish to
> opt-out to avoid copying overhead.  We could do that either with a
> different typlen code, or still typlen -1 but pay attention to whether
> typalign is 'c'.

Any reason to go with typalign? You can always use typalign='i' to get the
regular headers.

Alternatively, what does the trailing "a" in varlena signify? Would this be
varlenb?

>> Option 1)
>
>> We detect cases where the typmod guarantees either a fixed size or a maximum
>> size < 256 bytes.
>
> After last week I would veto this option anyway: it fails unless we
> always know typmod exactly, and I'm here to tell you we don't.

Oh, heh, timing is everything I guess.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Faster StrNCpy

2007-02-13 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> This is from October 2006.  Is there a TODO here?

I think we had decided that the code that's in there is fine.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Andrew Dunstan

Andrew Hammond wrote:

On Feb 12, 5:16 pm, [EMAIL PROTECTED] ("Joshua D. Drake") wrote:
  

Peter Eisentraut wrote:


Jeroen T. Vermeulen wrote:
  

Is this a known problem?  Is there any test procedure that builds the
"base" distribution before release?


Most of the core team is convinced that the postgresql-foo tarballs are
useless, but Marc insists on keeping them.  But since they are nearly
useless, no one tests them, so it is not surprising that they don't
work.
  

Why do we keep them again? I can't recall at any point in the life of
CMD us ever using the -foo tarballs. Not to mention they just take up space.

Let's dump them.



The FreeBSD database/postgres* ports depend on them. Which is probably
why Marc insists on keeping them.

  


Well, I think that's a horrid dependency to have. Other packaging 
systems (e.g. the RPM builds) seem quite able to split up a single 
unified build into multiple packages - what can't FBSD? What would we do 
if some other packaging system wanted to ask us for a different split?


cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> The postmaster is stuck in the following loop, according to
> ktrace/kdump:

>   2023 postgres CALL  select(0x8,0xbfffe194,0,0,0xbfffe16c)
>   2023 postgres RET   select 1
>   2023 postgres CALL  sigprocmask(0x3,0x2f0d38,0)
>   2023 postgres RET   sigprocmask 0
>   2023 postgres CALL  accept(0x7,0x200148c,0x200150c)
>   2023 postgres RET   accept -1 errno 24 Too many open files
>   2023 postgres CALL  write(0x2,0x2003928,0x3b)
>   2023 postgres GIO   fd 2 wrote 59 bytes
>"LOG:  could not accept new connection: Too many open files
>"
>   2023 postgres RET   write 59/0x3b
>   2023 postgres CALL  close(0x)
>   2023 postgres RET   close -1 errno 9 Bad file descriptor
>   2023 postgres CALL  sigprocmask(0x3,0x2e6400,0)
>   2023 postgres RET   sigprocmask 0
>   2023 postgres CALL  select(0x8,0xbfffe194,0,0,0xbfffe16c)
>   2023 postgres RET   select 1

Interesting.  So accept() fails because it can't allocate an FD, which
means that the select condition isn't cleared, so we keep retrying
forever.  I don't see what else we could do though.  Having the
postmaster abort on what might well be a transient condition doesn't
sound like a hot idea.  We could possibly sleep() a bit before retrying,
just to not suck 100% CPU, but that doesn't really *fix* anything ...

I've been meaning to bug you about increasing cuckoo's FD limit anyway;
it keeps failing in the regression tests.

> ulimit is set to 1224 open files, though I seem to keep bumping into that
> (anyone know what the system-level limit is, or how to change it?)

On my OS X machine, "ulimit -n unlimited" seems to set the limit to
10240 (or so a subsequent ulimit -a reports).  But you could probably
fix it using the buildfarm parameter that cuts the number of concurrent
regression test runs.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Faster StrNCpy

2007-02-13 Thread Bruce Momjian

This is from October 2006.  Is there a TODO here?

---

Tom Lane wrote:
> I did a couple more tests using x86 architectures.  On a rather old
> Pentium-4 machine running Fedora 5 (gcc 4.1.1, glibc-2.4-11):
> 
> $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected 
> to be very slow."' -DN="(1024*1024)" -o x x.c y.c strlcpy.c  
> NONE:786305 us
> MEMCPY: 9887843 us
> STRNCPY:   15045780 us
> STRLCPY:   1763 us
> U_STRLCPY: 14994639 us
> LENCPY:19700346 us
> 
> $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected 
> to be very slow."' -DN=1 -o x x.c y.c strlcpy.c 
> NONE:562001 us
> MEMCPY: 2026546 us
> STRNCPY:   11149134 us
> STRLCPY:   13747827 us
> U_STRLCPY: 12467527 us
> LENCPY:17672899 us
> 
> (STRLCPY is our CVS HEAD code, U_STRLCPY is the unrolled version)
> 
> On a Mac Mini (Intel Core Duo, OS X 10.4.8, gcc 4.0.1), the system has a
> version of strlcpy, but it seems to suck :-(
> 
> $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected 
> to be very slow."' -DN="(1024*1024)" -o x x.c y.c strlcpy.c ; ./x
> NONE:480298 us
> MEMCPY: 7857291 us
> STRNCPY:   10485948 us
> STRLCPY:   16745154 us
> U_STRLCPY: 18337286 us
> S_STRLCPY: 20920213 us
> LENCPY:22878114 us
> 
> $ gcc -O3 -std=c99 -DSTRING='"This is a very long sentence that is expected 
> to be very slow."' -DN=1 -o x x.c y.c strlcpy.c ; ./x
> NONE:480269 us
> MEMCPY: 1858974 us
> STRNCPY:5405618 us
> STRLCPY:   16364452 us
> U_STRLCPY: 16439753 us
> S_STRLCPY: 19134538 us
> LENCPY:22873141 us
> 
> It's interesting that the unrolled version is actually slower here.
> I didn't dig into the assembly code, but maybe Apple's compiler isn't
> doing a very good job with it?
> 
> Anyway, these results make me less excited about the unrolled version.
> 
> In any case, I don't think we should put too much emphasis on the
> long-source-string case.  In essentially all cases, the true source
> string length will be much shorter than the target buffer (were this
> not so, we'd probably be needing to make the buffer bigger), and strlcpy
> will certainly beat out strncpy in those situations.  The memcpy numbers
> look attractive, but they ignore the problem that in practice we usually
> don't know the source string length in advance --- so I don't think
> those represent something achievable.
> 
> One thing that seems real clear is that the LENCPY method loses across
> the board, which surprised me, but it's hard to argue with numbers.
> 
> I'm still interested to experiment with MemSet-then-strlcpy for namestrcpy,
> but given the LENCPY results this may be a loser too.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Missing directory when building 8.2.3-base

2007-02-13 Thread Andrew Hammond
On Feb 12, 5:16 pm, [EMAIL PROTECTED] ("Joshua D. Drake") wrote:
> Peter Eisentraut wrote:
> > Jeroen T. Vermeulen wrote:
> >> Is this a known problem?  Is there any test procedure that builds the
> >> "base" distribution before release?
>
> > Most of the core team is convinced that the postgresql-foo tarballs are
> > useless, but Marc insists on keeping them.  But since they are nearly
> > useless, no one tests them, so it is not surprising that they don't
> > work.
>
> Why do we keep them again? I can't recall at any point in the life of
> CMD us ever using the -foo tarballs. Not to mention they just take up space.
>
> Let's dump them.

The FreeBSD database/postgres* ports depend on them. Which is probably
why Marc insists on keeping them.

Andrew


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Re: [COMMITTERS] pgsql: Add ORDER BY to vacummdb so databases are scaned in the same

2007-02-13 Thread Bruce Momjian
Tom Lane wrote:
> [EMAIL PROTECTED] (Bruce Momjian) writes:
> > Log Message:
> > ---
> > Add ORDER BY to vacummdb so databases are scaned in the same order as
> > pg_dumpall.
> 
> If we're gonna do that (which I have no objection to), shouldn't
> clusterdb and reindexdb do the same?

Done, thanks.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [COMMITTERS] pgsql: Add ORDER BY to vacummdb so databases are scaned in the same

2007-02-13 Thread Tom Lane
[EMAIL PROTECTED] (Bruce Momjian) writes:
> Log Message:
> ---
> Add ORDER BY to vacummdb so databases are scaned in the same order as
> pg_dumpall.

If we're gonna do that (which I have no objection to), shouldn't
clusterdb and reindexdb do the same?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Tom Lane wrote:
> > We detect cases where the typmod guarantees either a fixed size or a maximum
> > size < 256 bytes.
> 
> After last week I would veto this option anyway: it fails unless we
> always know typmod exactly, and I'm here to tell you we don't.

If we can pull this off, it handles short values stored in TEXT fields
too, which is a big win over the typmod idea I had.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> You keep suggesting things that I've previously considered and rejected --
> perhaps prematurely. Off the top of my head I recall the following four
> options from our discussions. It looks like we're circling around option 4.

No, I think I'm arguing for option 3.

> And I take it you're not worried about sites that might not detoast a datum or
> detoast one in the wrong memory context where previously they were guaranteed
> it wouldn't generate a copy? In particular I'm worried about btree code and
> plpgsql row/record handling.

Any such site is already broken, because with the proposed design, code
is only exposed to short-header datums if it is also exposed to toasted
datums.  It's possible that we would find a few places that hadn't
gotten converted, but any such place is a must-fix anyway.  In practice
the TOAST support has been in there long enough that I don't think we'll
find any ... at least not in the core code.  It's entirely possible that
there is user-defined code out there that uses varlena format and isn't
careful about detoasting.  This is an argument for allowing opt-out
(see below).

> I'm not sure what to do about the alignment issue. We could just never align
> 1-byte headers. That would work just fine as long a the data types that need
> alignment don't get ported to the new macros. It seems silly to waste space on
> disk in order to save a cpu memcpy that we aren't even going to be saving for
> now anyways.

No, a datum that is in 1-byte-header format wouldn't have any special
alignment inside a tuple.  There are two paths to get at it: detoast it
(producing an aligned, 4-byte-header, palloc'd datum) or use the
still-to-be-named macros that let you access the unaligned content
directly.

I'm inclined to think that we might want to set things up so that
varlena datatypes can individually opt-in or opt-out of this treatment;
a datatype that needs alignment of its content might well wish to
opt-out to avoid copying overhead.  We could do that either with a
different typlen code, or still typlen -1 but pay attention to whether
typalign is 'c'.

> Option 1)

> We detect cases where the typmod guarantees either a fixed size or a maximum
> size < 256 bytes.

After last week I would veto this option anyway: it fails unless we
always know typmod exactly, and I'm here to tell you we don't.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] cuckoo is hung during regression test

2007-02-13 Thread Jim C. Nasby
The 8.1 build for cuckoo is currently hung, with the *postmaster* taking
all the CPU it can get. The build started almost 5 hours ago.

The postmaster is stuck in the following loop, according to
ktrace/kdump:

  2023 postgres RET   write 59/0x3b
  2023 postgres CALL  close(0x)
  2023 postgres RET   close -1 errno 9 Bad file descriptor
  2023 postgres CALL  sigprocmask(0x3,0x2e6400,0)
  2023 postgres RET   sigprocmask 0
  2023 postgres CALL  select(0x8,0xbfffe194,0,0,0xbfffe16c)
  2023 postgres RET   select 1
  2023 postgres CALL  sigprocmask(0x3,0x2f0d38,0)
  2023 postgres RET   sigprocmask 0
  2023 postgres CALL  accept(0x7,0x200148c,0x200150c)
  2023 postgres RET   accept -1 errno 24 Too many open files
  2023 postgres CALL  write(0x2,0x2003928,0x3b)
  2023 postgres GIO   fd 2 wrote 59 bytes
   "LOG:  could not accept new connection: Too many open files
   "
  2023 postgres RET   write 59/0x3b
  2023 postgres CALL  close(0x)
  2023 postgres RET   close -1 errno 9 Bad file descriptor
  2023 postgres CALL  sigprocmask(0x3,0x2e6400,0)
  2023 postgres RET   sigprocmask 0
  2023 postgres CALL  select(0x8,0xbfffe194,0,0,0xbfffe16c)
  2023 postgres RET   select 1
  2023 postgres CALL  sigprocmask(0x3,0x2f0d38,0)
  2023 postgres RET   sigprocmask 0
  2023 postgres CALL  accept(0x7,0x200148c,0x200150c)
  2023 postgres RET   accept -1 errno 24 Too many open files
  2023 postgres CALL  write(0x2,0x200381c,0x3b)
  2023 postgres GIO   fd 2 wrote 59 bytes
   "LOG:  could not accept new connection: Too many open files
   "
  2023 postgres RET   write 59/0x3b

ulimit is set to 1224 open files, though I seem to keep bumping into that
(anyone know what the system-level limit is, or how to change it?)

Is there other useful info to be had about this process, or should I just kill
it?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread mark
Hey all:

This is a randomly inserted distraction to tell you that I really like to
read about these ideas. No input from myself at this point. I'm happy with
the direction you are taking.

Thanks,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Bruce Momjian" <[EMAIL PROTECTED]> writes:
>> Uh, if the macros can read 1 and 4-byte headers, why do we need to
>> allocate memory for them?

> Because the macros can't read 1 and 4-byte headers. If they could we would
> have the problem with VARDATA for code sites that write to the data before
> they write the size.

The way I see this working is that VARDATA keeps its current behavior
and hence can only be used with datums that are known to be in
4-byte-header form; hence, to avoid breaking code that uses it,
PG_DETOAST_DATUM has to produce a 4-byte-header datum always.

After we have the infrastructure in place, we'd make a pass over
high-traffic functions to replace uses of PG_DETOAST_DATUM with
something that doesn't forcibly expand 1-byte-header datums, and replace
uses of VARDATA on the result with something that handles both header
formats (and would be unsuitable for generating result datums, since
it'd have to assume that the length is already filled in).

I don't see any good reason why datatype-specific functions would ever
need to generate the short-header format directly.  The only point where
it's worth trimming the header size is during heap_form_tuple, and we
can do it there at no significant efficiency cost.  So uses of VARDATA
in connection with building a new datum need not be touched.

I'm inclined also to suggest that VARSIZE() need only support 4-byte
format: we could have a second macro that understands both formats and
gets used in the same high-traffic functions in which we are replacing
uses of VARDATA().  There's no benefit in making other sites support
1-byte format for VARSIZE() if they aren't going to support it for
VARDATA().

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> The point I'm trying to get across here is to do things one small step
> at a time; if you insist on a "big bang" patch then it'll never get
> done.  You might want to go back and review the CVS history for some
> other big changes like TOAST and the version-1 function-call protocol
> to see our previous uses of this approach.

Believe me I'm not looking for a "big bang" approach. It's just that I've
found problems with any of the incremental approaches. I suppose it's
inevitable that "big bang" approaches always seem like the most perfect since
they automatically involve fixing anything that could be a problem.

You keep suggesting things that I've previously considered and rejected --
perhaps prematurely. Off the top of my head I recall the following four
options from our discussions. It looks like we're circling around option 4.

As long as we're willing to live with the palloc/memcpy overhead at least for
now given that we can reduce it by whittling away at the sites that use the
old macros, that seems like a good compromise and the shortest development
path except perhaps option 1.

And I take it you're not worried about sites that might not detoast a datum or
detoast one in the wrong memory context where previously they were guaranteed
it wouldn't generate a copy? In particular I'm worried about btree code and
plpgsql row/record handling.

I'm not sure what to do about the alignment issue. We could just never align
1-byte headers. That would work just fine as long a the data types that need
alignment don't get ported to the new macros. It seems silly to waste space on
disk in order to save a cpu memcpy that we aren't even going to be saving for
now anyways.



Option 1)

We detect cases where the typmod guarantees either a fixed size or a maximum
size < 256 bytes. In which case instead of copying the typlen from pg_type
into tupledesc and the table's attlen we use the implied attlen or store -3
indicating single byte headers everywhere.

Cons: This implies pallocing and memcpying the datum in heap_deform_tuple.

  It doesn't help text or varchar() at all even if we mostly store
  small data in them.


Pros: it buys us 0-byte headers for things like numeric(1,0) and even char(n)
  on single-byte encodings. It buys us 1-byte headers for most numerics
  and char(n) or varchar(n).


Option 2)

We have heap_form_tuple/heap_deform_tuple compress and uncompress the headers.
The rule is that headers are always compressed in a tuple and never compressed
when at the end of a datum.

Cons: This implies pallocing and memcpying the datum in heap_deform_tuple

  It requires restricting the range of 1-byte headers to 127 or 63 bytes
  and always uses 1 byte even for fixed size data. (We could get 0-byte
  headers for a small range (ascii characters and numeric integers up to
  127) but then 1-byte headers would be down to 31 byte data.)

  It implies a different varlena format for on-disk and in-memory

Pros: it works for text/varchar as long as we store small data.

  It lets us use network byte order or some other format for the on-disk
  headers without changing the macros to access in-memory headers.


Option 3)

We have the toaster (or heap_form_tuple as a shortcut) compress the headers
but delay decompression until pg_detoast_datum. tuples only contain compressed
headers but Datums sometimes point to compressed headers and sometimes
uncompressed headers just as they sometimes point to toasted data and
sometimes detoasted data.

Cons: This still implies pallocing and memcpying the datum at least for now

  There could be cases where code expects to deform_tuple and be
  guaranteed a non-toasted pointer into the tuple.

  requires replacing VARATT_SIZEP with SET_VARLENA_LEN()

Pros: It allows for future macros to examine the compressed datum without
  decompressing it.


Option 4)

We have compressed data constructed on the fly everywhere possible.

Cons: requires replacing VARATT_SIZEP and also requires hacking VARDATA to
  find the data in the appropriate place. Might need an additional pair of
  macros for backwards compatibility in code that really needs to
  construct a 4-byte headered varlena.

  fragility with risk of VARSIZE / VARDATA being filled in out of order

  Requires changing header to be in network byte order all the time.

Pros: one consistent representation for varlena everywhere.



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Gregory Stark wrote:
> 
> "Bruce Momjian" <[EMAIL PROTECTED]> writes:
> 
> >> But I'm surprised, I had the impression that you were reluctant to consider
> >> memcpying all the data all the time.
> >
> > Uh, if the macros can read 1 and 4-byte headers, why do we need to
> > allocate memory for them?
> 
> Because the macros can't read 1 and 4-byte headers. If they could we would
> have the problem with VARDATA for code sites that write to the data before
> they write the size.

OK, so what if we always create 4-byte headers, and read both.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] TODO item: update source/timezone for 64-bit tz files

2007-02-13 Thread Bruce Momjian

Added to TODO:

* Update our code to handle 64-bit timezone files to match the zic
  source code, which now uses them


---

Tom Lane wrote:
> Back when we converted src/timezone to use int64 for pg_time_t, we
> wondered what to do about extending the compiled timezone data file
> format for int64, so that it would work for years beyound 2038.  We
> shelved the problem waiting to see what the upstream zic folks would do.
> Well, it looks like they've done something about it.  So I think we
> ought to plan on updating our code to match theirs, so that we fix the
> y2038 problem while keeping it possible to use a standard zic-database
> installation with Postgres.  This is not urgent (I surely see no need
> to hold up 8.2 to fix it), but it ought to go on the TODO list.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 4: Have you searched our list archives?
> 
>http://archives.postgresql.org

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Gregory Stark

"Bruce Momjian" <[EMAIL PROTECTED]> writes:

>> But I'm surprised, I had the impression that you were reluctant to consider
>> memcpying all the data all the time.
>
> Uh, if the macros can read 1 and 4-byte headers, why do we need to
> allocate memory for them?

Because the macros can't read 1 and 4-byte headers. If they could we would
have the problem with VARDATA for code sites that write to the data before
they write the size.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing likelihood of deadlocks (was referential Integrity and SHARE locks)

2007-02-13 Thread Marc Munro
On Tue, 2007-13-02 at 11:38 -0500, Tom Lane wrote:
> Marc Munro <[EMAIL PROTECTED]> writes:
> > From an application developer's standpoint there are few options, none
> > of them ideal:
> 
> How about
> 
> 4) Make all the FK constraints deferred, so that they are only checked
> at end of transaction.  Then the locking order of transactions that only
> modify C is always C1, C2, ..., P.

Excellent suggestion.  Thank you.

__
Marc


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Reducing likelihood of deadlocks (was referential Integrity and SHARE locks)

2007-02-13 Thread Tom Lane
Marc Munro <[EMAIL PROTECTED]> writes:
> From an application developer's standpoint there are few options, none
> of them ideal:

How about

4) Make all the FK constraints deferred, so that they are only checked
at end of transaction.  Then the locking order of transactions that only
modify C is always C1, C2, ..., P.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> On Tue, Feb 13, 2007 at 05:23:56PM +0100, Peter Eisentraut wrote:
>> It turns out that gcc warns about it anyway.  Does anyone have some sort 
>> of clever recipe to catch warnings more easily than by carefully 
>> reading the make output or manually grepping build log files or 
>> something?

> Perhaps something we could have the buildfarm do as well, if it can be
> automated?

I tend to do "make >make.out 2>make.err" and then look at make.err.
The normal situation with a gcc build is that make.err contains one or
two warnings due to flex's bad habits.  We could possibly get that down
to zero if we wanted to work at it.  However, most non-gcc compilers
I've looked at generate dozens of mostly-silly warnings, so I'm not sure
if the buildfarm could use this technique or not.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Bruce Momjian
Peter Eisentraut wrote:
> Tom Lane wrote:
> > Magnus Hagander <[EMAIL PROTECTED]> writes:
> > > From what I can tell, this is because the Assert() puts code (the
> > > do {} loop) *before* the declaration of StringInfoData buf, which
> > > is not permitted.
> >
> > This will fail on every ANSI-C compiler, not just vc.  Please fix.
> 
> We seem to have very poor coverage of such compilers in the build farm, 
> it seems.  Is the vcbuild ready to support a regular build farm run 
> yet?
> 
> It turns out that gcc warns about it anyway.  Does anyone have some sort 
> of clever recipe to catch warnings more easily than by carefully 
> reading the make output or manually grepping build log files or 
> something?

Yes, I run /src/tools/pgtest, which shows the warning lines at the end,
after the regression tests are run.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Have psql show current sequnce values - (Resubmission)

2007-02-13 Thread Bruce Momjian

Based on this patch review, I am removing the patch from the patch
queue and requiring a resubmission.

---

Tom Lane wrote:
> Dhanaraj M <[EMAIL PROTECTED]> writes:
> > Sorry for resubmitting this patch.
> > Just now I found a problem.
> > Instead of assigning initial sequence value to 1,
> > I assign LLONG_MAX to avoid the buffer overflow problem.
> > Please find the current version here.
> 
> This patch is a mess.  In the first place, it's completely unkosher for
> an application to scribble on a PGresult's contents, even if you do take
> steps like the above to try to make sure there's enough space.  But said
> step does not work anyway -- LLONG_MAX might not exist on the client, or
> might exist but be smaller than the server's value.
> 
> Another problem with it is it's not schema-aware and not proof against
> quoting requirements for the sequence name (try it with a mixed-case
> sequence name for instance).  It also ought to pay some attention to
> the possibility that the SELECT for last_value fails --- quite aside
> from communication failure or such, there might be a permissions problem
> preventing the last_value from being read.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Magnus Hagander
On Tue, Feb 13, 2007 at 05:23:56PM +0100, Peter Eisentraut wrote:
> Tom Lane wrote:
> > Magnus Hagander <[EMAIL PROTECTED]> writes:
> > > From what I can tell, this is because the Assert() puts code (the
> > > do {} loop) *before* the declaration of StringInfoData buf, which
> > > is not permitted.
> >
> > This will fail on every ANSI-C compiler, not just vc.  Please fix.
> 
> We seem to have very poor coverage of such compilers in the build farm, 
> it seems.  Is the vcbuild ready to support a regular build farm run 
> yet?

Backend-wise, yes. It does require some changes to the buildfarm code itself
(can't call make and such), which I beleive Andrew is working on (when
he has free time).

(the vcregress script I just committed was the final "major step"
towards it, I think)

> It turns out that gcc warns about it anyway.  Does anyone have some sort 
> of clever recipe to catch warnings more easily than by carefully 
> reading the make output or manually grepping build log files or 
> something?

Perhaps something we could have the buildfarm do as well, if it can be
automated?

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> I'd be inclined to put the intelligence into heap_form_tuple and thereby
>> avoid getting the TOAST code involved unless there are wide fields to
>> deal with.

> And have heap_deform_tuple / heap_getattr palloc and memcpy the the datum on
> the way out? Or wait until detoast time and then do it?

No, heap_deform_tuple / heap_getattr are not responsible for palloc'ing
anything, only for computing appropriate pointers into the tuple.
Existing functions that use PG_DETOAST_DATUM would incur a palloc to
produce a 4-byte-header version of a short-header datum.  We could then
work on modifying one function at a time to use some alternative macro
that doesn't force a useless palloc, but the system wouldn't be broken
meanwhile; and only the high-traffic functions would be worth fixing
at all.

The point I'm trying to get across here is to do things one small step
at a time; if you insist on a "big bang" patch then it'll never get
done.  You might want to go back and review the CVS history for some
other big changes like TOAST and the version-1 function-call protocol
to see our previous uses of this approach.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Peter Eisentraut
Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
> > From what I can tell, this is because the Assert() puts code (the
> > do {} loop) *before* the declaration of StringInfoData buf, which
> > is not permitted.
>
> This will fail on every ANSI-C compiler, not just vc.  Please fix.

We seem to have very poor coverage of such compilers in the build farm, 
it seems.  Is the vcbuild ready to support a regular build farm run 
yet?

It turns out that gcc warns about it anyway.  Does anyone have some sort 
of clever recipe to catch warnings more easily than by carefully 
reading the make output or manually grepping build log files or 
something?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing likelihood of deadlocks (was referential Integrity and SHARE locks)

2007-02-13 Thread Marc Munro
On Mon, 2007-12-02 at 00:10 -0500, Tom Lane wrote:
> Marc Munro <[EMAIL PROTECTED]> writes:
> > Consider a table C containing 2 child records C1 and C2, of parent P.
> > If transaction T1 updates C1 and C2, the locking order of the the
> > records will be C1, P, C2.  Another transaction, T2, that attempts to
> > update only C2, will lock the records in order C2, P.
> 
> > The locks on C2 and P are taken in different orders by the two
> > transactions, leading to the possibility of deadlock.
> 
> But the lock on P is shared, hence no deadlock.

Doh!  Yes, you are right.  It is not that simple.

For deadlock to occur, we need a transaction that takes an exclusive
lock on P as well as on one of the children.  Let us replace T2 with a
new transaction, T3, which is going to update P and only one of its
children.

If T3 is going to update P and C1 without the possibility of deadlock
against T1, then it must take out the locks in the order C1, P.  If, on
the other hand, it is going to update P and C2, then the locks must be
taken in the order P, C2.

This means that there is no single strategy we can apply to T3 that will
guarantee to avoid deadlocks with transactions that update only C (ie
transactions, which to a developers point of view do nothing to P, and
so should be unable to deadlock with T3).

From an application developer's standpoint there are few options, none
of them ideal:

1) Insist on a locking policy that requires updates to first lock their
parent records.

This is horrible for so many reasons.  It should be unnecessary; it
causes exclusive locking on parent records, thereby eliminating the
gains made by introducing row share locks in 8.1; it is onerous on the
developers; it is error-prone; etc

2) Remove FK constraints to eliminate the possibility of RI-triggered
deadlocks.

Ugh.

3) Encapsulate all transactions in some form of retry mechanism that
traps deadlocks and retries those transactions.

This may not be practicable, and incurs all of the overhead of
encountering and trapping deadlocks in the first place.  Also, as each
deadlock occurs, a number of locks will be left active before deadlock
detection kicks in, increasing the window for further deadlocks.  On a
busy system, the first deadlock may well trigger a cascade of further
deadlocks.

__
Marc


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Greg Stark wrote:
> Tom Lane <[EMAIL PROTECTED]> writes:
> 
> > > So (nigh) every tuple will get deformed and reformed once before it goes 
> > > to
> > > disk? Currently the toast code doesn't even look at a tuple if it's small
> > > enough, but in this case we would want it to fire even on very narrow 
> > > rows.
> > 
> > I'd be inclined to put the intelligence into heap_form_tuple and thereby
> > avoid getting the TOAST code involved unless there are wide fields to
> > deal with.
> 
> And have heap_deform_tuple / heap_getattr palloc and memcpy the the datum on
> the way out? Or wait until detoast time and then do it?
> 
> If we do it on the way out of the heaptuple then we could have a rule that
> headers are always compressed in a tuple and always uncompressed out of a
> tuple.
> 
> But I'm surprised, I had the impression that you were reluctant to consider
> memcpying all the data all the time.

Uh, if the macros can read 1 and 4-byte headers, why do we need to
allocate memory for them?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> > So (nigh) every tuple will get deformed and reformed once before it goes to
> > disk? Currently the toast code doesn't even look at a tuple if it's small
> > enough, but in this case we would want it to fire even on very narrow rows.
> 
> I'd be inclined to put the intelligence into heap_form_tuple and thereby
> avoid getting the TOAST code involved unless there are wide fields to
> deal with.

And have heap_deform_tuple / heap_getattr palloc and memcpy the the datum on
the way out? Or wait until detoast time and then do it?

If we do it on the way out of the heaptuple then we could have a rule that
headers are always compressed in a tuple and always uncompressed out of a
tuple.

But I'm surprised, I had the impression that you were reluctant to consider
memcpying all the data all the time.

-- 
greg


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Gregory Stark wrote:
> 
> Tom Lane <[EMAIL PROTECTED]> writes:
> 
> > Gregory Stark <[EMAIL PROTECTED]> writes:
> > > I don't really see a way around it though. Places that fill in VARDATA 
> > > before
> > > the size (formatting.c seems to be the worst case) will just have to be
> > > changed and it'll be a fairly fragile point.
> > 
> > No, we're not going there: it'd break too much code now and it'd be a
> > continuing source of bugs for the foreseeable future.  The sane way to
> > design this is that
> > 
> > (1) code written to existing practice will always generate 4-byte
> > headers.  (Hence, VARDATA() acts the same as now.)  That's the format
> > that generally gets passed around in memory.
> 
> So then we don't need to replace VARSIZE with SET_VARLENA_LEN at all.
> 
> > (2) creation of a short header is handled by the TOAST code just before
> > the tuple goes to disk.
> > 
> > (3) replacement of a short header with a 4-byte header is considered
> > part of de-TOASTing.
> 
> So (nigh) every tuple will get deformed and reformed once before it goes to
> disk? Currently the toast code doesn't even look at a tuple if it's small
> enough, but in this case we would want it to fire even on very narrow rows.

One weird idea I had was that the macros can read 1 and 4-byte headers,
but can only create 4-byte headers.  The code that writes to the shared
buffer pages would to compression from 1 to 4 bytes as needed.

This might avoid changing any macros.  It also allows us to carry around
4-byte headers in memory, which I think might be more efficient.  I am
not sure if I have heard this idea proposed already or not.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Alvaro Herrera
Magnus Hagander wrote:
> The latest set of XML changes (I think latest, at least fairly recent)
> broke the win32vc build with asserts enabled. The line:
>   Assert(fully_escaped || !escape_period);
> 
> From what I can tell, this is because the Assert() puts code (the do {}
> loop) *before* the declaration of StringInfoData buf, which is not
> permitted.

Certainly.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> (1) code written to existing practice will always generate 4-byte
>> headers.  (Hence, VARDATA() acts the same as now.)  That's the format
>> that generally gets passed around in memory.

> So then we don't need to replace VARSIZE with SET_VARLENA_LEN at all.

Yes, we do, because we have to alter the representation of 4-byte headers.
Otherwise we can't tell which header format a datum is using.

> So (nigh) every tuple will get deformed and reformed once before it goes to
> disk? Currently the toast code doesn't even look at a tuple if it's small
> enough, but in this case we would want it to fire even on very narrow rows.

I'd be inclined to put the intelligence into heap_form_tuple and thereby
avoid getting the TOAST code involved unless there are wide fields to
deal with.

> What I had had in mind was to prohibit using smaller headers than the
> alignment of the data type. But that was on the assumption we would continue
> to use the compressed header in memory and not copy it.

Well, it wouldn't be too unreasonable to limit this whole mechanism to
datatypes that have no alignment requirements on the *content* of their
datums; which in practice is probably just text/varchar/char and perhaps
inet.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Magnus Hagander
On Tue, Feb 13, 2007 at 10:50:30AM -0500, Tom Lane wrote:
> Magnus Hagander <[EMAIL PROTECTED]> writes:
> > From what I can tell, this is because the Assert() puts code (the do {}
> > loop) *before* the declaration of StringInfoData buf, which is not
> > permitted.
> 
> This will fail on every ANSI-C compiler, not just vc.  Please fix.

Applied.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Tom Lane
Magnus Hagander <[EMAIL PROTECTED]> writes:
> From what I can tell, this is because the Assert() puts code (the do {}
> loop) *before* the declaration of StringInfoData buf, which is not
> permitted.

This will fail on every ANSI-C compiler, not just vc.  Please fix.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Gregory Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
> > I don't really see a way around it though. Places that fill in VARDATA 
> > before
> > the size (formatting.c seems to be the worst case) will just have to be
> > changed and it'll be a fairly fragile point.
> 
> No, we're not going there: it'd break too much code now and it'd be a
> continuing source of bugs for the foreseeable future.  The sane way to
> design this is that
> 
> (1) code written to existing practice will always generate 4-byte
> headers.  (Hence, VARDATA() acts the same as now.)  That's the format
> that generally gets passed around in memory.

So then we don't need to replace VARSIZE with SET_VARLENA_LEN at all.

> (2) creation of a short header is handled by the TOAST code just before
> the tuple goes to disk.
> 
> (3) replacement of a short header with a 4-byte header is considered
> part of de-TOASTing.

So (nigh) every tuple will get deformed and reformed once before it goes to
disk? Currently the toast code doesn't even look at a tuple if it's small
enough, but in this case we would want it to fire even on very narrow rows.

One design possibility I considered was doing this in heap_deform_tuple and
heap_form_tuple. Basically skipping the extra deform/form_tuple cycle in the
toast code. I had considered having heap_deform_tuple palloc copies of these
data before returning them. But that has the same problems.

The other problem is that there may be places in the code that receive a datum
from someplace where they have every right to expect it not to be toasted. For
example, plpgsql deforming a tuple they just formed, or even as the return
value from a function. They might be quite surprised to receive a toasted
tuple.

Note also that that's going to force us to palloc and memcpy these data. Are
there going to be circumstances where existing code where this changes the
memory context lifetime of some data? If, for example, soemthing like the inet
code knows its arguments can never be large enough to get toasted and doesn't
do a FREE_IF_COPY on its btree operator arguments.


> After we have that working, we can work on offering alternative macros
> that let specific functions avoid the overhead of conversion between
> 4-byte headers and short ones, in much the same way that there are TOAST
> macros now that let specific functions get down-and-dirty with the
> out-of-line TOAST representation.  But first we have to get to the point
> where 4-byte-header datums can be distinguished from short-header datums
> by inspection; and that requires either network byte order in the 4-byte
> length word or some other change in its representation.
> 
> > Actually I think neither htonl nor bitshifting the entire 4-byte word is 
> > going
> > to really work here. Both will require 4-byte alignment.
> 
> And your point is what?  The 4-byte form can continue to require
> alignment, and *will* require it in any case, since many of the affected
> datatypes expect alignment of the data within the varlena.  The trick is
> that when we are examining a non-aligned address within a tuple, we have
> to be able to tell whether we are looking at the first byte of a
> short-header datum (not aligned) or a pad byte.  This is easily done,
> for instance by decreeing that pad bytes must be zeroes.

Well if we're doing it in toast then the alignment of the payload really
doesn't matter at all. It'll be realigned after detoasting anyways.

What I had had in mind was to prohibit using smaller headers than the
alignment of the data type. But that was on the assumption we would continue
to use the compressed header in memory and not copy it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> But if you are walking through attributes, how do you know to look at
> the next byte or the next aligned byte?  We have to force zeros in
> there?

Yup: pad bytes must be zeroes (they are already) and a 1-byte-header
can't be a zero (easily done if its length includes itself).  So the
tuple-walking code would do something like

if (looking-at-a-zero && not-at-4-byte-boundary)
advance to next 4-byte boundary;
check current byte to determine if 1-byte or 4-byte header;

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Heikki Linnakangas wrote:
> Bruce Momjian wrote:
> > Heikki Linnakangas wrote:
> >> We would still require all datums with a 4-byte header to be 4-byte 
> >> aligned, right? When reading, you would first check if it's a compressed 
> >> or uncompressed header. If compressed, read the 1 byte header, if 
> >> uncompressed, read the 4-byte header and do htonl or bitshifting. No 
> >> need to do htonl or bitshifting on unaligned datums.
> > 
> > I am not sure how to handle the alignment issue.  If we require 1-byte
> > headers to be 4-byte aligned, we lose a lot of the benefits of the
> > 1-byte header.
> 
> Why would we require that?
> 
> I don't see a problem with having 4-byte header 4-byte aligned, and 
> 1-byte headers not aligned. The first access to the header is to check 
> if it's a 4 or 1 byte header. That's a 1 byte wide access, requiring no 
> alignment. After that you know which one it is.

But if you are walking through attributes, how do you know to look at
the next byte or the next aligned byte?  We have to force zeros in
there?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Magnus Hagander
On Tue, Feb 13, 2007 at 04:29:16PM +0100, Magnus Hagander wrote:
> The latest set of XML changes (I think latest, at least fairly recent)
> broke the win32vc build with asserts enabled. The line:
>   Assert(fully_escaped || !escape_period);
> 
> From what I can tell, this is because the Assert() puts code (the do {}
> loop) *before* the declaration of StringInfoData buf, which is not
> permitted.
> 
> Attached patch seems to fix this. Can someone confirm this is correct
> before I put it in?

I just realised I should of course move the comment as well :-) Thus,
the attached patch is more correct. 

//Magnus

Index: src/backend/utils/adt/xml.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.27
diff -c -r1.27 xml.c
*** src/backend/utils/adt/xml.c 11 Feb 2007 22:18:15 -  1.27
--- src/backend/utils/adt/xml.c 13 Feb 2007 15:38:16 -
***
*** 1320,1335 
  char *
  map_sql_identifier_to_xml_name(char *ident, bool fully_escaped, bool 
escape_period)
  {
/*
 * SQL/XML doesn't make use of this case anywhere, so it's
 * probably a mistake.
 */
Assert(fully_escaped || !escape_period);

- #ifdef USE_LIBXML
-   StringInfoData buf;
-   char *p;
-
initStringInfo(&buf);

for (p = ident; *p; p += pg_mblen(p))
--- 1320,1335 
  char *
  map_sql_identifier_to_xml_name(char *ident, bool fully_escaped, bool 
escape_period)
  {
+ #ifdef USE_LIBXML
+   StringInfoData buf;
+   char *p;
+
/*
 * SQL/XML doesn't make use of this case anywhere, so it's
 * probably a mistake.
 */
Assert(fully_escaped || !escape_period);

initStringInfo(&buf);

for (p = ident; *p; p += pg_mblen(p))

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-13 Thread Pavan Deolasee

On 2/13/07, Tom Lane <[EMAIL PROTECTED]> wrote:


Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Hannu Krosing wrote:
>> Are we actually doing that ? I.E are null bitmaps really allocated in 1
>> byte steps nowadays ?

> Yes.

Not really; we still have to MAXALIGN at the end of the bitmap.  The
point is that you can get 8 bits in there before paying the first
additional MAXALIGN increment.

It's all moot anyway since 8 bits isn't enough for a pointer ...



We could live with 8 bits actually. We can store only the least
significant 8 bits of the pointer. It would point to a set of tuples and
we may need to search within that set to find the required tuple.
This would still be better than scanning the entire page.

But I agree that utilizing those 8 bits would result in a penalty
for tables with few columns.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Heikki Linnakangas

Bruce Momjian wrote:

Heikki Linnakangas wrote:
We would still require all datums with a 4-byte header to be 4-byte 
aligned, right? When reading, you would first check if it's a compressed 
or uncompressed header. If compressed, read the 1 byte header, if 
uncompressed, read the 4-byte header and do htonl or bitshifting. No 
need to do htonl or bitshifting on unaligned datums.


I am not sure how to handle the alignment issue.  If we require 1-byte
headers to be 4-byte aligned, we lose a lot of the benefits of the
1-byte header.


Why would we require that?

I don't see a problem with having 4-byte header 4-byte aligned, and 
1-byte headers not aligned. The first access to the header is to check 
if it's a 4 or 1 byte header. That's a 1 byte wide access, requiring no 
alignment. After that you know which one it is.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] XML changes broke assert-enabled vcbuild

2007-02-13 Thread Magnus Hagander
The latest set of XML changes (I think latest, at least fairly recent)
broke the win32vc build with asserts enabled. The line:
Assert(fully_escaped || !escape_period);

>From what I can tell, this is because the Assert() puts code (the do {}
loop) *before* the declaration of StringInfoData buf, which is not
permitted.

Attached patch seems to fix this. Can someone confirm this is correct
before I put it in?

//Magnus

Index: src/backend/utils/adt/xml.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.27
diff -c -r1.27 xml.c
*** src/backend/utils/adt/xml.c 11 Feb 2007 22:18:15 -  1.27
--- src/backend/utils/adt/xml.c 13 Feb 2007 15:27:02 -
***
*** 1324,1335 
 * SQL/XML doesn't make use of this case anywhere, so it's
 * probably a mistake.
 */
-   Assert(fully_escaped || !escape_period);
  
  #ifdef USE_LIBXML
StringInfoData buf;
char *p;
  
initStringInfo(&buf);
  
for (p = ident; *p; p += pg_mblen(p))
--- 1324,1336 
 * SQL/XML doesn't make use of this case anywhere, so it's
 * probably a mistake.
 */
  
  #ifdef USE_LIBXML
StringInfoData buf;
char *p;
  
+   Assert(fully_escaped || !escape_period);
+ 
initStringInfo(&buf);
  
for (p = ident; *p; p += pg_mblen(p))

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Heikki Linnakangas wrote:
> Gregory Stark wrote:
> > "Tom Lane" <[EMAIL PROTECTED]> writes:
> >> For example it'd be easy to implement the previously-discussed design
> >> involving storing uncompressed length words in network byte order:
> >> SET_VARLENA_LEN does htonl() and VARSIZE does ntohl() and nothing else in
> >> the per-datatype functions needs to change. Another idea that we were
> >> kicking around is to make an explicit distinction between little-endian and
> >> big-endian hardware: on big-endian hardware, store the two TOAST flag bits
> >> in the MSBs as now, but on little-endian, store them in the LSBs, shifting
> >> the length value up two bits. This would probably be marginally faster than
> >> htonl/ntohl depending on hardware and compiler intelligence, but either way
> >> you get to guarantee that the flag bits are in the physically first byte,
> >> which is the critical thing needed to be able to tell the difference 
> >> between
> >> compressed and uncompressed length values.
> > 
> > Actually I think neither htonl nor bitshifting the entire 4-byte word is 
> > going
> > to really work here. Both will require 4-byte alignment. Instead I think we
> > have to access the length byte by byte as a (char*) and do arithmetic. Since
> > it's the pointer being passed to VARSIZE that isn't too hard, but it might
> > perform poorly.
> 
> We would still require all datums with a 4-byte header to be 4-byte 
> aligned, right? When reading, you would first check if it's a compressed 
> or uncompressed header. If compressed, read the 1 byte header, if 
> uncompressed, read the 4-byte header and do htonl or bitshifting. No 
> need to do htonl or bitshifting on unaligned datums.

I am not sure how to handle the alignment issue.  If we require 1-byte
headers to be 4-byte aligned, we lose a lot of the benefits of the
1-byte header.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> I don't really see a way around it though. Places that fill in VARDATA before
> the size (formatting.c seems to be the worst case) will just have to be
> changed and it'll be a fairly fragile point.

No, we're not going there: it'd break too much code now and it'd be a
continuing source of bugs for the foreseeable future.  The sane way to
design this is that

(1) code written to existing practice will always generate 4-byte
headers.  (Hence, VARDATA() acts the same as now.)  That's the format
that generally gets passed around in memory.

(2) creation of a short header is handled by the TOAST code just before
the tuple goes to disk.

(3) replacement of a short header with a 4-byte header is considered
part of de-TOASTing.

After we have that working, we can work on offering alternative macros
that let specific functions avoid the overhead of conversion between
4-byte headers and short ones, in much the same way that there are TOAST
macros now that let specific functions get down-and-dirty with the
out-of-line TOAST representation.  But first we have to get to the point
where 4-byte-header datums can be distinguished from short-header datums
by inspection; and that requires either network byte order in the 4-byte
length word or some other change in its representation.

> Actually I think neither htonl nor bitshifting the entire 4-byte word is going
> to really work here. Both will require 4-byte alignment.

And your point is what?  The 4-byte form can continue to require
alignment, and *will* require it in any case, since many of the affected
datatypes expect alignment of the data within the varlena.  The trick is
that when we are examining a non-aligned address within a tuple, we have
to be able to tell whether we are looking at the first byte of a
short-header datum (not aligned) or a pad byte.  This is easily done,
for instance by decreeing that pad bytes must be zeroes.

I think we should probably consider making use of different alignment
codes for different varlena datatypes.  For instance the geometry types
probably will still need align 'd' since they contain doubles; this may
mean that we should just punt on any short-header optimization for them.
But text and friends could have align 'c' showing that they need no
padding and would be perfectly happy with a nonaligned VARDATA pointer.
(Actually, maybe we should only do this whole thing for 'c'-alignable
data types?  But NUMERIC is a bit of a problem, it'd like
's'-alignment.  OTOH we could just switch NUMERIC to an all-two-byte
format that's independent of TOAST per se.)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Magnus Hagander
On Tue, Feb 13, 2007 at 09:44:03AM -0500, Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Magnus Hagander wrote:
> >> Could the "new style" macros be back-ported to previous releases in case
> >> we do this?
> 
> > Yes, Tom and I talked about this.  It could appear in the next minor
> > release of all branches.
> 
> I don't really see the point of that.  Third-party authors who want
> their code to be backwards-compatible would do something like
> 
> #ifndef SET_VARLENA_LEN
> #define SET_VARLENA_LEN(var,len)  (VARATT_SIZEP(var) = (len))
> #endif
> 
> While we could provide this same macro in later updates of the current
> release branches, those authors are still going to want to include the
> above in their code so as to be able to compile against existing
> releases.  Therefore there's not really much point in us doing it too.

It'd be a help to those who wouldn't be building against releases with
known security issues in them for one ;-)

Sure, it's not important or a dealbreaker or so, but it would be
convenient.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Foreign keys for non-default datatypes, redux

2007-02-13 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> As far as I understood the proposal, tgargs wouldn't go away, it would
> just not be populated for RI triggers.

Yes, of course.  I wasn't suggesting that we take away the ability to
pass arguments to triggers in general.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Peter Eisentraut
Gregory Stark wrote:
> a) To have two sets of macros, one of which, VARATT_DATA and
> VARATT_SIZEP are for constructing new tuples and behaves exactly as
> it does now. So you always construct a four-byte header datum. Then
> in heap_form*tuple we check if you can use a shorter header and
> convert. VARDATA/VARSIZE would be for looking at existing datums and
> would interpret the header bits.

Has any thought been given to headers *longer* than four bytes?  I don't 
exactly recall a flood of field reports that one gigabyte per datum is 
too little, but as long as the encoding of variable length data is 
changed, one might as well prepare a little for the future.

Of course, that would put a dent into any plan that wants to normalize 
the header to four bytes somewhere along the way.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Magnus Hagander wrote:
>> Could the "new style" macros be back-ported to previous releases in case
>> we do this?

> Yes, Tom and I talked about this.  It could appear in the next minor
> release of all branches.

I don't really see the point of that.  Third-party authors who want
their code to be backwards-compatible would do something like

#ifndef SET_VARLENA_LEN
#define SET_VARLENA_LEN(var,len)  (VARATT_SIZEP(var) = (len))
#endif

While we could provide this same macro in later updates of the current
release branches, those authors are still going to want to include the
above in their code so as to be able to compile against existing
releases.  Therefore there's not really much point in us doing it too.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Gregory Stark
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:

>> Actually I think neither htonl nor bitshifting the entire 4-byte word is 
>> going
>> to really work here. Both will require 4-byte alignment. Instead I think we
>> have to access the length byte by byte as a (char*) and do arithmetic. Since
>> it's the pointer being passed to VARSIZE that isn't too hard, but it might
>> perform poorly.
>
> We would still require all datums with a 4-byte header to be 4-byte aligned,
> right? When reading, you would first check if it's a compressed or 
> uncompressed
> header. If compressed, read the 1 byte header, if uncompressed, read the 
> 4-byte
> header and do htonl or bitshifting. No need to do htonl or bitshifting on
> unaligned datums.

It's not easy to require datums with 4-byte headers to be 4-byte aligned. How
do you know where to look for the bits to show it's an uncompressed header if
you don't know where it's aligned yet?

It could be done if you rule that if you're on an unaligned byte and see a \0
then scan forward until the aligned byte. But that seems just as cpu expensive
as just doing the arithmetic. And wastes space to boot.

I'm thinking VARSIZE would look something like:

#define VARSIZE((datum)) \
   uint8*)(datum))[0] & 0x80) ? 
(((uint8*)(datum))[0] & 0x7F) : \
(((uint8*)(datum))[0]<< 24 | ((uint8*)(datum))[1]<<16 | 
((uint8*)(datum))[2]<<8 | ((uint8*)(datum))[0]))

Which is effectively the same as doing ntohl except that it only works for
left hand sides -- luckily VARSIZE always has a lhs. It also works for
unaligned accesses. It's going to be fairly slow but no slower than doing an
unaligned access looking at nul padding bytes.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-13 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Hannu Krosing wrote:
>> Are we actually doing that ? I.E are null bitmaps really allocated in 1
>> byte steps nowadays ?

> Yes.

Not really; we still have to MAXALIGN at the end of the bitmap.  The
point is that you can get 8 bits in there before paying the first
additional MAXALIGN increment.

It's all moot anyway since 8 bits isn't enough for a pointer ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Heikki Linnakangas

Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:

For example it'd be easy to implement the previously-discussed design
involving storing uncompressed length words in network byte order:
SET_VARLENA_LEN does htonl() and VARSIZE does ntohl() and nothing else in
the per-datatype functions needs to change. Another idea that we were
kicking around is to make an explicit distinction between little-endian and
big-endian hardware: on big-endian hardware, store the two TOAST flag bits
in the MSBs as now, but on little-endian, store them in the LSBs, shifting
the length value up two bits. This would probably be marginally faster than
htonl/ntohl depending on hardware and compiler intelligence, but either way
you get to guarantee that the flag bits are in the physically first byte,
which is the critical thing needed to be able to tell the difference between
compressed and uncompressed length values.


Actually I think neither htonl nor bitshifting the entire 4-byte word is going
to really work here. Both will require 4-byte alignment. Instead I think we
have to access the length byte by byte as a (char*) and do arithmetic. Since
it's the pointer being passed to VARSIZE that isn't too hard, but it might
perform poorly.


We would still require all datums with a 4-byte header to be 4-byte 
aligned, right? When reading, you would first check if it's a compressed 
or uncompressed header. If compressed, read the 1 byte header, if 
uncompressed, read the 4-byte header and do htonl or bitshifting. No 
need to do htonl or bitshifting on unaligned datums.



The important point here is that VARSIZE() still works, so only code that
creates a new varlena value need be affected, not code that examines one.


So what would VARSIZE() return, the size of the payload plus VARHDRSZ
regardless of what actual size the header was? That seems like it would break
the least existing code though removing all the VARHDRSZ offsets seems like it
would be cleaner.


My vote would be to change every caller. Though there's a lot of 
callers, it's a very simple change.


To make it posible to compile an external module against 8.2 and 8.3, 
you could have a simple ifdef block to map the new macro to old 
behavior. Or we could backport the macro definitions as Magnus suggested.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Bruce Momjian
Magnus Hagander wrote:
> On Mon, Feb 12, 2007 at 11:19:14PM -0500, Tom Lane wrote:
> > Gregory Stark <[EMAIL PROTECTED]> writes:
> 
> > By my count there are only 170 uses of VARATT_SIZEP in the entire
> > backend (including contrib) so this is not an especially daunting
> > change.  It would break existing user-written functions that return
> > varlena values, but the fix wouldn't be painful for them either.
> 
> Could the "new style" macros be back-ported to previous releases in case
> we do this? That way module maintainers wouldn't need to maintain
> two different sets of code for it - they could use the new style and
> just compile it against an older version of pg?

Yes, Tom and I talked about this.  It could appear in the next minor
release of all branches.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> If we replaced that line with something like
>
>   SET_VARLENA_LEN(result, len + VARHDRSZ);
>
> then we'd have a *whole* lot more control.  

I think that part was already clear. The problem was in VARDATA.

I don't really see a way around it though. Places that fill in VARDATA before
the size (formatting.c seems to be the worst case) will just have to be
changed and it'll be a fairly fragile point.

> For example it'd be easy to implement the previously-discussed design
> involving storing uncompressed length words in network byte order:
> SET_VARLENA_LEN does htonl() and VARSIZE does ntohl() and nothing else in
> the per-datatype functions needs to change. Another idea that we were
> kicking around is to make an explicit distinction between little-endian and
> big-endian hardware: on big-endian hardware, store the two TOAST flag bits
> in the MSBs as now, but on little-endian, store them in the LSBs, shifting
> the length value up two bits. This would probably be marginally faster than
> htonl/ntohl depending on hardware and compiler intelligence, but either way
> you get to guarantee that the flag bits are in the physically first byte,
> which is the critical thing needed to be able to tell the difference between
> compressed and uncompressed length values.

Actually I think neither htonl nor bitshifting the entire 4-byte word is going
to really work here. Both will require 4-byte alignment. Instead I think we
have to access the length byte by byte as a (char*) and do arithmetic. Since
it's the pointer being passed to VARSIZE that isn't too hard, but it might
perform poorly.

> The important point here is that VARSIZE() still works, so only code that
> creates a new varlena value need be affected, not code that examines one.

So what would VARSIZE() return, the size of the payload plus VARHDRSZ
regardless of what actual size the header was? That seems like it would break
the least existing code though removing all the VARHDRSZ offsets seems like it
would be cleaner.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Ooops ... seems we need a re-release pronto

2007-02-13 Thread Martijn van Oosterhout
On Mon, Feb 12, 2007 at 08:08:31PM -0500, Robert Treat wrote:
> > In my memory I remember a site that displayed the code coverage of the
> > regression tests, but I can't find it now. Does anybody know?
> >
> 
> Are you thinking of spikesource? According to thier numbers, we currently 
> cover about 40% of the code base. 
> 
> http://developer.spikesource.com/info/search.php?c=POSTGRESQL&view=details

Yes, that was it, except I can't access the details, the redirection is
broken. However, with that info it would be nice to see which areas
could be better covered.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Foreign keys for non-default datatypes, redux

2007-02-13 Thread Florian G. Pflug

Robert Treat wrote:

On Saturday 10 February 2007 13:59, Tom Lane wrote:

Stephan Szabo <[EMAIL PROTECTED]> writes:

I'd say we probably want to keep the tgargs info for at least a version
or two after changing the implementation.  Getting rid of using the args
info sounds like a good idea.

We whack the catalogs around in incompatible ways in every release.  I'm
willing to keep filling tgargs if someone can point to a real use-case,
but not just because there might be code out there somewhere using it.



I'm pretty sure we use tgargs in phppgadmin, though exactly why escapes me... 
I am thinking it would be to display a triggers arguments?   Assuming we can 
still get all the same information one way or another I suppose we can update 
our code, though right now that code is pretty well intermixed with the 
normal function code iirc (I don't think it has been updated at all in the 
8.x series, so my memory is pretty fuzzy on this), so if I could avoid 
changing it I would... 


As far as I understood the proposal, tgargs wouldn't go away, it would just not
be populated for RI triggers. So as long as pgadmin3 doesn't use tgargs to get
information about constraints, pgadmin would be fine I believe...

greetings, Florian Pflug


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] HOT for PostgreSQL 8.3

2007-02-13 Thread Heikki Linnakangas

Hannu Krosing wrote:
But actually that 1 free byte in the header is not currently just waste 
of space. If you have any nulls in your tuple, there's going to be a 
null bitmap in addition to the header. 1 byte is conveniently enough to 
store the null bitmap for a table with max 8 columns,


Are we actually doing that ? I.E are null bitmaps really allocated in 1
byte steps nowadays ?


Yes.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] 8.1 stats issues on Win32

2007-02-13 Thread Magnus Hagander
I've been looking at backporting the stats fix committed to head and 8.2
into 8.1, but realised that it's just not going to work. 8.1 still uses
the dual stats processor stuff, which means that the simplification just
is not possible.

The most obvious result is that autovacuum is very likely to fail on 8.1
if your system load is high enough. (all of stats fail of course, but
autovac is a very common user of this)

Should we note this somewhere?

Oh, and if we were "looking for reasons" to deprecate 8.1, this sounds
like a pretty good one for me. I still think we should keep patchin it,
but it is a very good reason to encourage our users to switch to 8.2.

Now, we could try to fix it there, but we've seen a lot of issues since
day one coming from the "inherit socket in two steps", so even if we can
get this one fix, there could be more lurking around in the dual-process
model. I personally don't think it's worth investing the required time
into fixing that on 8.1.

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Teodor Sigaev

Precise definition for "latin" in C locale please. Are you saying that
single byte encoding with range 0-7f? is "latin"? If so, it seems they
are exacty same as ASCII.


p_islatin returns true for ASCII alpha characters.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Tatsuo Ishii
> > I know. My guess is the parser does not read the stop word file at
> > least with default configuration.
> 
> Parser should not read stopword file: its deal for dictionaries.

I'll come up with more detailed info, explaining why stopword file is
not read.

> > So if a character is not ASCII, it returns 0 even if p_isalpha returns
> > 1. Is this what you expect?
> No, p_islatin should return true only for latin characters, not for national 
> ones.

Precise definition for "latin" in C locale please. Are you saying that
single byte encoding with range 0-7f? is "latin"? If so, it seems they
are exacty same as ASCII.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> > In our case, we added JAPANESE_STOP_WORD into english.stop then:
> > select to_tsvector(JAPANESE_STOP_WORD)
> > which returns words even they are in JAPANESE_STOP_WORD.
> > And with the patches the problem was solved.
> 
> Pls, show your configuration for lexemes/dictionaries. I suspect that you 
> have 
> en_stem dictionary on for lword lexemes type. Better way is to use 'simple' 
> distionary (it's support stopword the same way as en_stem does) and set it for
> nlword, word, part_hword, nlpart_hword, hword, nlhword lexeme's types. Note, 
> leave unchanged en_stem for any latin word.
> 
> -- 
> Teodor Sigaev   E-mail: [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Variable length varlena headers redux

2007-02-13 Thread Magnus Hagander
On Mon, Feb 12, 2007 at 11:19:14PM -0500, Tom Lane wrote:
> Gregory Stark <[EMAIL PROTECTED]> writes:

> By my count there are only 170 uses of VARATT_SIZEP in the entire
> backend (including contrib) so this is not an especially daunting
> change.  It would break existing user-written functions that return
> varlena values, but the fix wouldn't be painful for them either.

Could the "new style" macros be back-ported to previous releases in case
we do this? That way module maintainers wouldn't need to maintain
two different sets of code for it - they could use the new style and
just compile it against an older version of pg?


//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Teodor Sigaev

I know. My guess is the parser does not read the stop word file at
least with default configuration.


Parser should not read stopword file: its deal for dictionaries.



So if a character is not ASCII, it returns 0 even if p_isalpha returns
1. Is this what you expect?

No, p_islatin should return true only for latin characters, not for national 
ones.



In our case, we added JAPANESE_STOP_WORD into english.stop then:
select to_tsvector(JAPANESE_STOP_WORD)
which returns words even they are in JAPANESE_STOP_WORD.
And with the patches the problem was solved.


Pls, show your configuration for lexemes/dictionaries. I suspect that you have 
en_stem dictionary on for lword lexemes type. Better way is to use 'simple' 
distionary (it's support stopword the same way as en_stem does) and set it for
nlword, word, part_hword, nlpart_hword, hword, nlhword lexeme's types. Note, 
leave unchanged en_stem for any latin word.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] OT: IRC nick to real world mapping

2007-02-13 Thread Niels Breet
On Tue, February 13, 2007 07:39, Tom Lane wrote:
> Lukas Kahwe Smith <[EMAIL PROTECTED]> writes:
>
>> Bruce Momjian wrote:
>>
>>> Lukas Kahwe Smith wrote:
>>>
 http://developer.postgresql.org/index.php/IRC2RWNames

>>>
>>> Ah, excellent.  Should we put this in the IRC topic line?
>>>
>
>> if there is still some space in the topic ... sure!
>
> Is there any cross-check on the correctness of this list?
>
>
> (Hint: if someone shows up in IRC claiming to be me, he's more than
> likely lying.)
>

That is what registering your nick is supposed to help for.
http://freenode.net/faq.shtml#registering

- Niels



---(end of broadcast)---
TIP 6: explain analyze is your friend