Re: [HACKERS] [PATCHES] WAL logging freezing

2006-11-01 Thread Simon Riggs
On Tue, 2006-10-31 at 11:04 -0500, Tom Lane wrote:

> It seems that we're converging on the conclusion that not truncating
> clog early is the least bad alternative.  This has the advantage of
> making things a lot simpler --- we won't need to track minxid at all.
> Allow me to summarize what I think has to happen:
> 
> * VACUUM will determine a freeze cutoff XID the same way it does now,
> except that instead of using a hard-wired freeze window of 1G
> transactions, we'll either reduce the window to (say) 100M transactions
> or provide a GUC variable that can be adjusted over some reasonable
> range.
> 
> * All XIDs present in the table that are older than the cutoff XID will
> be replaced by FrozenXid or InvalidXid as required, and such actions
> will be WAL-logged.  (I think we need to consider all 3 of xmin, xmax,
> and xvac here.)
> 
> * On successful completion, the cutoff XID is stored in
> pg_class.relvacuumxid, and pg_database.datvacuumxid is updated
> if appropriate.  (The minxid columns are now useless, but unless there
> is another reason to force initdb before 8.2, I'm inclined to leave them
> there and unused.  We can remove 'em in 8.3.)
> 
> * pg_clog is truncated according to the oldest pg_database.datvacuumxid.
> We should WAL-log this action, because replaying such an entry will
> allow a PITR slave to truncate its own clog and thereby avoid wraparound
> conflicts.  Note that we no longer need a checkpoint before truncating
> --- what we need is XLogFlush, instead.  ("WAL before data")
> 
> These changes get us back into the regime where the hint bits truly are
> hints, because the underlying pg_clog data is still there, both in a
> master database and in a PITR slave.  So we don't need to worry about
> WAL-logging hint bits.  We also avoid needing any flushes/fsyncs or
> extra checkpoints.  The added WAL volume should be pretty minimal,
> because only tuples that have gone untouched for a long time incur extra
> work.  The added storage space for pg_clog could be annoying for a small
> database, but reducing the freeze window ameliorates that objection.

Complete agreement, nice summary.

Do we need another GUC? I thought your observation about a PITR slave
having that set lower than its master still remains unresolved. Perhaps
we should do that by pg_config_manual.h, so its much less likely that
the two would be out of step?

Thanks to Heikki, for spotting the original bug before it was reported
in production.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Extended protocol logging

2006-11-01 Thread Simon Riggs
On Tue, 2006-10-31 at 23:51 -0500, Tom Lane wrote:
> Dave Cramer <[EMAIL PROTECTED]> writes:
> > These are logs from Beta 2.
> 
> With what logging settings?  log_duration has rather different behavior
> from what it used to do.

I think it would be useful to have the log results from a test program
in the protocol section, so interface designers know what will get
logged from various protocol sequences.

That way JDBC people and others can interpret what their own interfaces
should look like for 8.2

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Teodor Sigaev

We are trying to get something faster than ~ '%foo%';
Which Tsearch2 does not give us :)


Hmm, why not with GIN? :) Although, right now GIN doesn't support prefix search, 
it's possible to use it.


Brain storm method:

Develop a dictionary which returns all substring for lexeme, for example for 
word foobar it will be 'foobar fooba foob foo fo oobar ooba oob oo obar oba ob 
bar ba ar'. And make GIN functional index over your column (to save disk space).

So, your query will be looked as
select ... where to_tsvector(text_column) @@ 'foo';
Notices:
Time of search in GIN weak depend on number of words (opposite to 
tsearch2/GiST), but insertion of row may be slow enough


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Joshua D. Drake
Teodor Sigaev wrote:
>> We are trying to get something faster than ~ '%foo%';
>> Which Tsearch2 does not give us :)
> 
> Hmm, why not with GIN? :) Although, right now GIN doesn't support prefix
> search, it's possible to use it.

Well they run 8.1 :)

Joshua D. Drake

> 
> Brain storm method:
> 
> Develop a dictionary which returns all substring for lexeme, for example
> for word foobar it will be 'foobar fooba foob foo fo oobar ooba oob oo
> obar oba ob bar ba ar'. And make GIN functional index over your column
> (to save disk space).
> So, your query will be looked as
> select ... where to_tsvector(text_column) @@ 'foo';
> Notices:
> Time of search in GIN weak depend on number of words (opposite to
> tsearch2/GiST), but insertion of row may be slow enough
> 


-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] WAL logging freezing

2006-11-01 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> Do we need another GUC? I thought your observation about a PITR slave
> having that set lower than its master still remains unresolved.

No, AFAICS that's not an issue in this design.  The facts-on-the-ground
are whatever is recorded in pg_class.relvacuumxid, and whether a
particular table has been vacuumed with a shorter or longer freeze
window doesn't affect correctness.  In particular, a slave with
ambitions towards having a shorter window would still be unable to
truncate its clog before having re-vacuumed everything.  

So, not only could we have a GUC variable, but it could be USERSET;
there's no breakage risk as long as we constrain the value range to
something sane.

It strikes me that VACUUM FREEZE could be replaced by
SET vacuum_freeze_limit = 0
VACUUM ...
which would be a good thing because the FREEZE keyword has to be
partially reserved in this syntax, and that is contrary to spec.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Extended protocol logging

2006-11-01 Thread Dave Cramer


On 31-Oct-06, at 11:51 PM, Tom Lane wrote:


Dave Cramer <[EMAIL PROTECTED]> writes:

These are logs from Beta 2.


With what logging settings?  log_duration has rather different  
behavior

from what it used to do.


to be honest I don't know, and looking at the logs I suspect that  
this is just logging duration, however it's still looking pretty  
ambiguous. ( I will get the settings, my client is on the other side  
of the world)


what exactly does it mean ? The total operation was  4.365ms and the  
parse was .672 and bind was .128? Is it possible for different  
connections to be interleaved? I still think having the parse,  
bind,execute show the statement name makes sense if for no other  
reason than clarity. I would think writing a log parser would be  
fairly challenging without them.


Dave


regards, tom lane

---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] [Fwd: pg_migrator: in-place upgrade tool at pgFoundry]

2006-11-01 Thread korryd





EnterpriseDB has created a new project at pgFoundry - http://pgfoundry.org/projects/pg-migrator/

pg_migrator is a tool that can in-place upgrade existing data without the usual dump/reload cycle.

The pg_migrator project site (at pgFoundry) contains a complete implementation of the functionality described below as well as a copy of the introductory document that I've included in this message.

We would welcome feedback on implementation details and ideas for improvements.  

    -- Korry






--
  Korry Douglas    [EMAIL PROTECTED]
  EnterpriseDB  http://www.enterprisedb.com




=

--
PG_MIGRATOR: IN-PLACE UPGRADES FOR POSTGRESQL 
--

Upgrading a PostgreSQL database from one release to another can be an
expensive process. For minor upgrades, you can simply install new executables
and forget about upgrading existing data. But for major upgrades, you have to
export all of your data (using pg_dump), install the new release, run initdb
to create a new cluster, and then import your old data. If you have a lot of
data, that can take a considerable amount of time (hours?, days?). If you have
too much data, you may have to buy more storage since you need enough room to
hold the original data plus the exported data.

EnterpriseDB is contributing a new tool, pg_migrator, that can reduce the
amount of time (and disk space) required for many upgrades.

--
WHAT IT DOES
--

PG_migrator is a tool (not a complete solution) that performs an in-place
upgrade of existing data. For many upgrades, the data stored in user-defined
tables does not have to change when moving from one version of PostgreSQL to
another. Some upgrades require changes in the on-disk representation of data;
pg_migrator cannot help in those upgrades. However, many upgrades require no
changes to the on-disk representation of a user-defined table and, in those
cases, pg_migrator will move existing user-defined tables from the old
database cluster into the new cluster.

There are two factors that determine whether an in-place upgrade is practical.

Every table in a cluster (actually, every table created by a given version)
shares the same infrastructure layout. By infrastructure, we mean the on-disk
representation of the table headers and trailers and the on-disk
representation of tuple headers. If the infrastructure changes between the old
version of PostgreSQL and the new version, pg_migrator cannot move existing
tables to the new cluster (you'll have to pg_dump the old data and then import
that data into the new cluster).

Occasionally, a PostgreSQL release introduces a change to the on-disk
representation of a data type. For example, PostgreSQL version 8.2 changes the
layout for values of type INET and CIDR. If you are not storing any values of
type INET (or CIDR), pg_migrator can upgrade any table in your cluster. If you
are storing values of type INET (or CIDR) in some tables, you must
export/import those tables, but pg_migrator can in-place upgrade other tables
(a change in infrastructure means that you have to export/import every table).

If a new version of PostgreSQL does not change the infrastructure layout and
does not change the on-disk representation of a data type (that you are
using), you can pg_migrator to save a tremendous amount of time (and disk
space).

--
HOW IT WORKS
--

To use pg_migrator during an upgrade, you start by installing a fresh cluster
(using the newest version) in a new directory. When you've finished installing
the new version, the new cluster will contain the new executables (postmaster,
pg_dump, ...) and the usual template0, template1, and postgresql databases,
but no user-defined tables. At this point, you can shutdown the new postmaster
(we presume that you shutdown the old postmaster prior to creating the new
cluster) and invoke pg_migrator.

When pg_migrator starts, it runs through a verification process that ensures
that all required executables (the old postmaster, the new postmaster,
pg_dump, pg_resetxlog, ...) are present and contain the expected version
numbers. The verification process also checks the old and new $PGDATA
directories to ensure that the expected files and subdirectories (base,
global, pg_clog, pg_xlog, ...) are in place.  If the verification process
succeeds, pg_migrator starts the old postmaster and runs pg_dumpall
--schema-only to capture the metadata contained in the old cluster. The script
produced by pg_dumpall will be used in a 

[HACKERS] ¿¿¿past chunk end???

2006-11-01 Thread luis garcia
Hi We have been making some modifications to postgres catalog, butit seems to be a problem with one of the fields we just added, exactlyin the time when inserting the right values to that specific field.
1. This is what It happens:**template1=# CREATE TABLE test() AS VALID EVENT 2 WITH FREQUENCY '00:00:50.00' AS REAL TIME;WARNING:  detected write past chunk end in CacheMemoryContext 0xb7212e0c
CREATE TABLE**The creation it's possible, in fact, We have added 7 fields to Postgrescatalog (in pg_class) and there are no problems with the other fields values,
just with this one.2. The field is defined this way:

// FILE: Parsenodes.h

// STRUCTURE: frequencyDefinition

typedef struct FrequencyDefinition
{
    NodeTag        type;
    bool        timeFrequency;
    FrequencyType    fttype;
    const char    *frequencyMeasure; // THIS ONE
} FrequencyDefinition;
3. The values are inserted this way:// FILE: relcache.c// FUNCTION: RelationBuildLocalRelation**rel->rd_rel->relhasfrequency = freqDef->timeFrequency;  // Perfect Asignation
if((freqDef->fttype) == REAL_TIME)            namestrcpy(&rel->rd_rel->relfrequencytype,(const char *)"REAL TIME");        else            namestrcpy(&rel->rd_rel->relfrequencytype,(const char *)"HISTORIC");
namestrcpy(&rel->rd_rel->relfrequency,freqDef->frequencyMeasure)**We can see the other values whit a SELECT over PG_CLASS:template1=# SELECT relname, relhasfrequency, relfrequencytype, relfrequency FROM pg_class WHERE relhasfrequency=true;
 relname  | relhasfrequency | relfrequencytype | relfrequency - - - - - -  + - - - - - - - - - - - - + - - - - - - - - - - - - - + - - - - - - - - - test | t    | REAL TIME  | 0
(1 row)We have realized that in the assignation it takes just the first CHAR from '00:00:50.00',and we tested the same assignation but just like this:***namestrcpy(&rel->rd_rel->relfrequencytype,freqDef->frequencyMeasure);
//namestrcpy(&rel->rd_rel->relfrequency,freqDef->frequencyMeasure)***And the result to the same query was:template1=# SELECT relname, relhasfrequency, relfrequencytype, relfrequency FROM pg_class WHERE relhasfrequency=true;


 relname  | relhasfrequency | relfrequencytype | relfrequency
 - - - - - -  + - - - - - - - - - - - - + - - - - - - - - - - - - - + - - - - - - - - -
 test | t    | '00:00:50.00' | 0
(1 row)
So please, anybody could tell me how to fix this?Thanks...-- Luis D. García M.Telf: 0414-3482018- FACYT - UC -- Computación -


Re: [HACKERS] Extended protocol logging

2006-11-01 Thread Dave Cramer


On 1-Nov-06, at 6:18 AM, Simon Riggs wrote:


On Tue, 2006-10-31 at 23:51 -0500, Tom Lane wrote:

Dave Cramer <[EMAIL PROTECTED]> writes:

These are logs from Beta 2.


With what logging settings?  log_duration has rather different  
behavior

from what it used to do.


I think it would be useful to have the log results from a test program
in the protocol section, so interface designers know what will get
logged from various protocol sequences.


I think some sort of examples are in order, or more consistency.


That way JDBC people and others can interpret what their own  
interfaces

should look like for 8.2
from a JDBC point of view we don't look at the logs in the API. I am  
just debugging something


--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Design Considerations for New Authentication Methods

2006-11-01 Thread Stephen Frost
* Henry B. Hotz ([EMAIL PROTECTED]) wrote:
> I've been looking at adding SASL or GSSAPI as an auth method.  I have  
> some questions about how to handle the flow of control changes.

Great!  I'd love to see that implemented, personally, so if you're
looking for help, please let me know.

> round trips and network accesses done.  I notice that all the  
> authentication (pg_fe_sendauth()) is done inside PWConnectPoll(),  
> which sounds like something that isn't expected to block on network  
> access.

I think you're missing a bit about how the design works.
PGConnectPoll() is expected to be called multiple times until the
connection is established.  Basically, you can return something to the
user that says "connection not done yet, but I'm returning because you
said to not block.  Please call again when more data available or you
have the opportunity to".  This is a pretty common arrangment when
non-blocking systems are implemented.  As Tom said, you should just need
to have some state stored so that you know what's going on when you're
called again.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] IN(subselect returning few values ...)

2006-11-01 Thread Stephen Frost
All,

* Josh Berkus (josh@agliodbs.com) wrote:
> > The plain non-VALUES list form is also significantly faster than it
> > was, but I think it will only result in a bitmap indexscan plan type.
> 
> Yeah, even bitmapscans break down at 1000 values ...

In a similar vein, perhaps 8.2 fixes this but I don't recall seeing
anything where it would...

Working on 8.1 I've recently been annoyed at the need to translate a
sub-select inside an IN () clause into a fixed list of contents (the
results of the sub-select, exactly) in order to get better performance.
If the results of a sub-select are very likely to be less than 1000 (or
what have you) is there a reason not to translate that sub-select into
a VALUES list or IN (constants) set (ie: a nest-loop or a bitmap
indexscan)?

This particular case was involving 9 values from a table which only had
around 250 rows total being used to find a set of records in a much,
much bigger table (60M or so, iirc).  I dislike having to hard-code
those values in the scripts I'm writing, or hack it up to implement
getting the list and then using it as a constant.

A similar case I've seen is that when using a sub-select or similar
instead of a list of constants the 'One-Time Filter: false' doesn't
appear to ever be able to happen.  I might have overlooked something
else which doesn't something similar, but if not this ends up making a
query *much* more expensive when alot of disjoint tables are involved,
most of which don't need to be considered since they're not in the
constants list.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Design Considerations for New Authentication Methods

2006-11-01 Thread Henry B. Hotz


On Oct 31, 2006, at 8:34 PM, Tom Lane wrote:


"Henry B. Hotz" <[EMAIL PROTECTED]> writes:

I notice that all the
authentication (pg_fe_sendauth()) is done inside PWConnectPoll(),
which sounds like something that isn't expected to block on network
access.


That's right.


Is this behavior important during startup?


You needn't bother to submit a patch that breaks it ;-).


In other words I can't do the easy thing.  OK.


But I don't
really see that it's such a big deal.  You just need some state  
data to

keep track of what to do the next time you receive a message.  There's
no assumption anywhere that authentication only involves one message
exchange.


In a sense you're right.  The API's are designed to support that.   
Means I need to some more cases to the huge switch statement inside  
PWConnectPoll() though.



I haven't looked at the corresponding logic on the server side, but
I'd assume that it forks before we get to this point so it doesn't
matter.


Correct, we don't need to worry about multitasking apps there.

regards, tom lane


 


The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
[EMAIL PROTECTED], or [EMAIL PROTECTED]



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Extended protocol logging

2006-11-01 Thread Simon Riggs
On Wed, 2006-11-01 at 10:06 -0500, Tom Lane wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > On Tue, 2006-10-31 at 23:51 -0500, Tom Lane wrote:
> >> With what logging settings?  log_duration has rather different behavior
> >> from what it used to do.
> 
> > I think it would be useful to have the log results from a test program
> > in the protocol section,
> 
> The contents of the postmaster log are surely not part of the FE protocol.
> Clients can't even see the log without resorting to nonstandard hacks.

OK, can we please put the example from -hackers into the docs,
somewhere, with particular note of which protocol messages result in
which logging output?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Design Considerations for New Authentication Methods

2006-11-01 Thread Henry B. Hotz


On Nov 1, 2006, at 6:33 AM, Stephen Frost wrote:


* Henry B. Hotz ([EMAIL PROTECTED]) wrote:

I've been looking at adding SASL or GSSAPI as an auth method.  I have
some questions about how to handle the flow of control changes.


Great!  I'd love to see that implemented, personally, so if you're
looking for help, please let me know.


Thank you.  I will!  ;-)

Do you know Java?  I'm doing this ultimately because I want the JDBC  
driver to support encrypted connections with Kerberos and without  
needing SSL.  As an added plus a Windows-native client should support  
it.


My main hesitation between SASL and GSSAPI is that the Windows  
equivalent APIs for SASL have not received the same degree of  
interoperability testing as SSPI<-->GSSAPI.  I don't have a published  
example to crib from.  For general information the relevant calls are  
at the bottom of .


"I don't do windows (TM)."  ;-)


round trips and network accesses done.  I notice that all the
authentication (pg_fe_sendauth()) is done inside PWConnectPoll(),
which sounds like something that isn't expected to block on network
access.


I think you're missing a bit about how the design works.
PGConnectPoll() is expected to be called multiple times until the
connection is established.


I think I got it.  I just didn't want to get it.

I'll probably do the simple, blocking-loop client anyway as a way to  
debug the server side.  Then worry about getting the clients up to  
snuff.



Basically, you can return something to the
user that says "connection not done yet, but I'm returning because you
said to not block.  Please call again when more data available or you
have the opportunity to".  This is a pretty common arrangment when
non-blocking systems are implemented.  As Tom said, you should just  
need

to have some state stored so that you know what's going on when you're
called again.


Once I start adding connection state I can add a control for whether  
data packets need to be wrapped as well.  I'm not sure how hard the  
>8KB case will be to handle though.  Probably some hooks in the  
underlying library, and I hope they make it easier rather than harder.



Thanks!

Stephen


 


The opinions expressed in this message are mine,
not those of Caltech, JPL, NASA, or the US Government.
[EMAIL PROTECTED], or [EMAIL PROTECTED]



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [Fwd: Index Advisor]

2006-11-01 Thread Kai-Uwe Sattler

Hi Gurjeet,
I include pgsql-hackers in this discussion ...

Am 01.11.2006 um 17:38 schrieb Gurjeet Singh:


Hi Kai,

I am working with Simon at EnterpriseDB, and am currently  
working on porting
your patch to 8.2 sources. I have done a quick hack to make it work  
on 8.2;
please find the modified patch attached (remember it's a quick-n- 
dirty hack; it

still needs a cleanup).

The only changes (as yet) that I have done are:
  (1) Changed the code according to the change in the linked- 
list (List*)

handling across 7.4 and 8.2.
  (2) Added support for BitmapAnd, BitmapOr, BitmapHeapScan and
BitmapIndexScan plan-nodes in scan_plan().

The outstanding issues, as of now, as I see them, are:
(1) There are left-over dependencies in pg_depends, that  
stops the

table-being-analyzed from getting dropped (probably it'll affect many
other DDLs, I haven't tested though). I am investigating this.

(2) The intermediate indexes that are created are 'real'  
indexes, in the

sense that they are actiually created on the disk before
planning/analyzing and are then dropped. I assume this since, you are
calling index_create() which in turn calls index_build(), which in  
turn

calls the index's Access-method's build method, which, I assume, will
create the index on disk. Please point out if this is not the case.
You are right - at least an empty index is created. I'm not sure if  
the index appears on disk, but this was the easiest way in 7.4 to do  
this. However, I agree - it has performance drawbacks and raises  
concurreny issues.


This, real-index build, can be counter productive where the  
underlying
table is huge. We were thinking of creating actual 'virtual'  
indexes, by

utilizing the 'skip_build' parameter of index_create() (introduced in
index.c:1.229). But that would entail calculating the number-of- 
pages, average
record-size, and probably some more stats, on our own. And then  
putting these

values in the catalog.
Actually, we did exactly this - if you have only an empty index then  
you have to estimate this values.




I see that you have declred following two functions in
src/include/catalog/index.h:

extern int2 get_attrType(Oid attrTypId, int2 attrSize, int4  
attrTypMod);


extern int4 estimate_indexPages(Relation heapRelation,
IndexInfo *indexInfo);

Looking at these, I suppose that you also worked on some such  
calculations.

If still with you, can you share this code with us?

It should be part of the patch - but let me check this.



(3) (If you've lost track, this is the third in the list of  
outstanding
issues :). I am concerned about the visibility of the virtual  
indexes. If
these indexes are immediately visible to other sessions, then there  
is a
strong possibilty that other backends that are in the planning  
stage of a
non-explain query, will pickup these indexes and develop their plan  
and send
for execution. And you know what hell will break loose if that  
happens;

there won't be any data in these indexes!!
Right - that's what I meant above by concurrency issues. Honestly, we  
had not enough knowledge about pgsql at this time to do this. I  
suppose the right way would be to add a session or transaction id to  
the virtual index and let the planner use only virtual indexes from  
the same session as the query.




One more thing, we are looking at ways to make it easier for  
others too, to
develop their own advisors. So we are looking at the possibility of  
making
it plugin based arch, similar to how edb-debugger for pl/pgsql is  
being
developed. This will allow others to develop and use their own  
advisors for

various Select/DML statements, in the least invasive way.
That's a great idea - it could be helpful for other kind of virtual  
objects too, e.g. materialized views, partitions etc. There are  
several exits for plugins: the set off indexes which should be  
created virtually, the profit assignment to the individual indexes as  
well as the way the recommendation is used. For example, we have the  
prototype of an online advisor which collects the recommendations  
continuously and tries to adapt the current set of real indexes (at  
least as an alerter for the DBA).




Lastly, and most importantly, can we move this discussion to  
pgsql-hackers?

Done.

So, let me know if there is anything that I can do.



Best regards,
-- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail |  
yahoo }.com



Best,
   Kai

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread tomas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, Oct 31, 2006 at 08:55:04PM -0800, Joshua D. Drake wrote:
> 
> >> We are not storing bytea [...]
[...]
> > Hmm, have you tried to create a functional trigram index on the
> > equivalent of "strings(bytea_column)" or something like that?

Hrm. Sorry for my impolite interuption, but... is there such a thing as
a "functional trigram index"? (this would be very cool).

Thanks
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFSG33Bcgs9XrR2kYRAnB7AJ4l6UPK/4vhtgr7Ux2/L7VtYq6d7ACeLBZP
IMPCEj5zqhYR7b2eYPgjRRE=
=6uiR
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ¿¿¿past chunk end???

2006-11-01 Thread imad

Carefully check all the palloc's you are doing in your code.
This warning is shown when you write some extra bytes to the memory
and size of your data goes beyond the allocated size.

--Imad
EnterpriseDB (www.enterprisedb.com)


On 11/1/06, luis garcia <[EMAIL PROTECTED]> wrote:


Hi We have been making some modifications to postgres catalog, but
it seems to be a problem with one of the fields we just added, exactly
in the time when inserting the right values to that specific field.

1. This is what It happens:

**
template1=# CREATE TABLE test() AS VALID EVENT 2 WITH FREQUENCY
'00:00:50.00' AS REAL TIME;
WARNING:  detected write past chunk end in CacheMemoryContext 0xb7212e0c
CREATE TABLE
**

The creation it's possible, in fact, We have added 7 fields to Postgres
catalog (in pg_class) and there are no problems with the other fields
values,
 just with this one.


2. The field is defined this way:
 // FILE: Parsenodes.h
 // STRUCTURE: frequencyDefinition

 typedef struct FrequencyDefinition
 {
 NodeTagtype;
 booltimeFrequency;
 FrequencyTypefttype;
 const char*frequencyMeasure; // THIS ONE
 } FrequencyDefinition;


3. The values are inserted this way:
// FILE: relcache.c
// FUNCTION: RelationBuildLocalRelation

**
rel->rd_rel->relhasfrequency = freqDef->timeFrequency;  // Perfect
Asignation

if((freqDef->fttype) == REAL_TIME)

namestrcpy(&rel->rd_rel->relfrequencytype,(const char
*)"REAL TIME");
else

namestrcpy(&rel->rd_rel->relfrequencytype,(const char
*)"HISTORIC");

namestrcpy(&rel->rd_rel->relfrequency,freqDef->frequencyMeasure)
**


We can see the other values whit a SELECT over PG_CLASS:

template1=# SELECT relname, relhasfrequency, relfrequencytype, relfrequency
FROM pg_class WHERE relhasfrequency=true;

 relname  | relhasfrequency | relfrequencytype | relfrequency
 - - - - - -  + - - - - - - - - - - - - + - - - - - - - - - - - - - + - - -
- - - - - -
 test | t| REAL TIME  | 0
(1 row)

We have realized that in the assignation it takes just the first CHAR from
'00:00:50.00',
and we tested the same assignation but just like this:

***
namestrcpy(&rel->rd_rel->relfrequencytype,freqDef->frequencyMeasure);
//namestrcpy(&rel->rd_rel->relfrequency,freqDef->frequencyMeasure)
***

And the result to the same query was:

template1=# SELECT relname, relhasfrequency, relfrequencytype, relfrequency
FROM pg_class WHERE relhasfrequency=true;

  relname  | relhasfrequency | relfrequencytype | relfrequency
  - - - - - -  + - - - - - - - - - - - - + - - - - - - - - - - - - - + - - -
- - - - - -
  test | t| '00:00:50.00' | 0
 (1 row)


So please, anybody could tell me how to fix this?

Thanks...
--
Luis D. García M.
Telf: 0414-3482018

- FACYT - UC -
- Computación -


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Extended protocol logging

2006-11-01 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> On Tue, 2006-10-31 at 23:51 -0500, Tom Lane wrote:
>> With what logging settings?  log_duration has rather different behavior
>> from what it used to do.

> I think it would be useful to have the log results from a test program
> in the protocol section,

The contents of the postmaster log are surely not part of the FE protocol.
Clients can't even see the log without resorting to nonstandard hacks.

What it sounds like to me is that Dave's client has got log_duration = ON
when what he should have is log_min_duration_statement = 0.  Those two
settings used to be just about redundant but as of 8.2 they have got
distinct functionality.  See this thread:
http://archives.postgresql.org/pgsql-hackers/2006-09/msg00681.php

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ¿¿¿pas

2006-11-01 Thread Martijn van Oosterhout
On Wed, Nov 01, 2006 at 08:49:48AM -0400, luis garcia wrote:
> Hi We have been making some modifications to postgres catalog, but
> it seems to be a problem with one of the fields we just added, exactly
> in the time when inserting the right values to that specific field.

Well, it's a bit hard to tell since you don't tell what changes you
actually made to the catalog. Did you remember to update the length to
the catalog entry and update the struct and all those details?

In particular:
> template1=# SELECT relname, relhasfrequency, relfrequencytype, relfrequency
> FROM pg_class WHERE relhasfrequency=true;
> 
> relname  | relhasfrequency | relfrequencytype | relfrequency
> - - - - - -  + - - - - - - - - - - - - + - - - - - - - - - - - - - + - - -
> - - - - - -
> test | t| REAL TIME  | 0
> (1 row)

What's the type of "relfrequency" actually?

Much more info needed...

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] IN(subselect returning few values ...)

2006-11-01 Thread Tom Lane
Stephen Frost <[EMAIL PROTECTED]> writes:
> Working on 8.1 I've recently been annoyed at the need to translate a
> sub-select inside an IN () clause into a fixed list of contents (the
> results of the sub-select, exactly) in order to get better performance.

Better performance than what?  Ever since 7.4 we've converted small IN
sub-selects into plans along the lines of

regression=# explain select * from tenk1 where unique1 in (select f1 from 
int4_tbl);
QUERY PLAN
---
 Nested Loop  (cost=1.06..31.20 rows=5 width=244)
   ->  HashAggregate  (cost=1.06..1.11 rows=5 width=4)
 ->  Seq Scan on int4_tbl  (cost=0.00..1.05 rows=5 width=4)
   ->  Index Scan using tenk1_unique1 on tenk1  (cost=0.00..6.00 rows=1 
width=244)
 Index Cond: (tenk1.unique1 = "outer".f1)
(5 rows)

which looks OK to me.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [SQL] Case Preservation disregarding case

2006-11-01 Thread Chuck McDevitt
Sorry, my last mail wasn't well thought out.  Yes, the
information_schema needs the case-folded name (although it might be ok
to add additional columns to the information_schema for extra
information).

But, stepping back from all that, what is it the users want?

1)  When re-creating a CREATE TABLE statement from whatever catalog
info, they'd like the names to come back exactly as then entered them.
If I do:
 CREATE TABLE BobsTable (WeeklySales numeric(10,2),
"SomeStrangeName" int);

  They'd like to see exactly that when the CREATE TABLE gets
re-created, not what we do now:

   CREATE TABLE bobstable (weeklysales numeric(10,2),
"SomeStrangeName" int);

2)  When doing reports, they'd like the name as entered to be the title
of the column:
Select * from bobstable;  

  Would be nice if they saw this:
  WeeklySalesSomeStrangeName
  ------

   
For compatibility with existing apps and spec compliance, you'd still
want PQfname() to return the case-folded name.
But there isn't any reason you couldn't also return a "suggested title"
field (PQftitle?) which preserves the user's case.

You could also extend the concept of a PQftitle to make nicer titles for
expressions.  Instead of 
SELECT sum(WeeklySales) from BobsTable;

Producing "?column?" or somesuch to use in the report, it could return a
title like "sum(WeeklySales)"

-Original Message-
From: Tom Lane [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 31, 2006 10:38 PM
To: Chuck McDevitt
Cc: Stephan Szabo; beau hargis; pgsql-sql@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case

"Chuck McDevitt" <[EMAIL PROTECTED]> writes:
> Equivalent, yes.  But I can interpret that clause it mean I can show
> either the case folded or non-case-folded value in the information
> schema, as they are equivalent.

Well, that's an interesting bit of specs-lawyering, but I don't see
how you can defend it against these rules in SQL99 5.2:

21) For every  IB there is exactly one
corresponding case-normal form CNF. CNF is an 
derived from IB as follows.

Let n be the number of characters in IB. For i ranging from
1
(one) to n, the i-th character M(i) of IB is translated into
the
corresponding character or characters of CNF as follows.

Case:

a) If M(i) is a lower case character or a title case
character
  for which an equivalent upper case sequence U is defined
by
  Unicode, then let j be the number of characters in U; the
  next j characters of CNF are U.

b) Otherwise, the next character of CNF is M(i).

22) The case-normal form of the  of a  is used for purposes such as and including
determination of identifier equivalence, representation in
the Definition and Information Schemas, and representation
in
diagnostics areas.

NOTE 44 - Any lower-case letters for which there are no
upper-
case equivalents are left in their lower-case form.

Again, obviously we are not compliant because we fold to lower rather
than upper case, but I do not see how you can read (22) as not requiring
the information schema to show the upper-cased form.  The output of
functions such as PQfname() might be considered closer to diagnostics
info than information schema, but that's covered too.

But the really serious problem with what you propose is that it would
allow two table columns with names that the system considers distinct
to show as the same string in the information schema and diagnostic
outputs.  That can't be acceptable --- it's going to break any
application that does any nontrivial analysis of what it sees there,
not to mention that it violates various primary key constraints in
the information schema specification.

regards, tom lane



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Index ignored with "is not distinct from", 8.2 beta2

2006-11-01 Thread JEAN-PIERRE PELLETIER

I've reposted this from pgsql-performance where I got no response.

==

Hi,

I wanted to use "exp1 is not distinct from exp2" which I tough was syntaxic 
sugar for
exp1 is not null and exp2 is not null and exp1 = exp2 or exp1 is null and 
exp2 is null

but my index is ignored with "is not distinct from".

Is this the expected behavior ?

create temporary table t as select * from generate_series(1,100) t(col);
create unique index i on t(col);
analyze t;

-- These queries don't use the index
select count(*) from t where col is not distinct from 123;
select count(*) from t where not col is distinct from 123;

-- This query use the index
select count(*) from t where col is not null and 123 is not null and col = 
123 or col is null and 123 is null;


explain analyze select count(*) from t where col is not distinct from 123;
QUERY PLAN

Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual 
time=228.200..228.202 rows=1 loops=1)
  ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual 
time=0.042..228.133 rows=1 loops=1)

Filter: (NOT (col IS DISTINCT FROM 123))
Total runtime: 228.290 ms
(4 rows)
Time: 219.000 ms

explain analyze select count(*) from t where not col is distinct from 123;
QUERY PLAN

Aggregate  (cost=19154.79..19154.80 rows=1 width=0) (actual 
time=235.950..235.952 rows=1 loops=1)
  ->  Seq Scan on t  (cost=0.00..17904.90 rows=499956 width=0) (actual 
time=0.040..235.909 rows=1 loops=1)

Filter: (NOT (col IS DISTINCT FROM 123))
Total runtime: 236.065 ms
(4 rows)
Time: 250.000 ms

explain analyze select count(*) from t where col is not null and 123 is not 
null and col = 123 or col is null and 123 is null;

   QUERY PLAN
---
Aggregate  (cost=8.13..8.14 rows=1 width=0) (actual time=0.267..0.268 rows=1 
loops=1)
  ->  Index Scan using i on t  (cost=0.00..8.13 rows=1 width=0) (actual 
time=0.237..0.241 rows=1 loops=1)

Index Cond: (col = 123)
Total runtime: 0.366 ms
(4 rows)
Time: 0.000 ms

I am on Windows XP Service pack 2 with PostgreSQL 8.2 beta2

Thanks,
Jean-Pierre Pelletier
e-djuster



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Writing WAL for relcache invalidation: pg_internal.init

2006-11-01 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> Enclose a patch for new WAL records for relcache invalidation.

I don't think this works.  RelationCacheInitFileInvalidate is executed
post-commit, which means that there's a window between commit and where
you propose to write the WAL entry.  A crash and restart in that
interval would leave the catalog changes committed, but not reflected
into pg_internal.init.

I think we're probably better off to just forcibly remove the init file
during post-recovery cleanup.  The easiest place to do this might be
BuildFlatFiles, which has to scan pg_database anyway ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Writing WAL for relcache invalidation:pg_internal.init

2006-11-01 Thread Simon Riggs
On Wed, 2006-11-01 at 12:05 -0500, Tom Lane wrote:

> I think we're probably better off to just forcibly remove the init file
> during post-recovery cleanup.  The easiest place to do this might be
> BuildFlatFiles, which has to scan pg_database anyway ...

Presumably not rebuilding it, since we can let that occur naturally.

I'll submit a patch tomorrow.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] IN(subselect returning few values ...)

2006-11-01 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
> Stephen Frost <[EMAIL PROTECTED]> writes:
> > Working on 8.1 I've recently been annoyed at the need to translate a
> > sub-select inside an IN () clause into a fixed list of contents (the
> > results of the sub-select, exactly) in order to get better performance.
> 
> Better performance than what?  Ever since 7.4 we've converted small IN
> sub-selects into plans along the lines of

Specifically what I had been looking for a change from a HASH IN
plan w/ seq-scan on the big table to a bitmap index scan or a nested
loop index lookup (as you have below).  With the IN(constants) I had
been getting a bitmap-index scan.  I looked a bit closer though and
discovered it was thinking there would be 300+ rows returned from the
query (which would have resulted in a very much larger number of rows
being returned from the large table) instead of just 9, so I ran 
analyze on the table and that seemed to fix it up (changed to a nested
loop w/ an index scan, which works nicely).

I've got autovacuum running though and that table hasn't changed in ages
so I'm a bit confused how the stats for it were so far off.  I didn't
expect to have an analyze problem on a database that has autovacuum
running on a table that hasn't changed in a very long time.  Wish I knew
how it'd been missed. :/  I'm running a database-wide analyze, though
that'll probably take a while considering it about 300G.  Makes me
wonder if autovacuum needs to periodically run a check of tables which
havn't been seen to have changed but may have in important ways which
were somehow missed, not unlike how my SAN and RAID systems run monthly
consistancy checks...

Sorry about the noise. :/

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ¿¿¿past chunk end???

2006-11-01 Thread luis garcia
Hi, thanks for your answers...>Carefully check all the palloc's you are doing in your code.>This warning is shown when you write some extra bytes to the memory>and size of your data goes beyond the allocated size.
There's no problem in the memory allocation, just in relfrequency, that's the "Big Problem" for me...>Well, it's a bit hard to tell since you don't tell what changes you
>actually made to the catalog. Did you remember to update the length to>the catalog entry and update the struct and all those details?Well about the changes I have made, they are working just fine. All the
structures length have been updated and as I said we added 7 new fieldsto pg_class structure and there was no problem. The problem here is with relfrequency only.
> What's the type of "relfrequency" actually?>I'm sorry, but I forgot to show you the relfrequency definition in pg_class. Here It is:**
// FILE: pg_class.h// STRUCTURE: CATALOG(pg_class,1259) BKI_BOOTSTRAP...bool        relvalidtime;    /* T if the table has a valid time attribute */NameData    relvttype;    /* The type of the valid time column of the table:
                           * event, state or null (when the table isn't a TSQL2                           * temporal table)                            */int2        relvtprecision;    /* The precision of the valid time column. By default 0 */
bool        reltranstime;    /* T if the table has TSQL2 transaction time attribute */bool        relhasfrequency;/* relation has time frequency */NameData    relfrequencytype;/* table time frequency type (REAL TIME, HISTORIC)*/
NameData    relfrequency;    /* table time frequency value*/...**These are the seven fields we just added, and as I said the only one with
problems is relfrequency.> Much more info needed...
What else could be important there? Thanks again-- Luis D. García M.Telf: 0414-3482018- FACYT - UC -- Computación -


Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Gregory Maxwell

On 11/1/06, Teodor Sigaev <[EMAIL PROTECTED]> wrote:
[snip]

Brain storm method:

Develop a dictionary which returns all substring for lexeme, for example for
word foobar it will be 'foobar fooba foob foo fo oobar ooba oob oo obar oba ob
bar ba ar'. And make GIN functional index over your column (to save disk space).

[snip]

Time of search in GIN weak depend on number of words (opposite to
tsearch2/GiST), but insertion of row may be slow enough


With the right folding the number of possible trigrams for ascii text
is fairly small.. much smaller than the number of words in used in a
large corpus of text so the GIN performance for searches should be
pretty good.

Real magic would be to teach the regex operator to transparently make
use of such an index. ;)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:
> On Tue, Oct 31, 2006 at 08:55:04PM -0800, Joshua D. Drake wrote:
> > 
> > >> We are not storing bytea [...]
> [...]
> > > Hmm, have you tried to create a functional trigram index on the
> > > equivalent of "strings(bytea_column)" or something like that?
> 
> Hrm. Sorry for my impolite interuption, but... is there such a thing as
> a "functional trigram index"? (this would be very cool).

Heh :-)  I meant an index, using the pg_trgm opclass (which indexes
trigrams; hence the "trigram" part), on a function that would extract
the text from a bytea column; instead of indexing the trigrams of the
bytea column directly.  Hence the "functional" part.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Oleg Bartunov

On Tue, 31 Oct 2006, Joshua D. Drake wrote:



Yes we do (and can) expect to find text among the bytes. We have
searches running, we are just running into the maximum size issues for
certain rows.


you can use substr() to be safe, if schema change doesn't available




Sincerely,

Joshua D. Drake







Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: Re: [HACKERS] ¿¿¿past chunk end???

2006-11-01 Thread Tom Lane
"luis garcia" <[EMAIL PROTECTED]> writes:
> Well about the changes I have made, they are working just fine. All the
> structures length have been updated and as I said we added 7 new fields
> to pg_class structure and there was no problem. The problem here is with
> relfrequency only.

pg_class is notoriously tricky to add fields to --- there are more
undocumented dependencies than you might think.  If you check the
archives, there was a thread only a week or so back with someone who
was one or two edits short of getting such a change to work.

But what I'm wondering in your case is whether your code thinks it can
set relfrequency and/or the preceding field relfrequencytype to null.
You can't use a C struct to address any fields beyond the first
possibly-nullable field, because the constant offsets computed by the
compiler won't work.  Also, initdb makes some assumptions derived from
this fact to decide which system-catalog columns to mark NOT NULL.
If "\d pg_class" shows not-null column markings that are at variance
with what you intend, you have a problem.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly