Re: [HACKERS] alpha4 bundled -- please verify

2010-02-20 Thread Shane Ambler

Peter Eisentraut wrote:

Alpha4 has been bundled and is available at

http://developer.postgresql.org/~petere/alpha/

Please check that it is sane.

Since I'll be away for the next few days, someone has to take it from
here: write announcement, move tarballs, send announcement.




I can pass make check OK
MacOS 10.4.11 PPC - gcc 4.0.1



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Odd historical fact about Bison

2009-07-11 Thread Shane Ambler
The real question is slow-to-upgrade OSes like HP-UX, AIX, OpenBSD and 
Solaris.  What version of Bison are they shipping with?


Mac OSX 10.4.11 - GNU Bison version 1.28



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really need a 7.4.22 release now?

2008-09-20 Thread Shane Ambler

Greg Sabino Mullane wrote:


I guess I don't understand where Joe User was supposed to have gotten
the message that 7.4 was on its last legs. If anything, the fact that
it is on patchlevel 21 suggests otherwise. Us hackers and developers
shudder at seeing a 7.4 database, but there are plenty of businesses
who are still using it, and I think we owe it to them to give more
advance warning that no more patchlevels are coming along than 3
months.



The few postings I have noticed with users running 7.4 has been with a
release several less than the newest. One of the first suggestions is
always to install the newest update.

Out of the users out there that still have 7.4 servers running, what
percentage use the newest update?
I am certain it's not 100%
I doubt it would be much more that 50%

I would think the old rule of don't fix what ain't broke would be fairly
common among 7.4 users. The fact that it took 5 years to find a problem
to be fixed would indicate that it isn't a show stopping issue that they
need fixed.

Supporting old versions is a great and noble thing but there comes a
time when it is a waste of resources because the effort goes unused.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some newbie questions

2008-09-07 Thread Shane Ambler

M2Y wrote:

Hello,

Could you plz answer the following questions of a newbie:

What is a good way to start understanding backend(postgres) code? Is 
there any documentation available especially for developers?


Most of the developer info is within comments in the code itself.
Another place to start is http://www.postgresql.org/developer/coding


What is commit log and why it is needed?


To achieve ACID (Atomic, Consistent, Isolatable, Durable)
The changes needed to complete a transaction are saved to the commit log
and flushed to disk, then the data files are changed. If the power goes
out during the data file modifications the commit log can be used to
complete the changes without losing any data.

Why does a replication solution need log shipping and why cant we 
just ship the transaction statements to a standby node?


Depends on what you wish to achieve. They are two ways to a similar
solution.
Log shipping is part of the core code with plans to make the duplicate
server be able to satisfy select queries.
Statement based replication is offered by other options such as slony.

Each has advantages and disadvantages. Transaction logs are part of
normal operation and can be copied to another server in the background
without adding load or delays to the master server.

Statement based replication has added complexity of waiting for the
slaves to duplicate the transaction and handling errors from a slave
applying the transaction. They also tend to have restrictions when it
comes to replicating DDL changes - implemented as triggers run from
INSERT/UPDATE not from CREATE/ALTER TABLE.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-22 Thread Shane Ambler

Dave Cramer wrote:


On 21-Jul-08, at 4:28 PM, Andrew Sullivan wrote:


On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote:

pgFoundry ain't the CPAN, alas.


Maybe that's the problem that really needs solving?

One of the big Postgres features is its extensibility.  I agree 
that the extensions can sometimes be hard to find, but surely the 
answer to that is not an infinitely large source tarball?



I'd have to agree with Andrew here. Making it easy to get extensions 
would solve lots of problems.


What about starting a secondary team that would review extensions?
Projects on pgfoundry could be identified as reviewed and approved as a
type of recommendation that they are of acceptable quality to use in
production - maybe against certain versions.

What I would see is current core developers teaching a new group of
developers to do the add-on code reviews to a point where they could
continue on by themselves - one or two from core may wish to stay in
this group - with core checking in from time to time to ensure the
quality doesn't slip. Thereby giving some confidence in the use of the
add-ons that get *certified*.

A new add-on would be presented to this group and maybe voted on in one
of the lists (General or Admin?) to get acceptance into the review process.

Anyone interested in starting this?



I do agree that the main code doesn't need to contain every feature that
is available. But we do need to improve the perception of add-ons.
Hardly anyone thinks twice about adding an extension to firefox, perl,
gimp or oscommerce or even drivers to the os, and we need to aim for a
similar thought here.

I do think that having a list of reviewed and approved add-ons that is
easily found on the main site along with release downloads will help
along those lines.

We need to promote that postgresql isn't a one-size-fits-all solution,
it is a solid product that can be customised to suite your needs.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A new take on the foot-gun meme

2008-07-01 Thread Shane Ambler

Robert Treat wrote:


So is that a golf club gun?



Careful what you wish for 
http://www.totallyabsurd.com/12gaugegolfclub.htm




I reckon they watched Caddyshack (I think that was the one) and thought 
they could get the patent before someone actually tried selling them.


I reckon it was Rodney Dangerfield that blew the golf ball down the fairway.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dept of ugly hacks: eliminating padding space in system indexes

2008-06-24 Thread Shane Ambler

Mark Mielke wrote:

Not that I disagree with your change, but  5 Mbytes in 4 Gbytes of RAM 
for my main PostgreSQL system that I manage seems like a drop in the 
bucket. Even if 40% of pg_class_relname and pg_proc_proname indices was 
saved - we're talking about 154 Kbytes saved on both those indices 
combined. Minor? Major? I bet I wouldn't notice unless my database 
requirements used up all RAM, and even then I'm suspecting it wouldn't 
matter except for border line cases (like all pages required for 
everything else happened to equal 4 Gbytes near exactly).


Guess the mileage will vary depending on the complexity of the db 
structure. Shorter names will also benefit more than longer ones.



The performance impact is probably going to be limited by our extensive
use of catalog caches --- once a desired row is in a backend's catcache,
it doesn't take a btree search to fetch it again.  Still, the system
indexes are probably hot enough to stay in shared buffers most of the
time, and the smaller they are the more space will be left for other
stuff, so I think there should be a distributed benefit.
  


My question is whether this is limited to system catalogs? or will this 
benefit char() index used on any table? The second would make it more 
worthwhile.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-02 Thread Shane Ambler

To diverge a little -


Bit of a nood question along these lines -

Does LIKE and ILIKE take into consideration the locale allowing 
insensitive searches in any locale setting?



I know that LIKE can use an index if you don't start the match with a 
wild card. ILIKE doesn't seem to. Is or would it be possible to get 
ILIKE to use a properly configured index as well?




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-05-29 Thread Shane Ambler

David Fetter wrote:


This part is a deal-killer.  It's a giant up-hill slog to sell warm
standby to those in charge of making resources available because the
warm standby machine consumes SA time, bandwidth, power, rack space,
etc., but provides no tangible benefit, and this feature would have
exactly the same problem.

IMHO, without the ability to do read-only queries on slaves, it's not
worth doing this feature at all.


+1

I would think that a read-only WAL slave is more valuable than a 
real-time backup. (especially as the topic is about adding slaves not 
increasing the effectiveness of backups)



I also think that starting with a read-only WAL slave will ease the 
transition between delayed slave updating and real-time slave updating.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial max_connections for initdb on FreeBSD.

2008-05-29 Thread Shane Ambler

Greg Sabino Mullane wrote:

-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160


I've got a BSD system I work on that is jailed and has low shared
memory settings. So low that I cannot even initdb to create a test
8.3.1 database. It tries to create template1 with shared_buffers of
50 and max_connections of 13. Is there any way to tweak those during
initdb? Should there or can there be? Is there another way around
this problem that can be somewhat automated? For my testing purposes,
I don't need high settings at all (max_connections could be 2, for
example).



From the manual I get -

initdb uses uses the environment variables supported by libpq

PGOPTIONS sets additional run-time options for the PostgreSQL server

postgres -

-B nbuffers
Sets the number of shared buffers for use by the server processes. The
default value of this parameter is chosen automatically by initdb.
Specifying this option is equivalent to setting the shared_buffers
configuration parameter.

-N max-connections
Sets the maximum number of client connections that this server will
accept. The default value of this parameter is chosen automatically by
initdb. Specifying this option is equivalent to setting the
max_connections configuration parameter.



So I would try something like -

setenv PGOPTIONS -N 2 -B 10  (or export depending on shell)
initdb -D /path/to/data


--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \? help display

2008-05-13 Thread Shane Ambler

Bruce Momjian wrote:

Alvaro Herrera wrote:

Bruce Momjian wrote:

I promised to review our psql \? output to see if I could improve it,
particularly the General section at the top.  Below are the results.

Are the new sections ideal, and in the best ordering?  Should \copyright
be kept in General at the top?  Should \? be listed?

Why do we have a section named Copy, Large Objects?  It would seem to
make sense to put the LO stuff on its own section.


OK, new version attached.  I moved \copy into External and relabled
the section as just Large Object (singular?).



I would think copy would fit better with i/o - basically a 
subset/variation of \i

external is more for executing external code than importing data.

Yes singular - all the others are singular. If we go plural variable and 
maybe connection would fit plural as well (or maybe after the multi 
connection patch)?





--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] we don't have a bugzilla

2008-04-27 Thread Shane Ambler

Andrew Sullivan wrote:

On Sat, Apr 26, 2008 at 08:54:46AM -0700, Joshua D. Drake wrote:

How would he know to search at the archives?

 * There is no archives signature at the bottom of -hackers lists


Maybe because there's a perfectly functional archive link in the mail
headers?  And because there's an RFC that tells us how such headers
are supposed to work?

A



Oh so there is.

As a lot of people use gui apps, (I do seem to recall that mail cli 
shows the full headers) most would never see the detailed headers of an 
email. I know I very rarely look at the full source of an email.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of psql welcome message

2008-04-21 Thread Shane Ambler

This idea may be taking it to the extreme but I thought I'd throw it out
there anyway as everyone seems to want something different. (well there
seems to be three variants that if all were available would keep most
people happy) This may be one way to please everyone.



What if the actual welcome message can be defined in the .psqlrc ?
Something along the lines of -

WELCOME_MESSAGE=Welcome to VERS_PSQL - VERS_SERVER\nSSL_INFO

(or similar with tags available that will allow predefined info to be
inserted)

I also see with this path that there could be a configure option to
specify the default welcome text, I am thinking three options would keep
most happy without adjustment in .psqlrc

--default-welcome-message=[classic|short|oneline|custom string]

custom string would match the above setting options - classic would be
the default.


This could also give the option in .psqlrc of having -

WELCOME_MESSAGE=CLASSIC


From there you can customise the setting to what you want and even
across all machines in the office to match company policy (so to speak)
setting default .psqlrc file for new users etc.

Along that line maybe implement a way to have default welcome message
settings per server? psql retrieves the welcome settings upon server
connection. (as in psql requests it after connection - not returned with
the initial connection request) This would be the default that is
overridden by the local .psqlrc file.


--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get rid of psql welcome message

2008-04-17 Thread Shane Ambler

Simon Riggs wrote:

On Thu, 2008-04-17 at 09:30 -0400, Stephen Frost wrote:

* Peter Eisentraut ([EMAIL PROTECTED]) wrote:
Around http://archives.postgresql.org/pgsql-patches/2008-01/msg00089.php it 
was proposed to truncate the psql welcome screen.  What do you think about 
that?

I'd recommend an option in .psqlrc to disable it, if possible.  That
would be in line with what alot of other splash-screen type things do.


+1



+1

I honestly don't care that I get a few lines of garbage as I start psql 
- I never really look at it myself (the first dozen times I used pg it 
was probably helpful to have the help commands there).
So what if you get a few lines of text as you start a program, it 
scrolls off the screen with everything else, it doesn't fill up your 
drive in log files and I doubt the 250 bytes being sent across the 
network for those running remotely is going to chew up anyone's 
bandwidth allocation.


I do think that an rc file option (or even a ./configure option if you 
want to go that far) is fine for those in the know to adjust to their 
tastes - a better option than not show it once a .psql_history exists.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent psql API

2008-04-09 Thread Shane Ambler

Tom Lane wrote:

Shane Ambler [EMAIL PROTECTED] writes:
When switching to a conn we also need a non-destructive way out if it is 
busy.


Uh, why?  Why would you switch to a connection at all, if you didn't
want its result?


What if you switch to the wrong connection and it hasn't finished. Do 
you then have to wait until you get the results before you can issue 
another command? Or will we be able to type commands while we wait for 
results?


I am thinking as currently happens - you can't type a command as you are 
waiting for a result. So if the connection you switch to is busy but you 
want to go to another connection then how do you?


This may tie into an 'auto new connection'. You start psql enter a 
command that will take a while then think of something else you can do 
as you wait. Do you open another shell and start psql again, or send the 
working task to the background and enter another command in a new 
connection?



Think jobs in a shell, you can suspend a long running process then send 
it to the background to work and go on with something else.



So I am thinking something like C-z that will allow you to switch out of 
a task that is waiting for results without having to stop it with C-c.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \du and \dg commands.

2008-04-09 Thread Shane Ambler

David BOURIAUD wrote:

Hello,
I don't really know since when those commands are provided by psql, but I 
found them recently and was quite annoyed by the output given by both of 
them.


Not certain since when but I would think from a very early version.

Though I find that the \du command's output is quite accurate, I wonder why 
\dg gives the same informations ?
I would have expected to see the group names in a left column, and the list of 
users that belongs to this group.
I know I can get the information by fetching rows of pg_group system table, 
but I was just wondering about this issue and see what you here would think 
of it. 
Thanks for any suggestions about this behavior.


Historically old versions had a clear definition between groups and 
users. 8.1 introduced the role as we use today replacing the users and 
groups.


The views in pg_catalog that are used to replace the old users and 
groups defines a group as a role that cannot login. This is an 
approximation only as any role can have members assigned to them as if 
it was a group, inheriting privileges of that role.
And nologin may be assigned to any role for more reasons than to define 
it as a group.


The \dg and \du commands in psql need to remain as the new version can 
still connect to old servers that don't support the new roles. But with 
an 8.1 or newer server it can't reliably distinguish between a user and 
a group role so returns the same info.



I guess there is always a chance of someone making a patch that would 
hide the two options (maybe replace them with \dr?) when connected to an 
8.1 or higher server. But I wouldn't expect it any time soon.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent psql API

2008-04-09 Thread Shane Ambler

Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

Shane Ambler wrote:
So I am thinking something like C-z that will allow you to switch out of  
a task that is waiting for results without having to stop it with C-c.



I agree -- we would need to have a mode on which it is not on any
connection, to which we could switch on C-z.  If all connections are
busy, there's no way to create a new one otherwise.


That would work okay for interactive use and not at all for scripts,
which makes it kind of a nonstarter.  I'm far from convinced that the
case must be handled anyway.  If you fat-finger a SQL command the
consequences are likely to be far worse than having to wait a bit,
so why is it so critical to be able to recover from a typo in a \join
argument?


I can see that a non-connected prompt would interfere with a script but 
I would think that a prompt should always be linked to a connection. It 
may work to get an un-connected prompt made available from C-z which 
could be limited to only allow new connections or \join commands which 
would also be limited to interactive input.


My first thoughts where that C-z would either drop back to the previous 
connection or create a new connection either based on the initial login 
or the connection you are C-z'ing out of. This would be the tricky 
decider though which may make a limited prompt viable.


C-z input detection may also be limited to the wait for query response 
loop so that it is only available if the current connection is without a 
prompt.



I do think it is useful for more than typo's in the \join command. What 
about a slip where you forget to \g the command. Or you start a query 
that seems to be taking too long, background it and look into what is 
happening. This would be more helpful to those that ssh into a machine 
then run psql from there.




(I'm also unconvinced that there won't be severe implementation
difficulties in supporting a control-Z-like interrupt --- we don't have
any terminal signals left to use AFAIK.  And what about Windows?)


That may be so and could be the decider over whether this can be added 
or not.


Unless Windows steals the input before psql gets it I don't see there 
will be a problem there. Windows may be a factor in deciding which key 
to use for this command if it is to be uniform across platforms.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent psql API

2008-04-08 Thread Shane Ambler

Tom Lane wrote:


\connect name [ optional connect params ]
\S name

This would require choosing a name for the default session, maybe -.
Or you could use 1 if you figured that people really would prefer
numbers as IDs.


+1 with name as a string, when an empty string is passed a numerical 
sequence is used as default.



I'm not real thrilled with overloading \S with two fundamentally
different behaviors, either.  Can't we find a different string to assign
to the listing purpose?  Maybe \S without parameter should mean to
switch to the default session.


I think it seems fine. Fits with \h and \d behaviour.



Hmm, this still seems a bit notation-heavy, doesn't it?  What if \g
takes an arg indicating which connection to issue the command on:

\c conn1
\c conn2
...
CREATE INDEX ...  \g conn1
CREATE INDEX ...  \g conn2
...
\cwait
\cwait


+1 on the \g but I would reverse the syntax -

\g conn1 CERATE INDEX...;


Not totally sure about that one, but issuing a command on a background
connection seems appealing for scripting purposes.  It eliminates the
risk that the query response comes back before you manage to switch away
from the connection; which would be bad because it would mess up your
count of how many cwait's you need.  It seems a bit more analogous to
the use of  in shell scripts, too, where you implicitly fork away from
the async command.  (Maybe c shouldn't make the new connection
foreground either?)


\c for a new foreground connection
\cb for a new background connection?




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Concurrent psql API

2008-04-08 Thread Shane Ambler

Tom Lane wrote:


\join connID

Switch to connection connID.  If it is busy, wait for
command completion and print the result before offering
a new command prompt.


When switching to a conn we also need a non-destructive way out if it is 
busy.




\join   (or \join - as per Alvaro)

Wait for any currently busy connection's command to finish,
then \join to it.  Error if there is no busy connection.



So what you suggest is that if you have 10 busy conns running \join will 
send you to the next conn to return a result?


On that - listing the current conns could be useful to have some status 
info with the list to indicate idle or running what command.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] auto-vacuum conflicting with long running analyze

2008-04-05 Thread Shane Ambler
I am currently attempting to import the world street map data as 
currently available from openstreetmap.org and am wondering if I will 
come across a possible problem that is mentioned on their website, which 
appears to be relevant for 8.1



From http://wiki.openstreetmap.org/index.php/Mapnik they state -

quote

After several hours the import may quit with an error like:-

ANALYZE planet_osm_line;
 failed: ERROR:  deadlock detected
DETAIL:  Process 28511 waits for AccessExclusiveLock on relation 1064115 
of database 18309; blocked by process 12776.
Process 12776 waits for ShareLock on transaction 572766655; blocked by 
process 28511.


Error occurred, cleaning up

This seems to be a fault in PostgreSQL and is caused when an auto-vacuum 
is attempted during the ANALYZE. The solution is to disable all 
auto-vacuums on the database. The data is not updated after the import 
so the vacuum process does nothing useful. In the postgresql.conf file 
set the option:


autovacuum = off

Then restart the database server

# /etc/init.d/postgresql-8.1 restart

Note: In Debian/Ubuntu you also need to update 
/etc/cron.d/postgresql-common to comment out the two pg_maintenance 
tasks which ...


/quote

I can see that an analyze will be run at the end of the import and given 
that the uncompressed xml file being imported is about 77GB I can see a 
large dataset with a long running analyze to finish of the task.


Given that the analyze will obviously take a long time, is this scenario 
likely to happen with 8.3.1? or has it been fixed since 8.1.x?



Would this be the issue fixed in 8.1.1? -
Prevent autovacuum from crashing during ANALYZE of expression index

Or is it in various autovacuum improvements from 8.3.0 ?


As it is a long slow process so it will be a while before I can verify 
for sure.




--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] auto-vacuum conflicting with long running analyze

2008-04-05 Thread Shane Ambler

Alvaro Herrera wrote:

Shane Ambler wrote:

Given that the analyze will obviously take a long time, is this scenario  
likely to happen with 8.3.1? or has it been fixed since 8.1.x?


In 8.3, autovacuum cancels itself if it sees it is conflicting with
another query.


Would this be the issue fixed in 8.1.1? -
Prevent autovacuum from crashing during ANALYZE of expression index


Unrelated.


Or is it in various autovacuum improvements from 8.3.0 ?


Yes.



Sounds good. When it has run through to verify I will get them to update 
their notes on that.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serial arrays?

2008-03-22 Thread Shane Ambler

Joshua D. Drake wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Fri, 21 Mar 2008 12:55:26 -0400
Tom Lane [EMAIL PROTECTED] wrote:


regression=# create table foo (f1 serial[11]);
NOTICE:  CREATE TABLE will create implicit sequence foo_f1_seq for
serial column foo.f1 CREATE TABLE
regression=# \d foo
 Table public.foo
 Column |  Type   |Modifiers 
+-+--

 f1 | integer | not null default nextval('foo_f1_seq'::regclass)


Should we throw an error for this?  If not, what behavior would be
sane?


Interesting? Would be to create 11 sequences that can update each
element of the array. 


Would you increment one element at a time? The first element in the 
first nextval, the second element in the next... or would it increment 
the first till it was 10 then the second till it was 10 Or would you 
increment each element by one for each nextval so each element is the 
same number (use same sequence)?


I would think the most elegant solution would be to create an 
array_sequence type. Which would open a great multitude of rule 
definitions on how to define how each element is incremented. Well 
probably a simple syntax that can end up with a complex list of rules 
saved for the sequence that could be hard to decipher later or by the 
next dba to come along.


As much as I can see at least one use for this (think number plate 
sequences - 0-36 for each element) and some curiosity as a challenging 
project, I do think this would be better handled by functions designed 
specifically for the app that wants them.



H, It could be an intriguing feature, but I'm not sure it would get 
much use.


CREATE SEQUENCE_ARRAY my_silly_seq AS
  integer[11] ON INCREMENT APPLY FROM ELEMENT 0,
  ELEMENT 0 FROM 0 TO 36 ON LIMIT INCREMENT ELEMENT 1 AND RESET TO 0,
  ELEMENT 1 FROM 0 TO 9 ON LIMIT INCREMENT ELEMENT 2 AND RESET TO 0,
  ...

Could there be char[] array that can increment from 0-9 then a-z before 
rolling back to 0?


Guess I got too much time on my hands... I'll go find something better 
to do now. ;-)



 Sane? None. We should throw an error.

+1 for the error



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Skytools committed without hackers discussion/review

2007-10-10 Thread Shane Ambler

Magnus Hagander wrote:


If it is this irreplacable killer feature, it should *not* be in contrib.
It should be in the core backend, and we should be discussing if we can
bend the rules for that. This is the proper forum for discussing that, so
let's bring that question to the table.


+1 there, I don't think it should go into contrib just cause it was a 
late entry. It really seems to be a matter of whether it gets into 8.3 
or 8.4



Our beta-1 is already fairly broken (the locale stuff on our most
downloaded platform), so perhaps we should pull that one back, put this
stuff in the backend, and try to get a beta2 out ASAP?



The question there is how long will it take to reach a decision of where 
the patch belongs? (8.3 8.4 or contrib)



Putting it in contrib just because we were too late to put it in the
backend, but it is reallyi really important for our users just doesn't
make sense.


+1



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Skytools committed without hackers discussion/review

2007-10-09 Thread Shane Ambler

Devrim GÜNDÜZ wrote:

Hi,

On Tue, 2007-10-09 at 16:50 -0700, Joshua D. Drake wrote:

IMO, the patch is reverted, and submitted for 8.4 or pgfoundry.


You know, txid was discussed in Slony-I + Skytools lists for a
reasonably long time, and Tom also commented in that thread. I agree
that we broke the policy this time, but this does not mean the end of
the world.


If it has been discussed and planned for so long then it should have 
been considered for inclusion earlier, not just slipped under the radar. 
Even if at feature freeze it wasn't ready it could have been discussed 
whether it could be added after feature freeze if it reached an 
acceptable standard.


If Slony or Skytools need this for a new feature in their x.y release 
then it can be a patch that is included with their release or be a 
prerequisite for their version x.y and detailed in their install steps.


Then they can discuss getting the change accepted into core or contrib 
for the next pg release.



just my .02c


--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [SQL] Why does the sequence skip a number with generate_series?

2007-10-05 Thread Shane Ambler

Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

Shane Ambler wrote:

CREATE TABLE jefftest ( id serial, num int );
INSERT INTO jefftest (num) values (generate_series(1,10));
INSERT INTO jefftest (num) values (generate_series(11,20));
INSERT INTO jefftest (num) values (generate_series(21,30));



Don't use set-returning functions in scalar context.


I think what is actually happening is that the expanded targetlist is

nextval('seq'), generate_series(1,10)

On the eleventh iteration, generate_series() returns ExprEndResult to
show that it's done ... but the 11th nextval() call already happened.
If you switched the columns around, you wouldn't get the extra call.

If you think that's bad, the behavior with multiple set-returning
functions in the same targetlist is even stranger.  The whole thing
is a mess and certainly not something we would've invented if we
hadn't inherited it from Berkeley.

regards, tom lane


Would a re-write be something worth adding to the todo list?

and/or maybe add something about this to the manual?



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [SQL] Why does the sequence skip a number with generate_series?

2007-10-04 Thread Shane Ambler

Stephan Szabo wrote:

On Tue, 2 Oct 2007, Jeff Frost wrote:


I expected these numbers to be in sync, but was suprised to see that the
sequence skips a values after every generate series.

CREATE TABLE jefftest ( id serial, num int );
INSERT INTO jefftest (num) values (generate_series(1,10));
INSERT INTO jefftest (num) values (generate_series(11,20));
INSERT INTO jefftest (num) values (generate_series(21,30));


It seems to do what you'd expect if you do
 INSERT INTO jefftest(num) select a from generate_series(1,10) as foo(a);
 INSERT INTO jefftest(num) select a from generate_series(11,20) as foo(a);
 INSERT INTO jefftest(num) select a from generate_series(21,30) as foo(a);

I tried a function that raises a notice and called it as
 select f1(1), generate_series(1,10);
and got 11 notices so it looks like there's some kind of phantom involved.



That's interesting - might need an answer from the core hackers.
I am posting this to pgsql-hackers to get their comments and feedback.
I wouldn't count it as a bug but it could be regarded as undesirable 
side effects.


My guess is that what appears to happen is that the sequence is created 
by incrementing as part of the insert steps and the test to check the 
end of the sequence is -

if last_inserted_number  end_sequence_number
rollback_last_insert

This would explain the skip in sequence numbers.

My thoughts are that -
if last_inserted_number  end_sequence_number
insert_again

would be a better way to approach this. Of course you would also need to 
check that the (last_insert + step_size) isn't greater than the 
end_sequence_number when the step_size is given.


I haven't looked at the code so I don't know if that fits easily into 
the flow of things.


The as foo(a) test would fit this as the sequence is generated into the 
equivalent of a temporary table the same as a subselect, then used as 
insert data. The rollback would be applied during the temporary table 
generation so won't show when the data is copied across to fulfill the 
insert.


Maybe the planner or the generate series function could use a temporary 
table to give the same results as select from generate_series()



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Money type todos?

2007-03-21 Thread Shane Ambler

August Zajonc wrote:

Agreed with Tom on this one. Full usage of money is beyond tagged types
etc. For example, when you earn money in another currency, it is the
time at which you earn it that describes its value. So for PL accounts
there is generally no change in exchange rates over time and you need to
track what the rate was at time of earning. Solution is to date earnings
and have a table of exchange rates by day.


Personally I think a true money type should hold the numeric value and 
optionally the currency (similar to the timestamp with timezone) and 
have support functions that handle the i/0 conversion (text - 
$US1,000.00 - to money) as happens now. As opposed to the db designer 
storing it in different columns.
But I think the data returned should be of purely numeric type unless a 
function is used to get pretty currency formatting or requesting the 
currency.



For balance sheet accounts, their value at a given point in time in a
home currency is of course dependent on exchange rates which creates the
currency gain or loss on the PL side, the account that captures
exchange rate movements. But this is dependent on the relative
differences between the rates when every dollar was earned and current
rates.


Well the exchange rate at the time the payment is received is only the 
speculative (possible) value of a foreign currency and may not account 
for exchange fees either. This speculative value changes daily (hourly) 
and is not really relevant to the money amount recorded. The speculative 
value is only relevant at the time a report is run to show current value 
in a common currency.


If you have bank accounts in different countries then the exchange rate 
at the time of running, say a balance sheet, will give you the 
speculative value in a common currency of your foreign bank accounts.


The true value to you will only be realised when you transfer the 
foreign money to your local account and get $xx affecting your local 
account balance after exchange fees using the exchange rate on offer at 
the time you initiate the exchange.



Darcy had suggested removing the currency symbol. That is a change I'd
support. The only other nice thing would be user defined precision, but
can live without that as most currencies work under nnn.mm. Speed is
everything in these systems. For a complex general system you often can
get away with integers if you define at the app layer the handling
(including a lookup in system for format, type).


As I mentioned before I think the returned data should be pure numeric 
value unless requesting formatted data. I also agree with the precision 
setting, saving only ddd.cc is not universal enough for all 
applications. Some industries work with tenth's or hundredth's of a cent 
for their pricing and calculations and are only rounded to a whole cent 
on the final invoice when payment is due.


Not sure about America but here in Australia petrol is sold at xxx.x 
cents per litre with the total being rounded to a whole cent when 
payment is made. And our smallest coin is 5 cents so cash paying 
customers are also rounded to 5 cent increments.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] PostgreSQL - 'SKYLINE OF' clause added!

2007-03-11 Thread Shane Ambler

Naz Gassiep wrote:

Let us not do the same to 
SQL and implement SKYLINE on our own, only to have other DBMS vendors 
implement it in different ways and then finally when the SQL standard 
includes it they try to make some kind of average approximation of the 
implementations resulting in *none* of the DBs being compliant. Then 
we'll be between the rock of breaking backwards compatibility and the 
hard place of unwarranted standards non-compliance.


While Josh did point out that being in the leading group as far as 
implementing new functionality goes, I feel that it has to be weighed 
against the need to not strike out too aggressively, potentially 
isolating ourselves with excessive non-standard syntax or behavior.


While I am convinced there is a strong use case for this functionality 
and we should definitely start looking at it, I don't see why we should 
be in a rush to get it into core. People have survived without it up to 
now, I don't think our userbase will suffer if it is implemented 6 
months after foo commercial DB implements it, at least, not as much as 
it will suffer if we start drifting away from standards compliance.


And where did most of the SQL standard come from? A lot of it copies or 
is based on either the first db to implement a feature or the one to 
implement the best syntax.


And how much of the standard became standard because most of the db's 
had already implemented similar features?


Some things can syntactically be expressed more than one way, while 
others are limited in ways to coherently express what you want to achieve.


If we consider this thoroughly and compile a suitable syntax that covers 
all bases it could be used as the basis of the standard definition or be 
close to what ends up in the standard.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Shane Ambler



Note to Nikhil: Make sure the new syntax doesn't prevent partitions from
being placed upon multiple tablespaces in some manner, at CREATE TABLE
time.


What if the syntax was something like -

CREATE TABLE tabname (
...
...
 ) PARTITION BY
 HASH(expr)
| RANGE(expr)
| LIST(expr)
[PARTITIONS num_partitions] /* will apply to HASH only for now*/
[PARTITION partition_name CHECK(...) [USING TABLESPACE tblspcname],
 PARTITION partition_name CHECK(...) [USING TABLESPACE tblspcname]
 ...
];


And (if we use the ALTER TABLE to add partitions)

ALTER TABLE tabname
ADD PARTITION partition_name CHECK(...)
[USING TABLESPACE tblspcname];


Of course ALTER TABLE childtable SET TABLESPACE tblspcname; should not 
cause any probs.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Auto creation of Partitions

2007-03-08 Thread Shane Ambler

Jim C. Nasby wrote:

On Thu, Mar 08, 2007 at 01:52:27PM +0530, NikhilS wrote:

I think it'd be great to make adding and removing partitions as
simple as ALTER TABLE. I don't think that DELETE should be the
mechanism to drop a partition, though. Again, DML statements
shouldn't be performing DDL.


Since partition is inheritance-based, a simple DROP or  NO INHERIT will do
the job to deal with the partition. Do we want to reinvent additional syntax
when these are around and are documented?


Well, if the syntax for adding a new partition eventually ends up as
ALTER TABLE ADD PARTITION, then it would make more sense that you remove
a partition via ALTER TABLE DROP PARTITION.


This follows on from the suggestion I made - taken along the lines of 
the subject auto creation of partitions where I suggested the syntax 
of partition check(month of mydatecol) and have a new partition created 
as data was entered. With this scenario dropping the partition when it 
was empty would complement the creation of a new partition as needed.


Given that there seems to be no real support of going with auto 
maintenance were new partitions are added as needed, then the auto 
dropping of empty partitions would also not apply.


Leaving us with only specific add partition / drop partition commands.
And have the parent table pick up rows not matching any partition check 
criteria.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] PostgreSQL - 'SKYLINE OF' clause added!

2007-03-07 Thread Shane Ambler

Tom Lane wrote:

Josh Berkus josh@agliodbs.com writes:
I think if the code is good enough, and we can avoid horrible non-standard 
syntax extensions, they should go in.   We have to defend our title as most 
advanced database and having stuff like Skyline first (before DB2 or MS) 
goes a long way for that.


Well, whether it's horrible or not is in the eye of the beholder, but
this is certainly a non-standard syntax extension.


Being non-standard should not be the only reason to reject a worthwhile 
feature. Do you really believe that the SQL standard covers every 
feature that a RDBMS could ever want to implement? Do you think that the 
current non-standard features of PostgreSQL should be removed?



My questions about whether to adopt it have more to do with
cost/benefit.  I haven't seen the patch, but it sounds like it will be
large and messy; and it's for a feature that nobody ever heard of before,
let alone one that the community has developed a consensus it wants.
I'm not interested in adopting stuff just because DB2 hasn't got it.


Partially agree but I do think it is worth looking at to see if some or 
all of the feature is worth implementing. The fact that several 
different groups have been mentioned to be working on this feature would 
indicate that it is worth considering. Maybe one of the other groups 
will have implemented it better than the first off the rank. Maybe our 
core developers can work out a better way to implement these features.


A few people on this list have said they are interested in this.


It's also worth noting that what we've got here is a large patch
developed, by students, completely outside our normal development
process; so the odds that it's going to be anywhere near acceptable are
low.  I think the last time we applied a patch that met that description
was the INTERSECT/EXCEPT patch in 1999 ... maybe you don't remember
what a fiasco that was, but I do.


True but the quals he has listed on his web pages look impressive and 
probably give him a little reason to have his work considered/looked at. 
He may just end up being a main PostgreSQL developer in the future.



Sorry to be a thrower of cold water, but I just don't see that this
comes anywhere near being something we should be eager to accept.


True we shouldn't just say sounds good let's put it in but with some 
indication that this feature is along the lines of what users want, 
would indicate that we should be asking -


Do we want this or a similar feature?
Is the theory behind this feature solid?
Can the same end results be gained with other existing methods?
Is the implementation offered worth considering?
Has it been developed to meet the PostgreSQL developer guidelines?
Is it reasonable to work on it to reach a level of quality/performance 
that we will be happy to include?

Can we implement this feature better ourselves?
Do we want to start this feature from scratch ourselves?



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Auto creation of Partitions

2007-03-06 Thread Shane Ambler

NikhilS wrote:

Hi,

On 3/6/07, Peter Eisentraut [EMAIL PROTECTED] wrote:


NikhilS wrote:
 iv) Based on the PRIMARY, UNIQUE, REFERENCES information specified,
 pass it on to the children tables.

How will you maintain a primary key in such a table, considering that
indexes can't span multiple tables?



We will not (I know its a hard thing to do :) ), the intention is to use
this information from the parent and make it a property of the child table.
This will avoid the step for the user having to manually specify CREATE
INDEX and the likes on all the children tables one-by-one.


I think a way can be devised to maintain the primary key and unique 
constraints.
If a search is done on the parent table, the planner knows to rewrite 
the query as a union (or union all) of all child tables that relate to 
the where clause, or all child tables if the where clause is not on the 
column/s used to partition, then this concept should be able to be 
converted to indexes as well, so that when a primary or unique index 
from a child table is inserted to, then each of the related child 
indexes is consulted to ensure uniqueness.
This would only apply if the partitioning was not done by the primary or 
unique column.



1) Whether we should use triggers/rules for step number (iii) above.
 Maybe rules is the way to go.

Since this would basically be a case of the updatable rules problem, you
should review those discussions in the past to check whether the issues
mentioned there don't interfere with that plan.


The rules mentioned here will be to specify that all the
inserts/updates/deletes should go into proper children tables instead of 
the

parent. I do not see the updateable rules problem with regards to this, but
will check out the archives for discussion on this related to partitioning.



I would think that a trigger would be a better choice as I see the need 
(or at least the possibility) for more than just a rewrite. When a value 
is inserted that is outside of a condition currently covered by an 
existing child table then a new child will need to be spawned to contain 
the new data.


So say we partition by year and month of a date column? As new dates are 
added new child tables would be created each month. Or is this beyond 
the current plan and left to manual creation?


Will ALTER TABLE be extended to handle partitions? This will allow 
partitioning existing tables (maybe without blocking access?) and allow 
things like ALTER TABLE mytable ADD PARTITION (mydate within 200703)
and ALTER TABLE mytable DROP PARTITION (mydate within 199912) or would 
dropping be covered by DELETE FROM mytable where mydate = 199912 ?


Could such a syntax be devised for date columns? (month of mydate) or 
similar to auto create partitions based on the year and month of a date 
column? or will we just do CHECK(mydatecol = 1/3/07 and mydatecol = 
31/3/07) for each month of data? Also (day of mydatecol) to partition 
based on year and day of year.


Another syntax possibility - range(myserialcol of 50) where new 
child tables are created every 50 rows?



Maybe I'm looking at auto-maintenance which is beyond any current planning?



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] ERROR: operator does not exist: integer !=- integer

2007-03-04 Thread Shane Ambler

Andrew - Supernews wrote:

On 2007-03-04, William ZHANG [EMAIL PROTECTED] wrote:

Here is the steps to reproduce it in CVS HEAD:
backend select -1 !=-1;


This arguably isn't a bug, because != is not a standard SQL operator, and
therefore !=- can legitimately be defined as a single operator by the user.



I missed the first post and can't seem to search for it - so correct me 
if I am missing something.


Isn't the problem here a missing space? != is a valid operator and -1 is 
the value you are comparing to. !=-1 is not valid but != -1 is correct 
and what I assume you are looking to achieve.


The negation operator goes with the int being negated and is not part of 
the comparison operator != the space is needed there to separate the two.




--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PostgreSQL - 'SKYLINE OF' clause added!

2007-03-03 Thread Shane Ambler

ranbeer makin wrote:

We at International Institute of Information Technology (IIIT) Hyderabad,
India, have extended the Postgres database
system with the skyline operation. For this work, we were guided by our
Prof. Kamalakar Karlapalem
(http://www.iiit.ac.in/~kamal/).

We have extended SQL 'SELECT' clause by an optional 'SKYLINE OF' clause in
version 8.0.3. The changes are done in parser, transformation,
planner/optimizer (a bit) and executor stages. For its execution, two novel
algorithms  - BNL (Block Nested Loop) and SFS
(Sort Filter Skyline) - are also implemented.


From what I read on your web pages it sounds interesting and may be a 
worthwhile addition to PostgreSQL. I'll have a look at it when it is 
available.


Can this piece of work contribute to PostgreSQL? If yes, then we'll send 
out

a detailed report of this project including changes
made, issues involved/need to be solved, limitations, future work, and the
source code etc.


I am not one making the choice of accepting your work but from what I 
know you would want to make your patch available so others can review 
the stability/quality of your work and decide whether there is enough 
demand for the feature to have it included in the main distribution 
either as part of the main code or within the contrib section.


One option you have is to start a project at pgfoundry.org so others can 
access and try your contribution. This will allow your work to be 
available and to be tested by those interested in this feature. If your 
work proves to be worthwhile and in demand it can progress from there 
into the main distribution.


You most probably want to look at porting your changes to the latest 
postgresql release as well.



Thanks very much.

Regards,
Nikita
Ranbeer

--
http://students.iiit.ac.in/~nikita/
http://students.iiit.ac.in/~ranbeer/




--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Erronous sort used in query plan

2007-01-07 Thread Shane Ambler
I am putting together searches on the catalog info and came up with a 
select that was rather slow and I noticed that in the explain analyze 
there is a sort step on one of the left joins which I don't think 
belongs there.


I found the small error in my query (using tl.oid instead of tr.oid and 
tres.oid) that caused the query to slow down and generate the sort in 
the plan but am not sure that the given condition should even generate a 
sort step and if it does then I believe it should be a (more?) stable 
decision.


Removing one of the left join's that is in error (tr or tres) changes 
the column that is sorted, neither of which is related to the join/s 
that appear to generate the step.


With tl, tr and tres in place the sort is performed on pjoin.oid.

Removing or correcting either tr or tres the sort is changed to perform 
on olsort.oid.


Removing or correcting both tr and tres removes the sort from the plan.

Also - removing all the pg_operator joins the sort is still there (on 
pjoin.oid) but if I remove one of the erroneous joins as well the sort 
goes. (correcting one of the joins leaves the sort there but removing it 
removes the sort)


Using postgres 8.2.0 on Mac OSX 10.4.8


The full query is -

explain analyze
SELECT
o.oid as OID
, n.nspname as Schema
, o.oprname as Name
, r.rolname as Owner
, CASE WHEN o.oprkind='b' THEN 'infix(left and right)'
WHEN o.oprkind='l' THEN 'prefix (left)'
WHEN o.oprkind='r' THEN 'postfix (right)'
END as Kind
, CASE WHEN o.oprcanhash='t' THEN 'Yes'
WHEN o.oprcanhash='f' THEN 'No' END as Supports Hash Joins
, tl.typname as Left Operand
, tr.typname as Right Operand
, tres.typname as Result Type
, ocom.oprname as Commutator Operator
, onegate.oprname as Negator Operator
, olsort.oprname as Left Sort Operator
, orsort.oprname as Right Sort Operator
, oltcm.oprname as Less Than Operator
, ogtcm.oprname as Greater Than Operator
, pcode.proname as Operator Function
, prest.proname as Restriction Selectivity Function
, pjoin.proname as Join Selectivity Function


FROM pg_catalog.pg_operator o
left join pg_catalog.pg_namespace n ON n.oid = o.oprnamespace
left join pg_catalog.pg_roles r on r.oid=o.oprowner
left join pg_catalog.pg_type tl on tl.oid=o.oprleft
left join pg_catalog.pg_type tr on tl.oid=o.oprright
left join pg_catalog.pg_type tres on tl.oid=o.oprresult
left join pg_catalog.pg_operator ocom on ocom.oid=o.oprcom
left join pg_catalog.pg_operator onegate on onegate.oid=o.oprnegate
left join pg_catalog.pg_operator oneg on oneg.oid=o.oprnegate
left join pg_catalog.pg_operator olsort on olsort.oid=o.oprlsortop
left join pg_catalog.pg_operator orsort on orsort.oid=o.oprrsortop
left join pg_catalog.pg_operator oltcm on oltcm.oid=o.oprltcmpop
left join pg_catalog.pg_operator ogtcm on ogtcm.oid=o.oprgtcmpop
left join pg_catalog.pg_proc pcode on pcode.oid=o.oprcode
left join pg_catalog.pg_proc prest on prest.oid=o.oprrest
left join pg_catalog.pg_proc pjoin on pjoin.oid=o.oprjoin

WHERE n.nspname like 'public'



I have attached a copy of the query and plan.
--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz
explain analyze
SELECT  
o.oid as OID 
, n.nspname as Schema 
, o.oprname as Name 
, r.rolname as Owner 
, CASE WHEN o.oprkind='b' THEN 'infix(left and right)'
WHEN o.oprkind='l' THEN 'prefix (left)'
WHEN o.oprkind='r' THEN 'postfix (right)'
END as Kind
, CASE WHEN o.oprcanhash='t' THEN 'Yes' WHEN o.oprcanhash='f' THEN 'No' END as 
Supports Hash Joins
, tl.typname as Left Operand
, tr.typname as Right Operand
, tres.typname as Result Type
, ocom.oprname as Commutator Operator
, onegate.oprname as Negator Operator
, olsort.oprname as Left Sort Operator
, orsort.oprname as Right Sort Operator
, oltcm.oprname as Less Than Operator
, ogtcm.oprname as Greater Than Operator
, pcode.proname as Operator Function
, prest.proname as Restriction Selectivity Function
, pjoin.proname as Join Selectivity Function


FROM pg_catalog.pg_operator o 
left join pg_catalog.pg_namespace n ON n.oid = o.oprnamespace 
left join pg_catalog.pg_roles r on r.oid=o.oprowner  
left join pg_catalog.pg_type tl on tl.oid=o.oprleft
left join pg_catalog.pg_type tr on tl.oid=o.oprright
left join pg_catalog.pg_type tres on tl.oid=o.oprresult
left join pg_catalog.pg_operator ocom on ocom.oid=o.oprcom
left join pg_catalog.pg_operator onegate on onegate.oid=o.oprnegate
left join pg_catalog.pg_operator oneg on oneg.oid=o.oprnegate
left join pg_catalog.pg_operator olsort on olsort.oid=o.oprlsortop
left join pg_catalog.pg_operator orsort on orsort.oid=o.oprrsortop
left join pg_catalog.pg_operator oltcm on oltcm.oid=o.oprltcmpop
left join pg_catalog.pg_operator ogtcm on ogtcm.oid=o.oprgtcmpop
left join pg_catalog.pg_proc pcode on pcode.oid=o.oprcode
left join pg_catalog.pg_proc prest on prest.oid=o.oprrest
left join pg_catalog.pg_proc pjoin on pjoin.oid=o.oprjoin


WHERE n.nspname like 'public' 


ORDER BY lower(n.nspname), lower(o.oprname

Re: [HACKERS] Erronous sort used in query plan

2007-01-07 Thread Shane Ambler

Tom Lane wrote:

Shane Ambler [EMAIL PROTECTED] writes:
I am putting together searches on the catalog info and came up with a 
select that was rather slow and I noticed that in the explain analyze 
there is a sort step on one of the left joins which I don't think 
belongs there.


Well, it's certainly necessary in context because it's preparing the
data for the merge join immediately above it.  The correct question
is why is the thing using a merge join here, when a hash join would be
cheaper?

I dug through this and found out that the hash join is estimated as
cheaper, right up till the last step of cost_hashjoin:

/*
 * Bias against putting larger relation on inside.  We don't want an
 * absolute prohibition, though, since larger relation might have better
 * bucketsize --- and we can't trust the size estimates unreservedly,
 * anyway.  Instead, inflate the run cost by the square root of the size
 * ratio.  (Why square root?  No real good reason, but it seems
 * reasonable...)
 *
 * Note: before 7.4 we implemented this by inflating startup cost; but if
 * there's a disable_cost component in the input paths' startup cost, that
 * unfairly penalizes the hash.  Probably it'd be better to keep track of
 * disable penalty separately from cost.
 */
if (innerbytes  outerbytes  outerbytes  0)
run_cost *= sqrt(innerbytes / outerbytes);

In this example, the data volume from the join of everything else is
estimated as less than what needs to be fetched from pg_proc, and so
this bias kicks in, and the cost estimate roughly doubles.
Unfortunately, because it's a LEFT JOIN, we'll never consider hashjoin
in the other direction and so the hash loses out to the mergejoin.

It seems clear to me that we ought not impose a bias unless the join
type is such that both directions of hashing are feasible.  I wonder
also if the bias is too large ... but there's not really evidence for
or against that in this example.  The point is that this code implicitly
assumes both directions will be tried, and they won't.



I think that the selected sort (or at least the merge join) is incorrect 
- the column sorted (or both actually) is linking the current record in 
pg_operator with the oid in the pg_proc - it will only return one row.


If one of the pg_type joins is changed, it then sorts on the oid of 
pg_operator as the foreign table - again this will only return one row.


I would think that the foreign oid would indicate to the planner that it 
will only find one foreign row to link with.



I can see that the error I made created a funny (probably useless 
actually) link that would throw things out, but I would expect it to
create bad planning for the two joins that are in error not a 
non-related one to one join. If a sort/merge join was created from this 
and used to satisfy this join I would accept that as part of what I 
unintentionally requested but instead it generates a sort/merge join on 
a join that links one current record to one foreign record and has 
nothing in common with the joins in error.




--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Strange pgsql crash on MacOSX

2006-12-23 Thread Shane Ambler

Shane Ambler wrote:

Tom Lane wrote:

Shane Ambler [EMAIL PROTECTED] writes:

postgres=# \q
psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
for freed object - object was probably modified after being freed, break
at szone_error to debug
psql(24931) malloc: *** set a breakpoint in szone_error to debug
Segmentation fault


I think we've seen something like this before in connection with
readline/libedit follies.  Does the crash go away if you invoke
psql with -n option?  If so, exactly which version of readline or
libedit are you using?





psql -n stops the error.



I just found out the problem.

psql_history - I had tried to copy from a text file earlier that was 
utf8 and came up with some errors, I guess these got into the history 
file and stuffed it up.


Renamed it so it created a new one and all is fine now.

--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Strange pgsql crash on MacOSX

2006-12-22 Thread Shane Ambler
 processors, PowerPC G4  (3.2),
1.25 GHz, 2 GB
Graphics: NVIDIA GeForce4 MX, GeForce4 MX, AGP, 32 MB
Memory Module: DIMM0/J21, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM1/J22, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM2/J23, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM3/J20, 512 MB, DDR SDRAM, PC2600U-25330
AirPort: AirPort, 9.52
Network Service: Built-in Ethernet, Ethernet, en0
PCI Card: pci-bridge, pci, SLOT-3
PCI Card: firewire, ieee1394, 1x0
PCI Card: usb, usb, 1x1
PCI Card: usb, usb, 1x1
PCI Card: pci167e,225a, , 1x1
Parallel ATA Device: LITE-ON DVD SOHD-167T,
Parallel ATA Device: WDC WD1200JB-00FUA0, 111.79 GB
Parallel ATA Device: IBM-IC35L120AVVA07-0, 115.04 GB
USB Device: Apple Pro Keyboard, Mitsumi Electric, Up to 1.5 Mb/sec, 500 mA
USB Device: i350, Canon, Up to 12 Mb/sec, 500 mA
FireWire Device: unknown_device, unknown_value, Up to 400 Mb/sec



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] effective_cache_size vs units

2006-12-19 Thread Shane Ambler

Peter Eisentraut wrote:

Tom Lane wrote:

Nor do I believe that we'd ever accept a future patch that made
the distinction between kb and kB significant --- if you think
people are confused now, just imagine what would happen then.


As I said elsewhere, I'd imagine future functionality like a units-aware 
data type, which has been talked about several times, and then this 
would be really bad.




Most if not all of us here with computer knowledge (particularly at the 
programming level) know the difference between capital and lowercase 
memory/data size abbreviations.


Case insensitive size measurements don't matter if you actually know 
what the abbreviations mean.


The case where case matters ;-) is b and B (bits and Bytes for those 
that don't know the diff)


And if you don't know the difference between m and M - one is a portion 
of and the other is a multiple of. So mB would technically mean 0.001 of 
a byte. I'd like to see you allocate that!!


As is the case of many english words - the context of the usage makes a 
big difference in the interpretation.


Given that the purpose of the effective_cache_size setting (and similar) 
is to specify the amount of memory you want allocated/limited to, then 
that context allows you to assume that all unit abbreviations are 
specifying bytes/kilobytes/megabytes/gigabytes and not 
bits/kilobits/millibits/millibytes etc



As for the future - well, TB is getting more common, petabytes of 
storage has been talked about, 64bit systems can access exabytes of ram. 
Next would be zettabytes and yottabytes.


Unless we start a roman numeral system for amounts bigger than that then 
I seriously doubt that we will hit any confusion with future usage. (and 
I doubt in our lifetimes) Though I suppose with storage expansion rates 
increasing the way they have the last few years we may be using 
yottabyte hard drives on our 256bit systems with 512 zettabytes of ram 
in about 15 years ;-)


That might make it around the end of life for the 8.0 branch so maybe we 
need to consider handling these future storage needs soon?


Maybe in 40 years we will all retire with mega-yotta-byte drives in our 
pda watches?



As for units aware datatypes - what options are available will need to 
be decided at implementation time. Will we allow megabit (Mb) size 
allocations or only megabyte? I would say bits would be clearly 
specified as such (bit instead of b)




Let's skip any flame wars on this and concentrate on the humorous future 
storage sizes.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] libpq.a in a universal binary

2006-12-14 Thread Shane Ambler

Dave Page wrote:

Shane Ambler wrote:



# make distclean
# CFLAGS=-isysroot /Developer/SDKs/MacOSX10.4u.sdk -arch ppc -arch
i386 LDFLAGS=-Wl,-syslibroot,/Developer/SDKs/MacOSX10.4u.sdk -arch 
ppc

-arch i386 ./configure --with-openssl --prefix=/usr/local
# make all


After reading the Apple tech doc on this 
http://developer.apple.com/technotes/tn2005/tn2137.html
I am wondering if we can add Mac universal binary support to the TODO 
list.


I haven't tested this yet but from what I gather adding the 
--disable-dependency-tracking option to configure will allow the 
option of setting the CFLAGS and LDFLAGS with -arch ppc -arch i386 
and getting the universal binary build.


The PostgreSQL build doesn't use the dependency tracking (at least not 
by default) so that's not an issue. pgAdmin does on the other hand, and 
the build will fail loudly if you don't disable tracking when you 
configure.


Another possibility (the better option) would be to add an option to 
build universal binaries - such as --host=mac_universal or would that 
be --host=darwin_universal - to automate the CFLAGS and LDFLAGS 
settings. Adding --with-isysroot= should be added to support 
non-standard and future SDK setups as well.


In pgAdmin we figure out what to build for based on the wxWidgets 
installation. In PostgreSQL's case, I would imagine we'd need to list 
the architectures to include in the build per the example above however 
it was automated (you might want ppc and ppc64 but not intel for example).


You could have a few options -
--host=darwin-ppc-i386
--host=darwin-ppc-ppc64-i386-amd64
etc.

maybe this could be better setup by splitting the string after darwin_ 
instead of having many different single hosts as such.


If host starts with darwin and len(host)6
arch_string() = explode(-,ltrim(host,7))
CFLAGS=CFLAGS+ -arch +arch_string(x)
else
build for current cpu

Maybe the --host= options with all the different possibilities won't be 
taken in, but I think a TODO should at least be added (not a high 
priority) to support building universal binaries even if only fixing the 
current failure point so that one make command can be used to get the 
end product even if it is a partly manual configure.

And this would apply to the client libs more than the server itself.

FWIW, from memory the PostgreSQL build fails at the point at which it 
links all the SUBSYS.o's.




Actually now that I checked it on a clean source tree (8.2.0) and not a 
make/make clean'd source tree it does appear to get to the SUBSYS.o's

(when I make clean then set CFLAGS etc it stopped at the beginning)

As I said I hadn't looked into it much and my first impression of trying 
this gave me misleading thoughts of what was needed.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Help me pack up postgresDB with my windows application.

2006-11-14 Thread Shane Ambler

du li wrote:

Dear hackers,
   
  I'm working on a windows application with C# language and use npgsql to connect postgres DB. I'm eager to learn how to make a solo setup file which included windows application and postgres DB. My develop environment is Visual Studio 2003 and Framework 1.1
   
  I don't know if there are a convenient way to pack up a postgres DB, and I'm not sure which files and register key I need to pack, as well as how to turn up a postgres service after installation.
   
  I know many people like you have done wonderful job on postgres and it is unsuspectingly. I just want to reduce steps, config-operation and keep database password when our user install applications.
   
   


http://pgfoundry.org/projects/pginstaller/

contains a link to the source used to build the windows binary installer 
- what you want will be in there.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [HACKERS] Replication documentation addition

2006-10-25 Thread Shane Ambler

Bruce Momjian wrote:


OK, does that mean we mention EnterpriseDB in the section about Oracle
functions?  Why not mention MS SQL if they have a better solution?  I
just don't see where that line can clearly be drawn on what to include.
Do we mention Netiza, which is loosely based on PostgreSQL?   It just
seems very arbitrary to include commercial software.  If someone wants
to put in on a wiki, I think that would be fine because that doesn't
seems as official.


I agree that the commercial offerings shouldn't be named directly in the 
docs, but it should be mentioned that some commercial options are 
available and a starting point to find more information.


If potential new users look through the docs and it says no options 
available for what they want or consider they will need in the future 
then they go elsewhere, if they know that some options are available 
then they will look further if they want that feature.


something like
There are currently no open source solutions available for this option 
but there are some commercial offerings. More details of some available 
solutions can be found at postgresql.org/support/




--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] 8.1.4 verified on Intel Mac OS 10.4.8

2006-10-18 Thread Shane Ambler

Douglas Toltzman wrote:

The subject line says it all.

I just completed a build, test, and install of PostgreSQL 8.1.4 on an 
Intel Mac with OS 10.4.8.  Every single step executed flawlessly, and 
all 98 tests passed, running the regression tests.


Just for info - there is an automated build farm that tests building 
each version on different systems. It is at http://pgbuildfarm.org
You can view the results page and see what systems have compiled each 
version of postgres and what results they got.

A machine with OSX 10.4.7 on Intel is in the farm.
If you are interested in joining the build farm and have your machine 
regularly building and sending feedback then you can find details at the 
site.


I've got a suggestion for the documentation (INSTALL file).  I may be 
the only person in the world who does everything backwards, but my first 
attempt at regression tests failed because I had already started the 
postmaster.  I know that's pretty stupid, but I had to scratch my head 
for a few minutes when it said initdb failed, and left me no clue as to 
why.  You might add a note in with the instructions for regression 
testing to shutdown the server before running the tests.  It does say, 
after all, that you can run them any time!


I have seen this gotcha several times but haven't pinpointed why. (I may 
sit down and nut it out one day)


With no postgres running there is no problem but with a version of 
postgres running I have found the results vary.

One day I can make check and have a problem the next it will run fine.
(today it is working(day before yesterday make check on 8.1.5 didn't))


p.s. I noticed I am a build behind, but I would think that 8.1.5 would give 
similar results.


As 8.1.5 was only released a couple of days ago I wouldn't say you are 
behind yet.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Mirror problems for download

2006-10-18 Thread Shane Ambler

Michael Paesold wrote:
Not being subscribed to any more appropriate list, I post this here on 
hackers.


I just wanted to download the postgresql-8.0.9 tarball. The page I got 
was this:


Choose a download mirror
Downloading: /source/v8.0.9/postgresql-8.0.9.tar.gz
We could not query the database or no mirrors could be found!
Download PostgreSQL from the primary site
Read this if you would like to host a mirror.

Of course the primary FTP site is already unavailable (530 - maximum 
number of users reached).


I get the same error for older releases, too. Can someone look into this?



Your seeing a general overload problem - with new versions just released 
everyone is jumping on and downloading at the same time.


Indicates that the popularity of PostgreSQL is at a level where the 
current resources are reaching thier limits and more mirror sites may be 
needed.


I haven't noticed any torrent files for postgreSQL releases - maybe we 
can look into setting this up to share the load, particularly around 
release times.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Shane Ambler

Stephen Frost wrote:

* Mark Woodward ([EMAIL PROTECTED]) wrote:

If I am asking for a specific column value, should I, technically
speaking, need to group by that column?


Technically speaking, if you're asking for a specific tuple, should you
be allowed to request an aggregation?

Only with the assumption that the value in the where clause is for a 
unique column.


If you want min(col2) and avg(col2) where col1=x you can get it without 
a group by, the same as if you put col1x - if you want an aggregate of 
all records returned not the aggregate based on each value of col1.




select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;


But back to the query the issue comes in that the ycis_id value is 
included with the return values requested (a single row value with 
aggregate values that isn't grouped) - if ycis_id is not unique you will 
get x number of returned tuples with ycis_id=15 and the same min() and 
avg() values for each row.
Removing the ycis_id after the select will return the aggregate values 
you want without the group by.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Shane Ambler

Mark Woodward wrote:

Stephen Frost wrote:


select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15;

But back to the query the issue comes in that the ycis_id value is
included with the return values requested (a single row value with
aggregate values that isn't grouped) - if ycis_id is not unique you will
get x number of returned tuples with ycis_id=15 and the same min() and
avg() values for each row.
Removing the ycis_id after the select will return the aggregate values
you want without the group by.


I still assert that there will always only be one row to this query. This
is an aggregate query, so all the rows with ycis_id = 15, will be
aggregated. Since ycis_id is the identifying part of the query, it should
not need to be grouped.


SELECT ycis_id FROM table WHERE ycis_id=15; returns a single tuple when 
ycis_id is unique otherwise multiple tuples
which means that SELECT ycis_id is technically defined as returning a 
multiple row tuple even if ycis_id is unique - the data in the tuple 
returned is data directly from one table row


SELECT max(col2) FROM table WHERE ycis_id=15; returns an aggregate tuple

SELECT ycis_id FROM table WHERE ycis_id=15 GROUP BY ycis_id; returns an 
aggregate tuple (aggregated with the GROUP BY clause making the ycis_id 
after the SELECT an aggregate as well)


You can't have both a single tuple and an aggregate tuple returned in 
the one statement. If you want the column value of ycis_id in the 
results you need the group by to unify all returned results as being 
aggregates.



--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Syntax bug? Group by?

2006-10-17 Thread Shane Ambler

Joe Sunday wrote:

On Tue, Oct 17, 2006 at 02:41:25PM -0400, Mark Woodward wrote:


The output column ycis_id is unabiguously a single value with regards to
the query. Shouldn't PostgreSQL know this? AFAIR, I think I've used this
exact type of query before either on PostgreSQL or another system, maybe
Oracle, and it did work.


Doesn't work in Oracle 10g:

SELECT ycis_id, tindex from x where ycis_id = 15;
YCIS_ID  TINDEX
===  ==
 15  10
 15  20

SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15;
ORA-00937: not a single-group group function

SELECT ycis_id, min(tindex), avg(tindex) from x where ycis_id = 15 GROUP BY 
ycis_id;
YCIS_ID  MIN(TINDEX)  AVG(TINDEX)
===  ===  ===
 15   10   15

--Joe



MySQL reports -
Mixing of GROUP columns (MIN(),MAX(),COUNT()...) with no GROUP columns 
is illegal if there is no GROUP BY clause



I found one that actually returns the desired result - SQLite3.

sqlite select * from test;
15|20
15|10
sqlite select ycis_id,min(tindex),avg(tindex) from test where ycis_id=15;
15|10|15
sqlite


--

Shane Ambler
[EMAIL PROTECTED]

Get Sheeky @ http://Sheeky.Biz

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Postgresql Caching

2006-10-16 Thread Shane Ambler

Harvell F wrote:

  Getting back to the original posting, as I remember it, the question 
was about seldom changed information.  In that case, and assuming a 
repetitive query as above, a simple query results cache that is keyed on 
the passed SQL statement string and that simply returns the previously 
cooked result set would be a really big performance win.


I believe the main point that Mark made was the extra overhead is in the 
sql parsing and query planning - this is the part that postgres won't 
get around. Even if you setup simple tables for caching it still goes 
through the parser and planner and looses the benefits that memcached 
has. Or you fork those requests before the planner and loose the 
benefits of postgres.
The main benefit of using memcached is to bypass the parsing and query 
planning.


You will find there is more to sql parsing than you first think, it 
needs to find the components that make up the sql statement (tables 
column names functions) and check that they exist and can be used in the 
context of the given sql and the given data matches the context that is 
given to be used in, it needs to check that the current user has enough 
privileges to perform the requested task, then it locates the data 
whether it be in the memory cache, on disk or an integrated version of 
memcached, this would also include checks to make sure another user 
hasn't locked the data to change it and whether there exists more than 
one version of the data, committed and uncommitted and then sends the 
results back to the client requesting it.


  Registering each cache entry by the tables included in the query and 
invalidating the cache during on a committed update or insert 
transaction to any of the tables would, transparently, solve the 
consistency problem.


That was part of my thinking when I made the suggestion of adding 
something like memcached into postgres.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Postgresql Caching

2006-10-15 Thread Shane Ambler

Merlin Moncure wrote:

On 10/15/06, Anon Mous [EMAIL PROTECTED] wrote:

 Would it be possible to combine a special memcache implementation of
 memcache with a Postgresql interface wrapper?


have you seen
http://people.freebsd.org/~seanc/pgmemcache/

merlin


Now you got me thinkin ;-P

Just throwing some ideas around -

What if we could do something like

CREATE TABLESPACE myramcache LOCATION MEMORY(2GB);

CREATE TABLE mycache (
id as integer, data as varchar(50))
USING TABLESPACE myramcache;

INSERT INTO mycache SELECT id,data FROM myrealtable;


You could setup a table in memory to contain known popular data, you 
could also use this to create a temporary table in memory to speed up 
multiple intermediate calculations without touching disks.



Or maybe just a view for caching -

CREATE MEMORY VIEW mycacheview
USING MAX(2GB) FOR LIFE(10)
AS SELECT * FROM myrealtable;

which would cache SELECTed rows in ram for LIFE seconds before purging 
and inserts/updates to myrealtable would trigger or can contain a 
trigger that would purge all or some of the view cache.


Or better yet maybe the memory tablespace idea could also be extended to 
allow CREATE VIEW ... USING TABLESPACE myramcache LIFE(10);


TABLESPACE LOCATION MEMORY would seem to give an opportunity for later 
expansion.


The memory tablespace idea could be expanded to work with something like 
memcached (pg_ramcache_slave ?) - allowing multiple machines to work as 
a ram cache for the server.


Something like -
CREATE MEMCACHE group1 SLAVE 192.168.0.5;
CREATE MEMCACHE group1 SLAVE 192.168.0.6 PORT 5436;
CREATE MEMCACHE group2 SLAVE 192.168.0.7;
CREATE TABLESPACE myramcache LOCATION MEMORY WITH group1 SLAVES;
CREATE TABLESPACE myramcache2 LOCATION MEMORY WITH group2 SLAVES;

Probably want to put in some limits such as only temporary tables and 
'caching' views are allowed in memory tablespace.


Apart from temp tables these could all be saved into system tables so 
they are re-created upon server startup.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Postgresql Caching

2006-10-15 Thread Shane Ambler

[EMAIL PROTECTED] wrote:

On Mon, Oct 16, 2006 at 03:08:39AM +0930, Shane Ambler wrote:


You could setup a table in memory to contain known popular data, you 
could also use this to create a temporary table in memory to speed up 
multiple intermediate calculations without touching disks.


I'm not sure what this would save. If the table is read-only, there
shouldn't be writes happening. If it's small, and frequently accessed,
it should fit in the buffer cache.


Because it is frequently accessed doesn't mean that it is small - the 
main point is control over what is cached and a starting point for other 
options mentioned later.



None of this avoids the cost of query planning, or query execution.


No but you can avoid costly disk access and still have the postgres 
level of integrity and integration that memcached doesn't offer.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Postgresql Caching

2006-10-15 Thread Shane Ambler

[EMAIL PROTECTED] wrote:

As a thought experiment, I'm not seeing the benefit. I think if you
could prove a benefit, then any proof you provided could be used to
improve the already existing caching layers, and would apply equally
to read-only or read-write pages. For example, why not be able to
hint to PostgreSQL that a disk-based table should be considered a
priority to keep in RAM. That way, PostgreSQL would avoid pushing
pages from this table out.



If memcached (or pgmemcached implemented in triggers) can show a speed 
improvement using ram based caching (even with network overhead) of 
specific data then it stands to reason that this ram based cache can be 
integrated into postgres with better integration that will overcome the 
issues that pgmemcached has. So I threw some ideas out there to get 
others thinking on these lines to see if we can come up with a way to 
improve or integrate this principle.


My original thoughts were integrating it into the sql level to allow the 
database structure to define what we would want to cache in ram, which 
is similar to what is happening with using pgmemcached.
Expanding create table to specify that a table gets priority in cache or 
allocate x amount of cache to be used by table y could be a better way 
than saying all of this table in ram.


I think the main benefit of my first ideas would come from the later 
examples I gave where create memory tablespace with slaves would allow 
the use of extra machines, effectively increasing the ram available 
outside the current postgres setup.


Maybe implementing this idea as a way of increasing the current postgres 
 caching would be a better implementation than the memory tablespaces 
idea. As in integrating a version of pgmemcached as an option into the 
current caching layers. Thus implementing it at the config level instead 
of the structure design. Although defining tables to get priority or 
allocated space in the ram cache would fit well with that.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Casting to money

2006-10-09 Thread Shane Ambler

Tom Lane wrote:

Dave Page dpage@vale-housing.co.uk writes:

select '$123.45'::money
ERROR: invalid input syntax for type money: $123.45
select '£123.00'::money
ERROR: invalid input syntax for type money: £123.00


So ... what locale are you trying this in?

I get the following from 8.2beta1 - looks like it doesn't like the 
double quotes.


postgres=# select '123.45'::money;
  money
-
 $123.45
(1 row)

postgres=# select '$123.45'::money;
  money
-
 $123.45
(1 row)
postgres=# select 123.45::money;
ERROR:  column 123.45 does not exist
LINE 1: select 123.45::money;
   ^
postgres=# select $123.45::money;
ERROR:  column $123.45 does not exist
LINE 1: select $123.45::money;
   ^

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match