Re: [HACKERS] Jsonb transform for pl/python
On Thu, 09 Nov 2017 12:26:46 + Aleksander Alekseev <a.aleks...@postgrespro.ru> wrote: > The following review has been posted through the commitfest > application: make installcheck-world: tested, failed > Implements feature: tested, passed > Spec compliant: tested, passed > Documentation:tested, passed > > Hello Anthony, > > Great job! > > I decided to take a closer look on your patch. Here are some defects > I discovered. > > > + Additional extensions are available that implement transforms > > for > > + the jsonb type for the language PL/Python. The > > + extensions for PL/Perl are called > > 1. The part regarding PL/Perl is obviously from another patch. > > 2. jsonb_plpython2u and jsonb_plpythonu are marked as relocatable, > while jsonb_plpython3u is not. Is it a mistake? Anyway if an > extension is relocatable there should be a test that checks this. > > 3. Not all json types are test-covered. Tests for 'true' :: jsonb, > '3.14' :: jsonb and 'null' :: jsonb are missing. > > 4. jsonb_plpython.c:133 - "Iterate trhrough Jsonb object." Typo, it > should be "through" or probably even "over". > > 5. It looks like you've implemented transform in two directions > Python <-> JSONB, however I see tests only for Python <- JSONB case. > > 6. Tests passed on Python 2.7.14 but failed on 3.6.2: > > > CREATE EXTENSION jsonb_plpython3u CASCADE; > > + ERROR: could not access file "$libdir/jsonb_plpython3": No such > > file or directory > > module_pathname in jsonb_plpython3u.control should be > $libdir/jsonb_plpython3u, not $libdir/jsonb_plpython3. > > Tested on Arch Linux x64, GCC 7.2.0. > > The new status of this patch is: Waiting on Author > Hello, Aleksander. Thank you for your time. The defects you have noticed were fixed. Please, find in attachments new version of the patch (it is called 0001-jsonb_plpython-extension-v2.patch). Most of changes were made to fix defects(list of the defects may be found in citation in the beginning of this message), but the algorithm of iterating through incoming jsonb was changed so that it looks tidier. -- Anthony Bykov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/contrib/Makefile b/contrib/Makefile index 8046ca4..d9d9817 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -81,9 +81,9 @@ ALWAYS_SUBDIRS += hstore_plperl endif ifeq ($(with_python),yes) -SUBDIRS += hstore_plpython ltree_plpython +SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython else -ALWAYS_SUBDIRS += hstore_plpython ltree_plpython +ALWAYS_SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython endif # Missing: diff --git a/contrib/jsonb_plpython/Makefile b/contrib/jsonb_plpython/Makefile new file mode 100644 index 000..6371d11 --- /dev/null +++ b/contrib/jsonb_plpython/Makefile @@ -0,0 +1,39 @@ +# contrib/jsonb_plpython/Makefile + +MODULE_big = jsonb_plpython$(python_majorversion)u +OBJS = jsonb_plpython.o $(WIN32RES) +PGFILEDESC = "jsonb_plpython - transform between jsonb and plpythonu" + +PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpython $(python_includespec) -DPLPYTHON_LIBNAME='"plpython$(python_majorversion)"' + +EXTENSION = jsonb_plpythonu jsonb_plpython2u jsonb_plpython3u +DATA = jsonb_plpythonu--1.0.sql jsonb_plpython2u--1.0.sql jsonb_plpython3u--1.0.sql + +REGRESS = jsonb_plpython$(python_majorversion) +REGRESS_PLPYTHON3_MANGLE := $(REGRESS) + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/jsonb_plpython +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif + +# We must link libpython explicitly +ifeq ($(PORTNAME), win32) +# ... see silliness in plpython Makefile ... +SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a)) +else +rpathdir = $(python_libdir) +SHLIB_LINK += $(python_libspec) $(python_additional_libs) +endif + +ifeq ($(python_majorversion),2) +REGRESS_OPTS += --load-extension=plpython2u +else +REGRESS_OPTS += --load-extension=plpython3u +endif diff --git a/contrib/jsonb_plpython/expected/jsonb_plpython2.out b/contrib/jsonb_plpython/expected/jsonb_plpython2.out new file mode 100644 index 000..8ad5338 --- /dev/null +++ b/contrib/jsonb_plpython/expected/jsonb_plpython2.out @@ -0,0 +1,478 @@ +CREATE EXTENSION jsonb_plpython2u CASCADE; +-- test jsonb -> python dict +CREATE FUNCTION test1(val jsonb) RETURNS int +LANGUAGE plpython2u +TRANSFORM FOR TYPE jsonb +AS $$ +assert isinstance(val, dict) +plpy.info(sorted(val.items())) +return len(val) +$$; +SELECT test1('{"a":1, "c":"NULL"}'::jsonb); +INFO: [('a', Decimal('1')), ('c', 'NULL')] + test1 +--
Re: [HACKERS] Jsonb transform for pl/python
On Sun, 29 Oct 2017 19:11:02 +0100 David Fetter <da...@fetter.org> wrote: > Thanks for your hard work! > > Should there also be one for PL/Python3U? > > Best, > David. Hi. Actually, there is one for PL/Python3U. This patch contains following extensions: jsonb_plpythonu jsonb_plpython2u jsonb_plpython3u "make install" checks which python major version was your postgresql configured with and installs corresponding extension. -- Anthony Bykov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Jsonb transform for pl/python
Hi. I've implemented jsonb transform (https://www.postgresql.org/docs/9.5/static/sql-createtransform.html) for pl/python. 1. '{"1":1}'::jsonb is transformed into dict {"1"=>1}, while '["1",2]'::jsonb is transformed into list(not tuple!) ["1", 2] 2. If there is a numeric value appear in jsonb, it will be transformed to decimal through string (Numeric->String->Decimal). Not the best solution, but as far as I understand this is usual practise in postgresql to serialize Numerics and de-serialize them. 3. Decimal is transformed into jsonb through string (Decimal->String->Numeric). An example may also be helpful to understand extension. So, as an example, function "test" transforms incoming jsonb into python, transforms it back into jsonb and returns it. create extension jsonb_plpython2u cascade; create or replace function test(val jsonb) returns jsonb transform for type jsonb language plpython2u as $$ return (val); $$; select test('{"1":1,"example": null}'::jsonb); -- Anthony Bykov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/contrib/Makefile b/contrib/Makefile index e84eb67..d6b7170 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -82,9 +82,9 @@ ALWAYS_SUBDIRS += hstore_plperl endif ifeq ($(with_python),yes) -SUBDIRS += hstore_plpython ltree_plpython +SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython else -ALWAYS_SUBDIRS += hstore_plpython ltree_plpython +ALWAYS_SUBDIRS += hstore_plpython ltree_plpython jsonb_plpython endif # Missing: diff --git a/contrib/jsonb_plpython/Makefile b/contrib/jsonb_plpython/Makefile new file mode 100644 index 000..1e34d86 --- /dev/null +++ b/contrib/jsonb_plpython/Makefile @@ -0,0 +1,39 @@ +# contrib/jsonb_plpython/Makefile + +MODULE_big = jsonb_plpython$(python_majorversion)u +OBJS = jsonb_plpython.o $(WIN32RES) +PGFILEDESC = "jsonb_plpython - transform between jsonb and plpythonu" + +PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plpython $(python_includespec) -DPLPYTHON_LIBNAME='"plpython$(python_majorversion)"' + +EXTENSION = jsonb_plpython$(python_majorversion)u +DATA = jsonb_plpython$(python_majorversion)u--1.0.sql + +REGRESS = jsonb_plpython$(python_majorversion) +REGRESS_PLPYTHON3_MANGLE := $(REGRESS) + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/jsonb_plpython +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif + +# We must link libpython explicitly +ifeq ($(PORTNAME), win32) +# ... see silliness in plpython Makefile ... +SHLIB_LINK += $(sort $(wildcard ../../src/pl/plpython/libpython*.a)) +else +rpathdir = $(python_libdir) +SHLIB_LINK += $(python_libspec) $(python_additional_libs) +endif + +ifeq ($(python_majorversion),2) +REGRESS_OPTS += --load-extension=plpython2u +else +REGRESS_OPTS += --load-extension=plpython3u +endif diff --git a/contrib/jsonb_plpython/expected/jsonb_plpython.out b/contrib/jsonb_plpython/expected/jsonb_plpython.out new file mode 100644 index 000..be104af --- /dev/null +++ b/contrib/jsonb_plpython/expected/jsonb_plpython.out @@ -0,0 +1,118 @@ +CREATE EXTENSION jsonb_plpythonu CASCADE; +-- test jsonb -> python dict +CREATE FUNCTION test1(val jsonb) RETURNS int +LANGUAGE plpythonu +TRANSFORM FOR TYPE jsonb +AS $$ +assert isinstance(val, dict) +plpy.info(sorted(val.items())) +return len(val) +$$; +SELECT test1('{"a":1, "c":"NULL"}'::jsonb); +INFO: [('a', Decimal('1')), ('c', 'NULL')] + test1 +--- + 2 +(1 row) + +-- test jsonb -> python dict +-- complex dict with dicts as value +CREATE FUNCTION test1complex(val jsonb) RETURNS int +LANGUAGE plpythonu +TRANSFORM FOR TYPE jsonb +AS $$ +assert isinstance(val, dict) +assert(val == {"d":{"d": 1}}) +return len(val) +$$; +SELECT test1complex('{"d":{"d": 1}}'::jsonb); + test1complex +-- +1 +(1 row) + +-- test jsonb[] -> python dict +-- dict with array as value +CREATE FUNCTION test1arr(val jsonb) RETURNS int +LANGUAGE plpythonu +TRANSFORM FOR TYPE jsonb +AS $$ +assert isinstance(val, dict) +assert(val == {"d": [12,1]}) +return len(val) +$$; +SELECT test1arr('{"d":[12,1]}'::jsonb); + test1arr +-- +1 +(1 row) + +-- test jsonb[] -> python list +-- simple list +CREATE FUNCTION test2arr(val jsonb) RETURNS int +LANGUAGE plpythonu +TRANSFORM FOR TYPE jsonb +AS $$ +assert isinstance(val, list) +assert(val == [12,1]) +return len(val) +$$; +SELECT test2arr('[12,1]'::jsonb); + test2arr +-- +2 +(1 row) + +-- test jsonb[] -> python list +-- array of dicts +CREATE FUNCTION test3arr(val jsonb) RETURNS int +LANGUAGE plpythonu +TRANSFORM FOR TYPE jsonb +AS $$ +assert isinstance(val, list) +assert(val == [{"a":1,"b&qu
Re: [HACKERS] Transform for pl/perl
There are some moments I should mention: 1. {"1":1}::jsonb is transformed into HV {"1"=>"1"}, while ["1","2"]::jsonb is transformed into AV ["1", "2"] 2. If there is a numeric value appear in jsonb, it will be transformed to SVnv through string (Numeric->String->SV->SVnv). Not the best solution, but as far as I understand this is usual practise in postgresql to serialize Numerics and de-serialize them. 3. SVnv is transformed into jsonb through string (SVnv->String->Numeric). An example may also be helpful to understand extension. So, as an example, function "test" transforms incoming jsonb into perl, transforms it back into jsonb and returns it. create extension jsonb_plperl cascade; create or replace function test(val jsonb) returns jsonb transform for type jsonb language plperl as $$ return $_[0]; $$; select test('{"1":1,"example": null}'::jsonb); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Transform for pl/perl
Hello. Please, check out jsonb transform (https://www.postgresql.org/docs/9.5/static/sql-createtransform.html) for pl/perl language I've implemented.diff --git a/contrib/Makefile b/contrib/Makefile index 8046ca4..53d44fe 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -75,9 +75,9 @@ ALWAYS_SUBDIRS += sepgsql endif ifeq ($(with_perl),yes) -SUBDIRS += hstore_plperl +SUBDIRS += hstore_plperl jsonb_plperl else -ALWAYS_SUBDIRS += hstore_plperl +ALWAYS_SUBDIRS += hstore_plperl jsonb_plperl endif ifeq ($(with_python),yes) diff --git a/contrib/jsonb_plperl/Makefile b/contrib/jsonb_plperl/Makefile new file mode 100644 index 000..8c427c5 --- /dev/null +++ b/contrib/jsonb_plperl/Makefile @@ -0,0 +1,40 @@ +# contrib/jsonb_plperl/Makefile + +MODULE_big = jsonb_plperl +OBJS = jsonb_plperl.o $(WIN32RES) +PGFILEDESC = "jsonb_plperl - jsonb transform for plperl" + +PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plperl + +EXTENSION = jsonb_plperlu jsonb_plperl +DATA = jsonb_plperlu--1.0.sql jsonb_plperl--1.0.sql + +REGRESS = jsonb_plperl jsonb_plperlu + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/jsonb_plperl +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif + +# We must link libperl explicitly +ifeq ($(PORTNAME), win32) +# these settings are the same as for plperl +override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment +# ... see silliness in plperl Makefile ... +SHLIB_LINK += $(sort $(wildcard ../../src/pl/plperl/libperl*.a)) +else +rpathdir = $(perl_archlibexp)/CORE +SHLIB_LINK += $(perl_embed_ldflags) +endif + +# As with plperl we need to make sure that the CORE directory is included +# last, probably because it sometimes contains some header files with names +# that clash with some of ours, or with some that we include, notably on +# Windows. +override CPPFLAGS := $(CPPFLAGS) $(perl_embed_ccflags) -I$(perl_archlibexp)/CORE diff --git a/contrib/jsonb_plperl/expected/jsonb_plperl.out b/contrib/jsonb_plperl/expected/jsonb_plperl.out new file mode 100644 index 000..7a85361 --- /dev/null +++ b/contrib/jsonb_plperl/expected/jsonb_plperl.out @@ -0,0 +1,76 @@ +CREATE EXTENSION jsonb_plperl CASCADE; +NOTICE: installing required extension "plperl" +-- test hash -> jsonb +CREATE FUNCTION testHVToJsonb() RETURNS jsonb +LANGUAGE plperl +TRANSFORM FOR TYPE jsonb +AS $$ +$val = {a => 1, b => 'boo', c => undef}; +return $val; +$$; +SELECT testHVToJsonb(); + testhvtojsonb +- + {"a": 1, "b": "boo", "c": null} +(1 row) + +-- test array -> jsonb +CREATE FUNCTION testAVToJsonb() RETURNS jsonb +LANGUAGE plperl +TRANSFORM FOR TYPE jsonb +AS $$ +$val = [{a => 1, b => 'boo', c => undef}, {d => 2}]; +return $val; +$$; +SELECT testAVToJsonb(); +testavtojsonb +- + [{"a": 1, "b": "boo", "c": null}, {"d": 2}] +(1 row) + +-- test scalar -> jsonb +CREATE FUNCTION testSVToJsonb() RETURNS jsonb +LANGUAGE plperl +TRANSFORM FOR TYPE jsonb +AS $$ +$val = 1; +return $val; +$$; +SELECT testAVToJsonb(); +testavtojsonb +- + [{"a": 1, "b": "boo", "c": null}, {"d": 2}] +(1 row) + +-- test jsonb -> scalar -> jsonb +CREATE FUNCTION testSVToJsonb2(val jsonb) RETURNS jsonb +LANGUAGE plperl +TRANSFORM FOR TYPE jsonb +AS $$ +return $_[0]; +$$; +SELECT testSVToJsonb2('1'); + testsvtojsonb2 + + 1 +(1 row) + +SELECT testSVToJsonb2('[1,2,3]'); + testsvtojsonb2 + + [1, 2, 3] +(1 row) + +SELECT testSVToJsonb2('{"1":{"2":[3,4,5]},"2":3}'); + testsvtojsonb2 +- + {"1": {"2": [3, 4, 5]}, "2": 3} +(1 row) + +DROP EXTENSION plperl CASCADE; +NOTICE: drop cascades to 5 other objects +DETAIL: drop cascades to extension jsonb_plperl +drop cascades to function testhvtojsonb() +drop cascades to function testavtojsonb() +drop cascades to function testsvtojsonb() +drop cascades to function testsvtojsonb2(jsonb) diff --git a/contrib/jsonb_plperl/expected/jsonb_plperlu.out b/contrib/jsonb_plperl/expected/jsonb_plperlu.out new file mode 100644 index 000..6d4be1c --- /dev/null +++ b/contrib/jsonb_plperl/expected/jsonb_plperlu.out @@ -0,0 +1,33 @@ +CREATE EXTENSION jsonb_plperlu CASCADE; +NOTICE: installing required extension "plperlu" +-- test jsonb -> hash +CREATE FUNCTION testJsonbToHV(val jsonb) RETURNS jsonb +LANGUAGE plperlu +TRANSFORM FOR TYPE jsonb +AS $$ +return $_[0]; +$$; +SELECT testJsonbToHV('{"aa":"bb", "cc":null, "dd":2}'::jsonb); + testjsonbtohv +--- + {"aa": "bb", "cc": null, "dd": 2} +(1 row) + +-- test jsonb -> av +CREATE FUNCTION testJsonbToAV(val jsonb) RETURNS jsonb +LANGUAGE plperlu +TRANSFORM FOR TYPE jsonb +AS $$ +return $_[0]; +$$; +SELECT testJsonbToAV('["bb",
[HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Hello, I've tested it (make check-world) and as far as I understand, it works fine. The new status of this patch is: Ready for Committer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:tested, failed Hello, As far as I understand, this patch adds functionality (correct me if I'm wrong) for users. Shouldn't there be any changes in doc/src/sgml/ with the description of new functionality? Regards Anthony The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: issue: record or row variable cannot be part of multiple-item INTO list
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested I'm afraid this patch conflicts with current master branch. The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] N-grams
Hello, Today I was reading a blog post from a fellow coworker http://www.depesz.com/index.php/2010/12/11/waiting-for-9-1-knngist/ and started to mess around with the trigram contrib package for postgres and playing with some different word dictionaries for English and German. I was wanting to see how performant particular queries could be if SIGLENINT in trgm.h was adjusted to be the avg character length for a particular word dictionary http://packages.ubuntu.com/dapper/wamerican compling=# SELECT AVG(LENGTH(CAST(word AS bytea), 'UTF8')) FROM english_words; avg 8.4498980409662267 vs http://packages.ubuntu.com/dapper/wngerman compling=# SELECT AVG(LENGTH(CAST(word AS bytea), 'UTF8')) FROM words; //german avg - 11.9518056504365566 (unsurprisingly German words are on average longer than English ones) Effectly wanting to make the trigram package act more along the lines of n-gram where I am explicitly setting the N when it is built. I, am however, not very proficient in C and doubt that is the only change necessary needed to convert the trigram contrib to an n-gram as after changing SIGLENINT to 12 in trgm.h I still get trigram results for show_trgrm() . I was hoping someone familiar with it could provide a little help for me by perhaps giving me a path of action needed to change the trigram implementation to behave as an n-gram. Thanks for your time and I appreciate any advice anyone can give me. Anthony Gentile
Re: [HACKERS] Hierarchical Queries--Stalled No Longer...
So is postgresql going into the direction of WITH or CONNECT BY (or both)? I am authoring O'Reilly's SQL Cookbook and I'd like to mention it in the Hierarchical chapter to give the pg readers a heads up. Thanks and regards, Anthony Molinaro -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus Sent: Thursday, September 22, 2005 2:02 PM To: pgsql-hackers@postgresql.org; Jonah H. Harris Subject: Re: [HACKERS] Hierarchical Queries--Stalled No Longer... Jonah, Tom, Bruce, Simon, Alvaro, Josh, et al., have you guys used the patch? If so, what comments do you have as to what you'd like to see. Unfortunately, yes I have. I had to help several users who got burned by it: the patch was buggy as all-get out. For example, it wouldn't access TOAST tables, so any TEXT fields larger than 500chars got cut off; and wouldn't support user-added data types or domains. (This was a year ago, so maybe Evgen fixed these things) So it's really nice of Evgen to re-license, but the license was *not* the primary thing blocking acceptance of the patch. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Hierarchical Queries--Stalled No Longer...
Josh, Thanks man, good to know. Sorry if the question was a bit out of place on this list but I wanted to make sure I reached the right people. I love what you guys are doing and made sure postgresql was included in my book. also, while I got your ear. I bugged Simon about this earlier this year and was wondering if you guys are still planning on added the window functions added to the '03 standard? I have a ton of recipes that use them and if you guys are still planning on implementing them, I'd like to mention that as well. Thanks, Anthony -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Thursday, September 22, 2005 2:43 PM To: Anthony Molinaro Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Hierarchical Queries--Stalled No Longer... Anthony, So is postgresql going into the direction of WITH or CONNECT BY (or both)? PostgreSQL would do ONLY with. We're not interested in Oracle-proprietary syntax. That being said, there is a CONNECT_BY() function in /contrib/tablefunc. But this would never be part of the core syntax. I am authoring O'Reilly's SQL Cookbook and I'd like to mention it in the Hierarchical chapter to give the pg readers a heads up. Keen. Please feel free to ask questions so that the PG section can be as accurate as possible. #postgresql on irc.freenode.net is quite active, and you can get any number of Postgres - SQL questions answered there. --Josh -- __Aglio Database Solutions___ Josh BerkusConsultant josh@agliodbs.comwww.agliodbs.com Ph: 415-752-2500Fax: 415-752-2387 2166 Hayes Suite 200San Francisco, CA ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Hierarchical Queries--Stalled No Longer...
Josh, Great, thanks for the update. It's a major feature implementation if we want them to be at all worthwhile agreed. SS 2005 added partial support for window functions (can't create moving windows of aggregation, ie, the portion of the syntax the standard calls the framing clause) and I didn't like that. Imho, that wasn't cool at all. Either support these functions all the way or don't; halfway is silly. Btw, some of my postgres reviewers, when they came across recipes that used the new GENERATE_SERIES function, were quite happy (in particular for pivoting so you don't need to have extra tables lying around). It's a cool addition and I've gotten positive feedback from it. So, whoever dreamt it up, nice job. :) Thanks and regards, Anthony -Original Message- From: Josh Berkus [mailto:[EMAIL PROTECTED] Sent: Thursday, September 22, 2005 3:05 PM To: pgsql-hackers@postgresql.org Cc: Anthony Molinaro Subject: Re: [HACKERS] Hierarchical Queries--Stalled No Longer... Anthony, also, while I got your ear. I bugged Simon about this earlier this year and was wondering if you guys are still planning on added the window functions added to the '03 standard? I have a ton of recipes that use them and if you guys are still planning on implementing them, I'd like to mention that as well. Planning, yes. Have started, no. It's a major feature implementation if we want them to be at all worthwhile; I'd like users to be able to create custom windowing aggregates, for that matter. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] No Timeout in SELECT..FOR UPDATE
Hi Guys, I have a suggestion for fixing a long-term and painful problem in PostgreSQL that is holding up many very important commercial projects, including ours! This problem has been reported numerous times: When one process has a row lock on one or more rows in a table, using SELECT...FOR UPDATE in default lock mode, another process has NO WAY of aborting from the same request, and reporting to the user that this record is already locked, reserved, or whatever you want to call it. In other words, by the time the second process has run the SELECT...FOR UPDATE statement, it's too late!! This second process is now locked forever, waiting for the first process to commit, rollback, or some other function that will release the lock. Yes, the first process will eventually release the lock by commiting or rollback, bu this is for a commercial environment with users, not processes, and the user needs to be informed about the error immediately, or within a second or so, and be given the chance to retry the update with lock, or just abort and go find another record to change. This problem is *fundamental*, and *very typical* in a commercial, accounting, or mission-critical environment. The only solution to this problem in PostgreSQL seems to be to: (1) Re-write the SELECT...FOR UPDATE SQL code, to return with an exception or error if it cannot immediately secure the lock, OR: (2) Add a TIMEOUT N clause to this statement, so that the timeout can be controlled on a per-statement basis - this is probably better. For example: [1] if I want the second process to give up within 1 millisecond, and return if the lock is not possible, then write: SELECT ... FOR UPDATE TIMEOUT 1 [0] If I want the default behaviour, (for a process that is prepared to wait forever for the record(s)), then: SELECT... FOR UPDATE TIMEOUT 0 OR, simply: SELECT... FOR UPDATE (as it is now) I hope that this suggestion will be taken seriously, since it is clear that a large number of developpers have made comments on this problem, dated all the way back to 2001 or earlier. Many thanks, Tony Rich, Richcorp Technology, Sydney, Australia. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Lauri Pietarinen [EMAIL PROTECTED] writes Anthony W. Youngman wrote: In article [EMAIL PROTECTED], Lauri Pietarinen lauri.pie [EMAIL PROTECTED] writes Anthony W. Youngman wrote: In article [EMAIL PROTECTED], Lauri Pietarinen [EMAIL PROTECTED] writes Well, if it is normalised, how easy is it for you to change the customer_id of an order? Anyway, Incredibly easy. Just update the customer_id field of the invoice record. A single change to a single row And I presume the system will automatically move all related stuff (order details etc.) into the same block as the new customer? How long will that take? What if there is no room for it there? Well, I'd view an order as an entity. As such, I would give it its own FILE, and your question doesn't make sense. But then your formula for disk head movements does not make sense either! Why not? The order is a real-world thing, and as such I would have an ORDERS file, in which each order is a single entry, with customer_id as one of its attributes. order detail is an attribute of order, so if I change customer_id it's the relational equivalent of just changing one cell in one row. The chances of me having to move the record is pretty near nil, and if I do it won't change bucket so at most it involves two frames (or disk blocks, if that's what you want to call them). But if the system did move the stuff, it would be four disk accesses - read/write to delete the old entry, read/write to save the new. As for enough room - well - it'll fall over if we have a disk full (or it might not). Not enough room here means not enought room in the block of the customer (from which you were supposed to get all data in one read, or disk head movement). That would mean that your order information would be moved perhaps to another block and result in an extra head movement, or am I right? Which I've taken in to account - if there isn't enough room in the original bucket, I need to either overflow into the next bucket which might exist, or to create it if it doesn't. Ie two head movements to delete from the first bucket, and two head movements to add to the second. And it will only fall over if I need to create a new bucket and there's no space left on the disk (or if (and this is very unlikely in this scenario) it triggers a split which again needs space and there's none left on disk). Or have you not sussed that we view order detail as an attribute of order (which is therefore stored as part of the same thing), but customer is separate from order, is stored separately, and is linked by a relationship. (Whereas order detail is NOT related to order, because they are part of the same thing :-) Well, it does result in data being stored multiple times ;-) What on earth is wrong with that? Do you know how much 160GB of disk cost's today? I could ask: does your system work in, say 4KB? That's how much memory the first computer I used (a Wang 2000) had. Probably it would not work at all. In the 50's they did amazing things with hardly any compilers and very little memory. I am referring to Whirlwind. See http://www.cedmagic.com/history/whirlwind-computer.html. Could you have done that with MV? My point? Why are we discussing restrictions to memory and CPU speed of the 70's and 80's? If an SQL DBMS uses more memory and disk, and it is available, why complain about *that*. Im not impying that you cannot complain about other matters, e.g. ease of development etc. and you might even be right. Be it as it is, I am not trying to make you abandon your MV database. As always, you're relying on hardware to help :-) You know what I think of that :-) And 160Gb of disk is only cheap if you're using IDE on a desktop PC - it costs a hell of a lot more for a laptop or SCSI for a server. And if it's embedded it maybe that the *room* is expensive, not the capacity ... And: what if I was just reading customer-data. Would the same formula apply (= (2+N)*ST*1.05)? Nope. If I understand you correctly, you want attributes that belong to the entity customer, not the entity invoice. T = ST * 1.05. (By the way, billing and/or invoice address (for example) are invoice attributes, not company attributes.) No, I want you to give me a list of all your customers. How many disk reads? T = N * 1.05 where N is the number of customers. What do you want to know about those customers? Address? Phone number*s*? Anything else? That's *all* at no extra cost. Well, no thanks. I just wanted their names this time. The relational alternative, with an index on customer_name, would be again an order of magnitune less disk reads. Well, if you let me use an index here, I'm sorry, GAME OVER! The best you can do would be a photo finish. Assuming an overhead of, say, 4 bytes per index entry, the entire index would be Size = 4 * N + sigma(name_length) + sigma(key_length) Okay, I've probably got some padding there as well, but so will you. And note I
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Anthony W. Youngman [EMAIL PROTECTED] writes Really, however you calculate it, it is an order of magnitude less than your alternative. And please don't tell me that using indexes is not fair or not in the spirit of the relational model ;-) Well, it does result in data being stored multiple times ;-) And while it maybe doesn't affect the result that much, you wanted the value? Where has that come from? What if the price changed half way through the period you're calculating? :-) You've failed to answer your own question, so maybe I could match you ... Whoops - sorry - I did notice after I wrote this that you included price in your index. But it does seem strange indexing on a composite field like that ... Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Marshall Spight [EMAIL PROTECTED] writes Bob Badour [EMAIL PROTECTED] wrote in message news:W46dnf4tbfF1DwiiU- [EMAIL PROTECTED] All physical structures will bias performance for some operations and against others. This strikes me as a succinct statement of the value of data independence. One has the option (but not the requirement) to adjust the physical structures the DBMS uses while keeping the logical model (and therefor all application code and queries, etc.) unchanged. Unless one has data independence, one does not have this option; one will be locked into a particular performance model. This is why I found the MV guy's obvious pleasure at being able to precisely describe the performance model for his DB as odd: I thought it a deficit to be able to say what it was; he thought it an asset. When you park your car, do you put the chassis on the drive, the engine in the garage, and the wheels in the front garden? You may find my approach of keeping data together strange, I just find it extremely weird that you think it is an IMPROVEMENT to disassemble what is in the real world a single thing. I'm sure you would not be happy if I tried to disassemble YOU and store your head in one place, your legs and arms in another, etc etc. Can I refer you to something called emergent complexity? A scientific theory of how the whole can be greater than the sum of its parts? Harking to something else, I can't remember who said the tuple is the fundamental unit of data. Apart from the fact that such a statement is not worth arguing with, I would compare that to the quark in physics. A strange beast that is known to exist, but can never be found in reality. And as a chemist, it is totally and utterly irrelevant to me. It pays to know it's there just in case in some strange circumstance it should be useful, but for the most part I can ignore it as just not part of my reality. Oh - and do you know why I was so pleased to describe the performance model for my db? For the same reason as I mentioned Huffman compression. It's impossible to prove that that Huffman is the most efficient algorithm, and indeed I pointed out that it isn't. It is, however, possible to prove that it is mathematically impossible for a more efficient algorithm to exist. I'm TOTALLY happy to be locked into a performance model, if I can PROVE that there are no other models that are more efficient. My ability with stats isn't good enough, but the figure bandied about is that there is room for about 5% improvement before we hit that mathematical limit. SQL has a HELL of a long way to go to catch up :-) Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Lauri Pietarinen lauri.pie [EMAIL PROTECTED] writes Anthony W. Youngman wrote: In article [EMAIL PROTECTED], Lauri Pietarinen [EMAIL PROTECTED] writes Anthony W. Youngman wrote: Fine. But MV *doesn't* *need* much of a cache. Let's assume both SQL and MV have the same amount of RAM to cache in - i.e. *not* *much*. I did say the spec said extract maximum performance from the hardware available. So what's wrong with gettng a machine with lots of memory? How much does 2G of memory for an Intel-box cost now a days? Is this some kind of new ultimate sport, trying to get along with as little memory as possible? I presume you didn't read the bit below ... what if you have SEVERAL tables, and EACH of them is a gigabyte or two in size? OK, I get your point. Using technology to get you out of a hole is fine. Assuming it will be there if you need it is not. And actually, this is one of the factors hammering the MV model :-( Technology is now powerful enough to solve a lot of problems simply by using brute force. Well, if it is normalised, how easy is it for you to change the customer_id of an order? Anyway, Incredibly easy. Just update the customer_id field of the invoice record. A single change to a single row And I presume the system will automatically move all related stuff (order details etc.) into the same block as the new customer? How long will that take? What if there is no room for it there? Well, I'd view an order as an entity. As such, I would give it its own FILE, and your question doesn't make sense. But if the system did move the stuff, it would be four disk accesses - read/write to delete the old entry, read/write to save the new. As for enough room - well - it'll fall over if we have a disk full (or it might not). if we stick to your example and even if we don't normalise using e.g. clustering features of Oracle, as Bob pointed out, we are getting at most the same number of I/O's. So, answer to your question: our formula is at least as good as yours. Except I think Bob said we could optimise to favour *certain* transactions. I think actually ANY transaction benefits. You're relying on stuff that's outwith your theory, we're relying on stuff that's inherent to our model. That certainly is not true. The theory says NOTHING about how data should be arranged on disk. You are talking about how modern SQL-databases behave. The DBMS is at liberty to do whatever it pleases with the data, even save it in a PICK database. Hey, wadda you think? Would that be a good idea? We get to keep our SQL but with the speed of PICK ;-) That would be nice ;-) But I think our two paragraphs don't connect. I was talking about MV ... We let the hardware help us out if it can. There's a big difference. If you can't get the hardware, you're stuffed. We don't need it, so while we may have a hard time of it it's nowhere near as bad for us. And again, relational separates the physical from the logical. You're being hypocritical if you call upon the physical representation to help out with the (speed of the) logical presentation. My goodness, no I'm not! Its the same as claiming that if you have a drawing for a house, you have to make that house out of paper?!? I want a list with all products with corresponding total sales, read from order detail e.g. Hammer 1$ Nail 5000$ Screw 1200$ How many disk reads (or head movements)? Actually, probably the same as you here. If we're indexed on order detail. If Hammer appears in N invoices, then T = (1+N) * ST * 1.05 for hammers, and the same for all the other products. Theory favours us, in that if a product appears X times in one invoice, that's one read for us and X for you, but hardware will probably help you more than us (that is, assuming thrashing cuts in) in that you stand a marginally higher chance of getting multiple instances of a product in any given read. So for each product you get T = (1+N) * ST * 1.05. Now, for our SQL-DBMS, presuming that we build indexes for detail and product: order_detail(product_id, qty, unit_price) = 20 bytes/row product(product_id, product_name) = 50 bytes/row With 2 disk reads I would get 8K/20 = 400 order detail rows and 8K/50 = 160 product rows Since all rows are in product_id order, no need for random disk reads so T = 1 + N/400 + P/160 (N=number of details, P=number of products) for ALL products and details. And, because of sequential prefetch, we probably would not have to wait for I/O's at all. Really, however you calculate it, it is an order of magnitude less than your alternative. And please don't tell me that using indexes is not fair or not in the spirit of the relational model ;-) Well, it does result in data being stored multiple times ;-) And while it maybe doesn't affect the result that much, you wanted the value? Where has that come from? What if the price changed half way
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Paul Vernon [EMAIL PROTECTED] writes No, I think Anthony is just saying that he doesn't believe in science/the scientific method. Or maybe he believes that engineering is not based on scientific knowledge! Actually, I *DO* believe in the Scientific Method. I just fail to see the connection between Scientific Method and Relational. The former is Science, the latter is Maths. Please tell me how I can use relational theory to predict the future. Without that, relational is unprovable, and hence unscientific. Note I didn't say relational is *incorrect* - the ideas of mathematically correct and scientifically provable are orthogonal, and have nothing to say about each other. Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Lauri Pietarinen [EMAIL PROTECTED] writes Anthony W. Youngman wrote: Well, as far as we MV'ers are concerned, performance IS a problem with the relational approach. The attitude (as far as I can tell) with relational is to hide the actual DB implementation from the programmers. So it is a design flaw that it is extremely easy for a programmer to do something stupid. And you need a DBA to try and protect the database from the programmers! As soon as a requirement for a database specifies extraction of the maximum power from the box, it OUGHT to rule out all the current relational databases. MV flattens it for it for performance. As an MV programmer, I *KNOW* that I can find any thing I'm looking for (or find out it doesn't exist) with just ONE disk seek. A relational programmer has to ask the db does this exist and hope the db is optimised to be able to return the result quickly. To quote the Pick FAQ SQL optimises the easy task of finding stuff in memory. Pick optimises the hard task of getting it into memory in the first place. So in your opinion, is the problem 1) SQL is so hard that the average programmer will not know how to use it efficiently Nope or 2) Relational (or SQL-) DBMS'es are just too slow Yes. If 2) then why don't we get a bit more concrete. Could you give an example of a query that in your experience would be too slow using a standard SQL database (e.g. Oracle, or MySQL). We could then actually try it out on some machine and compare. I suggest using the customer-order-order_detail-product database Okay. Give me a FORMULA that returns a time in seconds for your query. Let's assume I want to print a statement of how many invoices were sent to a customer, along with various details of those invoices. My invoice file is indexed by company/month, and we can reasonably assume that the time taken to produce the statement is infinitesimal compared to the time taken to retrieve the invoice data from disk. For MV T = (2 + N) * ST * 1.05 Where T is the time taken to produce the report, N is the number of invoices, and ST is the hard disk seek time. I've assumed I have to access the company details as well, hence the 2 (1 for company, 1 for the index). I've also assumed that the data isn't cached in RAM, which I think is reasonable if we assume the hardware is being stressed. If 1) I would like to hear some concrete examples. It's 2, so ... But as I understand relational theory, such a question is completely outside the scope of the theory. Seeing as it tries to divorce the database logic from the practical implementation ... And you know it's been proven that Huffman coding is the most efficient compression algorithm? (Actually, it isn't - it's been proven it can't be improved upon, which isn't the same thing...). Can you improve on the formula I've just given you? Given that if we could change the 1.05 to 1 then we can prove it can't be improved upon ... again - I've taken the liberty of assuming that a MV FILE is equivalent to an entity if we assume the relational designer has been thinking in an entity-attribute- relation sort of way. My maths isn't good enough to prove it, but I think it would be pretty easy to prove that accessing data as one and only one complete entity at a time is the most efficient way. best regards, Lauri Pietarinen Looking forward to you coming up with maths that can prove relational can even EQUAL MV :-) Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Dreaming About Redesigning SQL
, such a question is completely outside the scope of the theory. Seeing as it tries to divorce the database logic from the practical implementation ... The theory, indeed, does not say anything about buffer pools, but by decoupling logic from implementation we leave the implementor (DBMS) to do as it feels fit to do. As DBMS technology advances, we get faster systems without having to change our programs. But with MV, if our database is too large for current technology, we kick the shit out of relational for speed ... Don't forget. You've already said that, if nothing is cached, my average case exceeds your best. And my case is *already* assuming that the system is seriously stressed and struggling ... When we design databases we can decouple logical planning from performance considerations, which, you must agree, are two separate issues. And you know it's been proven that Huffman coding is the most efficient compression algorithm? (Actually, it isn't - it's been proven it can't be improved upon, which isn't the same thing...). Can you improve on the formula I've just given you? Given that if we could change the 1.05 to 1 then we can prove it can't be improved upon ... again - I've taken the liberty of assuming that a MV FILE is equivalent to an entity if we assume the relational designer has been thinking in an entity-attribute- relation sort of way. My maths isn't good enough to prove it, but I think it would be pretty easy to prove that accessing data as one and only one complete entity at a time is the most efficient way. I think that in a typical system your cache hit ratio would approach 90% so that could mean 0.1 disk seeks. That improves our performance just as much as improves yours. What happens to your response time if you just DON'T HAVE the cache available, for whatever reason? I can't find the post now :-( but is Christopher reading this? You know I compared that relational system on a twin Xeon 800, to an MV system running on a P90? Christopher made the (reasonable in the circumstances) assumption that the relational consultants must be crap, and the MV guy a guru. Actually, I'd come to exactly the OPPOSITE conclusion. My MV experience tells me that MV query was probably thrown together, by an average programmer, in 30 seconds. On the other hand, those SQL consultants had an axe to grind and a point to prove. They couldn't afford to let this old fashioned system beat them. That SQL query would have been optimised to within an inch of its life over weeks. Don't forget how proud they were to beat this MV system! Yet with hardware that was so much more powerful and a query that was heavily optimised, they had great difficulty beating a query that was thrown together in seconds by an average MV guy (or even just a luser!). Don't forget. I said I am a database *engineer*. Engineers believe in elegance, they believe in beauty. And when I look at relational, all I see is the theorists pleading power, hardware, brute force, to get them out of trouble. And then all these people, who believe in maths over reality, are surprised when I turn round and say I despise their beliefs. Note, I did NOT say I despise relational theory. I despise the belief that it is the answer to life, the database universe, and everything data related. (By the way, 6 times 9 DOES equal 42 :-) best regards, Lauri Pietarinen Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Lauri Pietarinen [EMAIL PROTECTED] writes Anthony W. Youngman wrote: Fine. But MV *doesn't* *need* much of a cache. Let's assume both SQL and MV have the same amount of RAM to cache in - i.e. *not* *much*. I did say the spec said extract maximum performance from the hardware available. So what's wrong with gettng a machine with lots of memory? How much does 2G of memory for an Intel-box cost now a days? Is this some kind of new ultimate sport, trying to get along with as little memory as possible? I presume you didn't read the bit below ... what if you have SEVERAL tables, and EACH of them is a gigabyte or two in size? If an engineer has a problem, throwing brute force at it is rarely the solution. Let's be topical (near enough) and look at the Titanic (seeing as there was this film recently). If they'd forseen the problem, they could have thrown brute force at it and doubled the thickness of the steel plate. Except she would have then sunk when they launched her, before she even had a chance to hit the iceberg. Or look at aviation - especially in the early years. They had gliders that could fly, and they had engines that could easily provide the power to get a glider airborne. The problem was, every time they increased the power of the engine they got *further* *away* from the possibility of powered flight, because the increased power came at the price of increased weight. You're welcome to live in your mathematical world where power can be gained for no cost, but that doesn't work in the real world. And the cost isn't necessarily dollars. Like in the aircraft example, the cost could be a case of sorry, technology ain't that advanced yet mate! You're assuming that you can throw hardware at the problem - fine, but that's not always possible. You might have already maxed out the ram, you might have a huge database, you might be sharing your db server with other programs (BIND really likes to chew up every available drop of ram, doesn't it :-). I'm not saying that you shouldn't throw hardware at it, but what if you can't? Except my example was an *average* case, and yours is a *best* case. Oh, and my data is still normalised - I haven't had to denormalise it! AND I haven't run an optimiser over it :-) Are you hiding your optimiser behind the curtain? ;-) Well, if you include getting optimisation for free because that's the way things work, maybe I am ;-) Well, if it is normalised, how easy is it for you to change the customer_id of an order? Anyway, Incredibly easy. Just update the customer_id field of the invoice record. A single change to a single row if we stick to your example and even if we don't normalise using e.g. clustering features of Oracle, as Bob pointed out, we are getting at most the same number of I/O's. So, answer to your question: our formula is at least as good as yours. Except I think Bob said we could optimise to favour *certain* transactions. I think actually ANY transaction benefits. You're relying on stuff that's outwith your theory, we're relying on stuff that's inherent to our model. Now, that was a *conservative* estimate, and we assumed that we did not have any rows lying around in the (global!) cache. As the size of the cache grows in proportion to the size of the total database we can assume less and less disk I/O. You're relying on the hardware to bale you out :-) We can do the same! Well why don't you? We let the hardware help us out if it can. There's a big difference. If you can't get the hardware, you're stuffed. We don't need it, so while we may have a hard time of it it's nowhere near as bad for us. And again, relational separates the physical from the logical. You're being hypocritical if you call upon the physical representation to help out with the (speed of the) logical presentation. Note also that the cache can be configured many ways, you can put different tables (or indexes) in different caches, and even change the size of the cache on the fly (you might want a bigger cache during evening and night when your batch programs are running) so you can rig your system to favour certain types of queries. I havn't even gone into the topic of using thick indexes so table access can be totally avoided (=we are reading into memory only interesting columns). Now, in your example, what if the product department comes along and wants to make a report with sales / product? What would be your formula in that case? I'm not quite sure what you're trying to do. I'll assume you want a report of all invoices which refer to a given product. Assuming I've got the relevant indices defined, I can simply read a list of invoices from the product code index, a second list of invoices from the month index, and do an intersect of the two lists. I want a list with all products with corresponding total sales, read from order detail e.g. Hammer 1$ Nail 5000$ Screw 1200$ How many disk reads
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Christopher Browne [EMAIL PROTECTED] writes How do you know it works? Without the theory and model, you really do not. And don't other databases have both theory and model? It's just that all the academics have been brainwashed into thinking this is true only for relational, so that's what they teach to everyone else, and the end result is that all research is ploughed into a model that may be (I didn't say is) bankrupt. Just like the academics were brainwashed into thinking that microkernels were the be-all and end-all - until Linus showed them by practical example that they were all idiots :-) In mathematics as well as in the analysis of computer algorithms, it is typical for someone who is trying to explain something new to try to do so in terms that allow the gentle reader to do as direct a comparison as possible between the things with which they are familiar (e.g. - in this case, relational database theory) and the things with which they are perhaps NOT familiar (e.g. - in this case, MV databases). Nobody seems to have been prepared to explain the MV model in adequate theoretical terms as to allow the gentle readers to compare the theory behind it with the other theories out there. I'm afraid that does not reflect very well on either those lauding MV or those trashing it. I think one MAJOR problem is that most (if not all) MV practitioners are not formally qualified in computing ... for example by education I'm a chemist. And I'm doing postgrad in medical science ... The trouble is - we MV'ers tend to take an engineering approach - we use it because we know it works. To quote you from another post ... When people _don't_ do that thinking differently, we are certain to see hideous performance, and that is neither a SQL issue nor a relational issue. The point is that if they are accessing a big pile of data, they have to think carefully [jumping to that different way of thinking] irrespective of what specific language(s), libraries, or other tools they are using. Well, as far as we MV'ers are concerned, performance IS a problem with the relational approach. The attitude (as far as I can tell) with relational is to hide the actual DB implementation from the programmers. So it is a design flaw that it is extremely easy for a programmer to do something stupid. And you need a DBA to try and protect the database from the programmers! As soon as a requirement for a database specifies extraction of the maximum power from the box, it OUGHT to rule out all the current relational databases. MV flattens it for it for performance. As an MV programmer, I *KNOW* that I can find any thing I'm looking for (or find out it doesn't exist) with just ONE disk seek. A relational programmer has to ask the db does this exist and hope the db is optimised to be able to return the result quickly. To quote the Pick FAQ SQL optimises the easy task of finding stuff in memory. Pick optimises the hard task of getting it into memory in the first place. Relational is all about theory and proving things mathematically correct. MV is all about engineering and getting the result. And if that means pinching all the best ideas we can find from relational, then we're engineers - of course we'll do it :-) Think different. Think Engineering, not Maths. And for $DEITY's sake stop going on about science. Unless you can use set theory to predict the future, relational has nothing to do with science ... Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Gene Wirchenko [EMAIL PROTECTED] writes [EMAIL PROTECTED] (Seun Osewa) wrote: [snip] Sometimes I wonder why its so important to model data in the rela- tional way, to think of data in form of sets of tuples rather than tables or lists or whatever. I mean, though its elegant and based on mathematical principles I would like to know why its the _right_ model to follow in designing a DBMS (or database). The way my mind sees it, should we not rather be interested in what works? How do you know it works? Without the theory and model, you really do not. And don't other databases have both theory and model? It's just that all the academics have been brainwashed into thinking this is true only for relational, so that's what they teach to everyone else, and the end result is that all research is ploughed into a model that may be (I didn't say is) bankrupt. Just like the academics were brainwashed into thinking that microkernels were the be-all and end-all - until Linus showed them by practical example that they were all idiots :-) Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Seun Osewa [EMAIL PROTECTED] writes Thanks for the links. Christopher Browne [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] 125932.news.uni-berlin.de... There are two notable 'projects' out there: 1. There's Darwen and Date's Tutorial D language, defined as part of their Third Manifesto about relational databases. 2. newSQL http://newsql.sourceforge.net/, where they are studying two syntaxes, one based on Java, and one based on a simplification (to my mind, oversimplification) of SQL. I was able to get a pdf coy of the Third Manifesto article here: http://citeseer.nj.nec.com/darwen95third.html but the details of tutorial D seem not to be a part of that article. NewSQL *might* be cool if someone found reason to use it in a DBMS. Is Darwen and Date's stuff that where they said SQL was crap. As I understand it, within about a year of designing SQL, at least one of Codd and Date said it was rubbish and tried to replace it with something better. Sometimes I wonder why its so important to model data in the rela- tional way, to think of data in form of sets of tuples rather than tables or lists or whatever. I mean, though its elegant and based on mathematical principles I would like to know why its the _right_ model to follow in designing a DBMS (or database). The way my mind sees it, should we not rather be interested in what works? I couldn't agree more (of course I would). As I like to put it, surely Occam's Razor says that stuffing the four-dimensional world into a flat- earth database can't be the optimal solution! The trouble with so many SQL advocates is that they are so convinced in the mathematical rightness of the relational model, that they forget it is a *model* and, as such, needs to be shown as relevant to the real world. That said, I always think relationally when designing databases - it helps. Look at the multi-value databases. Think relationally, you can still store your data in normal form, but you're not stuffed by all the irrelevant restrictions that relational databases tend to impose. Get a freebie copy of jBASE, UniVerse or UniData, and try them out :-) Cheers, Wol -- Anthony W. Youngman [EMAIL PROTECTED] 'Yings, yow graley yin! Suz ae rikt dheu,' said the blue man, taking the thimble. 'What *is* he?' said Magrat. 'They're gnomes,' said Nanny. The man lowered the thimble. 'Pictsies!' Carpe Jugulum, Terry Pratchett 1998 Visit the MaVerick web-site - http://www.maverick-dbms.org Open Source Pick ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Dreaming About Redesigning SQL
In article [EMAIL PROTECTED], Lee Fesperman [EMAIL PROTECTED] writes If you don't care for mathematical principles, there's always ad-hoc database models. Check out Pick, OO and XML databases. They're interested in what works and ignore elegance and mathematical principles. Mathematical principles? You mean like Euclidean Geometry and Newtonian Mechanics? They're perfectly solid, good, mathematically correct. Shame they don't actually WORK all the time in the real world. That's what I feel about relational, too ... Cheers, Wol -- Anthony W. Youngman - wol at thewolery dot demon dot co dot uk Witches are curious by definition and inquisitive by nature. She moved in. Let me through. I'm a nosey person., she said, employing both elbows. Maskerade : (c) 1995 Terry Pratchett ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] dynamic sql with parameters in interactive queries
hi postgresql supports dynamic sql with parameters in SQL function bodies, but not in interactive queries. why? when i wrote a dynamic sql with parameters, ODBC just filled the values of parameters into query string and sent it to server as a static query string. i think it's not right solution to dynamic sql with parameters. thanks ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Help Unsubscribing
How do I unsubscribe from here? Thank You, Anthony ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Is there a way to drop and restore an index?
Tom Lane wrote: See REINDEX. Thanks. -Tony ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] IRIX 6.5.12: POSIX BSD
I've been going talking with the SGI technical support about some of the errors I got when compiling Postgres 7.1.1 on SGI IRIX 6.5.12 with the MIPSPro 7.3 C compiler. I've already mentioned that somehow the compiler can't see the correct definition for strdup (I believe she thought that it was due to the POSIX declaration). There's also a problem with it not seeing the structure timeval defined. timeval is in /usr/include/sys/time.h and is declared in the following way: #if _XOPEN4UX || defined(_BSD_TYPES) || defined(_BSD_COMPAT) /* * Structure returned by gettimeofday(2) system call, * and used in other calls. * Note this is also defined in sys/resource.h */ #ifndef _TIMEVAL_T #define _TIMEVAL_T struct timeval { #if _MIPS_SZLONG == 64 __int32_t :32; #endif time_t tv_sec; /* seconds */ long tv_usec; /* and microseconds */ }; So SGI is assuming that you're declaring BSD types or compatibility. However, the tech support person said that with the compiler's POSIX declaration, this is conflicting. Basically, she says that POSIX implies generalized portability across many platforms, but BSD implies a specific type of platform. So that's where she thinks SGI is having the trouble-- two conflicting type declarations. Is this correct? -Tony ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Followup to IRIX6.5 installation of PG 7.1.1
I've been talking with SGI tech support about my problem with installing Postgres 7.1.1 on the SGI (IRIX 6.5.12 using the MIPSPro 7.3 compiler). Fortunately, one of my SGI's (an octane) built PG without any problem so this is just academic now (but probably useful for others wanting to install PG on the SGI). The other SGI (an o2) seems to lose definitions of strdup and timeval and some other structures. On the specific question of strdup, the SGI person told me this: Hi Tony, From my research I came across this: strdup is not part of ISO C, either C 89, C90, C95, or the latest, C99. As a a result, there is no strdup() prototype visible, so the compiler assumes strdup returns an int. An int cannot be transformed to a char * without a cast, so the compiler gives a diagnostic. I noticed in your code string.h is not included. The man page for strdup specifies the inclusion of this header. Please advise. Any comments? -Tony ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Followup to IRIX6.5 installation of PG 7.1.1
Tom Lane wrote: string.h is included in every Postgres source file (via c.h). Yep. That's what I expected. SGI technical support seems to think that the problem is with the POSIX flag. Have you defined any POSIX variables, such as -D_POSIX_SOURCE or included pthread.h? When you enable POSIX you will incur a lot of undefined smbols that are not POSIX compliant. You can check the symbols in the POSIX standards. I can't say that I understand this at all, but I believe she is saying that the -D_POSIX_SOURCE flag has caused some function declarations to be hidden from the compiler (?). In any case, since I have a working copy of 7.1.1 for IRIX, I'll leave all of it alone for now as it has gone past my comprehension. SGI has a freeware site with PostgreSQL 7.0.3. I'm sure that they'll figure this out when they try to build 7.1.1 for the site. -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Installation on SGI IRIX 6.5.10
Tom Lane wrote: That absolutely should NOT be necessary; there should be a proper extern declaration of strdup visible. Perhaps it should be added to include/port/irix5.h (cf port/nextstep.h). regards, tom lane Just to make sure, I tried compiling on another SGI. Everything went fine without any kludgy workarounds. It looks like somehow my compiler/OS has some problems. I'm upgrading to the latest OS version and compiler version. Thanks for the help. -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Installation on SGI IRIX 6.5.10
Tom Lane wrote: cc-1070 cc: ERROR File = xact.c, Line = 696 The indicated type is incomplete. struct timeval delay; ^ Hm. Which system header file defines struct timeval on IRIX? I'd expect time.h or sys/time.h, but maybe they keep it someplace unusual. In /usr/include/sys/time.h: #if _XOPEN4UX || defined(_BSD_TYPES) || defined(_BSD_COMPAT) /* * Structure returned by gettimeofday(2) system call, * and used in other calls. * Note this is also defined in sys/resource.h */ #ifndef _TIMEVAL_T #define _TIMEVAL_T struct timeval { #if _MIPS_SZLONG == 64 __int32_t :32; #endif time_t tv_sec; /* seconds */ long tv_usec; /* and microseconds */ }; -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Addition to: Trouble with initdb when the #define NAMEDATALEN = 51
Tom Lane wrote: G. Anthony Reina [EMAIL PROTECTED] writes: In postgres_ext.h, I changed: #define NAMEDATALEN 32 to #define NAMEDATALEN 51 Everything compiled and installed. However, the initdb started up but then just said that it failed. I have not tried that in awhile, but the last time I did, it worked fine. Are you sure you did a *complete* rebuild? I'd suggest make distclean at the top level, configure, make all, install, initdb. I did a 'gmake clean'. I'll try again today. Perhaps I'll find something that I was overlooking on Friday. -Tony ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Installation on SGI IRIX 6.5.10
Tom Lane wrote: Evidently IRIX also considers strdup() to be nonstandard :-( It's hard to believe that SGI is quite this braindead. I think there is something broken about configure on your setup. Can't tell what from here --- suggest you call in some IRIX gurus. Yep. So goes SGI. I can't figure out why this error is showing up. When I looked at the man page for strdup: char *strdup (const char *s1); which is how it looks to be used in async.c. I simply added a specific type-cast: notifyName = (char *) strdup(relname); and it compiled async.c fine (of course, now I'll have to go through some of the other files that also have strdup and change them). I'm going to see if the SGI technical support considers this a bug or not. -Tony ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Installation on SGI IRIX 6.5.10
Tom Lane wrote: #if _XOPEN4UX || defined(_BSD_TYPES) || defined(_BSD_COMPAT) Next thought is that maybe none of these control symbols are defined by default --- could you look into that possibility? Perhaps some compiler switches or #defines are needed to get IRIX to allow struct timeval? regards, tom lane In xact.c, I added: #define _BSD_COMPAT 1 before #include sys/time.h It seems to get through that part of the compilation okay now. I'm not sure if that will break anything else but it seems minor. There's a new problem with async.c: cc-1515 cc: ERROR File = async.c, Line = 172 A value of type int cannot be assigned to an entity of type char *. notifyName = strdup(relname); ^ 1 error detected in the compilation of async.c. gmake[3]: *** [async.o] Error 2 gmake[3]: Leaving directory `/usr/src/postgresql-7.1.1/src/backend/commands' gmake[2]: *** [commands-recursive] Error 2 gmake[2]: Leaving directory `/usr/src/postgresql-7.1.1/src/backend' gmake[1]: *** [all] Error 2 gmake[1]: Leaving directory `/usr/src/postgresql-7.1.1/src' gmake: *** [all] Error 2 It looks like I just need to change the code to explicitly cast the variable. -Tony ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Trouble with initdb when the #define NAMEDATALEN = 51
I'm not sure if this is still needed in postgres to define the length of a variable/table name. In postgres_ext.h, I changed: #define NAMEDATALEN 32 to #define NAMEDATALEN 51 Everything compiled and installed. However, the initdb started up but then just said that it failed. I did a gmake clean and changed the 51 back to 32 and everything went through correctly (make, install, and initdb). Can anyone else verify if this is correct or even makes sense? Thanks. -Tony ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Installation on SGI IRIX 6.5.10
In addition to my RedHat 6.2 server, I'm installing Postgres 7.1.1 on an SGI O2 (IRIX 6.5.10). The configure works, but the 'gmake all' fails when it tries to compile 'xact.c': cc-1521 cc: WARNING File = /usr/include/setjmp.h, Line = 26 A nonstandard preprocessing directive is used. #ident $Revision: 1.36 $ ^ cc-1070 cc: ERROR File = xact.c, Line = 696 The indicated type is incomplete. struct timeval delay; ^ 1 error detected in the compilation of xact.c. gmake[4]: *** [xact.o] Error 2 gmake[4]: Leaving directory `/usr/src/postgresql-7.1.1/src/backend/access/transam' gmake[3]: *** [transam-recursive] Error 2 gmake[3]: Leaving directory `/usr/src/postgresql-7.1.1/src/backend/access' gmake[2]: *** [access-recursive] Error 2 gmake[2]: Leaving directory `/usr/src/postgresql-7.1.1/src/backend' gmake[1]: *** [all] Error 2 gmake[1]: Leaving directory `/usr/src/postgresql-7.1.1/src' gmake: *** [all] Error 2 o21.nsi.edu:postgres::/usr/src/postgresql-7.1.1 I'm using the SGI MIPSPro 7.1 C compiler. I haven't had any problems like this when compiling previous versions of Postgres. If necessary, I could try to get gcc instead of the MIPSPro compiler, but I wonder if the xact.c definition for timeval could be modified to pass on my machine. Thanks. -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Addition to: Trouble with initdb when the #define NAMEDATALEN = 51
Sorry, I forgot to include that I'm compiling this on RedHat 6.2, Pentium III with Postgres 7.1.1. -Tony I'm not sure if this is still needed in postgres to define the length of a variable/table name. In postgres_ext.h, I changed: #define NAMEDATALEN 32 to #define NAMEDATALEN 51 Everything compiled and installed. However, the initdb started up but then just said that it failed. I did a gmake clean and changed the 51 back to 32 and everything went through correctly (make, install, and initdb). Can anyone else verify if this is correct or even makes sense? Thanks. -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] 7.1.1
I see by the messages that 7.1.1 is in the final packaging. Anyone know when it will be released? -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_dump Backup on 7.0.3 - Sanity error?
Tom Lane wrote: Most likely, you removed the user that owned ro_ellipse. Create a user with the same usesysid shown as ro_ellipse's relowner, or else change the relowner field to point at an extant user. I believe 7.1's pg_dump copes with this sort of thing more gracefully... regards, tom lane Yes. I did delete that user. Thanks Tom. That makes sense. -Tony ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[HACKERS] pg_dump Backup on 7.0.3 - Sanity error?
I'm trying to use pg_dump to backup my tables one at a time from Postgres 7.0.3 (I'll upgrade to 7.1 in a few weeks). I'm getting a strange error that I've never encountered before. The backup call is:pg_dump db01 -t cell | gzip cell.backup.gz The error is : failed sanity check, table ro_ellipse was not found However, I wasn't even accessing table ro_ellipse. Plus, I've verified that the table does exist and appears fine (I can select data from it). I vacuumed the db and even restarted the postmaster, but I still get this weird warning. Anyone seen this before or know if this is a problem? Thanks. -Tony Postgres 7.0.3 running on RH Linux 6.2 (Zoot), Pentium III/400 MHz, 512 Meg RAM ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Is it possible to mirror the db in Postgres?
We use Postgres 7.0.3 to store data for our scientific research. We have two other labs in St. Louis, MO and Tempe, AZ. I'd like to see if there's a way for them to mirror our database. They would be able to update our database when they received new results and we would be able to update theirs. So, in effect, we'd have 3 copies of the same db. Each copy would be able to update the other. Any thoughts on if this is possible? Thanks. -Tony Reina ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Re: Is it possible to mirror the db in Postgres?
Nathan Meyers wrote: Does the replication have to be reliable? Are you equipped to reconcile databases that have got out of sync, if not? Will the different labs ever try to update the same existing record, or insert conflicting (unique-key) records? (1) Yes, of course. (2) Willing--yes; equipped--dunno. (3) Yes, probably. Reliable WAN replication is harder. Most of the proprietary database companies will tell you they can do it, but their customers will tell you they can't. Joel Burton suggested the rserv utility. I don't know how well it would work over a wide network. -Tony ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] rtrim giving weird result
Ken Hirsch wrote: So rtrim("center_out_opto", "_opto") returns "center_ou" because "u" is not in the set {o, p, t, _} but all the characters after it are. rtrim("center_out_opto", "pot_") will produce the same thing. That seems like an odd definition (although as Tom points out, it is consistent with Oracle). Is there a way to just remove the "_opto" from the end of the string? -Tony ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] rtrim giving weird result
I'm running Postgres 7.0.3 on a RedHat Linux 6.1. For some reason, rtrim is giving me an incorrect result: db01=# SELECT tablename FROM pg_tables WHERE tablename LIKE '%_opto' AND tablename NOT LIKE 'pg%' ORDER BY tablename ASC ; tablename - center_out_opto circles_opto ellipse_opto ex_ellipse_opto figure8_opto ro_ellipse_opto (6 rows) Now I want to return the same thing only with the trailing '_opto' removed: db01=# SELECT rtrim(tablename, '_opto') FROM pg_tables WHERE tablename LIKE '%_opto' AND tablename NOT LIKE 'pg%' ORDER BY tablename ASC ; rtrim center_ou === NOTE: the trailing 't' is missing circles ellipse ex_ellipse figure8 ro_ellipse (6 rows) However, as you can see, the 'center_out' table is missing the last 't'. If I exclude the '_': db01=# SELECT rtrim(tablename, 'opto') FROM pg_tables WHERE tablename LIKE '%_opto' AND tablename NOT LIKE 'pg%' ORDER BY tablename ASC ; rtrim - center_out_ === 't' shows up again circles_ ellipse_ ex_ellipse_ figure8_ ro_ellipse_ (6 rows) The 't' is back. Is there something that I'm doing wrong with my query here? Thanks. -Tony ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] ERROR: cannot open relation center_out_analog_proc
I was trying to add a column to a table and fill it but ran into a big error. Apparently now Postgres can't open this table to vacuum or to select although it does show up when I ask psql to describe the table (i.e. db01=# /d center_out_analog_proc). I'm using Postgres 7.0.3 on a PII/400 MHz with RedHat Linux (kernel 2.2.14-5). The command that started the problem was from the script: -- Re-arranges the columns in a table -- -- Tony Reina -- Created: 6 March 2001 -- The BEGIN and COMMIT statements ensure that either all statements are done or none are done BEGIN WORK; -- ADD THE NEW COLUMN TO THE TABLE ALTER TABLE center_out_analog_proc ADD COLUMN name text; -- SELECT the columns from the table in whatever new format you wish. Place into a temporary table. SELECT subject, arm, target, rep, channel, name, cut_off_freq, quality, analog_data INTO temp_table FROM center_out_analog_proc; -- DROP THE OLD TABLE DROP TABLE center_out_analog_proc; -- MAKE THE NEW TABLE INTO THE OLD TABLE ALTER TABLE temp_table RENAME TO center_out_analog_proc; -- FILL THE NEW COLUMN WITH THE CORRECT DATA UPDATE center_out_analog_proc SET name = (SELECT name FROM center_out_analog AS a WHERE a.subject = center_out_analog_proc.subject AND a.arm = center_out_analog_proc.arm AND a.target = center_out_analog_proc.target AND a.rep = center_out_analog_proc.rep AND a.channel = center_out_analog_proc.channel); -- VACUUM THE TABLE VACUUM VERBOSE ANALYZE center_out_analog_proc; COMMIT WORK; --- When I ran this, I had an error in the UPDATE command (so the entire transaction aborted). I assumed that becuase the transaction aborted that nothing would have changed in the db. However, after this happened, I corrected the UPDATE command but ran into this error when I re-ran the script: db01=# \i alter_table_format.sql BEGIN psql:alter_table_format.sql:14: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:14: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:14: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:14: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:14: ERROR: cannot open relation center_out_analog_proc psql:alter_table_format.sql:17: NOTICE: current transaction is aborted, queries ignored until end of transaction block *ABORT STATE* psql:alter_table_format.sql:20: NOTICE: current transaction is aborted, queries ignored until end of transaction block *ABORT STATE* psql:alter_table_format.sql:26: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:26: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:26: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:26: NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory psql:alter_table_format.sql:26: NOTICE: current transaction is aborted, queries ignored until end of transaction block *ABORT STATE* psql:alter_table_format.sql:29: NOTICE: current transaction is aborted, queries ignored until end of transaction block *ABORT STATE* COMMIT When I try to vacuum the table or the database I get: NOTICE: Pages 190: Changed 0, reaped 0, Empty 0, New 0; Tup 9280: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 652, MaxLen 652; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.07s/0.14u sec. NOTICE: --Relation circles_analog_proc -- NOTICE: Pages 187: Changed 0, reaped 0, Empty 0, New 0; Tup 9140: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 652, MaxLen 652; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.08s/0.13u sec. NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory NOTICE: RelationIdBuildRelation: smgropen(center_out_analog_proc): No such file or directory NOTICE: --Relation center_out_analog_proc -- NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory ERROR: cannot open relation center_out_analog_proc db01=# select distinct monkey from center_out_analog_proc; NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory ERROR: cannot open relation center_out_analog_proc Likewise, a select gives me: db01=# select distinct arm from center_out_analog_proc; NOTICE: mdopen: couldn't open center_out_analog_proc: No such file or directory NOTICE: mdopen: couldn't open center_out_analog_proc:
Re: [HACKERS] ERROR: cannot open relation center_out_analog_proc
Tom Lane wrote: You can't roll back a DROP TABLE under pre-7.1 releases (and 7.0 has a big fat warning notice to tell you so!). The physical table file is deleted immediately by the DROP, so rolling back the system catalog changes doesn't get you back to a working table. The only way to clean up at this point is to drop the table for real. Okay, so then you are saying that even though the DROP TABLE and ALTER TABLE RENAME went through correctly, the line after that bombed out, tried to rollback the transaction, and gave me the error? I definitely missed that warning. Are there any big warnings for things that don't work so well within a transaction (BEGIN WORK; COMMIT WORK)? Thanks. Off to rebuild a table. -Tony ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Re: Patches with vacuum fixes available for 7.0.x
Alfred, Is there a tarbar with the updated files for the vacuum patch? Or, is there some way to use the 'v.diff' file without the need to modify the files by hand? I started changing the files by hand, but realized that there is so much information that I'm bound to make a mistake in the manual update. Thanks. -Tony Reina There's three patchsets and they are available at: http://people.freebsd.org/~alfred/vacfix/ complete diff: http://people.freebsd.org/~alfred/vacfix/v.diff only lazy vacuum option to speed up index vacuums: http://people.freebsd.org/~alfred/vacfix/vlazy.tgz only lazy vacuum option to only scan from start of modified data: http://people.freebsd.org/~alfred/vacfix/mnmb.tgz
[HACKERS] Weird backup file
I backed up my database from Postgres 6.5.3 and migrated to 7.0.2 several a few months ago. For some reason, data was lost in the transition. I've finally pinned it down to the attached file (abridged to point out the problem). It looks like two things happened in the backup. First, when I move from 'G' to 'F' in the names column, I seem to lose the column called 'dsp_chan'. Second, the double quotes around the float_4 array called 'spike_hist' aren't included. I'm not sure if the double quotes are necessary, but the missing column is probably a problem. I added this column after the database was created by using 'alter table ellipse_cell_proc add column dsp_chan' and then put it in the correct position by using: SELECT name, arm, rep, cycle, hemisphere, area, cell, dsp_chan, spike_hist INTO xxx FROM ellipse_cell_proc; DROP TABLE ellipse_cell_proc; ALTER TABLE xxx RENAME TO ellipse_cell_proc; Can anyone explain what went wrong with the backup or where I erred adding the column? Thanks. -Tony \connect - postgres CREATE TABLE "ellipse_cell_proc" ( "name" text, "arm" character, "rep" int4, "cycle" int4, "hemisphere" character, "area" text, "cell" int4, "dsp_chan" text, "spike_hist" "_float4" ); COPY "ellipse_cell_proc" FROM stdin; I L 888 2 R 4 173 1 {"0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"} I L 888 3 R 4 173 1 {"0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"} I L 888 4 R 4 173 1 {"0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"} I L 888 5 R 4 173 1 {"0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"} G R 5 2 L 4 1 1 {"1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.1","1.6","43.9","29.4","36.3","30.4","14","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","2.2","7","30.3","47.5","48.5","48.2","33.9","35.6","37.8","29.6","23.3","43.7","21.3","26.6","44","39.6","35.5","35.6","46.6","41.6","43","42.1","25.7","22.7","22.7","22.1","21.3","21.9","22.2","36","26.8","29.2","30.6","21.3","22.8","24.4","9","7.6","7.6","7.6","7.6","2","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","1.4","13.3","20.4","8.1","3.9","3.9","3.9","3.9","3.9","3.9","3.9","3.9","3.9","39","10","10","10","15.9","30.5"} G R 5 3 L 4 1 1
[HACKERS] Still having problems with DIGEST
I've subscribed and un-subscribed to the HACKERS-DIGEST list several times now. Each time I seem to be getting EVERY message sent to the list rather than a DIGEST. Can someone tell me if it is still possible to get a DIGEST of the list? Is the list administrator aware of the problem? Thanks. -Tony
[HACKERS] [Fwd: Weird backup file]
I sent this e-mail last week but hadn't received any response. Given Thomas' last message about seeing responses to threads he never recalled seeing in the first place, I'm wondering whether the original message made it to the server. -Tony p.s. I still can't seem to get the "DIGEST" to work on HACKERS. Seems to be some problems with the majordomo. Here's my original message: Original Message Subject: Weird backup file Date: Fri, 17 Nov 2000 11:27:32 -0800 From: "G. Anthony Reina" [EMAIL PROTECTED] Organization: The Neurosciences Institute To: "[EMAIL PROTECTED]" [EMAIL PROTECTED],[EMAIL PROTECTED] I backed up my database from Postgres 6.5.3 and migrated to 7.0.2 several a few months ago. For some reason, data was lost in the transition. I've finally pinned it down to the attached file (abridged to point out the problem). It looks like two things happened in the backup. First, when I move from 'G' to 'F' in the names column, I seem to lose the column called 'dsp_chan'. Second, the double quotes around the float_4 array called 'spike_hist' aren't included. I'm not sure if the double quotes are necessary, but the missing column is probably a problem. I added this column after the database was created by using 'alter table ellipse_cell_proc add column dsp_chan' and then put it in the correct position by using: SELECT name, arm, rep, cycle, hemisphere, area, cell, dsp_chan, spike_hist INTO xxx FROM ellipse_cell_proc; DROP TABLE ellipse_cell_proc; ALTER TABLE xxx RENAME TO ellipse_cell_proc; Can anyone explain what went wrong with the backup or where I erred adding the column? Thanks. -Tony \connect - postgres CREATE TABLE "ellipse_cell_proc" ( "name" text, "arm" character, "rep" int4, "cycle" int4, "hemisphere" character, "area" text, "cell" int4, "dsp_chan" text, "spike_hist" "_float4" ); COPY "ellipse_cell_proc" FROM stdin; I L 888 2 R 4 173 1 {"0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0.4","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0"} I L 888 3 R 4 173 1 {"0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0",&q
Re: [HACKERS] [Fwd: Weird backup file]
Tom Lane wrote: Your procedure was fine, but ALTER TABLE RENAME was mighty flaky in pre-7.0 releases. Even in 7.0, doing it inside a transaction block is asking for trouble (that's finally fixed for 7.1, thank goodness). I suspect you got bit by an ALTER bug. I'm not sure about the exact mechanism, but I have a suspicion: it looks a lot like some blocks of the original ellipse_cell_proc table got written into the new table. I know 6.5 failed to clear old shared disk buffers during a table rename. I can't recall if it was sloppy about that during a table drop as well, but it would've taken both bugs to cause this result if I'm guessing right that that was the failure path. There are good reasons why we've been urging people to update to 7.0.* ASAP ... I'm afraid you got bit by one :-(. Sorry about that. Okay. At least the problem has been solved. It seems though that the last 2 times I've done a backup (in order to upgrade to the latest Postgres version) I've had data lost because of some error. I'm getting a little concerned about the quality of the Postgres backups. -Tony
[HACKERS] PostgreSQL 7.0.3?
I remember a post about 2 weeks back concerning a new patch that was to be introduced as 7.0.3. I haven't seen any reference to this since then. Is this still happening, or will the patch be part of 7.1? -Tony Reina