Re: [GENERAL] cache lookup failed for function 19119

2010-07-22 Thread tamanna madaan
Hi All

Thanks for your suggestions . But the problem here seems to be with the way 
slony applies triggers . 

The result of below queries in the database gives the following results :

SELECT tgrelid,tgname,tgfoid from pg_trigger;

 

tgrelid|tgname  | tgfoid

-  ++

   16470   | _clustername_denyaccess_1  |  19119

   16470   | user_defined_trig_1|  17733

   16470   | user_defined_trig_2|  17629

   16392   | _cluastername_denyaccess_1 |  20801



msw=# SELECT relname from pg_class where oid=16470;

relname



 abc_pkey

(1 row)

 

msw=# SELECT relname from pg_class where oid=16392;

  relname

---

 abc

(1 row)

 

The result is like this for all the replicable tables  i.e the denyaccess 
trigger is applied on the table as well as table_pkey with different function 
ids (19119 and 20801 in this case). While function with oid 20801 is available 
in pg_proc table and the corresponding function name is denyaccess( ) but 
function with oid 19119 is not available In database and that's why the error 
cache lookup failed for function 19119 

 Now my question is  how denyaccess trigger got applied on table_pkey while 
denyaccess trigger should have been applied only on  table name and user 
defined triggers should have been applied on table_pkey in slave database. 

Is this a known issue ?? What can lead to  this kind of situation. ??

 

Please help

 
Thanks..
Tamanna
 

-Original Message-
From: David Fetter [mailto:da...@fetter.org] 
Sent: Saturday, July 17, 2010 8:45 PM
To: Merlin Moncure
Cc: tamanna madaan; pgsql-general@postgresql.org
Subject: Re: [GENERAL] cache lookup failed for function 19119

On Thu, Jul 15, 2010 at 10:21:52AM -0400, Merlin Moncure wrote:
 On Thu, Jul 15, 2010 at 2:34 AM, tamanna madaan
 tamanna.ma...@globallogic.com wrote:
  Hi All
 
  I am using  postgres-8.1.2 .
 
  And getting this error cache lookup failed for function 19119.
 
  Can anyone please let me know what could have gone wrong.
 
  How can a function go missing . And which function
 
  Its talkig about ?? its some postgres's internal function or a user defined
  function ??
 
   How can I get function name corresponding 19119
 
 The function is either gone (it was deleted manally from pg_proc for
 example), dropped, added, etc. or there is some other problem. You
 might be able to fix the problem by recreating the function
 (create/replace) that is calling the function in question (your
 database log should be giving you some context).
 
 You are on 8.1.2 which is crazy.  you need to immediately get the
 latest bugfix release for the 8.1 series.  You might want to consider
 a dump/reload...read the release notes for the 8.1 series here:
 http://www.postgresql.org/docs/8.1/static/release.html.

You might also want to note that 8.1's end of life is in November, so
start planning the upgrade to 9.0 right now.  You will likely need to
clean up some client code in order for that to work, as modern
versions of PostgreSQL don't allow some of the sloppy and dangerous
things (casting automatically to and from text, e.g.) that former
versions did.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Howard Rogers
On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote:



 Why on Earth would I want to store this sort of stuff in a bit string?!

 Because you are manipulating bits and not integers?  I guess there are
 10 kinds of people, those who like think in binary and those who
 don't.

 I don't know about you, but I find looking at 21205 a darn'd site
 easier than staring blankly at 101001011010101!!

 If the fifth bit means one thing, and the 7th bit means something
 else, quick which of the following have the fifth bit set and the 7th
 bit off:

 01001101 (base2)
 or
 77 (base 10)


 And, fundamentally,
 they mean precisely the same thing.

 Of course.  But that wasn't my point, and by my example above, one is
 much easier to figure out than the other if you're interested in bit
 twiddling.

 And the '' function works as
 nicely with boring old decimals as it does with long-winded binaries,
 so I really don't see the point of making it more complicated than it
 needs to be -but I'm open to be enlightened on the matter!

 I fail to see how storing a binary as a binary and showing it as a
 binary makes things more complicated.  But I'm open to someone showing
 me how that's true.  At least hex or octal have direct and simple
 conversions where each hex or octal digit represents 4 or 3 bits
 respectively.  Decimal does not.

 Thanks for the second link though. I hadn't realised that PostgreSQL
 was so richly-endowed with bitwise functions. Specifically, it's got
 the bitwise XOR I was thinking it would be nice to have in these sorts
 of situations:

 ims=# select * from coloursample where colour # 10 = 0;
  recid | colour |     descript
 ---++---
     1 |     10 | Yellow and Orange
 (1 row)

 Not quite sure how to apply that to my more realistic example just
 yet, but I think this will be very helpful, so thank you!

 Note you can cast integer to bitstring, but there may be some odd
 behaviour for sign bits and such.  Which is again why I'd use the
 right type for the job, bit string.  But it's your project.


Quoting...

 Because you are manipulating bits and not integers?  I guess there are
 10 kinds of people, those who like think in binary and those who
 don't.

Er, no. 21205 is not an integer. It's an encoded bit of magic.


 quick which of the following have the fifth bit set and the 7th
bit off:

01001101 (base2)
or
77 (base 10)

I'll give you that one (except that the fifth bit isn't set and the
7th bit is on!!). Now repeat for 10 million records and see how
you get on.

Really, I don't have to visually inspect a record to work this stuff
out! So what is easier on your eye for one record is completely
irrelevant as far as my code is concerned!

one is
much easier to figure out than the other if you're interested in bit
twiddling.

As I say, take an incredibly simple example and everything looks, er,
simple. Point of fact, I happen to know that the '1' bit is set in
21205 simply by looking at the last digit and spotting that it's not
even. Fat lot of good that does me when fetching 350,000 records that
happen to match 'insurance claim'.

I fail to see how storing a binary as a binary and showing it as a
binary makes things more complicated

Because it's NOT binary. It's an encoding. Whether that encoding is
displayed in binary, fluent hebrew or klingon or imaginary numbers is
really irrelevant to me. It happens that I can look at a decimal
number and work out most things (if the number is 29438, it's a fair
chance the 16384 bit is set, for example; if it's 4098, I know it's
4096 + 2. And so on.). If you're happier working with fancifully long
strings of 1s and 0s, good on you: but it doesn't alter the fact that
I'm working with encoded meanings, not binary digits and I prefer a
nice, compact display of that encoding which doesn't involve
hieroglyphics.

But it's your project.

Indeed.

Doesn't mean I can do it all on my own, of course. But if it's simply
a question of personal preference, I'll take mine over yours, for
projects I work on, if that's OK. No hard feelings! :-)

Regards
HJR

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Passing a PGconn * between two processes on Unix like systems

2010-07-22 Thread Marc Balmer
I two Unix/ process related questions:

Is there a documented way to pass around an (opened) PGconn * structure
between two processes on Unix?

When a process forks() and both the parent and child process continue to
use a previously opened PGconn * structure, is that behaviour defined?

Thanks,
Marc Balmer

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Finding last checkpoint time

2010-07-22 Thread Devrim GÜNDÜZ
On Tue, 2010-07-20 at 20:48 +0100, Thom Brown wrote:
 
 Or you can use pg_controldata /path/to/pgdata and look at Time of
 latest checkpoint.

Right. Thanks :)
-- 
Devrim GÜNDÜZ
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
PostgreSQL RPM Repository: http://yum.pgrpms.org
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Passing a PGconn * between two processes on Unix like systems

2010-07-22 Thread Magnus Hagander
On Thu, Jul 22, 2010 at 08:35, Marc Balmer m...@msys.ch wrote:
 I two Unix/ process related questions:

 Is there a documented way to pass around an (opened) PGconn * structure
 between two processes on Unix?

No.

You can probably hack up something yourself but you'd have to look
inside the struct which is not part of the public API - so it'd be a
very version-dependent (even minor-version dependent!) hack.


 When a process forks() and both the parent and child process continue to
 use a previously opened PGconn * structure, is that behaviour defined?

Yes - broken :-)

Well, the child can continue to use it *as long as the parent doesn't
use it anymore*.

And note that while it may be a good idea in general to close the
socket in the parent, you can *not* call PQclose() on it - that'll
tell the server you're disconnecting, and the child will stop working.
In theory you could do something like close(PQsocket(conn))...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locating cities within a radius of another

2010-07-22 Thread Oliver Kohll - Mailing Lists

On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com wrote:

 If you want something simple, and not requiring PostGIS, but plpgsql
 instead, see:
 
 http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php

For completeness, the earthdistance module also provides the distance between 
two lat/longs, the point@point syntax is simple to use:
http://www.postgresql.org/docs/8.3/static/earthdistance.html

Regards
Oliver Kohll

oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll
www.agilebase.co.uk - software



Re: [GENERAL] Passing a PGconn * between two processes on Unix like systems

2010-07-22 Thread Alban Hertroys
On 22 Jul 2010, at 8:35, Marc Balmer wrote:

 When a process forks() and both the parent and child process continue to
 use a previously opened PGconn * structure, is that behaviour defined?


I recall having done this successfully, but you have to take care to 
synchronise access to the connection. You can't have multiple transactions 
running in parallel on one connection.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4c482748286211410335719!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ECPG - Some errno definitions don't match to the manual

2010-07-22 Thread Satoshi Nagayasu
Hi all,

I'm looking into some ecpg part of the official manual,
and I have found some strange things.

I'm now investigating SQLCODE and SQLSTATE, and I have found
that some of the errno definitions don't match to the manual.

For example, the manual says that ECPG_CONVERT_BOOL could be `-207'.
However, ECPG_CONVERT_BOOL is defined as `-211' in ecpgerrno.h.

 -207 (ECPG_CONVERT_BOOL)
 
 This means the host variable is of type bool and the datum in the 
 database is neither 't' nor 'f'. (SQLSTATE 42804) 

http://www.postgresql.org/docs/9.0/static/ecpg-errors.html

 #define ECPG_NUMERIC_FORMAT   -207
 #define ECPG_CONVERT_BOOL -211

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/ecpg/include/ecpgerrno.h?rev=1.27;content-type=text%2Fx-cvsweb-markup;only_with_tag=REL9_0_STABLE

What does it mean? The manual is not up to date?

Any suggestions?

Regards,
-- 
NAGAYASU Satoshi satoshi.nagay...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Maximum document-size of text-search?

2010-07-22 Thread Andreas Joseph Krogh

Hi.
I'm trying to index the contents of word-documents, extracted text, 
which leads to quite large documents sometimes. This resutls in the 
following Exception:
Caused by: org.postgresql.util.PSQLException: ERROR: index row requires 
10376 bytes, maximum size is 8191


I have the following schema:
andreak=# \d origo_search_index
   Table public.origo_search_index
  Column  |   Type
|Modifiers

--+---+-
 id   | integer   | not null default 
nextval('origo_search_index_id_seq'::regclass)

 entity_id| integer   | not null
 entity_type  | character varying | not null
 field| character varying | not null
 search_value | character varying | not null
 textsearchable_index_col | tsvector  |

origo_search_index_fts_idx gin (textsearchable_index_col)

Triggers:
update_search_index_tsvector_t BEFORE INSERT OR UPDATE ON 
origo_search_index FOR EACH ROW EXECUTE PROCEDURE 
tsvector_update_trigger('textsearchable_index_col', 
'pg_catalog.english', 'search_value')


I store all the text extracted from the documents in search_value and 
have the built-in trigger tsvector_update_trigger update the 
tsvector-column.


Any hints on how to get around this issue to allow indexing large 
documents? I don't see how only index the first N bytes of the 
document would be of interest to anyone...


BTW: I'm using PG-9.0beta3

--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
+-+
OfficeNet AS| The most difficult thing in the world is to |
Rosenholmveien 25   | know how to do a thing and to watch |
1414 Trollåsen  | somebody else doing it wrong, without   |
NORWAY  | comment.|
| |
Tlf:+47 24 15 38 90 | |
Fax:+47 24 15 38 91 | |
Mobile: +47 909  56 963 | |
+-+


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Finding last checkpoint time

2010-07-22 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 Or you can use pg_controldata /path/to/pgdata and look 
 at Time of latest checkpoint.

Assuming your system is using English. Otherwise, you'll 
have to build a collection of .po strings as we did for 
check_postgres.pl. Needless to say, I'd greatly prefer 
some other way to grab the information!

- -- 
Greg Sabino Mullane g...@turnstep.com
PGP Key: 0x14964AC8 201007220933
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkxISMwACgkQvJuQZxSWSsirbACfa3ujzyTLyzlPbG0QrDUC/0AB
BCYAnRfP0E2CJQM+V0qNzgdsi47OjWKB
=+XW4
-END PGP SIGNATURE-



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locating cities within a radius of another

2010-07-22 Thread Geoffrey

Oliver Kohll - Mailing Lists wrote:


On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com 
mailto:m...@joeconway.com wrote:



If you want something simple, and not requiring PostGIS, but plpgsql
instead, see:

http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php


For completeness, the earthdistance module also provides the distance 
between two lat/longs, the point@point syntax is simple to use:

http://www.postgresql.org/docs/8.3/static/earthdistance.html


I did look at earthdistance before.  Revisiting it now, thanks.

So, I'm trying to figure out this syntax.  The docs say:

point @ point - float8 - gives the distance in statue miles between 
two points on the Earth's surface.


How does longitude and latitude fit into this picture?  I can't find any 
other documentation or examples?


I've got the contrib mods installed as 'select earth()' works fine.



Regards
Oliver Kohll

oli...@agilebase.co.uk mailto:oli...@agilebase.co.uk / +44(0)7814 
828608 / skype:okohll

www.agilebase.co.uk http://www.agilebase.co.uk - software




--
Until later, Geoffrey

I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them.
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locating cities within a radius of another

2010-07-22 Thread Geoffrey

Oliver Kohll - Mailing Lists wrote:


On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com 
mailto:m...@joeconway.com wrote:



If you want something simple, and not requiring PostGIS, but plpgsql
instead, see:

http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php


For completeness, the earthdistance module also provides the distance 
between two lat/longs, the point@point syntax is simple to use:

http://www.postgresql.org/docs/8.3/static/earthdistance.html


Trying to figure out the proper usage.  My assumptions:

use ll_to_earth() to get point values to pass to 'point @ point'

First issue, ll_to_earth() returns three values, not one.

Second issue, I tried something like:

select (ll_to_earth(46,67)@ll_to_earth(57,87));

I get:

ERROR:  operator does not exist: earth @ earth
LINE 1: select (ll_to_earth(46,67)@ll_to_earth(57,87));

So I tried:

select (4618419.15006707@4394453.66154081);

And I get:

ERROR:  operator does not exist: numeric @ numeric
LINE 1: select (4618419.15006707@4394453.66154081);
^
HINT:  No operator matches the given name and argument type(s). You 
might need to add explicit type casts.


What am I missing???



--
Until later, Geoffrey

I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them.
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locating cities within a radius of another

2010-07-22 Thread Geoffrey

Oliver Kohll - Mailing Lists wrote:


On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com 
mailto:m...@joeconway.com wrote:



If you want something simple, and not requiring PostGIS, but plpgsql
instead, see:

http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php


For completeness, the earthdistance module also provides the distance 
between two lat/longs, the point@point syntax is simple to use:

http://www.postgresql.org/docs/8.3/static/earthdistance.html


Disgregard my last post, Surely as soon as I hit send, the light went 
on...  I'm looking at deriving my points for point @ point from 
ll_to_earth().




Regards
Oliver Kohll

oli...@agilebase.co.uk mailto:oli...@agilebase.co.uk / +44(0)7814 
828608 / skype:okohll

www.agilebase.co.uk http://www.agilebase.co.uk - software




--
Until later, Geoffrey

I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them.
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Getting statistics for each sql statement?

2010-07-22 Thread Stefan-Michael Guenther

Hello,

is it possible to get statistics on the usage of different sql 
statements, e.g. how many INSERT or UPDATE statements per day?


log_statement_stats doesn't seem to be the right parameter or I haven't 
found the output statistic for this command.


Any ideas or suggestions?

Thanks,

Stefan


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


RESOLVED: Re: [GENERAL] Maximum document-size of text-search?

2010-07-22 Thread Andreas Joseph Krogh

On 07/22/2010 03:31 PM, Andreas Joseph Krogh wrote:

Hi.
I'm trying to index the contents of word-documents, extracted text, 
which leads to quite large documents sometimes. This resutls in the 
following Exception:
Caused by: org.postgresql.util.PSQLException: ERROR: index row 
requires 10376 bytes, maximum size is 8191


I have the following schema:
andreak=# \d origo_search_index
   Table public.origo_search_index
  Column  |   Type
|Modifiers
--+---+- 

 id   | integer   | not null default 
nextval('origo_search_index_id_seq'::regclass)

 entity_id| integer   | not null
 entity_type  | character varying | not null
 field| character varying | not null
 search_value | character varying | not null
 textsearchable_index_col | tsvector  |

origo_search_index_fts_idx gin (textsearchable_index_col)

Triggers:
update_search_index_tsvector_t BEFORE INSERT OR UPDATE ON 
origo_search_index FOR EACH ROW EXECUTE PROCEDURE 
tsvector_update_trigger('textsearchable_index_col', 
'pg_catalog.english', 'search_value')


I store all the text extracted from the documents in search_value 
and have the built-in trigger tsvector_update_trigger update the 
tsvector-column.


Any hints on how to get around this issue to allow indexing large 
documents? I don't see how only index the first N bytes of the 
document would be of interest to anyone...


BTW: I'm using PG-9.0beta3


Never mind... I was having a btree index on search_value too, which of 
course caused the problem.


--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
+-+
OfficeNet AS| The most difficult thing in the world is to |
Rosenholmveien 25   | know how to do a thing and to watch |
1414 Trollåsen  | somebody else doing it wrong, without   |
NORWAY  | comment.|
| |
Tlf:+47 24 15 38 90 | |
Fax:+47 24 15 38 91 | |
Mobile: +47 909  56 963 | |
+-+


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Clarification of the simple dictionary

2010-07-22 Thread Andreas Joseph Krogh
Hi. It's not clear to me if the simple dictionary uses stopwords or 
not, does it?
Can someone please post a complete description of what the simple 
dict. does?


--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
+-+
OfficeNet AS| The most difficult thing in the world is to |
Rosenholmveien 25   | know how to do a thing and to watch |
1414 Trollåsen  | somebody else doing it wrong, without   |
NORWAY  | comment.|
| |
Tlf:+47 24 15 38 90 | |
Fax:+47 24 15 38 91 | |
Mobile: +47 909  56 963 | |
+-+


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Peter Hunsberger
On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote:

 On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:


 
  Why on Earth would I want to store this sort of stuff in a bit string?!
 
  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.
 
  I don't know about you, but I find looking at 21205 a darn'd site
  easier than staring blankly at 101001011010101!!

snip lots of stuff/snip

 
  Note you can cast integer to bitstring, but there may be some odd
  behaviour for sign bits and such.  Which is again why I'd use the
  right type for the job, bit string.  But it's your project.
 

 Quoting...

  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.

 Er, no. 21205 is not an integer. It's an encoded bit of magic.


In that case your database design is fundamentally broken.  A database
should have content fields that map to the needs of the application.
As you describe your application requirements, that is a bit string
and not an integer.  Use bit strings and your application logic is
transparent, obvious and easy to maintain.  Use integers and you have
to resort to magic.  As you say, it's your choice, but you came here
looking for advice and the advice you were given is very good

--
Peter Hunsberger

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] optimizing daily data storage in Pg

2010-07-22 Thread P Kishor
I have been struggling with this for a while now, have even gone down
a few paths but struck out, so I turn now to the community for ideas.
First, the problem: Store six daily variables for ~ 25 years for cells
in a grid.

 * Number of vars = 6
 * Number of cells ~ 13 million
 * Number of days ~ 9125 (25 * 365)

Optimize the store for two different kinds of queries:

Query one: Retrieve the value of a single var for all or a portion of
the cells for a single day. This is analogous to an image where every
pixel is the value of a single var.

Query two: Retrieve values for all the days or a duration of days for
a single var for a single cell. This is like grabbing a column out of
a table in which each row holds all the vars for a single day.

So, I set about designing the db. The grid is in a table with 13 million rows

CREATE TABLE cells (
cell_id  INTEGER,
other_data ..
)
WITH (
OIDS=FALSE
)


A single table *where every row is one day's values for one cell* looks like so

CREATE TABLE d (
yr  SMALLINT,
ydaySMALLINT,
a   SMALLINT,
b   SMALLINT,
d   SMALLINT,
e   SMALLINT,
f   SMALLINT,
g   SMALLINT,
cell_id INTEGER
)
WITH (
OIDS=FALSE
)

The data would look like so

yr  ydaya   b   c   d   e   f   g   cell_id

19801   x   x   x   x   x   x   x   1
..
1980365 x   x   x   x   x   x   x   1
...
19811   x   x   x   x   x   x   x   1
..
1981365 x   x   x   x   x   x   x   1
  ...
  ...
20051   x   x   x   x   x   x   x   1
..
2005365 x   x   x   x   x   x   x   1
..
19801   x   x   x   x   x   x   x   2
..
1980365 x   x   x   x   x   x   x   2
  ...

I could now (theoretically) conduct my queries like so:

Query 1a: Retrieve the value of a single var for all the cells for a
single day. This is analogous to an image where every pixel is the
value of a single var.

SELECT var FROM d WHERE yr = ? AND yday = ?;

I assuming I would need an index on yr and yday, or perhaps even a
compound index on (yr, yday).

Query 1b: Retrieve the value of a single var for a portion of the
cells for a single day. This is analogous to an image where every
pixel is the value of a single var.

SELECT var FROM d WHERE yr = ? AND yday = ? AND cell_id IN (?,?,?...);

I assuming I would need an index on yr and yday, or perhaps even a
compound index on (yr, yday) AND an index on cell_id.

Query 2: Retrieve values for all the days or a duration of days for a
single var for a single cell. This is like grabbing a column out of a
table in which each row holds all the vars for a single day.

SELECT var FROM d WHERE cell_id = ?;
SELECT var FROM d WHERE cell_id IN (?,?,?...);

Once again, an index on cell_id would assist in the above.

The problem: The above table would have 13 M * 9125 rows ~ 118 billion
rows. Huge indexes, slow queries, etc. In fact, major issues loading
the data in the first place. Since I am loading data in batches, I
drop the indexes (takes time), COPY data into the table (takes time),
build the indexes (takes time), experiment with improving the
performance (takes time), fail, rinse, lather, repeat. I actually
tried the above with a subset of data (around 100 M rows) and
experienced all of the above. I don't remember the query times, but
they were awful.

So, I partitioned the table into years like so

CREATE TABLE d_ (
CHECK ( yr =  )
) INHERITS (d)

Hmmm... still no satisfaction. I ended up with 1 master table + 25
inherited tables. Each of the year tables now had ~ 4.75 billion rows
(13 M * 365), and the queries were still very slow.

So, I partitioned it all by years and days like so

CREATE TABLE d__yday (
CHECK ( yr =  AND yday = yday )
) INHERITS (d)

Each table now has 13 million rows, and is reasonably fast (although
still not satisfactorily fast), but now I have 9K tables. That has its
own problems. I can't query the master table anymore as Pg tries to
lock all the tables and runs out of memory. Additionally, I can't
anymore conduct query two above. I could do something like

SELECT a FROM d_1980_1 WHERE cell_id = 1
UNION
SELECT a FROM d_1980_2 WHERE cell_id = 1
UNION
SELECT a FROM d_1980_3 WHERE cell_id = 1
UNION
SELECT a FROM d_1980_4 WHERE cell_id = 1
UNION
...

But the above is hardly optimal.

Any suggestions, ideas, brainstorms would be appreciated. Perhaps Pg,
or even a RDBMS, is not the right tool for this problem, in which
case, suggestion for alternatives would be welcome as well.

Right now I am testing this on a dual Xeon dual core 3 GHz Xserve with
12 GB RAM. The PGDATA directory is located on an attached RAID that is
configured as 

[GENERAL] How to improve performance in reporting database?

2010-07-22 Thread Matthew Wilson
I have a daily job that pushes data from the production database into
the reporting database, which right now, is an exact copy.

I have a webapp that builds lots of reports for users.  Most of these
reports involve elaborate joins of lookup tables and lots of summations,
and they take too long to run, even after using everything I know to
tune the queries.

Since I know this is a read-only data, it seems like I should be able to
speed everything up dramatically if I run the queries offline and then
save the results into new tables.  Then the web app could just grab the
cached results out of these new tables and then spit them out quickly.

I've heard people talking about using materialized views for this, but
that was with Oracle.

What's the postgresql way here?

More generally, any advice on running reporting databases well is
welcome.


Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Getting statistics for each sql statement?

2010-07-22 Thread Ben Chobot
On Jul 22, 2010, at 4:50 AM, Stefan-Michael Guenther wrote:

 Hello,
 
 is it possible to get statistics on the usage of different sql statements, 
 e.g. how many INSERT or UPDATE statements per day?
 
 log_statement_stats doesn't seem to be the right parameter or I haven't found 
 the output statistic for this command.
 
 Any ideas or suggestions?

The usual approach is to make a log file that holds all your queries and then 
let something like pgFouine grovel through it.
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to distribute quantity if same product is in multiple rows

2010-07-22 Thread Andrus

Tim,

Thank you.


It can be done in SQL: SUM(kogus) OVER (PARTITION BY toode
ORDER BY ID) - kogus (*1) will give you the running sum of
the product up to that row. You can then subtract that value
from the delivered quantity to calculate the delivered quan-
tity for the current row.



 But doing so automatically is probably bad. For example,
if a user has a purchase order with one position of two
pieces and one position of four, it is very likely that when
a shipment of four pieces arrives, the latter position shall
be marked as delivered. So I would leave the decision to the
user.


If four pieces arrived, first position of 2 pieces should marked as 
delivered.
Second position of 4 pieces shoudl be marked as partialli delivered by 
setting undelivered quantity

of this row to 2

How to use your suggestion for this ?
How to implement this is PostgreSql 8.1,8.2, 8.3 ?

Andrus.


(*1)   In PostgreSQL 9.0, you might be able to use ROWS

  BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING instead
  (untested).

PS. If possible please use cc: with my email address in reply.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clarification of the simple dictionary

2010-07-22 Thread John Gage
The easiest way to look at this is to give the simple dictionary a  
document with to_tsvector() and see if stopwords pop out.


In my experience they do.  In my experience, the simple dictionary  
just breaks the document down into the space etc. separated words in  
the document.  It doesn't analyze further.


John


On Jul 22, 2010, at 4:15 PM, Andreas Joseph Krogh wrote:

Hi. It's not clear to me if the simple dictionary uses stopwords  
or not, does it?
Can someone please post a complete description of what the simple  
dict. does?


--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
 
+-+
OfficeNet AS| The most difficult thing in the world is  
to |
Rosenholmveien 25   | know how to do a thing and to  
watch |
1414 Trollåsen  | somebody else doing it wrong,  
without   |
NORWAY  |  
comment.|

   | |
Tlf:+47 24 15 38 90  
| |
Fax:+47 24 15 38 91  
| |
Mobile: +47 909  56 963  
| |
 
+-+



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Are identical subqueries in unioned statements nonrepeatable?

2010-07-22 Thread Derrick Rice
Hi all.  I had no luck finding a previous message or documentation related
to the effective transaction isolation of subqueries, specifically identical
subqueries in union statements.  Consider the following statement executed
without a transaction.

select true as from_one, table_one.*
from table_one
where table_one.id not in (select id from reference_table)
   union all
select false as from_one, table_two.*
from table_two
where table_two.id not in (select id from reference_table)

Is it possible for the contents of reference_table to differ from the first
select to the select on the right hand side of the union?  (e.g. because
some other transaction committed additional rows).  Or even from row
comparison to row comparison in the same select (I highly doubt that).

If it is not possible, why?  Is it because a single query always executes
with serializable (effective) isolation?  Is it because postgresql
recognizes that the query is repeated and uses a single result set in both
sides of the union?

Is this behavior that is part of postgresql intentionally, or a side effect
that I should not rely on?

Assumption:  I'm assuming that it's faster to union all after filtering by
the where clause than to union all then filter by a single where clause.
The subquery for exclude_ids is very fast and the results of each of the
selects is a small fraction of the entire tables.  Doing a union first would
be expensive in comparison to doing a union of the

Thanks,

Derrick


Re: [GENERAL] Clarification of the simple dictionary

2010-07-22 Thread Andreas Joseph Krogh

On 07/22/2010 06:27 PM, John Gage wrote:
The easiest way to look at this is to give the simple dictionary a 
document with to_tsvector() and see if stopwords pop out.


In my experience they do.  In my experience, the simple dictionary 
just breaks the document down into the space etc. separated words in 
the document.  It doesn't analyze further.


That's my experience too, I just want to make sure it doesn't actually 
have any stopwords which I've missed. Trying many phrases and checking 
for stopwords isn't really proving anything.


Can anybody confirm the simple dict. only lowercases the words and 
uniques them?


--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
+-+
OfficeNet AS| The most difficult thing in the world is to |
Rosenholmveien 25   | know how to do a thing and to watch |
1414 Trollåsen  | somebody else doing it wrong, without   |
NORWAY  | comment.|
| |
Tlf:+47 24 15 38 90 | |
Fax:+47 24 15 38 91 | |
Mobile: +47 909  56 963 | |
+-+


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clarification of the simple dictionary

2010-07-22 Thread Oleg Bartunov

Don't guess, but read docs
http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY

12.6.2. Simple Dictionary

The simple dictionary template operates by converting the input token to lower 
case and checking it against a file of stop words. If it is found in the file 
then an empty array is returned, causing the token to be discarded. If not, the 
lower-cased form of the word is returned as the normalized lexeme. 
Alternatively, the dictionary can be configured to report non-stop-words as 
unrecognized, allowing them to be passed on to the next dictionary in the list.

d=# \dFd+ simple
  List of text search dictionaries
   Schema   |  Name  | Template  | Init options |Description 
++---+--+---

 pg_catalog | simple | pg_catalog.simple |  | simple dictionary: 
just lower case and check for stopword

By default it has no Init options, so it doesn't check for stopwords.

On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote:


On 07/22/2010 06:27 PM, John Gage wrote:
The easiest way to look at this is to give the simple dictionary a document 
with to_tsvector() and see if stopwords pop out.


In my experience they do.  In my experience, the simple dictionary just 
breaks the document down into the space etc. separated words in the 
document.  It doesn't analyze further.


That's my experience too, I just want to make sure it doesn't actually have 
any stopwords which I've missed. Trying many phrases and checking for 
stopwords isn't really proving anything.


Can anybody confirm the simple dict. only lowercases the words and 
uniques them?





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clarification of the simple dictionary

2010-07-22 Thread Andreas Joseph Krogh

On 07/22/2010 07:44 PM, Oleg Bartunov wrote:

Don't guess, but read docs
http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY 



12.6.2. Simple Dictionary

The simple dictionary template operates by converting the input token 
to lower case and checking it against a file of stop words. If it is 
found in the file then an empty array is returned, causing the token 
to be discarded. If not, the lower-cased form of the word is returned 
as the normalized lexeme. Alternatively, the dictionary can be 
configured to report non-stop-words as unrecognized, allowing them to 
be passed on to the next dictionary in the list.


d=# \dFd+ simple
  List of text search 
dictionaries
   Schema   |  Name  | Template  | Init options 
|Description 
++---+--+--- 

 pg_catalog | simple | pg_catalog.simple |  | simple 
dictionary: just lower case and check for stopword


By default it has no Init options, so it doesn't check for stopwords.


Guess what - I *have* read the docs which sais ...and checking it 
against a file of stop words. What was unclear to me was whether or not 
it was configured with a stopwords-file or not as default, which is not 
the case I understand from your reply. Very good, fits my needs like a 
glove:-) It might be worth considering updating the docs to make this 
clearer?


So - can we rely on simple to remain this way forever (no Init 
options) or is it better to make a copy of it with the same properties 
as today?


It seems simple + the unaccent dict. available in 9.0 saves my day, 
thanks Mr. Bartunov.


--
Andreas Joseph Kroghandr...@officenet.no
Senior Software Developer / CTO
+-+
OfficeNet AS| The most difficult thing in the world is to |
Rosenholmveien 25   | know how to do a thing and to watch |
1414 Trollåsen  | somebody else doing it wrong, without   |
NORWAY  | comment.|
| |
Tlf:+47 24 15 38 90 | |
Fax:+47 24 15 38 91 | |
Mobile: +47 909  56 963 | |
+-+


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] varchar[] or text[]

2010-07-22 Thread Armand Turpel

 Hi,
I know this issue was controversed discussed. Some one see no really 
benefits of using varchar against text var. But i'm asking me what if i 
use it as array values. Is there any difference between varchar[126], 
varchar[1] and text[] else than the number of chars i can store in? 
Performance, memory usage, ?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] varchar[] or text[]

2010-07-22 Thread Peter C. Lai
This was discussed yesterday and previously. Please read the archives.

There is no positive performance reason to use varchar instead of text.

On 2010-07-22 05:38:14PM +0200, Armand Turpel wrote:
   Hi,
 I know this issue was controversed discussed. Some one see no really 
 benefits of using varchar against text var. But i'm asking me what if i 
 use it as array values. Is there any difference between varchar[126], 
 varchar[1] and text[] else than the number of chars i can store in? 
 Performance, memory usage, ?
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
===
Peter C. Lai | Bard College at Simon's Rock
Systems Administrator| 84 Alford Rd.
Information Technology Svcs. | Gt. Barrington, MA 01230 USA
peter AT simons-rock.edu | (413) 528-7428
===


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locating cities within a radius of another

2010-07-22 Thread Oliver Kohll - Mailing Lists

On 22 Jul 2010, at 12:57, Geoffrey wrote:

 For completeness, the earthdistance module also provides the distance 
 between two lat/longs, the point@point syntax is simple to use:
 http://www.postgresql.org/docs/8.3/static/earthdistance.html
 
 Disgregard my last post, Surely as soon as I hit send, the light went on...  
 I'm looking at deriving my points for point @ point from ll_to_earth().


I constructed mine using point(longitude, latitude), where long and lat are 
double precision, which returns a datatype of type point. ll_to_earth() looks 
like it returns a datatype of type earth, so not sure if it will work. Maybe 
things have changed in a recent release, please let me know if so.

So an example would be
select point(-2.2171,56.8952)@point(-1.2833,51.6667) as miles; 
  miles   
--
 363.202864676916
(1 row)

Regards
Oliver Kohll

oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll
www.agilebase.co.uk - software





Re: [GENERAL] How to improve performance in reporting database?

2010-07-22 Thread Greg Smith

Matthew Wilson wrote:

I've heard people talking about using materialized views for this, but
that was with Oracle.
  


You can build those manually with PostgreSQL if you really want them:  
http://wiki.postgresql.org/wiki/Materialized_Views


The fundamental architecture is sound for a lot of problems in this 
area, you just have to figure out how to build them efficiently.  In 
your case, you might just consider if there's a way way to update the MV 
in batches, rather than rely on triggers to keep the data up to date, 
after each data import.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to improve performance in reporting database?

2010-07-22 Thread Scott Marlowe
On Thu, Jul 22, 2010 at 8:45 AM, Matthew Wilson m...@tplus1.com wrote:
 I have a daily job that pushes data from the production database into
 the reporting database, which right now, is an exact copy.

 I have a webapp that builds lots of reports for users.  Most of these
 reports involve elaborate joins of lookup tables and lots of summations,
 and they take too long to run, even after using everything I know to
 tune the queries.

 Since I know this is a read-only data, it seems like I should be able to
 speed everything up dramatically if I run the queries offline and then
 save the results into new tables.  Then the web app could just grab the
 cached results out of these new tables and then spit them out quickly.

 I've heard people talking about using materialized views for this, but
 that was with Oracle.

 What's the postgresql way here?

http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views

They're kinda roll your own, but they're not that hard to do.

 More generally, any advice on running reporting databases well is
 welcome.

Throw more drives and RAM at the problem, and use materialized views.
Also you're often better off with fewer faster cpus than more slower
ones for reporting servers (the opposite of OLTP where number of cores
is far more important.)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Are identical subqueries in unioned statements nonrepeatable?

2010-07-22 Thread Alvaro Herrera
Excerpts from Derrick Rice's message of jue jul 22 12:27:31 -0400 2010:

 Is it possible for the contents of reference_table to differ from the first
 select to the select on the right hand side of the union?  (e.g. because
 some other transaction committed additional rows).

No.

 If it is not possible, why?  Is it because a single query always executes
 with serializable (effective) isolation?

Yes.  (Actually: it's because a query is always executed with a single
snapshot).

 Is it because postgresql
 recognizes that the query is repeated and uses a single result set in both
 sides of the union?

No.

 Is this behavior that is part of postgresql intentionally, or a side effect
 that I should not rely on?

It is intentional and will not be changed.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Insert and Retrieve unsigned char sequences using C

2010-07-22 Thread Merlin Moncure
2010/7/22 Vinícius Soares vinicius...@yahoo.com.br:
 Hey,

 thanks for your response.
 I did it:

     S8 sql[1500] = insert into t values ( E';
         U8 *msg;
     msg = PQescapeByteaConn(conn, pending_cmd-cmd.value,
 sizeof(msg_cmd_t), to_length);
     for (i=0; i  sizeof(msg_cmd_t); i++){
     S8 str[20] = ;
     sprintf(str, %c, *(msg+i) );
     strcat(sql, str);
     }
         strcat(sql, ' ););
         PQexec(conn, sql);

 But it is very strange because sometimes it works but others times it does
 not work.
 is it right?

That code doesn't look right: you need to make sure your 'to' is big
enough: at has to be at least (2*N)+1 where N is the input size.  it
returns a size_t, not a char*, and you should be able to just sprintf
the 'to' into your query, not copy the chars in a loop.  see the
following fragment:

#define ARGSZ 64
char my_bytea[ARGSZ];
char escaped_bytea[(2*ARGSZ)+1];
int error;
size_t nbytes;

nbytes = PQescapeStringConn (conn, escaped_bytea, my_bytea,
sizeof(my_bytea), error);

if(error != 0)
  // handle error

sprintf(querybuf, insert into foo(bytea_col) values (E'%s'), escaped_bytea);

like I said earlier, this is just about the absolute worst way to
transfer a bytea to the server.  I had to look up the docs for
PQescapeStringConn -- I've never once used it my entire life (or it's
even more evil cousin, PQescapeString).

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to distribute quantity if same product is in multiple rows

2010-07-22 Thread Tim Landscheidt
(anonymous) wrote:

It can be done in SQL: SUM(kogus) OVER (PARTITION BY toode
ORDER BY ID) - kogus (*1) will give you the running sum of
the product up to that row. You can then subtract that value
from the delivered quantity to calculate the delivered quan-
tity for the current row.

  But doing so automatically is probably bad. For example,
if a user has a purchase order with one position of two
pieces and one position of four, it is very likely that when
a shipment of four pieces arrives, the latter position shall
be marked as delivered. So I would leave the decision to the
user.

 If four pieces arrived, first position of 2 pieces should
 marked as delivered.
 Second position of 4 pieces shoudl be marked as partialli
 delivered by setting undelivered quantity
 of this row to 2

 How to use your suggestion for this ?

Que? You take the query above, join it in the UPDATE and
set the delivered quantity to the minimum of the ordered
quantity and taitmkogus - sumkogus.

 How to implement this is PostgreSql 8.1,8.2, 8.3 ?
 [...]

An example for calculating running sums without window
functions can be found at
URI:http://archives.postgresql.org/pgsql-sql/2001-07/msg00152.php.
I would rather use a PL/pgSQL function in this case, though.

Tim


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] locating cities within a radius of another

2010-07-22 Thread Geoffrey

Oliver Kohll - Mailing Lists wrote:


On 22 Jul 2010, at 12:57, Geoffrey wrote:

For completeness, the earthdistance module also provides the distance 
between two lat/longs, the point@point syntax is simple to use:

http://www.postgresql.org/docs/8.3/static/earthdistance.html


Disgregard my last post, Surely as soon as I hit send, the light went 
on...  I'm looking at deriving my points for point @ point from 
ll_to_earth().


I constructed mine using point(longitude, latitude), where long and lat 
are double precision, which returns a datatype of type point. 
ll_to_earth() looks like it returns a datatype of type earth, so not 
sure if it will work. Maybe things have changed in a recent release, 
please let me know if so.


So an example would be
select point(-2.2171,56.8952)@point(-1.2833,51.6667) as miles; 
  miles   
--

 363.202864676916
(1 row)


Perfect, that appears to work for me as well, thanks.



Regards
Oliver Kohll

oli...@agilebase.co.uk mailto:oli...@agilebase.co.uk / +44(0)7814 
828608 / skype:okohll

www.agilebase.co.uk http://www.agilebase.co.uk - software






--
Until later, Geoffrey

I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them.
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clarification of the simple dictionary

2010-07-22 Thread Oleg Bartunov

Andreas,

I'd create myself copy of dictionary to be independent on system changes.

Oleg
On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote:


On 07/22/2010 07:44 PM, Oleg Bartunov wrote:

Don't guess, but read docs
http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY 


12.6.2. Simple Dictionary

The simple dictionary template operates by converting the input token to 
lower case and checking it against a file of stop words. If it is found in 
the file then an empty array is returned, causing the token to be 
discarded. If not, the lower-cased form of the word is returned as the 
normalized lexeme. Alternatively, the dictionary can be configured to 
report non-stop-words as unrecognized, allowing them to be passed on to the 
next dictionary in the list.


d=# \dFd+ simple
  List of text search dictionaries
   Schema   |  Name  | Template  | Init options | 
Description 
++---+--+---
 pg_catalog | simple | pg_catalog.simple |  | simple 
dictionary: just lower case and check for stopword


By default it has no Init options, so it doesn't check for stopwords.


Guess what - I *have* read the docs which sais ...and checking it against a 
file of stop words. What was unclear to me was whether or not it was 
configured with a stopwords-file or not as default, which is not the case I 
understand from your reply. Very good, fits my needs like a glove:-) It might 
be worth considering updating the docs to make this clearer?


So - can we rely on simple to remain this way forever (no Init options) or 
is it better to make a copy of it with the same properties as today?


It seems simple + the unaccent dict. available in 9.0 saves my day, thanks 
Mr. Bartunov.





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime

2010-07-22 Thread Piotr Gasidło
Hello,

I have strange problem.

I test/optimize my queries with EXPLAIN ANALYZE. I get for example:

Total runtime: 40.794 ms

But when I run query without EXPLAIN ANALYZE i get, for example:

Time: 539.252 ms

Query returns 33 rows. Why?

I do checks with psql connected using socket to postgresql server. No SSL.

Using PostgreSQL 8.4.4.

-- 
Piotr Gasidło

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime

2010-07-22 Thread Szymon Guz
W dniu 22 lipca 2010 21:24 użytkownik Piotr Gasidło
qua...@barbara.eu.orgnapisał:

 Hello,

 I have strange problem.

 I test/optimize my queries with EXPLAIN ANALYZE. I get for example:

 Total runtime: 40.794 ms

 But when I run query without EXPLAIN ANALYZE i get, for example:

 Time: 539.252 ms

 Query returns 33 rows. Why?

 I do checks with psql connected using socket to postgresql server. No SSL.

 Using PostgreSQL 8.4.4.


Hi,
maybe the query waits on a lock or maybe the returned rows are very big.

regards
Szymon Guz


Re: [GENERAL] Clarification of the simple dictionary

2010-07-22 Thread John Gage



By default it has no Init options, so it doesn't check for stopwords.


In the first place, this functionality is a rip-snorting home run on  
Postgres.  I congratulate Oleg who I believe is one of the authors.


In the second, I too had not read (carefully) the documentation and am  
very happy to find that I can eliminate stop words with 'simple'.   
That will be a tremendous convenience going forward.


It turns out that using 'english' and getting stemmed lexemes is  
extremely convenient too, but this functionality in 'simple' is  
excellent.


Thanks,

John



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime

2010-07-22 Thread Piotr Gasidło
W dniu 22 lipca 2010 21:34 użytkownik Szymon Guz mabew...@gmail.com napisał:
 maybe the query waits on a lock or maybe the returned rows are very big.

So shouldn't EXPLAIN ANALYZE be also affected by waiting for lock?

The row has width = 313, so it's not big.

I've reduced it to witdh = 12 (only one column, int4 type), and get
EXPLAIN ANALYZE:

Total runtime: 14.788 ms

And only SELECT returns:

Time: 456,528 ms

Or maybe Total runtime it's not what I thought it is, and I should
look at psql \timing result, which form EXPLAIN ANALYZE is nearly the
same like for SELECT:

Time: 402,675 ms

-- 
Piotr Gasidło

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Need some help on PG database transfer

2010-07-22 Thread Steeles
Hi all,

I am doing some tests on the file level copying for one of database in
windows platform. Here is what I want to achieve.

copy files within PG tablespace folder where PG database resides to target
machine. I want to attache the copied data to target PG database.


What I have done is

1. Database: ABCD, OID: 12345, tablespace: e:\pg_data\, obviously,
e:\pg_data has a subfolder named 12345.
2. create database in target, named , and it will create its own OID,
for example, 16333, the folder 16333 will reside under BASE folder.
3. then I stop PG service, delete all files under ..\16333\*.*,
4. copy all files from e:\pg_data\12345\*.* to the folder in target server,
..\16333\.
5. start PG service in target machine. it looks like it pickup most of
tables that are from its source database, ABCD.

But, It is missing tables and functions, compared to sources.

Is it doable for replicating data like that?

PG tablespace, does it contain all the data/tables in its table space
folder?

Please comment.

Thanks.


Re: [GENERAL] Need some help on PG database transfer

2010-07-22 Thread Scott Marlowe
On Thu, Jul 22, 2010 at 2:08 PM, Steeles stee...@gmail.com wrote:
 Hi all,

 I am doing some tests on the file level copying for one of database in
 windows platform. Here is what I want to achieve.

 copy files within PG tablespace folder where PG database resides to target
 machine. I want to attache the copied data to target PG database.

That won't work.  You have to pg_dump out the tables / tablespace and
then psql or pg_restore it to the other db.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Peter Hunsberger
On Thu, Jul 22, 2010 at 4:09 PM, Howard Rogers h...@diznix.com wrote:

 On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
 peter.hunsber...@gmail.com wrote:
  On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote:
 
 there's a room-full of users who can look
 at code '4097' and know precisely what it means and would be mortified
 if I suddenly started displaying exactly the same meanings in what, to
 them, would look like utter gibberish.


In that case, you shouldn't be describing the column as some encoded
bit of magic here.  It clearly has some some semantic meaning which
gives you a reason to want to keep it that way.  Though why your users
are dealing with the raw values as stored in the database may be
another issue to deal with: Personally, I'd say store it in the way
that is easiest for your application logic to deal with, display it in
the form that is easiest for your users to deal with.  The are often
two completely different things...

--
Peter Hunsberger

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime

2010-07-22 Thread Piotr Gasidło
W dniu 22 lipca 2010 21:46 użytkownik Piotr Gasidło
qua...@barbara.eu.org napisał:
 W dniu 22 lipca 2010 21:34 użytkownik Szymon Guz mabew...@gmail.com napisał:
 (...)

Something new. The query is run against table which has been
partitioned into a lot of small tables.
When I combine data and put it into one table - the execution of
EXPLAIN ANALYZE compares with real SELECT timeing.

On paritioned:

EXPLAIN ANALYZE SELECT ...
Total runtime: 14.790 ms
Time: 291,637 ms

On one table with data from all partitions and same indexes:

EXPLAIN ANALYZE SELECT ...

Total runtime: 16.418 ms
Time: 17,371 ms

Can someone give me clue why EXPLAIN ANALYZE don't work correctly with
partitions?

-- 
Piotr Gasidło

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Question about SCO openserver and postgres...

2010-07-22 Thread Edmundo Robles L.
Hi!
  I have a problem with the  max  postgres connections  on SCO 
Openserver 5.0.7, so ...my boss decided to buy  the SCO Openserver 6.0
but this   version comes in 2  editions:  Starter and Enterprise.

If SCO 5.0.7 only allows 95 ( -3  used by superuser)  connections to 
postgres...

Do you know  how many connections to postgres  can i have with 
OpenServer   in Starter Edition or Enterprise edition?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] psql problem

2010-07-22 Thread Gary Fu

Hi,

System information:
- psql 8.4.4 on a client with CentOS 5.5 (64 bits)
- postgres 8.4.4 on the server with CentOS 5.5 (64 bits)
- the client is connected with vpn

I have a script to create a table with some comments in front.  When I 
use the command 'psql -f script.sql' to load it, it hangs.  However, if 
I remove the comments, OR remove some of the columns from the table, it 
works okay.   It looks like to me, the psql will hang with large size of 
the script file.  I tried 'psql  script.sql' and 'cat script.sql | 
psql' with the same result.


However, I tried it on another client host (CentOS 5.5 32 bits), I don't 
see this problem.


Any idea and suggestion ?

Thanks,
Gary

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Howard Rogers
On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
peter.hunsber...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote:

 On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:


 
  Why on Earth would I want to store this sort of stuff in a bit string?!
 
  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.
 
  I don't know about you, but I find looking at 21205 a darn'd site
  easier than staring blankly at 101001011010101!!

 snip lots of stuff/snip

 
  Note you can cast integer to bitstring, but there may be some odd
  behaviour for sign bits and such.  Which is again why I'd use the
  right type for the job, bit string.  But it's your project.
 

 Quoting...

  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.

 Er, no. 21205 is not an integer. It's an encoded bit of magic.


 In that case your database design is fundamentally broken.  A database
 should have content fields that map to the needs of the application.
 As you describe your application requirements, that is a bit string
 and not an integer.  Use bit strings and your application logic is
 transparent, obvious and easy to maintain.  Use integers and you have
 to resort to magic.  As you say, it's your choice, but you came here
 looking for advice and the advice you were given is very good

 --
 Peter Hunsberger

Hi Peter:

It wasn't, as the original poster pointed out, 'advice' that was given
so much as personal preference. Had someone said, 'ah, but you see
storing your 15 meanings in decimal uses up 5 bytes, whereas a
bitstring only requires 15 bits, and over 10,000,000 records, the
saving of 3 bytes per record adds up...', then that would be technical
advice I could listen to, assess and make a call on.

But simply saying your design is broken... wo! might well scare
the children, but doesn't really do anything for me, because I know
for a certainty that it's not broken at all.

It comes down to this: I can do Boyce-Codd normal form in my sleep
(...and falling asleep happens quite frequent when doing it, strangely
enough), and have been doing so since 1987. I'm certainly not perfect,
but I reckon I can tell from a mile away when one of my designs is
broken, as you put it -and this one isn't. I haven't even begun to
describe a scintilla of a percentage point of the design decisions
this thing has to deal with, nor the fact that it's been running quite
happily in this manner for a good couple of years... so you'll just
have to take it from me that there's a room-full of users who can look
at code '4097' and know precisely what it means and would be mortified
if I suddenly started displaying exactly the same meanings in what, to
them, would look like utter gibberish.

Unless you, or someone else, can come up with some hard, *technical*
facts as to why working with bitstring encodings of meaning is so much
better than working in decimal, we're sticking with the decimal
representation. I'll buy you're forever doing implicit casts which
are poor performers or implicit casts might break in a future
release or it's costing you three bytes per record ...or anything
else in that vein. But matters of transparency and ease of maintenance
are entirely subjective things (about which I sought no advice at all,
incidentally), and what works for you on those scores doesn't work for
me.

Regards
HJR

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] optimizing daily data storage in Pg

2010-07-22 Thread Andy Colson

On 7/22/2010 9:41 AM, P Kishor wrote:

I have been struggling with this for a while now, have even gone down
a few paths but struck out, so I turn now to the community for ideas.
First, the problem: Store six daily variables for ~ 25 years for cells
in a grid.

  * Number of vars = 6
  * Number of cells ~ 13 million
  * Number of days ~ 9125 (25 * 365)

Optimize the store for two different kinds of queries:

Query one: Retrieve the value of a single var for all or a portion of
the cells for a single day. This is analogous to an image where every
pixel is the value of a single var.


  SELECTvar  FROM d WHERE yr = ? AND yday = ?;
  SELECTvar  FROM d WHERE yr = ? AND yday = ? AND cell_id IN 
(?,?,?...);





Query two: Retrieve values for all the days or a duration of days for
a single var for a single cell. This is like grabbing a column out of
a table in which each row holds all the vars for a single day.

  SELECTvar  FROM d WHERE cell_id = ?;
  SELECTvar  FROM d WHERE cell_id IN (?,?,?...);



First, I must admit to not reading your entire email.

Second, Query 1 should be fast, regardless of how you layout the tables.

Third, Query 2 will return 13M rows?  I dont think it matters how you 
layout the tables, returning 13M rows is always going to be slow.



-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] psql problem

2010-07-22 Thread shakahsha...@gmail.com
On Jul 22, 5:03 pm, Gary Fu gary...@sigmaspace.com wrote:
 Hi,

 System information:
 - psql 8.4.4 on a client with CentOS 5.5 (64 bits)
 - postgres 8.4.4 on the server with CentOS 5.5 (64 bits)
 - the client is connected with vpn

 I have a script to create a table with some comments in front.  When I
 use the command 'psql -f script.sql' to load it, it hangs.  However, if
 I remove the comments, OR remove some of the columns from the table, it
 works okay.   It looks like to me, the psql will hang with large size of
 the script file.  I tried 'psql  script.sql' and 'cat script.sql |
 psql' with the same result.

 However, I tried it on another client host (CentOS 5.5 32 bits), I don't
 see this problem.

 Any idea and suggestion ?

 Thanks,
 Gary

Are you sure it is hanging? Maybe piping into psql via pv (http://
www.ivarch.com/programs/pv.shtml) could give some insight as to where
it might be hanging, or if it is just going slowly, or whatever.

E.g., if your script is 240500 bytes long, the following will give a
simple progress meter, elapsed time, and ETA display:
  cat yourscript.file | pv -t -r -e -b -s 240500 | psql

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Scott Marlowe
On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers h...@diznix.com wrote:
 On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
 peter.hunsber...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote:

 On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:


 
  Why on Earth would I want to store this sort of stuff in a bit string?!
 
  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.
 
  I don't know about you, but I find looking at 21205 a darn'd site
  easier than staring blankly at 101001011010101!!

 snip lots of stuff/snip

 
  Note you can cast integer to bitstring, but there may be some odd
  behaviour for sign bits and such.  Which is again why I'd use the
  right type for the job, bit string.  But it's your project.
 

 Quoting...

  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.

 Er, no. 21205 is not an integer. It's an encoded bit of magic.


 In that case your database design is fundamentally broken.  A database
 should have content fields that map to the needs of the application.
 As you describe your application requirements, that is a bit string
 and not an integer.  Use bit strings and your application logic is
 transparent, obvious and easy to maintain.  Use integers and you have
 to resort to magic.  As you say, it's your choice, but you came here
 looking for advice and the advice you were given is very good

 --
 Peter Hunsberger

 Hi Peter:

 It wasn't, as the original poster pointed out, 'advice' that was given
 so much as personal preference. Had someone said, 'ah, but you see
 storing your 15 meanings in decimal uses up 5 bytes, whereas a
 bitstring only requires 15 bits, and over 10,000,000 records, the
 saving of 3 bytes per record adds up...', then that would be technical
 advice I could listen to, assess and make a call on.

You do realize the first page I linked to told you that, right?  It's
not a particularly big page.  I had made the erroneous assumption
you'd read the link I posted.

 But simply saying your design is broken... wo! might well scare
 the children, but doesn't really do anything for me, because I know
 for a certainty that it's not broken at all.

I asked if there was a reason you were avoiding bit strings.  Hardly a
your design is broken point.  You've now said why you are not using
the type that was designed to handle bit strings for bit strings.

I personally would store them as bit strings and change representation
for users.  There are some issues that come up if your bit strings are
long enough to get close to the last bit in an integer (also mentioned
on the links I posted that didn't get read).  But other than that it
should work fine.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Question about SCO openserver and postgres...

2010-07-22 Thread Scott Marlowe
On Thu, Jul 22, 2010 at 3:24 PM, Edmundo Robles L.
erob...@sensacd.com.mx wrote:
 Hi!
  I have a problem with the  max  postgres connections  on SCO
 Openserver 5.0.7, so ...my boss decided to buy  the SCO Openserver 6.0
 but this   version comes in 2  editions:  Starter and Enterprise.

 If SCO 5.0.7 only allows 95 ( -3  used by superuser)  connections to
 postgres...

 Do you know  how many connections to postgres  can i have with
 OpenServer   in Starter Edition or Enterprise edition?

Are you sure this isn't just a limit in max_connections in postgresql.conf?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Changing pg_attribute.attislocal

2010-07-22 Thread Noah Misch
Over time, I mistakenly did something like this:

CREATE TABLE a ();
CREATE TABLE b () INHERITS(a);
ALTER TABLE b ADD col int;
ALTER TABLE a ADD col int;

where I should have left out the third statement.  Not a great loss, the only
consequence I've observed being pg_attribute.attislocal = true, so dropping
a.col will not drop b.col.  Is there a DDL way to change that, short of dropping
the column from both tables and re-adding it to the parent alone?  If not,
what's the danger of updating attislocal directly?  If it makes any difference,
I don't actually plan to drop the column anytime soon.

Thanks,
nm

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] psql problem

2010-07-22 Thread Andy Colson

On 07/22/2010 04:03 PM, Gary Fu wrote:

Hi,

System information:
- psql 8.4.4 on a client with CentOS 5.5 (64 bits)
- postgres 8.4.4 on the server with CentOS 5.5 (64 bits)
- the client is connected with vpn

I have a script to create a table with some comments in front. When I use the 
command 'psql -f script.sql' to load it, it hangs. However, if I remove the 
comments, OR remove some of the columns from the table, it works okay. It looks 
like to me, the psql will hang with large size of the script file. I tried 'psql 
 script.sql' and 'cat script.sql | psql' with the same result.

However, I tried it on another client host (CentOS 5.5 32 bits), I don't see 
this problem.

Any idea and suggestion ?

Thanks,
Gary


Line endings?

How about a sample?

What comment style: -- /* (* # ; ' //


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Scott Marlowe
On Thu, Jul 22, 2010 at 10:27 PM, Howard Rogers h...@diznix.com wrote:
 On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe scott.marl...@gmail.com 
 wrote:
 On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers h...@diznix.com wrote:
 On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
 peter.hunsber...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote:

 On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:


 
  Why on Earth would I want to store this sort of stuff in a bit string?!
 
  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.
 
  I don't know about you, but I find looking at 21205 a darn'd site
  easier than staring blankly at 101001011010101!!

 snip lots of stuff/snip

 
  Note you can cast integer to bitstring, but there may be some odd
  behaviour for sign bits and such.  Which is again why I'd use the
  right type for the job, bit string.  But it's your project.
 

 Quoting...

  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.

 Er, no. 21205 is not an integer. It's an encoded bit of magic.


 In that case your database design is fundamentally broken.  A database
 should have content fields that map to the needs of the application.
 As you describe your application requirements, that is a bit string
 and not an integer.  Use bit strings and your application logic is
 transparent, obvious and easy to maintain.  Use integers and you have
 to resort to magic.  As you say, it's your choice, but you came here
 looking for advice and the advice you were given is very good

 --
 Peter Hunsberger

 Hi Peter:

 It wasn't, as the original poster pointed out, 'advice' that was given
 so much as personal preference. Had someone said, 'ah, but you see
 storing your 15 meanings in decimal uses up 5 bytes, whereas a
 bitstring only requires 15 bits, and over 10,000,000 records, the
 saving of 3 bytes per record adds up...', then that would be technical
 advice I could listen to, assess and make a call on.

 You do realize the first page I linked to told you that, right?  It's
 not a particularly big page.  I had made the erroneous assumption
 you'd read the link I posted.

 If you mean, did I read the bit in the doco where it said nothing at
 all in the 'these are great advantages' style I've just described, but
 instead makes the fairly obvious point that a bit string takes 8 bits
 to store a group of 8 bits (well, stone me!!)

Wow, I'm surprised you get any help with your attitude.  I posted a
link and asked a question and right up front got my head handed to me.

To quote:  Why on Earth would I want to store this sort of stuff in a
bit string?!

I don't know about you, but I find looking at 21205 a darn'd site
easier than staring blankly at 101001011010101!!

Like I'd somehow bitten your hand when I asked my question.

 PLUS has extra overhead,
 then yes, I did read that part of your first link... and nevertheless
 concluded that, overall, there is... er, some extra overhead in
 storing bitstrings.

Well, your initial answer certainly didn't give ANY idea that you'd
read that page.

 So what precisely about that first article, which I did indeed read,
 would you have expected to lead me to the conclusion that I'd SAVE
 significant amounts of space or find some other technically-compelling
 reason for switching?

I didn't expect such.  I asked why you weren't using them, and gave
you some links to read on it.  It clearly states that bit strings use
a bit per bit, plus some overhead.  Now, I had no idea if you were
dealing with bigints and 60 bit strings or 5 bit strings.  In fact,
you did little to really describe your project and preferences in your
post.  Which is why my response was short and concise, I had little to
go on.

 My point is that there's nothing much in it, storage-wise, either way.

Well, there is the fact that bit strings can restrict the size of the
entry so you don't accidentally get an int stored that's got more bits
than your model can handle.  There's also the issue that if / when you
ever get close to the last bit in an int bitstring may behave oddly
because of sign issues.

 So there's no compelling technical reason to switch.

I never said there was.  I simply asked a question, and got my hand bitten.

 And without a
 technically-compelling reason, the rest of the post I was referring to
 simply boiled down, as far as I could tell, to a matter of personal
 preference. No less valid for that, of course. But ultimately, not
 something that would hold much sway with me.

Sure, fine, whatever you want.  I wasn't trying to convince you either
way.  I do think using the right type for the job makes more sense,
but again, it's personal preference.

 But simply saying your design is broken... wo! might well scare
 the children, but 

Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime

2010-07-22 Thread A. Kretschmer
In response to Piotr Gasid??o :
 Hello,
 
 I have strange problem.
 
 I test/optimize my queries with EXPLAIN ANALYZE. I get for example:
 
 Total runtime: 40.794 ms
 
 But when I run query without EXPLAIN ANALYZE i get, for example:
 
 Time: 539.252 ms
 
 Query returns 33 rows. Why?

Maybe cheaply or virtuell hardware? There are some issues with functions
like gettimoofday(), see here:

http://archives.postgresql.org/pgsql-general/2007-01/msg01653.php
(and the whole thread)

Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: - Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bitmask trickiness

2010-07-22 Thread Howard Rogers
On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers h...@diznix.com wrote:
 On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger
 peter.hunsber...@gmail.com wrote:
 On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote:

 On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com 
 wrote:


 
  Why on Earth would I want to store this sort of stuff in a bit string?!
 
  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.
 
  I don't know about you, but I find looking at 21205 a darn'd site
  easier than staring blankly at 101001011010101!!

 snip lots of stuff/snip

 
  Note you can cast integer to bitstring, but there may be some odd
  behaviour for sign bits and such.  Which is again why I'd use the
  right type for the job, bit string.  But it's your project.
 

 Quoting...

  Because you are manipulating bits and not integers?  I guess there are
  10 kinds of people, those who like think in binary and those who
  don't.

 Er, no. 21205 is not an integer. It's an encoded bit of magic.


 In that case your database design is fundamentally broken.  A database
 should have content fields that map to the needs of the application.
 As you describe your application requirements, that is a bit string
 and not an integer.  Use bit strings and your application logic is
 transparent, obvious and easy to maintain.  Use integers and you have
 to resort to magic.  As you say, it's your choice, but you came here
 looking for advice and the advice you were given is very good

 --
 Peter Hunsberger

 Hi Peter:

 It wasn't, as the original poster pointed out, 'advice' that was given
 so much as personal preference. Had someone said, 'ah, but you see
 storing your 15 meanings in decimal uses up 5 bytes, whereas a
 bitstring only requires 15 bits, and over 10,000,000 records, the
 saving of 3 bytes per record adds up...', then that would be technical
 advice I could listen to, assess and make a call on.

 You do realize the first page I linked to told you that, right?  It's
 not a particularly big page.  I had made the erroneous assumption
 you'd read the link I posted.

If you mean, did I read the bit in the doco where it said nothing at
all in the 'these are great advantages' style I've just described, but
instead makes the fairly obvious point that a bit string takes 8 bits
to store a group of 8 bits (well, stone me!!) PLUS has extra overhead,
then yes, I did read that part of your first link... and nevertheless
concluded that, overall, there is... er, some extra overhead in
storing bitstrings.

So what precisely about that first article, which I did indeed read,
would you have expected to lead me to the conclusion that I'd SAVE
significant amounts of space or find some other technically-compelling
reason for switching?

My point is that there's nothing much in it, storage-wise, either way.
So there's no compelling technical reason to switch. And without a
technically-compelling reason, the rest of the post I was referring to
simply boiled down, as far as I could tell, to a matter of personal
preference. No less valid for that, of course. But ultimately, not
something that would hold much sway with me.

 But simply saying your design is broken... wo! might well scare
 the children, but doesn't really do anything for me, because I know
 for a certainty that it's not broken at all.

 I asked if there was a reason you were avoiding bit strings.  Hardly a
 your design is broken point.

I'm getting a bit fed up of this thread now. It wasn't YOU that ever
said 'the design is broken', and I never suggested it was. That was
Peter Hunsberger, about three posts up in the thread, who wrote In
that case your database design is fundamentally broken.

If you're going to take umbrage at something, please take umbrage at
things that were actually directed at you in the first place!

 You've now said why you are not using
 the type that was designed to handle bit strings for bit strings.

 I personally would store them as bit strings and change representation
 for users.

I'm a user, too. I get to see this stuff every time I do a select
statement. At the command line. Which I use a lot.

 There are some issues that come up if your bit strings are
 long enough to get close to the last bit in an integer (also mentioned
 on the links I posted that didn't get read).

Don't make false assumptions about other people, please. You don't
know what I read or didn't read. Just because you didn't make a
compelling technical argument in favour of bitstrings doesn't mean I
didn't read the article you linked to ...that also didn't make a
compelling technical argument in favour of bitstrings.

 But other than that it
 should work fine.

Yes, I know. I've only been using this technique for five years on
Oracle! I would be very surprised indeed if it 

Re: [GENERAL] optimizing daily data storage in Pg

2010-07-22 Thread P Kishor
On Thu, Jul 22, 2010 at 4:56 PM, Andy Colson a...@squeakycode.net wrote:
 On 7/22/2010 9:41 AM, P Kishor wrote:

 I have been struggling with this for a while now, have even gone down
 a few paths but struck out, so I turn now to the community for ideas.
 First, the problem: Store six daily variables for ~ 25 years for cells
 in a grid.

  * Number of vars = 6
  * Number of cells ~ 13 million
  * Number of days ~ 9125 (25 * 365)

 Optimize the store for two different kinds of queries:

 Query one: Retrieve the value of a single var for all or a portion of
 the cells for a single day. This is analogous to an image where every
 pixel is the value of a single var.

      SELECTvar  FROM d WHERE yr = ? AND yday = ?;
      SELECTvar  FROM d WHERE yr = ? AND yday = ? AND cell_id IN
 (?,?,?...);



 Query two: Retrieve values for all the days or a duration of days for
 a single var for a single cell. This is like grabbing a column out of
 a table in which each row holds all the vars for a single day.

      SELECTvar  FROM d WHERE cell_id = ?;
      SELECTvar  FROM d WHERE cell_id IN (?,?,?...);



 First, I must admit to not reading your entire email.

I am not sure how to respond to your feedback give that you haven't
read the entire email. Nevertheless, thanks for writing...


 Second, Query 1 should be fast, regardless of how you layout the tables.

It is not fast. Right now I have data for about 250,000 cells loaded.
That comes to circa 92 million rows per year. Performance is pretty
sucky.



 Third, Query 2 will return 13M rows?  I dont think it matters how you layout
 the tables, returning 13M rows is always going to be slow.


Yes, I understand that. In reality I will never get 13 M rows. For
display purposes, I will probably get around 10,000 rows to 50,000
rows. When more rows are needed, it will be to feed a model, so that
can be offline (without an impatient human being waiting on the other
end).

Right now, my main problem is that I have either too many rows (~4 B
rows) in a manageable number of tables (25 tables) or manageable
number of rows (~13 M rows) in too many tables (~9000 tables).



 -Andy




-- 
Puneet Kishor http://www.punkish.org
Carbon Model http://carbonmodel.org
Charter Member, Open Source Geospatial Foundation http://www.osgeo.org
Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor
Nelson Institute, UW-Madison http://www.nelson.wisc.edu
---
Assertions are politics; backing up assertions with evidence is science
===

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general