Re: [GENERAL] integer only sposix/regex

2009-06-16 Thread Steve Atkins


On Jun 15, 2009, at 8:21 PM, rodeored wrote:


I'm trying to search phone numbers for digits. Unfortunately, the
validation has been sloppy and the numbers are not just numbers, they
also have dashes and probably other stuff.
I would like the search to ignore anything but integers

WHERE (a.phone1 ~* '.*626.*' OR a.phone2 ~* '.*626.*' OR a.phone2 ~*
'.*626.*')


A hideous way would be a.phone1 ~ '6[^0-9]*2[^0-9]*6'

A possibly less hideous way would be regexp_replace(a.phone1,  
'[^0-9]+', '') ~ '626'.


Cheers,
  Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres 8.2 Memory Tuning

2009-06-16 Thread CM J
Hi,

I need to measure the memory used by postgres under various loads in
my application.How to calculate the exact memory used by postgres for
OS:Windows as well as linux ?

Thanks,
CM J

On Mon, Jun 15, 2009 at 5:37 PM, Massa, Harald Armin c...@ghum.de wrote:

 CMJ,

 same system.Even under moderate load in my application, each of the
 postgres process occupies some 25 MB memory leading to total postgres memory
 usage of 500+ MB. Now, how do i  limit the memory used by each


 how did you measure those 25MB and those 500+MB MEMORY usage?

 I guess you are running windows task manager and looking into the default
 memory column. That is adress space used, rather then memory used. So
 Postgres is using 500MB of adress space; share_memory of those processes is
 accounted for every process.

 Look within additional columns of taskmanager to find the one giving the
 private memory of the process.

 Especially on windows you should update to the latest 8.3.x PostgreSQL.

 Harald





 --
 GHUM Harald Massa
 persuadere et programmare
 Harald Armin Massa
 Spielberger Straße 49
 70435 Stuttgart
 0173/9409607
 no fx, no carrier pigeon
 -
 LASIK good, steroids bad?



Re: [GENERAL] Postgres 8.2 Memory Tuning

2009-06-16 Thread Eduardo Morras

At 08:56 16/06/2009, you wrote:

Hi,

I need to measure the memory used by postgres under various 
loads in my application.How to calculate the exact memory used by 
postgres for OS:Windows as well as linux ?


Download ProcessXP from Microsoft (before it was from Sysinternals). 
There you will see some cool graphs with private memory and other 
useful info (page faults, I/O, etc...)



Thanks,
CM J



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Dynamic table

2009-06-16 Thread A B
Hi.
I have a little problem (and a suggestion for a solution) that I
wondered if anyone would care to comment on.

I have a standard table filled with customers (with a unique customer
id, names etc.) and for each customer I need to store some integer
values. The problem is that the number of integer values that should
be stored WILL change over time (it can both increase and decrease).
It will most likely grow from zero to 10-18 and then perhaps add 1 or
remove one value per year but basically be about 10-18 or so.

I must be able to add a new integer value to all customers, and remove
an integer value from all customers  Altering the table by adding and
deleting columns would theoretically solve it, but since columns are
not  really dropped, just hidden away, and there is a 1600 column
limit on tables as I understand it, this would crash the application
at some time in the future, or at least it will waste a lot of
discspace.

Other things I must be able to do is to alter values for a specific
integer value for one customer or all customers simultaneous. This can
be like change all value 4 to 8 for the third integer value.
And I must be able to quickly select all the integers for a specific
customer for further processing. There will also be access to single
integer values for a specific customer. It's hard to say how much
compared to get all integer values for a specific customer. All
customers will be equally accessed.

As I see it I have these options.

Method A) a add-only dynamic table that re-uses columns that are not
used any more.

First create a metatable
CREATE TABLE Columnspec (id integer, columnname varchar(10), name
varchar(30), defaultvalue integer, active boolean default true );
That will keep track of the dynamically added columns.

and a table for all the integer values
CREATE TABLE customerValues (customer_id integer references customer);
Initially it will be empty

and adding a new integer value called shoe size with a default value
of 4 would mean to run these two queries.
insert into Columnspec (id,columnname,name,defaultvalue) values (1,
'col1', 'shoe size', 4);
alter table customerValues add column col1 integer default 4;

removing a column would mean an operation like
update Columnspec set active=false where id=5;

then adding a new column would mean to reuse an inactive column, like
the following operations
update Columnspec set active=true, name='hair length', defaultvalue=10
where id=5;
update customerValues set col5=10; -- to give all customers the default value.

One could perhaps alter the default value of the column called 'col5'
in the example above and also alter the names of the columns but that
are just cosmetic.
The only drawback I can see with this rightnow is that if the number
of columns decreases, it will waste space.

Method B)
The EAV way.
With a meta table
CREATE TABLE metadata (id integer primary key, name varchar(30),
defaultvalue integer);

and then the values
CREATE TABLE eav (customer_id references customer, key integer
references metadata on delete cascade , value integer,
unique(customer_id, key)  );

With c customers, k integer fields method A requires  at least c*(k+1)
integers to store,  B requires c*k*3. A factor about 3 sounds not that
much wasted space, but of course it can make a difference.

Method C)
Have a metadata table as usual, and then store the values in an array.
I must admit that I have not looked enough at arrays yet.
The drawback I can see right now will be the access to a specific
value in the array, but I guess PostgreSQL is fast...

Method D)
As method B) but add a view that aggregates the values into a complete
set for each customers if that will speed up anything

Any comments on the pros/cons with these strategies?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How can I interpolate psql variables in function bodies?

2009-06-16 Thread Albe Laurenz
J. Greg Davidson wrote:
 Hi dear colleagues,
 
 I'm trying to pull some platform-specific constants out of my
 code by using psql variables, e.g.:
 
 $ psql -v TypeLength=4
 
 # CREATE TYPE tref (
   INTERNALLENGTH = :TRefTypeLength,
   INPUT = tref_in,
   OUTPUT = tref_out,
   PASSEDBYVALUE
 );
 
 which works fine, but when I need such a constant in a function
 it is not substituted.  A simplified example:
 
 $ psql -v foo=10
 
 # select :foo;
  ?column? 
 --
10
 (1 row)
 
 # create function foo() returns integer as 'select '(:foo) language sql;
 ERROR:  syntax error at or near ( at character 51
 
 I'm sure that I could do something horrible by using EXECUTE inside of
 a plpgsql function, and I'm hoping that someone will have a simpler
 alternative.  For example, is there some kind of quoting mechanism I can
 use which will not impede psql from doing substitutions?

I don't think that there is a convenient way, as variables are not
substituted inside string constants.

I came up with the following:

psql -v foo=10
test= \set funcbody `echo 'SELECT\ `:foo`echo '`
test= create function foo() returns integer as :funcbody language sql;

But that is ugly as hell and depends on the underlying operating
system to have a Bourne-like shell.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ResultSet is FORWARD_ONLY.

2009-06-16 Thread S Arvind
Hi everyone,
Recently we have upgraded postgres driver jdbc4 for 8.3. After that
while executing rst.first(), it raise the exception.
The statment was created correctly as
dbconnection..createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);

org.postgresql.util.PSQLException: Operation requires a scrollable
ResultSet, but this ResultSet is FORWARD_ONLY.
at
org.postgresql.jdbc2.AbstractJdbc2ResultSet.checkScrollable(AbstractJdbc2ResultSet.java:207)
at
org.postgresql.jdbc2.AbstractJdbc2ResultSet.first(AbstractJdbc2ResultSet.java:292)
at
org.apache.tomcat.dbcp.dbcp.DelegatingResultSet.first(DelegatingResultSet.java:326)


This statment worked in postgres 8.1 jdbc 3 driver but now its not.. Pleas
help is solving this...

-Arvind S


Re: [GENERAL] 10 TB database

2009-06-16 Thread Michelle Konzack
Hi Artur,

I am owner of a database about War, Worcrime and Terroism with more then
1,6 TByte and I am already fscked...

Am 2009-06-15 14:00:05, schrieb Artur:
 Hi!

 We are thinking to create some stocks related search engine.
 It is experimental project just for fun.

 The problem is that we expect to have more than 250 GB of data every month.

I have only 500 MByte per month...

 This data would be in two tables. About 50.000.000 new rows every month.

arround 123.000 new rows per month

 We want to have access to all the date mostly for generating user  
 requesting reports (aggregating).
 We would have about 10TB of data in three years.

 Do you think is it possible to build this with postgresql and have any  
 idea how to start? :)

You have to use a physical cluster like me. Searches in  a  Database  of
more then 1 TByte even under using tablespace and  tablepartitioning
let you run into performance issues...

I have now splited my Database in chunks of 250 GByte using a Cluster of
1U Servers from Sun Microsystems.  Currently I run  8 servers  with  one
proxy.  Each server cost me 2.300 Euro.

Note:   On Friday I have a meeting with a  Sun  Partner  in
Germany because a bigger project...  where  I  have
to increase the performance of my database servers.
I have to calculate with 150.000 customers.

Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
Tamay Dogan Network
Debian GNU/Linux Consultant

-- 
Linux-User #280138 with the Linux Counter, http://counter.li.org/
# Debian GNU/Linux Consultant #
http://www.tamay-dogan.net/ Michelle Konzack
http://www.can4linux.org/   c/o Vertriebsp. KabelBW
http://www.flexray4linux.org/   Blumenstrasse 2
Jabber linux4miche...@jabber.ccc.de   77694 Kehl/Germany
IRC #Debian (irc.icq.com) Tel. DE: +49 177 9351947
ICQ #328449886Tel. FR: +33  6  61925193


signature.pgp
Description: Digital signature


Re: [GENERAL] Rounding incompatibility

2009-06-16 Thread Havasvölgyi Ottó
Yes, they are both your packages from your official site. So this means that
in 8.2 and in earlier versions the rounding is not the regular one.

Best regards,
Otto

2009/6/15 Dave Page dp...@pgadmin.org

 On Mon, Jun 15, 2009 at 3:33 PM, Tom Lanet...@sss.pgh.pa.us wrote:
  =?ISO-8859-1?Q?Havasv=F6lgyi_Ott=F3?= havasvolgyi.o...@gmail.com
 writes:
  I have found the following strangeness on Windows versions:
 
  Were your two versions built different ways (perhaps with different
  compilers)?  This comes down to what the system-supplied rint() function
  does.

 If they are our packages, then 8.2 was built with mingw/msys, and 8.3 with
 VC++.


 --
 Dave Page
 EnterpriseDB UK:   http://www.enterprisedb.com



Re: [GENERAL] 10 TB database

2009-06-16 Thread Pedro Doria Meunier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello Arthur,

We have a database that has a table growing ~1,5M rows each month.
The overall growth for the db is ~1GB/month.
PostgreSQL 8.2.9 on x86_64 - a very modest Dell R200 with 4GB of ram.

Although poor planning was made in the beginning (i.e. no clustering,
no partitioning...) - we weren't expecting the boom :] - that
particular server runs like clockwork with hundreds of queries per
minute and still doing so without any noticeable speed loss.

We're, of course, planning for load balancing in the beginning of next
year ... :)

BR,

Pedro Doria Meunier
GSM: +351 96 17 20 188
Skype: pdoriam
 



Michelle Konzack wrote:
 Hi Artur,

 I am owner of a database about War, Worcrime and Terroism with more
 then 1,6 TByte and I am already fscked...

 Am 2009-06-15 14:00:05, schrieb Artur:
 Hi!

 We are thinking to create some stocks related search engine. It
 is experimental project just for fun.

 The problem is that we expect to have more than 250 GB of data
 every month.

 I have only 500 MByte per month...

 This data would be in two tables. About 50.000.000 new rows every
 month.

 arround 123.000 new rows per month

 We want to have access to all the date mostly for generating user
  requesting reports (aggregating). We would have about 10TB of
 data in three years.

 Do you think is it possible to build this with postgresql and
 have any idea how to start? :)

 You have to use a physical cluster like me. Searches in  a
 Database  of more then 1 TByte even under using tablespace and
 tablepartitioning let you run into performance issues...

 I have now splited my Database in chunks of 250 GByte using a
 Cluster of 1U Servers from Sun Microsystems.  Currently I run  8
 servers  with  one proxy.  Each server cost me 2.300 Euro.

 Note:   On Friday I have a meeting with a  Sun  Partner  in Germany
 because a bigger project...  where  I  have to increase the
 performance of my database servers. I have to calculate with
 150.000 customers.

 Thanks, Greetings and nice Day/Evening Michelle Konzack
 Systemadministrator Tamay Dogan Network Debian GNU/Linux Consultant


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFKN13P2FH5GXCfxAsRAkIiAJ95GvbQhBrOglzK2d57F5Qv7E5NdgCfcKga
bFpRiWf2vSY0oMOD40PgSsg=
=4OB3
-END PGP SIGNATURE-


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic table

2009-06-16 Thread Sam Mason
On Tue, Jun 16, 2009 at 09:11:20AM +0200, A B wrote:
 I have a standard table filled with customers (with a unique customer
 id, names etc.) and for each customer I need to store some integer
 values. The problem is that the number of integer values that should
 be stored WILL change over time (it can both increase and decrease).
 It will most likely grow from zero to 10-18 and then perhaps add 1 or
 remove one value per year but basically be about 10-18 or so.
 
 I must be able to add a new integer value to all customers, and remove
 an integer value from all customers  Altering the table by adding and
 deleting columns would theoretically solve it, but since columns are
 not  really dropped, just hidden away, and there is a 1600 column
 limit on tables as I understand it, this would crash the application
 at some time in the future, or at least it will waste a lot of
 discspace.

 Method B)
 The EAV way.
 With a meta table
 CREATE TABLE metadata (id integer primary key, name varchar(30),
 defaultvalue integer);
 
 and then the values
 CREATE TABLE eav (customer_id references customer, key integer
 references metadata on delete cascade , value integer,
 unique(customer_id, key)  );

The way you described the problem the EAV solution sounds like the best
match--not sure if I'd use your synthetic keys though, they will save a
bit of space on disk but queries will be much more complicated to write.
EAV style solutions are rarely good/easy to maintain when the problem
changes so maybe you can take a step back from the problem and solve it
some other way.

The examples you gave (i.e. shoe size, hair length) would fit normal
table columns much better.  Maybe just use a set of tables for each set
of related attributes (i.e. physical attributes, personal preferences,
not sure what would fit here).  You'd also be able to use appropriate
data types/constraints, for example shoe size could have two columns one
for the measurement system (i.e. European, US, UK...) and the other for
the actual measurement.

Just had a quick flick through your previous posts; and I'd probably
stick with the multiple tables approach.  It's the most natural fit to
relational databases and until you know more about the problem (i.e.
you've experienced the data your going to be getting and the ways it's
going to change) you can't do much better.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Playing with set returning functions in SELECT list - behaviour intended?

2009-06-16 Thread Albe Laurenz
While playing around with the new features in 8.4rc1, I observe the following:

I create a simple set returning function:

CREATE OR REPLACE FUNCTION n_plicate(x anyelement, i integer) RETURNS SETOF 
anyelement LANGUAGE plpgsql AS
$$BEGIN
   FOR n IN 1..i LOOP
  RETURN NEXT x;
   END LOOP;
   RETURN;
END;$$;

test= select n_plicate(42, 3), 41;
 n_plicate | ?column? 
---+--
42 |   41
42 |   41
42 |   41
(3 rows)

test= select n_plicate(42, 4), n_plicate(41, 6);
 n_plicate | n_plicate 
---+---
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
42 |41
(12 rows)

So it looks like the number of result rows is the least common multiple
of the cardinalities of all columns in the select list.

Is this an artifact or is this intentional?
Should it be documented?

Then I try this:

test= WITH dummy(a, b) AS
test-(VALUES(42, 12), (11, 41))
test- SELECT n_plicate(a, 2), n_plicate(b, 2)
test- FROM dummy;
 n_plicate | n_plicate 
---+---
42 |12
42 |12
11 |41
11 |41
(4 rows)

Looks reasonable.

But this doesn't:

test= WITH dummy(a, b) AS
test-(VALUES(42, 12), (11, 41))
test- SELECT n_plicate(max(a), 2), n_plicate(max(b), 2)
test- FROM dummy;
 n_plicate | n_plicate 
---+---
42 |41
(1 row)

I had expected two result rows.
I guess it is the implicit grouping kicking in, but in an unintuitive way.


Should it be that way?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger Function and backup

2009-06-16 Thread Nishkarsh

Hi Merlin, thanks for the detailed input.

As per ur suggestion i will try to implement Slony-I.

I think i will need some help to do it.

I am useing Postgres 8.3.7, on Windows.

I was following the Slony-I example in the help for pgAdmin III. I am able
to perform the steps from 1-7. Step 8 : create Slony-I cluster i am getting
a msg in the interface 
   
Slony-I creation script no available; only join possible

On doing some research i found some scripts to be copied (I was not able to
find very clear instruction) or give slony-I path. i tried all that but was
not able to move ahead.

Can u plz guide me through -(%-|

Regards
Nishkarsh

Merlin Moncure-2 wrote:
 
 On Mon, Jun 15, 2009 at 4:29 AM, Nishkarshnishkars...@rediffmail.com
 wrote:
 Hello every one,

 I am new to databases. I am using Postgres 8.2 (Migrating to 8.3.7 in few
 days) on windows platform.

 I had tried using Slony-I for replication and was not able to create a
 cluster.

 After struggling for some time i decide to implement a way around to take
 differential backup. As the tables getting changed were very less.

 Here is what i intend to do:

 - Write a trigger for each of the tables in concern
 - Some how write a function which can copy / execute the same query in
 another temp Db on the same physical system (I have no idea how to do
 that)
 - Take a backup of temp DB which will be the differential backup of DB
 (We
 need to clear temp db after backup)

 Am i going in the right direction?
 Is there any way i can implement it.
 Any help will be really of great help
 
 Generating a full trigger based replication system on your own is
 IMNSHO crazy.  Slony is the best solution to this problem (trigger
 replication with postgres) that I know of, and is probably better than
 any one person to come up with in a reasonable amount of time.
 Probably, your best course of action if you need to get things running
 right now is to give slony another go (why did you not succeed?).
 
 Hand written trigger replication is ok if you need to copy, say, a
 couple of tables or you have some other very specific requirement.  In
 particular, copying an insert to a mirror database with trigger
 function wrapping dblink is a snap (updates are more problematic, but
 doable).  Of course, you need to figure out how to deal with schema
 updates and other issues that plague replication systems such as
 volatile data in cascading triggers (just to name one).  General
 purpose trigger replication is a huge project...
 
 It sounds to me that what you really want is the 'hot standby' feature
 that unfortunately missed the cut for 8.4.  Hot standby is probably
 the easiest way to mirror a database for purposes of read only
 querying.  There are no triggers to worry about, just a few .conf
 settings and some other setup to get going (more or less, it isn't
 finalized yet).  So maybe, waiting for hot standby (or even, digging
 up a hot standby patch and trying to apply it vs. 8.4 if your
 adventurous) is the answer.
 
 Another possibility is to look at statement level replication, like
 pgpool.
 
 merlin
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general
 
 

-- 
View this message in context: 
http://www.nabble.com/Trigger-Function-and-backup-tp24030638p24051851.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Whit Armstrong
Does anyone know why I get an unknown relation error when I query for
relation size?

kls=# select tablename, pg_size_pretty(pg_relation_size(tablename))
from pg_tables where schemaname = 'econ' order by tablename;
ERROR:  relation series_info does not exist
kls=#

Is there a better way to do this query to avoid this error?

Thanks,
Whit

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic table

2009-06-16 Thread A B
 The way you described the problem the EAV solution sounds like the best
 match--not sure if I'd use your synthetic keys though, they will save a
 bit of space on disk but queries will be much more complicated to write.
I guess I'll have to build procedures for all the complicated queries
when ever I add or remove an integer value.


 EAV style solutions are rarely good/easy to maintain when the problem
 changes so maybe you can take a step back from the problem and solve it
 some other way.
That's what I keep reading about EAV :-(


 The examples you gave (i.e. shoe size, hair length) would fit normal
 table columns much better.
Sorry, shoe size was not a good example, think of it as  random
string  instead of shoe size. The data/name is nothing you can relate
to in any way or build special columns for or treat in other ways.


 Just had a quick flick through your previous posts; and I'd probably
 stick with the multiple tables approach.  It's the most natural fit to
 relational databases and until you know more about the problem (i.e.
 you've experienced the data your going to be getting and the ways it's
 going to change) you can't do much better.

One table per integer is one way that I have not thought about. Thanks!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Frank Heikens
The search_path isn't correct. You're looking for the size of tables  
in schema 'econ' but you don't mention this schema inside the function  
pg_relation_size(). Try to  use the schemanames as well, saves you a  
lot of problems with assumptions.


This one should work:

SELECT
tablename,
pg_size_pretty(pg_relation_size(schemaname || '.' || tablename))
FROM
pg_tables
WHERE
schemaname = 'econ';


Kind regard,
Frank


Op 16 jun 2009, om 13:17 heeft Whit Armstrong het volgende geschreven:


Does anyone know why I get an unknown relation error when I query for
relation size?

kls=# select tablename, pg_size_pretty(pg_relation_size(tablename))
from pg_tables where schemaname = 'econ' order by tablename;
ERROR:  relation series_info does not exist
kls=#

Is there a better way to do this query to avoid this error?

Thanks,
Whit

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Custom Fields Database Architecture

2009-06-16 Thread Greg Stark
On Mon, Jun 15, 2009 at 2:04 PM, Gnanamgna...@zoniac.com wrote:

 I also read some article which talks about the type of patterns:
 1. Meta-database
 2. Mutating
 3. Fixed
 4. LOB

 My question here is, what is the best approach to define the architecture
 for custom fields. Performance should not be compromised.

The reason there are multiple patterns are because the best approach
depends very much on the specifics of your needs.

For all David's dogma there are use cases where EAV is the best fit.
But there are downsides and if those downsides are a problem then one
of the other patterns may be a better fit.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Whit Armstrong
Thanks, Frank.

That works perfectly.

Cheers,
Whit


On Tue, Jun 16, 2009 at 7:36 AM, Frank Heikensfrankheik...@mac.com wrote:
 The search_path isn't correct. You're looking for the size of tables in
 schema 'econ' but you don't mention this schema inside the function
 pg_relation_size(). Try to  use the schemanames as well, saves you a lot of
 problems with assumptions.

 This one should work:

 SELECT
        tablename,
        pg_size_pretty(pg_relation_size(schemaname || '.' || tablename))
 FROM
        pg_tables
 WHERE
        schemaname = 'econ';


 Kind regard,
 Frank


 Op 16 jun 2009, om 13:17 heeft Whit Armstrong het volgende geschreven:

 Does anyone know why I get an unknown relation error when I query for
 relation size?

 kls=# select tablename, pg_size_pretty(pg_relation_size(tablename))
 from pg_tables where schemaname = 'econ' order by tablename;
 ERROR:  relation series_info does not exist
 kls=#

 Is there a better way to do this query to avoid this error?

 Thanks,
 Whit

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic table

2009-06-16 Thread Greg Stark
On Tue, Jun 16, 2009 at 12:21 PM, A Bgentosa...@gmail.com wrote:

 Just had a quick flick through your previous posts; and I'd probably
 stick with the multiple tables approach.  It's the most natural fit to
 relational databases and until you know more about the problem (i.e.
 you've experienced the data your going to be getting and the ways it's
 going to change) you can't do much better.

 One table per integer is one way that I have not thought about. Thanks!

you said you wanted to quickly select all the values for a given user.
That does not jibe well with having to use a separate table for each
integer.

I don't think think it's fair to call this EAV actually. It sounds
like the integers are a collection of things which represent the same
thing. Ie, they're all bank balances or all distances driven, just for
different time periods. Storing all objects representing the same
thing in the same column is just a regular normalized table, not EAV.

You might want to consider partitioning the table of integers by type
to facilitate dropping the old ones. But that's an administrative
trick, not a fundamental schema design decision.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic table

2009-06-16 Thread A B
2009/6/16 Greg Stark gsst...@mit.edu

 On Tue, Jun 16, 2009 at 12:21 PM, A Bgentosa...@gmail.com wrote:

  I don't think think it's fair to call this EAV actually. It sounds
 like the integers are a collection of things which represent the same
 thing. Ie, they're all bank balances or all distances driven, just for
 different time periods. Storing all objects representing the same
 thing in the same column is just a regular normalized table, not EAV.

 You might want to consider partitioning the table of integers by type
 to facilitate dropping the old ones. But that's an administrative
 trick, not a fundamental schema design decision.


Unfortunatly there is no type here. One can make no distinction between
the integers and one can not tell when they are added or dropped.


Re: [GENERAL] Dynamic table

2009-06-16 Thread A B
2009/6/16 A B gentosa...@gmail.com:

 2009/6/16 Greg Stark gsst...@mit.edu

 I don't think think it's fair to call this EAV actually. It sounds
 like the integers are a collection of things which represent the same
 thing. Ie, they're all bank balances or all distances driven, just for
 different time periods. Storing all objects representing the same
 thing in the same column is just a regular normalized table, not EAV.

 You might want to consider partitioning the table of integers by type
 to facilitate dropping the old ones. But that's an administrative
 trick, not a fundamental schema design decision.

 Unfortunatly there is no type here. One can make no distinction between
 the integers and one can not tell when they are added or dropped.


Or did I misunderstand you Greg?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger Function and backup

2009-06-16 Thread Merlin Moncure
On Tue, Jun 16, 2009 at 6:48 AM, Nishkarshnishkars...@rediffmail.com wrote:

 Hi Merlin, thanks for the detailed input.

 As per ur suggestion i will try to implement Slony-I.

 I think i will need some help to do it.

 I am useing Postgres 8.3.7, on Windows.

 I was following the Slony-I example in the help for pgAdmin III. I am able
 to perform the steps from 1-7. Step 8 : create Slony-I cluster i am getting
 a msg in the interface

 Slony-I creation script no available; only join possible

 On doing some research i found some scripts to be copied (I was not able to
 find very clear instruction) or give slony-I path. i tried all that but was
 not able to move ahead.

 Can u plz guide me through -(%-|

did you ask on the slony mailing list?  I haven't looked at slony for
a good while so you'll get better help there (unless somebody else
wants to chime in).  you also might want to check out irc.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Playing with set returning functions in SELECT list - behaviour intended?

2009-06-16 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 So it looks like the number of result rows is the least common multiple
 of the cardinalities of all columns in the select list.

It's always been that way.  The lack of any obviously-sane way to
handle multiple SRFs in a targetlist is exactly why the feature is
looked on with disfavor.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Tom Lane
Frank Heikens frankheik...@mac.com writes:
   pg_size_pretty(pg_relation_size(schemaname || '.' || tablename))

At some point you're going to wish you'd used quote_ident() here.

regards, tom lane

PS: Personally I prefer to rely on pg_relation_size(oid), but to use
that you need to be looking directly at pg_class, not at pg_tables
which doesn't expose the oid column :-(

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Frank Heikens

Agreed.

Personally I wouldn't use pg_tables at all because of the missing oid.  
Would be nice to have in this view, but it can't be changed because  
it's a system-view. pg_class would do the job.


Regards,
Frank


Op 16 jun 2009, om 16:12 heeft Tom Lane het volgende geschreven:


Frank Heikens frankheik...@mac.com writes:

pg_size_pretty(pg_relation_size(schemaname || '.' || tablename))


At some point you're going to wish you'd used quote_ident() here.

regards, tom lane

PS: Personally I prefer to rely on pg_relation_size(oid), but to use
that you need to be looking directly at pg_class, not at pg_tables
which doesn't expose the oid column :-(

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Magnus Hagander
Actually, is there any particular reason why we can't *add* that column
to the view in a future version? We certainly shouldn't go modify it,
but adding to it should be pretty safe, no?

-- 
 Magnus Hagander
 Self: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Frank Heikens wrote:
 Agreed.
 
 Personally I wouldn't use pg_tables at all because of the missing oid.
 Would be nice to have in this view, but it can't be changed because it's
 a system-view. pg_class would do the job.
 
 Regards,
 Frank
 
 
 Op 16 jun 2009, om 16:12 heeft Tom Lane het volgende geschreven:
 
 Frank Heikens frankheik...@mac.com writes:
 pg_size_pretty(pg_relation_size(schemaname || '.' || tablename))

 At some point you're going to wish you'd used quote_ident() here.

 regards, tom lane

 PS: Personally I prefer to rely on pg_relation_size(oid), but to use
 that you need to be looking directly at pg_class, not at pg_tables
 which doesn't expose the oid column :-(

 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general
 
 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_relation_size, relation does not exist

2009-06-16 Thread Frank Heikens
The same problem, missing oid's, occurs with several other system  
views as well. If you have to do some serious work, it's always  
pg_class you need.


oid's in these views would be nice, but only if all the system views  
have the oid's of the underlaying objects. In case of pg_tables you  
need the oid's of the schema and the table.


http://www.postgresql.org/docs/8.3/static/views-overview.html

Regards,
Frank


Op 16 jun 2009, om 16:52 heeft Magnus Hagander het volgende geschreven:

Actually, is there any particular reason why we can't *add* that  
column

to the view in a future version? We certainly shouldn't go modify it,
but adding to it should be pretty safe, no?

--
Magnus Hagander
Self: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


Frank Heikens wrote:

Agreed.

Personally I wouldn't use pg_tables at all because of the missing  
oid.
Would be nice to have in this view, but it can't be changed because  
it's

a system-view. pg_class would do the job.

Regards,
Frank


Op 16 jun 2009, om 16:12 heeft Tom Lane het volgende geschreven:


Frank Heikens frankheik...@mac.com writes:

   pg_size_pretty(pg_relation_size(schemaname || '.' || tablename))


At some point you're going to wish you'd used quote_ident() here.

   regards, tom lane

PS: Personally I prefer to rely on pg_relation_size(oid), but to use
that you need to be looking directly at pg_class, not at pg_tables
which doesn't expose the oid column :-(

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Todd Lieberman
 The problem is that we expect to have more than 250 GB of data every month.

Sounds like Terradata or Netezza teritory


Re: [GENERAL] Amazon EC2 | Any recent developments

2009-06-16 Thread David Kerr
On Mon, Jun 15, 2009 at 06:53:00PM -0700, Just Someone wrote:
- Hi,
- 
- I've seen both - some unknown reason for it to die (mostly related to
- the underlying hardware having issues). We also see instance failure
- from time to time with advanced notice. Just like a regular machine
- dies from time to time, so do cloud instances. I'd say it's bit more
- common on the cloud, but not by a big margin. I might see it more
- because I have hundreds of instances running.

Hi Guy,

Thanks!

So, when a cloud machine fails does it get de-allocated/wiped out? or
does it is it still out there in a bad state? how do you recover your
data? 

For example, in a physical system if the RAM goes bad the machine
crashes but you replace the chip, boot, perform instance recovery and
continue on with your life...

Thanks

Dave

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic table

2009-06-16 Thread Sam Mason
On Tue, Jun 16, 2009 at 01:21:42PM +0200, A B wrote:
  The examples you gave (i.e. shoe size, hair length) would fit normal
  table columns much better.
 Sorry, shoe size was not a good example, think of it as  random
 string  instead of shoe size. The data/name is nothing you can relate
 to in any way or build special columns for or treat in other ways.

It's this decision that makes me think you've backed yourself into a
corner where EAV is your only solution.  EAV is nice because it allows
you express arbitrary parings between keys and values (i.e. it's a map
of arbitrary size, like a hashtable in other languages) but it has its
downsides in that all values are constrained to be of some specific type
and it doesn't allow you to exploit the power/expressiveness of databases
very easily.

  Just had a quick flick through your previous posts; and I'd probably
  stick with the multiple tables approach.  It's the most natural fit to
  relational databases and until you know more about the problem (i.e.
  you've experienced the data your going to be getting and the ways it's
  going to change) you can't do much better.
 
 One table per integer is one way that I have not thought about. Thanks!

That's not what I meant at all; I was meaning to group several related
integer values into a single table.  This would be a more natural fit
for the example names you gave.

Your problem is currently sounding very much like an exam question; you
seem to be arbitrarily making decisions without showing any real data.
When you deal with real problems in the real world you're normally
making compromises when you model things and hence the decisions
wouldn't be as forced as you're making it.  Design is about picking and
choosing between compromises and without knowing what the choices are
you can't design anything, this is one of the problems with tests.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Martin Gainty

would suggest Oracle 11 for DB of 10TB or greater 
http://www.oracle.com/solutions/performance_scalability/tpc-h-10tb-11g.html

Martin Gainty 
__ 
Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité
 
Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger 
sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung 
oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem 
Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. 
Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung 
fuer den Inhalt uebernehmen.
Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le 
destinataire prévu, nous te demandons avec bonté que pour satisfaire informez 
l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est 
interdite. Ce message sert à l'information seulement et n'aura pas n'importe 
quel effet légalement obligatoire. Étant donné que les email peuvent facilement 
être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité 
pour le contenu fourni.




Subject: Re: [GENERAL] 10 TB database
Date: Tue, 16 Jun 2009 08:24:07 -0700
From: tlieber...@marchex.com
To: pgsql-general@postgresql.org; pgsql-general@postgresql.org








RE: [GENERAL] 10 TB database




 The problem is that we expect to have more than 250 GB of data every month.



Sounds like Terradata or Netezza teritory




_
Insert movie times and more without leaving Hotmail®.
http://windowslive.com/Tutorial/Hotmail/QuickAdd?ocid=TXT_TAGLM_WL_HM_Tutorial_QuickAdd_062009

Re: [GENERAL] 10 TB database

2009-06-16 Thread Greg Smith

On Mon, 15 Jun 2009, Artur wrote:


We are thinking to create some stocks related search engine.
It is experimental project just for fun.


I hope your idea of fun involves spending a bunch of money on hardware and 
endless tweaking to get data loading every day with appropriate 
corrections, because that's just the first round of fun on a job like 
this.


The problem is that we expect to have more than 250 GB of data every 
month. This data would be in two tables. About 50.000.000 new rows every 
month. We want to have access to all the date mostly for generating user 
requesting reports (aggregating). We would have about 10TB of data in 
three years. Do you think is it possible to build this with postgresql 
and have any idea how to start? :)


You start by figuring out what sort of business model is going to justify 
this very expensive adventure in today's market where buyers of financial 
market products are pretty rare, but that's probably not the question you 
wanted an answer to.


You can certainly build a server capable of handling this job with 
PostgreSQL here in 2009.  Get 8 cores, a stack of 24 1TB disks and a RAID 
card with a write cache, and you'll have a big enough system to handle the 
job.  Basic database design isn't too terribly difficult either.  Stock 
data is trivial to partition up into tiny pieces at the database level 
(each day can be its own 250GB partition), and any system capable of 
holding that much data is going to have a giant stack of drives spreading 
out the disk I/O too.


The first level of problems you'll run into are how to keep up with 
loading data every day.  The main way to get bulk data in PostgreSQL, 
COPY, isn't particularly fast, and you'll be hard pressed to keep up with 
250GB/day unless you write a custom data loader that keeps multiple cores 
going with that load.  Commercial databases have some better solutions to 
solve this problem in the base product, or easily available from third 
party sources.


The much, much bigger problem here is how exactly you're going to provide 
a user interface to this data.  You can't just give people access to the 
whole thing and let them run queries; the first person who executes 
something like select symbol,avg(price) from tickdata group by symbol 
because they want to see the average price of some stock over its lifetime 
is going to kill the whole server.  You really need to generate the 
aggregated reports ahead of time, using an approach like materialized 
views, and then only let people grab those.  It's possible to manually 
create materialized views in PostgreSQL, but that will be yet another bit 
of custom development here.


The third level of issue is how you scale the app up if you're actually 
successful.  It's hard enough to get 250GB of daily data loaded into a 
single database and storing 10TB of data somewhere; doing the job across a 
replicated set of servers, so you can spread the queries out, is even more 
fun than that.


P.S. If you're not already familiar with how to aggregate properly over a 
trading calendar that includes holidays and spots where the market is only 
open part of the day, give up now; that's the hardest issue specific to 
this particular type of application to get right, and a lot of people 
don't realize that early enough in the design process to properly plan for 
it.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Greg Smith

On Mon, 15 Jun 2009, Whit Armstrong wrote:


Is anyone aware of additional examples or tutorials on partitioning?


http://www.pgcon.org/2007/schedule/events/41.en.html
http://blog.mozilla.com/webdev/2007/05/15/partitioning-fun-in-postgresql/
http://benjamin.smedbergs.us/blog/2007-05-12/when-partitioning-database-tables-explain-your-queries/

In that order really; those go from general commentary down to focusing on 
specific issues people tend to run into.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Greg Smith

On Tue, 16 Jun 2009, Brent Wood wrote:

For data retrieval, clustered indexes may help, but as this requires a 
physical reordering of the data on disk, it may be impractical.


This tends to be irrelevant for this sort of data because it's normally 
inserted in a fairly clustered way in the first place.  The usual way 
tables get unclustered involves random insertion and deletion, and that 
just doesn't happen for data that's being imported daily and never deleted 
afterwards; it's naturally clustered quite well.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Amazon EC2 | Any recent developments

2009-06-16 Thread Just Someone
Hi,

 So, when a cloud machine fails does it get de-allocated/wiped out? or
 does it is it still out there in a bad state? how do you recover your
 data?

It depends. Sometimes it dies and you can't do anything with it. In
others you can restart it. As we store the data on EBS (which is a
network storage in AMazon's cloud), the data is not related to the
instance directly. And that's the beauty of it. If a machine dies I
can launch a replacement machine (it takes about 3-5 minutes for it to
boot and be ready), or in some cases I just have a replacement
instance waiting. I then mount the EBS volume holding the Postgres
data, let the server do the recovery if needed and I'm back online. No
need to replace a chip, go to the DC, etc...

In case the volume got corrupted (a very rare situation, as the EBS
volumes are very durable), there are snapshots I can recover from and
the WAL files I stream to another storage system (Amazon's S3). I have
some systems where I take daily tar backups of the PG directory and
ship those to a separate storage.

There is no doubt you can get a much stronger machine by building your
own hardware, but the ability to easily recover, and easily launch and
cluster are a huge advantage for the cloud. It does takes getting used
to, though. You need to think of machines as expendable, and plan for
easy failure preparation and replacement. It does make you really
prepare and test your recovery strategies. I know of too many
companies that just trust the DB to be ok. And it is most of the time,
but when a catastrophe happens, recovery is a long and risky process.

Bye,

Guy.



-- 
Family management on rails: http://www.famundo.com
My development related blog: http://devblog.famundo.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Amazon EC2 | Any recent developments

2009-06-16 Thread Greg Smith

On Tue, 16 Jun 2009, Just Someone wrote:


In case the volume got corrupted (a very rare situation, as the EBS
volumes are very durable), there are snapshots I can recover from and
the WAL files I stream to another storage system (Amazon's S3).


I wouldn't go so far as to say very durable, because the failure rate 
they aim for isn't really very high relative to what people expect when 
you use that term in a database context.  The most definitive commentary 
I've found on this is at 
http://solutions.amazonwebservices.com/connect/thread.jspa;jsessionid=96A862FA1DC393FCDD94DAF0B43CF4E7?messageID=111953 
where they say we aim to provide an annual failure rate (AFR) of 0.1% - 
0.5% for volumes; frankly, that's garbage to most database people.  But, 
as you say, when combined with an alternative backup strategy when that 
happens, the easy provisioning and such can give a reasonable system 
design for some goals.  You just have to recognize that the volumes are 
statistically pretty fragile compared to a traditional RAID configuration 
on dedicated hardware and plan accordingly.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Amazon EC2 | Any recent developments

2009-06-16 Thread Just Someone
Hi,

On Tue, Jun 16, 2009 at 11:05 AM, Greg Smithgsm...@gregsmith.com wrote:

 You just have to recognize that the volumes are
 statistically pretty fragile compared to a traditional RAID configuration on
 dedicated hardware and plan accordingly.

I agree completely. I think the advantage is that it FORCES you to
plan for failure. Now, I know we all SHOULD plan for failure, but I
also know how many do not...

Bye,

Guy.


Family management on rails: http://www.famundo.com
My development related blog: http://devblog.famundo.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Michelle Konzack
Hi Greg,

Am 2009-06-16 12:13:20, schrieb Greg Smith:
 The first level of problems you'll run into are how to keep up with  
 loading data every day.  The main way to get bulk data in PostgreSQL,  
 COPY, isn't particularly fast, and you'll be hard pressed to keep up with 
 250GB/day unless you write a custom data loader that keeps multiple cores 

AFAIK he was talking about 250 GByte/month which are  around  8 GByte  a
day or 300 MByte per hour

Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
Tamay Dogan Network
Debian GNU/Linux Consultant

-- 
Linux-User #280138 with the Linux Counter, http://counter.li.org/
# Debian GNU/Linux Consultant #
http://www.tamay-dogan.net/ Michelle Konzack
http://www.can4linux.org/   c/o Vertriebsp. KabelBW
http://www.flexray4linux.org/   Blumenstrasse 2
Jabber linux4miche...@jabber.ccc.de   77694 Kehl/Germany
IRC #Debian (irc.icq.com) Tel. DE: +49 177 9351947
ICQ #328449886Tel. FR: +33  6  61925193


signature.pgp
Description: Digital signature


Re: [GENERAL] ResultSet is FORWARD_ONLY.

2009-06-16 Thread Andy Colson

S Arvind wrote:

Hi everyone,
Recently we have upgraded postgres driver jdbc4 for 8.3. After that 
while executing rst.first(), it raise the exception.

The statment was created correctly as
dbconnection..createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);

org.postgresql.util.PSQLException: Operation requires a scrollable 
ResultSet, but this ResultSet is FORWARD_ONLY.
at 
org.postgresql.jdbc2.AbstractJdbc2ResultSet.checkScrollable(AbstractJdbc2ResultSet.java:207)
at 
org.postgresql.jdbc2.AbstractJdbc2ResultSet.first(AbstractJdbc2ResultSet.java:292)
at 
org.apache.tomcat.dbcp.dbcp.DelegatingResultSet.first(DelegatingResultSet.java:326)



This statment worked in postgres 8.1 jdbc 3 driver but now its not.. 
Pleas help is solving this...


-Arvind S


I guess there are two important questions:

1) do you want a forward only result set?

2) if yes, then cant you just remove .first()?

Its ok not to use first, right:

q.open();
while (! q.eof())
{
do something;
q.next();
}

(IANAJP)

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Greg Smith

On Tue, 16 Jun 2009, Michelle Konzack wrote:


Am 2009-06-16 12:13:20, schrieb Greg Smith:
you'll be hard pressed to keep up with 250GB/day unless you write a 
custom data loader that keeps multiple cores


AFAIK he was talking about 250 GByte/month which are  around  8 GByte  a
day or 300 MByte per hour


Right, that was just a typo in my response, the comments reflected what he 
meant.  Note that your averages here presume you can spread that out over 
a full 24 hour period--which you often can't, as this type of data tends 
to come in a big clump after market close and needs to be loaded ASAP for 
it to be useful.


It's harder than most people would guess to sustain that sort of rate 
against real-world data (which even fails to import some days) in 
PostgreSQL without running into a bottleneck in COPY, WAL traffic, or 
database disk I/O (particularly if there's any random access stuff going 
on concurrently with the load).  Just because your RAID array can write at 
hundreds of MB/s does not mean you'll be able to sustain anywhere close to 
that during your loading.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Grzegorz Jaśkiewicz
2009/6/16 Martin Gainty mgai...@hotmail.com:
 would suggest Oracle 11 for DB of 10TB or greater
 http://www.oracle.com/solutions/performance_scalability/tpc-h-10tb-11g.html

You are joking, right ?
Better invest that money in paying someone from -hackers to add
features required, if there will be any !
Or buy for that heftier RAID, with more disks...


-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 10 TB database

2009-06-16 Thread Scott Marlowe
2009/6/16 Grzegorz Jaśkiewicz gryz...@gmail.com:
 2009/6/16 Martin Gainty mgai...@hotmail.com:
 would suggest Oracle 11 for DB of 10TB or greater
 http://www.oracle.com/solutions/performance_scalability/tpc-h-10tb-11g.html

 You are joking, right ?
 Better invest that money in paying someone from -hackers to add
 features required, if there will be any !
 Or buy for that heftier RAID, with more disks...

You can throw a metric ton of hardware and development at a problem
for the cost of an Oracle license.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Dynamic table

2009-06-16 Thread Erik Jones


On Jun 16, 2009, at 12:11 AM, A B wrote:


Hi.
I have a little problem (and a suggestion for a solution) that I
wondered if anyone would care to comment on.

I have a standard table filled with customers (with a unique customer
id, names etc.) and for each customer I need to store some integer
values. The problem is that the number of integer values that should
be stored WILL change over time (it can both increase and decrease).
It will most likely grow from zero to 10-18 and then perhaps add 1 or
remove one value per year but basically be about 10-18 or so.

I must be able to add a new integer value to all customers, and remove
an integer value from all customers  Altering the table by adding and
deleting columns would theoretically solve it, but since columns are
not  really dropped, just hidden away, and there is a 1600 column
limit on tables as I understand it, this would crash the application
at some time in the future, or at least it will waste a lot of
discspace.

Other things I must be able to do is to alter values for a specific
integer value for one customer or all customers simultaneous. This can
be like change all value 4 to 8 for the third integer value.
And I must be able to quickly select all the integers for a specific
customer for further processing. There will also be access to single
integer values for a specific customer. It's hard to say how much
compared to get all integer values for a specific customer. All
customers will be equally accessed.

As I see it I have these options.

Method C)
Have a metadata table as usual, and then store the values in an array.
I must admit that I have not looked enough at arrays yet.
The drawback I can see right now will be the access to a specific
value in the array, but I guess PostgreSQL is fast...


You may want to try this.

pagila=# create table test (id serial primary key, a int[]);
NOTICE:  CREATE TABLE will create implicit sequence test_id_seq for  
serial column test.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index  
test_pkey for table test

CREATE TABLE
Time: 12.844 ms

-- let's populate it with some starter data
pagila=# insert into test (a) values ('{1}'), ('{2}'), ('{3}');
INSERT 0 3
Time: 2.127 ms
pagila=# select * from test;
 id |  a
+-
  1 | {1}
  2 | {2}
  3 | {3}
(3 rows)

Time: 2.823 ms

-- so, lets say you want to add a second integer value of 5 to all  
rows (remember, postgres arrays are 1-based):

pagila=# update test set a[2] = 5;
UPDATE 3
Time: 1.157 ms
pagila=# select * from test;
 id |   a
+---
  1 | {1,5}
  2 | {2,5}
  3 | {3,5}
(3 rows)

Time: 0.445 ms

-- delete the first integer value for just id=1, the key thing here is  
that you use update to delete an individual value in an array

pagila=# update test set a[1] = null where id = 1;
UPDATE 1
Time: 1.688 ms
pagila=# select * from test;
 id |a
+--
  2 | {2,5}
  3 | {3,5}
  1 | {NULL,5}
(3 rows)

Time: 0.527 ms

-- get integer value 1 for all rows
pagila=# select a[1] from test;
 a

  2
  3
 \N
(3 rows)

Time: 0.489 ms

-- you can even skip positions
pagila=# update test set a[5] = 10;
UPDATE 3
Time: 1.180 ms
pagila=# select * from test;
 id |   a
+---
  2 | {2,5,NULL,NULL,10}
  3 | {3,5,NULL,NULL,10}
  1 | {NULL,5,NULL,NULL,10}
(3 rows)

Time: 0.431 ms

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] nagios -- number of postgres connections

2009-06-16 Thread Whit Armstrong
anyone know a way to get nagios to monitor the number of postgres connections?

Thanks,
Whit

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] nagios -- number of postgres connections

2009-06-16 Thread John R Pierce

Whit Armstrong wrote:

anyone know a way to get nagios to monitor the number of postgres connections?
  



a script that counts how many lines of postgres processes there are and 
subtracts a few



$ ps -C postmaster uww
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
postgres  2878  0.0  0.1 152036  1500 ?SApr05  22:48 
/usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data
postgres  3941  0.0  0.0  13056   460 ?Ss   Apr05   0:11 
postgres: logger process 
postgres  3944  0.0  8.4 152192 87196 ?Ss   Apr05   1:35 
postgres: writer process 
postgres  3945  0.0  0.0 152192   684 ?Ss   Apr05   0:10 
postgres: wal writer process 
postgres  3946  0.0  0.1 152980  1380 ?Ss   Apr05  29:13 
postgres: autovacuum launcher process
postgres  3947  0.1  0.0  13916  1020 ?Ss   Apr05 118:23 
postgres: stats collector process 



so there's one connection there, and the others are the fixed processes.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] nagios -- number of postgres connections

2009-06-16 Thread Scott Mead
On Tue, Jun 16, 2009 at 11:32 PM, John R Pierce pie...@hogranch.com wrote:

 Whit Armstrong wrote:

 anyone know a way to get nagios to monitor the number of postgres
 connections?



  You could also login to the database and run:

   select count(1) from pg_stat_activity;

--Scott


Re: [GENERAL] nagios -- number of postgres connections

2009-06-16 Thread Jeff Frost
John R Pierce wrote:
 Whit Armstrong wrote:
 anyone know a way to get nagios to monitor the number of postgres
 connections?
   


Check out the check_postgres nagios plugin:
http://bucardo.org/check_postgres/

Specifically you want the backends check:
http://bucardo.org/check_postgres/check_postgres.pl.html#backends


-- 
Jeff Frost, Owner   j...@frostconsultingllc.com
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 916-647-6411 FAX: 916-405-4032


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] nagios -- number of postgres connections

2009-06-16 Thread Scott Mead
On Tue, Jun 16, 2009 at 11:32 PM, John R Pierce pie...@hogranch.com wrote:

 Whit Armstrong wrote:

 anyone know a way to get nagios to monitor the number of postgres
 connections?




  You could also login to the database and run:

   select count(1) from pg_stat_activity;

--Scott