Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith

John R Pierce wrote:
otoh, there's plenty of places where natural keys are optimal.   my 
company makes widgets, and we make damn sure our serial #s and part 
numbers are unique, and we use them as PK's for the various tables.


Sure; what I was commenting on is that you normally can't ever trust 
external sources for identifiers.  If you want to come up with your own, 
internally unique keys for things, great.  But one of the goals of using 
a natural key is often to avoid the overhead of storing both that ID and 
some made up internal number, too.  And whether the number is made up by 
the computer (the classic SERIAL or similar surrogate key), or you make 
one up yourself, it's still another chunk of data that gets stored for 
every item.  It's just one that means something more useful in your 
case.  Probably going to take up more space in the process and possibly 
be slower though--part number strings can easily end up longer than 
SERIAL-like integers.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Bidirectional replication

2011-05-03 Thread tushar nehete
Hi,
Is there any way to do bidirectional replication for Postgresql Plus Advance
Server 8.4.5?

I tried SLONY-I but its master-slave asynchronous replication.
Can we configure master-master replication by slony?


Or is there any trusted tool to do it?


Regards,
Tushar


Re: [GENERAL] Seg Fault in backend after beginning to use xpath (PG 9.0, FreeBSD 8.1)

2011-05-03 Thread alan bryan
On Mon, May 2, 2011 at 10:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 alan bryan alan.br...@gmail.com writes:
 Checking out postgres.core and we see:

 (gdb) bt
 #0  0x0008f5f19afd in pthread_mutex_lock () from /lib/libthr.so.3
 #1  0x000800d22965 in xmlRMutexLock () from /usr/local/lib/libxml2.so.5
 #2  0x000800d717e1 in xmlDictReference () from 
 /usr/local/lib/libxml2.so.5
 #3  0x000800d74ba5 in xmlSAX2StartDocument ()
    from /usr/local/lib/libxml2.so.5
 #4  0x000800ccee5f in xmlParseDocument () from 
 /usr/local/lib/libxml2.so.5
 #5  0x000800ccef85 in xmlDoRead () from /usr/local/lib/libxml2.so.5
 #6  0x0076b58d in xpath ()
 #7  0x005880e4 in GetAttributeByNum ()
 #8  0x00588e91 in GetAttributeByName ()
 #9  0x005850a3 in ExecProject ()
 #10 0x0058c5e4 in ExecScan ()
 #11 0x00584a2d in ExecProcNode ()
 #12 0x0059bfc8 in ExecLimit ()
 #13 0x005848f5 in ExecProcNode ()
 #14 0x00583049 in standard_ExecutorRun ()
 #15 0x0067630d in PostgresMain ()
 #16 0x00677921 in PortalRun ()
 #17 0x00672ea4 in pg_parse_and_rewrite ()
 #18 0x00675354 in PostgresMain ()
 #19 0x00626afb in ClosePostmasterPorts ()
 #20 0x00627a8e in PostmasterMain ()
 #21 0x005bbea7 in main ()
 (gdb)

 Ideas?  Need more info?

 Well, the first thing that you should consider is rebuilding both PG and
 libxml with debug symbols enabled, so you can get a stack trace that's
 worth the electrons it's written on.  That one has enough laughers in
 the PG part to make me not trust the libxml part too much.  That would
 also help you find out what SQL command is being executed, which'd
 possibly lead to being able to create a reproducible test case.

                        regards, tom lane


Thanks Tom - I'll see what I can do.  We just removed that new code
and did it in our PHP code instead as a workaround.  I'll try to spend
some time getting a reproducible test case and come back with a better
trace if possible.

Appreciate the quick response.

--Alan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread John R Pierce

On 05/02/11 11:15 PM, tushar nehete wrote:

Hi,
Is there any way to do bidirectional replication for Postgresql Plus 
Advance Server 8.4.5?




PostgreSQL Plus Advanced Server is a commercial product sold by 
EntepriseDB, you probably should ask them



I tried SLONY-I but its master-slave asynchronous replication.
Can we configure master-master replication by slony?


Or is there any trusted tool to do it?



In general, master-master replication is not easy to do efficiently and 
correctly.   every implementation on any database suffers from issues 
with either very poor performance due to global synchronous locking and 
2 phase commits, or it suffers from data collisions, which can only be 
avoided with careful application design and programming, not easily 
enforced at the database server.


AFAIK, the only postgres replication systems that even pretend to 
support master-master are things like Bucardo that do the replication at 
the SQL layer, by sending all update/insert/delete commands to both 
servers, and under certain sequences of concurrent queries, you could 
end up with different results on the two servers.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Sim Zacks

On 05/03/2011 09:15 AM, tushar nehete wrote:


Hi,
Is there any way to do bidirectional replication for Postgresql Plus 
Advance Server 8.4.5?


I tried SLONY-I but its master-slave asynchronous replication.
Can we configure master-master replication by slony?


Or is there any trusted tool to do it?


Regards,
Tushar
I have heard good things about Bucardo, though I haven't tried it myself 
yet. I was warned that it would be risky to have 2 masters that have the 
same tables modified in both because of issues such as delayed sync, 
race conditions and other such goodies that may corrupt the meaning of 
the data.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Karsten Hilbert
On Tue, May 03, 2011 at 10:52:23AM +0800, Craig Ringer wrote:

 ... and that's before we get into the horror of what is someone's
 name. Which name? Which spelling? Do they even have a single canonical
 name?

- people have, at least over time, several compound names
- they have, at any one time, one active compound name
- additional spellings can be tracked as additional names
  of that individual

 Is their canonical name - if any - expressable in the character
 set used by the service? Is it even covered by Unicode?!?

- I haven't seen evidence to the contrary.
- But then, I haven't had a need to store a Klingon name.
- Yes, it's been difficult to come up with something sensible
  to store Spock's first name in the GNUmed database.

 Does it make
 any sense to split their name up into the traditional
 english-speaking-recent-western family and given name parts?

- any compound names I have come across work like this:

- group name
- individual name
- nicknames (pseudonyms, warrior names, actor names, ...)

The day-to-day usage of each part varies, though.

 Is there a single consistent way to do so for their name even if it does? etc.

Even in Japan, where the group is a lot more than the
individual, can you clearly split into group name and
individual name.

Karsten
-- 
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Simon Riggs
On Tue, May 3, 2011 at 7:31 AM, Sim Zacks s...@compulab.co.il wrote:

 I have heard good things about Bucardo, though I haven't tried it myself
 yet. I was warned that it would be risky to have 2 masters that have the
 same tables modified in both because of issues such as delayed sync, race
 conditions and other such goodies that may corrupt the meaning of the data.


Just to be clear and fair to Bucardo, I would add a few points.

All multi-master replication solutions that use an optimistic
mechanism require conflict resolution cases and code. This is the
same with SQLServer and Oracle etc.. Referring to a well known problem
as a race condition seems to introduce doubt and fear into a situation
that is well understood. Bucardo does offer hooks for conflict
resolution to allow you to program around the issues.

So if I felt that multi-master replication was the right way to go for
a solution, Bucardo is a good choice.

Just to add other info: if multi-master replication uses pessimistic
coherence, then the coherence mechanism can also be a source of
contention and/or cause the need for alternative kinds of conflict
resolution.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread raghu ram
Respected Committers,

It may be a silly question, still out of curiosity I want to know, is there
any possible way to flush the Postgres Shared Memory without restarting the
cluster.

In Oracle, we can flush the SGA, can we get the same feature here..

Thanks in Advance.

Regards
Raghu Ram
EnterpriseDB Corporation


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Rick Genter

On May 2, 2011, at 10:52 PM, Craig Ringer wrote:

 SSN? What if they don't live in the US or aren't a citizen?

Non-citizens can have SSNs (they have to if they work in the US).
--
Rick Genter
rick.gen...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: Seg Fault in backend after beginning to use xpath (PG 9.0, FreeBSD 8.1)

2011-05-03 Thread Ivan Voras

On 03/05/2011 07:12, alan bryan wrote:

Our developers started to use some xpath features and upon deployment
we now have an issue where PostgreSQL is seg faulting periodically.
Any ideas on what to look at next would be much appreciated.

FreeBSD 8.1
PostgreSQL 9.0.3 (also tried upgrading to 9.0.4)  built from ports
Libxml2 2.7.6 (also tried upgrading to 2.7.8)   built from ports

pgsql logs show:
May  1 17:51:13 192.168.20.100 postgres[11862]: [94-1] LOG:  server
process (PID 62112) was terminated by signal 11: Segmentation fault

syslog shows:
May  2 20:29:16 db3 kernel: pid 49956 (postgres), uid 70: exited on
signal 11 (core dumped)
May  2 21:06:37 db3 kernel: pid 39086 (postgres), uid 70: exited on
signal 10 (core dumped)

Checking out postgres.core and we see:

(gdb) bt
#0  0x0008f5f19afd in pthread_mutex_lock () from /lib/libthr.so.3
#1  0x000800d22965 in xmlRMutexLock () from /usr/local/lib/libxml2.so.5


This is unusual. There isn't any need to use pthreads here. As far as I 
can see, the normal build of libxml2 doesn't import it explicitly:



ldd /usr/local/lib/libxml2.so

/usr/local/lib/libxml2.so:
libz.so.5 = /lib/libz.so.5 (0x800889000)
libiconv.so.3 = /usr/local/lib/libiconv.so.3 (0x800e5)
libm.so.5 = /lib/libm.so.5 (0x80104b000)
libc.so.7 = /lib/libc.so.7 (0x800647000)

Judging by the mix of SIGBUS and SIGSEGV, I'd say it is likely this is 
causing you problems.


To make sure, you may want to rebuild libxml2 with WITHOUT_THREADS 
defined. You may also need to rebuild postgresql afterwards.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread tushar nehete
Thanks you all,
I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5)
server.
Can you please suggest some link which describe the installation steps in
details.


Thanks,
Tushar

On Tue, May 3, 2011 at 2:49 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, May 3, 2011 at 7:31 AM, Sim Zacks s...@compulab.co.il wrote:

  I have heard good things about Bucardo, though I haven't tried it myself
  yet. I was warned that it would be risky to have 2 masters that have the
  same tables modified in both because of issues such as delayed sync, race
  conditions and other such goodies that may corrupt the meaning of the
 data.


 Just to be clear and fair to Bucardo, I would add a few points.

 All multi-master replication solutions that use an optimistic
 mechanism require conflict resolution cases and code. This is the
 same with SQLServer and Oracle etc.. Referring to a well known problem
 as a race condition seems to introduce doubt and fear into a situation
 that is well understood. Bucardo does offer hooks for conflict
 resolution to allow you to program around the issues.

 So if I felt that multi-master replication was the right way to go for
 a solution, Bucardo is a good choice.

 Just to add other info: if multi-master replication uses pessimistic
 coherence, then the coherence mechanism can also be a source of
 contention and/or cause the need for alternative kinds of conflict
 resolution.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Simon Riggs
On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com wrote:

 It may be a silly question, still out of curiosity I want to know, is there
 any possible way to flush the Postgres Shared Memory without restarting the
 cluster.
 In Oracle, we can flush the SGA, can we get the same feature here..
 Thanks in Advance.


The CHECKPOINT command will do this for you.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Raghavendra
Best to start with..

http://bucardo.org/wiki/Bucardo/Installation

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/



On Tue, May 3, 2011 at 5:34 PM, tushar nehete tpneh...@gmail.com wrote:

 Thanks you all,
 I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5)
 server.
 Can you please suggest some link which describe the installation steps in
 details.


 Thanks,
 Tushar

 On Tue, May 3, 2011 at 2:49 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, May 3, 2011 at 7:31 AM, Sim Zacks s...@compulab.co.il wrote:

  I have heard good things about Bucardo, though I haven't tried it myself
  yet. I was warned that it would be risky to have 2 masters that have the
  same tables modified in both because of issues such as delayed sync,
 race
  conditions and other such goodies that may corrupt the meaning of the
 data.


 Just to be clear and fair to Bucardo, I would add a few points.

 All multi-master replication solutions that use an optimistic
 mechanism require conflict resolution cases and code. This is the
 same with SQLServer and Oracle etc.. Referring to a well known problem
 as a race condition seems to introduce doubt and fear into a situation
 that is well understood. Bucardo does offer hooks for conflict
 resolution to allow you to program around the issues.

 So if I felt that multi-master replication was the right way to go for
 a solution, Bucardo is a good choice.

 Just to add other info: if multi-master replication uses pessimistic
 coherence, then the coherence mechanism can also be a source of
 contention and/or cause the need for alternative kinds of conflict
 resolution.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general





Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Raghavendra
One more point, Please take into consideration the points mentioned by Simon
Riggs in your testing.

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/



On Tue, May 3, 2011 at 5:41 PM, Raghavendra 
raghavendra@enterprisedb.com wrote:

 Best to start with..

 http://bucardo.org/wiki/Bucardo/Installation

 Best Regards,
 Raghavendra
 EnterpriseDB Corporation
 Blog: http://raghavt.blogspot.com/



 On Tue, May 3, 2011 at 5:34 PM, tushar nehete tpneh...@gmail.com wrote:

 Thanks you all,
 I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5)
 server.
 Can you please suggest some link which describe the installation steps in
 details.


 Thanks,
 Tushar

 On Tue, May 3, 2011 at 2:49 PM, Simon Riggs si...@2ndquadrant.comwrote:

 On Tue, May 3, 2011 at 7:31 AM, Sim Zacks s...@compulab.co.il wrote:

  I have heard good things about Bucardo, though I haven't tried it
 myself
  yet. I was warned that it would be risky to have 2 masters that have
 the
  same tables modified in both because of issues such as delayed sync,
 race
  conditions and other such goodies that may corrupt the meaning of the
 data.


 Just to be clear and fair to Bucardo, I would add a few points.

 All multi-master replication solutions that use an optimistic
 mechanism require conflict resolution cases and code. This is the
 same with SQLServer and Oracle etc.. Referring to a well known problem
 as a race condition seems to introduce doubt and fear into a situation
 that is well understood. Bucardo does offer hooks for conflict
 resolution to allow you to program around the issues.

 So if I felt that multi-master replication was the right way to go for
 a solution, Bucardo is a good choice.

 Just to add other info: if multi-master replication uses pessimistic
 coherence, then the coherence mechanism can also be a source of
 contention and/or cause the need for alternative kinds of conflict
 resolution.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general






Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Raghavendra
On Tue, May 3, 2011 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com
 wrote:

  It may be a silly question, still out of curiosity I want to know, is
 there
  any possible way to flush the Postgres Shared Memory without restarting
 the
  cluster.
  In Oracle, we can flush the SGA, can we get the same feature here..
  Thanks in Advance.


 The CHECKPOINT command will do this for you.


This command will empty the PSM...

Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread raghu ram
On Tue, May 3, 2011 at 6:01 PM, Raghavendra 
raghavendra@enterprisedb.com wrote:


 On Tue, May 3, 2011 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com
 wrote:

  It may be a silly question, still out of curiosity I want to know, is
 there
  any possible way to flush the Postgres Shared Memory without restarting
 the
  cluster.
  In Oracle, we can flush the SGA, can we get the same feature here..
  Thanks in Advance.


 The CHECKPOINT command will do this for you.




According to PostgreSQL documentation, whenever you execute CHECKPOINT in
the database,it will flush the modified data files presented in the Shared
Buffers retuned to the Disk.

   http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html

Is this clears the entire shared memory cache and same time,if i execute
fresh SQL statement, Data will be retuned from disk??



--Raghu Ram


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Merlin Moncure
On Mon, May 2, 2011 at 11:53 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 03/05/11 11:07, Greg Smith wrote:

 That doesn't mean you can't use
 them as a sort of foreign key indexing the data; it just means you can't
 make them the sole unique identifier for a particular entity, where that
 entity is a person, company, or part.

 Classic case: a database here has several tables indexed by MAC address.
 It's used for asset reporting and software inventory.

 Problem: VMs generate random MAC addresses by default. They're not
 guaranteed to be globally unique. Collisions have happened and will
 probably happen again. In this case, it wasn't a big deal, but it just
 goes to show that even the obviously globally unique isn't necessarily so.

It's precisely pathological cases like this where uniqueness
constraints are important and should be used.  By the way, we aren't
debating the use of natural case but whether to define uniqueness
constraints.  My main gripe with surrogates is that their use often
leads directly to lazy schema design where uniqueness constraints are
not defined which leads to data problems exactly like the case you
described above.

In a purely surrogate table with no unique on the mac, suppose you
have two records with the same value for the address, and there are no
other interesting fields on the table or elsewhere in the database:
*) who/what made the decision to place a second record on the table?
*) is that decision verifiable?
*) Is that decision repeatable?
*) are there other routes of data entry into the database that bypass
that decision?  will there ever be?
*) what happens when the code that represents that decision has or
develops a bug?
*) why would you not want information supporting that decision in the database?
*) how do you know the tables keying to your mac table are pointing to
the correct record?
*) what are the consequences for ad hoc queries that join directly
against the mac?  DISTINCT?

If your data modeler that made the the assumptions that a MAC is
unique (a mistake obviously) at least the other tables are protected
from violations of that assumption because the database would reject
them with an error, which is a Good Thing.  Without a uniqueness
constraint you now have ambiguous data which is a Very Bad Thing.
Without proper unique constraints, a generated key is effectively
saying well, I cant' figure this out right now...I'll deal with it
later.  That unmanaged complexity is now in the application and all
the queries that touch the database...you've created your own bug
factory.  With a uniqueness constraint, you have a rigorous definition
of what your record represents, and other entities in the database can
now rely on that definition.

Natural key designs are good for a lot of reasons, but #1 on the least
is that they force you to deal with problems in your data model up
front because they force you to define unqiueness.  If the MAC turns
out not to be unique and the problem is not in fact coming from the
input data or the application, yes, you do have to correct the model
but at least the data inside the database is clean, and can be
unambiguously mapped to the new model.  I'll take schema changes over
bad data.

Correcting the model means you have to figure out whatever information
is used to distinguish identical MACs #1 and #2 is stored in the
database because now your data and the corresponding decisions are
verifiable, repeatable, unambiguous, etc.  What extra field you have
to add to your 'mac' table to make it unique would depend on certain
things, but it's certainly a solvable problem, and when solved would
give you a more robust database.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Merlin Moncure
On Tue, May 3, 2011 at 8:30 AM, raghu ram raghuchenn...@gmail.com wrote:
 On Tue, May 3, 2011 at 6:01 PM, Raghavendra
 raghavendra@enterprisedb.com wrote:

 On Tue, May 3, 2011 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com
 wrote:

  It may be a silly question, still out of curiosity I want to know, is
  there
  any possible way to flush the Postgres Shared Memory without restarting
  the
  cluster.
  In Oracle, we can flush the SGA, can we get the same feature here..
  Thanks in Advance.


 The CHECKPOINT command will do this for you.



 According to PostgreSQL documentation, whenever you execute CHECKPOINT in
 the database,it will flush the modified data files presented in the Shared
 Buffers retuned to the Disk.
            http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
 Is this clears the entire shared memory cache and same time,if i execute
 fresh SQL statement, Data will be retuned from disk??

no it will not, or at least there is no guarantee it will be.  the
only way to reset the buffers in that sense is to restart the database
(and even then they might not be read from disk, because they could
sit in the o/s cache).  to force a read from the drive you'd have to
reboot the server, or at least shut it down and use a lot of memory
for some other purpose.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread tv
 On Tue, May 3, 2011 at 6:01 PM, Raghavendra 
 raghavendra@enterprisedb.com wrote:


 On Tue, May 3, 2011 at 5:37 PM, Simon Riggs si...@2ndquadrant.com
 wrote:

 On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com
 wrote:

  It may be a silly question, still out of curiosity I want to know, is
 there
  any possible way to flush the Postgres Shared Memory without
 restarting
 the
  cluster.
  In Oracle, we can flush the SGA, can we get the same feature here..
  Thanks in Advance.


 The CHECKPOINT command will do this for you.




 According to PostgreSQL documentation, whenever you execute CHECKPOINT
 in
 the database,it will flush the modified data files presented in the Shared
 Buffers retuned to the Disk.

http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html

 Is this clears the entire shared memory cache and same time,if i execute
 fresh SQL statement, Data will be retuned from disk??

No. Checkpoint means all dirty buffers are written to the datafiles, it
does not mean emptying the shared buffers. Checkpoints happen regularly so
this would have an unwanted impact on performance.

And besides that, there's a page cache maintained by the OS (not sure if
you're running Linux or Windows). So even when the block does not exist in
the shared buffers, it may be in the page cache (thus not read from the
drive).

Dropping the page cache is quite simple (http://linux-mm.org/Drop_Caches),
emptying the shared buffers is not that simple - I guess the easiest way
is to restart the db.

What are you trying to achieve? Why do you need this?

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Cédric Villemain
2011/5/3 Merlin Moncure mmonc...@gmail.com:
 On Tue, May 3, 2011 at 8:30 AM, raghu ram raghuchenn...@gmail.com wrote:
 On Tue, May 3, 2011 at 6:01 PM, Raghavendra
 raghavendra@enterprisedb.com wrote:

 On Tue, May 3, 2011 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com
 wrote:

  It may be a silly question, still out of curiosity I want to know, is
  there
  any possible way to flush the Postgres Shared Memory without restarting
  the
  cluster.
  In Oracle, we can flush the SGA, can we get the same feature here..
  Thanks in Advance.


 The CHECKPOINT command will do this for you.



 According to PostgreSQL documentation, whenever you execute CHECKPOINT in
 the database,it will flush the modified data files presented in the Shared
 Buffers retuned to the Disk.
            http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
 Is this clears the entire shared memory cache and same time,if i execute
 fresh SQL statement, Data will be retuned from disk??

 no it will not, or at least there is no guarantee it will be.  the
 only way to reset the buffers in that sense is to restart the database
 (and even then they might not be read from disk, because they could
 sit in the o/s cache).  to force a read from the drive you'd have to
 reboot the server, or at least shut it down and use a lot of memory
 for some other purpose.

with linux, you can : echo 3  /proc/sys/vm/drop_caches for the OS cache


 merlin

 --
 Sent via pgsql-admin mailing list (pgsql-ad...@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-admin




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Merlin Moncure
On Tue, May 3, 2011 at 8:55 AM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2011/5/3 Merlin Moncure mmonc...@gmail.com:

 no it will not, or at least there is no guarantee it will be.  the
 only way to reset the buffers in that sense is to restart the database
 (and even then they might not be read from disk, because they could
 sit in the o/s cache).  to force a read from the drive you'd have to
 reboot the server, or at least shut it down and use a lot of memory
 for some other purpose.

 with linux, you can : echo 3  /proc/sys/vm/drop_caches for the OS cache


yeah -- good point.  aside: does that also drop cache on the drive/raid card?

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Cédric Villemain
2011/5/3 Merlin Moncure mmonc...@gmail.com:
 On Tue, May 3, 2011 at 8:55 AM, Cédric Villemain
 cedric.villemain.deb...@gmail.com wrote:
 2011/5/3 Merlin Moncure mmonc...@gmail.com:

 no it will not, or at least there is no guarantee it will be.  the
 only way to reset the buffers in that sense is to restart the database
 (and even then they might not be read from disk, because they could
 sit in the o/s cache).  to force a read from the drive you'd have to
 reboot the server, or at least shut it down and use a lot of memory
 for some other purpose.

 with linux, you can : echo 3  /proc/sys/vm/drop_caches for the OS cache


 yeah -- good point.  aside: does that also drop cache on the drive/raid card?

no -- good point too ! (damn! how SAN users will do...maybe EMC or
other are good enough to provide some control panel for that ? )

and as I read on the link provided by Tomas, it is better to issue a
'sync' before trying to drop cache (I do that sometime, but postgresql
flush its write before shutdown, so I expected the dirty pages in OS
cache not to be relative to postgresql files.)


-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread Merlin Moncure
On Tue, May 3, 2011 at 4:19 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, May 3, 2011 at 7:31 AM, Sim Zacks s...@compulab.co.il wrote:

 I have heard good things about Bucardo, though I haven't tried it myself
 yet. I was warned that it would be risky to have 2 masters that have the
 same tables modified in both because of issues such as delayed sync, race
 conditions and other such goodies that may corrupt the meaning of the data.


 Just to be clear and fair to Bucardo, I would add a few points.

 All multi-master replication solutions that use an optimistic
 mechanism require conflict resolution cases and code. This is the
 same with SQLServer and Oracle etc.. Referring to a well known problem
 as a race condition seems to introduce doubt and fear into a situation
 that is well understood. Bucardo does offer hooks for conflict
 resolution to allow you to program around the issues.

 So if I felt that multi-master replication was the right way to go for
 a solution, Bucardo is a good choice.

 Just to add other info: if multi-master replication uses pessimistic
 coherence, then the coherence mechanism can also be a source of
 contention and/or cause the need for alternative kinds of conflict
 resolution.

Yeah.  One nasty property that async multi master solutions share is
that they change the definition of what 'COMMIT' means -- the database
can't guarantee the transaction is valid because not all the
supporting facts are necessarily known.  Even after libpq gives you
the green light that transaction could fail an arbitrary length of
time later, and you can't rely in the assumption it's valid until
you've done some synchronizing with the other 'masters'.  Maybe you
don't need to rely on that assumption so a 'fix it later, or possibly
never' methodology works well.  Those cases unfortunately fairly rare
in the real world.

Multi master replication, at least those implementations that don't
hold locks and release the transaction until you've got a guarantee
it's valid and will stay valid, are fundamentally incompatible with
SQL.  I know some people do some cool, usable things with that stuff,
but the whole concept seems awfully awkward to me.  I suppose I'm a
crotchety, cane shaking fundamentalist, but the old school approach of
dividing work logically and developing communication protocols is
often the best approach to take.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Simon Riggs
On Tue, May 3, 2011 at 3:16 PM, Mark Johnson
m...@remingtondatabasesolutions.com wrote:

 The contents of this email may not be copied or forwarded in part or in
 whole without the express written consent of the author.

Pleased to meet you Mark.

If you post here, the above disclaimer is not effective. Right now
your words are being copied across the internet...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Mark Johnson

Is there a particular one of Oracle's memory clearning features you want to use 
in PostgreSQL? In Oracle you cannot flush the entire SGA without a restart, but 
you can flush three parts of the SGA using three separate commands.
1. In Oracle you can flush the redo buffer by issuing a COMMIT or by rotating 
the logs. You can force a log switch in PostgreSQL using select 
pg_switch_xlog();.
2. In Oracle when you flush the shared pool this does three things: (a) removes 
sql and pl/sql statements from the shared library cache, (b) flushes the 
dictionary cache of object info and security data, and (c) flushes the query 
result cache (11g only). I am relatively new to PostgreSQL and have not seen an 
equivalent in PostgreSQL to these things. Based on other replies it does not 
seem possible to flush the catalog cache in PostgreSQL.
3. In Oracle when you request a flush of the buffer cache it signals a 
checkpoint to ensure all dirty buffers are written out AND later it will remove 
the dirty buffers from memory. This can take anywhere from a few seconds on 
very small systems to several minutes on VLDB systems, per my observations. The 
Oracle checkpoint is fast, and the SQL prompt comes back very fast, but the 
removal of dirty buffers from memory runs in the background with a low priority 
over a long period of time. If you are planning to use alter system flush 
buffer_cache to clear memory in between tests you actually have no way to know 
when memory is clear except to wait a long time and then assume all is well 
(yes, this is also true with ASM and direct i/o to raw devices). In PostgreSQL, 
you can checkpoint manually to signal bgwriter to flush dirty pages to the 
operating system's cache and from there you will see a lazy write to disk 
(e.g., watch pdflush on linux), so immediately re-running a query will still 
get some caching benefits eventhough the checkpoint is complete. There are 
operating system commands that you could use for that (cat /proc/meminfo to 
see what's there, sync to write dirty pages to disk, then echo 3  
/proc/sys/vm/drop_caches to remove the now clean pages, and then cat 
/proc/meminfo one more time). And, if you are using SAN consider array based 
caching as well.
Sincerely,
Mark R. Johnson
Owner, Remington Database Solutions, LLC
Author, Oracle Database 10g: From Nuts to Soup



The contents of this email may not be copied or forwarded in part or in whole 
without the express written consent of the author.

-Original Message-
From: Raghavendra [mailto:raghavendra@enterprisedb.com]
Sent: Tuesday, May 3, 2011 08:31 AM
To: 'Simon Riggs'
Cc: 'raghu ram', 'pgsql-admin', 'pgsql-general', 'pgsql novice'
Subject: Re: [ADMIN] Can we Flush the Postgres Shared Memory ?


On Tue, May 3, 2011 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:
On Tue, May 3, 2011 at 11:54 AM, raghu ram raghuchenn...@gmail.com wrote:

 It may be a silly question, still out of curiosity I want to know, is there
 any possible way to flush the Postgres Shared Memory without restarting the
 cluster.
 In Oracle, we can flush the SGA, can we get the same feature here..
 Thanks in Advance.



The CHECKPOINT command will do this for you.


This command will empty the PSM...


Best Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/








Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Karsten Hilbert
On Tue, May 03, 2011 at 03:33:34PM +0100, Simon Riggs wrote:

  The contents of this email may not be copied or forwarded in part or in
  whole without the express written consent of the author.
 
 Pleased to meet you Mark.
 
 If you post here, the above disclaimer is not effective. Right now
 your words are being copied across the internet...

By typing / selecting a public list address written consent
of the author can be assumed to exist implicitely ;-)

Karsten
-- 
GPG key ID E4071346 @ gpg-keyserver.de
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Simon Riggs
On Tue, May 3, 2011 at 2:30 PM, raghu ram raghuchenn...@gmail.com wrote:

 The CHECKPOINT command will do this for you.



 According to PostgreSQL documentation, whenever you execute CHECKPOINT in
 the database,it will flush the modified data files presented in the Shared
 Buffers retuned to the Disk.
            http://www.postgresql.org/docs/9.0/static/sql-checkpoint.html
 Is this clears the entire shared memory cache and same time,if i execute
 fresh SQL statement, Data will be retuned from disk??

No, but then you'd need to flush OS buffers and all disk caches as
well to make that effective.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Tom Lane
Karsten Hilbert karsten.hilb...@gmx.net writes:
 On Tue, May 03, 2011 at 03:33:34PM +0100, Simon Riggs wrote:
 If you post here, the above disclaimer is not effective. Right now
 your words are being copied across the internet...

 By typing / selecting a public list address written consent
 of the author can be assumed to exist implicitely ;-)

Nonetheless, corporate lawyers who insist on such disclaimers on all
email are idiots, and make their company's employees look like idiots
as well.  Every disclaimer on obviously-public mail hastens the day
when such disclaimers will have no legal force whatsoever (if indeed
there's any left to them now).

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Raymond O'Donnell

On 03/05/2011 16:08, Tom Lane wrote:

Karsten Hilbertkarsten.hilb...@gmx.net  writes:

On Tue, May 03, 2011 at 03:33:34PM +0100, Simon Riggs wrote:

If you post here, the above disclaimer is not effective. Right now
your words are being copied across the internet...



By typing / selecting a public list address written consent
of the author can be assumed to exist implicitely ;-)


Nonetheless, corporate lawyers who insist on such disclaimers on all
email are idiots, and make their company's employees look like idiots
as well.  Every disclaimer on obviously-public mail hastens the day
when such disclaimers will have no legal force whatsoever (if indeed
there's any left to them now).


I don't want to start a flame war, but did they every have any legal 
force in the first place?


Ray.


--
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Peter Geoghegan
On 3 May 2011 16:49, Raymond O'Donnell r...@iol.ie wrote:
 I don't want to start a flame war, but did they every have any legal force
 in the first place?

No.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Mark Johnson

Yes, understood and agreed. My mail server adds it automatically. I can 
manually remove it prior to sending to the mail list.
-Mark
-Original Message-
From: Simon Riggs [mailto:si...@2ndquadrant.com]
Sent: Tuesday, May 3, 2011 10:33 AM
To: 'Mark Johnson'
Cc: 'pgsql-admin', 'pgsql-general'
Subject: Re: [ADMIN] Can we Flush the Postgres Shared Memory ?

On Tue, May 3, 2011 at 3:16 PM, Mark Johnson wrote:  The contents of this 
email may not be copied or forwarded in part or in  whole without the express 
written consent of the author. Pleased to meet you Mark. If you post here, the 
above disclaimer is not effective. Right now your words are being copied across 
the internet... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL 
Development, 24x7 Support, Training  Services -- Sent via pgsql-admin mailing 
list (pgsql-ad...@postgresql.org) To make changes to your subscription: 
http://www.postgresql.org/mailpref/pgsql-admin


Re: [GENERAL] Switching Database Engines

2011-05-03 Thread Carlos Mennens
On Sat, Apr 30, 2011 at 4:29 AM, Greg Smith g...@2ndquadrant.com wrote:
 I wouldn't fight with this too much though.  Unless you have some really
 customized stuff in your wiki, there really is nothing wrong with the idea
 of dumping everything into XML, creating a blank PostgreSQL-backed MediaWiki
 install, then restoring into that.  That's what I always do in order to get
 a plain text backup of my server, and to migrate a wiki from one server to
 another.  There are all kinds of issues you could have left here before this
 works, trying to do a database-level export/reload--encoding, foreign key
 problems, who knows what else.  The database-agnostic export/import into XML
 avoids all of those.

Greg,

I'm with you and think that just doing an XML dump of the Wiki itself
is the best way to go. My question is when I do the XML dump as
follows:

/var/www/html/int/main/wiki/maintenance
[root@ideweb1 maintenance]# php dumpBackup.php --full  mw_wiki_2011_05_03.xml
PHP Warning:  PHP Startup: mcrypt: Unable to initialize module
Module compiled with module API=20050922, debug=0, thread-safety=0
PHPcompiled with module API=20060613, debug=0, thread-safety=0
[...]

So now I have a backup file of the Wiki on my old server running
MySQL. I have created the database and installed MediaWiki on the new
server using PostgreSQL as the backend. My question now is what are
your recommended steps in order to get the XML data imported on
MediaWiki using PostgreSQL? I know I also have to move the users since
the XML script / backup doesn't do anything in regards to the users.

Thanks again so much!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] converting databases form SQL_ASCII to UTF8

2011-05-03 Thread Jasen Betts
On 2011-04-22, Geoffrey Myers g...@serioustechnology.com wrote:
 Vick Khera wrote:
 On Fri, Apr 22, 2011 at 11:00 AM, Geoffrey Myers 
 li...@serioustechnology.com mailto:li...@serioustechnology.com wrote:
 
 Here's our problem.  We planned on moving databases a few at a time.
 Problem is, there is a process that pushes data from one database to
 another.  If this process attempts to push data from a SQL_ASCII
 database to a new UTF8 database and it has one of these characters
 mentioned above, the process fails.
 
 
 The database's enforcement of the encoding should be the last layer that 
 does so.  Your applications should be enforcing strict utf-8 encoding 
 from start to finish.  Once this is done, and the old data already in 
 the DB is properly encoded as utf-8, then there should be no problems 
 switching on the utf-8 encoding in postgres to get that final layer of 
 verification.

 Totally agree.  Still, the question remains, why not leave it as SQL_ASCII?

perhaps you want sorted output in some locale other than 'C'?
or maybe want to take a substring in the database...

utf8 in SQL-ASCII is just a string of octets

utf8 in a utf8 database is a string of unicode characters.

-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database recovery.

2011-05-03 Thread Jasen Betts
On 2011-03-24, Waqar Azeem waqarazeem.priv...@gmail.com wrote:
 --0015174766a0ffbf86049f35206e
 Content-Type: text/plain; charset=ISO-8859-1

 My XP is crashed and now I have to take a full backup of my postgresql 8.4

 I am used to get backup of ldf/mdf files in case of SQLServer

 Please let me know the right way of doing this for postgresql 8.4.

You need the data directory, and the service user postgres needs to
own it's contents.   consider using runas and xcopy.

-- 
⚂⚃ 100% natural


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-03 Thread John R Pierce

On 05/03/11 5:04 AM, tushar nehete wrote:
I started with Bucardo. I installed activeperl 5.12 on my 
Linux(RHEL5.5) server.


why ActivePerl, which is usually used by MS Windows users, rather than 
the Perl built into RHEL 5.5 (btw, 5.6 is out now, you really should run 
'yum update').




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] converting databases form SQL_ASCII to UTF8

2011-05-03 Thread Geoffrey Myers

Jasen Betts wrote:

On 2011-04-22, Geoffrey Myers g...@serioustechnology.com wrote:

Vick Khera wrote:
On Fri, Apr 22, 2011 at 11:00 AM, Geoffrey Myers 
li...@serioustechnology.com mailto:li...@serioustechnology.com wrote:


Here's our problem.  We planned on moving databases a few at a time.
Problem is, there is a process that pushes data from one database to
another.  If this process attempts to push data from a SQL_ASCII
database to a new UTF8 database and it has one of these characters
mentioned above, the process fails.


The database's enforcement of the encoding should be the last layer that 
does so.  Your applications should be enforcing strict utf-8 encoding 
from start to finish.  Once this is done, and the old data already in 
the DB is properly encoded as utf-8, then there should be no problems 
switching on the utf-8 encoding in postgres to get that final layer of 
verification.

Totally agree.  Still, the question remains, why not leave it as SQL_ASCII?


perhaps you want sorted output in some locale other than 'C'?
or maybe want to take a substring in the database...

utf8 in SQL-ASCII is just a string of octets

utf8 in a utf8 database is a string of unicode characters.



We finally have a solution in place. A bug in my code was making the 
problem bigger then it really is.  Gotta love those bugs.



--
Until later, Geoffrey

I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them.
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [ADMIN] Can we Flush the Postgres Shared Memory ?

2011-05-03 Thread Joshua D. Drake

On 5/3/2011 7:33 AM, Simon Riggs wrote:


Pleased to meet you Mark.

If you post here, the above disclaimer is not effective. Right now
your words are being copied across the internet...



I believe our community needs to move past posting replies like this. It 
isn't even relevant to the context of his question and makes us look 
like a bunch of ideological buffoons.


JD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Jeff Davis
On Mon, 2011-05-02 at 20:06 -0600, Rob Sargent wrote:
 Jeff Davis wrote:
  In particular, I think you are falsely assuming that a natural key must
  be generated from an outside source (or some source outside of your
  control), and is therefore not reliably unique.
 
  You can generate your own keys...

...

 My wife works (at the sql level) with shall we say records about 
 people.  Real records, real people.  Somewhere around 2 million unique 
 individuals, several million source records.  They don't all have ssn, 
 they don't all have a drivers license.  They don't all have an address, 
 many have several addresses (especially over time) and separate people 
 have at one time or another lived at the same address.  You would be 
 surprise how many bob smiths where born on the same day.  But then 
 they weren't all born in a hospital etc etc etc.  A person may present 
 on any of a birth record, a death record, a hospital record, a drivers 
 license, a medical registry, a marriage record and so on.  There simply 
 is no natural key for a human.  We won't even worry about the 
 non-uniqueness of ssn. And please don't get her started on twins. :) 
 
 
 I can only imagine that other equally complex entities are just as 
 slippery when it comes time to pinpoint the natural key.

I think you missed my point. You don't have to rely on natural keys that
come from somewhere else; you can make up your own, truly unique
identifier.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Rob Sargent



On 05/03/2011 12:51 PM, Jeff Davis wrote:

On Mon, 2011-05-02 at 20:06 -0600, Rob Sargent wrote:

Jeff Davis wrote:

In particular, I think you are falsely assuming that a natural key must
be generated from an outside source (or some source outside of your
control), and is therefore not reliably unique.

You can generate your own keys...


...


My wife works (at the sql level) with shall we say records about
people.  Real records, real people.  Somewhere around 2 million unique
individuals, several million source records.  They don't all have ssn,
they don't all have a drivers license.  They don't all have an address,
many have several addresses (especially over time) and separate people
have at one time or another lived at the same address.  You would be
surprise how many bob smiths where born on the same day.  But then
they weren't all born in a hospital etc etc etc.  A person may present
on any of a birth record, a death record, a hospital record, a drivers
license, a medical registry, a marriage record and so on.  There simply
is no natural key for a human.  We won't even worry about the
non-uniqueness of ssn. And please don't get her started on twins. :)


I can only imagine that other equally complex entities are just as
slippery when it comes time to pinpoint the natural key.


I think you missed my point. You don't have to rely on natural keys that
come from somewhere else; you can make up your own, truly unique
identifier.

Regards,
Jeff Davis

Sorry, but I'm confused, but that's common.  Isn't a natural key to be 
compose solely from the attributes of the entity?  As in a subset of the 
columns of the table in a third-normalish world. Isn't tacking on 
another column with a concocted id joining the pervassiveness?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Jeff Davis
On Tue, 2011-05-03 at 13:35 -0600, Rob Sargent wrote:
 Sorry, but I'm confused, but that's common.  Isn't a natural key to be 
 compose solely from the attributes of the entity?  As in a subset of the 
 columns of the table in a third-normalish world. Isn't tacking on 
 another column with a concocted id joining the pervassiveness?

Not in my opinion. Before cars existed, there was no driver's license
number. The DMV (as it's called in California, anyway) created it, and
it's now a key that they can trust to be unique. It's also an attribute
of the entity now, because it's printed on the cards you hand to people.

The thing that I think is a mistake is to use generated IDs like an
internal implementation detail (i.e. hide them like pointers); then at
the same time mix them into the data model.

Regards,
Jeff Davis




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] postgres segfaulting on pg_restore

2011-05-03 Thread Tom Lane
Chris Curvey ch...@chriscurvey.com writes:
 and, FWIW, here's another trace, which is NEARLY the same as the first one I
 posted, with the difference being a slightly different line number at #3.  I
 will be quiet now and leave the brain trust to ponder.  Let me know if there
 is anything else I can get for you guys.

 #0  0x006ce317 in GetMemoryChunkSpace (pointer=0x292a0e0) at
 mcxt.c:264
 #1  0x006d3d56 in writetup_index (state=0x2281670, tapenum=value
 optimized out, stup=value optimized out) at tuplesort.c:2924

It occurred to me that a simple explanation for a core dump there would
be if something had scribbled past the end of the preceding palloc
chunk.  That would tend to clobber the context link of the palloc
chunk after it, which would send GetMemoryChunkSpace off into
never-never land following a trashed pointer.

That doesn't get us very much closer to a solution, but it does suggest
that you might learn something if you try this with an assert-enabled
build (configure --enable-cassert).  Also, are there any contrib modules
or third-party add-on modules or home-brew C functions that you're using?

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Jeff Davis
On Mon, 2011-05-02 at 23:07 -0400, Greg Smith wrote:
 I see this whole area as being similar to SQL injection.  The same way 
 that you just can't trust data input by the user to ever be secure, you 
 can't trust inputs to your database will ever be unique in the way you 
 expect them to be. 

So, don't trust them to be unique then. Make up your own unique
identifier, and use that.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Question on Wal time lines

2011-05-03 Thread dabicho
For restoring a database from wal files, if I omit a target on the
recovery.conf file, can I make it so the database continues the time line
instead of starting one?
Or is there a tool to pick the most recent time line from a bunch of wal
files?

thankyou.


Re: [GENERAL] Question on Wal time lines

2011-05-03 Thread John R Pierce

On 05/03/11 3:07 PM, dabicho wrote:


For restoring a database from wal files, if I omit a target on the 
recovery.conf file, can I make it so the database continues the time 
line instead of starting one?
Or is there a tool to pick the most recent time line from a bunch of 
wal files?


thankyou.



you need to playback all the wal files from when you started the base 
backup.  of course, you need that base backup, too.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Question on Wal time lines

2011-05-03 Thread dabicho
El may 3, 2011 5:59 p.m., John R Pierce pie...@hogranch.com escribió:

 On 05/03/11 3:07 PM, dabicho wrote:


 For restoring a database from wal files, if I omit a target on the
recovery.conf file, can I make it so the database continues the time line
instead of starting one?
 Or is there a tool to pick the most recent time line from a bunch of wal
files?

 thankyou.


 you need to playback all the wal files from when you started the base
backup.  of course, you need that base backup, too.

I did that.
I restored the database, put the recovery file in place along with previous
wal files and the last wall files, and after start up there was a new time
line.
Am I missing something?
This is postgres 9.0




 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Rob Sargent



On 05/03/2011 03:08 PM, Jeff Davis wrote:

On Tue, 2011-05-03 at 13:35 -0600, Rob Sargent wrote:

Sorry, but I'm confused, but that's common.  Isn't a natural key to be
compose solely from the attributes of the entity?  As in a subset of the
columns of the table in a third-normalish world. Isn't tacking on
another column with a concocted id joining the pervassiveness?


Not in my opinion. Before cars existed, there was no driver's license
number. The DMV (as it's called in California, anyway) created it, and
it's now a key that they can trust to be unique. It's also an attribute
of the entity now, because it's printed on the cards you hand to people.

The thing that I think is a mistake is to use generated IDs like an
internal implementation detail (i.e. hide them like pointers); then at
the same time mix them into the data model.

Regards,
Jeff Davis



Well yes it does all depend on how you model things after all. I think a 
drivers license is and attribute of driver not person. So before cars, 
one still had a hard time coming up with a natural key on person.  Of 
course California's DMV only cares about Californian licenced drivers, 
so they get to generate and assign license number as an arbitary key for 
drivers 'cause under that we're back to person.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith

Merlin Moncure wrote:

If your data modeler that made the the assumptions that a MAC is
unique (a mistake obviously) at least the other tables are protected
from violations of that assumption because the database would reject
them with an error, which is a Good Thing.  Without a uniqueness
constraint you now have ambiguous data which is a Very Bad Thing.
  


With a uniqueness constraint in this situation, the unexpected data--row 
with a non unique MAC--will be rejected and possibly lost when the 
insertion happens.  You say that's a good thing, plenty of people will 
say that's the worst possible thing that can happen.  When dealing with 
external data, it's often impossible to know everything you're going to 
see later at design time.  Approaching that problem with the idea that 
you're going to lose any data that doesn't fit into the original model 
is not what everyone finds reasonable behavior.


I don't think it's possible to decide in a generic way which of these is 
the better approach:  to reject unexpected data and force the problem 
back at the application immediately (commit failure), or to accept with 
with because you're using a surrogate key and discover the problems down 
the line.  Both are valid approaches with a very different type of risk 
associated with them.  I think it's fair to say that real-world data is 
not always well known enough at design time to follow the idea you're 
suggesting though, and that does factor into why there is such a 
preference for surrogate keys in the industry.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread Greg Smith

Jeff Davis wrote:

On Mon, 2011-05-02 at 23:07 -0400, Greg Smith wrote:
  
I see this whole area as being similar to SQL injection.  The same way 
that you just can't trust data input by the user to ever be secure, you 
can't trust inputs to your database will ever be unique in the way you 
expect them to be. 



So, don't trust them to be unique then. Make up your own unique
identifier, and use that.
  


If you're making up your own unique identifier, that's closer to a 
surrogate key as far as I'm concerned, even though it doesn't fit the 
strict definition of that term (it doesn't have the subtle idea that 
surrogate implies meaningless).  Now, there is some value to doing 
that well, instead of just using the typical incrementing integer 
pointer approach, as you've called it.  But if it's not derived from 
external data you're storing anyway, it's not a true natural key either.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pervasiveness of surrogate (also called synthetic) keys

2011-05-03 Thread David Johnston


On May 3, 2011, at 22:03, Greg Smith g...@2ndquadrant.com wrote:

 Merlin Moncure wrote:
 If your data modeler that made the the assumptions that a MAC is
 unique (a mistake obviously) at least the other tables are protected
 from violations of that assumption because the database would reject
 them with an error, which is a Good Thing.  Without a uniqueness
 constraint you now have ambiguous data which is a Very Bad Thing.
  
 
 With a uniqueness constraint in this situation, the unexpected data--row with 
 a non unique MAC--will be rejected and possibly lost when the insertion 
 happens.  You say that's a good thing, plenty of people will say that's the 
 worst possible thing that can happen.  When dealing with external data, it's 
 often impossible to know everything you're going to see later at design time. 
  Approaching that problem with the idea that you're going to lose any data 
 that doesn't fit into the original model is not what everyone finds 
 reasonable behavior.
 
 I don't think it's possible to decide in a generic way which of these is the 
 better approach:  to reject unexpected data and force the problem back at the 
 application immediately (commit failure), or to accept with with because 
 you're using a surrogate key and discover the problems down the line.  Both 
 are valid approaches with a very different type of risk associated with them. 
  I think it's fair to say that real-world data is not always well known 
 enough at design time to follow the idea you're suggesting though, and that 
 does factor into why there is such a preference for surrogate keys in the 
 industry.
 
 -- 
 Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
 PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
 PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

If you implicitly expect MAC to be unique but do not enforce it then you are 
likely to write queries that rely upon that uniqueness.  If you do enforce it 
then the constraint failure occurs anyway.

A scalar sub-query that links via the MAC will fail when the duplicate data is 
encountered, and normal queries will return too-many records.  A detail report 
may be obvious but if you are summarizing the data the specific offending 
record is going to require some effort to find.

I guess if you are the compromising type you can loosely enforce the uniqueness 
by running a check query periodically to see if supposedly unique values have 
been duplicated.

I agree there is no right answer - the designer needs to make trade-offs - but 
I'd rather reject new data and leave the system in a status-quo stable state 
instead of introducing invalid data and putting the system into a state where 
it requires effort to get it functioning again.  If you accept the invalid data 
the likely scenario, if something breaks, is someone finds the offending record 
and removes it until the application and database can be fixed properly - which 
is where we are at with validation.  The common exception is where identifiers 
are reused over time and you remove the old record in order to keep/allow the 
newer record to remain.

On a tangential course I've started considering is a setup whereby you 
basically have two identifiers for a record.  One is end-user facing and 
updatable whereas the other is static and used in intra-table relations.  You 
can create a new record with the same user-facing id as an existing Id but the 
existing Id will be replaced with its system id.  This is useful when users 
will be using the Id often and it can be reasonably assumed to be unique over a 
moderate period of time (say a year).  Invoice numbers, customer numbers are 
two common examples.  The lookup Id itself may require additional fields in 
order to qualify as a primary (natural) key but the static key wants to be a 
single field.  Often simply putting a date with the original id (and parent 
identifiers) is sufficient due to the infrequency of updates.  The downside is, 
with string-based parent identifiers the pk value can be quite long.  I 
currently have PKs of 40-50 length but during my new design my first pass on a 
couple of tables indicated 100 characters limit.

Is there any rules-of-thumb on the performance of a PK as a function of key 
length?  I like using varchar based identifiers since I tend to query tables 
directly and writing where clauses is much easier if you can avoid the joins.  
I'm likely better off creating views and querying those but am still curious on 
any basic thoughts on having a 100+ length primary key.

David J.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] full_page_writes makes no difference?

2011-05-03 Thread Tian Luo

Hi guys,

No matter I turn on or turn off the full_page_writes, I always
observe 8192-byte writes of log data for simple write operations
(write/update).

But according to the document, when this is off, it could speed up
operations but may cause problems during recovery. So, I guess this is
because it writes less when the option is turned off. However, this
contradicts my observations 

If I am not missing anything, I find that the writes of log data go
through function XLogWrite in source file
backend/access/transam/xlog.c.

In this file, log data are written with the following code:

from = XLogCtl-pages + startidx * (Size) XLOG_BLCKSZ;
nbytes = npages * (Size) XLOG_BLCKSZ;
if (write(openLogFile, from, nbytes) != nbytes)
{
 ...
}

So, nbytes should always be multiples of XLOG_BLCKSZ, which in the
default case, is 8192.

My question is, if it always writes full pages no matter
full_page_writes is on or off, what is the difference?

Thanks!

Regards,
- Tian


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Needs Suggestion

2011-05-03 Thread SUBHAM ROY
My output of explain (analyze,buffers) is something like this:

shared hit=3796624 read=46038

So what is meant by *read* here? Does it indicates number of disk reads?

Does *shared hit* takes into account only the hit in the pg_buffercache of
the postgres or it also takes into account the *linux buffers* that postgres
uses?

-- 
Thank You,
Subham Roy,
CSE IIT Bombay.