Re: [PERFORM] slony rpm help slony1-95-2.2.2-1.rhel6.x86_64

2016-07-16 Thread Jan Wieck
On Fri, Jun 3, 2016 at 7:03 PM, avi Singh <avisingh19811...@gmail.com>
wrote:

> Hi All
>  Can anyone please point me to location from where i can get slony
> slony1-95-2.2.2-1.rhel5.x86_64
> <http://yum.postgresql.org/9.5/redhat/rhel-5-x86_64/slony1-95-2.2.4-4.rhel5.x86_64.rpm>
>   rpm.
>

There should not be one since Slony1-I v2.2.2 does not compile against
PostgreSQL 9.5.
9.5 requires at least Slony-I v2.2.4. I recommend upgrading Slony to 2.2.5
first.


Regards, Jan




> I'm upgrading database from version 9.3 to 9.5. Current version of rpm we
> are using is  slony1-93-2.2.2-1.el5.x86_64 and the one that is available on
> postgresql website for 9.5 is slony1-95-2.2.4-4.rhel5.x86_64
> <http://yum.postgresql.org/9.5/redhat/rhel-5-x86_64/slony1-95-2.2.4-4.rhel5.x86_64.rpm>
>   which is not compatible and throws an error when i test the upgrade.
> In the past i was able to find the 2.2.2-1 version rpm for previous
> versions on postgres website but not this time for postgresql 9.5
>
>
>
> Thanks
> Avi
>
>
>
>
>


-- 
Jan Wieck
Senior Postgres Architect


Re: [PERFORM] Performance degradation after successive UPDATE's

2005-12-06 Thread Jan Wieck

On 12/6/2005 4:08 AM, Assaf Yaari wrote:

Thanks Bruno,

Issuing VACUUM FULL seems not to have influence on the time.
I've added to my script VACUUM ANALYZE every 100 UPDATE's and run the
test again (on different record) and the time still increase.


I think he meant

- run VACUUM FULL once,
- adjust FSM settings to database size and turnover ratio
- run VACUUM ANALYZE more frequent from there on.


Jan



Any other ideas?

Thanks,
Assaf. 


-Original Message-
From: Bruno Wolff III [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 05, 2005 10:36 PM

To: Assaf Yaari
Cc: pgsql-performance@postgresql.org
Subject: Re: Performance degradation after successive UPDATE's

On Mon, Dec 05, 2005 at 19:05:01 +0200,
  Assaf Yaari [EMAIL PROTECTED] wrote:
 Hi,
  
 I'm using PostgreSQL 8.0.3 on Linux RedHat WS 3.0.
  
 My application updates counters in DB. I left a test over the night 
 that increased counter of specific record. After night running 
 (several hundreds of thousands updates), I found out that the time 
 spent on UPDATE increased to be more than 1.5 second (at 
the beginning 
 it was less than 10ms)! Issuing VACUUM ANALYZE and even 
reboot didn't 
 seemed to solve the problem.


You need to be running vacuum more often to get rid of the 
deleted rows (update is essentially insert + delete). Once 
you get too many, plain vacuum won't be able to clean them up 
without raising the value you use for FSM. By now the table 
is really bloated and you probably want to use vacuum full on it.




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] 15,000 tables - next step

2005-12-05 Thread Jan Wieck

On 12/4/2005 4:33 AM, Michael Riess wrote:

I will do the following:

- switch to 10k buffers on a 1GB machine, 20k buffers on a 2GB machine
- try to optimize my connection polls to remember which apps (groups of 
30 tables) were accessed, so that there is a better chance of using caches
- swap out tables which are rarely used: export the content, drop the 
table, and re-create it on the fly upon access.


I hacked pgbench a little and did some tests (finally had to figure out 
for myself if there is much of an impact with hundreds or thousands of 
tables).


The changes done to pgbench:

- Use the [-s n] value allways, instead of determining the
  scaling from the DB.

- Lower the number of accounts per scaling factor to 10,000.

- Add another scaling type. Option [-a n] splits up the test
  into n schemas, each containing [-s n] branches.

The tests were performed on a 667 MHz P3, 640MB Ram with a single IDE 
disk. All tests were IO bound. In all tests the number of clients was 5 
default transaction and 50 readonly (option -S). The FreeBSD kernel of 
the system is configured to handle up to 50,000 open files, fully cache 
directories in virtual memory and to lock all shared memory into 
physical ram.


The different scalings used were

init -a1 -s3000
run  -a1 -s300

and

init -a3000 -s1
run  -a300 -s1

The latter creates a database of 12,000 tables with 1,200 of them 
actually in use during the test. Both databases are about 4 GB in size.


The performance loss for going from -s3000 to -a3000 is about 10-15%.

The performance gain for going from 1,000 shared_buffers to 48,000 is 
roughly 70% (-a3000 test case) and 100% (-s3000 test case).


Conclusion: The right shared memory configuration easily outperforms the 
loss from increase in number of tables, given that the kernel is 
configured to be up to the task of dealing with thousands of files 
accessed by that number of backends too.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] 15,000 tables - next step

2005-12-03 Thread Jan Wieck

On 12/2/2005 6:01 PM, Michael Riess wrote:


Hi,

thanks for your comments so far - I appreciate it. I'd like to narrow 
down my problem a bit:


As I said in the other thread, I estimate that only 20% of the 15,000 
tables are accessed regularly. So I don't think that vacuuming or the 
number of file handles is a problem. Have a look at this:


What makes you think that? Have you at least tried to adjust your shared 
buffers, freespace map settings and background writer options to values 
that match your DB? How does increasing the kernel file desctriptor 
limit (try the current limit times 5 or 10) affect your performance?



Jan






content2=# select relpages, relname from pg_class order by relpages desc 
limit 20;

  relpages | relname
--+-
 11867 | pg_attribute
 10893 | pg_attribute_relid_attnam_index
  3719 | pg_class_relname_nsp_index
  3310 | wsobjects_types
  3103 | pg_class
  2933 | wsobjects_types_fields
  2903 | wsod_133143
  2719 | pg_attribute_relid_attnum_index
  2712 | wsod_109727
  2666 | pg_toast_98845
  2601 | pg_toast_9139566
  1876 | wsod_32168
  1837 | pg_toast_138780
  1678 | pg_toast_101427
  1409 | wsobjects_types_fields_idx
  1088 | wso_log
   943 | pg_depend
   797 | pg_depend_depender_index
   737 | wsod_3100
   716 | wp_hp_zen

I don't think that postgres was designed for a situation like this, 
where a system table that should be fairly small (pg_attribute) is this 
large.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] 15,000 tables - next step

2005-12-03 Thread Jan Wieck

On 12/3/2005 11:41 AM, Michael Riess wrote:


Alvaro Herrera schrieb:

Michael Riess wrote:

Shared memory ... I currently use 1500 buffers for 50 connections, and 
performance really suffered when I used 3000 buffers. The problem is 
that it is a 1GB machine, and Apache + Tomcat need about 400MB.


Well, I'd think that's were your problem is.  Not only you have a
(relatively speaking) small server -- you also share it with other
very-memory-hungry services!  That's not a situation I'd like to be in.
Try putting Apache and Tomcat elsewhere, and leave the bulk of the 1GB
to Postgres. 


No can do. I can try to switch to a 2GB machine, but I will not use 
several machines. Not for a 5GB database. ;-)


What version of PostgreSQL are we talking about? If it is anything older 
than 8.0, you should upgrade at least to that. With 8.0 or better try 
2 shared buffers or more. It is well possible that going from 1500 
to 3000 buffers made things worse. Your buffer cache can't even hold the 
system catalog in shared memory. If those 50 backends serve all those 
500 apps at the same time, they suffer from constant catalog cache 
misses and don't find the entries in the shared buffers either.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] 15,000 tables

2005-12-02 Thread Jan Wieck

On 12/1/2005 2:34 PM, Michael Riess wrote:

VACUUM FULL was probably always overkill, unless always includes
versions prior to 7.3...


Well, we tried switching to daily VACUUM ANALYZE and weekly VACUUM FULL, 
but the database got considerably slower near the end of the week.


This indicates that you have FSM settings that are inadequate for that 
many tables and eventually the overall size of your database. Try 
setting those to


max_fsm_relations = 8
max_fsm_pages = (select sum(relpages) / 2 from pg_class)

Another thing you might be suffering from (depending on the rest of your 
architecture) is file descriptor limits. Especially if you use some sort 
of connection pooling or persistent connections like PHP, you will have 
all the backends serving multiple of your logical applications (sets of 
30 tables). If on average one backend is called for 50 different apps, 
then we are talking 50*30*4=6000 files accessed by that backend. 80/20 
rule leaves 1200 files in access per backend, thus 100 active backends 
lead to 120,000 open (virtual) file descriptors. Now add to that any 
files that a backend would have to open in order to evict an arbitrary 
dirty block.


With a large shared buffer pool and little more aggressive background 
writer settings, you can avoid mostly that regular backends would have 
to evict dirty blocks.


If the kernel settings allow Postgres to keep that many file descriptors 
open, you avoid directory lookups.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Perfomance of views

2005-10-27 Thread Jan Wieck

On 10/27/2005 7:29 AM, Richard Huxton wrote:


Don't forget to CC the list


Svenne Krap wrote:

What do you mean exactly but pushing conditions inside ?


If I have something like SELECT * FROM complicated_view WHERE foo = 7 
then the planner can look inside complicated_view and see where it can 
attach the condition foo=7, rather than running the query and applying 
the condition at the end.


Sorry, but the planner doesn't attach the condition anywhere. It is the 
rewriter that takes the actual query, replaces the views rangetable and 
expression entries with the actual underlying objects and adds the views 
condition with an AND to the queries condition. Simply example:


Given a view

create view v1 as select a1, b1, c2 from t1, t2 where a1 = a2;

The statement

select * from v1 where b1 = 'foo';

will result in a parsetree equivalent to what you would get if the 
original query was


select a1, b1, c2 from t1, t2 where (b1 = 'foo') and (a1 = a2);

It is the planners and optimizers job to recognize where in the 
execution plan it can push qualifications down into filters or even 
scankeys. The planner should be able to realize that


select * from v1 where a1 = 42;

is in fact equivalent to

select a1, b1, c2 from t1, t2 where a1 = 42 and a1 = a2;

as well as

select a1, b1, c2 from t1, t2 where a1 = 42 and a1 = a2 and a2 = 42;

This very last addition of a2 = 42 because of a2 = a1 = 42 allows it 
to put a constant scankey onto the scan of t2. The 8.0 planner does 
that, so the resulting query plan for the last three selects above is 
absolutely identical.




There are cases where it is safe for the planner to do this, but it 
isn't smart enough to do so.


Example?


Jan



I don't think I will have the option of testing on the full queries, as 
these take many days to write (the current ones, they are replacing on a 
mssql takes up more that 5kb of query). The current ones are nightmares 
from a maintaince standpoint.


Hmm - it sounds like they would be.

Basicly what the application is doing is selecting some base data from 
the large table for a point in time (usually a quarter) and selects 
all matching auxilliare data from the other tables. They are made in a 
time-travel like manner with a first and last useable date.


The ways I have considered was :
1) write a big query in hand (not preferred as it gets hard to manage)


Agreed.

2) write layers of views (still not prefered as I still have to remember 
to put on the right conditions everywhere)


This is what I'd probably do, but of course I don't have full 
information about your situation.


3) write layers of sql-functions (returning the right sets of rows from 
the underlying tables) - which I prefer from a development angel .. it 
gets very clean and I cant forget a parameter anywhere.


But I seem to remember (and I have used PGSQL in production since 7.0) 
that the planner has some problems with solution 3 (i.e. estimating the 
cost and rearranging the query), but frankly that would be the way I 
would like to go.


Well, 8.x can inline a simple sql function into a larger query, but it 
doesn't sound like that will be enough in your case. Once a function 
becomes a black box then there's not much the planner can do to figure 
out what to do.


Based on the current (non-optimal) design and hardware constraints, I 
still have to make sure, the query runs fairly optimal - that means the 
planner must use indexes intelligently and other stuff as if it was 
(well-)written using solution 1.


Well, #1,#2 are likely to be the most efficient, but you won't know for 
sure about #2 until you test it.


There are a couple of other options though:

#4 - Write a set-returning function that breaks the query into steps and 
executes each in turn. So - fetch IDs from the main table in step 1 and 
store them in a temporary table, join other tables in later steps.


#5 - Write a function that writes your big query for you and either 
returns the SQL to your application, or runs it and returns the results.


What do you think of the three solutions ? And is there some ressource 
about the planners capabilites for someone like me (that is very used to 
write reasonably fast and complex sql, can read c-code, but does not 
really want to dig into the source code)


There is some stuff in the Internals section of the manuals and it 
might be worth rummaging around on http://techdocs.postgresql.org


--
   Richard Huxton
   Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of 

Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-06-06 Thread Jan Wieck

On 6/6/2005 2:12 PM, PFC wrote:




Please pardon my ignorance, but from whatever I had heard, mysql was
supposedly always faster than postgres  Thats why I was so surprised  
!!


	I heard a lot of this too, so much it seems common wisdom that postgres 
is slow... well maybe some old version was, but it's getting better at  
every release, and the 8.0 really delivers...


The harder it is to evaluate software, the less often people reevaluate 
it and the more often people just copy opinions instead of doing an 
evaluation at all.


Today there are a gazillion people out there who know that MySQL is 
faster than PostgreSQL. They don't know under what circumstances it is, 
or what the word circumstances means in this context anyway. When you 
ask them when was the last time they actually tested this you get in 
about 99% of the cases an answer anywhere between 3 years and infinity 
(for all those who never did). The remaining 1% can then be reduced to 
an insignificant minority by asking how many concurrent users their test 
simulated.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] column name is LIMIT

2005-03-21 Thread Jan Wieck
On 3/14/2005 4:26 AM, Qingqing Zhou wrote:
So is it to make SQL parser context-sensitive - say the parser will
understand that in statement SELECT * from LIMIT, LIMIT is just a table
name, instead of keyword?
More or less, yes. To use a reserved keyword as an identifier (table or 
column name), it must be enclosed in double quotes. Double quotes are 
also used to make identifiers case sensitive. So

select someval, SOMEVAL, someVAL from user;
is a valid query retrieving 3 distinct columns from the table user. 
There is a builtin function quote_ident() in PostgreSQL that is supposed 
to return a properly quoted string allowed as an identifier for whatever 
name is passed in. But it fails to do so for all lower case names that 
are reserved keywords.

The queries Slony executes on the replicas are constructed using that 
quoting function, and therefore Slony fails to build valid SQL for 
replicated table containing reserved keyword identifiers. One solution 
would be to brute-force quote all identifiers in Slony ... not sure what 
the side effects performance wise would be.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] column name is LIMIT

2005-03-14 Thread Jan Wieck
On 3/14/2005 1:28 PM, Robert Treat wrote:
Yeah... how come no one told him don't do that?  LIMIT is an SQL
reserved word, so it's likely to cause trouble in any database you try
to use it on... I'd strongly recommend renaming that column asap. You
can see other reserved words at
http://www.postgresql.org/docs/8.0/interactive/sql-keywords-appendix.html
Robert Treat
Note also that the Slony-I replication system has problems with column 
names identical to reserved words. This is rooted in the fact that the 
quote_ident() function doesn't quote reserved words ... as it IMHO is 
supposed to do.

Jan
On Mon, 2005-03-14 at 03:55, Christopher Kings-Lynne wrote:
You will still need to use double quotes in 8.0.1...
Chris
Gourish Singbal wrote:
 Thanks a lot,
 
 we might be upgrading to 8.0.1 soon.. till than using double quotes
 should be fine.
 
 regards
 gourish
 
 On Mon, 14 Mar 2005 18:25:22 +1100, Russell Smith [EMAIL PROTECTED] wrote:
 
On Mon, 14 Mar 2005 06:14 pm, Gourish Singbal wrote:

Guys,

I am having a problem firing queries on one of the tables which is
having limit as the column name.

If a run an insert/select/update command on that table i get the below error.

ERROR:  syntax error at or near limit at character 71

select limit from limit_table WHERE limit  50 LIMIT 2;

You need to quote the field name, and make sure the case is correct.

Any Help would be realyl great to solve the problem.

postgresql 7.4.5 and linux OS


You should probably upgrade to 7.4.7

Regards

Russell Smith.

 
 
 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] column name is LIMIT

2005-03-14 Thread Jan Wieck
On 3/14/2005 2:26 PM, Bryan Encina wrote:
Note also that the Slony-I replication system has problems 
with column 
names identical to reserved words. This is rooted in the fact 
that the 
quote_ident() function doesn't quote reserved words ... as it IMHO is 
supposed to do.

Jan
Does this apply to table names as well or just columns?
Bryan
Sure does, don't try to replicate a table named user.
Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-02-06 Thread Jan Wieck
On 1/28/2005 2:49 PM, Christopher Browne wrote:
But there's nothing wrong with the idea of using pg_dump --data-only
against a subscriber node to get you the data without putting a load
on the origin.  And then pulling the schema from the origin, which
oughtn't be terribly expensive there.
And there is a script in the current CVS head that extracts the schema 
from the origin in a clean, slony-traces-removed state.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] sort_mem affect on inserts?

2004-11-19 Thread Jan Wieck
On 11/17/2004 5:07 PM, Josh Berkus wrote:
David,
I understand that the sort_mem conf setting affects queries with order by,
etc., and the doc mentions that it is used in create index. Does sort_mem
affect the updating of indexes, i.e., can the sort_mem setting affect the
performance of inserts?
Only if the table has Foriegn Keys whose lookup might require a large sort.   
Otherwise, no.

Hmmm ... what type of foreign key lookup would that be? None of the RI 
generated queries has any order by clause.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Free PostgreSQL Training, Philadelphia, Oct 30

2004-10-18 Thread Jan Wieck
On 10/13/2004 11:47 PM, Bruce Momjian wrote:
Josh Berkus wrote:
Aaron,
 That makes two of us. Hanging out with Tom, Bruce, and others at OSCON
 2002 was one of the most informative and fun times I've had. That and
 I could really stand to brush up on my Postgres basics
You're thinking of Jan.   Tom wasn't at OSCON.  ;-)
Ah, but he said 2002 and I think Tom was there that year.
And I wasn't, which makes it rather difficult to hang out with me.
I will however be in Malvern too, since it's just around the corner for me.
Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-18 Thread Jan Wieck
On 10/14/2004 6:36 PM, Simon Riggs wrote:
[...]
I think Jan has said this also in far fewer words, but I'll leave that to
Jan to agree/disagree...
I do agree. The total DB size has as little to do with the optimum 
shared buffer cache size as the total available RAM of the machine.

After reading your comments it appears more clear to me. All what those 
tests did show is the amount of high frequently accessed data in this 
database population and workload combination.

I say this: ARC in 8.0 PostgreSQL allows us to sensibly allocate as large a
shared_buffers cache as is required by the database workload, and this
should not be constrained to a small percentage of server RAM.
Right.
Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-18 Thread Jan Wieck
On 10/14/2004 8:10 PM, Christopher Browne wrote:
Quoth [EMAIL PROTECTED] (Simon Riggs):
I say this: ARC in 8.0 PostgreSQL allows us to sensibly allocate as
large a shared_buffers cache as is required by the database
workload, and this should not be constrained to a small percentage
of server RAM.
I don't think that this particularly follows from what ARC does.
The combination of ARC together with the background writer is supposed 
to allow us to allocate the optimum even if that is large. The former 
implementation of the LRU without background writer would just hang the 
server for a long time during a checkpoint, which is absolutely 
inacceptable for any OLTP system.

Jan
What ARC does is to prevent certain conspicuous patterns of
sequential accesses from essentially trashing the contents of the
cache.
If a particular benchmark does not include conspicuous vacuums or
sequential scans on large tables, then there is little reason to
expect ARC to have a noticeable impact on performance.
It _could_ be that this implies that ARC allows you to get some use
out of a larger shared cache, as it won't get blown away by vacuums
and Seq Scans.  But it is _not_ obvious that this is a necessary
truth.
_Other_ truths we know about are:
 a) If you increase the shared cache, that means more data that is
represented in both the shared cache and the OS buffer cache,
which seems rather a waste;
 b) The larger the shared cache, the more pages there are for the
backend to rummage through before it looks to the filesystem,
and therefore the more expensive cache misses get.  Cache hits
get more expensive, too.  Searching through memory is not
costless.

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Autotuning of shared buffer size (was: Re: [HACKERS] Getting rid of AtEOXact Buffers (was Re: [Testperf-general] Re: [PERFORM] First set of OSDL Shared Memscalability results, some wierdness ...))

2004-10-18 Thread Jan Wieck
Trying to think a little out of the box, how common is it in modern 
operating systems to be able to swap out shared memory?

Maybe we're not using the ARC algorithm correctly after all. The ARC 
algorithm does not consider the second level OS buffer cache in it's 
design. Maybe the total size of the ARC cache directory should not be 2x 
the size of what is configured as the shared buffer cache, but rather 2x 
the size of the effective cache size (in 8k pages). If we assume that 
the job of the T1 queue is better done by the OS buffers anyway (and 
this is what this discussion seems to point out), we shouldn't hold them 
in shared buffers (only very few of them and evict them ASAP). We just 
account for them and assume that the OS will have those cached that we 
find in our T1 directory. I think with the right configuration for 
effective cache size, this is a fair assumption. The T2 queue represents 
the frequently used blocks. If our implementation would cause the 
unused/used portions of the buffers not to move around, the OS will swap 
out currently unused portions of the shared buffer cache and utilize 
those as OS buffers.

To verify this theory it would be interesting what the ARC strategy 
after a long DBT run with a large buffer cache thinks a good T2 size 
would be. Enabling the strategy debug message and running postmaster 
with -d1 will show that. In theory, this size should be anywhere near 
the sweet spot.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [HACKERS] Getting rid of AtEOXact Buffers (was Re: [Testperf-general]

2004-10-18 Thread Jan Wieck
On 10/17/2004 3:40 PM, [EMAIL PROTECTED] wrote:
Seeing as I've missed the last N messages... I'll just reply to this
one, rather than each of them in turn...
Tom Lane [EMAIL PROTECTED] wrote on 16.10.2004, 18:54:17:
I wrote:
 Josh Berkus  writes:
 First off, two test runs with OProfile are available at:
 http://khack.osdl.org/stp/298124/
 http://khack.osdl.org/stp/298121/
 Hmm.  The stuff above 1% in the first of these is
 Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 
0x00 (No unit mask) count 10
 samples  %app name symbol name
 ...
 9203692.1332  postgres AtEOXact_Buffers
 ...
 In the second test AtEOXact_Buffers is much lower (down around 0.57
 percent) but the other suspects are similar.  Since the only difference
 in parameters is shared_buffers (36000 vs 9000), it does look like we
 are approaching the point where AtEOXact_Buffers is a problem, but so
 far it's only a 2% drag.
Yes... as soon as you first mentioned AtEOXact_Buffers, I realised I'd
seen it near the top of the oprofile results on previous tests.
Although you don't say this, I presume you're acting on the thought that
a 2% drag would soon become a much larger contention point with more
users and/or smaller transactions - since these things are highly
non-linear.
It occurs to me that given the 8.0 resource manager mechanism, we could
in fact dispense with AtEOXact_Buffers, or perhaps better turn it into a
no-op unless #ifdef USE_ASSERT_CHECKING.  We'd just get rid of the
special case for transaction termination in resowner.c and let the
resource owner be responsible for releasing locked buffers always.  The
OSDL results suggest that this won't matter much at the level of 1
or so shared buffers, but for 10 or more buffers the linear scan in
AtEOXact_Buffers is going to become a problem.
If the resource owner is always responsible for releasing locked
buffers, who releases the locks if the backend crashes? Do we need some
additional code in bgwriter (or?) to clean up buffer locks?
If the backend crashes, the postmaster (assuming a possibly corrupted 
shared memory) restarts the whole lot ... so why bother?


We could also get rid of the linear search in UnlockBuffers().  The only
thing it's for anymore is to release a BM_PIN_COUNT_WAITER flag, and
since a backend could not be doing more than one of those at a time,
we don't really need an array of flags for that, only a single variable.
This does not show in the OSDL results, which I presume means that their
test case is not exercising transaction aborts; but I think we need to
zap both routines to make the world safe for large shared_buffers
values.  (See also
http://archives.postgresql.org/pgsql-performance/2004-10/msg00218.php)
Yes, that's important. 

Any objection to doing this for 8.0?
As you say, if these issues are definitely kicking in at 10
shared_buffers - there's a good few people out there with 800Mb
shared_buffers already. 

Could I also suggest that we adopt your earlier suggestion of raising
the bgwriter parameters as a permanent measure - i.e. changing the
defaults in postgresql.conf. That way, StrategyDirtyBufferList won't
immediately show itself as a problem when using the default parameter
set. It would be a shame to remove one obstacle only to leave another
one following so close behind. [...and that also argues against an
earlier thought to introduce more fine grained values for the
bgwriter's parameters, ISTM]
I realized that StrategyDirtyBufferList currently wasts a lot of time by 
first scanning over all the buffers that haven't even been hit since 
it's last call and neither have been dirty last time (and thus, are at 
the beginning of the list and can't be dirty anyway). If we would have a 
way to give it a smart point in the list to start scanning ...


Also, what will Vacuum delay do to the O(N) effect of
FlushRelationBuffers when called by VACUUM? Will the locks be held for
longer?
Vacuum only naps at the points where it checks for interrupts, and at 
that time it isn't supposed to hold any critical locks.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-13 Thread Jan Wieck
On 10/9/2004 7:20 AM, Kevin Brown wrote:
Christopher Browne wrote:
Increasing the number of cache buffers _is_ likely to lead to some
slowdowns:
 - Data that passes through the cache also passes through kernel
   cache, so it's recorded twice, and read twice...
Even worse, memory that's used for the PG cache is memory that's not
available to the kernel's page cache.  Even if the overall memory
Which underlines my previous statement, that a PG shared cache much 
larger than the high-frequently accessed data portion of the DB is 
counterproductive. Double buffering (kernel-disk-buffer plus shared 
buffer) only makes sense for data that would otherwise cause excessive 
memory copies in and out of the shared buffer. After that, in only 
lowers the memory available for disk buffers.

Jan
usage in the system isn't enough to cause some paging to disk, most
modern kernels will adjust the page/disk cache size dynamically to fit
the memory demands of the system, which in this case means it'll be
smaller if running programs need more memory for their own use.
This is why I sometimes wonder whether or not it would be a win to use
mmap() to access the data and index files -- doing so under a truly
modern OS would surely at the very least save a buffer copy (from the
page/disk cache to program memory) because the OS could instead
direcly map the buffer cache pages directly to the program's memory
space.
Since PG often has to have multiple files open at the same time, and
in a production database many of those files will be rather large, PG
would have to limit the size of the mmap()ed region on 32-bit
platforms, which means that things like the order of mmap() operations
to access various parts of the file can become just as important in
the mmap()ed case as it is in the read()/write() case (if not more
so!).  I would imagine that the use of mmap() on a 64-bit platform
would be a much, much larger win because PG would most likely be able
to mmap() entire files and let the OS work out how to order disk reads
and writes.
The biggest problem as I see it is that (I think) mmap() would have to
be made to cooperate with malloc() for virtual address space.  I
suspect issues like this have already been worked out by others,
however...


--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-13 Thread Jan Wieck
On 10/14/2004 12:22 AM, Greg Stark wrote:
Jan Wieck [EMAIL PROTECTED] writes:
Which would require that shared memory is not allowed to be swapped out, and
that is allowed in Linux by default IIRC, not to completely distort the entire
test.
Well if it's getting swapped out then it's clearly not being used effectively.
Is it really that easy if 3 different cache algorithms (PG cache, kernel 
buffers and swapping) are competing for the same chips?

Jan
There are APIs to bar swapping out pages and the tests could be run without
swap. I suggested it only as an experiment though, there are lots of details
between here and having it be a good configuration for production use.

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-13 Thread Jan Wieck
On 10/13/2004 11:52 PM, Greg Stark wrote:
Jan Wieck [EMAIL PROTECTED] writes:
On 10/8/2004 10:10 PM, Christopher Browne wrote:
 [EMAIL PROTECTED] (Josh Berkus) wrote:
 I've been trying to peg the sweet spot for shared memory using
 OSDL's equipment.  With Jan's new ARC patch, I was expecting that
 the desired amount of shared_buffers to be greatly increased.  This
 has not turned out to be the case.
 That doesn't surprise me.
Neither does it surprise me.
There's been some speculation that having a large shared buffers be about 50%
of your RAM is pessimal as it guarantees the OS cache is merely doubling up on
all the buffers postgres is keeping. I wonder whether there's a second sweet
spot where the postgres cache is closer to the total amount of RAM.
Which would require that shared memory is not allowed to be swapped out, 
and that is allowed in Linux by default IIRC, not to completely distort 
the entire test.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] postgresql performance with multimedia

2004-08-24 Thread Jan Wieck
On 8/24/2004 1:08 AM, my ho wrote:
--- Jan Wieck [EMAIL PROTECTED] wrote:
On 8/17/2004 8:44 PM, my thi ho wrote:
 Hi,
 I am working on a project which explore postgresql
to
 store multimedia data.
 In details,  i am trying to work with the buffer
 management part of postgres source code. And try
to
 improve the performance. I had search on the web
but
 could not find much usefull information. 

What version of PostgreSQL are you looking at? Note
that the buffer 
cache replacement strategy was completely changed
for version 8.0, which 
is currently in BETA test. A description of the
algorithm can be found 
in the README file in src/backend/storage/bufmgr.
oki, Thanks for the information. I have a look at 8.0
beta, but cannot start the statistic collector. (I had
post this err message before for help, but havent
really got any clue to fix it)
LOG:  could not create IPv6 socket: Address family
not
supported by protocol
LOG:  could not bind socket for statistics
collector:
Cannot assign requested address
LOG:  disabling statistics collector for lack of
working socket
Tom Lane answered to that question. The code in question does resolve 
localhost with getaddrinfo() and then tries to create and bind a UDP 
socket to all returned addresses. For some reason localhost on your 
system resolves to an address that is not available for bind(2).

btw, what i want to ask here is does postgreSQL have
any kind of read-ahead buffer implemented? 'cos it
would be useful in multimedia case when we always scan
the large table for continous data.
Since there is no mechanism to control that data is stored contiguously 
in the tables, what would that be good for?

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Performance Bottleneck

2004-08-08 Thread Jan Wieck
On 8/8/2004 8:10 AM, Jeff wrote:
On Aug 8, 2004, at 1:29 AM, Martin Foster wrote:
I am currently making use of Apache::DBI which overrides the 
DBI::disconnect call and keeps a pool of active connections for use 
when need be.   Since it offloads the pooling to the webserver, it 
seems more advantageous then pgpool which while being able to run on a 
external system is not adding another layer of complexity.

Apache::DBI is not the same sort of a pool as pgpool.  DB connections 
are not shared among all your apache children (A common misconception). 
  So if you have 300 apache kids you can have have 300 db connections.  
With pgpool connections are  shared among all of them so even though 
you have 300 kids you only have say 32 db connections.
And this is exactly where the pgpool advantage lies. Especially with the 
TPC-W, the Apache is serving a mix of PHP (or whatever CGI technique is 
used) and static content like images. Since the 200+ Apache kids serve 
any of that content by random and the emulated browsers very much 
encourage it to ramp up MaxClients children by using up to 4 concurrent 
image connections, one does end up with MaxClients DB connections that 
are all relatively low frequently used. In contrast to that the real 
pgpool causes lesser, more active DB connections, which is better for 
performance.



Anyone had any experience with both Apache::DBI and pgpool?   For my 
needs they seem to do essentially the same thing, simply that one is 
invisible to the code while the other requires adding the complexity 
of a proxy.

Both are invisible to the app.  (With pgpool it thinks it is connecting 
to a regular old PG server)

And I've been running pgpool in production for months.  It just sits 
there.  Doesn't take much to set it up or configure it.  Works like a 
champ
And it buys you some extra admin feature people like to forget about it. 
One can shut down one pool for one web application only. That gives you 
instant single user access to one database without shutting down the 
whole webserver or tempering with the pg_hba.conf file.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Bottleneck

2004-08-07 Thread Jan Wieck
On 8/3/2004 2:05 PM, Martin Foster wrote:
I run a Perl/CGI driven website that makes extensive use of PostgreSQL 
(7.4.3) for everything from user information to formatting and display 
of specific sections of the site.   The server itself, is a dual 
processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives 
mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).

Recently loads on the site have increased during peak hours to the point 
of showing considerable loss in performance.This can be observed 
when connections move from the 120 concurrent connections to PostgreSQL 
to roughly 175 or more. Essentially, the machine seems to struggle 
to keep up with continual requests and slows down respectively as 
resources are tied down.
Have you taken a look at pgpool? I know, it sounds silly to *reduce* the 
number of DB connections through a connection pool, but it can help.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Working on huge RAM based datasets

2004-07-12 Thread Jan Wieck
On 7/12/2004 12:38 PM, Josh Berkus wrote:
Rond, Chris,
 What would be most interesting to see is whether this makes it wise to
 increase shared buffer size.  It may be more effective to bump down
 the cache a little, and bump up sort memory; hard to tell.
How do we go about scheduling tests with the OSDL folks? If they could
do 10 runs with buffers between 1k and 500k it would help us get a broad
view of the situation.
Yes.   We'll need to.   However, I'd like to wait until we're officially in 
Beta.   I'll be seeing the OSDL folks in person (PostgreSQL+OSDL BOF at Linux 
World Expo!!)  in a couple of weeks.

Don't forget to add that ARC needs some time actually to let the 
algorithm adjust the queue sizes and populate the cache according to the 
access pattern. You can't start a virgin postmaster and then slam on the 
accellerator of your test application by launching 500 concurrent 
clients out of the blue and expect that it starts off airborne.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Working on huge RAM based datasets

2004-07-11 Thread Jan Wieck
On 7/9/2004 10:16 AM, Merlin Moncure wrote:
What is it about the buffer cache that makes it so unhappy being able
to
hold everything? I don't want to be seen as a cache hit fascist, but
isn't
it just better if the data is just *there*, available in the
postmaster's
address space ready for each backend process to access it, rather than
expecting the Linux cache mechanism, optimised as it may be, to have
to do
the caching?
The disk cache on most operating systems is optimized.  Plus, keeping
shared buffers low gives you more room to bump up the sort memory, which
will make your big queries run faster.
Plus, the situation will change dramatically with 7.5 where the disk 
cache will have less information than the PG shared buffers, which will 
become sequential scan resistant and will know that a block was pulled 
in on behalf of vacuum and not because the regular database access 
pattern required it.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL on VMWare vs Windows vs CoLinux

2004-06-02 Thread Jan Wieck
Using VMware myself quite extensively, I wonder what the disk 
configuration was that you created for the VM. Where the disks 
preallocated and did you make sure that they are contiguous on the NTFS 
filesystem? Did you install the VMware tools in the guest operating system?

What did you use to measure the performance?
Jan
On 6/1/2004 6:56 PM, Vitaly Belman wrote:
Hello pgsql-performance,
  I was using the native windows PostgreSQL 7.5Dev and was adviced by
  several people to use an emulated PostgreSQL instead, as it is just
  a beta.
  Well, I give it a whirl and tried both commercial VMWare and the
  freeweare open-source CoLinux, both work under Windows and both
  emulate Linux, that's a quick review of my experience with them, may
  someone in need learn from it.
  This might be not the best place for such a post, but since the
  subject was brought up here, I'll post it here as well. If someone
  thinks it should be posted somewhere else, let me know.
  Installation  Configuration
  
  
  VMWare:

  On the bright side, the installation went quite smoothly, VMWare
  configured all the network stuff by itself and I had no trouble
  using the net right away. On the grim side, the installation itself
  took ages, compared to the plug  play feel of CoLinux.
  Installing PostgreSQL on VMWare was quite straightforward, just as
  the the PostgreSQL documention goes.
  CoLinux:
  As I said, with CoLinux the installation itself goes very quickly.
  To get Linux running you need to download practically less than 20mb
  which include the distribution (Debian in my case) and the CoLinux
  setup. Configuring CoLinux took a bit longer than VMWare, yet, not
  long as I thought it would take. In fact, it can be very easy if you
  just follow the documention of CoLinux Wiki stuff, there are some
  very easy to follow tutorials there.
  Installing PostgreSQL on CoLinux proved a little more difficult
  (again, Debian), but I posted a quick tutorial that should smooth
  the process: http://www.colinux.org/wiki/index.php/PostgreSQL.
  Performance
  ---
  This was a totally subjective test (especially since one of the
  participants is in a beta stage), yet, that's what I tested and that's
  what I needed to know.
  To make the test as fair as possible, I did an exact dump of the
  same database. I ran the SQLs (around 10) in the same order on all
  of them and repeated the test several times. I also did an EXPLAIN
  on the queries to make sure all the databases work on the query the
  same way. It wasn't a full test though, I didn't test mass select
  load, nor inserts, nor work under heavy load, nor I tried different
  types of joins. All I did was to run some heavy (in execution time)
  queries. So you should take these tests just for what they are.
  That's what I got:
  The native window port performed poorly lagging
  30%-50% behind the VMWare/CoLinux solutions in execution times,
  rather sad, but not unexpected, I guess.
  CoLinux and VMWare give AROUND the same results, yet CoLinux did
  give slightly better performance (I'd say 5%-10%) but with such
  slight improvement and inconsistency I wouldn't count it as much.
  Conclusion
  --
  With all that said, VMWare is badly suited for running a database,
  while CoLinux can be run as a service (didn't try it yet though),
  VMWare always sits there, it is slow to go up, slow to go down and
  generally feels like a system hog.
  I'll go on with CoLinux for now and hope it will act as good as it
  looks.
  http://www.vmware.com/
  http://www.colinux.org/
  Thanks to Bryan and Matthew for their advices regarding the emulations.
Regards,
 Vitaly Belman
 
 ICQ: 1912453
 AIM: VitalyB1984
 MSN: [EMAIL PROTECTED]
 Yahoo!: VitalyBe

---(end of broadcast)---
TIP 8: explain analyze is your friend

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-22 Thread Jan Wieck
 syslog | 0
 syslog_facility| LOCAL0
 syslog_ident   | postgres
 tcpip_socket   | on
 TimeZone   | unknown
 trace_notify   | off
 transaction_isolation  | read committed
 transaction_read_only  | off
 transform_null_equals  | off
 unix_socket_directory  | unset
 unix_socket_group  | unset
 unix_socket_permissions| 511
 vacuum_mem | 65536
 virtual_host   | unset
 wal_buffers| 32
 wal_debug  | 0
 wal_sync_method| fdatasync
 zero_damaged_pages | off
(113 rows)
suggestions, doubts and commentaries are very welcome

regards 
__
Eduardo Cunha de Almeida
Administrao de Banco de Dados
UFPR - CCE 
+55-41-361-3321
[EMAIL PROTECTED]
[EMAIL PROTECTED]

--- Jan Wieck [EMAIL PROTECTED] wrote:
Josh Berkus wrote:

 Folks,
 
 I've sent a polite e-mail to Mr. Gomez offering
our help.  Please, nobody 
 flame him!
 

Please keep in mind that the entire test has, other
than a similar 
database schema and query types maybe, nothing to do
with a TPC-H. I 
don't see any kind of SUT. Foreign key support on
the DB level is not 
required by any of the TPC benchmarks. But the
System Under Test, which 
is the combination of middleware application and
database together with 
all computers and network components these parts are
running on, must 
implement all the required semantics, like ACID
properties, referential 
integrity c. One could implement a TPC-H with flat
files, it's just a 
major pain in the middleware.

A proper TPC benchmark implementation would for
example be a complete 
PHP+DB application, where the user interaction is
done by an emulated 
browser and what is measured is the http response
times, not anything 
going on between PHP and the DB. Assuming that all
requirements of the 
TPC specification are implemented by either using
available DB features, 
or including appropriate workarounds in the PHP
code, that would very 
well lead to something that can compare PHP+MySQL
vs. PHP+PostgreSQL.

All TPC benchmarks I have seen are performed by
timing such a system 
after a considerable rampup time, giving the DB
system a chance to 
properly populate caches and so forth. Rebooting the
machine just before 
the test is the wrong thing here and will especially
kill any advanced 
cache algorithms like ARC.

Jan

--

#==#
# It's easier to get forgiveness for being wrong
than for being right. #
# Let's break this rule - forgive me.   
  #
#==
[EMAIL PROTECTED] #

---(end of
broadcast)---
TIP 5: Have you checked our extensive FAQ?
  
http://www.postgresql.org/docs/faqs/FAQ.htmlIP 5:
Have you checked our extensive FAQ?

  
http://www.postgresql.org/docs/faqs/FAQ.html





__
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25
http://photos.yahoo.com/ph/print_splash


--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [pgsql-advocacy] [PERFORM] MySQL vs PG TPC-H benchmarks

2004-04-21 Thread Jan Wieck
Josh Berkus wrote:

Folks,

I've sent a polite e-mail to Mr. Gomez offering our help.  Please, nobody 
flame him!

Please keep in mind that the entire test has, other than a similar 
database schema and query types maybe, nothing to do with a TPC-H. I 
don't see any kind of SUT. Foreign key support on the DB level is not 
required by any of the TPC benchmarks. But the System Under Test, which 
is the combination of middleware application and database together with 
all computers and network components these parts are running on, must 
implement all the required semantics, like ACID properties, referential 
integrity c. One could implement a TPC-H with flat files, it's just a 
major pain in the middleware.

A proper TPC benchmark implementation would for example be a complete 
PHP+DB application, where the user interaction is done by an emulated 
browser and what is measured is the http response times, not anything 
going on between PHP and the DB. Assuming that all requirements of the 
TPC specification are implemented by either using available DB features, 
or including appropriate workarounds in the PHP code, that would very 
well lead to something that can compare PHP+MySQL vs. PHP+PostgreSQL.

All TPC benchmarks I have seen are performed by timing such a system 
after a considerable rampup time, giving the DB system a chance to 
properly populate caches and so forth. Rebooting the machine just before 
the test is the wrong thing here and will especially kill any advanced 
cache algorithms like ARC.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-06 Thread Jan Wieck
Mike Nolan wrote:
Seriously, I am tired of this kind of question. You gotta get bold 
enough to stand up in a meeting like that, say guy's, you can ask me 
how this compares to Oracle ... but if you're seriously asking me how 
this compares to MySQL, call me again when you've done your homework.
Can they call you at the unemployment office?
It might not work with the words I used above, but the point I tried to 
make is that the hardest thing you can sell is a no. I mean, not 
just saying no, but selling it in a way that the customer will not go 
with the next idiot who claims we can do that.

If the customer has a stupid idea, like envisioning an enterprise 
solution based on ImSOL, there is no way you will be able to deliver it. 
Paying customer or not, you will fail if you bow to their strategic 
decisions and ignore knowing that the stuff they want to use just 
doesn't fit.

That is absolutely not ImSOL specific. If someone comes to me and asks 
for a HA scenario with zero transaction loss during failover, we can 
discuss a little if this is really what he needs or not, but if he needs 
that, the solution will be Oracle or DB2, for sure I will not claim that 
PostgreSQL can do that, because it cannot.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [pgsql-advocacy] MySQL+InnoDB vs. PostgreSQL test?

2004-02-03 Thread Jan Wieck
Josh Berkus wrote:

Folks,

I've had requests from a couple of businesses to see results of infomal MySQL
+InnoDB vs. PostgreSQL tests.I know that we don't have the setup to do 
full formal benchmarking, but surely someone in our community has gone 
head-to-head on your own application?

Josh,

how does someone compare an Apache+PHP+MySQL thing against something 
implemented with half the stuff done in stored procedures and the entire 
business model guarded by referential integrity, custom triggers and 
whatnot?

Seriously, I am tired of this kind of question. You gotta get bold 
enough to stand up in a meeting like that, say guy's, you can ask me 
how this compares to Oracle ... but if you're seriously asking me how 
this compares to MySQL, call me again when you've done your homework.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]