from:"Shridhar Daithankar"

Re: [HACKERS] inclusions WAS: Increased company involvement

2005-05-04 Thread Shridhar Daithankar

On Wednesday 04 May 2005 8:18 pm, Marc G. Fournier wrote:
 Just curious here ... but do any of the version control systems provide
 per directory user restrictions?  Where I could give CVS access to
 Joshua, for instance, just to the plphp directory?

Subversion does.

http://svnbook.red-bean.com/en/1.0/ch06s04.html#svn-ch-6-sect-4.4.2

 Shridhar

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Porting/platforms/buildfarm open issues

2005-01-06 Thread Shridhar Daithankar

On Thursday 06 Jan 2005 3:52 pm, Peter Eisentraut wrote:
 First, we still do not have any test with 8.0 on the following platforms:

 HP-UX

All the 96 tests are passed on RC3. I can rerun the tests with additional 
configure flags if required..But I can not install anything new on HP-UX 
machine.

$ ./configure --without-readline --without-zlib;gmake;gmake check
$ uname -a
HP-UX machine B.11.00 A 9000/785 2005950738 two-user license

FWIW, I also ran on my Linux machine. It passes there as well..

$ ./configure;make;make check
$ uname -a
Linux machine 2.6.7 #1 Wed Jun 16 16:23:03 PDT 2004 i686 unknown unknown 
GNU/Linux
$ cat /etc/slackware-version
Slackware 10.0.0

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] lpthread errors while compiling 8.0beta

2004-08-09 Thread Shridhar Daithankar

On Monday 09 Aug 2004 9:18 pm, Robert Treat wrote:
 but I was pretty sure I had lpthread library on my machine (and verified
 this with some help from irc).   however Makefile.global told me

 PTHREAD_CFLAGS  = -pthread -D_REENTRANT -D_THREAD_SAFE
 -D_POSIX_PTHREAD_SEMANTICS
 PTHREAD_LIBS=

 so I added -lpthread to the PTHREAD_LIBS line and it all compiled ok and
 passed regression.   one theory of where the problem lies focused on this
 bit of config.log:

 configure:13260: checking for the pthreads library -lpthreads
 configure:13301: gcc -o conftest -O2 -fno-strict-aliasing -g  -D_GNU_SOURCE
 conftest.c -lpthreads -lz -lreadline -ltermcap -lcrypt -lresolv -lnsl -ldl
 -lm  5
 /usr/i386-slackware-linux/bin/ld: cannot find -lpthreads
 collect2: ld returned 1 exit status
 configure:13304: $? = 1
 configure: failed program was:

I noticed it on a slackware 9.1 system as well. I didn't noticed the above 
part but I had to change Makefile.global.

My hypothesis was linuxthreads does not transmit libpthread.so dependency 
correctly.(Out of memory, found while reading on differences between NTPL and 
linuxthreads. Don't remember the exact source now). 

I got a failure while linking initdb because it was linking against libpq, 
which in turn linking against libpthreads.so. So in order to close the 
linking unit, the linker needs -lpthreads mentioned against initdb(or in 
global linker flags).

NTPL can take care of this situation IIRC. 

Since linuxthreads are becoming extinct rapidly(barring slackware and may be 
debian, of course), I didn't chase the issue much. May be we need to document 
this.

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Why frequently updated tables are an issue

2004-06-10 Thread Shridhar Daithankar

[EMAIL PROTECTED] wrote:
The session table is a different issue, but has the same problems. You
have an active website, hundreds or thousands of hits a second, and you
want to manage sessions for this site. Sessions are created, updated many
times, and deleted. Performance degrades steadily until a vacuum. Vacuum
has to be run VERY frequently. Prior to lazy vacuum, this was impossible.
Both session tables and summary tables have another thing in common, they
are not vital data, they hold transitive state information. Yea, sure,
data integrity is important, but if you lose these values, you can either
recreate it or it isn't too important.
Why put that is a database at all? Because, in the case of sessions
especially, you need to access this information for other operations. In
the case of summary tables, OLAP usually needs to join or include this
info.
PostgreSQL's behavior on these cases is poor. I don't think anyone who has
tried to use PG for this sort of thing will disagree, and yes it is
getting better. Does anyone else consider this to be a problem? If so, I'm
open for suggestions on what can be done. I've suggested a number of
things, and admittedly they have all been pretty weak ideas, but they were
potentially workable.
There is another as-of-non-feasible and hence rejected approach. Vacuum in 
postgresql is tied to entire relations/objects since indexes do not have 
transaction visibility information.

It has been suggested in past to add such a visibility to index tuple header so 
that index and heaps can be cleaned out of order. In such a case other backround 
processes such as background writer and soon-to-be integrated autovacuum daemon 
can vacuum pages/buffers rather than relations. That way most used things will 
remain clean and cost of cleanup will remain outside crtical transaction 
processing path.

However increasing index footprint seems to be a tough sell. Besides FSM would 
need some rework to accomodate/autotune it's behaviour.

I am quoting from memory, so don't flame me if I misquote it. Just adding to 
make this complete. Only from performance point of view, it could solve quite 
some problems, at least in theory.

 Shridhar
---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Why frequently updated tables are an issue

2004-06-10 Thread Shridhar Daithankar

Glen Parker wrote:
It has been suggested in past to add such a visibility to index
tuple header so
that index and heaps can be cleaned out of order. In such a case
other backround
It seems to me that the benefit of this wouldn't be all that impressive
*when accessing the cache*, which is the problem this discussion is about.
I doubt. I have seen examnples on general list where people have thousands of 
dead *pages* for few hundred live tuples. If it is a problem with cache access, 
it will spill to disk as the problem grows.

I don't think postgresql cache is that bad. No matter how you do it RAM is RAM. 
Problem is with disk bandwidth. See past discussions about vacuum delay patch 
and improvement it brought around.

Vacuum costs disk bandwidth and that affects performance. That remains a fact.
Disk access would occur more commonly with large tables, which I'll ignore.
Let's say total scan time for a query on a very dirty table is 100ms.  It
seems safe to assume that the scan time for the index would be *roughly*
half that of the heap.  If visibilty could be determined by looking at just
the index tuple, you'd cut you query scan time down to 50ms.  When the clean
table case is 7ms total scan time, the difference between 50 and 100 ms is
not much of an issue; either way, it's still way to high!
However increasing index footprint seems to be a tough sell.
And rightly so, IMO.
Mee too. Unless somebody comes up with patch that demonstrates the improvement. 
Obviously people can live with cost of mandatory vacuum so this is not high 
priority. But one day it will be.

OTOH if the perceived benefit is not there, at least it is proven that it is not 
there. I plan to do it when I find time. But again, I don't face the problem 
myself(I don't even use postgresql for anything important for that matter) so 
haven't bothered spending any time on it.

As long as it is not high priority, it is going to be a tough sell. Thats not 
unusual.

 Shridhar
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Improving postgresql.conf

2004-06-09 Thread Shridhar Daithankar

[EMAIL PROTECTED] wrote:
I have a LOT of opinions about postgresql.conf, and frankly, I think more
comments are not where the problems lie.
If you *really* want to make configuring postgresql easier,
postgresql.conf HAS to live outside the data directory and specify where
everything is. postgresql.conf should do exactly as one would assume it
does, configure postgresql.
Right now it doesn't. Right now it just sets parameters and the -D or
PGDATA environment variable *really* configure postgresql. If you do not
know how a machine is setup, you have to look for the install. Hopefuly,
the previous administrator did not have any test directories which would
confuse the search. Sorry, I'm ranting.
In an ideal world, I envision Postgresql having a default location for
postgresql.conf, in this file will be the declarations for where the data
directory is, possible included files, etc. i.e. the stuff I've been
pushing litterally for years. I am not saying that the current behavior
change in any way, what I am saying is that a more world compatible
methodology should be possible.
Once the postgresql.conf file is out of the data directory, you have a new
paradigm from which to work. One could write a setup application, in java
or something, which creates a new postgresql.conf file, right down to
where you want the installed directory to be, and THAT is used by initdb.
The setup application can also provide context sensitive help for each of
the setting. The user may not even *know* that there is such a file as
postgresql.conf.
Well, the statement 'postgresql.conf outside data directory' isn't going to win 
I think.

postgresql.conf is a cluster configuration file. I remember previous discussion 
on this and I agree with that a cluster configuration file should remain in 
cluster itself.

Let me put it in a different way. What you are asking is a service configuration 
file. It is *not* same as current postgresql configuration file. It will/should 
be unique to a perticular installation of postgresql. i.e. something like 
/etc/postgresql/7.4.2/service.conf

I think it makes a lot of sense then. It would allow to maitain different 
clusters, like in another thread where OP wanted different locales/collation. 
And it will still allow multiple versions of postgresql to be installed.

I remember chasing mandrake couple of years ago for not to make postgresql 
database live in /var/lib/data... but to make it configurable. It didn't go 
anywhere though.

I think it is a rather good idea to add service configuration to default 
postgresql install. May be linux distro. vendors can customize thr. the init 
scripts.

Also pulling postgresql.conf out of cluster has a drawback. All the clusters 
would have to share same tuning parameters which is not exactly ideal. If we 
define a services file with multiple clusters we woudl still provide ultimate 
control to the DBA/system admin.

Just a thought..
 Shridhar


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Improving postgresql.conf

2004-06-09 Thread Shridhar Daithankar

Honza Pazdziora wrote:
On Wed, Jun 09, 2004 at 07:53:19PM +0530, Shridhar Daithankar wrote:
Well, the statement 'postgresql.conf outside data directory' isn't going to 
win I think.
One day there won't be any data directory because the data will be
on raw partitions. Then you will _have_ to have the configuration
somewhere else. Ideally, only the absolute minimum of parameters
needed to start the server ought to be in the external configuration.
Everything else may happily reside within the database storage.
Given that postgresql by design does not outsmart OS in any overlapping 
functionality, I doubt we would see raw storage anytime soon but anyways..


Let me put it in a different way. What you are asking is a service 
configuration file. It is *not* same as current postgresql configuration 
file. It will/should be unique to a perticular installation of postgresql. 
i.e. something like /etc/postgresql/7.4.2/service.conf
Why? The administrator may want to run second cluster on the same
machine, share a couple of options using include directive while
preserving separate configuration, including the location of data
store, for things that should be different for each of these
clusters.
Well that is easy. In the service file just say
[Cluster1]
 datapath=/data/foo
[Cluster2]
 datapath=/data/foo1
and postgresql.conf could still reside inside each cluster to provide specific 
configuration.

Thenhave a script which can say 'service postgresql cluster1 start'
I think it is a rather good idea to add service configuration to default 
What is that service you mention?
Postgresql as a database server is a service. A cluster is an service instance. 
A service configuration file documents all service instances and their 
parameters required for all tuning and control purposes. Add a possibility of 
multiple versions of postgresql on same box. That sums it up pretty well..

Also pulling postgresql.conf out of cluster has a drawback. All the 
clusters would have to share same tuning parameters which is not exactly 
ideal.
Why would they _have_ to? Pulling postgresql.conf out of cluster only
means that the file resides somewhere else and in it the location of
the data directory is specified. It does not mandate there will only
be one cluster and it does not mean that each cluster cannot have
completely different configuration file.
Well, I wish I could have some archives link handy but suffice to say that Tom 
has rejected this idea many times before..

 Shridhar
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [pgsql-hackers-win32] [HACKERS] Failures with windows port

2004-06-08 Thread Shridhar Daithankar

Andrew Dunstan wrote:
Shridhar Daithankar wrote:
Leaving it for a full build as I am calling it a day. Will give it 
another  go tomorrow morning..

What you would need to test is not #ifdefing the whole thing out - 
rather you would need to enable the fputc by removing the #ifndef and 
#endif lines.
Hmm.. good.. It worked. 1 out of 95 tests failed. Join is the test that is failed.
Now I can go about installing and playing with it. I should use CVS in future 
though..

And BTW, I was not running the regression in a cygwin shell. It was a msys shell 
only. I am thinking of knowcking off cygwin in favour of msys. But I don't find 
CVS with msys.. I need to find a commandline cvs for daily use. WinCVS is just 
too much GUI for my taste..

Thanks again..
 Shridhar
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[HACKERS] Failures with windows port

2004-06-07 Thread Shridhar Daithankar

Hi,
I checked out the windows port to play with. It compiled file but 'make check' 
produced attached regression diff.

I am using the nightly snapshot. Is it too early to look at these failures or 
did I do something wrong? I was following usual ./configure;make procedure after 
installing mingw.

 Shridhar

parallel group (13 tests):  text varchar oid char name float4 int2 boolean int8 int4 
float8 bit numeric
 boolean  ... FAILED
 char ... FAILED
 name ... FAILED
 varchar  ... FAILED
 text ... FAILED
 int2 ... FAILED
 int4 ... FAILED
 int8 ... FAILED
 oid  ... FAILED
 float4   ... FAILED
 float8   ... FAILED
 bit  ... FAILED
 numeric  ... FAILED
test strings  ... FAILED
test numerology   ... FAILED
parallel group (20 tests):  lseg path time timetz circle comments reltime abstime 
point tinterval polygon box inet interval timestamp timestamptz date type_sanity 
oidjoins opr_sanity
 point... FAILED
 lseg ... FAILED
 box  ... FAILED
 path ... FAILED
 polygon  ... FAILED
 circle   ... FAILED
 date ... FAILED
 time ... FAILED
 timetz   ... FAILED
 timestamp... FAILED
 timestamptz  ... FAILED
 interval ... FAILED
 abstime  ... FAILED
 reltime  ... FAILED
 tinterval... FAILED
 inet ... FAILED
 comments ... FAILED
 oidjoins ... FAILED
 type_sanity  ... FAILED
 opr_sanity   ... FAILED
test geometry ... FAILED
test horology ... FAILED
test insert   ... FAILED
test create_function_1... ok
test create_type  ... FAILED
test create_table ... ok
test create_function_2... ok
test copy ... ok
parallel group (7 tests):  create_aggregate create_operator triggers constraints 
vacuum inherit create_misc
 constraints  ... FAILED
 triggers ... FAILED
 create_misc  ... ok
 create_aggregate ... ok
 create_operator  ... ok
 inherit  ... FAILED
 vacuum   ... FAILED
parallel group (2 tests):  create_view create_index
 create_index ... FAILED
 create_view  ... FAILED
test sanity_check ... FAILED
test errors   ... FAILED
test select   ... FAILED
parallel group (18 tests):  select_into select_having update transactions namespace 
case select_implicit select_distinct_on arrays union select_distinct random portals 
join aggregates hash_index btree_index subselect
 select_into  ... ok
 select_distinct  ... FAILED
 select_distinct_on   ... FAILED
 select_implicit  ... FAILED
 select_having... FAILED
 subselect... FAILED
 union... FAILED
 case ... FAILED
 join ... FAILED
 aggregates   ... FAILED
 transactions ... FAILED
 random   ... failed (ignored)
 portals  ... FAILED
 arrays   ... FAILED
 btree_index  ... FAILED
 hash_index   ... FAILED
 update   ... FAILED
 namespace... FAILED
test privileges   ... FAILED
test misc ... FAILED
parallel group (5 tests):  portals_p2 cluster foreign_key rules select_views
 select_views ... FAILED
 portals_p2   ... FAILED
 rules... FAILED
 foreign_key  ... FAILED
 cluster  ... FAILED
parallel group (14 tests):  copy2 truncate temp sequence rowtypes domain polymorphism 
rangefuncs limit plpgsql prepare conversion without_oid alter_table
 limit... FAILED
 plpgsql  ... FAILED
 copy2... FAILED
 temp ... FAILED
 domain   ... FAILED
 rangefuncs   ... FAILED
 prepare  ... FAILED
 without_oid  ... FAILED
 conversion   ... FAILED
 truncate ... FAILED
 alter_table  ... FAILED
 sequence ... FAILED
 polymorphism ... FAILED
 rowtypes ... FAILED
test stats... FAILED

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [pgsql-hackers-win32] [HACKERS] Failures with windows port

2004-06-07 Thread Shridhar Daithankar

Andrew Dunstan wrote:
Shridhar Daithankar wrote:
Hi,
I checked out the windows port to play with. It compiled file but 
'make check' produced attached regression diff.

I am using the nightly snapshot. Is it too early to look at these 
failures or did I do something wrong? I was following usual 
./configure;make procedure after installing mingw.

We need more information. In particular, we would need to know
. what version of Windows, Mingw, MSys
Windows 2000 Professional, MingW 3.1.0.1, MSys 1.0.10
. what is in your regression.diff (if you send it, you should probably 
zip it up first).
http://www.hserus.net/~shridhar/regression.diffs.gz
http://www.hserus.net/~shridhar/regression.out
HTH
 Shridhar
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [pgsql-hackers-win32] [HACKERS] Failures with windows port

2004-06-07 Thread Shridhar Daithankar

Bruce Momjian wrote:
http://www.hserus.net/~shridhar/regression.diffs.gz
http://www.hserus.net/~shridhar/regression.out
Uh, were did you get this snapshot?  Hold old was it?  These newline
problems were fixed perhaps 2 weeks ago.
ftp://ftp.postgresql.org/pub/dev/postgresql-snapshot.tar.gz
It is timestamped at 6/7/2004, 8:09 AM.
Anyways, I will install flex and bison and use CVS. I was just being lazy to opt 
for a snapshot..

 Shridhar
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [pgsql-hackers-win32] [HACKERS] Failures with windows port

2004-06-07 Thread Shridhar Daithankar

Bruce Momjian wrote:
Shridhar Daithankar wrote:
Bruce Momjian wrote:
http://www.hserus.net/~shridhar/regression.diffs.gz
http://www.hserus.net/~shridhar/regression.out
Uh, were did you get this snapshot?  Hold old was it?  These newline
problems were fixed perhaps 2 weeks ago.
ftp://ftp.postgresql.org/pub/dev/postgresql-snapshot.tar.gz
It is timestamped at 6/7/2004, 8:09 AM.
Anyways, I will install flex and bison and use CVS. I was just being lazy to opt 
for a snapshot..

That is certainly new enough.  We added a MinGW workaround to fix a
problem with extra newlines coming from psql, but you might have a
configuration that doesn't need the workaround.
Do you need this line in psql/print.c?

#ifndef __MINGW32__
/* for some reason MinGW outputs an extra newline, so this supresses it */
fputc('\n', fout);
#endif
That might be the cause of your problem?  If you have it try removing it and recompile.
No it is not. A quick #if 0...#endif around it and recompile produced 87 
failures out of 95 test, one being ignored.

Leaving it for a full build as I am calling it a day. Will give it another  go 
tomorrow morning..

 Shridhar
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-06-02 Thread Shridhar Daithankar

Hi,

Any updates/opinions? Should we convert assign hooks to perform actual 
assignment and custom validation instead of just custom validation? It is 
clear from README that it is for validation purposes only..

Or Shall i look for some place else to perform conversion?

Shridhar

On Tuesday 01 June 2004 18:01, Shridhar Daithankar wrote:
 On Tuesday 01 June 2004 14:12, Shridhar Daithankar wrote:
  Actually I need to find out few more things about it. It is not as simple
  as adding a assign_hook. When I tried to initdb with changes, it demanded
  64MB of shared buffers which I (now) think that somewhere NBuffers are
  used before postgresql.conf is parsed. So 8192*8000=64MB. But this is
  just guesswork. Haven't looked in it there.

 Found  it. Following is the code that is causing problem.

 guc.c:2998
 ---
   if (conf-assign_hook)
   if (!(*conf-assign_hook) (newval, changeVal, 
 source))
   {
   ereport(elevel,
   
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg(invalid value 
 for parameter \%s\: %d,
   name, 
 newval)));
   return false;
   }

   if (changeVal || makeDefault)
   {
   if (changeVal)
   {
   *conf-variable = newval;
   conf-gen.source = source;
   }
 ---

 So even if assign_hook is executed, the value of variable is overwritten in
 next step which nullifies any factoring/change in value done in assign
 hook.

 I find this as a convention at many other place at guc.c. Call assign_hook
 and the overwrite the value. So is assign_hook called only to validate the
 value? How do I modify the value of the variable without getting specific?

 I tried

 if (changeVal  !(conf-assign_hook))

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-06-02 Thread Shridhar Daithankar

On Wednesday 02 June 2004 20:16, Tom Lane wrote:
 Shridhar Daithankar [EMAIL PROTECTED] writes:
  Any updates/opinions? Should we convert assign hooks to perform actual
  assignment and custom validation instead of just custom validation? It is
  clear from README that it is for validation purposes only..

 As it should be.  Assign hooks have no business altering the
 user-supplied value.

OK

 I remain unalterably opposed to the notion of measuring shared_buffers
 in KB, but if you think you can get such a thing in over my objections,

Are you OK with MBs? I am fine with anything.

 the way to do it is to decouple the GUC parameter from NBuffers.  The
 GUC setting is whatever it is; you can reject the value if it's too
 far out of range, but you do not editorialize upon it.  What you do is
 compute the derived value for NBuffers and assign that in the assign
 hook.

That means removing NBuffers from declaration for config structure and 
substituting a dummy variable for it?

If you think this is good and acceptable enough, I will proceed making changes 
that way. Shall I take this as guideline for other parameters as well?

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-06-02 Thread Shridhar Daithankar

On Wednesday 02 June 2004 20:59, Tom Lane wrote:
 Frank Wiles [EMAIL PROTECTED] writes:
This may be an unreasonable suggestion, but how about allowing both?
I've seen several configuration systems do the following:
 
shared_buffers = 1 ( shared_buffers in pages )
shared_buffers = 100M  ( 100 MBs of shared_buffers )
shared_buffers = 2048K ( 2MBs of shared_buffers )

 I could live with that.  I'm not sure how painful it would be to wedge
 into GUC though; and I have a feeling that it is exactly *not* what
 Shridhar would think is simpler than the present behavior ;-)

Usability POV, I could live with that. That would mean converting the GUC from 
int to string though.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-06-01 Thread Shridhar Daithankar

On Monday 31 May 2004 22:00, Tom Lane wrote:
 Shridhar Daithankar [EMAIL PROTECTED] writes:
  Right now following are measured in pages
  wal_buffers
  shared_buffers
  effective_cachesize
  while rest of the memory parameters are in kb. I thought being uniform
  would be good. Besides it will make it independent of page size as well.

 It would probably be reasonable to change effective_cache_size, since we
 really do not know what the kernel's unit of buffering is (except on
 Linux, where we *do* know that it ain't 8K ;-)).  Personally I'd opt for
 measuring it in MB not KB, though; that would be a much more convenient
 unit on modern machines.  We could easily make it a float for anyone who
 thinks they know the cache size to sub-MB accuracy.

I have no problems with MB. Only thing I want to see is a bit more user 
friendly and consistent configuration.

Initially I thought of bytes as oracle does but said 'let's be little more 
practical' and put KB..:-) MB is perfectly OK. It would just change the 
multiplier.

And I don't think specifying a float is such a good idea. It is just so 
counter-intuitive. I mean how many people would care for fraction of an MB 
wasted?

 As for the others, I'll side with Emerson: a foolish consistency is the
 hobgoblin of little minds.  We know very well what the unit of
 allocation of those is, and it's pages.  There's no advantage to using
 KB except making it harder to work out what's really happening.  We
 could measure max_connections in KB too if we had a mind to: there's
 a very definite shared-mem cost per connection slot.  Or the FSM
 parameters, or checkpoint_segments, or max_locks_per_transaction.
 The fact that they have quantifiable space costs doesn't mean that space
 is the most useful way to measure them.

Agreed. This is not to change things left and right. It is only to put some 
consistency in place.

 BTW, were you intending to convert KB to NBuffers by charging exactly 8K
 per buffer, or were you intending to allow for the additional shmem
 costs such as buffer headers, per-buffer LWLocks, etc?  If not the
 latter, then what are you really measuring?  For sure it's not shared
 memory size --- charging an artificial number isn't going to help anyone
 who's trying to pick shared_buffers to arrive at a particular actual
 shmem size.  But if it is the latter then it'll become even more
 impossible to tell what's really happening, and we'll be forced to
 invent some way of reading out how many buffers really got allocated.

Well, for the purpose, this is beyond what I am trying to do. As of now there 
is a int value sitting in postgresql.conf which is in page blocks. Now there 
will be a conversion before it is used anytime so that it is bit more user 
friendly. The change should be skin deep so as to be low impact.

As far putting a prefix such as K or M, I don't know much work that would be. 
Does that mean we need to convert shared_buffers to a string parameter and 
parse it? [EMAIL PROTECTED](I would gladly write a real name but alas) 
said yesterday that there is a patch pending with Bruce for such a framework. 
I don't know what and how it does.

For simplicity, I would convert all memory parameters to either KB or MB and 
state so in postgresql.conf. No floats no suffixes. This is my opinion of 
course. Any suggestions are always welcome..

Actually I need to find out few more things about it. It is not as simple as 
adding a assign_hook. When I tried to initdb with changes, it demanded 64MB 
of shared buffers which I (now) think that somewhere NBuffers are used before 
postgresql.conf is parsed. So 8192*8000=64MB. But this is just guesswork. 
Haven't looked in it there.

If this seems reasonably OK, then I would spend some more time on it. We would 
need quite some documentation update then.

 So I disagree with the premise.  Measuring these things in KB is not an
 improvement.

As I said, KBs or MBs is not the issue. Not having it in terms of pagesize is.

 Shridhar


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-06-01 Thread Shridhar Daithankar

On Tuesday 01 June 2004 14:12, Shridhar Daithankar wrote:
 Actually I need to find out few more things about it. It is not as simple
 as adding a assign_hook. When I tried to initdb with changes, it demanded
 64MB of shared buffers which I (now) think that somewhere NBuffers are used
 before postgresql.conf is parsed. So 8192*8000=64MB. But this is just
 guesswork. Haven't looked in it there.

Found  it. Following is the code that is causing problem.

guc.c:2998
---
if (conf-assign_hook)
if (!(*conf-assign_hook) (newval, changeVal, 
source))
{
ereport(elevel,

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg(invalid value 
for parameter \%s\: %d,
name, 
newval)));
return false;
}

if (changeVal || makeDefault)
{
if (changeVal)
{
*conf-variable = newval;
conf-gen.source = source;
}
---

So even if assign_hook is executed, the value of variable is overwritten in 
next step which nullifies any factoring/change in value done in assign hook.

I find this as a convention at many other place at guc.c. Call assign_hook and 
the overwrite the value. So is assign_hook called only to validate the value?  
How do I modify the value of the variable without getting specific?

I tried 

if (changeVal  !(conf-assign_hook))

and it worked. However that is just for int variables. I am not sure if that 
is a design decision. What should I do?

Regards,
 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-05-31 Thread Shridhar Daithankar

On Sunday 30 May 2004 21:33, Tom Lane wrote:
 Shridhar Daithankar [EMAIL PROTECTED] writes:
  I was toying around with idea of converting all the memory related
  parameters in postgresql.conf to kilobytes for simplicity and
  uniformity.

 Why is that a good idea?

Right now following are measured in pages
wal_buffers
shared_buffers 
effective_cachesize 

Following are in kbs
work_mem 
maintenance_work_mem 
max_stack_depth 

while rest of the memory parameters are in kb. I thought being uniform would 
be good. Besides it will make it independent of page size as well.

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Converting postgresql.conf parameters to kilobytes

2004-05-31 Thread Shridhar Daithankar

On Monday 31 May 2004 18:41, Bruce Momjian wrote:
 Shridhar Daithankar wrote:
  On Sunday 30 May 2004 21:33, Tom Lane wrote:
   Shridhar Daithankar [EMAIL PROTECTED] writes:
I was toying around with idea of converting all the memory related
parameters in postgresql.conf to kilobytes for simplicity and
uniformity.
  
   Why is that a good idea?
 
  Right now following are measured in pages
  wal_buffers
  shared_buffers
  effective_cachesize
 
  Following are in kbs
  work_mem
  maintenance_work_mem
  max_stack_depth
 
  while rest of the memory parameters are in kb. I thought being uniform
  would be good. Besides it will make it independent of page size as well.

 Sounds good to me.  How are you handling cases where the value has to be
 a multiple of page size --- rounding or throwing an error?

The patch rounds it down owing to assignment of possible real number to 
integer.  but I didn't imagine of this requirement, to be honest. 

This seems to be better behavior than throwing an error.

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] CVS tip compiler error with --enable-thread-safety

2004-05-31 Thread Shridhar Daithankar

On Sunday 30 May 2004 19:37, Bruce Momjian wrote:
  The relevant configure messages read
  -
  checking whether pthreads work without any flags... no
  checking whether pthreads work with -Kthread... no
  checking whether pthreads work with -kthread... no
  checking for the pthreads library -llthread... no
  checking whether pthreads work with -pthread... yes
  -
 
  Here is relevant portion of src/Makefile.global
 
  -
  PTHREAD_CFLAGS  = -pthread -D_REENTRANT -D_THREAD_SAFE
  -D_POSIX_PTHREAD_SEMANTICS
  PTHREAD_LIBS=
  LIBS = -lz -lreadline -ltermcap -lcrypt -lresolv -lnsl -ldl -lm
  -
 
  It worked after I manually added -lpthread to LIBS and did a make
  clean;make

 OK, I have applied the following patch which should fix it.  Turns out I
 wasn't using the thread libs as part of library creation.

I updated anoncvs, the patch in src/port/Makefile is there but initdb still 
fails to compile.

I suspect this is because in Makefile.global PTHREAD_LIBS is still blank. I 
need to add -lpthread these to get stuff working. However I could not figure 
out what create PTHREAD_LIBS in Makefile.global.

Is there something else I should try?

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Extended customizing, SQL functions,

2004-05-31 Thread Shridhar Daithankar

On Saturday 29 May 2004 18:10, [EMAIL PROTECTED] wrote:
 Having internal PostgreSQL variables that are not present on disk, or
 maybe, variables that are mirrored on disk may be good.

Yes. I agree. I can see why you proposed no transactions few posts ago. Take 
an example of a count variable. It may not have transactions but it is 
expected not to be very accurate anyways.

If I can declare variables which can be changed/read in locked fashion and 
visible to all the backends would be a real great use. It shouldn't have 
transactions because it is not data but a state. It is in database so that 
other connections and stored procedures could see it.

Coupled with the fact that postgresql has custom data types, there is no end 
how this could be put to use. Lot more things that sit in application layer 
will be inside postgresql, I can image.

 The whole reason why I made this post was to see if other people have had
 similar issues and looked for a similar solution, and to think about if
 there is a solution that fits within PostgreSQL and how it would work.

AFAIK, there is no way of doing it in postgresql. But I would love to see it 
happen. (I wish I could work on it...:-( )

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[HACKERS] Converting postgresql.conf parameters to kilobytes

2004-05-30 Thread Shridhar Daithankar

Hi,

I was toying around with idea of converting all the memory related parameters 
in postgresql.conf to kilobytes for simplicity and uniformity.

Attached is a proof of concept patch that converts shared_buffers to kilobytes 
using assign_hook.

It compiled all-right but I experienced a strange behavior. At the time of 
initdb, it demanded 69MB of shared memory. I had to bump up SHMMAX from 32MB 
to 128MB to get initdb through. Certainly I did something wrong somewhere but 
I don't know what and where. This is linux 2.6.4.

The postgresql.conf is properly altered and shows 8000(Though the description 
around is no longer in sync.)

I also don't know where to put the assign_hook. I have put in guc.c for the 
time being. Only other int hook I found was assign_max_stack_depth which is 
in postgres.c

Any comments/pointers?

Regards,
 Shridhar
*** src/backend/utils/misc/guc.c.orig	Sun May 30 17:14:08 2004
--- src/backend/utils/misc/guc.c	Sun May 30 18:02:10 2004
***
*** 106,111 
--- 106,112 
  static bool assign_stage_log_stats(bool newval, bool doit, GucSource source);
  static bool assign_log_stats(bool newval, bool doit, GucSource source);
  
+ static bool assign_shared_buffers(int newval, bool doit, GucSource source);
  
  /*
   * Debugging options
***
*** 967,973 
  			NULL
  		},
  		NBuffers,
! 		1000, 16, INT_MAX, NULL, NULL
  	},
  
  	{
--- 968,974 
  			NULL
  		},
  		NBuffers,
! 		1000, 16, INT_MAX, assign_shared_buffers, NULL
  	},
  
  	{
***
*** 5130,5133 
--- 5131,5147 
  }
  
  
+ static bool assign_shared_buffers(int newval, bool doit, GucSource source)
+ {
+ 	
+ 	if(doit)
+ 	{
+ 	 	if(BLCKSZ  0)
+ 			NBuffers = (newval*1024)/BLCKSZ;
+ 		else
+ 			return(false);	
+ 	}
+ 	
+ 	return(true);
+ }
  #include guc-file.c
*** src/bin/initdb/initdb.c.orig	Sun May 30 17:26:01 2004
--- src/bin/initdb/initdb.c	Sun May 30 17:26:51 2004
***
*** 886,892 
  	snprintf(repltok, sizeof(repltok), max_connections = %d, n_connections);
  	conflines = replace_token(conflines, #max_connections = 100, repltok);
  
! 	snprintf(repltok, sizeof(repltok), shared_buffers = %d, n_buffers);
  	conflines = replace_token(conflines, #shared_buffers = 1000, repltok);
  
  	snprintf(repltok, sizeof(repltok), lc_messages = '%s', lc_messages);
--- 886,892 
  	snprintf(repltok, sizeof(repltok), max_connections = %d, n_connections);
  	conflines = replace_token(conflines, #max_connections = 100, repltok);
  
! 	snprintf(repltok, sizeof(repltok), shared_buffers = %d, (n_buffers*BLCKSZ)/1024);
  	conflines = replace_token(conflines, #shared_buffers = 1000, repltok);
  
  	snprintf(repltok, sizeof(repltok), lc_messages = '%s', lc_messages);

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] CVS tip compiler error with --enable-thread-safety

2004-05-30 Thread Shridhar Daithankar

Hi,

Platform Slackware linux 9.1/ Linux 2.6.4

I did a make distclean and ./configure --enable-thread-safety. The build 
aborted with following error messages

make[4]: Leaving directory `/home/shridhar/postgresql/pgsql/src/port'
gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations 
-DFRONTEND -I../../../src/interfaces/libpq -I../../../src/include 
-D_GNU_SOURCE   -c -o initdb.o initdb.c
rm -f exec.c  ln -s ../../../src/port/exec.c .
gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations 
-DFRONTEND -I../../../src/interfaces/libpq -I../../../src/include 
-D_GNU_SOURCE   -c -o exec.o exec.c
gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes -Wmissing-declarations 
initdb.o exec.o -L../../../src/interfaces/libpq -lpq -L../../../src/port  
-Wl,-rpath,/usr/local/pgsql/lib -lz -lreadline -ltermcap -lcrypt -lresolv 
-lnsl -ldl -lm  -lpgport -o initdb
../../../src/interfaces/libpq/libpq.so: undefined reference to 
`pthread_getspecific'
../../../src/interfaces/libpq/libpq.so: undefined reference to `pthread_once'
../../../src/interfaces/libpq/libpq.so: undefined reference to 
`pthread_key_create'
../../../src/interfaces/libpq/libpq.so: undefined reference to 
`pthread_setspecific'
collect2: ld returned 1 exit status
make[3]: *** [initdb] Error 1
make[3]: Leaving directory `/home/shridhar/postgresql/pgsql/src/bin/initdb'

The relevant configure messages read 
-
checking whether pthreads work without any flags... no
checking whether pthreads work with -Kthread... no
checking whether pthreads work with -kthread... no
checking for the pthreads library -llthread... no
checking whether pthreads work with -pthread... yes
-

Here is relevant portion of src/Makefile.global

-
PTHREAD_CFLAGS  = -pthread -D_REENTRANT -D_THREAD_SAFE 
-D_POSIX_PTHREAD_SEMANTICS
PTHREAD_LIBS= 
LIBS = -lz -lreadline -ltermcap -lcrypt -lresolv -lnsl -ldl -lm 
-

It worked after I manually added -lpthread to LIBS and did a make clean;make

Regards
 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Extended customizing, SQL functions,

2004-05-29 Thread Shridhar Daithankar

On Saturday 29 May 2004 04:38, [EMAIL PROTECTED] wrote:
 Now, I could roll my own system pretty easily, and probably will do so. It
 won't take too much, however, it would be neat if this was in PostgreSQL.

 I fully expect that people would worry about this, and I don't blame them.
 It is a *bad* idea. Like I said, I could roll my own, but I'm curious if
 anyone else sees any benefit to this feature. If it is a feature that
 people want, it would best be done from within PostgreSQL. If it is not
 something generally wanted, then I'll keep it here or try to get it on
 gborg or pgfoundary.

I agree that it could be a nice feature. But it reminds me a quote from a C++ 
FAQ I read once.

--
*. Should I use exception for error handling?

Ans. The real question is can I afford stack unwinding here...
--

The situation is similar here. When you want something in database, one 
question is to ask is do I need MVCC here?

Of course depending upon the application context the answer well could be yes. 
But at a lot of places, this could be easily be managed in application and 
probably better be done so.

Personally I do not think managing such information in application is an  
hack. 

Just a thought...

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Table Spaces

2004-05-19 Thread Shridhar Daithankar

On Wednesday 19 May 2004 00:19, [EMAIL PROTECTED] wrote:
  [EMAIL PROTECTED] wrote:
  This makes me worried. That's the way we *used* to do things, but the
  sleazy IP lawyers are looking for anything with which they can create
  the
  impression of impropriety. The open source and free projects are ground
  zero for this crap.
 
  We *really* need to be careful.
 
  I assumed this tool was GPL and we just needed to avoid the GPL issue.

 I'm probably just being alarmist, but think about some IP lawyer buying up
 the entity that owns the GPL code, and suing end user's of PostgreSQL.

 This is similar to what is happening in Linux land with SCO.

 The best defense is to say, nope, we didn't copy your stuff, we
 implemented it ourselves based on the documentation.

by that argument, the license of documentation should matter too.. Isn't it?

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] WAL write of full pages

2004-03-15 Thread Shridhar Daithankar

Hi,

I was thinking other way round. What if we write to WAL pages only to those 
portions which we need to modify and let kernel do the job the way it sees fit? 
What will happen if it fails?

Bruce Momjian wrote:

Our current WAL implementation writes copies of full pages to WAL before
modifying the page on disk.  This is done to prevent partial pages from
being corrupted in case the operating system crashes during a page
write.  
Assuming a WAL page is zero at start and later written say a 128 bytes block. 
Then how exactly writing 128 bytes is different than writing entire 8K page, 
especially when we control neither kernel/buffer cache nor disk?

What is partial? Postgresql will always flush entire data block to WAL page 
isn't it? If write returns, we can assume it is written.

For example, suppose an 8k block is being written to a heap file.  
First the backend issues a write(), which copies the page into the
kernel buffer cache.  Later, the kernel sends the write request to the
drive. Even if the file system uses 8k blocks, the disk is typically
made up of 512-byte sectors, so the OS translates the 8k block into a
contiguous number of disk sectors, in this case 16.  There is no
guarantee that all 16 sectors will be written --- perhaps 8 could be
written, then the system crashes, or perhaps part of an 512-byte sector
is written, but the remainder left unchanged.  In all these cases,
restarting the system will yield corrupt heap blocks.
We are hoping to prevent WAL page corruption which is part of file system 
corruption. Do we propose to tacle file system corruption in order to guarantee 
WAL integrity?

The WAL writes copies of full pages so that on restore, it can check
each page to make sure it hasn't been corrupted.  The system records an
LSN (log serial number) on every page.  When a pages is modified, its
pre-change image is written to WAL, but not fsync'ed.  Later, if a
backend wants to write a page, it must make sure the LSN of page page is
between the LSN of the last checkpoint and the LSN of the last fsync by
a committed transactions.  Only in those cases can the page be written
because we are sure that a copy of the page is in the WAL in case there
is a partial write.
Do we have per page checksum? It could be in control log, not necessarily in 
WAL. But just asking since I don't know.

Now, as you can image, these WAL page writes take up a considerable
amount of space in the WAL, and cause slowness, but no one has come up
with a way to recover from partial pages write with it.  The only way to
minimze page writes is to increase checkpoint_segments and
checkpoint_timeout so that checkpoints are less frequent, and pages have
to be written fewer times to the WAL because old copies of the pages
remain in WAL longer.
If I am not mistaken, we rely upon WAL being consistent to ensure transaction 
recovery. We write() WAL and fsync/open/close it to make sure it goes on disk 
before data pages. What else we can do?

I can not see why writing an 8K block is any more safe than writing just the 
changes.

I may be dead wrong but just putting my thoughts together..

 Shridhar

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-03-08 Thread Shridhar Daithankar

On Sunday 07 March 2004 20:28, Michael Meskes wrote:
 On Wed, Mar 03, 2004 at 07:40:40PM +0530, Shridhar Daithankar wrote:
  Is this fine?
   * Allow a 'connection *' pointer to be specified instead of a string to
   denote a connection.
  ...

 I personally have no problem with this as long as it does not break
 compatibility to the code we allow now.

I searched thr. SQL92 standard over weekend(sunday and monday.. had a working 
saturday..:-)) And need to correct some of the assumptions I stated 
previously.

In ECPG we can not dispose connection names as strings because standard 
expects it. Hence if we need to provide a connection pointer to denote a 
connection, that would be a postgresql only extension and such should be 
documented and warned for potential portability problem.

With responses so far, I believe it is OK for me to go ahead and actually try 
some coding now..:-)

Will keep things posted.

 Shridhar




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] [OT] Respository [was Re: [PERFORM] Feature request: smarter use of conditional indexes]

2004-03-08 Thread Shridhar Daithankar

On Sunday 07 March 2004 09:16, Tom Lane wrote:
 Personally I consider -c format the only one of the three that is
 readable for reviewing purposes, so even if I weren't intending
 immediate application, I'd ask for -c before looking at the patch.
 There are some folks who consider -u format readable, but I'm not
 one of them ...

I was wondering what people use to keep track of their personal development 
especially when they do not have a cvs commit access.

I am toying with idea of using GNU arch for personal use. It encourages 
branching, merging and having as many repository trees as possible. I haven't 
tried it in field as yet but if it delivers what it promises, it could be a 
great assistance.

I know that there are not many postgresql branches like say linux kernel needs 
but having a good tool does not hurt, isn't it..:-)

 Just a thought..

 Shridhar

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-03-04 Thread Shridhar Daithankar

Zeugswetter Andreas SB SD wrote:
- Dispose names of connectiong and replace them with a pointer.


You cannot dispose the names, you can only add something to also allow pointers.
The names are in the ESQL/C standard.
Can you point me to the standards text? I am googling for it but nothing 
resembling an independent standard has emerged as yet..

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-03-03 Thread Shridhar Daithankar

Oh.. By all means..Please do..

The reason I posted it because I didn't wanted to work on it if core is not 
going to accept it on account of non-compliance with spec.

Is this fine?
 * Allow a 'connection *' pointer to be specified instead of a string to denote 
a connection.

I plan to work on it whenever possible. What I would like to do is eliminate the 
locks around name-connection mapping as we would be directly using the 
connection instead of a name.

I think we can also add the SQL-CA to connection structure so that each 
connection gets it's  own SQL-CA. That way ECPG is as thread-safe as the calling 
application gets.

And on the plus side we don't have to worry about platform specific threading 
models either.

Thoughts?

 Shridhar

Bruce Momjian wrote:

Should I add this to the TODO list?

---

Lee Kindness wrote:

Sort of related, I was thinking about adding some more thread-related
code such that if a connection wasn't explicitely specified then the
last connection SET or CONNECTed to for the current thread is used,
rather than just the last connection.
But yeah, specifying the connection by variable (be it string or
connection ptr) would be a definite step forward. Currently you cannot
write a generic function like:
int getit(char *using_connection)
{
 EXEC SQL BEGIN DECLARE SECTION;
 char *s_connection = using_connection;
 int s_it;
 EXEC SQL END DECLARE SECTION;
 EXEC SQL AT :s_connection SELECT it INTO :s_it FROM some_table;
 return( s_it );
}
which could be run concurrently by multiple threads.

L.

Shridhar Daithankar writes:
 On Friday 27 February 2004 20:54, Michael Meskes wrote:
  On Fri, Feb 27, 2004 at 04:22:33PM +0530, Shridhar Daithankar wrote:
   How about, allowing 'connection *'? If somebody puts a 'connection *'
   there it is used. If it is a string a name search is performed. Best of
   both worlds.
 
  How shall anyone put a pointer to a connection struct inside the SQL
  statement?
 
  It would help me a lot if you'd be able to give some examples.
 
 EXEC SQL BEGIN DECLARE SECTION;
  connect *connectionPtr;
 EXEC SQL END DECLARE SECTION;
 
 EXEC SQL CONNECT TO db AS connectionPtr;
 EXEC SQL AT connectionPtr SELECT 1;
 
 After all, it is matter of parsing some code and emitting equivalent C code, 
 isn't it?
 
  Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-03-03 Thread Shridhar Daithankar

Lee Kindness wrote:

Shridhar, want to discuss this off list a bit to work through the various
options and then revent back to the list with a suitable to-do (for
discussion)?
I don't mind. Just for summary, I am listing the discussion/proposal so far on 
this issue..

- Dispose names of connectiong and replace them with a pointer.
- Make SQL CA a member of connection structure rather than a thread local variable.
- Update man pages and documentation
- Update examples.
 Drop me a mail offline to take this further.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Out of space situation and WAL log pre-allocation (was

2004-03-03 Thread Shridhar Daithankar

Tom Lane wrote:
I think also that Simon completely misunderstood my intent in saying
that this could be user-scriptable policy.  By that I meant that the
*user* could write the code to behave whichever way he liked.  Not that
we were going to go into a mad rush of feature invention and try to
support every combination we could think of.  I repeat: code that pushes
logs into a secondary area is not ours to write.  We should concentrate
on providing an API that lets users write it.  We have only limited
manpower for this project and we need to spend it on getting the core
functionality done right, not on inventing frammishes.
Hmm... I totally agree. I think the backend could just offer a shared memory 
segment and a marker message to another process to allow copy from it. then it 
is the applications business to do the things.

Of course there has to be a two way agreement about it but an API is a real nice 
thing rather than an application.

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Avoid MVCC using exclusive lock possible?

2004-03-01 Thread Shridhar Daithankar

On Tuesday 02 March 2004 06:29, Paul Tillotson wrote:
 However, for this to be useful, your table must not have any indexes,
 views, foreign keys, sequences, triggers, etc., or else you must be
 prepared to re-create all of them using application level code.

Which isn't a big deal is it? You can write a single function to create entire 
object and it's dependency. It is one time job but can save lots of time at 
runtime.

 I imagine this would break lots of things, but it would be nice if
 instead of Shridhar's rename step (see below) one could do this:

 $table1node = query(SELECT relfilenode FROM pg_class WHERE relname =
 '$old_table';);
 $table2node = query(SELECT relfilenode FROM pg_class WHERE relname =
 '$new_table';);
 exec(UPDATE pg_class SET relfilenode = $table2node WHERE relname =
 '$old_table';);
 exec(UPDATE pg_class SET relfilenode = $table1node WHERE relname =
 '$new_table';);

 You would of course need to change the relfilenode for all of the
 toasted columns and indexes as well in the same atomic step, but it
 seems like this might be more compatible with postgresql's MVCC model
 than other ideas suggested.

I still don't understand what is not so good about rename. All the indexes 
remain there. Views need updation, I agree. Missed that last time. But what 
you are suggesting is a part of rename if not complete of it.

I would always prefer to let PG handle these kind of details. Not very 
comfortable with mucking around catalogs especially if there exists an 
alternative.

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Avoid MVCC using exclusive lock possible?

2004-02-29 Thread Shridhar Daithankar

On Tuesday 24 February 2004 22:13, Stephen wrote:
 Hi,

 Recently, I ran a huge update on an Integer column affecting 100 million
 rows in my database. What happened was my disk space increased in size and
 my IO load was very high. It appears that MVCC wants to rewrite each row
 (each row was about 5kB due to a bytea column). In addition, VACUUM needs
 to run to recover space eating up even more IO bandwidth.

I am sure people have answered the approach you have suggested so let me 
suggest a workaround for your problem.

You could run following in a transaction.

- begin
- Create another table with exact same structure
- write a procedure that reads from input table and updates the value in 
between
- drop the original table
- rename new table to old one
- commit
- analyze new table 

Except for increased disk space, this approach has all the good things 
postgresql offers. Especially using transactable DDLs it is huge benefit. You 
certainly do save on vacuum.

If the entire table is updated then you can almost certainly get things done 
faster this way.

HTH

 Shridhar


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-02-28 Thread Shridhar Daithankar

On Friday 27 February 2004 22:24, Lee Kindness wrote:
 Sort of related, I was thinking about adding some more thread-related
 code such that if a connection wasn't explicitely specified then the
 last connection SET or CONNECTed to for the current thread is used,
 rather than just the last connection.

 But yeah, specifying the connection by variable (be it string or
 connection ptr) would be a definite step forward. Currently you cannot
 write a generic function like:

  int getit(char *using_connection)
  {
   EXEC SQL BEGIN DECLARE SECTION;
   char *s_connection = using_connection;
   int s_it;
   EXEC SQL END DECLARE SECTION;

   EXEC SQL AT :s_connection SELECT it INTO :s_it FROM some_table;
   return( s_it );
  }

 which could be run concurrently by multiple threads.

Consider another scenario. In a C++ class you want to contain a database 
connection. The class needs to make n resources thread safe, including 
database connection. Now each instance of class would be referring to 
differnet database connection with same code.

Doing same with char strings, is just clean enough IMHO..
 
 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-02-27 Thread Shridhar Daithankar

Michael Meskes wrote:
On Mon, Feb 23, 2004 at 09:27:40PM +0530, Shridhar Daithankar wrote:
What I wonder is, do we really need to maintain that level of lookup? Can't we 
just say a connection is a 'struct connection *' which should be opaque and 
should not be touched or poked inside, just like PGConn.
I'm not sure I understand you correctly. The SQL standard says you can
call your statement as this:
exec sql at CONNECTION select 1;
Here CONNECTION of course is a string, the name of the connection. So,
yes, we have to maintain that list to make sure we get the right
connection.
Or what were you asking?
I am asking for CONNECTION being a variable of data type 'connection *' rather 
than 'const char *'. That would avoid name lookups.

Is that out of spec?

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-02-27 Thread Shridhar Daithankar

Zeugswetter Andreas SB SD wrote:

I am asking for CONNECTION being a variable of data type 'connection *' rather 
than 'const char *'. That would avoid name lookups.

Is that out of spec?
Yes, but the preprocessor could still add an optimization ala 'connection *' for
the hardcoded cases (exec sql set connection 'myconn1'; exec sql at 'myconn1' ...).
It needs to maintain the string list for the non hardcoded cases though.
How about, allowing 'connection *'? If somebody puts a 'connection *' there it 
is used. If it is a string a name search is performed. Best of both worlds.

Can that be an acceptable compromise?

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] [pgsql-www] Collaboration Tool Proposal

2004-02-27 Thread Shridhar Daithankar

On Friday 27 February 2004 19:59, Andrew Dunstan wrote:
 I believe it should not be hard to do a one-time bulk registration of
 everyone on the lists, if that was desired.

I agree. If possible we could also run postgresql registration system where we 
can track general usage of postgresql on various fronts, for the information 
people are willing to put their name on. It could be a massive advocacy 
ammo..

 Signal to Noise. It's not at all clear to me why a bug tracking system
 should have a worse signal to noise ratio than a mailing list with
 similar access rules, especially since we also provide the facility to
 log bugs through a web form directly off the postgresql.org home page.
 But even if it does, that can be managed by good triage. That should
 improve the ratio for all but those doing the triage. Personally, I'd be
 surprised if it took one knowledgable person more than 30 minutes a day
 to weed out the garbage (sorry for the mixed metaphor), and if the load
 was spread across several people it would be just a few minutes a day
 for any one of them, at a significant saving to everyone else.

Look at KDE bugzilla. They first make you search and then file the bug. I have 
seen duplicates dropping from several thousands to few hundreds. Simple but 
effective step. For sure postgresql will hardly receive that kind of bug 
flurry.

They put a direct report a bug in KDE2.0. Click on a menu item and it send a 
bug report. As a result they had massive duplicates. Now the menu item give 
you a URL to click on, then you go and search etc. Very nice system.

 Email interface: it should not be beyond the wit of man to provide some
 level of email interface to any reasonable bug tracking system. Whether
 or not it is worth doing depends on the demand. Two obvious places for
 it would be 1) to allow initial logging of a bug via email, and 2)
 periodically run query 'foo' and email me the results. Getting a once a
 day digest of new bug reports might be quite nice in fact.

Logging bugs via email is a bad idea because you can not enforce the fields. 
Would you like somebody filing a bug via mail and leaving postgresql version 
out?

Let people use webforms. It is nice enough IMHO..

 Shridhar

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-02-27 Thread Shridhar Daithankar

On Friday 27 February 2004 20:54, Michael Meskes wrote:
 On Fri, Feb 27, 2004 at 04:22:33PM +0530, Shridhar Daithankar wrote:
  How about, allowing 'connection *'? If somebody puts a 'connection *'
  there it is used. If it is a string a name search is performed. Best of
  both worlds.

 How shall anyone put a pointer to a connection struct inside the SQL
 statement?

 It would help me a lot if you'd be able to give some examples.

EXEC SQL BEGIN DECLARE SECTION;
 connect *connectionPtr;
EXEC SQL END DECLARE SECTION;

EXEC SQL CONNECT TO db AS connectionPtr;
EXEC SQL AT connectionPtr SELECT 1;

After all, it is matter of parsing some code and emitting equivalent C code, 
isn't it?

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Tablespaces

2004-02-26 Thread Shridhar Daithankar

On Thursday 26 February 2004 15:37, Gavin Sherry wrote:
 Tying it all together:

 The catalogs pg_database, pg_namespace, and pg_class will have a 'tblspc'
 field. This will be the OID of the table space the object resides in, or 0
 (default table space). Since we can then resolve relid/relname, schema and
 database to a tablespace, there aren't too many cases when extra logic
 needs to be added to the IO framework. In fact, most of it is taken care
 of because of the abstraction of relpath().

 The creation of table spaces will need to be recorded in xlog in the same
 way that files are in heap_create() with the corresponding delete logic
 incase of ABORT.

Is tablespace some sort of copyrighted? Last I remembered, the discussion was 
about location/storage

 Just a thought..

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] Sparc optimizations

2004-02-24 Thread Shridhar Daithankar

http://www.osnews.com/printer.php?news_id=6136

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Sparc optimizations

2004-02-24 Thread Shridhar Daithankar

On Tuesday 24 February 2004 21:15, scott.marlowe wrote:
 On Tue, 24 Feb 2004, Shridhar Daithankar wrote:
  http://www.osnews.com/printer.php?news_id=6136

 That page gets a please don't link to printer ready pages error and
 redirects to here:

 http://www.osnews.com/story.php?news_id=6136

My bad.. Shouldn't have been doing that..
 
 Shridhar

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

[HACKERS] Thread safe connection-name mapping in ECPG. Is it required?

2004-02-23 Thread Shridhar Daithankar

Hi all,

I was just going thr. the ecpg sources checking for thread safety. 

It looks like the mutex protects the connections list in connection.c. I do 
not like that from a application developers perspective.

If I am developing an application and using multiple connections in multiple 
threads, I have to store a connection name for each connection as C string. 
Of course, I also have to protect it across thread so that I can rightly tell 
ecpg what connection I would be talking to next.

If an application can take care of a C string, it can also take care of a 
connection structure. On top of it, it eliminates the list lookup. The 
potential performance gain could be worth it if there are hundreds of 
connections and a busy website/application server.

What I wonder is, do we really need to maintain that level of lookup? Can't we 
just say a connection is a 'struct connection *' which should be opaque and 
should not be touched or poked inside, just like PGConn.

Just a thought... 

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] MS SQL features for new version

2004-02-10 Thread Shridhar Daithankar

Hi all,

Just stumbled upon this. just an FYI,

http://www.microsoft.com/sql/yukon/productinfo/top30features.asp

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Shridhar Daithankar

On Thursday 05 February 2004 20:24, Tom Lane wrote:
 Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
  So Imho the target should be to have not much IO open for the checkpoint,
  so the fsync is fast enough, even if serial.

 The best we can do is push out dirty pages with write() via the bgwriter
 and hope that the kernel will see fit to write them before checkpoint
 time arrives.  I am not sure if that hope has basis in fact or if it's
 just wishful thinking.  Most likely, if it does have basis in fact it's
 because there is a standard syncer daemon forcing a sync() every thirty
 seconds.

There are other benefits of writing pages earlier even though they might not 
get synced immediately.

It would tell kernel that this is latest copy of updated buffer. Kernel VFS 
should make that copy visible to every other backend as well. The buffer 
manager will fetch the updated copy from VFS cache next time. All without 
going to disk actually..(Within the 30 seconds window of course..)

 People keep saying that the bgwriter mustn't write pages synchronously
 because it'd be bad for performance, but I think that analysis is
 faulty.  Performance of what --- the bgwriter?  Nonsense, the *point*
 of the bgwriter is to do the slow tasks.  The only argument that has
 any merit is that O_SYNC or immediate fsync will prevent us from having
 multiple writes outstanding and thus reduce the efficiency of disk
 write scheduling.  This is a valid point but there is a limit to how
 many writes we need to have in flight to keep things flowing smoothly.

Is it a valid assumption for platforms-that-postgresql-supports that a write 
call would make changes visible across processes?

 What I'm thinking now is that the bgwriter should issue frequent fsyncs
 for its writes --- not immediate, but a lot more often than once per

frequent fsyncs or frequent fsyncs per file descriptor written? I thought it 
was later.

Just a thought.

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Extending SET SESSION AUTHORIZATION

2004-01-31 Thread Shridhar Daithankar

Tom Lane wrote:

Bruce Momjian [EMAIL PROTECTED] writes:

Ezra Epstein wrote:

I'd like to extend SET SESSION AUTHORIZATION to support a form which takes a
password.


Uh, a password?  What purpose would that serve?


Indeed.  SET SESSION AUTH is already allowed only to superusers --- a
superuser hardly needs any additional privileges to become whoever he
wants.
It is very helpful for connection pooling/persistent connections. Say I have 10 
connections opened as superuser. I can switch the connection authorization per 
query and let database enforce the rules and access control.

For authentication, I can keep a dummy connection.

There could be multiple ways to improve this behaviour.

1. If a non super-user attempts set session authorization, let him do so with 
proper password.

2. Add password to set session authorization as suggested above.

I would prefer this actually. In case the application is breached, with option 
2, the database is left wide open. With option 1, that may not be the case if 
initial connection is with a sufficiently unprivilaged user. But then I need to 
cache the actual password, which is another can of worms..:-(

Additionally it would be great if libpq could just authenticate a user without 
forking a backend. I think some kind of PAM voodoo can be substituted for that 
but having a libpq frontend is great.

I did suggest this earlier as well. Just reiterating..

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Issue with Linux+Pentium SMP Context Switching

2003-12-19 Thread Shridhar Daithankar

On Saturday 20 December 2003 00:00, Josh Berkus wrote:
 In discussions with Linux kernel hackers online, they blame the way that
 PostgreSQL uses shared memory.   Whether or not they are correct, the
 effect of the issue is to harm PostgreSQL's performance and make us look
 bad on one of the major enterprise systems of choice: the multi-processor
 Xeon system.

Two suggestions..

1. Patch linux kernel for HT aware scheduler.
2. Try running Xeons in HTdisabled modes.

See if that helps. I would say using 2.6 on it is recommended anyways.. If 
possible of course..

 Shridhar


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] ecpg tests compile failure

2003-12-18 Thread Shridhar Daithankar

On Thursday 18 December 2003 17:37, Dave Cramer wrote:
 test_thread.pgc:51: undefined reference to `pthread_create'
 undefined reference to `pthread_join'

It is not linking against pthreads library. Do you have -lpthread cause in 
your compilation/linking command?

 Shridhar


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] ecpg tests compile failure

2003-12-18 Thread Shridhar Daithankar

On Thursday 18 December 2003 18:03, Dave Cramer wrote:
 Shridhar,

 Obviously not, but I just did make inside the test directory, so I
 assume the Makefile needs to be fixed.

I don't think so.. Check this..

[EMAIL PROTECTED] pgsql]$ find . -name Makefile.global
./src/Makefile.global
[EMAIL PROTECTED] pgsql]$ grep THREAD_LIBS ./src/Makefile.global
THREAD_LIBS = -lpthread

If you don't have this flag defined, the tests won't be linked against the 
thread library. Check what value this flag is set to.

You did a ./configure --enable-thread-safety, right? Which platform is this 
BTW?

This is on a CVS head checkout from couple of days ago on linux.. All the 
tests got successfully compiled.

 Shridhar


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Project status pages

2003-12-18 Thread Shridhar Daithankar

On Thursday 18 December 2003 20:16, Robert Treat wrote:
 On Wed, 2003-12-17 at 21:37, Bruce Momjian wrote:
  I have put up a list of projects being worked on and their TODO lists in
  hope that people will find it easier to work on them:
 
  http://momjian.postgresql.org/main/writings/pgsql/project

Browsing that page I noticed tablespaces item. I have this file we have 
prepared long time back. I think it deserves a mention on that page.

May be I should edit it to completion and then send. But this is for review 
for others so that any bugs are ironed out..

 Shridhar
POSTGRESQL LOCATIONS DESIGN DOCUMENT


Authors:
Christopher Kings-Lynne ([EMAIL PROTECTED])

Writtem:
30th June 2003

Version:
0.1

REQUIREMENTS

1. To be able to specify the physical location of tables and indexes 
   from within the PostgreSQL SQL environment.

2. Databases and schemas will be able to have a default table and 
   default index location specified.  Unless overridden at creation 
   time, a new object will inherit its location from, first, its 
   schema and second, its database. ** Is a default temporary 
   tablespace required?

3. At creation time of all tables and indexes, it should be possible 
   to specify a location.  This includes UNIQUE and PRIMARY KEY 
   declarations.

4. We should allow anonymous locations wherever we allow a locname.  
   This means that PsotgreSQL will create a new location on the fly, 
   with a generated name.  This could help with backward compatibility 
   with the CREATE DATABASE location clause.

5. If you don't like 'location', then s/location/storage/g.

CATALOG CHANGES
---

A new (global) catalog will need to be created, pg_location.

It will need to have the following fields:

locoid  OID UNIQUE  NOT NULLLocation identifier
locname TEXTUNIQUE  NOT NULLLocation name
locpath TEXTNOT NULLLocation path (eg. /spaces/ind1)
locconfig   TEXT[]  GUC variables for objects
   in this location.
locacl  ACLITEM[]   Access controls for location.

New fields will be required in pg_database:

datrellocid OID Default table 
location identifier
datindlocid OID Default index 
location identifier

New fields will be required in pg_namespace:

nsprellocid OID Default table 
location identifier
nspindlocid OID Default index 
location identifier

The default location fields above can be NULL, in which case objects
default to datpath for the database.

*** Should we just remove datpath and make datrellocid NOT NULL???

PERMISSIONS
---

Locations will have a single permission - USAGE.  By default, the creator of a 
tablespace will be its owner and will have USAGE rights.

Locations can only be created and altered by a superuser.

GRAMMAR
---

For direct manipulation of locations:

CREATE LOCATION locname AS 'locpath';

DROP LOCATION locname [ CASCADE | RESTRICT ];

ALTER LOCATION locname SET variable { TO | = } { value | DEFAULT };
ALTER LOCATION locname RESET variable;
ALTER LOCATION locname RENAME TO newname;

For the specification of default locations:

CREATE DATABASE datname [ WITH ] { TABLE | INDEX } LOCATION locname;
ALTER DATABASE datname { TABLE | INDEX } LOCATION { locname | DEFAULT };

*** How do we handle existing LOCATION clause?

CREATE SCHEMA nspname { TABLE | INDEX } LOCATION locname;
ALTER SCHEMA nspname { TABLE | INDEX } LOCATION { locname | DEFAULT };

Object creation:

CREATE TABLE ... ( ... PRIMARY KEY LOCATION locname, UNIQUE LOCATION locname ... ) 
... LOCATION locname;
CREATE INDEX ... LOCATION locname;
ALTER TABLE ... ADD { UNIQUE | PRIMARY KEY } ... LOCATION locname;

DIAGNOSTICS
---

These are the result strings for location manipulation:

CREATE LOCATION
DROP LOCATION
ALTER LOCATION

EXOTIC FEATURES
---

These features are cool, but won't make it into the initial implementation
most likely:

1. The SET variable stuff above.
2. WAL location
3. Location quotas


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] PostgreSQL port to pure Java?

2003-12-11 Thread Shridhar Daithankar

Andrew Dunstan wrote:
Not to mention it would kill PostgreSQL's current   speedy performance!


Maybe, maybe not. Modern JVMs have much better performance 
characteristics than was once the case. Also, some of the things that 
Java buys you (memory management, threading, for example) might actually 
enhance performance in some circumstances. A crude port wouldn't work, 
though - it would have to be done in such a way as to leverage the 
platform's strengths, just as we leverage the strengths of writing in C.

The *big* problem would be keeping a Java port in sync with the base. 
That would make it almost impossible to do in a worthwhile way IMNSHO - 
the maintenance would be a nightmare.

It would be an excellent student exercise, though :-)
Jokes and facts aside, I can't help it to think how better it would have been, 
if postgresql was in C++. We could easily plug multiple implementations of 
underlying subsystems without mucking much in base code..

 Wild thought anyways..

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] tablespaces?

2003-12-09 Thread Shridhar Daithankar

Bruce Momjian wrote:

Guys, where are we on tablespaces?  Should I set up a project page or
does it need organization?  I think we need to define the command syntax
and then implementation details.  I don't think it is that hard and
certainly is possible for 7.5.


There was a offlist group communication about it. Gavin Sherry promised to set 
up a separate CVS server for it and merge his work till date.

I understand he is busy. We were looking at couple of weeks break to start 
things rolling. IIRC it is almost over.

That is where it stand, I understand.

 Shridhar



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] sigh

2003-12-04 Thread Shridhar Daithankar

On Wednesday 03 December 2003 13:59, Mark Kirkwood wrote:
 How about:

 Implement a function estimated_count that can be used instead of
 count. It could use something like the algorithm in
 src/backend/commands/analyze.c to get a reasonably accurate psuedo count
 quickly.

 The advantage of this approach is that count still means (exact)count
 (for your xact snapshot anyway). Then the situation becomes:

 Want a fast count? - use estimated_count(*)
 Want an exact count - use count(*)

Something like select reltuples from pg_class where relname='foo'?

 Shridhar


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Providing anonymous mmap as an option of sharing memory

2003-11-26 Thread Shridhar Daithankar

Tom Lane wrote:
Shridhar Daithankar [EMAIL PROTECTED] writes:

I was looking thr. the source and thought it would be worth to seek
opinion on this proposal.
This has been discussed and rejected before.  See the archives.
I went thr. this for details.

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/doc/TODO.detail/mmap

There seem to be two objections to mmap.

1. If a backend from last crashed running postmaster exists then it might have 
file etc. open and that is in general not such a good idea

2. For replacing stdio for data and WAL files with mmap, mmap does not guarantee 
order of IO which defeats WAL.

I covered only first point in my post. IMO it is not such a unsolvable problem. 
If a postmaster crashes hard but leaves a backend running, would it clean pid 
file etc? I don't think so. So if a postmaster can start on a 'pid-clean' state, 
then it is guaranteed to be no childs left around.

There were issues where linux not supporting MAP_SHARE and MAP_ANONYMOUS 
simaltenously but they are quite old messages, from 1998, talking of linux 
2.0.x. I don't think it is still true anymore but need to check.

Too bad, freeBSD M_NOSYNC is not a standard otherwise even for point 2, it could 
have been considered.

Did I miss something?

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] 7.5 Plans

2003-11-26 Thread Shridhar Daithankar

Christopher Kings-Lynne wrote:

Hi everyone,

I'm just interested in what everyone's personal plans for 7.5 
development are?

Shridar, Gavin and myself are trying to get the tablespaces stuff off 
the ground.  Hopefully we'll have a CVS set up for us to work in at some 
point (we didn't think getting a branch and commit privs was likely). 
Changing all our $Id$ tags to $Postgres$ would make a separate CVS a lot 
easier, hint hint :)

What's everyone else wanting to work on?
By the time tablespaces CVS goes live, I plan to study postgresql buffer 
management and push mmap. (See other thread). Hopefully my home internet 
connection will be up soon so that I can work on weekends.

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Providing anonymous mmap as an option of sharing memory

2003-11-26 Thread Shridhar Daithankar

Tom Lane wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:

I covered only first point in my post. IMO it is not such a unsolvable
problem.  If a postmaster crashes hard but leaves a backend running,
would it clean pid file etc? I don't think so. So if a postmaster can
start on a 'pid-clean' state, then it is guaranteed to be no childs
left around.


And that helps how?  The problem is to detect whether there are any
children left from the old postmaster, when what you have to work from
is the pid-file it left behind.
fine. We need shared memory for that. How about using 1 8K page just for 
detecting that? We don't need to base shared memory model on that, right?

May be we can put clog in shared memory segment which would serve as process 
counter and move shared buffers to mmap?

In any case, you're still handwaving away the very real portability
issues around mmap.  Linux is not the universe, and Linux+BSD isn't
either.
From the machines I can access here, following have anon and shared mmap..

[ost] ~ uname -a
SunOS host 5.8 Generic_108528-21 sun4u sparc SUNW,Sun-Fire-880 Solaris
[host] ~ uname -a
AIX host 1 5 0001A5CA4C00
[/home/user]uname -a
HP-UX host B.11.00 A 9000/785 2005950738 two-user license
Is it enough of support?

We might still have considered it, despite the negatives, if anyone had
been able to point to any actual *advantages* of mmap.  There are none.
Yes, the SysV shmem API is old and ugly and crufty, but it does what we
need it to do.


1) Per database buffers

Postgresql does not perform well with large number of buffers. Say an 
installation is configured for 100K buffers and have 5 databases. Now what would 
happen if each of these databases get their own 100K buffers?

mmap can not expand shared memory without a server restart. The current 
implementation of shared memory behaves the same way.

Rather than moving it to use shared memory as and when required, we could push 
per database buffers to improve scalability.

I think of this.

1. Introduce parameter columns in pg_database, for shared memory size (to start 
with) and number of live connections to that database. May be a callback to 
daemon postmaster to reread configuration if possible. (In shared memory, may be?)

2. Provide start and stop server commands which essentially either let a 
connection happen or not.

Now somebody modifies the buffer parameters for a database(Say via alter 
database), all it has to do is disconnect and reconnect. If this is a first 
connection to the database, the parent postmaster should reread the per database 
parameters and force them. Same can happen with start/stop commands.

2) No more kernel mucking required.

Recent linux installations are provide sane enough default of SHMMAX but I am 
sure plenty of folks would be glad to see that dependency go.

I also want to talk about mmap for file IO but not in this thread.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] 7.5 Plans

2003-11-26 Thread Shridhar Daithankar

Tom Lane wrote:
1. You can't easily generate a clean diff of your local version against
the original imported from postgresql.org.  The changes you actually
made get buried in a mass of useless $Foo$ diff lines.  Stripping those
out is possible in theory but painful.
Is that the reason linux does not use CVS? I thought so at least.

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

[HACKERS] Providing anonymous mmap as an option of sharing memory

2003-11-25 Thread Shridhar Daithankar

Hello All,

I was looking thr. the source and thought it would be worth to seek opinion on 
this proposal.

From what I understood so far, the core shared memory handling is done in 
pgsql/src/backend/port/sysv_shmem.c. It is linked by configure as per the 
runtime environment.

So I need to write another source code file which exports same APIs as 
above(i.e. all non static functions in that file) but using mmap and that would 
do it for using anon mmap instead of sysV shared memory.

It might seem unnecessary to provide mmap based shared memory. but this is just 
one step I was thinking of.

In pgsql/src/backend/storage/ipc/shmem.c, all the shared memory allocations are 
done. I was thinking of creating a structure of all global variables in that 
file. The global variables would still be in place so that existing code would 
not break. But the structure would hold database specific buffering information. 
Let's call that structure database context.

That way we can assign different mmaped(anon, of course) regions per database. 
In the backend, we could just switch the database contexts i.e. assign global 
variables from the database context and let the backend write to appropriate 
shared memory region. Every database would need at least two shared memory 
regions. One for operating on it's own buffers and another for system where it 
could write to shared catalogs etc. It can close the shared memory region 
belonging to other databases on startup.

Of course, buffer management alone would not cover database contexts altogether. 
WAL need to be lumped in as well(Not necessarily though. If all WAL buffering go 
thr. system shared region, everything will still work). I don't know if clog and 
data file handling is affected by this. If WAL goes in database context, we can 
probably provide per database WAL which could go well with tablespaces as well.

In case of WAL per database, the operations done on a shared catalog from a 
backend would need flushing system WAL and database WAL to ensure such 
transaction commit. Otherwise only flushing database WAL would do.

This way we can provided a background writer process per database, a common 
buffer per database minimising impact of cross database load significantly. e.g. 
vacuum full on one database would not hog another database due to buffer cache 
pollution. (IO can still saturate though.) This way we can push hardware to 
limit which might not possible right now in some cases.

I was looking for the reason large number of buffers degrades the performance 
and the source code browsing spiralled in this thought. So far I haven't figured 
out any reason why large numebr of buffers can degrade the performance. Still 
looking for it.

Comments?

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-21 Thread Shridhar Daithankar

Matthew T. O'Connor wrote:

But we track tuples because we can compare against the count given by 
the stats system.  I don't know of a way (other than looking at the FSM, 
or contrib/pgstattuple ) to see how many dead pages exist.
I think making pg_autovacuum dependent of pgstattuple is very good idea.

Probably it might be a good idea to extend pgstattuple to return pages that are 
excessively contaminated and clean them ASAP. Step by step getting closer to 
daemonized vacuum.

 Shridhar

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] RPM building fun

2003-11-20 Thread Shridhar Daithankar

On Thursday 20 November 2003 13:09, David Fetter wrote:
 On Thu, Nov 20, 2003 at 11:45:06AM +0530, Shridhar Daithankar wrote:
  Joshua D. Drake wrote:
  Is there some way to remove this piece of sh^H^Hlegacy from the
  configure script?  Does anybody actually use info?
  
  All of GNU.
 
  Additionally it is very good resource when you use Konqueror to
  browse it as html..

 Roight.  How to tell it to get the right prefix, then?

I didn't get you.. I type info:/autoconf and it will show me the autoconf 
page.

You mean extension of actual info files? No idea. Never seen them in 
wild...:-)

 Shridhar
 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar

Josh Berkus wrote:

Shridhar,
 However I do not agree with this logic entirely. It pegs the next vacuum
w.r.t current table size which is not always a good thing.


No, I think the logic's fine, it's the numbers which are wrong.   We want to 
vacuum when updates reach between 5% and 15% of total rows.   NOT when 
updates reach 110% of total rows ... that's much too late.
Well, looks like thresholds below 1 should be norm rather than exception.

Hmmm ... I also think the threshold level needs to be lowered; I guess the 
purpose was to prevent continuous re-vacuuuming of small tables?  
Unfortunately, in the current implementation, the result is tha small tables 
never get vacuumed at all.

So for defaults, I would peg -V at 0.1 and -v at 100, so our default 
calculation for a table with 10,000 rows is:

100 +  ( 0.1 * 10,000 ) = 1100 rows.
I would say -V 0.2-0.4 could be great as well. Fact to emphasize is that 
thresholds less than 1 should be used.

Furthermore analyze threshold depends upon inserts+updates. I think it
should also depends upon deletes for obvious reasons.
Yes.  Vacuum threshold is counting deletes, I hope?
It does.

My comment about the frequency of vacuums vs. analyze is that currently the 
*default* is to analyze twice as often as you vacuum.Based on my 
experiece as a PG admin on a variety of databases, I believe that the default 
should be to analyze half as often as you vacuum.
OK.

I am all for experimentation. If you have real life data to play with, I
can give you some patches to play around.
I will have real data very soon .
I will submit a patch that would account deletes in analyze threshold. Since you 
want to delay the analyze, I would calculate analyze count as

n=updates + inserts *-* deletes

Rather than current n = updates + inserts. Also update readme about examples 
and analyze frequency.

What does statistics gather BTW? Just number of rows or something else as well? 
I think I would put that on Hackers separately.

I am still wary of inverting vacuum analyze frequency. You think it is better to 
set inverted default rather than documenting it?

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar

On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
 Shridhar Daithankar wrote:
  I will submit a patch that would account deletes in analyze threshold.
  Since you want to delay the analyze, I would calculate analyze count as

 deletes are already accounted for in the analyze threshold.

Yes. My bad. Deletes are not accounted in initializing analyze count but later 
they are used.

  I am still wary of inverting vacuum analyze frequency. You think it is
  better to set inverted default rather than documenting it?

 I think inverting the vacuum and analyze frequency is wrong.

Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient 
for you?..:-)

Matthew, I am confyused about one thing. Why would autovacuum count updates 
while checking for analyze threshold? Analyze does not change statistics 
right? ( w.r.t line 1072, pg_autovacuum.c). For updating statistics, only 
inserts+deletes should suffice, isn't it?

Other than that, I think autovacuum does everything it can.

Comments?

 Shridhar
*** README.pg_autovacuum.orig	Thu Nov 20 19:58:29 2003
--- README.pg_autovacuum	Thu Nov 20 20:26:39 2003
***
*** 141,150 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes)  AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates)  VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
--- 141,150 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes) = AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates) = VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
***
*** 158,163 
--- 158,186 
  and running ANALYZE more often should not substantially degrade system
  performance.
  
+ Examples:
+ 
+ Following table shows typical usage of pg_autovacuum settings.
+ These are put here so that a DBA can have some starting point while
+ tuning pg_autovacuum.
+ 
+ Vacuum is triggered by updates and deletes. So in case of vacuum,
+ last column indicates total of updates and deletes required
+ to trigger vacuum. In case of analyze, the operations would count total
+ number of inserts, updates and deletes.
+ 
+ Threshold	Scaling factor	Records		No. of Operations
+ 1,000		1		10,000		11,000
+ 1,000		2		10,000		21,000
+ 1,000		0.5		10,000		 6,000
+ 1,000		0.1		10,000		 2,000
+ 
+ Although analyze is cheaper operation compared to vacuum,
+ it might be needed less often. The default is to analyze twice as much as
+ vacuum but that might be too aggressive for some installations. It is advised that
+ such installation tune their analyze threshold separately, rather than relying upon
+ the default behaviour.
+ 
  Sleeping:
  -
  

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-20 Thread Shridhar Daithankar

On Thursday 20 November 2003 20:29, Shridhar Daithankar wrote:
 On Thursday 20 November 2003 20:00, Matthew T. O'Connor wrote:
  Shridhar Daithankar wrote:
   I will submit a patch that would account deletes in analyze threshold.
   Since you want to delay the analyze, I would calculate analyze count as
 
  deletes are already accounted for in the analyze threshold.

 Yes. My bad. Deletes are not accounted in initializing analyze count but
 later they are used.

   I am still wary of inverting vacuum analyze frequency. You think it is
   better to set inverted default rather than documenting it?
 
  I think inverting the vacuum and analyze frequency is wrong.

 Me. Too. ATM all I can think of this patch attached. Josh, is it sufficient
 for you?..:-)

use this one. A warning added for too aggressive vacuumming. If it is OK by 
everybody, we can send it to patches list.

 Shridhar
*** README.pg_autovacuum.orig	Thu Nov 20 19:58:29 2003
--- README.pg_autovacuum	Thu Nov 20 20:35:34 2003
***
*** 141,163 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes)  AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates)  VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
! vacuum_base_value + (vacuum_scaling_factor * number of tuples in the table)
! 
  AnalyzeThreshold is equal to:
! analyze_base_value + (analyze_scaling_factor * number of tuples in the table)
! 
  The AnalyzeThreshold defaults to half of the VacuumThreshold since it
  represents a much less expensive operation (approx 5%-10% of vacuum),
  and running ANALYZE more often should not substantially degrade system
  performance.
  
  Sleeping:
  -
  
--- 141,191 
  depending on the mixture of table activity (insert, update, or
  delete):
  
! - If the number of (inserts + updates + deletes) = AnalyzeThreshold, then
only an analyze is performed.
  
! - If the number of (deletes + updates) = VacuumThreshold, then a
vacuum analyze is performed.
  
  VacuumThreshold is equal to:
! vacuum_base_value + (vacuum_scaling_factor * number of tuples in the 
! table)
  AnalyzeThreshold is equal to:
! analyze_base_value + (analyze_scaling_factor * number of tuples in the 
! table)
  The AnalyzeThreshold defaults to half of the VacuumThreshold since it
  represents a much less expensive operation (approx 5%-10% of vacuum),
  and running ANALYZE more often should not substantially degrade system
  performance.
  
+ Examples:
+ 
+ Following table shows typical usage of pg_autovacuum settings.
+ These are put here so that a DBA can have some starting point while
+ tuning pg_autovacuum.
+ 
+ Vacuum is triggered by updates and deletes. So in case of vacuum,
+ last column indicates total of updates and deletes required
+ to trigger vacuum. In case of analyze, the operations would count total
+ number of inserts, updates and deletes.
+ 
+ Base		Scaling factor	Records		No. of Operations
+ 1,000		1		10,000		11,000
+ 1,000		2		10,000		21,000
+ 1,000		0.5		10,000		 6,000
+ 1,000		0.1		10,000		 2,000
+ 
+ Although analyze is cheaper operation compared to vacuum,
+ it might be needed less often. The default is to analyze twice as much as
+ vacuum but that might be too aggressive for some installations. It is advised 
+ thatsuch installation tune their analyze threshold separately, rather than 
+ relying upon the default behaviour.
+ 
+ Furthermore, for aggressive vacuum/analyze behaviour, it is recommended that
+ scaling factor is set to less than 1. However too aggresive operation can affect
+ performance of normal database operations adversely. Do not apply such setting
+ to production databases without prior testing.
+ 
  Sleeping:
  -
  

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Win32 port

2003-11-19 Thread Shridhar Daithankar

Jean-Michel POURE wrote:
OK, now, some of us will complain that Win32 is not needed at a time when the 
Debian Synaptic graphical installer gives access to 13.748 packages. Win32 
sounds like an old Atari game station. Agreed. On the long-run, everyone 
will leave Win32, even my grand-mother.
Well, jokes and rants aside, win32 port is on high priority.

The whole debate started on advocacy was 'Whether win32 port is killer-enough 
feature?' and not 'Whether win32 port is required now?'

Win32 will happen. and we will revisit this debate when there is another release 
with win32..:-)

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] [PERFORM] More detail on settings for pgavd?

2003-11-19 Thread Shridhar Daithankar

Josh Berkus wrote:

Shridhar,

I was looking at the -V/-v and -A/-a settings in pgavd, and really don't 
understand how the calculation works.   According to the readme, if I set -v 
to 1000 and -V to 2 (the defaults) for a table with 10,000 rows, pgavd would 
only vacuum after 21,000 rows had been updated.   This seems wrong.
No. that is correct.

It is calculated as

threshold = base + scale*numebr of current rows

Which translates to

21,000 = 1000 + 2*1000

However I do not agree with this logic entirely. It pegs the next vacuum w.r.t 
current table size which is not always a good thing.

I would rather vacuum the table at 2000 updates, which is what you probably want.

Furthermore analyze threshold depends upon inserts+updates. I think it should 
also depends upon deletes for obvious reasons.

Can you clear this up a little?   I'd like to tweak these settings but can't 
without being better aquainted with the calculation.
What did you expected in above example? It is not difficult to tweak 
pg_autovacuum calculations. For testing we can play around.

Also, you may want to reverse your default ratio for Vacuum/analyze frequency.  
True, analyze is a less expensive operation than Vacuum, but it's also needed 
less often -- only when the *distribution* of data changes.I've seen 
databases where the optimal vacuum/analyze frequency was every 10 min/once 
per day.
OK vacuum and analyze thresholds are calculated with same formula as shown above 
 but with different parameters as follows.

vacthresh = vacbase + vacscale*ntuples
anathresh = anabase + anascale*ntuples
What you are asking for is

vacthresh = vacbase*vacscale
anathresh = anabase + anascale*ntuples
Would that tilt the favour the way you want? i.e. an analyze is triggered when a 
fixed *percentage* of table changes but a vacuum is triggered when a fixed 
*number of rows* are changed.

I am all for experimentation. If you have real life data to play with, I can 
give you some patches to play around.

And BTW, this is all brain child of Mathew O.Connor(Correct? I am not good at 
either names or spellings). The way I wrote pgavd originally, each table got to 
get separate threshold..:-). That was rather a brute force approach.

 Shridhar





---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] RPM building fun

2003-11-19 Thread Shridhar Daithankar

Joshua D. Drake wrote:
Is there some way to remove this piece of sh^H^Hlegacy from the
configure script?  Does anybody actually use info?
All of GNU.
Additionally it is very good resource when you use Konqueror to browse it as html..

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Background writer committed

2003-11-19 Thread Shridhar Daithankar

Jan Wieck wrote:

I committed the first part of the background writer process. We had a 
consensus on attempting to avoid write() calls from regular backends, 
but did no come to any conclusions what to do to force the kernel to 
actually do some IO.

Consequently, this patch is a separate process launched by postmaster, 
that periodically write()'s out some dirty buffers in LRU order. This 
causes the buffers returned for replacement (when a backend needs to 
read in a page) to be clean allways. The process does no sync(), fsync() 
or any other calls thus far. Nothing has changed in the checkpoint logic 
either.
Can we have some idea where to tweak sync routines for comparing results?

I mean I would like to run pgbench with same config all along and compare the 
performance difference between sync, fsync and fdatasync etc.

If we could get to run any live world data test by that, it would be great as well.

 Shridhar

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] Background writer process

2003-11-18 Thread Shridhar Daithankar

Bruce Momjian wrote:

Shridhar Daithankar wrote:

On Friday 14 November 2003 22:10, Jan Wieck wrote:

Shridhar Daithankar wrote:

On Friday 14 November 2003 03:05, Jan Wieck wrote:

For sure the sync() needs to be replaced by the discussed fsync() of
recently written files. And I think the algorithm how much and how often
to flush can be significantly improved. But after all, this does not
change the real checkpointing at all, and the general framework having a
separate process is what we probably want.
Having fsync for regular data files and sync for WAL segment a
comfortable compromise?  Or this is going to use fsync for all of them.
IMO, with fsync, we tell kernel that you can write this buffer. It may or
may not write it immediately, unless it is hard sync.
I think it's more the other way around. On some systems sync() might
return before all buffers are flushed to disk, while fsync() does not.
Oops.. that's bad.


Yes, one I idea I had was to do an fsync on a new file _after_ issuing
sync, hoping that this will complete after all the sync buffers are
done.

Since postgresql can afford lazy writes for data files, I think this
could work.
The whole point of a checkpoint is to know for certain that a specific
change is in the datafile, so that it is safe to throw away older WAL
segments.
I just made another posing on patches for a thread crossing win32-devel.

Essentially I said

1. Open WAL files with O_SYNC|O_DIRECT or O_SYNC(Not sure if current code does 
it. The hackery in xlog.c is not exactly trivial.)


We write WAL, then fsync, so if we write multiple blocks, we can write
them and fsync once, rather than O_SYNC every write.

2. Open data files normally and fsync them only in background writer process.

Now BGWriter process will flush everything at the time of checkpointing. It 
does not need to flush WAL because of O_SYNC(ideally but an additional fsync 
won't hurt). So it just flushes all the file descriptors touched since last 
checkpoint, which should not be much of a load because it is flushing those 
files intermittently anyways.

It could also work nicely if only background writer fsync the data files. 
Backends can either wait or proceed to other business by the time disk is 
flushed. Backends needs to wait for certain while committing and it should be 
rather small delay of syncing to disk in current process as opposed to in  
background process. 

In case of commit, BGWriter could get away with files touched in transaction
+WAL as opposed to all files touched since last checkpoint+WAL in case of 
checkpoint. I don't know how difficult that would be.

What is different in current BGwriter implementation? Use of sync()?


Well, basically we are still discussing how to do this.  Right now the
backend writer patch uses sync(), but the final version will use fsync
or O_SYNC, or maybe nothing.
The open items are whether a background process can keep the dirty
buffers cleaned fast enough to keep up with the maximum number of
backends.  We might need to use multiple processes or threads to do
this.   We certainly will have a background writer in 7.5 --- the big
question is whether _all_ write will go through it.   It certainly would
be nice if it could, and Tom thinks it can, so we are still exploring
this.
Given that fsync is blocking, the background writer has to scale up in terms of 
processes/threads and load w.r.t. disk flushing.

I would vote for threads for a simple reason that, in BGWriter, threads are 
needed only to flush the file. Get the fd, fsync it and get next one. No need to 
make entire process thread safe.

Furthermore BGWriter has to detect the disk limit. If adding threads does not 
improve fsyncing speed, it should stop adding them and wait. There is nothing to 
do when disk is saturated.

If the background writer uses fsync, it can write and allow the buffer
to be reused and fsync later, while if we use O_SYNC, we have to wait
for the O_SYNC write to happen before reusing the buffer;  that will be
slower.
Certainly. However an O_SYNC open file would not require fsync separately. I 
suggested it only for WAL. But for WAL block grouping as suggested in another 
post, all files with fsync might be a good idea.

Just a thought.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] start of transaction

2003-11-18 Thread Shridhar Daithankar

Hannu Krosing wrote:

Tom Lane kirjutas E, 17.11.2003 kell 02:08:

Neil Conway [EMAIL PROTECTED] writes:

Hmmm... I agree this behavior isn't ideal, although I can see the case
for viewing this as a mistake by the application developer: they are
assuming that they know exactly when transactions begin, which is not
a feature provided by their language interface.
Well, actually, it's a bug in the interface IMHO.  But as I said in the
last thread, it's a fairly widespread bug. 


I'm not sure that it is a client-side bug. For example Oracle seems to
_always_ have a transaction going, i.e. you can't be outside of
transaction, and you use just COMMIT to commit old _and_start_new_
transaction.
IIRC the same is true for DB2.
Actually, in oracle a new transaction starts with first DDL after a commit. That 
does not include DML BTW.

And Damn.. Actually I recently fixed a bug where I had to force a start of 
transaction in Pro*C, immediately after commit. Otherwise a real start of 
transaction could be anywhere down the line, causing some weird concurrency 
issues. Rather than fiddling with oracle support, I would hack my source code, 
especially this is not the first oracle bug I have worked around:-(

The fact that I couldn't control exact transaction start was such a irritation 
to put it mildly.. I so missed 'exec sql begin work' in ecpg..:-)

We've been taking the
position that the interface libraries should get fixed, and that's not
happening.  It's probably time to look at a server-side fix.
I hope that does not compramise transaction control I have with libpq/ecpg etc.

And when we are talking about interface libraries, how many of them are within 
PG control and how many are not? With languages maintenend by postgresql group, 
it should behave correctly, right? E.g pl/perl,pl/python etc.

And for other interface libraries, what are they exactly? php? Can't we just 
send them a stinker/patch to get that damn thing right(Whatever wrong they are 
doing. I have kinda lost thread on it..:-) Was it exact time of transaction 
start v/s now()?)

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Background writer process

2003-11-18 Thread Shridhar Daithankar

Zeugswetter Andreas SB SD wrote:
1. Open WAL files with O_SYNC|O_DIRECT or O_SYNC(Not sure if 
Without grouping WAL writes that does not fly. Iff however such grouping
is implemented that should deliver optimal performance. I don't think flushing 
WAL to the OS early (before a tx commits) is necessary, since writing 8k or 256k 
to disk with one call takes nearly the same time. The WAL write would need to be 
done as soon as eighter 256k fill or a txn commits.
That means no special treatment to WAL files? If it works, great. There would be 
single class of files to take care w.r.t sync. issue. Even more simpler.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Background writer process

2003-11-17 Thread Shridhar Daithankar

On Friday 14 November 2003 22:10, Jan Wieck wrote:
 Shridhar Daithankar wrote:
  On Friday 14 November 2003 03:05, Jan Wieck wrote:
  For sure the sync() needs to be replaced by the discussed fsync() of
  recently written files. And I think the algorithm how much and how often
  to flush can be significantly improved. But after all, this does not
  change the real checkpointing at all, and the general framework having a
  separate process is what we probably want.
 
  Having fsync for regular data files and sync for WAL segment a
  comfortable compramise?  Or this is going to use fsync for all of them.
 
  IMO, with fsync, we tell kernel that you can write this buffer. It may or
  may not write it immediately, unless it is hard sync.

 I think it's more the other way around. On some systems sync() might
 return before all buffers are flushed to disk, while fsync() does not.

Oops.. that's bad.

  Since postgresql can afford lazy writes for data files, I think this
  could work.

 The whole point of a checkpoint is to know for certain that a specific
 change is in the datafile, so that it is safe to throw away older WAL
 segments.

I just made another posing on patches for a thread crossing win32-devel.

Essentially I said

1. Open WAL files with O_SYNC|O_DIRECT or O_SYNC(Not sure if current code does 
it. The hackery in xlog.c is not exactly trivial.)
2. Open data files normally and fsync them only in background writer process.

Now BGWriter process will flush everything at the time of checkpointing. It 
does not need to flush WAL because of O_SYNC(ideally but an additional fsync 
won't hurt). So it just flushes all the file decriptors touched since last 
checkpoint, which should not be much of a load because it is flushing those 
files intermittently anyways.

It could also work nicely if only background writer fsync the data files. 
Backends can either wait or proceed to other business by the time disk is 
flushed. Backends needs to wait for certain while committing and it should be 
rather small delay of syncing to disk in current process as opposed to in  
background process. 

In case of commit, BGWriter could get away with files touched in transaction
+WAL as opposed to all files touched since last checkpoint+WAL in case of 
chekpoint. I don't know how difficult that would be.

What is different in currrent BGwriter implementation? Use of sync()?

 Shridhar


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Background writer process

2003-11-13 Thread Shridhar Daithankar

On Friday 14 November 2003 03:05, Jan Wieck wrote:
 For sure the sync() needs to be replaced by the discussed fsync() of
 recently written files. And I think the algorithm how much and how often
 to flush can be significantly improved. But after all, this does not
 change the real checkpointing at all, and the general framework having a
 separate process is what we probably want.

Having fsync for regular data files and sync for WAL segment a comfortable 
compramise?  Or this is going to use fsync for all of them.

IMO, with fsync, we tell kernel that you can write this buffer. It may or may 
not write it immediately, unless it is hard sync. 

Since postgresql can afford lazy writes for data files, I think this could 
work.

Just a thought..

 Shridhar


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] 7.4 hot backup capabilities?

2003-11-12 Thread Shridhar Daithankar

Austin Gonyou wrote:

What facilities are/will be available for hot(online) backups with the
7.4 release? PITR, something else? TIA.
pg_dump?

Did you mean hot failover or hot backup? Postgresql does hot backup for a long time.

 Bye
  Shridhar
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] 7.4 hot backup capabilities?

2003-11-12 Thread Shridhar Daithankar

Austin Gonyou wrote:

On Wed, 2003-11-12 at 01:00, Shridhar Daithankar wrote:

Austin Gonyou wrote:


What facilities are/will be available for hot(online) backups with the
7.4 release? PITR, something else? TIA.
pg_dump?

Did you mean hot failover or hot backup? Postgresql does hot backup for a long time.


I was referring primarly to hot backup. We're migrating from Oracle to
pgsql, and we wanted to know more about this. From our current
understanding, pgsql's hot backup is more of an export, than say, a hot
backup with redo-logs type mode.(i.e. oracle rman). Is there a similar
utility? I didn't think that pg_dump did that functionality as an actual
backup. i.e. keeps track of block changes while IO is going on and
ensures stuff in process gets backed up as well.
pg_dump takes a snapshot while it starts and it saves the snapshot consistently. 
Any ongoing transactions that are not yet committed are not visible to pg_dump 
anyways.

Furthermore since the snapshot is consistent, pg_dump does not need redo logs etc.

You can look into asynchronous replications if you want incremental backup. 
There is a gborg page for replication projects, 
http://gborg.postgresql.org/project/pgreplication/projdisplay.php

pg_dump can be selective but I don't think it can be incremental.

I hope I got your point correctly.

HTH

 Shridhar

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-11 Thread Shridhar Daithankar

On Tuesday 11 November 2003 18:55, Jan Wieck wrote:
 Shridhar Daithankar wrote:
  On Tuesday 11 November 2003 00:50, Neil Conway wrote:
  Jan Wieck [EMAIL PROTECTED] writes:
   We can't resize shared memory because we allocate the whole thing in
   one big hump - which causes the shmmax problem BTW. If we allocate
   that in chunks of multiple blocks, we only have to give it a total
   maximum size to get the hash tables and other stuff right from the
   beginning. But the vast majority of memory, the buffers themself, can
   be made adjustable at runtime.
 
  Yeah, writing a palloc()-style wrapper over shm has been suggested
  before (by myself among others). You could do the shm allocation in
  fixed-size blocks (say, 1 MB each), and then do our own memory
  management to allocate and release smaller chunks of shm when
  requested. I'm not sure what it really buys us, though: sure, we can
  expand the shared buffer area to some degree, but
 
  Thinking of it, it can be put as follows. Postgresql needs shared memory
  between all the backends.
 
  If the parent postmaster mmaps anonymous memory segments and shares them
  with children, postgresql wouldn't be dependent upon any kernel resourse
  aka shared memory anymore.

 And how does a newly mmap'ed segment propagate into a running backend?

It wouldn't. Just like we allocate fixed amount of shared memory at startup 
now, we would do same for mmaped segments. Allocate maximum configured on 
startup. But it won't be into kernel space as much shared memory segment 
would be.

Anyway we wouldn't be mmaping one segment per page. That might be just too 
much mmapping. We could just mmap entire configured are and go ahead.

I like the possibility of isolating shared buffers per database in this 
approach. I don't know how much useful it would be in practice..

 Shridhar


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] bugzilla (Was: What do you want me to do?)

2003-11-11 Thread Shridhar Daithankar

On Tuesday 11 November 2003 19:19, Peter Eisentraut wrote:
 Andrew Dunstan writes:
  Seriously, I have wondered if it might be a good idea to assemble a
  small hit team that would take some high profile open source projects
  and make sure they worked with Postgres. Bugzilla would be the most
  obvious candidate, but there are certainly others. I suspect that could
  be quite productive, though.

 Good thought, but a hit team is not the right answer, because any project
 that would have been hit in this way will just go bad again the moment
 its database layer is changed.  What would work better are consultants:
 people that hang around on the other project's mailing lists, offer advise
 on database layer modelling and implementation, do clean up tasks, check
 regularly if everything works with the PG development branch, be there
 when the developers of that other project have a question.  I've been
 doing a bit of that, and my sensation is that most developers of
 database-backed applications are dying to have people like that at their
 disposal.

So forming a new group is quite beneficial?

I think so too..  I have been planning to do that for dbmail and egroupware 
but haven't got around it..

 Shridhar


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] bugzilla (Was: What do you want me to do?)

2003-11-11 Thread Shridhar Daithankar

Peter Eisentraut wrote:
Shridhar Daithankar writes:


So forming a new group is quite beneficial?


No, we don't need one group, we need many individuals (or possibly small
groups) to get in contact with their second favorite projects and hang out
there.
I meant lets form a group within advocacy which would consult other projects. 
Since we would expect people to encounter similar kind of problem, a group 
within advocacy/general might be a good idea.

Of course we need not do anything special till we have sizable projets being 
consulted. Can we at least have a web page for projects that use/prefer 
postgresql. I think I put ahead such idea of cross linking earlier as well.

I think so too..  I have been planning to do that for dbmail and egroupware
but haven't got around it..
When I said I've been doing a bit of that, I meant the developers of
eGroupWare call me once a week with questions.  So maybe you can take
over dbmail. :-)
Sure. Check this.

---
Hello,
My name is  and I work with postgresql group. Recently we/postgresql 
advocay/general group have decided to form a group of people who would help 
other projects w.r.t postgresql.

So feel free to ask me questions. I can help you to put yor questions to 
appropriate forums/persons effectively.

Regards
---
Sounds good? Or too aggressive?

 Bye
  Shridhar
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-11 Thread Shridhar Daithankar

Greg Stark wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:


If the parent postmaster mmaps anonymous memory segments and shares them with 
children, postgresql wouldn't be dependent upon any kernel resourse aka 
shared memory anymore.


Anonymous memory mappings aren't shared, at least not unless you're talking
about creating posix threads. That's just not how you create shared mappings
using mmap.
There is a way to create shared mappings using mmap, but it's exactly what you
say you don't want to do -- you use file mappings.
Using mmap postgres could allocate as much shared memory as it needs whenever
it needs it. You create a file the size of the mapping you want, you mmap it
with MAP_SHARED, then you arrange to have any other backends that want access
to it to mmap it as well.
Yes. It occurred to me in the morning. For sure, a good night sleep helps..
I'm not sure why you say you don't want to map files. If you're afraid it will
cause lots of i/o as the system tries to flush these writes, well, in theory
that's up to the kernel to avoid. On systems where the kernel does poorly at
this there are tools like MAP_LOCK/mlock/shmfs that might trick it into doing
a better job.
I didn't have any file in my first post because I saw it as unnecessary. However 
 my guess is IO caused by such file would not be much. How muh shared bufffers 
postgresql would be using anyways? 100MB? 200MB?

On the bright side, system will automatically sync the shared buffers 
periodically. It is like taking snapshot of shaerd buffers. Could be good at
debugging.

If the IO caused by such a  shared memory image is really an issue for somebody,
they can just map the file on a ramdrive.
Actaully I would say that would be a good default approach. Use mmaped file over 
RAM drive as shared buffers. Just wondering if it can be done programmatically.

Actually I've been wondering how hard it would be to avoid this whole
double-buffering issue and having postgres mmap the buffers it wants from the
data files. That would avoid the double-buffering entirely including the extra
copy and memory use. But it would be a major change to a lot of core stuff.
And it be tricky to ensure WAL buffers are written before data blocks.
Yes. I understand mmap is not adequete for WAL and other transaction syncing 
requirement.

 Bye
  Shridhar
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-10 Thread Shridhar Daithankar

On Tuesday 11 November 2003 00:50, Neil Conway wrote:
 Jan Wieck [EMAIL PROTECTED] writes:
  We can't resize shared memory because we allocate the whole thing in
  one big hump - which causes the shmmax problem BTW. If we allocate
  that in chunks of multiple blocks, we only have to give it a total
  maximum size to get the hash tables and other stuff right from the
  beginning. But the vast majority of memory, the buffers themself, can
  be made adjustable at runtime.

 Yeah, writing a palloc()-style wrapper over shm has been suggested
 before (by myself among others). You could do the shm allocation in
 fixed-size blocks (say, 1 MB each), and then do our own memory
 management to allocate and release smaller chunks of shm when
 requested. I'm not sure what it really buys us, though: sure, we can
 expand the shared buffer area to some degree, but

Thinking of it, it can be put as follows. Postgresql needs shared memory 
between all the backends. 

If the parent postmaster mmaps anonymous memory segments and shares them with 
children, postgresql wouldn't be dependent upon any kernel resourse aka 
shared memory anymore.

Furthermore parent posmaster can allocate different anonymous mappings for 
different databases. In addition to postgresql buffer manager overhaul, this 
would make things lot better.

note that I am not suggesting mmap to maintain files on disk. So I guess that 
should be OK. 

I tried searching for mmap on hackers. The threads seem to be very old. One in 
1998. with so many proposals of rewriting core stuff, does this have any 
chance?

 Just a thought.

 Shridhar


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Duplicating transaction information in indexes and performing in memory vacuum

2003-10-27 Thread Shridhar Daithankar

Hi,

Last week, there was a thread whether solely in memory vacuum can be performed 
or not.(OK, that was a part of thread but anyways)

I suggested that a page be vacuumed when it is pushed out of buffer cache. Tom 
pointed out that it can not be done as index tuples stote heap tuple id and 
depend upon heap tuple to find out transaction information.

I asked is it feasible to add transaction information to index tuple and the 
answer was no.

I searched hackers archive and following is only thread I could come up in this 
context.

http://archives.postgresql.org/pgsql-hackers/2000-09/msg00513.php
http://archives.postgresql.org/pgsql-hackers/2001-09/msg00409.php
The thread does not consider vacuum at all.

What are (more) reasons for not adding transaction information to index tuple, 
in addition to heap tuple?

Cons are bloated indexes. The index tuple size will be close to 30 bytes minimum.

On pro* side of this, no more vacuum required (at least for part of data that is 
being used. If data isn't used, it does not need vacuum anyway) and space bloat 
is stopped right in memory, without incurring overhead of additional IO vacuum 
demands.

Given recent trend of pushing PG higher and higher in scale (From performance 
list traffic, that is), I think this could be worthwhile addition.

So what are the cons I missed so far?

 Bye
  Shridhar
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Duplicating transaction information in indexes and

2003-10-27 Thread Shridhar Daithankar

Tom Lane wrote:
Shridhar Daithankar [EMAIL PROTECTED] writes:

What are (more) reasons for not adding transaction information to
index tuple, in addition to heap tuple?


Cons are bloated indexes. The index tuple size will be close to 30
bytes minimum.


And extra time to perform an update or delete, and extra time for
readers of the index to process and perhaps update the extra copies
of the row's state.  And atomicity concerns, since you can't possibly
update the row and all its index entries simultaneously.  I'm not
certain that the latter issue is insoluble, but it surely is a big risk.
The additional information going in index, is available while updating the 
index, I assume. So extra time required is IO for pushing that page to disk.

As far as updating each index row is concerned, I was under impression that all 
relevant indexes are updated when a row is updated. Isn't that right?

On pro* side of this, no more vacuum required (at least for part of
data that is being used. If data isn't used, it does not need vacuum
anyway) and space bloat is stopped right in memory, without incurring
overhead of additional IO vacuum demands.
OK, no more vacuum required is marketing speak for it. It is not strictly true.

I do not believe either of those claims.  For starters, if you don't
remove a row's index entries when the row itself is removed, won't that
make index bloat a lot worse?  When exactly *will* you remove the index
entries ... and won't that process look a lot like VACUUM?
If a heap row is removed and index rows are not removed, it would not make any 
difference because the index row would contain all the information to infer that 
it is dead and can be removed.

The dead index row would be removed, when index page is fetched into buffer 
cache and being pushed out, just like a heap tuple. It would not need heap 
tuple(s) to clean the index page.

The index bloat would not be any worse than current because all the information 
available in index itself, vacuum can clean the dead indexes as well.

And yes, it is essentially vacuum. But with some differences.

* It will operate on buffer pages only. Not on entire database objects. It makes 
it CPU bound operation and cheaper compared to IO incurred. If we assume CPU to 
be cheap enough, additional processing would not affect regular operation that much.
* It will operate continuously unlike vacuum which needs a trigger. That could 
lower overall throughput a little but it would be much more consistent 
throughput rather than peaks and crests shown by triggered vacuum approach.
* It will not clean up entire database objects but only pages in question. So 
some bloat might be left on disk, on indexes and on heaps. But whatever that 
gets used will be cleaned up. Assuming caching works normally, it will keep the 
data set clean for frequent use.
* It is out of order in a sense, index and heap will not be cleaned in sync. The 
extra information in index is to make sure that this can happen.

This will not really eliminate vacuum but would rather drive down significance 
of vacuum. Right now, a write/updateheavy database will die horribly if not 
vacuumed aggressively. Hopefully situation will be much better with such an 
approach.

 Bye
  Shridhar
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] When the last vacuum occured? It's necessary to run

2003-10-27 Thread Shridhar Daithankar

Thiago Fernandes Moesch wrote:
  It would be great for maintainance if every object had a timestamp of
the last vaccum run on it. From time to time we're working with several
databases and I can't tell wich one needs a new vacuum.
  Another important information would be the rate of disposable data in
every table (like old and delete records) it would help was to
determine if a vacuum is required.
You could rather use autovacuum than adding timestamps to fields.

HTH

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Vacuum thoughts

2003-10-20 Thread Shridhar Daithankar

Tom Lane wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:

I was thinking about it. How about vacuuming a page when it is been
pushed out of postgresql buffer cache? It is is memory so not much IO
is involved.


You keep ignoring the problem of removing index entries.  To vacuum an
individual page, you need to be willing to read in (and update) all
index pages that reference the tuples-to-be-deleted.  This is hardly
tenable when the reason for pushing the page out of buffer cache was so
that you could read in something else instead --- you don't have spare
buffer slots, and you don't want to do all that I/O (and the associated
WAL log entries) before you can read in the page you originally wanted.
The latter point is really the crux of the problem.  The point of having
the VACUUM process is to keep maintenance work out of the critical path
of foreground queries.  Anything that moves even part of that
maintenance work into the critical path is going to be a net loss.
So the problem is an index tuple does not store transaction id information like 
a heap tuple does and it can not deduce that an index tuple is dead unless it 
points to a dead heap tuple.

Is that right?

If an index tuple had transaction information duplicated along with heap tuple, 
two types of tuples can be removed, independent of each other? Would above 
scheme of vacuum-on-page-expiry work in that case?

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Vacuum thoughts

2003-10-19 Thread Shridhar Daithankar

Gaetano Mendola wrote:
The vacuum cost is the same of a full scan table ( select count(*) ) ?
Why not do a sort of vacuum if a scan table happen ( during a simple
select that invole a full scan table for example )?
I was thinking about it. How about vacuuming a page when it is been pushed out 
of postgresql buffer cache? It is is memory so not much IO is involved.

Could it be an approach?

 Shridhar

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Tom Lane wrote:
I was just thinking of a GUC parameter: wait N milliseconds between
pages, where N defaults to zero probably.  A user who wants to run his
vacuum as a background process could set N larger than zero.  I don't
believe we are anywhere near being able to automatically adjust the
delay based on load, and even if we could, this would ignore the point
you make above --- the user's intent has to matter as much as anything
else.
I am slightly confused here. IIRC pg_autovacuum never did a vacuum full. At the 
most it does vacuum /vacuum analyse, none of which chew disk bandwidth. And if 
pg_autovacuum is running along with postmaster all the time, with aggressive 
polling like 5 sec, the database should not accumulate any dead tuples nor it 
would suffer xid wraparound as there are vacuum happening constantly.

What's left in above scenario? As long as all the requirements for pg_autovacuum 
are met, namely setting it up, setting it up aggressively and tuning 
postgresql.conf correctly, vacuum and related problems should be a thing in 
past, at least as far as 7.4 and onwards is considered.

Of course RSM implementation for vacuum would still be much needed but right 
now, it does not affect disk IO directly(except for tossing buffer cache out of 
track that is).

What am I missing?

 Shridhar

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] Mapping Oracle types to PostgreSQL types

2003-10-17 Thread Shridhar Daithankar

Jean-Michel POURE wrote:

Dear friends,

I would like to port Compiere CRM from Oracle to PostgreSQL (and release it 
for free).

At first I would like to convert the data schema. This is not difficult as 
Compiere is written using portable types like NUMBER (i,d) which can be 
replaced by NUMERIC (i,d), etc... A series of Search/Replace is sufficiant. 
There are other solutions in Contrib to connect to Oracle and export the data 
(Bruce). Don't blame me to search in another (silly) direction...
Rather than declaring numeric, create them as integer/float of appropriate size 
and add appropriate constraints. Numeric can be slower for large data load w.r.t 
native integers.

The point here is that I would like to use the CREATE TYPE or CREATE DOMAIN 
syntax to map Oracle types to PostgreSQL types. Therefore I can say Guys, 
Oracle is now mostly compatible with PostreSQL.
You can create some sql scripts which can natively migrate from oracle to 
postgresql. Contrib could host them or gborg.

So what postgresql would say is, create a database and run the scripts and many 
of the oracle migration gotchas will be automatically taken care of.

Including such features in core postgresql is rather hard sell to postgresql 
developers. Especially when there is a rather simple workaround.

 HTH

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Andrew Sullivan wrote:

On Fri, Oct 17, 2003 at 07:04:45PM +0530, Shridhar Daithankar wrote:

I am slightly confused here. IIRC pg_autovacuum never did a vacuum full. At 
the most it does vacuum /vacuum analyse, none of which chew disk bandwidth. 


The latter is false.  VACUUM FULL certainly uses _more_ disk
bandwidth than VACUUM, but it's just false that plain VACUUM doesn't
contend for disk.  And if you're already maxed, then that extra
bandwidth you cannot afford.
What part of plain vacuum takes disk bandwidth? WAL? Clog? Certainly not data 
files themselves, right?

OK, I understand some system can be saturated enough to have additional WAL/Clog 
burdon, but genuinely curious, how much disk bandwidth is required for plain 
vacuum and what are the factors it depends upon?

 Shridhar

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Alvaro Herrera wrote:

On Fri, Oct 17, 2003 at 07:04:45PM +0530, Shridhar Daithankar wrote:


And if pg_autovacuum is running along with postmaster all the time, with 
aggressive polling like 5 sec, the database should not accumulate any dead 
tuples nor it would suffer xid wraparound as there are vacuum happening 
constantly.


The database can suffer XID wraparound anyway if there's at least one
table without updates, because the autovacuum daemon will never vacuum
it (correct me if I'm wrong).
If a table is never updated and hence not vacuumed at all, why would it be 
involved in a transaction that would have xid wrap around?

pg_autovacuum takes care of insert/updates/deletes. If a table never 
participates in above three and hence escape from pg_autovauum, it also escapes 
from xid wraparound, isn't it?

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Alvaro Herrera wrote:

On Fri, Oct 17, 2003 at 07:41:38PM +0530, Shridhar Daithankar wrote:

Alvaro Herrera wrote:


On Fri, Oct 17, 2003 at 07:04:45PM +0530, Shridhar Daithankar wrote:


The database can suffer XID wraparound anyway if there's at least one
table without updates, because the autovacuum daemon will never vacuum
it (correct me if I'm wrong).
If a table is never updated and hence not vacuumed at all, why would it be 
involved in a transaction that would have xid wrap around?


Because the tuples on it were involved in some insert operation at some
time (else the table would not have any tuples).  So it _has_ to be
vacuumed, else you run the risk of losing the tuples when the wraparound
happens.  (Sorry, I don't know how to explain this better.)
OK. So here is what I understand. I have a table which contains 100 rows which 
appeated there due to some insert operation. Then I vacuum it. And sit there for 
internity for rest of the database to approach the singularity(the xid 
wraparound..:-) Nice term, isn't it?).

So this static table is vulnerable to xid wraparound? I doubt.

Did I miss something?

 Shridhar

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Tom Lane wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:

What part of plain vacuum takes disk bandwidth?


Reading (and possibly rewriting) all the pages.
I was under impression that was for shared memory pages only and not for disk pages.

OK.  I can see difference of understanding here.

Plain Vacuum goes around the table/database and makes space, shared buffers and 
disks, reusable whenever possible but *does not* free any space.

Would it be possible to have a vacuum variant that would just shuffle thr. 
shared buffers and not touch disk at all?  pg_autovacuum could probably be ulra 
agressive with such a shared-buffers only scan? Is it possible or feasible?

IMO that could be a clever solution rather than throttling IO for vacuum. For 
one thing, getting that throttiling right, would be extremely difficult and 
varying from site to site. If it is going to be tough to tune, then it will be 
underutilised and will lose it's value rather rapidly.

 Just a thought..

 Shridhar







---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Matthew T. O'Connor wrote:

On Fri, 2003-10-17 at 10:25, Shridhar Daithankar wrote:

OK. So here is what I understand. I have a table which contains 100 rows which 
appeated there due to some insert operation. Then I vacuum it. And sit there for 
internity for rest of the database to approach the singularity(the xid 
wraparound..:-) Nice term, isn't it?).

So this static table is vulnerable to xid wraparound? I doubt.


No that table would probably be ok, because you did a vacuum on it after
the inserts.  The problem is that pg_autovacuum may choose not to do a
vacuum if you didn't cross a threshold, or someone outside of
pg_autovacuum may have done the vacuum and autovac doesn't know about
it, so it can't guarantee that all tables in the database are safe from
xid wraparound.  

One additional thing, some of this might be possible if pg_autovacuum
saved its data between restarts.  Right now it restarts with no memory
of what happened before.  
Well, the unmaintened gborg version adopted approach of storing such info. in a 
table, so that it survives postgresql/pg_atuvacuum restart or both.

That was considered a tablespace pollution back then. But personally I think, it 
should be ok. If ever it goes to catalogues, I would rather add few columns to 
pg_class for such a stat. But again, thats not my call to make.

 Shridhar

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Tom Lane wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:

Would it be possible to have a vacuum variant that would just shuffle thr. 
shared buffers and not touch disk at all?


What would be the use of that?  You couldn't predict *anything* about
the coverage.  Maybe you find all the free space in a particular table,
but most likely you don't.
In any case an I/O-free vacuum is impossible since once you have decided
to recycle a particular tuple, you don't have any option about removing
the corresponding index entries first.  So unless both the table and all
its indexes are in RAM, you will be incurring I/O.
I am just suggesting it as a variant and not a replacement for existing vacuum 
options. Knowing that it does not do any IO, it could be triggered lot more 
aggressively. Furthermore if we assume pg_autovacuum as integral part of 
database operation, right before from a single database object is created, I 
think it could cover many/most database usage patterns barring multiple indexes, 
for which normal vacuum variants could be used.

Furthermore, when a tuple is updated, all the relevant indexes are updated, 
right? So if such a vacuum is aggressive enough, it could catch the index 
entries as well, in the RAM.

Think of it like catching hens. Easier to do in a cage rather than over a farm. 
So catch as many of them in cage. If they escape or spill out of cage due to 
over-population, you have to tread the farm anyways...

 Just a thought.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Shridhar Daithankar

Rod Taylor wrote:

On Fri, 2003-10-17 at 10:22, Tom Lane wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:

What part of plain vacuum takes disk bandwidth?
Reading (and possibly rewriting) all the pages.


Would it be possible for the backend to keep a list of the first N (N
being a large number but not significant in memory usage) pages it has
deleted tuples out of and a second list of N pages it has inserted
tuples into.
That is RSM, reclaimable space map. It is on TODO.

After the transaction has completed and there is an idle period (say 1/4
second between transaction) it can pass the insert information on a
rollback and delete information on a commit to a separate backend.
This 'vacuum' backend could then prioritize garbage collection for the
pages it knows have been changed performing a single page vacuum when a
specific page has seen a high level of reported activity.
If this daemon could also get a hold of information about idleness of IO
in general the decision about what to vacuum and when may be better
(heavily hit pages during peak periods, all reports pages on medium
load). When completely idle, run through the entire system to get back
as much as possible.
I agree. This seems to be the best way of dealing with things. Of course, 
probably there are details we are missing here, but in general its good.

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] BigInt woes

2003-10-08 Thread Shridhar Daithankar

Joshua D. Drake wrote:

Hello,

 I believe that the Int8/BigInt items are known issues but I have a knew 
programmer that ran into it
over the weekend (he didn't call me when he encountered the problem, 
when he should of) and we have a
customer that burned some significant time on it as well. Will this be 
fixed in 7.4?
Well, this is not an issue actually but fact that postgresql is very strict 
about it's data types. You need to cast explicitly even for those types which 
'seem' compatible, such as int4 and int2.

This query:

explain select bid_id, bid_time from bid where bid_id = 1

Will always sequential scan.

This query:

explain select bid_id, bid_time from bid where bid_id = '1'
Try explain select bid_id, bid_time from bid where bid_id = 1::bigint

create function bid_check(bigint) returns bool as '
declare
 in_bid_id alias for $1;
begin
 if (select count(*) from bid where bid_id = in_bid_id) = 1 then
Again try typecasting.

if (select count(*) from bid where bid_id::bigint = in_bid_id::bigint) = 1 then

I doubt in_bid_id needs to be casted that explicitly but I am sure it will be 
safe..:-)

HTH

 Shridhar



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: ADD FOREIGN KEY (was Re: [HACKERS] [GENERAL] 7.4Beta)

2003-09-29 Thread Shridhar Daithankar

On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote:
 So a db designer made a bloody mistake.
 The problem is there's no easy way to find out what's missing.
 I'd really like EXPLAIN to display all subsequent triggered queries
 also, to see the full scans caused by missing indexes.
 
  It could probably be doable for EXPLAIN ANALYZE (by actually tracing
  execution), but then you will see really _all_ queries, i.e. for a 1000
  row update you would see 1 UPDATE query and 1000 fk checks ...
 
  OTOH, you probably can get that already from logs with right logging
  parameters.

 Actually - it shouldn't be too hard to write a query that returns all
 unindexed foreign keys, surely?

Correct me if I am wrong but I remember postgresql throwing error that foreign 
key field was not unique in foreign table. Obviously it can not detect that 
without an index. Either primary key or unique constraint would need an 
index.

What am I missing here?


IOW, how do I exactly create foreign keys without an index?

 Shridhar


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] PL contribution guidelines?

2003-09-28 Thread Shridhar Daithankar

On Sunday 28 September 2003 11:53, mlg7 wrote:
 On Saturday 27 September 2003 19:46, Peter Eisentraut wrote:
  mlg7 writes:
   Is there a centralized list of pgsql PL's ?
 
  I'm not aware of one.
 
 http://techdocs.postgresql.org/guides/PLLanguages
 
 Josh posted it on advocacy few days back.
 
  Shridhar

 That does not work:
 : Proxy Error
 : The proxy server received an invalid response from an upstream server.
 :
 : The proxy server could not handle the request GET /guides/PLLanguages.
 :
 : Reason: Could not connect to remote machine: Operation timed out

Try google cache

http://www.google.co.in/search?q=cache:iDjd8nA-l2IJ:techdocs.postgresql.org/guides/PLLanguages+list+of+postgresql+procedural+languageshl=mrie=UTF-8

The google search term I used was list of postgresql procedural languages. 
Look for first techdocs link.

Yesterday when I got same error, I thought there was problem with 
international link again...:-) Hard to tell where exactly problem is..

 Shridhar


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] 2-phase commit

2003-09-27 Thread Shridhar Daithankar

On Saturday 27 September 2003 20:17, Bruce Momjian wrote:
 Richard Huxton wrote:
 I have been thinking it might be time to start allowing external
 programs to be called when certain events occur that require
 administrative attention --- this would be a good case for that.
 Administrators could configure shell scripts to be run when the network
 connection fails or servers drop off the network, alerting them to the
 problem.  Throwing things into the server logs isn't _active_ enough.

I would say calling events from external libraries would be a good extension. 
That could allow for extending postgresql in novel way. e.g. calling a 
logrecord copy event after a WAL record is written for near real time 
replication..:-)

 Shridhar


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] PL contribution guidelines?

2003-09-27 Thread Shridhar Daithankar

On Saturday 27 September 2003 19:46, Peter Eisentraut wrote:
 mlg7 writes:
  Is there a centralized list of pgsql PL's ?

 I'm not aware of one.

http://techdocs.postgresql.org/guides/PLLanguages

Josh posted it on advocacy few days back.

 Shridhar


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [ADMIN] postgres 6.2 vacuum

2003-09-26 Thread Shridhar Daithankar

Hornyak Laszlo wrote:

I think it is not that simple. How should I explain the company leaders
why I must stop the system. It may risk their bussiness success too. I can
tell them that the new db is more stable, but until the old one does the
job, it is still acceptable for them (it served the system for 5-6 years
or so). Once it crashes, it is a good reason to do the move.
Well, I am sure there are data corruption bugs fixed between 6.2 and current CVS 
head which would count as large impact in terms of numbers and severity.

If your client business depends upon it, that is in fact a better reason to 
upgrade. If postgresql developers tells you to upgrade, that does count as 
recommendation.

Its not like oracle upgrade where you have to move the OS, hardware and spend a 
large amount of money. The impact of migration is restricted to downtime of 
servers and cleaning up any applications that depend upon any incorrect 
behaviour supported in past.

IMO you should move in all scenarios.

 Shridhar

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] [pgsql-www] NuSphere and PostgreSQL for windows

2003-09-26 Thread Shridhar Daithankar

Greg Stark wrote:

Shridhar Daithankar [EMAIL PROTECTED] writes:


The little pthreads programming I did on linux/freeBSD tells me that it
supports majority of features except TLS(linux2.4/linuxthreads) and per thread
signals. 


LinuxThreads is dead. NPTL is the way and the light.

It has ELF TLS which for super-fast thread-local storage, proper per-thread
signal semantics, as well as not using USR1,USR2 in ways that mess up
signal-handling. It also has kernel support for user-space mutexes (strange as
that may sound).
Well.. Linuxthreads or NPTL, they are going to conform to pthreads standard 
right? In fact NPTL is supposed to be better conformant than linuxthreads.

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

1 2 3 >

1 - 100 of 267 matches

Mail list logo