Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-05 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-06-04 kell 18:09, kirjutas Tom Lane:
 Greg Stark [EMAIL PROTECTED] writes:
  Hannu Krosing [EMAIL PROTECTED] writes:
  Ühel kenal päeval, L, 2006-06-03 kell 10:43, kirjutas Jim Nasby:
  Might also be worth adding analyze delay settings, ala  
  vacuum_cost_delay.
 
 ANALYZE already respects the vacuum delay settings.
 
  Actually we should have delay settings for all potential
  (almost-)full-scan service ops, - VACUUM, ANALYSE, CREATE INDEX, ADD
  CONSTRAINT, maybe more - so that there would be better chances of
  running those on busy databases without disastrous effects.
 
  What about UPDATE and DELETE and for that matter SELECT?
 
 This seems pretty silly.  The point of the delay stuff is to prevent
 background maintenance operations from eating an unreasonable share
 of resources compared to foreground queries.  I don't see why you'd
 put delays into queries --- if your machine is loaded, it's loaded.
 
 I think the existing features are sufficient in this line and that
 doing more is just adding complexity for complexity's sake.

Making CREATE INDEX respect delay settings will be reasonable once we
get it to run without locking the table. 

And if non-locking is doable for ADD/ALTER CONSTRAINT, then it makes
sense there too.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] XLOG_BLCKSZ vs. wal_buffers table

2006-06-05 Thread Simon Riggs
On Thu, 2006-06-01 at 16:46 -0700, Mark Wong wrote:
 Simon Riggs wrote:
  On Wed, 2006-05-10 at 09:55 -0700, Mark Wong wrote:
  Simon Riggs wrote:
  Could you turn full_page_writes = off and do a few more tests? I think
  the full page writes is swamping the xlog and masking the performance we
  might see for normal small xlog writes.
  I'd try XLOG_BLCKSZ = 4096 and 8192 to start with. Thanks.
  Ok, got data for XLOG_BLCKXZ at 4096, 8192, and 32768 with
  full_page_wirtes = off.  The new data is at the bottom of the page:
 http://developer.osdl.org/markw/pgsql/xlog_blcksz.html
  
  What do you think is causing the step changes at 30+ and 60+ minutes on
  these tests?
 
 I took some time to start charting the sar data and to break down the 
 iostat data by tablespaces.  I've updated the web pages form the link 
 above.  Well, none of the charts helped me make any better guesses but 
 perhaps someone else will see something.
 
 When I get back from my short break, I'm planning on taking a look at 
 Tau, a project from the University of Oregon 
 (http://www.cs.uoregon.edu/research/tau/home.php) that's capable of 
 breaking down profiles per process and gathering hardware counters.  I'm 
 hoping that will shed some light on the behavior.

Sounds useful. Let me know what you find.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] bgwriter statistics

2006-06-05 Thread ITAGAKI Takahiro
On 2006-06-02 21:26, Jim Nasby wrote:

 Now that we've got a nice amount of tuneability in the bgwriter, it
 would be nice if we had as much insight into how it's actually doing.
 I'd like to propose that the following info be added to the stats
 framework to assist in tuning it:

I'm interested in your idea. You want to know what bgwriter does.
Also, I think there is another perspective; what bgwriter *should* do.
I imagine the information that pages are dirty or not is useful for
the purpose.

- dirty_pages:
The number of pages with BM_DIRTY in the buffer pool.
- replaced_dirty:
Total replaced pages with BM_DIRTY.
Backends should write the pages themselves.
- replaced_clean:
Same as above, but without BM_DIRTY.
Backends can replace them freely.

Bgwriter should boost ALL activity if dirty_pages is high,
and boost LRU activity if replaced_dirty is high.
In ideal, the parameters of bgwriter can be tuned almost automatically:

- LRU scans = replaced_dirty + replaced_clean
- LRU writes = replaced_dirty
- ALL scans/writes = the value that can keep dirty_pages low


However, tracking the number of dirty pages is not free. I suppose
the implementation should be well considered to avoid lock contentions.

Comments are welcome.

---
ITAGAKI Takahiro
NTT OSS Center


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Faster Updates

2006-06-05 Thread Jim Nasby

On Jun 3, 2006, at 2:05 PM, Nicolai Petri wrote:


On Saturday 03 June 2006 17:27, Tom Lane wrote:

PFC [EMAIL PROTECTED] writes:

   [snip - complicated update logic proprosal]
What do you think ?


Sounds enormously complicated and of very doubtful net win --- you're

[snip - ... bad idea reasoning] :)


What if every backend while processing a transaction collected a  
list of
touched records - probably with a max number of entries (GUC)  
collected per
transaction. Then when transaction completes the list of touples  
are sent to
pg_autovacuum or possible a new process that selectively only went  
for those
tupples. Of course it should have some kind of logic connected so  
we don't
visit the tupples for vacuum unless we are quite sure no running  
transactions
would be blocking adding the blocks to the FSM. We might be able to  
actually

queue up the blocks until a later time (GUC queue-max-time +
queue-size-limit) if we cannot determine that it would be safe to  
FSM the

blocks at current time.

I guess this has probably been suggested before and there is  
probably a reason


Yup. Search the archives for 'dead space map'.
--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] More thoughts about planner's cost estimates

2006-06-05 Thread Jim Nasby

On Jun 4, 2006, at 5:09 PM, Tom Lane wrote:


Greg Stark [EMAIL PROTECTED] writes:

Hannu Krosing [EMAIL PROTECTED] writes:

Ühel kenal päeval, L, 2006-06-03 kell 10:43, kirjutas Jim Nasby:

Might also be worth adding analyze delay settings, ala
vacuum_cost_delay.


ANALYZE already respects the vacuum delay settings.


Actually we should have delay settings for all potential
(almost-)full-scan service ops, - VACUUM, ANALYSE, CREATE INDEX, ADD
CONSTRAINT, maybe more - so that there would be better chances of
running those on busy databases without disastrous effects.



What about UPDATE and DELETE and for that matter SELECT?


This seems pretty silly.  The point of the delay stuff is to prevent
background maintenance operations from eating an unreasonable share
of resources compared to foreground queries.  I don't see why you'd
put delays into queries --- if your machine is loaded, it's loaded.


'maintenance operations' often also mean running large updates. Being  
able to run those at a reduced priority would certainly be helpful in  
many cases. Though, a better way to accomplish this would be to have  
the OS handle prioritized IO scheduling, but since pretty much none  
of them seem to do that...

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Faster Updates

2006-06-05 Thread Jim Nasby

On Jun 3, 2006, at 10:27 AM, Tom Lane wrote:


PFC [EMAIL PROTECTED] writes:

What do you think ?


Sounds enormously complicated and of very doubtful net win --- you're
moving a lot of overhead into SELECT in order to make UPDATE cheaper,
and on top of that the restriction to same-page will limit the
usefulness quite a lot (unless we deliberately keep pages less than
full, which costs a lot in distributed extra I/O).


A lot of CPU overhead, which in many cases won't really matter.

If someone has interest in testing this to see what impact it has,  
how hard would it be to hack together enough code to test the base  
concept? I'm thinking only basic SELECT and UPDATE support, along  
with a means to leave a certain percentage of each page empty.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-05 Thread Jim Nasby

On Jun 4, 2006, at 8:18 AM, Andrew Dunstan wrote:

I said:



Another option would be to re-run cvs up one more time if we get any
unexpected files. It sounds like that would fix this issue on  
windows

machines, while still ensuring we had a clean repo to work from.



please see the new release of the buildfarm client, in which I have
followed Tom's suggestion of removing the -P flag from the  
checkout and
update commands - that should solve the Windows problem, as it  
will no

longer try to remove the directory. I hope that solves the problem -
if  not I'll have a look at other solutions.



Unfortunately, this fell over first time out:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl? 
nm=lorisdt=2006-06-04%2012:09:33
The fix handled directories, but we got a false positive from a  
rename not

being immediate either, it seems. Bloody Windows!

One thought I had was to force Windows to use CVS export rather  
than update.
This has 2 disadvantages: it requires a complete repo fetch every  
run, even
if we don't need to do anything because nothing has changed, and it  
also

means we can't report the version numbers on files changed. Example:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl? 
nm=lorisdt=2006-06-04%2012:21:43
So what I'm going to try instead is a variation on Jim's suggestion  
above,
but instead of re-running cvs update, what we'll do is a longish  
sleep (say

10 or 20 secs) which should be enough time for Windows to get its act
together, and then run cvs status, which will also show us  
extraneous files.


What about my suggestion of runing CVS a second time if we get  
extraneous files the first go-round? I'm guessing there'd have to be  
a sleep in there as well...

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] 'CVS-Unknown' buildfarm failures?

2006-06-05 Thread Andrew Dunstan

Jim Nasby wrote:


What about my suggestion of runing CVS a second time if we get 
extraneous files the first go-round? I'm guessing there'd have to be a 
sleep in there as well...



The trouble with running cvs update a second time is that it will be 
just as liable to fail as the first run. So I am following your 
suggestion, but with this modification: after a sleep we will run cvs 
status which will not have the same issues, because it doesn't create 
or delete anything, and will show us any extraneous files/directories 
that might be present.


cheers

andrew




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Connection Broken with Custom Dicts for TSearch2

2006-06-05 Thread Teodor Sigaev
Sorry, it isn't mentioned on page, but this example of code working only with 
before 8.1 versions. In 8.1 interface to dictionary was changed.


More precisely, in 8.1, lexize function (in num2english dlexize_num2english()) 
should return pointer to TSLexeme array instead of char**.


Rodrigo Hjort wrote:

Sorry, but I thought it that was the most appropriate list for the issue.

I was following these instructions:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html 
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html


And what happens is that the function works just once. Perhaps a 
malloc/free issue?



$ psql fuzzy

fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog 
100');

to_tsvector

 'dog':9 'fox':4 'jump':5 'lazi':8 'brown':3 'quick':2 'hundred':10
(1 registro)

fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog 
100');

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
A conexão com servidor foi perdida. Tentando reiniciar: Falhou.
! \q


Regards,

Rodrigo Hjort
http://icewall.org/~hjort



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much memory

2006-06-05 Thread Jim C. Nasby
Moving to -hackers

On Mon, Jun 05, 2006 at 12:32:38AM +0200, Zoltan Boszormenyi wrote:
 I just noticed that psql's unformatted output uses too much
 memory. Is it normal? It seems that psql draws all records
 of a query off the server before it displays or writes the output.
 I would expect this only with formatted output.
 
 Problem is, I have an export that produces 500'000+ records
 which changes frequently. Several (20+) sites run this query
 nightly with different parameters and download it. The SELECTs
 that run in psql -A -t -c '...' may overlap and the query that runs
 in less than 1.5 minutes if it's the only one at the time may take
 3+ hours if ten such queries overlap. The time is mostly spent
 in swapping, all psql processes take up 300+ MB, so the 1GB
 server is brought to its knees quickly, peek swap usage is 1.8 GB.
 I watched the progress in top and the postmaster processes finished
 their work in about half an hour (that would still be acceptable)
 then the psql processes started eating up memory as they read
 the records.
 
 PostgreSQL 8.1.4 was used on RHEL3.
 
 Is there a way to convince psql to use less memory in unformatted
 mode? I know COPY will be able to use arbitrary SELECTs
 but until then I am still stuck with redirecting psql's output.
 
 The answer it to use SELECT INTO TEMP and then COPY.
 Psql will use much less memory that way. But still...

I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
largetable'  /dev/null results in psql consuming vast quantities of
memory. Why is this? ISTM this is a bug...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much memory

2006-06-05 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
 largetable'  /dev/null results in psql consuming vast quantities of
 memory. Why is this?

Is it different without the -A?

I'm reading this as just another uninformed complaint about libpq's
habit of buffering the whole query result.  It's possible that there's
a memory leak in the -A path specifically, but nothing said so far
provided any evidence for that.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much memory

2006-06-05 Thread Jim C. Nasby
On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
  largetable'  /dev/null results in psql consuming vast quantities of
  memory. Why is this?
 
 Is it different without the -A?

Nope.

 I'm reading this as just another uninformed complaint about libpq's
 habit of buffering the whole query result.  It's possible that there's
 a memory leak in the -A path specifically, but nothing said so far
 provided any evidence for that.

Certainly seems like it. It seems like it would be good to allow for
libpq not to buffer, since there's cases where it's not needed...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Connection Broken with Custom Dicts for TSearch2

2006-06-05 Thread Teodor Sigaev



Teodor Sigaev wrote:
Sorry, it isn't mentioned on page, but this example of code working only 
with before 8.1 versions. In 8.1 interface to dictionary was changed.


Try attached dict_tmpl.c

2Oleg: place file on site, pls

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/
/* 
 * num2english dictionary by Ben Chobot [EMAIL PROTECTED], based on
 * example of dictionary 
 * Teodor Sigaev [EMAIL PROTECTED]
 *
 */
#include errno.h
#include stdlib.h
#include string.h

#include postgres.h

#include dict.h
#include common.h

#include subinclude.h

/* special names for values */
struct nx {
	char name[20];
	int value;
};

static struct nx num2english_numarr[] =
{
{ zero, 0 },
{ one, 1 },
{ two, 2 },
{ three, 3 },
{ four, 4 },
{ five, 5 },
{ six, 6 },
{ seven, 7 },
{ eight, 8 },
{ nine, 9 },
{ ten, 10 },
{ eleven, 11 },
{ twelve, 12 },
{ thirteen, 13 },
{ fourteen, 14 },
{ fifteen, 15 },
{ sixteen, 16 },
{ seventeen, 17 },
{ eighteen, 18 },
{ nineteen, 19 },
{ twenty, 20 },
{ thirty, 30 },
{ forty, 40 },
{ fifty, 50 },
{ sixty, 60 },
{ seventy, 70 },
{ eighty, 80 },
{ ninety, 90 },
{ , 999 }
};	

static char *num2english_denom[]=
{
,
thousand,
million,
billion,
trillion,
quadrillion,
quintillion,
sextillion,
septillion,
octillion,
nonillion,
decillion,
undecillion,
duodecillion,
tredecillion,
quattuordecillion,
sexdecillion,
septendecillion,
octodecillion,
novemdecillion,
vigintillion
};


static char *cvt2(int);
static char *cvt3(int);
static char *itowords(long long);


 PG_FUNCTION_INFO_V1(dinit_num2english);
 Datum dinit_num2english(PG_FUNCTION_ARGS);

 Datum 
 dinit_num2english(PG_FUNCTION_ARGS) {
	/* nothing to init */
	 
 	PG_RETURN_POINTER(NULL);
 }

PG_FUNCTION_INFO_V1(dlexize_num2english);
Datum dlexize_num2english(PG_FUNCTION_ARGS);
Datum
dlexize_num2english(PG_FUNCTION_ARGS) {
 	void* dummy = PG_GETARG_POINTER(0);
	char   *in = (char*)PG_GETARG_POINTER(1);
	char *txt = pnstrdup(in, PG_GETARG_INT32(2));
	TSLexeme	*res=0;

	char	*phrase;
	char	*cursor;
	char	*last;
	int	lexes = 1;
	int	thisLex = 0;

 	if ( *txt=='\0' ) {
		res = palloc0(sizeof(TSLexeme));
 		pfree(txt);
 	} 
	else
	{
		phrase = itowords(atoll(txt));
		if((cursor = strchr(txt,'.'))  *(cursor+1))
		{
			char	*phrase2;
			char	*ptemp = phrase;

			phrase2 = itowords(atoll(cursor+1));
			phrase = palloc(strlen(phrase2) + strlen(ptemp) + strlen( . ) + 1);
			sprintf(phrase,%s . %s,ptemp,phrase2);
			pfree(ptemp);
			pfree(phrase2);
		}
		pfree(txt);

		for(cursor=phrase; *cursor; cursor++) if(*cursor == ' ') lexes++;

		res = palloc0(sizeof(TSLexeme)*(lexes +1));
		for(last=cursor=phrase; *cursor; cursor++)
		{
			if(*cursor == ' ')
			{
res[thisLex].lexeme = palloc((cursor-last+1));
memcpy(res[thisLex].lexeme,last,(cursor-last));
res[thisLex++].lexeme[cursor-last] = '\0';
/* done with this lex. */
if(*(cursor+1) == ' ') // if the next space is *also* whitespace
{
	/* We don't want it.
	   Fortunately we know we'll never get more than 2 spaces in a row. */
	cursor++;
}
last=cursor+1;
			}
		}

		/* finish up this last lex */
		res[thisLex].lexeme = palloc((cursor-last+1));
		memcpy(res[thisLex].lexeme,last,(cursor-last));
		res[thisLex++].lexeme[cursor-last] = 0;

		pfree(phrase);
		res[thisLex].lexeme = NULL;
	}

	PG_RETURN_POINTER(res);
}

/* The code below was taken from http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,3556,00.html 
 and modified slightly to fit in the postgres stored proc framework. It appears to be without copywrite. */

/* take a two-digit number and cvt to words. */
static char *cvt2(int val)
{
	int i=0;
	char word[80];
	char *ret = 0;

	while(num2english_numarr[++i].value = val)
		/* nothing */;
	strcpy(word,num2english_numarr[i-1].name);
	val -= num2english_numarr[i-1].value;
	if (val  0)
	{
		strcat(word, );
		strcat(word,num2english_numarr[val].name);
	}
	
	ret = palloc(strlen(word)+1);
	memcpy(ret,word,strlen(word)+1);
	return (ret);
}



/* take a 3 digit number and cvt it to words */
static char *cvt3(int val)
{
	int rem, mod;
	char word[80];
	char *ret = 0;

	word[0] = '\0';
	mod = val % 100;
	rem = val / 100;

	if ( rem  0 )
	{
		strcat(word,num2english_numarr[rem].name);
		strcat(word, hundred);
		if (mod  0)
			strcat(word, );
	}
	if ( mod  0 )
	{
		char *sub = cvt2(mod);
		strcat(word, sub);
		pfree(sub);
	}
	
	ret = palloc(strlen(word)+1);
	memcpy(ret,word,strlen(word)+1);
	return(ret);
}

/* here's the routine that does the rest */
static char *itowords(long long val)
{
	long long tri;	/* last three digits */
	long long place = 0;	/* which power of 10 we are on */
	int neg=0;	/* sign holder */
	char temp[255];	/* temporary string space */

	char word[255];
	char phrase[100];
	char *ret = 0;

	word[0] = '\0';

	/* check for negative int */
	if (val  0 )
	{
		neg = 1;
		val = -val;
	}

	if ( val == 0 )
	{
		ret = palloc(5);

Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much memory

2006-06-05 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
 I'm reading this as just another uninformed complaint about libpq's
 habit of buffering the whole query result.  It's possible that there's
 a memory leak in the -A path specifically, but nothing said so far
 provided any evidence for that.

 Certainly seems like it. It seems like it would be good to allow for
 libpq not to buffer, since there's cases where it's not needed...

See past discussions.  The problem is that libpq's API says that when it
hands you back the completed query result, the command is complete and
guaranteed not to fail later.  A streaming interface could not make that
guarantee, so it's not a transparent substitution.

I wouldn't have any strong objection to providing a separate API that
operates in a streaming fashion, but defining it is something no one's
bothered to do yet.  In practice, if you have to code to a variant API,
it's not that much more trouble to use a cursor...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Mark Woodward
 Jim C. Nasby [EMAIL PROTECTED] writes:
 On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
 I'm reading this as just another uninformed complaint about libpq's
 habit of buffering the whole query result.  It's possible that there's
 a memory leak in the -A path specifically, but nothing said so far
 provided any evidence for that.

 Certainly seems like it. It seems like it would be good to allow for
 libpq not to buffer, since there's cases where it's not needed...

 See past discussions.  The problem is that libpq's API says that when it
 hands you back the completed query result, the command is complete and
 guaranteed not to fail later.  A streaming interface could not make that
 guarantee, so it's not a transparent substitution.

 I wouldn't have any strong objection to providing a separate API that
 operates in a streaming fashion, but defining it is something no one's
 bothered to do yet.  In practice, if you have to code to a variant API,
 it's not that much more trouble to use a cursor...


Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
this for free?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Andrew Dunstan

Mark Woodward wrote:

Jim C. Nasby [EMAIL PROTECTED] writes:


On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
  

I'm reading this as just another uninformed complaint about libpq's
habit of buffering the whole query result.  It's possible that there's
a memory leak in the -A path specifically, but nothing said so far
provided any evidence for that.


Certainly seems like it. It seems like it would be good to allow for
libpq not to buffer, since there's cases where it's not needed...
  

See past discussions.  The problem is that libpq's API says that when it
hands you back the completed query result, the command is complete and
guaranteed not to fail later.  A streaming interface could not make that
guarantee, so it's not a transparent substitution.

I wouldn't have any strong objection to providing a separate API that
operates in a streaming fashion, but defining it is something no one's
bothered to do yet.  In practice, if you have to code to a variant API,
it's not that much more trouble to use a cursor...




Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
this for free?


  


It won't solve it in the general case for clients that expect a result 
set. ISTM that use a cursor is a perfectly reasonable answer, though.


cheers

andrew


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Zoltan Boszormenyi

Hi!

Tom Lane írta:

Jim C. Nasby [EMAIL PROTECTED] writes:
  

I've been able to verify this on 8.1.4; psql -A -t -c 'SELECT * FROM
largetable'  /dev/null results in psql consuming vast quantities of
memory. Why is this?



Is it different without the -A?

I'm reading this as just another uninformed complaint about libpq's
habit of buffering the whole query result.  It's possible that there's
a memory leak in the -A path specifically, but nothing said so far
provided any evidence for that.

regards, tom lane
  


So, is libpq always buffering the result? Thanks.
I thought psql buffers only because in its formatted output mode
it has to know the widest value for all the columns.

Then the SELECT INTO TEMP ; COPY TO STDOUT solution
I found is _the_ solution.

I guess then the libpq-based ODBC driver suffers
from the same problem? It certainly explains the
performance problems I observed: the server
finishes the query, the ODBC driver (or libpq underneath)
fetches all the records and the application receives
the first record after all these. Nice.

Best regards,
Zoltán Böszörményi


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Mark Woodward
 Mark Woodward wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:

 On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:

 I'm reading this as just another uninformed complaint about libpq's
 habit of buffering the whole query result.  It's possible that
 there's
 a memory leak in the -A path specifically, but nothing said so far
 provided any evidence for that.

 Certainly seems like it. It seems like it would be good to allow for
 libpq not to buffer, since there's cases where it's not needed...

 See past discussions.  The problem is that libpq's API says that when
 it
 hands you back the completed query result, the command is complete and
 guaranteed not to fail later.  A streaming interface could not make
 that
 guarantee, so it's not a transparent substitution.

 I wouldn't have any strong objection to providing a separate API that
 operates in a streaming fashion, but defining it is something no one's
 bothered to do yet.  In practice, if you have to code to a variant API,
 it's not that much more trouble to use a cursor...



 Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
 this for free?




 It won't solve it in the general case for clients that expect a result
 set. ISTM that use a cursor is a perfectly reasonable answer, though.

I'm not sure I agree -- surprise!

psql is often used as a command line tool and using a cursor is not
acceptable.

Granted, with an unaligned output, perhaps psql should not buffer the
WHOLE result at once, but without rewriting that behavior, a COPY from
query may be close enough.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Andrew Dunstan

Mark Woodward wrote:





Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
this for free?



  

It won't solve it in the general case for clients that expect a result
set. ISTM that use a cursor is a perfectly reasonable answer, though.



I'm not sure I agree -- surprise!

psql is often used as a command line tool and using a cursor is not
acceptable.

Granted, with an unaligned output, perhaps psql should not buffer the
WHOLE result at once, but without rewriting that behavior, a COPY from
query may be close enough.

  


You have missed my point. Surprise!

I didn't say it wasn't OK in the psql case, I said it wasn't helpful in 
the case of *other* libpq clients.


Expecting clients generally to split and interpret COPY output is not 
reasonable, but if they want large result sets they should use a cursor.


cheers

andrew


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Zoltan Boszormenyi

Andrew Dunstan írta:

Mark Woodward wrote:

Jim C. Nasby [EMAIL PROTECTED] writes:
   

On Mon, Jun 05, 2006 at 11:27:30AM -0400, Tom Lane wrote:
 

I'm reading this as just another uninformed complaint about libpq's
habit of buffering the whole query result.  It's possible that 
there's

a memory leak in the -A path specifically, but nothing said so far
provided any evidence for that.


Certainly seems like it. It seems like it would be good to allow for
libpq not to buffer, since there's cases where it's not needed...
  
See past discussions.  The problem is that libpq's API says that 
when it

hands you back the completed query result, the command is complete and
guaranteed not to fail later.  A streaming interface could not make 
that

guarantee, so it's not a transparent substitution.

I wouldn't have any strong objection to providing a separate API that
operates in a streaming fashion, but defining it is something no one's
bothered to do yet.  In practice, if you have to code to a variant API,
it's not that much more trouble to use a cursor...




Wouldn't the COPY (select ...) TO STDOUT format being discussed solve
this for free? 


Yes, it would for me.

It won't solve it in the general case for clients that expect a result 
set. ISTM that use a cursor is a perfectly reasonable answer, though.


The general case cannot be applied for all particular cases.
E.g. you cannot use cursors from shell scripts and just for
producing an export file it's not too reasonable either.
Redirecting psql's output or COPY is enough.

Best regards,
Zoltán Böszörényi


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Neil Conway
On Mon, 2006-06-05 at 19:17 +0200, Zoltan Boszormenyi wrote:
 The general case cannot be applied for all particular cases.
 E.g. you cannot use cursors from shell scripts

This could be fixed by adding an option to psql to transparently produce
SELECT result sets via a cursor.

-Neil



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] adding new field

2006-06-05 Thread ranbeer makin
Hi,I have two questions to ask:1. How to flush buffer in postgres? I'm not able to see the changes that I make in my code. Sometimes the output comes and sometime it's not. What might be the problem?
2. I have added a new field in RESDOM structure, made necessary
modifications in outfuncs.c, copy, read, equalfuncs.c but my
prog is behaving in a weird way, seems like some memory probs...What other modifications needs to be done to reflect the addition of this new field?I looked up into the archive, but not no results.Best,
RanbeerHyderabadIndia


Re: [HACKERS] adding new field

2006-06-05 Thread Martijn van Oosterhout
On Mon, Jun 05, 2006 at 11:24:09PM +0530, ranbeer makin wrote:
 Hi,
 
 I have two questions to ask:
 
 1. How to flush buffer in postgres? I'm not able to see the changes that I
 make in my code. Sometimes the output comes and sometime it's not. What
 might be the problem?

How do you mean? What kind of buffer? If you mean when does stuff in
the buffercache hit disk, well, normally they xlogged so they may not
appear in the actual datafiles. However, a CHECKPOINT may do it.

 2. I have added a new field in RESDOM structure, made necessary
 modifications in outfuncs.c, copy, read, equalfuncs.c but my prog is
 behaving in a weird way, seems like some memory probs...What other
 modifications needs to be done to reflect the addition of this new field?

Did you remember to recompile *everything* affected? (--enable-depend
is useful for this). You also have to initdb again.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] adding new field

2006-06-05 Thread Jonah H. Harris

On 6/5/06, ranbeer makin [EMAIL PROTECTED] wrote:

1. How to flush buffer in postgres? I'm not able to see the changes that I
make in my code.


I assume you mean an output buffer?  If you're using fprintf, make
sure to do a fflush.  Otherwise, use elog(LOG or elog(DEBUG.


 2. I have added a new field in RESDOM structure, made necessary
modifications in outfuncs.c, copy, read, equalfuncs.c but my prog is
behaving in a weird way, seems like some memory probs...What other
modifications needs to be done to reflect the addition of this new field?


Example of *wierd*?

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] psql -A (unaligned format) eats too much

2006-06-05 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Mon, 2006-06-05 at 19:17 +0200, Zoltan Boszormenyi wrote:
 The general case cannot be applied for all particular cases.
 E.g. you cannot use cursors from shell scripts

 This could be fixed by adding an option to psql to transparently produce
 SELECT result sets via a cursor.

Note of course that such a thing would push the incomplete-result
problem further upstream.   For instance in (hypothetical --cursor
switch)
psql --cursor -c select ... | myprogram
there would be no very good way for myprogram to find out that it'd
been sent an incomplete result due to error partway through the SELECT.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] adding new field

2006-06-05 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Mon, Jun 05, 2006 at 11:24:09PM +0530, ranbeer makin wrote:
 2. I have added a new field in RESDOM structure, made necessary
 modifications in outfuncs.c, copy, read, equalfuncs.c but my prog is
 behaving in a weird way, seems like some memory probs...What other
 modifications needs to be done to reflect the addition of this new field?

 Did you remember to recompile *everything* affected? (--enable-depend
 is useful for this). You also have to initdb again.

Also, if you're working on a patch you hope to someday contribute, you
should be starting from CVS HEAD or some reasonable approximation of it.
Resdom disappeared more than a year ago:
http://archives.postgresql.org/pgsql-committers/2005-04/msg00060.php

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] adding new field

2006-06-05 Thread Alvaro Herrera
Jonah H. Harris wrote:

  2. I have added a new field in RESDOM structure, made necessary
 modifications in outfuncs.c, copy, read, equalfuncs.c but my prog is
 behaving in a weird way, seems like some memory probs...What other
 modifications needs to be done to reflect the addition of this new field?
 
 Example of *wierd*?

http://www.m-w.com/

wierd:

The word you've entered isn't in the dictionary. Click on a spelling suggestion
below or try again using the search box to the right.

Suggestions for wierd:

 1. weird

weird:

Main Entry: 2weird
Function: adjective
1 : of, relating to, or caused by witchcraft or the supernatural : MAGICAL
2 : of strange or extraordinary character : ODD, FANTASTIC

(2weird, because 1weird is a noun)

Tom said some time ago weird is spelled weirdly.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] adding new field

2006-06-05 Thread Jonah H. Harris

On 6/5/06, Alvaro Herrera [EMAIL PROTECTED] wrote:

The word you've entered isn't in the dictionary.


Thanks Alvaro... my typing sometimes gets ahead of my in-brain spellcheck.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Connection Broken with Custom Dicts for TSearch2

2006-06-05 Thread Oleg Bartunov

On Mon, 5 Jun 2006, Teodor Sigaev wrote:




Teodor Sigaev wrote:
Sorry, it isn't mentioned on page, but this example of code working only 
with before 8.1 versions. In 8.1 interface to dictionary was changed.


Try attached dict_tmpl.c

2Oleg: place file on site, pls


done






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] fillfactor using WITH syntax

2006-06-05 Thread ITAGAKI Takahiro
Hi Hackers,

I'm rewriting fillfactor patch, per the following discussion,
http://archives.postgresql.org/pgsql-hackers/2006-03/msg00287.php
Now fillfactor can be set using WITH syntax:
  - CREATE INDEX index ON table USING btree (columns) WITH (...)
  - CREATE TABLE table (i integer PRIMARY KEY WITH (...))
  - ALTER TABLE table ADD PRIMARY KEY (columns) WITH (...)
The settings are stored on pg_class.relfillfactor and the last value will
be used on next REINDEX. WITH parameter is a list of DefElems, so we can
use it to pass additional parameters to index access methods.


I also added same extention to table creation:
  - CREATE TABLE table (columns) WITH (...)
  - CREATE TABLE table WITH (...) AS SELECT/EXECUTE ...
Fill factor for tables works on INSERT, COPY, VACUUM FULL, CLUSTER, and
UPDATE to another page (not be used when rows are updated in the same page).
It is not so useful currently however, but if we optimize updating in same
page, the freespace controlling will do some good.
(The optimization is discussed in [HACKERS] Faster Updates,
http://archives.postgresql.org/pgsql-hackers/2006-06/msg00116.php)


Now, I want to ask you how to modify WITH parameters for existing
tables/indexes. One idea is extending re-organization commands:
  - REINDEX INDEX index WITH (...)
  - CLUSTER index ON table WITH (...)
  - VACUUM FULL WITH (...)
Another is to use ALTER. but it may be unclear that the change will
be applied immediately or delayed until next re-organization.
  - ALTER TABLE/INDEX name SET (...)


I appreciate any comments.

---
ITAGAKI Takahiro
NTT OSS Center


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] That EXPLAIN ANALYZE patch still needs work

2006-06-05 Thread Tom Lane
Just got this rather surprising result:

regression=# \timing
Timing is on.
regression=# explain analyze select * from tenk1 a join tenk1 b on a.unique1 = 
b.unique2;
QUERY PLAN  
  
--
 Hash Join  (cost=825.00..2484.00 rows=1 width=488) (actual 
time=250.510..2557.129 rows=1 loops=1)
   Hash Cond: (a.unique1 = b.unique2)
   -  Seq Scan on tenk1 a  (cost=0.00..458.00 rows=1 width=244) (actual 
time=0.084..30.070 rows=1 loops=1)
   -  Hash  (cost=458.00..458.00 rows=1 width=244) (actual 
time=250.242..250.242 rows=1 loops=1)
 -  Seq Scan on tenk1 b  (cost=0.00..458.00 rows=1 width=244) 
(actual time=0.019..23.317 rows=1 loops=1)
 Total runtime: 847.855 ms
(6 rows)

Time: 856.179 ms
regression=# 

The Total runtime is correct AFAICT, which puts the top node's actual
time rather far out in left field.  This is pretty repeatable on my old
slow HPPA machine.  A new Xeon shows less of a discrepancy, but it's
still claiming top node actual  total, which is not right.

I realize that the sampling code can't be expected to be exactly right,
but a factor-of-three error is not acceptable.  One of us broke this :-(

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] fillfactor using WITH syntax

2006-06-05 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 Now, I want to ask you how to modify WITH parameters for existing
 tables/indexes.

I'd go with the ALTER TABLE, rather than cluttering N other commands.
There's already precedent for delayed effects of parameter alterations
(SET, ALTER SET STATISTICS, ALTER SET STORAGE, etc).  Documenting which
commands cause the new values to take effect seems sufficient.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq