Re: [HACKERS] Patch to mark items as static or not used

2006-07-15 Thread Teodor Sigaev



RenameRewriteRule, and I defer to Teodor about the gist and gin
functions.  The others range somewhere between no and hell no.


I think that gistcentryinit() and extractEntriesS() should not be a 
static. 	


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [PATCHES] [patch 0/9] annual pgcrypto update

2006-07-15 Thread Marko Kreen

On 7/14/06, Tom Lane [EMAIL PROTECTED] wrote:

I wrote:
 Applied, thanks.  What I now see is that pgp-pubkey-decrypt passes on
 a 32-bit machine but dumps core on a 64-bit machine, with SIGSEGV here:

Addendum: seems it only fails without openssl.


iMath's protability checks failed.  Attached patch drops them
and expects postgres.h to give correct types.

Also enable asserts to fail earlier.

--
marko
Index: contrib/pgcrypto/imath.c
===
RCS file: /opt/cvs/pgsql/contrib/pgcrypto/imath.c,v
retrieving revision 1.1
diff -u -c -r1.1 imath.c
*** contrib/pgcrypto/imath.c	13 Jul 2006 04:15:24 -	1.1
--- contrib/pgcrypto/imath.c	15 Jul 2006 01:12:52 -
***
*** 33,39 
  #include imath.h
  
  #undef assert
! #define assert(TEST)
  #define TRACEABLE_CLAMP 0
  #define TRACEABLE_FREE 0
  
--- 33,39 
  #include imath.h
  
  #undef assert
! #define assert(TEST) Assert(TEST)
  #define TRACEABLE_CLAMP 0
  #define TRACEABLE_FREE 0
  
Index: contrib/pgcrypto/imath.h
===
RCS file: /opt/cvs/pgsql/contrib/pgcrypto/imath.h,v
retrieving revision 1.1
diff -u -c -r1.1 imath.h
*** contrib/pgcrypto/imath.h	13 Jul 2006 04:15:24 -	1.1
--- contrib/pgcrypto/imath.h	15 Jul 2006 01:15:02 -
***
*** 39,49 
  typedef unsigned int   mp_size;
  typedef intmp_result;
  #ifdef USE_LONG_LONG
! typedef unsigned int   mp_digit;
! typedef unsigned long long mp_word;
  #else
! typedef unsigned short mp_digit;
! typedef unsigned int   mp_word;
  #endif
  
  typedef struct mpz {
--- 39,53 
  typedef unsigned int   mp_size;
  typedef intmp_result;
  #ifdef USE_LONG_LONG
! typedef uint32 mp_digit;
! typedef uint64 mp_word;
! #define MP_DIGIT_MAX   0xULL
! #define MP_WORD_MAX0xULL
  #else
! typedef uint16 mp_digit;
! typedef uint32 mp_word;
! #define MP_DIGIT_MAX   0xUL
! #define MP_WORD_MAX0xUL
  #endif
  
  typedef struct mpz {
***
*** 70,90 
  #define MP_DIGIT_BIT(sizeof(mp_digit) * CHAR_BIT)
  #define MP_WORD_BIT (sizeof(mp_word) * CHAR_BIT)
  
- #ifdef USE_LONG_LONG
- #  ifndef ULONG_LONG_MAX
- #ifdef ULLONG_MAX
- #  define ULONG_LONG_MAX   ULLONG_MAX
- #else
- #  error Maximum value of unsigned long long not defined!
- #endif
- #  endif
- #  define MP_DIGIT_MAX   (ULONG_MAX * 1ULL)
- #  define MP_WORD_MAXULONG_LONG_MAX
- #else
- #  define MP_DIGIT_MAX(USHRT_MAX * 1UL)
- #  define MP_WORD_MAX (UINT_MAX * 1UL)
- #endif
- 
  #define MP_MIN_RADIX2
  #define MP_MAX_RADIX36
  
--- 74,79 

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Patch to mark items as static or not used

2006-07-15 Thread Andrew Dunstan

Neil Conway wrote:

On Sat, 2006-07-15 at 00:05 -0400, Tom Lane wrote:
  

The fundamental problem with find_static is that it hasn't got a clue
about likely future changes, nor about what we think external add-ons
might want



We could annotate the source to indicate that some functions are
deliberately intended to be externally visible, but not referenced
within the source tree, and then teach find_static to grok that
annotation.

  


I thought of that, but what if one gets missed? Is the tool worth the 
hassle?


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Patch to mark items as static or not used

2006-07-15 Thread Martijn van Oosterhout
On Sat, Jul 15, 2006 at 09:34:57AM -0400, Andrew Dunstan wrote:
 Neil Conway wrote:
 We could annotate the source to indicate that some functions are
 deliberately intended to be externally visible, but not referenced
 within the source tree, and then teach find_static to grok that
 annotation.
 
 I thought of that, but what if one gets missed? Is the tool worth the 
 hassle?

The tool is just a tool. The annotation is so that some human won't
come to the conclusion that it can be removed. Teaching the tool to
skip it is just a bonus.

Some places mark external functions with DLLEXPORT but I guess we could
invent a comment that would be machine readable.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCHES] [patch 0/9] annual pgcrypto update

2006-07-15 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 iMath's protability checks failed.  Attached patch drops them
 and expects postgres.h to give correct types.

 Also enable asserts to fail earlier.

Applied, thanks.  Looks like this should fix current buildfarm
failure on bear.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] More #ifdef fun: src/interfaces/libpq/win32.c

2006-07-15 Thread Tom Lane
src/interfaces/libpq/win32.c contains

/* Declared here to avoid pulling in all includes, which causes name 
collissions */ 
#ifdef ENABLE_NLS
extern char *
libpq_gettext(const char *msgid)
__attribute__((format_arg(1)));
#else
#define libpq_gettext(x) (x)
#endif

Now, since it has nowhere bothered to #include pg_config.h, ENABLE_NLS
will certainly not be defined, and so this code always fails to invoke
gettext.

This may not be real important since the file appears to use gettext
only here:

if (!success)
sprintf(strerrbuf, libpq_gettext(Unknown socket error 
(0x%08X/%i)), err, err);

but it's still pretty bogus.

Without a Windows machine, I'm not in a position to try making this file
properly include postgres_fe.h, or at least c.h, but someone ought to
try harder.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Still more #ifdef fun: struct sockaddr_storage

2006-07-15 Thread Tom Lane
pgcheckdefines reports

src/port/getaddrinfo.c references HAVE_STRUCT_SOCKADDR_STORAGE_SS_LEN, defined 
in src/include/libpq/pqcomm.h

I think this is an actual bug: if we have a platform that calls its
struct sockaddr_storage member __ss_len, and we are using our own
getaddrinfo code, we will fail to initialize the length member of
the returned ai_addr struct.

It would be safe to include src/include/libpq/pqcomm.h into
getaddrinfo.c, but it seems a tad ugly.  Perhaps the machinations
in pqcomm.h to normalize the sockaddr_storage names ought to be
moved into a separate header file?  If so, where to put it?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] PG_RETURN_INT64 vs PointerGetDatum ANALYZE

2006-07-15 Thread Sergey E. Koposov

Hello, Hackers

I recently discovered that some my tables with the functional indices
have the wrong pg_stats records.

wsdb=# \d q3c
  Table public.q3c
 Column |   Type   | Modifiers 
+--+---

 ipix   | bigint   |
 ra | double precision |
 dec| double precision | 
Indexes:

ipix_idx btree (ipix)
q3c_func_idx btree (q3c_ang2ipix(ra, dec)) CLUSTER

wsdb=# select * from pg_stats where tablename ~'q3c' and attname ~'pg';
 schemaname | tablename  | attname | null_frac | avg_width | n_distinct |   most_common_vals| most_common_freqs | histogram_bounds | correlation 
++-+---+---++---+---+--+-

 public | q3c_func_idx   | pg_expression_1 | 0 | 8 |
  1 | {6000}| {1}   |  |
   1

You see that the most_common_freqs=1 which is certainly not true.

Starting to dig into the problem and checking my C function  I saw that when 
my C function (returning bigint) use the  return PointerGetDatum((ipix_buf));
instead of PG_RETURN_INT64 ,the ANALYZE command produces the wrong statistics. 
But I think that's wrong, isn't it ?


So the short toy example:

CREATE OR REPLACE FUNCTION q3c_ang2ipix(double precision, double precision)
RETURNS bigint
AS '$libdir/q3c', 'pgq3c_ang2ipix'
LANGUAGE C IMMUTABLE STRICT;

C function definition:

PG_FUNCTION_INFO_V1(pgq3c_ang2ipix);
Datum pgq3c_ang2ipix(PG_FUNCTION_ARGS)
{
static q3c_ipix_t ipix_buf;
ipix_buf ++;
elog(WARNING,%lld,ipix_buf);
return PointerGetDatum((ipix_buf));
}

And I have the table q3c 
wsdb=# \d q3c

  Table public.q3c
 Column |   Type   | Modifiers 
+--+---

 ipix   | bigint   |
 ra | double precision |
 dec| double precision | 
Indexes:

ipix_idx btree (ipix)
q3c_func_idx btree (q3c_ang2ipix(ra, dec)) CLUSTER


Now I do 
wsdb=# ANALYZE q3c;

WARNING:  1
WARNING:  2
WARNING:  3
WARNING:  4

WARNING:  2998
WARNING:  2999
WARNING:  3000
ANALYZE
wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg';
 schemaname |  tablename   | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation 
+--+-+---+---++--+---+--+-

 public | q3c_func_idx | pg_expression_1 | 0 | 8 |  
1 | {3000}   | {1}   |  |   1
(1 row)

So the values of most_common_vals, most_common_freqs are wrong

But if I replace the return PointerGetDatum((ipix_buf));
by the  PG_RETURN_INT64(ipix_buf)
the analyze works fine

wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg';
 schemaname |  tablename   | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs |  histogram_bounds  | correlation 
+--+-+---+---++--+---++-

 public | q3c_func_idx | pg_expression_1 | 0 | 8 | 
-1 |  |   | 
{1,300,600,900,1200,1500,1800,2100,2400,2700,3000} |   1
(1 row)

Is it incorrect to use PointerGetDatum in those circumstances ?
Because I  always used it in that function to return bigints (to not have 
the additional overhead of PG_RETURN_INT64), and that's  the first time a 
see the bug due to that.


Thank you in advance,

Regards,
Sergey

***
Sergey E. Koposov
Max Planck Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: [EMAIL PROTECTED]

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] contrib promotion?

2006-07-15 Thread Oleg Bartunov
We estimate 1 month for both us to complete these issues/todos and 
1 month extra to write documentation, so we certainly are not ready

for 8.2, most probable 8.3. We're looking for sponsorship of our work.

Oleg

On Sat, 15 Jul 2006, Teodor Sigaev wrote:


tsearch2 is functionality that definitely should be in core eventually,
but even Oleg still says it's not done.  Aside from the documentation
issue, it's not clear that we've got a stable API for it.


Issues/TODO to move tsearch2 into core (by fast look)
* memory management. Dictionaries and tsearch2 itself cache a lot
 of data in memory allocated by malloc/palloc(TopContext) and
 not all can be correctly freed by reset_tsearch() function.
* tsearch2 doesn't automatically reinit dictionary/configuration
 after configuration changes
* Also, dictionary doesn't automatically recompile its files if they
 were changed
* It will be good to store shared information (such as
 dictionary's structure) in shared memory. For example,
 ispell dict takes for compile its files about 1-5 seconds, sharing
 should help
* I suppose, at least snowball stemmers should be compiled in
 postgresql or compiled them into separate *.so libs. Sources
 of all snowball's files take ~1Mb on disk. The goal is easier
 configuration for end-user. Ispell dict's files take 1-3 Mb per
 language.
* Now database can be created with another encoding then cluster was
 initialized, but locale can't be changed. Tsearch2 depends on locale
 because of lower/upper casing and isalpha(and somr another is*). So,
 it's possible to get unworking tsearch2 configuration.


I'm going to vacation tomorrow for two week, so I havn't time to resolve that 
issues before feature freeze, sorry.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

 http://www.postgresql.org/docs/faq



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] DISTINCT/Optimizer question

2006-07-15 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 On the other side of the coin, there's the analogy to GROUP BY that Greg
 points out --- there's some duplicated functionality there, but again it
 doesn't carry over to DISTINCT ON, AFAICS.

The equivalent query for:

SELECT DISTINCT ON (x,y) a,b,c FROM ... ORDER BY x,y,z

is:

SELECT x,y,z,first(a),first(b),first(c) FROM (
  SELECT x,y,z,a,b,c FROM ... ORDER BY x,y,z
) GROUP BY x,y

Getting the optimizer to treat this as well as DISTINCT ON would be quite a
trick. It would probably require the same machinery as getting min() and max()
to take maximum advantage of indexes in the face of a GROUP BY clause. That is
some sort of field for aggregate functions indicating what subset of the
records is sufficient for them and what path they would need for that to be
the case.

In any case I don't see how you get DISTINCT ON to work without sorting. For
min() and max() they could indicate they only need the first field if the
input is sorted and the optimizer could decide it's cheaper to pass them every
record that do the sort. For first() and last() they would tell the optimizer
they only need the first or last record with no particular path but that only
works because the rewritten query has an explicit ORDER BY clause.

That's about as far as I've thought about it. At the time I thought it would
likely be too hard for a first project. I suspect it's too hard for a second
project for that matter.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] monolithic distro (was: Re: Fwd: Three weeks left until feature freeze)

2006-07-15 Thread Josh Berkus
Lukas, all:

 So what I am suggesting is that PostgreSQL.org should push people
 towards the monolithic distro. The docs should contain everything that
 is in the monolithic distro. At conference we should say the name of the
 monolithic distro etc.

The issue I think you're ignoring is that maintaining such a distro and its 
build system for a reasonable number of platforms would require an enormous 
amount of work ... like, 3-4 full-time developers and at least a dozen 
part-time developers.   Compare the staff requirements for Debian, Red Hat or 
SuSE.  I can tell you from being the Bizgres admin for a few months that just 
trying to maintain/debug a build system that would do PostgreSQL + 
JasperReports + KETL + 4 optional modules on four platforms was easily 20-30 
hours of work, *per release*.  So this isn't something we can just vote into 
existance.

Second with endorsing or certifying projects on pgFoundry and elsewhere, 
who has the time?   To rate stuff as mature/not mature a committee of 
PostgreSQL people would have to be constantly reviewing projects, every 
single month, and probably getting into long political debates to boot.  If 
we do less, a repeat of the libpq++/libpqxx mess is inevitable.

It's very nice to throw these things out there and put them on the TODO 
list ... and if I had $100,000 in development money to throw at something, I 
might spend it that way ... but to propose them as *immediate* solutions to 
problems for 8.2 is fantasy.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pgsql-patches considered harmful

2006-07-15 Thread Josh Berkus
All,

Several of us hashed this out at the Code Sprint.  While the solution we 
arrived at doesn't completely satisfy Greg, several others would be fine with 
just having a version of pgsql-patches (pgsql-patches-lite?) that we could 
subscribe to to get the messages without the attachments.

Also, Greg pointed out the need to post periodic summaries of what 
features/patches had been committed.   So I guess I'll have to start doing 
that for PWN.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] monolithic distro

2006-07-15 Thread Lukas Smith

Josh Berkus wrote:

It's very nice to throw these things out there and put them on the TODO 
list ... and if I had $100,000 in development money to throw at something, I 
might spend it that way ... but to propose them as *immediate* solutions to 
problems for 8.2 is fantasy.


Point taken. Obviously I was not suggesting this for 8.2 .. it was more 
a suggestion for the midterm. I never installed Bizgres or Mammoth 
PostgreSQL. Maybe however there could be some lobbying from PostgreSQL 
core to better pool the resources currently directed at these two forks 
(and other similar efforts).


That being said, I am just a talker here that is hoping to instigate 
action by others and we all know talk is cheap.


regards,
Lukas


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] postgresql.conf basic analysis tool

2006-07-15 Thread Josh Berkus
Drew,

 Is there any interest in a basic perl script that would read through a
 postgresql.conf file and calculate approximate memory (and shared
 memory) usage? Also, are there any other (simple for now) things I
 should look at in the process? Asking because I'm getting annoyed with
 doing this by hand so...

I would really welcome your help with the Configurator project, which is 
intended to do this (among other things).

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG_RETURN_INT64 vs PointerGetDatum ANALYZE

2006-07-15 Thread Tom Lane
Sergey E. Koposov [EMAIL PROTECTED] writes:
 C function definition:

 PG_FUNCTION_INFO_V1(pgq3c_ang2ipix);
 Datum pgq3c_ang2ipix(PG_FUNCTION_ARGS)
 {
  static q3c_ipix_t ipix_buf;
  ipix_buf ++;
  elog(WARNING,%lld,ipix_buf);
  return PointerGetDatum((ipix_buf));
 }

This code is wrong on its face: it can't support multiple calls to the
function within a single query, because each call will damage the
previous call's result.  Try something like

select pgq3c_ang2ipix(...), pgq3c_ang2ipix(...) from ...

and you'll get bizarre behavior.

You need to return a palloc'd result, rather than returning pointers to
the same static variable on successive calls.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing wal rotation

2006-07-15 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-07-14 kell 17:39, kirjutas Simon Riggs:
 On Fri, 2006-07-14 at 12:09 -0400, Tom Lane wrote:
  Florian G. Pflug [EMAIL PROTECTED] writes:
   I've now thought about how to fix that without doing that rather crude 
   rsync-pg_xlog-hack.
   I've read through the code, and learned that wal-segments are expected to 
   have a specific size -
   thus rotating them early is not that easy.
  
  Simon was working on a patch for this at the code sprint; I think it's
  submitted to -patches already.  
 
 Slightly different patch. I'm working on this one still.

What is your approach here ?

And by any chance, do you plan to backport the standby WAL playback mode
patches to 8.0 and 8.1 series ?

  Explicitly filling the segment as you
  propose would be really bad for performance.
 
 Yes, current approach I am taking is better than that.

Another thing that was discussed was adding a function to postgres that
could be called to get current WAL file and offset, so an external
process could do async wal-copying at the time WAL is being written
instead of copying it all when it is finished. 

This could reduce the lag of data availability to only (fractions of)
seconds.

Is anyone working on it ?

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] automatic system info tool?

2006-07-15 Thread Andrew Dunstan
Someone at the conference mentioned a tool that would portably and 
reliably report system info such as architecture. If someone has details 
I would like to have them, as it would help solve the buildfarm 
personality problem properly.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Online index builds

2006-07-15 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-07-13 kell 01:07, kirjutas Greg Stark:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  On Wed, 2006-07-12 at 12:09 -0400, Greg Stark wrote:
   no regression tests yet.
  
  We'll need some performance tests that show that lock-hold time is
  *actually* reduced, given the shenanigans needed to get there.
 
 I'm not sure what you mean by lock-hold time. Online index builds
 effectively take *no* locks in the user-visible sense that regular index
 builds do. Other transactions can insert, update, delete continuously
 throughout the entire process.
 
 The only locks that are taken are
 
 1) a ShareUpdateExclusiveLock which blocks vacuum from running on the table
being indexed. This is taken by both phase 1 and phase 2. (Actually I had
the wrong lock in the patch I emailed in one place. Fixed in my source tree
here)
 
 2) An ExclusiveLock that is taken momentarily and immediately released. Even
if that can never be acquired due to a busy system it can eventually
proceed anyways as long as there are no long-running transactions that are
refusing to commit.
 
 That said we do need some performance tests to get an idea how long phase 2
 takes for large tables. The additional index and heap scan and tid sort could
 take a substantial amount of time though never as long as the original index
 build done in phase 1.

Maybe we can show progress indicators in status line (either
pg_stat_activity.current_query or commandline shown in ps), like 

WAITING TO START PHASE 1 - WAITING FOR TRANSACTION XXX TO COMPLETE 

or 

INSERTING INDEX ENTRY N OF M

changing every few seconds.

that could give the sysadmin some idea of what is going on without too
much verbosity on console.

there could of course be a VERBOSE mode, which acts similar to VACUUM
VERBOSE.

And why not make t possible to add a verbosity level there as well:
'CREATE INDEX ... VEBOSE 3'. And to VACUUM VERBOSE as well 

At level 3 all status changes could also be sent to client as well.

 What's worse is that in some cases the merge could potentially be doing a lot
 of retail index inserts. I have no good intuition for how long those will take
 relative to the wholesale index build method, especially since for some index
 methods like GIN retail inserts are extremely expensive.
 
 So for indexes that don't have a lot of records that need to be inserted
 individually what I expect -- and what I put in the docs -- is something under
 100% time penalty for an online index build. In fact I expect it to be more
 like 50% though it depends on how wide the original index. For ones that do
 have lots of records mutated for phase 2 all bets are off.

the only quarantee seems to be, that if there are still some resources
left, the index build will eventualy complete.

but showing progress will let the DBA to make the decision to abort the
build if he sees that it takes too long.


Another related thing - throttling
--

Did you do any work on using vacuum_cost_* GUC vars to throttle the
build process if desired ? 

I guess that the initial seqscans are probably cheap enough ( or at
least they are no worse than if someone just did select * ...
concurrently), but the index merge can probably still not be as light on
OLTP queries as desirable.

In reality on OLTP dbs even SELECT COUNT(*) can sometimes be a
mainenance operation and thus the goal may not be to complete as fast
as possible, but to be as light as possible on concurrent OLTP queries.

Eventually it would nice to have special optimiser rules for any
maintenance queries and DDL ops as defined by DBA, but making use of
vacuum_cost_* vars for generic throttling would be a good first cut ;


 
-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] automatic system info tool?

2006-07-15 Thread Josh Berkus
Andrew,

 Someone at the conference mentioned a tool that would portably and
 reliably report system info such as architecture. If someone has details
 I would like to have them, as it would help solve the buildfarm
 personality problem properly.

There's potentially two, actually.   A SF start-up is going to be 
open-sourcing a tool, soon, and Jim Nasby said that Distributed.net has code 
for this.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] More #ifdef fun: src/interfaces/libpq/win32.c

2006-07-15 Thread Hiroshi Saito

Hi Tom-san.

I think that it should be in this way.

#include postgres_fe.h
#ifdef WIN32
#include win32.h
#endif

However, I am still under try by the reason for seeing the error of now a few.

Regards,
Hiroshi Saito

- Original Message - 
From: Tom Lane 


src/interfaces/libpq/win32.c contains

/* Declared here to avoid pulling in all includes, which causes name collissions */ 
#ifdef ENABLE_NLS

extern char *
libpq_gettext(const char *msgid)
__attribute__((format_arg(1)));
#else
#define libpq_gettext(x) (x)
#endif

Now, since it has nowhere bothered to #include pg_config.h, ENABLE_NLS
will certainly not be defined, and so this code always fails to invoke
gettext.

This may not be real important since the file appears to use gettext
only here:

if (!success)
sprintf(strerrbuf, libpq_gettext(Unknown socket error (0x%08X/%i)), err, err);

but it's still pretty bogus.

Without a Windows machine, I'm not in a position to try making this file
properly include postgres_fe.h, or at least c.h, but someone ought to
try harder.

regards, tom lane



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] automatic system info tool?

2006-07-15 Thread Peter Eisentraut
Andrew Dunstan wrote:
 Someone at the conference mentioned a tool that would portably and
 reliably report system info such as architecture.

What's wrong with config.guess?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] PostmasterHandl_patch of win32

2006-07-15 Thread Tom Lane
Hiroshi Saito [EMAIL PROTECTED] writes:
 Probably, this application will be required.

This seems very messy.  The correct fix is just to put back the #include
line, no?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Online index builds

2006-07-15 Thread Greg Stark

Hannu Krosing [EMAIL PROTECTED] writes:

 Maybe we can show progress indicators in status line (either
 pg_stat_activity.current_query or commandline shown in ps), like 
 
 WAITING TO START PHASE 1 - WAITING FOR TRANSACTION XXX TO COMPLETE 
 
 or 
 
 INSERTING INDEX ENTRY N OF M
 
 changing every few seconds.

Hm. That would be very interesting. I'll say that one of the things that
impressed me very much with Postgres moving from Oracle was the focus on
usability. Progress indicators would be excellent for a lot of operations.

That said I'm not sure how much I can do here. For a substantial index we
should expect most of the time will be spent in the tuplesort. It's hard to
see how to get any sort of progress indicator out of there and as long as we
can't it's hard to see the point of getting one during the heap scan or any of
the other i/o operations.

I think it does make sense to put something in current_query indicating when
it's waiting for transactions to end and when it's past that point. That's
something the DBA should be aware of.

 And why not make t possible to add a verbosity level there as well:
 'CREATE INDEX ... VEBOSE 3'. And to VACUUM VERBOSE as well 
 
 At level 3 all status changes could also be sent to client as well.

Wouldn't you just control this with log_min_messages? It seems unnecessary to
clutter the grammar for every command with verbose options.

 Another related thing - throttling
 --
 
 Did you do any work on using vacuum_cost_* GUC vars to throttle the
 build process if desired ? 

Actually no. While there is consensus that will be necessary I'm not sure I
can do it with this patch. The problem is that most of the real heavy lifting
here is done inside tuplesort. Even aside from that most of what's left is
inside bulkdelete(*) and the code that handles regular index builds.

So I think we'll need some global thinking about what options Postgres needs
to control throttling in general. And probably someone needs to write a
separate patch that adds all the hooks to the various places in a single go.
Trying to throttle just one operation at a time when a lot of the code that
implements these operations is shared will have us running in circles.

(*) Hm. Come to think of it I wonder if the vacuum_cost parameters are already
kicking in for this phase. That would be a bit strange since it's the fastest
of the three scans.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] PostmasterHandl_patch of win32

2006-07-15 Thread Hiroshi Saito

From: Tom Lane


Hiroshi Saito [EMAIL PROTECTED] writes:

Probably, this application will be required.


This seems very messy.  The correct fix is just to put back the #include
line, no?


@@ -18,6 +18,7 @@
#include unistd.h

#include miscadmin.h
+#include postmaster/postmaster.h
#include storage/pmsignal.h
#include storage/shmem.h

Ahh.., include by which you were committed is very good.
Thanks.!

Regards,
Hiroshi Saito


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Online index builds

2006-07-15 Thread Joshua D. Drake



That said I'm not sure how much I can do here. For a substantial index we
should expect most of the time will be spent in the tuplesort. It's hard to
see how to get any sort of progress indicator out of there and as long as we
can't it's hard to see the point of getting one during the heap scan or any of
the other i/o operations.


Well from a DBA perspective, just knowing that something productive is 
happening is useful.


When using vacuum I almost always use vacuum verbose, just so I have an 
idea of what is going on.


Joshua D. Drake
--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] pgwin32_backend_usleep seems overly cute/confusing

2006-07-15 Thread Tom Lane
Instead of this rigmarole with pg_usleep() being a macro to
pgwin32_backend_usleep, why don't we just call the function in
backend/port/win32/signal.c pg_usleep(), and put something like

#if defined(FRONTEND) || !defined(WIN32)

around the version in src/port/pgsleep.c?  We don't really want
that version present in the backend at all on Windows, ISTM.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Forcing wal rotation

2006-07-15 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 And by any chance, do you plan to backport the standby WAL playback mode
 patches to 8.0 and 8.1 series ?

That's not happening ... we do not put new features in stable branches.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Online index builds

2006-07-15 Thread Agent M
A great first step would be to add elog(INFO,...) in some standardized 
format over the wire so that clients can tell what's going on. It could 
be triggered by a GUC which is off by default.


-M

On Jul 15, 2006, at 9:10 PM, Greg Stark wrote:



Hannu Krosing [EMAIL PROTECTED] writes:


Maybe we can show progress indicators in status line (either
pg_stat_activity.current_query or commandline shown in ps), like

WAITING TO START PHASE 1 - WAITING FOR TRANSACTION XXX TO COMPLETE

or

INSERTING INDEX ENTRY N OF M

changing every few seconds.


Hm. That would be very interesting. I'll say that one of the things 
that
impressed me very much with Postgres moving from Oracle was the focus 
on
usability. Progress indicators would be excellent for a lot of 
operations.


That said I'm not sure how much I can do here. For a substantial index 
we
should expect most of the time will be spent in the tuplesort. It's 
hard to
see how to get any sort of progress indicator out of there and as long 
as we
can't it's hard to see the point of getting one during the heap scan 
or any of

the other i/o operations.

I think it does make sense to put something in current_query 
indicating when
it's waiting for transactions to end and when it's past that point. 
That's

something the DBA should be aware of.


¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
[EMAIL PROTECTED]
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match