date:20110922

Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-22 Thread Jeff Davis

On Thu, 2011-09-22 at 02:31 +0200, Florian Pflug wrote:
 My personal favourite would be '0', since it resembles the symbol used
 for empty sets in mathematics, and we already decided to use mathematical
 notation for ranges.
 
 If we're concerned that most of our users won't get that, then 'empty'
 would be a viable alternative I think.
 
 From a consistency POV it'd make sense to use a bracket-based syntax
 also for empty ranges. But the only available options would be '()' and '[]',
 which are too easily confused with '(,)' and '[,]' (which we already
 decided should represent the full range).

Yes, I think () is too close to (,).

Brainstorming so far:
 0   : simple, looks like the empty set symbol
 empty   : simple
 empty : a little more obvious that it's special
   : visually looks empty
 -   : also looks empty
 {}  : mathematical notation, but doesn't quite fit ranges

I don't have a strong opinion. I'd be OK with any of those.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN and nfiltered, take two

2011-09-22 Thread Heikki Linnakangas


On 22.09.2011 07:51, Tom Lane wrote:

Here's a revised version of the patch that behaves in a way that seems
reasonable to me, in particular it suppresses zero filter-count rows in
text mode.  I've not done anything yet about the documentation.


I haven't been following this closely, so sorry if this has already been 
discussed, but:


I find it a bit strange to print the number of lines filtered out. I 
think that's the only place where we would print a negative like that, 
everywhere else we print the number of lines let through a node. How 
about printing the number of lines that enter the filter, instead?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EXPLAIN and nfiltered, take two

2011-09-22 Thread Tom Lane

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I haven't been following this closely, so sorry if this has already been 
 discussed, but:

 I find it a bit strange to print the number of lines filtered out. I 
 think that's the only place where we would print a negative like that, 
 everywhere else we print the number of lines let through a node. How 
 about printing the number of lines that enter the filter, instead?

Yeah, I thought seriously about that too.  The problem with it is that
you end up having to print that line all the time, whether or not it
adds any knowledge.  The filter removed N rows approach has the saving
grace that you can leave it out when no filtering is happening.  Another
point is that if you have two filters operating at a node, printing only
the starting number of rows doesn't let you disentangle which filter did
how much.

Now having said that, I could still be talked into the other way if
someone had a design that accounted for outer/semi/anti-join behavior
more clearly than this does.  I thought for a little bit that printing
the starting number of rows might offer such a solution, but on
inspection it didn't really seem to help.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST

2011-09-22 Thread Heikki Linnakangas


On 06.09.2011 20:34, Oleg Bartunov wrote:

Here is the latest spgist patch, which has all planned features as well as
all overhead, introduced by concurrency and recovery, so performance
measurement should be realistic now.


I'm ignoring the text suffix-tree part of this for now, because of the 
issue with non-C locales that Alexander pointer out.


Regarding the quadtree, have you compared the performance of that with 
Alexander's improved split algorithm? I ran some tests using the test 
harness I still had lying around from the fast GiST index build tests:


testname |  time   | accesses | indexsize
-+-+--+---
 points unordered auto   | 00:03:58.188866 |   378779 | 522 MB
 points ordered auto | 00:07:14.362355 |   177534 | 670 MB
 points unordered auto   | 00:02:59.130176 |46561 | 532 MB
 points ordered auto | 00:04:00.50756  |45066 | 662 MB
 points unordered spgist | 00:03:05.569259 |78871 | 394 MB
 points ordered spgist   | 00:01:46.06855  |   422104 | 417 MB
(8 rows)

These tests were with a table with 750 random points. In the 
ordered-tests, the table is sorted by x,y coordinates. 'time' is the 
time used to build the index on it, and 'accesses' is the total number 
of index blocks hit by a series of 1 bounding box queries, measured 
from pg_statio_user_indexes.idx_blks_hit + idx_blks_read.


The first two tests in the list are with a GiST index on unpatched 
PostgreSQL. The next six tests are with Alexander's double-sorting split 
patch. The last two tests are with an SP-GiST index.


It looks like the query performance with GiST using the double-sorting 
split is better than SP-GiST, although the SP-GiST index is somewhat 
smaller. The ordered case seems pathologically bad, is that some sort of 
a worst-case scenario for quadtrees?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL X/Open Socket / BSD Socket Issue on HP-UX

2011-09-22 Thread MUHAMMAD ASIF

Very sorry for late reply.
You are right, _xpg_ socket functionality is not available in older systems, it
is available in hp-ux 11.23 version through patch HCO_35744 . HPUX 10.20 is
very old machine (1996). I am using latest HPUX B.11.31 machine, I don't have
access to older systems. -D_XOPEN_SOURCE_EXTENDED make the postgres build
X/Open Socket enabled including connector's i.e libpq. Now if system default
64bit perl (BSD Socket) try to use libpq (X/Open Socket) it will end up in
unexpected results or errors . HP-UX don't allow mixing of X/Open Socket
objects and BSD Socket objects in the same 64bit binary, HP tried to fix this
issue through -D_HPUX_ALT_XOPEN_SOCKET_API on later version of OS. It seems
nice that if postgres adopt this fix at least for connectors (PFA patch, minor
change in src/interfaces/libpq/Makefile) and so that users on later hp-ux boxes
don't trouble with these socket issues and connect their applications to
database server with the help of libpq without the fear of X/Open Socket or BSD
Socket complexity. On older system defining _HPUX_ALT_XOPEN_SOCKET_API should
do no effects or issues. Thanks.
( http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/xopen_networking.7.html )..
HP-UX provides two styles of Sockets API:
- default BSD Sockets -X/Open Sockets
These two styles of Sockets API have the same function names but they have
differences in semantics and argument types. For example, the optlen field in
X/Open getsockopt() is size_t type, while BSD getsockopt() is int type. In 64
bit mode, size_t is 64 bit and int is still 32 bit.
Linking objects compiled to X/Open Sockets specification and objects compiled
to BSD Sockets specification in the same program using the linkage method in
method A would erroneously resolve BSD Sockets calls to X/Open Sockets
functions in the Xnet library. As a result, the program may result in
application core dumps or unexpected Socket errors when it is run. These
symptoms commonly occur when BSD Sockets accept(), getpeername(),
getsockname(), getsockopt(), recvfrom(), sendmsg(), and recvmsg() are called.
..
Best Regards,Muhammad Asif Naeem

To: anaeem...@hotmail.com
CC: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] PostgreSQL X/Open Socket / BSD Socket Issue on HP-UX
Date: Tue, 20 Sep 2011 18:06:55 -0400
From: t...@sss.pgh.pa.us

MUHAMMAD ASIF anaeem...@hotmail.com writes:
I faced similar issue as discussed in
http://postgresql.1045698.n5.nabble.com/Fwd-DBD-Pg-on-HP-UX-11-31-64bit-td3305163.html;.
(man xopen_networking -
http://docstore.mik.ua/manuals/hp-ux/en/B2355-60130/xopen_networking.7.html)
... There are two ways to obtain X/Open Sockets functionality: *
Method A is in compliance with X/Open compilation specification.*
Method B slightly deviates from X/Open compilation specification. However,
Method B allows a program to include both objects compiled to X/Open
Sockets specification and objects compiled to BSD Sockets specification. ...
PostgreSQL support X/Open Sockets. Apache web server (2.2.15,
/opt/hpws22/apache) and Perl (5.8.8, /opt/perl_64) are BSD Socket
applications that are default with the OS. I tried Method B (It provides
wrapper _xpg_ socket functions that allows using X/Open socket objects and
BSD socket objects in the same binary) to build PostgreSQL 9.1 code, I
LD_PRELOAD the generated libpq binary, without any other change both perl
and apache work fine with postgresql now,and it is easy to implement too.
We just need to build the source code with -D_XOPEN_SOURCE=600
-D_HPUX_ALT_XOPEN_SOCKET_API and link binary with libc. PFA patch. Thanks.

AFAICT, the proposed patch will break things on at least some versions
of HPUX. You can't just arbitrarily remove the reference to -lxnet,
at least not without explaining to us why the existing comment about it
is wrong. Likewise, removing -D_XOPEN_SOURCE_EXTENDED isn't
acceptable without a whole more supporting evidence than you've
provided. (I'm fairly certain that the latter will break the build on
my old HPUX 10.20 box, for example.)

regards, tom lane

hp-ux_socket.patch.v2
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST

2011-09-22 Thread Alexander Korotkov

On Thu, Sep 22, 2011 at 2:05 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Regarding the quadtree, have you compared the performance of that with
 Alexander's improved split algorithm? I ran some tests using the test
 harness I still had lying around from the fast GiST index build tests:

testname |  time   | accesses | indexsize
 -+**-+--+-**--
  points unordered auto   | 00:03:58.188866 |   378779 | 522 MB
  points ordered auto | 00:07:14.362355 |   177534 | 670 MB
  points unordered auto   | 00:02:59.130176 |46561 | 532 MB
  points ordered auto | 00:04:00.50756  |45066 | 662 MB
  points unordered spgist | 00:03:05.569259 |78871 | 394 MB
  points ordered spgist   | 00:01:46.06855  |   422104 | 417 MB
 (8 rows)

I assume first two rows to be produced by new linear split
algorithm(current) and secound two rows by double sorting split algorithm(my
patch).


 These tests were with a table with 750 random points. In the
 ordered-tests, the table is sorted by x,y coordinates. 'time' is the time
 used to build the index on it, and 'accesses' is the total number of index
 blocks hit by a series of 1 bounding box queries, measured from
 pg_statio_user_indexes.idx_**blks_hit + idx_blks_read.

 The first two tests in the list are with a GiST index on unpatched
 PostgreSQL. The next six tests are with Alexander's double-sorting split
 patch. The last two tests are with an SP-GiST index.

 It looks like the query performance with GiST using the double-sorting
 split is better than SP-GiST, although the SP-GiST index is somewhat
 smaller. The ordered case seems pathologically bad, is that some sort of a
 worst-case scenario for quadtrees?

Comparison of search speed using number of page accesses is
quite comprehensive for various GiST indexes. But when we're
comparing  SP-GiST vs GiST we should take into accoung that they have
different CPU/IO ratio. GiST scans whole page which it accesses. SP-GiST can
scan only fraction of page because several nodes can be packed into single
page. Thereby it would be interesting to compare also CPU load GiST
vs. SP-GiST. Also, there is some hope to reduce number of page accesses in
SP-GiST by improving clustering algorithm.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Double sorting split patch

2011-09-22 Thread Heikki Linnakangas


!   /*
!* Calculate delta between penalties of join common entries to
!* different groups.
!*/
!   for (i = 0; i  commonEntriesCount; i++)
{
!   double  lower,
!   upper;
!
!   box = 
DatumGetBoxP(entryvec-vector[commonEntries[i].index].key);
!   if (context.dim == 0)
!   {
!   lower = box-low.x;
!   upper = box-high.x;
!   }
!   else
!   {
!   lower = box-low.y;
!   upper = box-high.y;
!   }
!   commonEntries[i].delta = Abs(box_penalty(leftBox, box) -
!   
 box_penalty(rightBox, box));
}


'lower' and 'upper' are not used for anything in the above. Is that just 
dead code that can be removed, or is there something missing that should 
be using them?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-22 Thread Kyotaro HORIGUCHI

Thank you for your understanding on that point.

At Wed, 21 Sep 2011 20:35:02 -0400, Robert Haas robertmh...@gmail.com wrote
 ...while Kyotaro Horiguchi clearly feels otherwise, citing the
 statistic that about 100 out of 7000 Japanese characters fail to work
 properly:
 
 http://archives.postgresql.org/pgsql-bugs/2011-07/msg00064.php
 
 That statistic seems to justify some action, but what?  Ideas:

Addition to the figures - based on whole characters defined in
JIS X 0208 which is traditionally (It is becoming a history now.)
for information exchange in Japan - narrowing to commonly-used
characters (named `Jouyou-Kanji' in Japanese, to be learned by
high school graduates in Japan), 35 out of 2100 hits.

# On the other hand, widening to JIS X 0213 which is roughly
# compatible with the Unicode, and defines more than 12K chars, I
# have not counted, but the additional 5k characters can be
# assumed to have less probability to fail than the chars in JIS
# X 0208.


 1. Adopt the patch as proposed, or something like it.
 2. Instead of installing encoding-specific character incrementing
 functions, we could try to come up with a more reliable generic
 algorithm.  Not sure exactly what, though.
 3. Come up with some way to avoid needing to do this in the first place.
 
 One random idea I have is - instead of generating  and  clauses,
 could we define a prefix match operator - i.e. a ### b iff substr(a,
 1, length(b)) = b?  We'd need to do something about the selectivity,
 but I don't see why that would be a problem.
 
 Thoughts?

I am a newbie for PostgreSQL, but from a general view, I think
that the most radical and clean way to fix this behavior is to
make indexes to have the forward-matching function for strings in
itself, with ignoreing possible overheads I don't know.  This can
save the all failures this patch has left unsaved, assuming that
the `greater string' is not necessary to be a `valid string' just
on searching btree.

Another idea that I can guess is to add a new operator that means
examine if the string value is smaller than the `greater string'
of the parameter.. This operator also can defer making `greater
string' to just before searching btree or summing up histogram
entries, or comparison with column values. If the assumption
above is true, making greater string operation can be done in
regardless of character encoding. This seems have smaller impact
than prefix match operator.

# But, mmm, The more investigating, the less difference it seems
# for me to be... But It is out of my knowledge now, anyway.. I
# need more study.



On the other hand, if no additional encoding-specific `character
increment function' will not come out, the modification of
pg_wchar_table can be cancelled and make_greater_string will
select the `character increment function' as 'switch
(GetDatabaseEncoding()) { case PG_UTF8:.. }'.  This get rid of
the pg_generic_charinc tweak for libpq too.



At Wed, 21 Sep 2011 21:49:27 -0400, Tom Lane t...@sss.pgh.pa.us wrote
 detail work; for instance, I noted an unconstrained memcpy into a 4-byte
 local buffer, as well as lots and lots of violations of PG house style.
 That's certainly all fixable but somebody will have to go through it.

Sorry for the illegal style of the patch. I will confirm it.


Regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Double sorting split patch

2011-09-22 Thread Alexander Korotkov

On Thu, Sep 22, 2011 at 3:22 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

  !   /*
 !* Calculate delta between penalties of join common
 entries to
 !* different groups.
 !*/
 !   for (i = 0; i  commonEntriesCount; i++)
{
 !   double  lower,
 !   upper;
 !
 !   box = DatumGetBoxP(entryvec-vector[**
 commonEntries[i].index].key);
 !   if (context.dim == 0)
 !   {
 !   lower = box-low.x;
 !   upper = box-high.x;
 !   }
 !   else
 !   {
 !   lower = box-low.y;
 !   upper = box-high.y;
 !   }
 !   commonEntries[i].delta = Abs(box_penalty(leftBox,
 box) -
 !
box_penalty(rightBox, box));
}


 'lower' and 'upper' are not used for anything in the above. Is that just
 dead code that can be removed, or is there something missing that should be
 using them?

Yes, it's just dead code.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] Online base backup from the hot-standby

2011-09-22 Thread Fujii Masao

On Wed, Sep 21, 2011 at 5:34 PM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Sep 21, 2011 at 08:23, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander mag...@hagander.net wrote:
 Presumably pg_start_backup() will check this. And we'll somehow track
 this before pg_stop_backup() as well? (for such evil things such as
 the user changing FPW from on to off and then back to on again during
 a backup, will will make it look correct both during start and stop,
 but incorrect in the middle - pg_stop_backup needs to fail in that
 case as well)

 Right. As I suggested upthread, to address that problem, we need to log
 the change of FPW on the master, and then we need to check whether
 such a WAL is replayed on the standby during the backup. If it's done,
 pg_stop_backup() should emit an error.

 I somehow missed this thread completely, so I didn't catch your
 previous comments - oops, sorry. The important point being that we
 need to track if when this happens even if it has been reset to a
 valid value. So we can't just check the state of the variable at the
 beginning and at the end.

Right. Let me explain again what I'm thinking.

When FPW is changed, the master always writes the WAL record
which contains the current value of FPW. This means that the standby
can track all changes of FPW by reading WAL records.

The standby has two flags: One indicates whether FPW has always
been TRUE since last restartpoint. Another indicates whether FPW
has always been TRUE since last pg_start_backup(). The standby
can maintain those flags by reading WAL records streamed from
the master.

If the former flag indicates FALSE (i.e., the WAL records which
the standby has replayed since last restartpoint might not contain
required FPW), pg_start_backup() fails. If the latter flag indicates
FALSE (i.e., the WAL records which the standby has replayed
during the backup might not contain required FPW),
pg_stop_backup() fails.

If I'm not missing something, this approach can address the problem
which you're concerned about.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

66 matches

Mail list logo