Re: [HACKERS] initdb and fsync

2012-03-13 Thread Jeff Davis
On Tue, 2012-03-13 at 09:42 +0100, Andres Freund wrote:
 for recursively everything in dir:
posix_fadvise(fd, POSIX_FADV_DONTNEED);
 
 for recursively everything in dir:
fsync(fd);

Wow, that made a huge difference!

  no sync:  ~ 1.0s
  sync: ~10.0s
  fadvise+sync: ~ 1.3s

Patch attached.

Now I feel much better about it. Most people will either have fadvise, a
write cache (rightly or wrongly), or actually need the sync. Those that
have none of those can use -N.

Regards,
Jeff Davis


initdb-fsync-20120313.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-03-12 Thread Jeff Davis
On Sun, 2012-02-05 at 17:56 -0500, Noah Misch wrote:
 I meant primarily to illustrate the need to be comprehensive, not comment on
 which executable should fsync a particular file.  Bootstrap-mode backends do
 not sync anything during an initdb run on my system.  With your patch, we'll
 fsync a small handful of files and leave nearly everything else vulnerable.

Thank you for pointing that out. With that in mind, I have a new version
of the patch which just recursively fsync's the whole directory
(attached).

I also introduced a new option --nosync (-N) to disable this behavior.

The bad news is that it introduces a lot more time to initdb -- it goes
from about 1s to about 10s on my machine. I tried fsync'ing the whole
directory twice just to make sure that the second was a no-op, and
indeed it didn't make much difference (still about 10s). 

That's pretty inefficient considering that

  initdb -D data --nosync  sync

only takes a couple seconds. Clearly batching the operation is a big
help. Maybe there's some more efficient way to fsync a lot of
files/directories? Or maybe I can mitigate it by avoiding files that
don't really need to be fsync'd?

Regards,
Jeff Davis


initdb-fsync-20120312.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-22 Thread Jeff Davis
On Tue, 2012-02-14 at 19:32 -0500, Dan Ports wrote:
 On Tue, Feb 14, 2012 at 09:27:58AM -0600, Kevin Grittner wrote:
  Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
   On 14.02.2012 04:57, Dan Ports wrote:
   The easiest answer would be to just treat every prepared
   transaction found during recovery as though it had a conflict in
   and out. This is roughly a one-line change, and it's certainly
   safe.

+1.

I don't even see this as much of a problem. Prepared transactions
hanging around for arbitrary periods of time cause all kinds of problems
already. Those using them need to be careful to resolve them quickly --
and if there's a crash involved, I think it's reasonable to say they
should be resolved before continuing normal online operations.

 Hmm, it occurs to me if we have to abort a transaction due to
 serialization failure involving a prepared transaction, we might want
 to include the prepared transaction's gid in the errdetail. 

I like this idea.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-02-05 Thread Jeff Davis
On Sat, 2012-02-04 at 20:18 -0500, Noah Misch wrote:
 If we add fsync calls to the initdb process, they should cover the entire data
 directory tree.  This patch syncs files that initdb.c writes, but we ought to
 also sync files that bootstrap-mode backends had written.

It doesn't make sense for initdb to take responsibility to sync files
created by the backend. If there are important files that the backend
creates, it should be the backend's responsibility to fsync them (and
their parent directory, if needed). And if they are unimportant to the
backend, then there is no reason for initdb to fsync them.

 An optimization
 like the pg_flush_data() call in copy_file() may reduce the speed penalty.

That worked pretty well. It took it down about 100ms on my machine,
which closes the gap significantly.

 initdb should do these syncs by default and offer an option to disable them.

For test frameworks that run initdb often, that makes sense.

But for developers, it doesn't make sense to spend 0.5s typing an option
that saves you 0.3s. So, we'd need some more convenient way to choose
the no-fsync option, like an environment variable that developers can
set. Or maybe developers don't care about 0.3s?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-02-04 Thread Jeff Davis
On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote:
 Yeah.  Personally I would be sad if initdb got noticeably slower, and
 I've never seen or heard of a failure that this would fix.

I worked up a patch, and it looks like it does about 6 file fsync's and
a 7th for the PGDATA directory. That degrades the time from about 1.1s
to 1.4s on my workstation.

pg_test_fsync says this about my workstation (one 8kB write):
open_datasync 117.495 ops/sec
fdatasync 117.949 ops/sec
fsync  25.530 ops/sec
fsync_writethroughn/a
open_sync  24.666 ops/sec

25 ops/sec means about 40ms per fsync, times 7 is about 280ms, so that
seems like about the right degradation for fsync. I tried with fdatasync
as well to see if it improved things, and I wasn't able to realize any
difference (not sure exactly why).

So, is it worth it? Should we make it an option that can be specified?

 I wonder whether it wouldn't be sufficient to call sync(2) at the end,
 anyway, rather than cluttering the entire initdb codebase with fsync
 calls.

It looks like there are only a few places, so I don't think clutter is
really the problem with the simple patch at this point (unless there is
a portability problem with just calling fsync).

Regards,
Jeff Davis


initdb-fsync.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2012-01-29 Thread Jeff Davis
On Tue, 2012-01-24 at 16:07 +0400, Alexander Korotkov wrote:
 Hi!
 
 
 New version of patch is attached.

Thank you for the updates. I have a small patch attached.

The only code change I made was very minor: I changed the constants used
in the penalty function because your version used INFINITE_BOUND_PENALTY
when adding an empty range, and that didn't quite make sense to me. If
I'm mistaken you can leave it as-is.

I also attached range-gist-test.sql, which I used for a performance
test. I mix various types of ranges together in a larger table of 1.1M
tuples. And then I create a smaller table that only contains normal
ranges and empty ranges. There are two tests:
  1. Create an index on the big table
  2. Do a range join (using overlaps rather than equals) where the
smaller table is on the outer side of a nested loop join and an index
scan over the larger table on the inner.

The index creation time reduces by a small amount with the patch, from
around 16s without the patch to around 13s with the patch. The query
time, however, dropped from around 26s to around 14s! Almost 2x speedup
with the patch!

Moreover, looking at the loop timing in the explain analyze output, it
goes from about 7..24 ms per loop down to about 1.5..13 ms per loop.
That seems to indicate that the index distribution is better, with more
queries returning quickly.

So, great work Alexander! Very convincing results.

Marking ready for committer, but please apply my comment fixes at your
discretion.

Regards,
Jeff Davis

PS: the test was run on my workstation (Intel(R) Core(TM) i7-2600 CPU @
3.40GHz) with work_mem=512MB, shared_buffers=512MB, and
checkpoint_segments=32. The rest of the settings were default.




range-gist-comments.patch.gz
Description: GNU Zip compressed data

\timing on

drop table big;
drop table small;

create temp table tmp_foo(i int, ir int8range);
insert into tmp_foo select g % 100, 'empty'::int8range from generate_series(1,5) g;
insert into tmp_foo select g % 100, int8range(NULL,NULL) from generate_series(1,1) g;
insert into tmp_foo select g % 100, int8range(NULL,((random()-0.5)*g*10)::int8) from generate_series(1,2) g;
insert into tmp_foo select g % 100, int8range(((random()-0.5)*g*10)::int8,NULL) from generate_series(1,2) g;
insert into tmp_foo select g % 100,
  int8range(
(g*10 + 10*(random()-0.5))::int8,
(g*10 + 10 + 10*(random()-0.5))::int8 )
  from generate_series(1,100) g;

create table big as select * from tmp_foo order by random();
drop table tmp_foo;

create table tmp_foo(i int, ir int8range);
insert into tmp_foo select g*1000 % 100, 'empty'::int8range from generate_series(1,50) g;
insert into tmp_foo select g*1000 % 100,
  int8range(
(g*10*1000 + 10*(random()-0.5))::int8,
(g*10*1000 + 10 + 10*(random()-0.5))::int8 )
  from generate_series(1,1000) g;

create table small as select * from tmp_foo order by random();
drop table tmp_foo;

vacuum;
vacuum;
vacuum;

create index big_idx on big using gist (ir);

analyze;

set enable_bitmapscan=false;

explain analyze select sum(upper(intersection) - lower(intersection))
from (
  select small.ir * big.ir as intersection from small,big
  where small.ir  big.ir and not lower_inf(big.ir) and not upper_inf(big.ir)
) s;

set enable_bitmapscan to default;
set enable_indexscan=false;

explain analyze select sum(upper(intersection) - lower(intersection))
from (
  select small.ir * big.ir as intersection from small,big
  where small.ir  big.ir and not lower_inf(big.ir) and not upper_inf(big.ir)
) s;

set enable_indexscan to default;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Davis
On Sat, 2012-01-28 at 10:31 -0500, Andrew Dunstan wrote:
 I'm curious what problem we're actually solving here, though. I've run 
 the buildfarm countless thousands of times on different VMs, and five of 
 my seven current animals run in VMs, and I don't think I've ever seen a 
 failure ascribable to inadequately synced files from initdb.

I believe I have seen such a problem second hand in a situation where
the VM was known to be killed harshly (not sure if you do that
regularly).

It's a little difficult for me to _prove_ that this would have solved
the problem, and I think it was only observed once (though I could
probably reproduce it if I tried). The symptom was a log message
indicating that PG_VERSION was missing or corrupt on a system that was
previously started and online (albeit briefly for a test).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-01-28 Thread Jeff Davis
On Sat, 2012-01-28 at 13:18 -0500, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  I'm curious what problem we're actually solving here, though. I've run 
  the buildfarm countless thousands of times on different VMs, and five of 
  my seven current animals run in VMs, and I don't think I've ever seen a 
  failure ascribable to inadequately synced files from initdb.
 
 Yeah.  Personally I would be sad if initdb got noticeably slower, and
 I've never seen or heard of a failure that this would fix.
 
 I wonder whether it wouldn't be sufficient to call sync(2) at the end,
 anyway, rather than cluttering the entire initdb codebase with fsync
 calls.

I can always add a sync call to the test, also (rather than modifying
initdb). Or, it could be an initdb option, which might be a good
compromise. I don't have a strong opinion here.

As machines get more memory and filesystems get more lazy, I wonder if
it will be a more frequent occurrence, however. On the other hand, if
filesystems are more lazy, that also increases the cost associated with
extra sync calls. I think there would be a surprise factor if
sometimes initdb had a long pause at the end and caused 10GB of data to
be written out.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] initdb and fsync

2012-01-27 Thread Jeff Davis
It looks like initdb doesn't fsync all the files it creates, e.g. the
PG_VERSION file.

While it's unlikely that it would cause any real data loss, it can be
inconvenient in some testing scenarios involving VMs.

Thoughts? Would a patch to add a few fsync calls to initdb be accepted?
Is a platform-independent fsync be available at initdb time?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-27 Thread Jeff Davis
On Mon, 2011-12-19 at 07:50 -0500, Robert Haas wrote:
 I
 think it would be regrettable if everyone had to give up 4 bytes per
 page because some people want checksums.

I can understand that some people might not want the CPU expense of
calculating CRCs; or the upgrade expense to convert to new pages; but do
you think 4 bytes out of 8192 is a real concern?

(Aside: it would be MAXALIGNed anyway, so probably more like 8 bytes.)

I was thinking we'd go in the other direction: expanding the header
would take so much effort, why not expand it a little more to give some
breathing room for the future?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-27 Thread Jeff Davis
On Mon, 2011-12-19 at 01:55 +, Greg Stark wrote:
 On Sun, Dec 18, 2011 at 7:51 PM, Jesper Krogh jes...@krogh.cc wrote:
  I dont know if it would be seen as a half baked feature.. or similar,
  and I dont know if the hint bit problem is solvable at all, but I could
  easily imagine checksumming just skipping the hit bit entirely.
 
 That was one approach discussed. The problem is that the hint bits are
 currently in each heap tuple header which means the checksum code
 would have to know a fair bit about the structure of the page format.

Which is actually a bigger problem, because it might not be the backend
that's reading the page. It might be your backup script taking a new
base backup.

The kind of person to care about CRCs would also want the base backup
tool to verify them during the copy so that you don't overwrite your
previous (good) backup with a bad one. The more complicated we make the
verification process, the less workable that becomes.

I vote for a simple way to calculate the checksum -- fixed offsets of
each page (of course, it would need to know the page size), and a
standard checksum algorithm.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-27 Thread Jeff Davis
On Mon, 2011-12-19 at 22:18 +0200, Heikki Linnakangas wrote:
 Or you could just use a filesystem that does CRCs...

That just moves the problem. Correct me if I'm wrong, but I don't think
there's anything special that the filesystem can do that we can't.

The filesystems that support CRCs are more like ZFS than ext3. They do
all writes to a new location, thus fragmenting the files. That may be a
good trade-off for some people, but it's not free.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums

2011-12-27 Thread Jeff Davis
On Sun, 2011-12-25 at 22:18 +, Greg Stark wrote:
 2) The i/o system was in the process of writing out blocks and the
 system lost power or crashed as they were being written out. In this
 case there will probably only be 0 or 1 torn pages -- perhaps as many
 as the scsi queue depth if there's some weird i/o scheduling going on.

That would also depend on how many disks you have and what configuration
they're in, right?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-27 Thread Jeff Davis
On Thu, 2011-12-22 at 03:50 -0600, Kevin Grittner wrote:
 Now, on to the separate-but-related topic of double-write.  That
 absolutely requires some form of checksum or CRC to detect torn
 pages, in order for the technique to work at all.  Adding a CRC
 without double-write would work fine if you have a storage stack
 which prevents torn pages in the file system or hardware driver.  If
 you don't have that, it could create a damaged page indication after
 a hardware or OS crash, although I suspect that would be the
 exception, not the typical case.  Given all that, and the fact that
 it would be cleaner to deal with these as two separate patches, it
 seems the CRC patch should go in first.

I think it could be broken down further.

Taking a step back, there are several types of HW-induced corruption,
and checksums only catch some of them. For instance, the disk losing
data completely and just returning zeros won't be caught, because we
assume that a zero page is just fine.

From a development standpoint, I think a better approach would be:

1. Investigate if there are reasonable ways to ensure that (outside of
recovery) pages are always initialized; and therefore zero pages can be
treated as corruption.

2. Make some room in the page header for checksums and maybe some other
simple sanity information (like file and page number). It will be a big
project to sort out the pg_upgrade issues (as Tom and others have
pointed out).

3. Attack hint bits problem.

If (1) and (2) were complete, we would catch many common types of
corruption, and we'd be in a much better position to think clearly about
hint bits, double writes, etc.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page Checksums + Double Writes

2011-12-27 Thread Jeff Davis
On Tue, 2011-12-27 at 16:43 -0600, Merlin Moncure wrote:
 On Tue, Dec 27, 2011 at 1:24 PM, Jeff Davis pg...@j-davis.com wrote:
  3. Attack hint bits problem.
 
 A large number of problems would go away if the current hint bit
 system could be replaced with something that did not require writing
 to the tuple itself.

My point was that neither the zero page problem nor the upgrade problem
are solved by addressing the hint bits problem. They can be solved
independently, and in my opinion, it seems to make sense to solve those
problems before the hint bits problem (in the context of detecting
hardware corruption).

Of course, don't let that stop you from trying to get rid of hint bits,
that has numerous potential benefits.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-12-21 Thread Jeff Davis
On Wed, 2011-12-14 at 01:04 +0400, Alexander Korotkov wrote:
 Hi!
 
Thank you! Attached a few changes:

* Change ordinal to normal for clarity (at least to me).
* Some comment cleanup
* Change classes_groups to be an enum of SPLIT_LEFT and SPLIT_RIGHT,
rather than using 1 and 2.
* Changed the bounds_lower and bounds_upper variables into
by_lower and by_upper to indicate that the arrays are distinguished
by sort order. It was confusing me to read it otherwise.

A few comments:

* In range_gist_picksplit, it would be nice to have a little bit more
intuitive description of what's going on with the nonEmptyCount and
nonInfCount numbers. For instance, it appears to depend on the fact that
a range must either be in nonEmptyCount or in nonInfCount. Also, can you
describe the reason you're multiplying by two and taking the absolute
value? It seems to work, but I am missing the intuition behind those
operations.

* The penalty function is fairly hard to read still. At a high level, I
think we're trying to accomplish a few things (in order from most to
least important):
  - Keep normal ranges separate.
  - Avoid broadening the class of the original predicate (e.g. turning
single-infinite into double-infinite).
  - Avoid broadening (as determined by subtype_diff) the original
predicate.
  - Favor adding ranges to narrower original predicates.

Do you agree? If so, perhaps we can attack those more directly and it
might be a little more readable.

Additionally, the arbitrary numbers might become a problem. Can we
choose better constants there? They would still be arbitrary when
compared with real numbers derived from subtype_diff, but maybe we can
still do better than what's there.

* Regarding the leftover common entries that can go to either side:
what is the delta measure trying to accomplish? When I work through
some examples, it seems to favor putting larger common ranges on the
left (small delta) and smaller common ranges on the right (smaller
delta). Why is that good? Or did I misread the code?

Intuitively, I would think that we'd want to push the ranges with lower
upper bounds to the left and higher lower bounds to the right -- in
other words, recurse. Obviously, we'd need to make sure it terminated at
some point, but splitting the common entries does seem like a smaller
version of the original problem. Thoughts?

Thank you for the helpful comments! It took me a while to work through
the logic, but I would have been lost completely without the comments
around the double sorting split.

Regards,
Jeff Davis
*** a/src/backend/utils/adt/rangetypes_gist.c
--- b/src/backend/utils/adt/rangetypes_gist.c
***
*** 39,45 
  	((RangeType *) DatumGetPointer(datumCopy(PointerGetDatum(r), \
  			 false, -1)))
  
! /* Minimum accepted ratio of split */
  #define LIMIT_RATIO 0.3
  
  /* Helper macros to place an entry in the left or right group */
--- 39,49 
  	((RangeType *) DatumGetPointer(datumCopy(PointerGetDatum(r), \
  			 false, -1)))
  
! /*
!  * Minimum accepted ratio of split for items of the same class. If the items
!  * are of different classes, it will separate along those lines regardless of
!  * the ratio.
!  */
  #define LIMIT_RATIO 0.3
  
  /* Helper macros to place an entry in the left or right group */
***
*** 66,72 
   * GiST. Each unique combination of properties is a class. CLS_EMPTY cannot be
   * combined with anything else.
   */
! #define CLS_ORDINAL			0 /* Ordinal ranges (no bits set) */
  #define CLS_LOWER_INF		1 /* Lower bound is infinity */
  #define CLS_UPPER_INF		2 /* Upper bound is infinity */
  #define CLS_CONTAIN_EMPTY	4 /* Contains underlying empty ranges */
--- 70,76 
   * GiST. Each unique combination of properties is a class. CLS_EMPTY cannot be
   * combined with anything else.
   */
! #define CLS_NORMAL			0 /* Normal ranges (no bits set) */
  #define CLS_LOWER_INF		1 /* Lower bound is infinity */
  #define CLS_UPPER_INF		2 /* Upper bound is infinity */
  #define CLS_CONTAIN_EMPTY	4 /* Contains underlying empty ranges */
***
*** 76,81 
--- 80,102 
  			   * of properties. CLS_EMPTY doesn't combine with
  			   * anything else, so it's only 2^3 + 1. */
  
+ /*
+  * Auxiliary structure for picksplit based on single sorting.
+  */
+ typedef struct
+ {
+ 	int	index;
+ 	RangeBound			bound;
+ 	TypeCacheEntry	   *typcache;
+ } PickSplitSortItem;
+ 
+ /* place on left or right side of split? */
+ typedef enum
+ {
+ 	SPLIT_LEFT = 0, /* makes initialization to SPLIT_LEFT easier */
+ 	SPLIT_RIGHT
+ } SplitLR;
+ 
  static RangeType *range_super_union(TypeCacheEntry *typcache, RangeType *r1,
  	RangeType *r2);
  static bool range_gist_consistent_int(FmgrInfo *flinfo,
***
*** 97,103  static int sort_item_cmp(const void *a, const void *b);
  static void range_gist_class_split(TypeCacheEntry *typcache,
     GistEntryVector *entryvec,
     GIST_SPLITVEC *v

Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-12-21 Thread Jeff Davis
On Tue, 2011-12-20 at 13:22 +0400, Alexander Korotkov wrote:
 Hi!
 
 
 Studying this question little more I found that current approach of
 range indexing can be dramatically inefficient in some cases. It's not
 because of penalty or split implementation, but because of approach
 itself. Mapping intervals to two-dimensional space produce much better
 results in case of high-overlapping ranges and @, @ operators
 with low selectivity. 
 
Thank you for testing this. I agree that your approach is much better
especially dealing with widely varying range sizes, etc. My approach
really only tackled the simple (and hopefully common) case when the
ranges are about the same size.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-12-12 Thread Jeff Davis
On Fri, 2011-12-02 at 15:48 +0400, Alexander Korotkov wrote:
 Rebased with head.
 
Thank you. I have attached a patch that's mostly just cleanup to this
one.

Comments:

* You use the term ordinal range quite a lot, which I haven't heard
before. Is that a mathematical term, or do you mean something more like
ordinary?

* There's a lot of code for range_gist_penalty. Rather than having
special cases for all combinations of properties in the new an original,
is it possible to use something a little simpler? Maybe just start the
penalty at zero, and add something for each property of the predicate
range that must be changed. The penalties added might vary, e.g., if the
original range has an infinite lower bound, changing it to have an
infinite upper bound might be a higher penalty.

* It looks like LIMIT_RATIO is not always considered. Should it be?

* You defined get/set_range_contain_empty, but didn't use them. I think
this was a merge error, but I removed them. So now there are no changes
in rangetypes.c.

Regards,
Jeff Davis


range_gist_cleanup.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why so few built-in range types?

2011-11-30 Thread Jeff Davis
On Tue, 2011-11-29 at 12:01 -0500, Tom Lane wrote:
 One thing that bothered me while looking at the range types patch is
 that it seemed you'd been mighty conservative about creating built-in
 range types.

During development, I didn't want to juggle the OIDs for too many range
types. That was really the only reason.

 In particular, I don't understand why there's not a
 standard float8range type; that seems like a pretty common case.
 I'd have also expected to see a standard textrange type.  What was
 the rationale for leaving these out?

A built-in textrange type would have to have collation C, right? Do
you think that would be useful to enough people?

One that I'd like to see is an IP address type, but that's complicated
because inet and cidr support netmasks.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST range-contained-by searches versus empty ranges

2011-11-29 Thread Jeff Davis
On Sat, 2011-11-26 at 19:26 -0500, Tom Lane wrote:
 I'm inclined to propose that we should add some logic to say that
 merging a new item into an existing one is forbidden if the penalty
 function returns plus-infinity for the case.  If all existing items on a
 page return infinity, a new item must be added to the page (possibly
 causing a page split) instead of inserting into any existing one.
 (Of course, gistpenalty() should be fixed to return infinity, not just a
 randomly chosen large value, for the null-and-not-null case.)

This seems less important now that you've committed the flag for
contains empty ranges.

However, it still sounds like a useful improvement to me.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-11-25 Thread Jeff Davis
On Wed, 2011-11-09 at 20:24 +0400, Alexander Korotkov wrote:
 New version of GiST for range types patch is here. This version seems
 to be complete and ready for review.
 
There's been some significant change in rangetypes_gist.c, can you
please rebase this patch?

I like the patch conceptually, though I'm still working through the
details.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obstacles to user-defined range canonicalization functions

2011-11-24 Thread Jeff Davis
On Wed, 2011-11-23 at 22:33 -0500, Tom Lane wrote:
 * The underlying range_serialize function is only exposed at the C
 level.  If you try to write something in, say, plpgsql then you are
 going to end up going through range_constructorN or range_in to produce
 your result value, and those call the type's canonical function.
 Infinite recursion, here we come.

That seems solvable, unless I'm missing something.

 * The only way to create a canonicalization function in advance of
 declaring the range type is to declare it against a shell type.  But the
 PL languages all reject creating PL functions that take or return a
 shell type.  Maybe we could relax that, but it's nervous-making, and
 anyway the first problem still remains.

That seems a little more challenging.

 One possibility that just came to me is to decree that every discrete
 range type has to be based on an underlying continuous range type (with
 all the same properties except no canonicalization function), and then
 the discrete range's canonicalization function could be declared to take
 and return the underlying range type instead of the discrete type
 itself.  Haven't worked through the details though.

An interesting approach. I wonder if there would be a reason to tie such
types together for a reason other than just the canonical function?
Would you have to define everything in terms of the continuous range, or
could it be a constraint hierarchy; e.g. a step size 100 is based on a
step size of 10 which is based on numeric?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Singleton range constructors versus functional coercion notation

2011-11-22 Thread Jeff Davis
On Tue, 2011-11-22 at 09:07 -0500, Robert Haas wrote:
 I honestly don't know what function names people will pick, and I
 don't care.  Someone might like singleton(x), which would be
 impractical as a built-in because there could be more than one range
 type over the same base type, but if the user defines the function
 they can pick what's convenient for them.  If they use singletons
 exceedingly frequently they might even want something really short,
 like just(x) or s(x).  Or they might say daterange1(x), along the
 lines you suggested earlier.

For that matter, they might pick daterange(x), as I picked earlier, and
run into the same problems.

It's a little strange that we allow people to define functions with one
argument and the same name as a type if such functions are confusing.

This isn't intended as an argument in either direction, just an
observation.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Singleton range constructors versus functional coercion notation

2011-11-20 Thread Jeff Davis
On Sat, 2011-11-19 at 15:57 -0500, Tom Lane wrote:
  I'm hesitant to remove them because the alternative is significantly
  more verbose:
numrange(1.0, 1.0, '[]');
 
 Right.  The question is, does the case occur in practice often enough
 to justify a shorter notation?  I'm not sure.

Well, if there were a good shorter notation, then probably so. But it
doesn't look like we have a good idea, so I'm fine with dropping it.

 One thing I've been thinking for a bit is that for discrete ranges,
 I find the '[)' canonical form to be surprising/confusing.  If we
 were to fix range_adjacent along the lines suggested by Florian,
 would it be practical to go over to '[]' as the canonical form?
 One good thing about that approach is that canonicalization wouldn't
 have to involve generating values that might overflow.

I think we had that discussion before, and Florian brought up some good
points (both then and in a reply now). Also, Robert wasn't convinced
that '[]' is necessarily better for discrete ranges:

http://archives.postgresql.org/pgsql-hackers/2011-10/msg00598.php

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] range_adjacent and discrete ranges

2011-11-19 Thread Jeff Davis
On Fri, 2011-11-18 at 14:47 -0500, Tom Lane wrote:
 Yeah, probably not.  However, I don't like the idea of
 '(3,4)'::int4range throwing an error, as it currently does, because it
 seems to require the application to have quite a lot of knowledge of the
 range semantics to avoid having errors sprung on it.

OK, then let's make '(3,4)'::int4range the empty range. (3,3) might be
OK as well (for any range type), because at least it's consistent.

The one that I find strange is [3,3), but I think that needs to work for
the range_adjacent idea to work. Seeing it as useful in the context of
range_adjacent might mean that it's useful elsewhere, too, so now I'm
leaning toward supporting [3,3) as an empty range.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Singleton range constructors versus functional coercion notation

2011-11-19 Thread Jeff Davis
On Sat, 2011-11-19 at 12:27 -0500, Tom Lane wrote:
 The singleton range constructors don't work terribly well.
...

 I don't immediately see a solution that's better than dropping the
 single-argument range constructors.

We could change the name, I suppose, but that seems awkward. I'm
hesitant to remove them because the alternative is significantly more
verbose:

  numrange(1.0, 1.0, '[]');

But I don't have any particularly good ideas to save them, either.

Regarding the zero-argument (empty) constructors, I'd be fine removing
them. They don't seem to cause problems, but the utility is also very
minor.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Are range_before and range_after commutator operators?

2011-11-18 Thread Jeff Davis
On Thu, 2011-11-17 at 17:10 -0500, Tom Lane wrote:
 Applied, thanks.  These comments aren't quite what I'd hoped for though.
 What I'm lacking is the conceptual context, ie, why is a
 less-equal-greater primitive for bounds a good thing?  It seems like
 when you consider the four possible directional (lower/upper)
 combinations, the same result from range_cmp_bounds means something
 different in each case, and I find that confusing.  I wonder whether
 functions defined along set-theoretic lines (this bound is strictly
 weaker or stronger than this other one, or these bounds together define
 an empty or singleton or non-singleton set) might be more transparent.

I think it comes down to how helpful it is to define higher-level
functions in terms of range_cmp_bounds versus some other function (or
set of functions).

range_cmp_bounds seemed to work out fairly well for most of the
operators, and it was the best that I came up with. The nice thing about
it is that it can compare a lower bound to another lower bound, or to an
upper bound. Then again, perhaps I tried to hard to unify those
concepts, and it just led to complexity?

I'm open to suggestion.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] range_adjacent and discrete ranges

2011-11-18 Thread Jeff Davis
While thinking about range_cmp_bounds, I started to think that the way
range_adjacent works is wrong.

range_adjacent() depends on the bounds of two ranges to match up, such
that the boundary values are equal, but one is exclusive and the other
inclusive, and one is a lower bound and the other an upper bound.

That makes perfect sense for continuous ranges because that is the only
way they can possibly be adjacent. It also works for the discrete ranges
as defined so far, because they use a canonical function that translates
the values to [) form. But if someone were to write a canonical function
that translates the ranges to [] or () form, range_adjacent would be
useless.

There are a few approaches to solving it:

1. Make the canonical function accept a format argument much like the
constructor, and have an output format stored in the catalog that would
be passed in under most circumstances. However, range_adjacent could
pass in its own form that would be useful for its purposes.

2. Replace the canonical function with inc and dec functions, or
some variation thereof. We'd still need some kind of output format to
match the current behavior, otherwise every range would always be output
in [) form (I don't necessarily object to that, but it would be a
behavior difference for user-defined range types).

3. Remove range_adjacent.

4. Do nothing, and document that the canonical function should translate
to either [) or (] if you want range_adjacent to work.

Thoughts?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] range_adjacent and discrete ranges

2011-11-18 Thread Jeff Davis
On Fri, 2011-11-18 at 10:33 -0500, Tom Lane wrote:
 regression=# select int4range(4,4,'(]');
 ERROR:  range lower bound must be less than or equal to range upper bound
 regression=# select int4range(4,4,'()');
 ERROR:  range lower bound must be less than or equal to range upper bound
 
 Would it be better for them to silently transform such cases to empty?

That had crossed my mind, but I read the first as saying that it
includes 4 and doesn't include 4, which is a little confusing.

But I wouldn't object to making them return empty ranges. Seeing that we
removed some other errors in favor of returning something, it might be a
little more consistent to return empty when possible.

I wouldn't like to extend that to int4range(4,3), however. When the
upper bound is less than the lower bound, it's almost certainly a
mistake, and the user should be informed.

By the way, what does this have to do with canonical functions? This
seems more like a constructor issue, and there is already a
zero-argument constructor to make empty ranges.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] range_adjacent and discrete ranges

2011-11-18 Thread Jeff Davis
On Fri, 2011-11-18 at 13:32 +0100, Florian Pflug wrote:
 That information, however, *is* already contained in the canonical
 functions, because those function know that (2,3) are empty as an integer
 range, but non-empty as a float range.

Very good point. Thank you.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Are range_before and range_after commutator operators?

2011-11-17 Thread Jeff Davis
On Wed, 2011-11-16 at 17:53 -0500, Tom Lane wrote:
 I noticed that  and  are not marked as commutator operators,
 though a naive view of their semantics suggests they should be.
 However, I realized that there might be edge cases I wasn't thinking
 about, so I went looking in the patch to try to confirm this.  And
 I found neither a single line of documentation about it, nor a single
 comment in that hairy little nest of unobvious tests that calls itself
 range_cmp_bounds.  I am of the opinion that that routine not only
 requires a comment, but very possibly a comment longer than the routine
 itself.  What's more, if it's this complicated to code, surely it would
 be a good idea for the user-facing documentation to explain exactly
 what we think before/after mean?
 
 In general, the level of commenting in the rangetypes code seems far short
 of what I'd consider acceptable for Postgres code.  I plan to fix some
 of that myself, but I do not wish to reverse-engineer what the heck
 range_cmp_bounds thinks it's doing.

Yikes! While commenting the code, it turns out that I missed the case
where the values match and they are both exclusive; but one is upper and
the other lower. Worse than that, there were apparently some bogus test
results that expected the wrong output. Mea culpa.

Patch attached. I'll do another pass over some of the comments in other
parts of the code.

And to your original question, it seems that  and  should be
commutators. Perhaps I had a reason earlier, but it is escaping me now.
What edge cases did you have in mind? 

Regards,
Jeff Davis


cmp-bounds-2017.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] declarations of range-vs-element @ and @

2011-11-17 Thread Jeff Davis
On Wed, 2011-11-16 at 16:41 -0500, Tom Lane wrote:
 But what surprises me about this example is that I'd have expected the
 heuristic assume the unknown is of the same type as the other input
 to resolve it.  Looking more closely, I see that we apply that heuristic
 in such a way that it works only for exact operator matches, not for
 matches requiring coercion (including polymorphic-type matches).  This
 seems a bit weird.  I propose adding a step to func_select_candidate
 that tries to resolve things that way, ie, if all the known-type inputs
 have the same type, then try assuming that the unknown-type ones are of
 that type, and see if that leads to a unique match.  There actually is a
 comment in there that claims we do that, but the code it's attached to
 is really doing something else that involves preferred types within
 type categories...
 
 Thoughts?

That sounds reasonable to me.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cause of intermittent rangetypes regression test failures

2011-11-14 Thread Jeff Davis
On Mon, 2011-11-14 at 08:11 -0500, Tom Lane wrote:
 It needs to return FALSE, actually.  After further reading I realized
 that you have that behavior hard-wired into the range GiST routines,
 and it's silly to make the stand-alone versions of the function act
 differently.

Good point. That makes sense to me.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cause of intermittent rangetypes regression test failures

2011-11-13 Thread Jeff Davis
On Sun, 2011-11-13 at 15:38 -0500, Tom Lane wrote:
 If the table has been analyzed, then the
 most_common_values array for column ir will consist of 
   {empty}
 which is entirely correct since that value accounts for 16% of the
 table.  And then, when mcv_selectivity tries to estimate the selectivity
 of the  condition, it applies range_before to the empty range along
 with the int4range(100,500) value, and range_before spits up.
 
 I think this demonstrates that the current definition of range_before is
 broken.  It is not reasonable for it to throw an error on a perfectly
 valid input ... at least, not unless you'd like to mark it VOLATILE so
 that the planner will not risk calling it.
 
 What shall we have it do instead?

We could have it return NULL, I suppose. I was worried that that would
lead to confusion between NULL and the empty range, but it might be
better than marking it VOLATILE.

Thoughts, other ideas?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btree gist known problems

2011-11-07 Thread Jeff Davis
On Mon, 2011-11-07 at 10:18 -0500, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  I'm looking for known problem areas in btree_gist. I see:
  http://archives.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us
 
  With Range Types, I'm anticipating that btree_gist will become more
  important, so I'd like to know what bugs are holding it back.
 
  So, anything else come to mind? Or does btree_gist just need a good
  review? Or have the problems been fixed?
 
 The thing I was complaining about in the referenced message is specific
 to the inet opclass; it's unfair to make it sound like it's a generic
 problem with btree_gist.  And no, it's not been fixed AFAIK.

Sorry, I thought I remembered a few other comments but couldn't find
them in the archives. I must be mistaken.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] btree gist known problems

2011-11-06 Thread Jeff Davis
I'm looking for known problem areas in btree_gist. I see:

http://archives.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us

With Range Types, I'm anticipating that btree_gist will become more
important, so I'd like to know what bugs are holding it back.

So, anything else come to mind? Or does btree_gist just need a good
review? Or have the problems been fixed?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-11-04 Thread Jeff Davis
On Thu, 2011-11-03 at 10:37 +0200, Heikki Linnakangas wrote:
 Looking at the most recent patch, I don't actually see any GiST support 
 for the empty and non-empty operators (!? and ?). I don't see how those 
 could be accelerated with GiST, anyway; I think if you want to use an 
 index for those operators, you might as well create a partial or 
 functional index on empty(x).
 
 So I'm actually inclined to remove not only the nonempty function, but 
 also the ? and !? operators. They don't seem very useful, and ? and !? 
 don't feel very intuitive to me, anyway. I'll just leave the empty(x) 
 function.

That's fine with me. At best, they were only marginally useful.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-11-04 Thread Jeff Davis
On Wed, 2011-11-02 at 21:29 +0200, Heikki Linnakangas wrote:
  +   else if (lower1.infinite || upper1.infinite)
  +   length1 = 1.0/0.0;
 
 That seems wrong. I take it that the point is to set length1 to infinity?

I reworked this in commit (on my private repo, of course):
6197fbffb00f729feba8082136801cdef5ac850e

For the archives, it's essentially taking the difference on the left
side of the range, and the difference on the right side of the range,
and adding them together. There are just a lot of special cases for
infinite boundaries, empty ranges, and the lack of a subtype_diff
function.

I think it's a little closer to what Alexander intended, which I think
is an improvement. It should now be able to recognize that expanding
[10,) into [0,) has a penalty of 10.

 PS. I note the docs still refer to subtype_float. I'll fix that before 
 committing.

Thank you. The only change I found strange was the test that used \c to
reconnect; but I can't say that my solution was any better.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-11-04 Thread Jeff Davis
On Wed, 2011-11-02 at 22:59 +0200, Heikki Linnakangas wrote:
 This seems to be coming from the selectivity estimation function. The 
 selectivity function for @ is scalargtsel, which is usually used for 
 scalar  and =. That doesn't seem right. But what do we store in the 
 statistics for range types in the first place, and what would be the 
 right thing to do for selectivity estimation?

I'll have to think more about that, and it depends on the operator. It
seems like an easier problem for contains a point than contains
another range or overlaps with another range.

Right now I don't have a very good answer, and even for the contains a
point case I'll have to think about the representation in pg_statistic.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]

2011-10-31 Thread Jeff Davis
On Sat, 2011-10-29 at 21:12 +0200, Erik Rijkers wrote:
 Would it be possible to remove of the double quotes in the daterange display 
 of BC dates?
 
 select '[0001-10-29 BC,2011-10-29)'::daterange;
   daterange
 --
  [0001-10-29 BC,2011-10-29)
 (1 row)

It accepts values without quotes, but on output it quotes them similar
to a record type.

Try:

  create table foo(d date);
  select '(0001-10-29 BC)'::foo;

The spaces are the only reason it's being quoted there. I think it's
best to be fairly consistent, and it was suggested that I model the
input parsing after the record parsing.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] strange code in array_in

2011-10-29 Thread Jeff Davis
In array_in(), I see the following code:

  my_extra-element_type = ~element_type;

It seems like it was explicitly changed from InvalidOid to
~element_type. At first I thought it was a mistake, but then I thought
maybe it was to ensure that the next branch was taken even if
element_type == InvalidOid. But the rest of the lookups will surely fail
on InvalidOid, and it seems like it would be easy to test for InvalidOid
at the beginning if we wanted to fail.

Can someone please explain, and perhaps include a comment indicating
what's going on? Or is it just too early and I missed something simple?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange code in array_in

2011-10-29 Thread Jeff Davis
On Sat, 2011-10-29 at 15:13 -0400, Tom Lane wrote:
 What other lookups?

I just meant anything after that point in the function would surely fail
(get_type_io_data).

 array_out, and I believe a bunch of other places, use the same trick.

OK. In retrospect it is a very simple trick, but at the time I was
slightly confused by it. I guess I just haven't seen that idiom before.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-26 Thread Jeff Davis
On Wed, 2011-10-26 at 12:19 -0400, Robert Haas wrote:
  1. In session A: BEGIN; SELECT * FROM foo WHERE id = 1; COMMIT;
The row has xmin = 123456, and it is cached as committed in the one-item
  cache by TransactionLogFetch.
  2. A lot of time passes. Everything is frozen, and XID wrap-around happens.
  (Session A is idle but not in a transaction, so it doesn't inhibit
  freezing.)
  3. In session B: BEGIN: INSERT INTO foo (id) VALUES (2); ROLLBACK;
By coincidence, this transaction was assigned XID 123456.
  4. In session A: SELECT * FROM foo WHERE id = 2;
The one-item cache still says that 123456 committed, so we return the
  tuple inserted by the aborted transaction. Oops.

Yes, that's the scenario I was talking about.

 Oh, hmm.  That sucks.

It didn't seem like a major concern a while ago:
http://archives.postgresql.org/message-id/28107.1291264...@sss.pgh.pa.us

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-25 Thread Jeff Davis
On Mon, 2011-10-24 at 13:15 +0300, Heikki Linnakangas wrote:
 Hmm, I don't think that's safe. After Oid wraparound, a range type oid 
 might get reused for some other range type, and the cache would return 
 stale values. Extremely unlikely to happen by accident, but could be 
 exploited by an attacker.
 

Any ideas on how to remedy that? I don't have another plan for making it
perform well. Plugging it into the cache invalidation mechanism seems
like overkill, but I suppose that would solve the problem.

Aren't there a few other cases like this floating around the code? I
know the single-xid cache is potentially vulnerable to xid wraparound
for the same reason.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-19 Thread Jeff Davis
On Wed, 2011-10-19 at 11:44 +0300, Peter Eisentraut wrote:
 On tis, 2011-10-18 at 21:47 -0700, Jeff Davis wrote:
  On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:
   Presumably because Jeff doesn't have that particular locale installed.
   locale -a would clarify that.
  
  $ locale -a |grep -i tr
  tr_CY.utf8
  tr_TR.utf8
  
  So, yes, I only have the UTF8 version. I didn't realize they were
  different -- do you happen to know what package I need for just plain
  tr_TR?
 
 dpkg-reconfigure locales

Did that, and still:

# locale -a|grep -i tr
tr_CY.utf8
tr_TR.utf8

 apt-get install locales-all

# aptitude install locales-all
No candidate version found for locales-all
No candidate version found for locales-all
No packages will be installed, upgraded, or removed.
0 packages upgraded, 0 newly installed, 0 to remove and 80 not upgraded.
Need to get 0 B of archives. After unpacking 0 B will be used.
 
Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-19 Thread Jeff Davis
On Wed, 2011-10-19 at 10:10 -0700, Jeff Davis wrote:
  dpkg-reconfigure locales

I had to manually do

# locale-gen tr_TR

to make it generate tr_TR.ISO-8859-9, and now it works.

I'm not sure what we should do, exactly, but I expect that others who
attempt to run the test on ubuntu (and maybe debian) might get confused.
I'd be fine with leaving it alone though, too.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-18 Thread Jeff Davis
On Sun, 2011-10-16 at 16:00 -0400, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  On master, I see a minor test error (at least on my machine) as well as
  a diff. Patch attached.
 
 Hmm, yeah, I forgot to fix this regression test when I added that DETAIL
 line.  However, I don't see the need for fooling with the lc_time value?
 
   regards, tom lane

Here is the diff that I'm seeing on master right now with:

  make -s check EXTRA_TESTS=collate.linux.utf8

If I qualify it as tr_TR.UTF-8 it works. Perhaps I have something
misconfigured on my system (Ubuntu 11.10)? I just installed:
  language-pack-de
  language-pack-tr
  language-pack-sv

in an attempt to make the test work, and it works all except for that
lc_time settng.

Regards,
Jeff Davis
*** /home/jdavis/wd/git/postgresql/src/test/regress/expected/collate.linux.utf8.out	2011-10-18 00:47:06.817223853 -0700
--- /home/jdavis/wd/git/postgresql/src/test/regress/results/collate.linux.utf8.out	2011-10-18 01:02:06.509206748 -0700
***
*** 396,411 
  
  -- to_char
  SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char   
  -
!  01 NIS 2010
  (1 row)
  
  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE tr_TR);
 to_char   
  -
!  01 NİS 2010
  (1 row)
  
  -- backwards parsing
--- 396,412 
  
  -- to_char
  SET lc_time TO 'tr_TR';
+ ERROR:  invalid value for parameter lc_time: tr_TR
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char   
  -
!  01 APR 2010
  (1 row)
  
  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE tr_TR);
 to_char   
  -
!  01 APR 2010
  (1 row)
  
  -- backwards parsing

==


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] new compiler warnings

2011-10-18 Thread Jeff Davis
I'm not sure if these can/should be fixed or not, but here are the
compiler warnings I'm getting on gcc and clang on ubuntu 11.10 with -O2.
The gcc ones are mostly new.

 GCC 

$ gcc --version
gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is
NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

warnings generated:

execQual.c: In function ‘GetAttributeByNum’:
execQual.c:1112:11: warning: the comparison will always evaluate as
‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress]
execQual.c: In function ‘GetAttributeByName’:
execQual.c:1173:11: warning: the comparison will always evaluate as
‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress]
execQual.c: In function ‘ExecEvalFieldSelect’:
execQual.c:3922:11: warning: the comparison will always evaluate as
‘true’ for the address of ‘tmptup’ will never be NULL [-Waddress]
In file included from gram.y:12962:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16257:23: warning: unused variable ‘yyg’ [-Wunused-variable]
elog.c: In function ‘write_pipe_chunks’:
elog.c:2479:8: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
elog.c:2488:7: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
elog.c: In function ‘write_console’:
elog.c:1797:7: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
tuplesort.c: In function ‘comparetup_heap’:
tuplesort.c:2751:12: warning: the comparison will always evaluate as
‘true’ for the address of ‘ltup’ will never be NULL [-Waddress]
tuplesort.c:2752:12: warning: the comparison will always evaluate as
‘true’ for the address of ‘rtup’ will never be NULL [-Waddress]
tuplesort.c: In function ‘copytup_heap’:
tuplesort.c:2783:17: warning: the comparison will always evaluate as
‘true’ for the address of ‘htup’ will never be NULL [-Waddress]
tuplesort.c: In function ‘readtup_heap’:
tuplesort.c:2835:17: warning: the comparison will always evaluate as
‘true’ for the address of ‘htup’ will never be NULL [-Waddress]
fe-connect.c: In function ‘PQconndefaults’:
fe-connect.c:832:6: warning: the comparison will always evaluate as
‘false’ for the address of ‘errorBuf’ will never be NULL [-Waddress]
fe-connect.c: In function ‘PQconninfoParse’:
fe-connect.c:3970:6: warning: the comparison will always evaluate as
‘false’ for the address of ‘errorBuf’ will never be NULL [-Waddress]
common.c: In function ‘handle_sigint’:
common.c:247:4: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
common.c:250:4: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
common.c:251:4: warning: ignoring return value of ‘write’, declared with
attribute warn_unused_result [-Wunused-result]
In file included from mainloop.c:425:0:
psqlscan.l: In function ‘evaluate_backtick’:
psqlscan.l:1677:6: warning: the comparison will always evaluate as
‘false’ for the address of ‘cmd_output’ will never be NULL [-Waddress]

 CLANG 

$ clang --version
clang version 2.9 (tags/RELEASE_29/final)
Target: x86_64-pc-linux-gnu
Thread model: posix

A lot of warnings of the form:

ruleutils.c:698:11: warning: array index of '1' indexes past the end of
an array (that contains 1 elements) [-Warray-bounds]
value = fastgetattr(ht_trig, Anum_pg_trigger_tgargs,
^~~~
In file included from ruleutils.c:23:
In file included from ../../../../src/include/catalog/indexing.h:18:
../../../../src/include/access/htup.h:808:3: note: instantiated from:
att_isnull((attnum)-1, (tup)-t_data-t_bits) ?
\
^
In file included from ruleutils.c:23:
In file included from ../../../../src/include/catalog/indexing.h:18:
In file included from ../../../../src/include/access/htup.h:18:
../../../../src/include/access/tupmacs.h:21:34: note: instantiated from:
#define att_isnull(ATT, BITS) (!((BITS)[(ATT)  3]  (1  ((ATT) 
0x07
 ^  ~~
In file included from ruleutils.c:23:
In file included from ../../../../src/include/catalog/indexing.h:18:
../../../../src/include/access/htup.h:153:9: note: array 't_bits'
declared here
bits8   t_bits[1];  /* bitmap of NULLs --
VARIABLE LENGTH */

==

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-18 Thread Jeff Davis
On Tue, 2011-10-18 at 22:25 +0300, Peter Eisentraut wrote:
 Presumably because Jeff doesn't have that particular locale installed.
 locale -a would clarify that.

$ locale -a |grep -i tr
tr_CY.utf8
tr_TR.utf8

So, yes, I only have the UTF8 version. I didn't realize they were
different -- do you happen to know what package I need for just plain
tr_TR?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-10-17 Thread Jeff Davis
On Sun, 2011-10-16 at 14:43 -0700, Jeff Davis wrote:
 On Fri, 2011-10-07 at 12:54 +0400, Alexander Korotkov wrote:
 
  The first thing caught my eye in existing GiST code is idea of
  subtype_float. float8 has limited precision and can't respresent, for
  example, varlena values good enough. Even if we have large int8 value
  we can loose lower bits, but data distribution can be so that these
  bits are valuable. Wouldn't it better to have function like
  subtype_diff_float which returns difference between two values of
  subtype as an float? Using of such function could make penalty more
  sensible to even small difference between values, and accordingly more
  relevant.
  
 I started implementing subtype_diff, and I noticed that it requires
 defining an extra function for each range type. Previously, the numeric
 types could just use a cast, which was convenient for user-defined range
 types.
 
 If you have any other ideas to make that cleaner, please let me know.
 Otherwise I'll just finish implementing subtype_diff.

I'm beginning to think that we should just allow the user to specify
their own gist_penalty function. Specifying just the subtype_diff
doesn't save much time, and it can only be limiting. Additionally, it's
harder for users to understand the purpose of the function.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] (patch) regression diffs on collate.linux.utf8 test

2011-10-16 Thread Jeff Davis
On master, I see a minor test error (at least on my machine) as well as
a diff. Patch attached.

Regards,
Jeff Davis
*** a/src/test/regress/expected/collate.linux.utf8.out
--- b/src/test/regress/expected/collate.linux.utf8.out
***
*** 395,401  SELECT relname FROM pg_class WHERE relname ~* '^abc';
  (0 rows)
  
  -- to_char
! SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char   
  -
--- 395,401 
  (0 rows)
  
  -- to_char
! SET lc_time TO 'tr_TR.UTF-8';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
 to_char   
  -
***
*** 967,972  CREATE COLLATION test3 (lc_collate = 'en_US.utf8'); -- fail, need lc_ctype
--- 967,973 
  ERROR:  parameter lc_ctype must be specified
  CREATE COLLATION testx (locale = 'nonsense'); -- fail
  ERROR:  could not create locale nonsense: No such file or directory
+ DETAIL:  The operating system could not find any locale data for the locale name nonsense.
  CREATE COLLATION test4 FROM nonsense;
  ERROR:  collation nonsense for encoding UTF8 does not exist
  CREATE COLLATION test5 FROM test0;
*** a/src/test/regress/sql/collate.linux.utf8.sql
--- b/src/test/regress/sql/collate.linux.utf8.sql
***
*** 146,152  SELECT relname FROM pg_class WHERE relname ~* '^abc';
  
  -- to_char
  
! SET lc_time TO 'tr_TR';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE tr_TR);
  
--- 146,152 
  
  -- to_char
  
! SET lc_time TO 'tr_TR.UTF-8';
  SELECT to_char(date '2010-04-01', 'DD TMMON ');
  SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE tr_TR);
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-10-16 Thread Jeff Davis
On Fri, 2011-10-07 at 12:54 +0400, Alexander Korotkov wrote:

 The first thing caught my eye in existing GiST code is idea of
 subtype_float. float8 has limited precision and can't respresent, for
 example, varlena values good enough. Even if we have large int8 value
 we can loose lower bits, but data distribution can be so that these
 bits are valuable. Wouldn't it better to have function like
 subtype_diff_float which returns difference between two values of
 subtype as an float? Using of such function could make penalty more
 sensible to even small difference between values, and accordingly more
 relevant.
 
I started implementing subtype_diff, and I noticed that it requires
defining an extra function for each range type. Previously, the numeric
types could just use a cast, which was convenient for user-defined range
types.

If you have any other ideas to make that cleaner, please let me know.
Otherwise I'll just finish implementing subtype_diff.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COUNT(*) and index-only scans

2011-10-12 Thread Jeff Davis
On Tue, 2011-10-11 at 13:22 -0400, Robert Haas wrote:
 The real issue is that the costing estimates need to be accurate, and
 that's where the rubber hits the road.  Otherwise, even if we pick the
 right way to scan the table, we may do silly things up the line when
 we go to start constructing the join order.  I think we need to beef
 up ANALYZE to gather statistics on the fraction of the pages that are
 marked all-visible, or maybe VACUUM should gather that information.
 The trouble is that if we VACUUM and then ANALYZE, we'll often get
 back a value very close to 100%, but then the real value may diminish
 quite a bit before the next auto-analyze fires.  I think if we can
 figure out what to do about that problem we'll be well on our way...

Can you send stats messages to keep track when you unset a bit in the
VM? That might allow it to be more up-to-date.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Jeff Davis
On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote:
 The cure seems worse than the disease.  What is so bad about '[]'?

OK, so we stick with the 3-argument form. Do we have a default for the
third argument, or do we scrap it to avoid confusion?

There were some fairly strong objections to using '[]' as the default or
having the default vary between types. So, the only real option
remaining, if we do have a default, is '[)'.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Jeff Davis
On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
  Certainly not the end of the world, but is the convenience of being
  able to write somerange(a, b) instead of somerange(a, b, '[)')
  really worth it? I kind of doubt that...
 
 You're making a persuasive argument for the latter based solely on the
 clarity.  If people see that 3rd element in the DDL, or need to
 provide it, it's *very* obvious what's going on.

That was how I originally thought, but we're also providing built-in
range types like tsrange and daterange. I could see how if the former
excluded the endpoint and the latter included it, it could be confusing.

We could go back to having different constructor names for different
inclusivity; e.g. int4range_cc(1,10). That at least removes the
awkwardness of typing (and seeing) '[]'.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Jeff Davis
On Tue, 2011-10-11 at 12:40 -0400, Robert Haas wrote:
 I think using '[)' is fine.  At some level, this is just a question of
 expectations.  If you expect that int4range(1,4) will create a range
 that includes 4, well, you're wrong.  Once you get used to it, it will
 seem normal, and you'll know that you need to write
 int4range(1,4,'[]') if that's what you want.  As long as the system is
 designed around a set of consistent and well-thought-out principles,
 people will get used to the details.  I don't see that the idea of a
 half-open range over a discrete-valued type is particularly confusing
 - we use them all the time in programming, when we make the end
 pointer point to the byte following the end of the array, rather than
 the last element - but even if it is, overall design consistency
 trumps what someone may find to be the absolutely perfect behavior in
 some particular case.  And saving typing is nearly always good -
 unless it creates a LOT more confusion than I think this will.

That sounds very reasonable to me.

Tom made an observation about '[1,INT_MAX]' thowing an error because
canonicalization would try to increment INT_MAX. But I'm not
particularly disturbed by it. If you want a bigger range, use int8range
or numrange -- the same advice we give to people who want unsigned
types. Or, for people who really need the entire range of signed int4
exactly, they can easily make their own range type that canonicalizes to
'[]'.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-10 Thread Jeff Davis
On Mon, 2011-10-10 at 14:27 +0100, Thom Brown wrote:
 I don't know if this has already been discussed, but can you explain
 the following:
 
 postgres=# select '[1,8]'::int4range;
  int4range
 ---
  [1,9)
 (1 row)
 
 It seems unintuitive to represent a discrete range using an exclusive
 upper bound.  While I agree that the value itself is correct, it's
 representation looks odd.  Is it necessary?

The canonicalize function (specified at type creation time) allows you
to specify the canonical output representation. So, I can change the
canonical form for discrete ranges to use '[]' notation if we think
that's more expected.

But then int4range(1,8) would still mean int4range(1,8,'[)') and
therefore '[1,7]'. I used to have a default_flags parameter that could
also be specified at type creation time that would control the default
third parameter (the parameter that controls inclusivity) of the
constructor. However, I removed the default_flags parameter because,
as Florian pointed out, it's better to have a consistent output from the
constructor.

I'm open to suggestions, including potentially bringing back
default_flags.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-10 Thread Jeff Davis
On Mon, 2011-10-10 at 12:53 -0400, Tom Lane wrote:
  The canonicalize function (specified at type creation time) allows you
  to specify the canonical output representation. So, I can change the
  canonical form for discrete ranges to use '[]' notation if we think
  that's more expected.
 
 What if I write '[1,INT_MAX]'::int4range?  The open-parenthesis form will
 fail with an integer overflow.  I suppose you could canonicalize it to
 an unbounded range, but that seems unnecessarily surprising.

So, are you suggesting that I canonicalize to '[]' then? That seems
reasonable to me, but there's still some slight awkwardness because
int4range(1,10) would be '[1,9]'.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-10 Thread Jeff Davis
On Mon, 2011-10-10 at 19:22 +0200, Florian Pflug wrote:
 I still think we should strive for consistency here, so let's also make
 '[]' the default flags for the range constructors.

For continuous ranges I don't think that's a good idea. Closed-open is a
very widely-accepted convention and there are good reasons for it -- for
instance, it's good for specifying contiguous-but-non-overlapping
ranges.

So, I think we either need to standardize on '[)' or allow different
default_flags for different types. Or, always specify the inclusivity in
the constructor (hopefully in a convenient way).

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-10 Thread Jeff Davis
On Mon, 2011-10-10 at 18:39 +0100, Thom Brown wrote:
  So the default boundaries should be '[]' as opposed to '[)' as it is
 now.

Would that vary between range types? In other words, do I bring back
default_flags?

If not, I think a lot of people will object. The most common use-case
for range types are for continuous ranges like timestamps. And (as I
pointed out in reply to Florian) there are good reasons to use the '[)'
convention for those cases.

Regards,
Jeff Davis





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-10 Thread Jeff Davis
On Tue, 2011-10-11 at 03:14 +0200, Florian Pflug wrote:
 Maybe ranges over discrete types are slightly more likely to be closed,
 and ranges over continuous types slightly more likely to be open. Still,
 I very much doubt that the skew in the distribution is large enough to
 warrant the confusion and possibility of subtle bugs we introduce by making
 the semantics of a range type's constructor depend on the definition of the
 range and/or base type.

I think you persuaded me on the consistency aspect.

I'm wondering whether to do away with the default argument entirely, and
just force the user to always specify it during construction. It seems
like a shame that such pain is caused over the syntax, because in a
perfect world it wouldn't be a bother to specify it at all. I even
considered using prefix/postfix operators to try to make it nicer, but
it seems like every idea I had was just short of practical. Maybe a few
extra characters at the end aren't so bad.

I'd like to hear from some potential users though to see if anyone
recoils at the common case.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-10-08 Thread Jeff Davis
On Fri, 2011-10-07 at 12:54 +0400, Alexander Korotkov wrote:

 The first thing caught my eye in existing GiST code is idea of
 subtype_float. float8 has limited precision and can't respresent, for
 example, varlena values good enough. Even if we have large int8 value
 we can loose lower bits, but data distribution can be so that these
 bits are valuable. Wouldn't it better to have function like
 subtype_diff_float which returns difference between two values of
 subtype as an float? Using of such function could make penalty more
 sensible to even small difference between values, and accordingly more
 relevant.

The reason I did it that way is for unbounded ranges. With
subtype_diff_float, it's difficult for the GiST code to differentiate
between [10,) and [10,), because infinity minus anything is
infinity. But when inserting the range [100,200), the penalty for the
first one should be zero and the second one should have some positive
penalty, right?

Maybe we can still use subtype_diff_float by calling it on various pairs
of bounds to come up with a reasonable cost?

I'm open to suggestion. I'd just like to make sure that unbounded ranges
are a part of the consideration.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2011-10-08 Thread Jeff Davis
On Sat, 2011-10-08 at 18:43 +0400, Alexander Korotkov wrote:

 I meant that penalty can be determined as sum of difference of old and
 new bounds of range, i.e. penalty = subtype_diff_float(new_lower,
 old_lower) + subtype_diff_float(old_upper, new_upper). 
 When we insert [100,200) into [10,+inf), union([100,200), [10,+inf))
 = [10,+inf), so penalty =  subtype_diff_float(10,10)
 +  subtype_diff_float(+inf, +inf) = 0 + 0 = 0.
 When we insert [100,200) into [10,), union([100,200), [10,
 +inf)) = [100,+inf), so penalty =  subtype_diff_float(100,10)
 +  subtype_diff_float(+inf, +inf) = 99900 + 0 = 99900.
 
OK, I like that. I will make the change.

 But, there are still the problem, when we'are inserting open interval
 when there is no such open intervals yet. For example, we're going to
 insert [0,+inf), while root page contains [0,10), [10,20), [20,30).
 Each penalty will be infinity, while it seems to be better to insert
 it into [0,10). But, it seems to me to be general limitation of
 current GiST interface, when we have to express penalty in a single
 float.

That seems like an acceptable limitation. I don't think my solution
handles it any better.

Regards,
Jeff Davis

 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-08 Thread Jeff Davis
On Sat, 2011-10-08 at 12:44 -0700, Jeff Janes wrote:
 When I apply this to head, make check fails with:
 
   create type textrange_en_us as range(subtype=text, collation=en_US);
 + ERROR:  collation en_US for encoding SQL_ASCII does not exist
 
 Is this a problem?  e.g. will it break the build-farm?
 
 make check LANG=en_US does pass

Thank you for pointing that out. I think I need to remove those before
commit, but I just wanted them in there now to exercise that part of the
code.

Is there a better way to test collations like that?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-06 Thread Jeff Davis
On Thu, 2011-10-06 at 23:26 +0400, Alexander Korotkov wrote:
 Hi, Jeff!
 
 
 Heikki has recently commited my patch about picksplit for GiST on
 points and boxes:
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7f3bd86843e5aad84585a57d3f6b80db3c609916
 I would like to try this picksplit method on ranges. I believe that it
 might be much more efficient on highly overlapping ranges than current
 picksplit. Range types patch isn't commited, yet. So, what way of work
 publishing is prefered in this case? Add-on patch, separate branch in
 git or something else?

Sounds great! Thank you.

The repo is available here:

http://git.postgresql.org/gitweb/?p=users/jdavis/postgres.git;a=log;h=refs/heads/rangetypes

I'd prefer to include it in the initial patch. If the current GiST code
is going to be replaced, then there's not much sense reviewing/testing
it.

You may need to consider unbounded and empty ranges specially. I made an
attempt to do so in the current GiST code, and you might want to take a
look at that first. I'm not particularly attached to my approach, but we
should do something reasonable with unbounded and empty ranges.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-02 Thread Jeff Davis
On Sun, 2011-10-02 at 11:32 +0200, Florian Pflug wrote:
 Looking at the patch, I noticed that it's possible to specify the default
 boundaries ([], [), (] or ()) per individual float type with the
 DEFAULT_FLAGS clause of CREATE TYPE .. AS RANGE. I wonder if that doesn't
 do more harm then good - it makes it impossible to deduce the meaning of
 e.g. numericrange(1.0, 2.0) without looking up the definition of numericrange.
 
 I suggest we pick one set of default boundaries, ideally '[)' since that
 is what all the built-in canonization functions produce, and stick with it.

That sounds reasonable to me. Unless someone objects, I'll make the
change in the next patch.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - symmetric

2011-09-25 Thread Jeff Davis
On Sat, 2011-09-24 at 10:49 -0400, Bruce Momjian wrote:
  I'll add that it would also cause a little confusion with inclusivity.
  What if you do: '[5,2)'::int4range? Is that really '[2,5)' or '(2,5]'?
 
 Reminder:  BETWEEEN supports the SYMMETRIC keyword, so there is
 a precedent for this.

And I don't see it as valuable enough to justify changing the grammar.

Also, that still leaves confusion about inclusivity when applied to
range types. 

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-22 Thread Jeff Davis
On Thu, 2011-09-22 at 02:31 +0200, Florian Pflug wrote:
 My personal favourite would be '0', since it resembles the symbol used
 for empty sets in mathematics, and we already decided to use mathematical
 notation for ranges.
 
 If we're concerned that most of our users won't get that, then 'empty'
 would be a viable alternative I think.
 
 From a consistency POV it'd make sense to use a bracket-based syntax
 also for empty ranges. But the only available options would be '()' and '[]',
 which are too easily confused with '(,)' and '[,]' (which we already
 decided should represent the full range).

Yes, I think () is too close to (,).

Brainstorming so far:
 0   : simple, looks like the empty set symbol
 empty   : simple
 empty : a little more obvious that it's special
   : visually looks empty
 -   : also looks empty
 {}  : mathematical notation, but doesn't quite fit ranges

I don't have a strong opinion. I'd be OK with any of those.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-22 Thread Jeff Davis
On Thu, 2011-09-22 at 11:31 -0400, Robert Haas wrote:
 On Thu, Sep 22, 2011 at 11:25 AM, Thom Brown t...@linux.com wrote:
  s/visca-versa/vice-versa/
  s/laods/loads/
 
 Fixed.  v4 attached.
 

Can you please explain the more subtly part below?

+A common pattern where this actually does result in a bug is when
adding items
+onto a queue.  The writer does this:
+
+q-items[q-num_items] = new_item;
+++q-num_items;
+
+The reader does this:
+
+num_items = q-num_items;
+for (i = 0; i  num_items; ++i)
+/* do something with q-items[i] */
+
+This code turns out to be unsafe, because the writer might increment
+q-num_items before it finishes storing the new item into the
appropriate slot.
+More subtly, the reader might prefetch the contents of the q-items
array
+before reading q-num_items.

How would the reader prefetch the contents of the items array, without
knowing how big it is?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-22 Thread Jeff Davis
On Thu, 2011-09-22 at 19:12 -0400, Robert Haas wrote:
 But since you asked... as I
 understand it, unless you're running on Alpha, you actually don't need
 a barrier here at all, because all currently-used CPUs other than
 alpha respect data dependencies, which means that if q-num_items is
 used to compute an address to be read from memory, the CPU will ensure
 that the read of that address is performed after the read of the value
 used to compute the address.  At least that's my understanding.  But
 Alpha won't.

I'm still trying to figure out how it's even possible to read an address
that's not computed yet. Something sounds strange about that...

I think it might have more to do with branch prediction or something
else. In your example, the address is not computed from q-num_items
directly, it's computed using i. But that branch being followed is
dependent on a comparison with q-num_items. Maybe that's the dependency
that's not respected?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote:
 No, but more similar the format are the easier it gets to at least factor
 the hairy parts of such a parser into a common subroutine. Assume that we
 don't support NULL as an alias for INF. What would then be the result of
 
   '[A,NULL)'::textrange?

I think that the range input should *parse* NULL in a similar way, but
reject it. So, to make it the range between two definite strings, you'd
do:

  '[A,NULL)'::textrange

which would be equal to textrange('A','NULL','[)'). Without the quotes,
it would detect the NULL, and give an error. Open to suggestion here,
though.

 Presumably, it'd be the same as textrange('A','NULL','[)'). Which think
 is a bit surprising, since '[A,NULL]'::text[] produces ARRAY['A',NULL],
 *NOT* ARRAY['A','NULL'].
 
 BTW, we currently represent infinity for floating point values as
 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make
 
   int4range(0,NULL,'[)')::text
 
 return 
 
   '[0,Infinity)'?

I'm open to that, if you think it's an improvement I'll do it (but we
should probably pick one identifiable string and stick with it). What
I'd like to avoid is adding to the NULL/infinity confusion.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
 What I really
 care about is that we don't talk ourselves into needing a zillion
 constructor functions.  Making things work with a single constructor
 function seems to me to simplify life quite a bit, and allowing there
 seems essential for that.

I think we pretty much all agree on that. However, you did see the note
about the difficulty of using default parameters in built-in functions,
right?

I ultimately ended up with 4 constructors, each with the same name but
0, 1, 2, and 3 parameters. Suggestions welcome.

 (I am also vaguely wondering what happens if if you have a text
 range is (nubile, null) ambiguous?)

There are a few ways to handle that. I would lean toward parsing the
NULL as a special keyword, and then rejecting it (does it matter if it's
upper case?).

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Wed, 2011-09-21 at 17:20 +0200, Florian Pflug wrote:
 Hm, so we'd have
 
   '(X,)' for range(X, NULL, '()'),
   '(,X)' for range(NULL, X, '()') and
   '(,)' for range(NULL, NULL, '()').
 
 We'd then have the choice of either declaring
 
   '(X,]' to mean '(X,)',
   '[,X)' to mean '(,X)' and
   '[,]' to mean '(,)'
 
 or to forbid the use of '[' and ']' for unspecified bounds.

Right now, I just canonicalize it to round brackets if infinite. Seems
pointless to reject it, but I can if someone thinks it's better.

 (Leaving out the ',' in the case of only one bound as in my reply to Robert's
 mail somewhere else in this thread doesn't actually work, since it'd be 
 ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').)
 
 One nice property is that, apart from the different brackets used, this
 representation is identical to the one used by records while still avoiding
 the infinity vs. NULL confusion.

OK, I like that. Slightly strange to require quoting empty strings, but
not stranger than the alternatives.

While we're at it, any suggestions on the text representation of an
empty range?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote:
 I've thought about this some more, and came to realize that the question
 here really is whether
 
   floatrange(0, 'Infinity'::float, '[)')
 
 and
 
   floatrange(0, NULL, '[)')
 
 are the same thing or not.

The unbounded side of a range is never equal to a value in the data
type's domain, so no, it's not the same.

I think that we pretty much settled on just using an empty string for
infinity in the other thread, right? So that makes this a non-issue.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-19 Thread Jeff Davis
On Mon, 2011-09-19 at 17:23 +0200, Florian Pflug wrote:
 The one reason I can see in favour of supporting N-U-L-L there is 
 compatibility with arrays.

But arrays actually do store and produce NULLs; ranges don't.

  I've recently had the questionable pleasure
 of writing PHP functions to parse and emit our textual representations of
 arrays, records, dates and timestamps. After that experience, I feel that
 the number of similar-yet-slightly-different textual input output format
 for non-primitive types is already excessive, and any further additions
 should be modeled after some existing ones.

I'm not clear on how accepting NULL would really save effort. With
ranges, the brackets have an actual meaning (inclusivity), and empty
ranges have no brackets at all. So I don't think it's going to be easy
to write one function to parse everything.

What about binary formats?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-19 Thread Jeff Davis
On Mon, 2011-09-19 at 11:00 -0500, Kevin Grittner wrote:
 On a practical level, our shop is already effectively doing this. 
 We have several tables where part of the primary key is effective
 date and there is a null capable expiration date -- with a NULL
 meaning that no expiration date has been set.  It would be nice to
 be able to have a generated column function which used these two
 dates to build a range for exclusion constraints and such.

Agreed, that's a good convenience argument for accepting NULL boundaries
in the constructors.

Underneath though, we don't use NULL semantics (because they don't make
sense for ranges -- in fact, avoiding the need to constantly
special-case NULLs is one of the reasons to use range types). So, we
want to avoid confusion where possible.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - symmetric

2011-09-13 Thread Jeff Davis
On Tue, 2011-09-13 at 12:34 -0400, Christopher Browne wrote:
  select int4range(5,2);
  ERROR:  range lower bound must be less than or equal to range upper bound
 
  Of course, I won't argue this is a bug, but I was wondering if it wouldn't 
  be handy to allow a
  'symmetric' mode in range construction, where, if the first of the pair is 
  higher than the second,
  they are automatically swapped, similar to SYMMETRIC in the BETWEEN clause.

...

 If you have a computation that gets a backwards range, then it is
 more than possible that what you've got isn't an error of getting the
 range backwards, but rather the error that your data is
 overconstraining, and that you don't actually have a legitimate range.

Agreed. On balance, it's just as likely that you miss an error as save a
few keystrokes.

I'll add that it would also cause a little confusion with inclusivity.
What if you do: '[5,2)'::int4range? Is that really '[2,5)' or '(2,5]'?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] collation, arrays, and ranges

2011-09-12 Thread Jeff Davis
My interpretation of collation for range types is different than that
for arrays, so I'm presenting it here in case someone has an objection.

An array type has the same typcollation as its element type. This makes
sense, because comparison between arrays are affected by the COLLATE
clause.

Comparison between ranges should not be affected by the COLLATE clause
(as we discussed). So, I chose to represent that as a separate
rngcollation and leave the typcollation 0. In other words, collation is
a concept internal to that range type and fixed at type definition time.
Range types are affected by their internal collation, but don't take
part in the logic that passes collation through the type system.

Comments?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] collation, arrays, and ranges

2011-09-10 Thread Jeff Davis
My interpretation of collation for range types is different than that
for arrays, so I'm presenting it here in case someone has an objection.

An array type has the same typcollation as its element type. This makes
sense, because comparison between arrays are affected by the COLLATE
clause.

Comparison between ranges should not be affected by the COLLATE clause
(as we discussed). So, I chose to represent that as a separate
rngcollation and leave the typcollation 0. In other words, collation is
a concept internal to that range type and fixed at type definition time.
Range types are affected by their internal collation, but don't take
part in the logic that passes collation through the type system.

Comments?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] collation, arrays, and ranges

2011-09-10 Thread Jeff Davis
On Sat, 2011-09-10 at 13:21 -0400, Tom Lane wrote:
  So, I chose to represent that as a separate
  rngcollation and leave the typcollation 0. In other words, collation is
  a concept internal to that range type and fixed at type definition time.
  Range types are affected by their internal collation, but don't take
  part in the logic that passes collation through the type system.
 
 Should I read that as saying you want to add yet another column to
 pg_type?  I'd prefer not to do that.  Seems to me we could still store
 the value in typcollation, but just interpret the column a bit
 differently depending on typtype.

I added the column to pg_range (rngcollation), which seemed a little
less invasive than either of the other options (either adding a new
column to pg_type or overloading the existing one).

I was worried about having the same column in pg_type mean two different
things -- every caller of get_typcollation would need to be careful.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] casting between range types

2011-08-31 Thread Jeff Davis
At one point, the question of casting between range types came up. At
first, this seemed like a fairly reasonable suggestion, but now I don't
think I like the semantics.

A normal cast changes between essentially equivalent values in different
domains. For instance 3 as an int4 is equivalent to 3.0 as a numeric.

However, if we take the simple approach with range types and cast the
bounds, we end up with some weird situations.

First, a range is really a set. So if we take '[1,10)'::int4range and
cast that to numrange, we end up moving from a set of exactly 9 elements
to a set of an infinite number of elements. Going the other way is
probably worse.

Sometimes casts are a bit lossy and I suppose we could write that off.

But things get weirder when the total order is different (e.g. different
text collations). Then you end up with a completely different set of
values, which doesn't sound like a cast to me at all.

So, I'm leaning toward just not providing any casts from one range type
to another.

Thoughts?

Regards
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] casting between range types

2011-08-31 Thread Jeff Davis
On Wed, 2011-08-31 at 09:20 +0300, Heikki Linnakangas wrote:
 On 31.08.2011 09:14, Jeff Davis wrote:
  First, a range is really a set. So if we take '[1,10)'::int4range and
  cast that to numrange, we end up moving from a set of exactly 9 elements
  to a set of an infinite number of elements. Going the other way is
  probably worse.

...

 Can you only provide casts that make sense, like between int4 and 
 numeric range types, and leave out the ones that don't?

There are certainly some casts that make sense, like
int4range-int8range. Do you think int4range-numrange also makes sense?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Jeff Davis
On Fri, 2011-08-26 at 12:46 -0400, Robert Haas wrote:
 --sections='predata data'
 --sections='postdata'
 --sections='index'

Agreed. After command line options reach a certain level of complexity,
I think it's worth looking for a more general way to express them.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Range Types

2011-08-23 Thread Jeff Davis
Attached is the latest version of the Range Types patch. I will get it
into better shape before the commitfest, but wanted to put up a draft in
case anyone had comments on the TODO items.

Changes:

  * Uses BTree opclass rather than compare function.
  * Collation specified at type definition time.
  * Various fixes.

TODO:

  * Should the catalog hold the opclass or the opfamily? This doesn't
affect much, but I wasn't sure which to actually store in the catalog.

  * Use Robert Haas' suggestion for auto-generating constructors with
the same name as the range type, e.g. int8range(1,10,'[]'), where the
third argument defaults to '[)'. This allows better type inference for
constructors, especially when there are multiple range types over the
same base type (and collation is a common case of this). I believe this
was the best idea after significant discussion:
http://archives.postgresql.org/pgsql-hackers/2011-06/msg02046.php
http://archives.postgresql.org/pgsql-hackers/2011-07/msg00210.php

  * Send/recv functions

  * cleanup

  * documentation updates

Regards,
Jeff Davis


rangetypes-20110822.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] skip WAL on COPY patch

2011-08-23 Thread Jeff Davis
On Tue, 2011-08-23 at 15:05 -0400, Tom Lane wrote:
 Steve Singer ssin...@ca.afilias.info writes:
  The attached patch adds an option to the COPY command to skip writing 
  WAL when the following conditions are all met:
 
  1) The table is empty (zero size on disk)
  2) The copy command can obtain an access exclusive lock on the table 
  with out blocking.
  3) The WAL isn't needed for replication
 
 Exposing this as a user-visible option seems a seriously bad idea.

In that particular way, I agree. But it might be useful if there were a
more general declarative option like BULKLOAD. We might then use that
information for a number of optimizations that make sense for large
loads.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jeff Davis
On Tue, 2011-08-16 at 11:01 -0500, Jim Nasby wrote:
 Also, an invalid transaction seems to be the result of least
 surprise... if you cared enough to begin a transaction, you're going
 to expect that either everything between that and the COMMIT succeeds
 or fails, not something in-between.

Agreed.

Perhaps we need a new utility command to set the snapshot to make the
error handling a little more obvious?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jeff Davis
On Tue, 2011-08-16 at 20:35 -0400, Tom Lane wrote:
 I'm not convinced by the above argument, because it requires that
 you pretend there's a significant difference between syntax errors and
 run time errors (whatever those are).

After a syntax error like COMMMIT the transaction will remain inside
the failed transaction block, but an error during COMMIT (e.g. deferred
constraint check failure) will exit the transaction block.

 I think we'd be far better off to maintain the position that a failed
 BEGIN does not start a transaction, under any circumstances.  To do
 that, we cannot have this new option attached to the BEGIN, which is a
 good thing anyway IMO from a standards compatibility point of view.
 It'd be better to make it a separate utility statement.

+1 for a utility statement. Much clearer from the user's standpoint what
kind of errors they might expect, and whether the session will remain in
a transaction block.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2011-08-16 Thread Jeff Davis
On Tue, 2011-08-16 at 21:08 -0400, Robert Haas wrote: 
 attaching it to BEGIN feels natural to me.

My only objection is that people have different expectations about
whether the session will remain in a transaction block when they
encounter an error. So, it's hard to make this work without surprising
about half the users.

And there are some fairly significant consequences to users who guess
that they will remain in a transaction block in case of an error; or who
are just careless and don't consider that an error may occur.

 If we did add another toplevel command, what would we call it?

SET TRANSACTION SNAPSHOT perhaps?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra check in 9.0 exclusion constraint unintended consequences

2011-08-12 Thread Jeff Davis
On Fri, 2011-08-12 at 14:58 -0400, Robert Haas wrote:
 Having thought about this a bit further, I'm coming around to the view
 that if it isn't worth adding this in master, it's not worth adding at
 all.  I just don't think it's going to get any visibility as a
 back-branch only doc patch.

Fine with me.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra check in 9.0 exclusion constraint unintended consequences

2011-08-11 Thread Jeff Davis
On Thu, 2011-08-11 at 11:58 -0400, Robert Haas wrote:
 I'm OK with adding a note either to the 9.0 docs only (which means it
 might be missed by a 9.0 user who only looks at the current docs) or
 with adding a note to all versions mentioning the difference in
 behavior with 9.0, but I'm not really sure which way to go with it.
 Or we could just not do anything at all.  Anyone else have an opinion?

It seems to be somewhat of a burden to carry a version-specific note
indefinitely... more clutter than helpful. So I'd vote for just changing
the 9.0 docs.

Or, we could add the 9.0-specific note to 9.0 and 9.1 docs, but leave it
out of 'master'. That way it sticks around for a while but we don't have
to remember to remove it later.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] augmenting MultiXacts to improve foreign keys

2011-08-09 Thread Jeff Davis
On Tue, 2011-08-09 at 13:01 -0400, Alvaro Herrera wrote:
 Note that the KEY UPDATE lock would be an internal option, not exposed
 to SQL.  I think we already have enough extensions in this area.  We are
 forced to expose KEY SHARE because RI triggers get to it via SPI, but I
 would be happy to avoid doing it if I knew how.

Right now, FKs aren't really very special, they are mostly just
syntactic sugar (right?). This proposal would make FKs special internal
mechanisms, and I don't see the benefit in doing so.

[ I didn't read through the previous threads yet, so perhaps this was
already discussed. ]

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ignore lost+found when checking if a directory is empty

2011-08-09 Thread Jeff Davis
On Tue, 2011-08-09 at 14:52 -0400, Brian Pitts wrote:
 When an ext2, ext3, or ext4 filesystem is mounted directly on the
 PGDATA directory, initdb will refuse to run because it sees the
 lost+found directory that mke2fs created and assumes the PGDATA
 directory is already in use for something other than PostgreSQL.
 Attached is a patch against master which will cause a directory that
 contains only lost+found to still be treated as empty.
 
 This was previously proposed in 2001; see
 http://archives.postgresql.org/pgsql-hackers/2001-03/msg01194.php

In the referenced discussion (10 years ago), Tom seemed OK with it and
Peter did not seem to like it much.

I think I agree with Peter here that it's not a very good idea, and I
don't see a big upside. With tablespaces it seems to make a little bit
more sense, but I'd still lean away from that idea.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ignore lost+found when checking if a directory is empty

2011-08-09 Thread Jeff Davis
On Tue, 2011-08-09 at 14:52 -0400, Brian Pitts wrote:
 Attached is a patch against master which will cause a directory that
 contains only lost+found to still be treated as empty.

Please add this to the September commitfest at:
https://commitfest.postgresql.org/

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-06 Thread Jeff Davis
On Fri, 2011-08-05 at 23:16 -0400, Bruce Momjian wrote:
 Well, if the table is created in the same transaction (which is the only
 case under consideration), no other sessions can write to the table so
 you are just writing the entire table on commit, rather than to the WAL.

The transaction can still write to many tables that way, and that could
mean many fsyncs.

Also, there may be a bunch of other transactions trying to write to the
WAL that have to wait as your transaction has to seek out to fsync the
data file and then seek back to the WAL.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-05 Thread Jeff Davis
On Thu, 2011-08-04 at 18:07 -0400, Bruce Momjian wrote:
 I am confused how generating WAL traffic that is larger than the heap
 file we are fsync'ing can possibly be slower.  Are you just throwing out
 an idea to try to make me prove it?

That's worded in a slightly confusing way, but here is the trade-off:

1. If you are using WAL, then regardless of what your transaction does,
only the WAL needs to be fsync'd at commit time. Conveniently, that's
being written sequentially, so it's a single fairly cheap fsync; and all
the data page changes are deferred, collected together, and fsync'd at
checkpoint time (rather than commit time). The cost is that you
double-write the data.

2. If you aren't using WAL, you need to fsync every data file the
transaction touched, which are probably not localized with other
activity. Also, the _entire_ data files needs to be sync'd, so perhaps
many other transactions have made changes to one data file all over, and
it may require _many_ seeks to accomplish the one fsync. The benefit is
that you don't double-write the data.

So, fundamentally, WAL is (in the OLTP case, where a transaction is much
shorter than a checkpoint interval) a big performance _win_, because it
allows us to do nice sequential writing in a single place for all
activities of all transactions; and defer all those random writes to
data pages until the next checkpoint. So we shouldn't treat WAL like a
cost burden that we want to avoid in every case we can.

But in the data load case (where many checkpoints may happen during a
single transaction anyway), it happens that avoiding WAL is a
performance win, because the seeks are not the dominant cost.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy vxid locks, v3

2011-08-04 Thread Jeff Davis
On Mon, 2011-08-01 at 12:12 -0400, Robert Haas wrote:
 I guess you could look at that way.  It just seemed like the obvious
 way to write the code: we do LockRefindAndRelease() only if we have a
 fast-path lock that someone else has pushed into the main table.

OK, looks good to me. Marked ready for committer.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transient plans versus the SPI API

2011-08-04 Thread Jeff Davis
On Tue, 2011-08-02 at 16:47 -0400, Tom Lane wrote:
 The most straightforward way to reimplement things within spi.c would be
 to redefine SPI_prepare as just doing the parse-and-rewrite steps, with
 planning always postponed to SPI_execute.  In the case where you just
 prepare and then execute a SPIPlan, this would come out the same or
 better, since we'd still just do one planning cycle, but the planner could
 be given the actual parameter values to use.  However, if you SPI_prepare,
 SPI_saveplan, and then SPI_execute many times, you might come out behind.
 This is of course the same tradeoff we are going to impose at the SQL level
 anyway, but I wonder whether there needs to be a control knob available to
 C code to retain the old plan-once-and-always-use-that-plan approach.

Would there ultimately be a difference between the way SPI_prepare and
PQprepare work? It seems like the needs would be about the same, so I
think we should be consistent.

Also, I assume that SPI_execute and PQexecParams would always force a
custom plan, just like always, right?

A control knob sounds limited. For instance, what if the application
knows that some parameters will be constant over the time that the plan
is saved? It would be nice to be able to bind some parameters to come up
with a generic (but less generic) plan, and then execute it many times.
Right now that can only be done by inlining such constants in the SQL,
which is what we want to avoid.

I'm a little bothered by prepare sometimes planning and sometimes not
(and, by implication, execute_plan sometimes planning and sometimes
not). It seems cleaner to just separate the steps into parse+rewrite,
bind parameters, plan (with whatever parameters are present, giving a
more generic plan when some aren't specified), and execute (which would
require you to specify any parameters not bound yet). Maybe we don't
need to expose all of those steps (although maybe we do), but it would
be nice if the API we do offer resembles those steps.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transient plans versus the SPI API

2011-08-04 Thread Jeff Davis
On Wed, 2011-08-03 at 12:19 -0400, Tom Lane wrote:
 Of course we could address the worst cases by providing some mechanism
 to tell the plancache code always use a generic plan for this query
 or always use a custom plan.  I'm not entirely thrilled with that,
 because it's effectively a planner hint and has got the same problems
 as all planner hints, namely that users are likely to get it wrong.

I'm not entirely convinced by that. It's fairly challenging for a human
to choose a good plan for a moderately complex SQL query, and its much
more likely that the plan will become a bad one over time. But, in many
cases, a developer knows if they simply don't care about planning time,
and are willing to always replan.

Also, we have a fairly reasonable model for planning SQL queries, but
I'm not sure that the model for determining whether to replan a SQL
query is quite as clear. Simon brought up some useful points along these
lines.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transient plans versus the SPI API

2011-08-04 Thread Jeff Davis
On Wed, 2011-08-03 at 13:07 -0400, Robert Haas wrote:
 A little OT here, but (as I think Simon said elsewhere) I think we
 really ought to be considering the table statistics when deciding
 whether or not to replan.  It seems to me that the overwhelmingly
 common case where this is going to come up is when (some subset of)
 the MCVs require a different plan than run-of-the-mill values.  It
 would be nice to somehow work that out.

That blurs the line a little bit. It sounds like this might be described
as incremental planning, and perhaps that's a good way to think about
it.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-04 Thread Jeff Davis
On Thu, 2011-08-04 at 17:46 -0400, Bruce Momjian wrote:
 Right.  I brought up SELECT INTO because you could make the argument
 that INSERT ... SELECT is not a utility command like the other ones and
 therefore can't be done easily, but CREATE TABLE AS is internal SELECT
 INTO and implemented in execMain.c, which I think is where INSERT ...
 SELECT would also be implemented.

The above statement is a little confusing, so let me start from the
beginning:

How could we avoid WAL logging for INSERT ... SELECT?

The way we do it for CREATE TABLE AS is because nobody would even *see*
the table if our transaction doesn't commit. Therefore we don't need to
bother logging it. Same can be said for SELECT INTO.

INSERT ... SELECT is just an insert. It needs just as much logging as
inserting tuples any other way. For instance, it will potentially share
pages with other inserts, and better properly record all such page
modifications so that they return to a consistent state.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


<    1   2   3   4   5   6   7   8   9   10   >