from:"Kenneth Marshall"

Re: [HACKERS] unique index violation after pg_upgrade to PG10

2017-10-24 Thread Kenneth Marshall

On Tue, Oct 24, 2017 at 01:30:19PM -0500, Justin Pryzby wrote:
> On Tue, Oct 24, 2017 at 01:27:14PM -0500, Kenneth Marshall wrote:
> > On Tue, Oct 24, 2017 at 01:14:53PM -0500, Justin Pryzby wrote:
> 
> > > Note:
> > > I run a script which does various combinations of ANALYZE/VACUUM 
> > > (FULL/ANALYZE)
> > > following the upgrade, and a script runs nightly with REINDEX and 
> > > pg_repack
> > > (and a couple of CLUSTER), so you should assume that any combination of 
> > > those
> > > maintenance commands have been run.
> > > 
> > > In our reindex/repack log I found the first error due to duplicates:
> > > Tue Oct 24 01:27:53 MDT 2017: sites: sites_idx(repack non-partitioned)...
> > > WARNING: Error creating index "public"."index_61764": ERROR:  could not 
> > > create unique index "index_61764"
> > > DETAIL:  Key (site_office, site_location)=(CRCLMT-DOEMS0, 1120) is 
> > > duplicated.
> > > WARNING: Skipping index swapping for "sites", since no new indexes built
> > > WARNING: repack failed for "sites_idx"
> > > reindex: warning, dropping invalid/unswapped index: index_61764
> > > 
> > 
> > Hi Justin,
> > 
> > This sounds like a pg_repack bug and not a PostgreSQL bug. What version are
> > you running?
> 
> Really ?  pg_repack "found" and was victim to the duplicate keys, and rolled
> back its work.  The CSV logs clearly show that our application INSERTed rows
> which are duplicates.
> 
> [pryzbyj@database ~]$ rpm -qa pg_repack10
> pg_repack10-1.4.2-1.rhel6.x86_64
> 
> Justin

Hi Justin,

I just dealt with a similar problem with pg_repack and a PostgreSQL 9.5 DB,
the exact same error. It seemed to caused by a tuple visibility issue that
allowed the "working" unique index to be built, even though a duplicate row
existed. Then the next pg_repack would fail with the error you got. In our
case I needed the locality of reference to keep the DB performance acceptable
and it was not a critical issue if there was a duplicate. We would remove the
duplicates if we had a failure. We never had a problem with the NO pg_repack
scenarios.

Regards,
Ken 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unique index violation after pg_upgrade to PG10

2017-10-24 Thread Kenneth Marshall

On Tue, Oct 24, 2017 at 01:14:53PM -0500, Justin Pryzby wrote:
> I upgrade another instance to PG10 yesterday and this AM found unique key
> violations.
> 
> Our application is SELECTing FROM sites WHERE site_location=$1, and if it
> doesn't find one, INSERTs one (I know that's racy and not ideal).  We ended up
> with duplicate sites, despite a unique index.  We removed the duplicate rows
> and reindexed fine.  This is just a heads up with all the detail I can fit in 
> a
> mail (but there's more if needed).
> ...
> Note:
> I run a script which does various combinations of ANALYZE/VACUUM 
> (FULL/ANALYZE)
> following the upgrade, and a script runs nightly with REINDEX and pg_repack
> (and a couple of CLUSTER), so you should assume that any combination of those
> maintenance commands have been run.
> 
> In our reindex/repack log I found the first error due to duplicates:
> Tue Oct 24 01:27:53 MDT 2017: sites: sites_idx(repack non-partitioned)...
> WARNING: Error creating index "public"."index_61764": ERROR:  could not 
> create unique index "index_61764"
> DETAIL:  Key (site_office, site_location)=(CRCLMT-DOEMS0, 1120) is duplicated.
> WARNING: Skipping index swapping for "sites", since no new indexes built
> WARNING: repack failed for "sites_idx"
> reindex: warning, dropping invalid/unswapped index: index_61764
> 

Hi Justin,

This sounds like a pg_repack bug and not a PostgreSQL bug. What version are
you running?

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

2017-08-16 Thread Kenneth Marshall

On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > Attached is a quick sketch of how this could perhaps be done (ignoring
> > for the moment the relatively-boring opclass pushups).  It introduces
> > a new function hash_any_extended which differs from hash_any() in that
> > (a) it combines both b and c into the result and (b) it accepts a seed
> > which is mixed into the initial state if it's non-zero.
> 
> > Comments?
> 
> Hm.  Despite the comment at lines 302-304, I'm not sure that we ought
> to do this simply by using "b" as the high order bits.  AFAICS that
> exposes little or no additional randomness; in particular it seems
> unlikely to meet Jenkins' original design goal that "every 1-bit and
> 2-bit delta achieves avalanche".  There might be some simple way to
> extend the existing code to produce a mostly-independent set of 32 more
> bits, but I wonder if we wouldn't be better advised to just keep Jenkins'
> code as-is and use some other method entirely for producing the
> 32 new result bits.
> 
> ... In fact, on perusing the linked-to page
> http://burtleburtle.net/bob/hash/doobs.html
> Bob says specifically that taking b and c from this hash does not
> produce a fully random 64-bit result.  He has a new one that does,
> lookup3.c, but probably he hasn't tried to make that bit-compatible
> with the 1997 algorithm.
> 

Hi,

The updated hash functions that we currently use are based on Bob Jenkins
lookup3.c and using b as the higher order bits is pretty darn good. I had
lobbied to present the 64-bit b+c hash in the original work for similar
reasons. We are definitely not using a lookup2.c version from 1997.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Data at rest encryption

2017-06-14 Thread Kenneth Marshall

On Wed, Jun 14, 2017 at 12:04:26PM +0300, Aleksander Alekseev wrote:
> Hi Ants,
> 
> On Tue, Jun 13, 2017 at 09:07:49AM -0400, Peter Eisentraut wrote:
> > On 6/12/17 17:11, Ants Aasma wrote:
> > > I'm curious if the community thinks this is a feature worth having?
> > > Even considering that security experts would classify this kind of
> > > encryption as a checkbox feature.
> > 
> > File system encryption already exists and is well-tested.  I don't see
> > any big advantages in re-implementing all of this one level up.  You
> > would have to touch every single place in PostgreSQL backend and tool
> > code where a file is being read or written.  Yikes.
> 
> I appreciate your work, but unfortunately I must agree with Peter.
> 
> On Linux you can configure the full disc encryption using LUKS /
> dm-crypt in like 5 minutes [1]. On FreeBSD you can do the same using
> geli [2]. In my personal opinion PostgreSQL is already complicated
> enough. A few companies that hired system administrators that are too
> lazy to read two or three man pages is not a reason to re-implement file
> system encryption (or compression, or mirroring if that matters) in any
> open source RDBMS.
> 

Hi Aleksander,

While I agree that configuring full disk encryption is not technically
difficult, it requires much more privileged access to the system and
basically requires the support of a system administrator. In addition,
if a volume is not available for encryption, PostgreSQL support for
encryption would still allow for its data to be encrypted and as others
have mentioned can be enabled by the DBA alone.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

2017-05-12 Thread Kenneth Marshall

On Fri, May 12, 2017 at 02:23:14PM -0400, Robert Haas wrote:
> 
> What about integers?  I think we're already assuming two's-complement
> arithmetic, which I think means that the only problem with making the
> hash values portable for integers is big-endian vs. little-endian.
> That's sounds solveable-ish.
> 

xxhash produces identical hashes independent for big-endian and little-
endian.

https://github.com/Cyan4973/xxHash

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CTE inlining

2017-05-03 Thread Kenneth Marshall

On Wed, May 03, 2017 at 02:33:05PM -0300, Alvaro Herrera wrote:
> David Fetter wrote:
> 
> > When we add a "temporary" GUC, we're taking on a gigantic burden.
> > Either we support it forever somehow, or we put it on a deprecation
> > schedule immediately and expect to be answering questions about it for
> > years after it's been removed.
> > 
> > -1 for the GUC.
> 
> Absolutely.
> 
> So ISTM we have three choices:
> 
> 1) we switch unmarked CTEs as inlineable by default in pg11.  What seems
> likely to happen for a user that upgrades to pg11 is that 5 out of 10
> CTE-using queries are going to become faster than with pg10, and they
> are going to be happy; 4 out of five are going to see no difference, but
> they didn't have to do anything about it; and the remaining query is
> going to become slower, either indistinguishably so (in which case they
> don't care and they remain happy because of the other improvements) or
> notably so, in which case they can easily figure where to add the
> MATERIALIZED option and regain the original performance.
> 
> 
> 2) unmarked CTEs continue to be an optimization barrier, but we add
> "WITH INLINED" so that they're inlineable.  Some users may wonder about
> it and waste a lot of time trying to figure out which CTEs to add it to.
> They see a benefit in half the queries, which makes them happy, but they
> are angry that they had to waste all that time on the other queries.
> 
> 
> 3) We don't do anything, because we all agree that GUCs are not
> suitable.  No progress.  No anger, but nobody is happy either.
> 

+1 for option 1. I just finished rewriting a well written CTE query to
avoid the optimization fence and get reasonable performance.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_output output of base64

2017-02-24 Thread Kenneth Marshall

On Thu, Feb 23, 2017 at 03:52:46PM -0800, David Fetter wrote:
> On Thu, Feb 23, 2017 at 05:55:37PM -0500, Bruce Momjian wrote:
> > On Thu, Feb 23, 2017 at 04:08:58PM -0500, Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > > Is there a reason we don't support base64 as a bytea_output output
> > > > option, except that no one has implemented it?
> > > 
> > > How about "we already have one too many bytea output formats"?
> > > I don't think forcing code to try to support still another one
> > > is a great thing ... especially not if it couldn't be reliably
> > > distinguished from the hex format.
> > 
> > Is there a reason we chose hex over base64?
> 
> Whether there was or not, there's not a compelling reason now to break
> people's software.  When people want compression, methods a LOT more
> effective than base64 are common.  Gzip, for example.
> 
> Best,
> David.

First, hex encoding is very simple to perform. Second, most applications
have routines to handle it trivially. And third, base64 encoding has some
padding constraints that can complicate is processing. Like David suggests,
if you want compact, run it through lz4/gzip/lzop...for a much better size
return.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Indexes

2016-09-19 Thread Kenneth Marshall

On Mon, Sep 19, 2016 at 12:14:26PM +0530, Amit Kapila wrote:
> On Mon, Sep 19, 2016 at 11:20 AM, Mark Kirkwood
>  wrote:
> >
> >
> > On 17/09/16 06:38, Andres Freund wrote:
> >>
> >> On 2016-09-16 09:12:22 -0700, Jeff Janes wrote:
> >>>
> >>> On Thu, Sep 15, 2016 at 7:23 AM, Andres Freund 
> >>> wrote:
> 
>  One earlier question about this is whether that is actually a worthwhile
>  goal.  Are the speed and space benefits big enough in the general case?
>  Could those benefits not be achieved in a more maintainable manner by
>  adding a layer that uses a btree over hash(columns), and adds
>  appropriate rechecks after heap scans?
> 
>  Note that I'm not saying that hash indexes are not worthwhile, I'm just
>  doubtful that question has been explored sufficiently.
> >>>
> >>> I think that exploring it well requires good code.  If the code is good,
> >>> why not commit it?
> >>
> >> Because getting there requires a lot of effort, debugging it afterwards
> >> would take effort, and maintaining it would also takes a fair amount?
> >> Adding code isn't free.
> >>
> >> I'm rather unenthused about having a hash index implementation that's
> >> mildly better in some corner cases, but otherwise doesn't have much
> >> benefit. That'll mean we'll have to step up our user education a lot,
> >> and we'll have to maintain something for little benefit.
> >>
> >
> > While I see the point of what you are saying here, I recall all previous
> > discussions about has indexes tended to go a bit like this:
> >
> > - until WAL logging of hash indexes is written it is not worthwhile trying
> > to make improvements to them
> > - WAL logging will be a lot of work, patches 1st please
> >
> > Now someone has done that work, and we seem to be objecting that because
> > they are not improved then the patches are (maybe) not worthwhile.
> >
> 
> I think saying hash indexes are not improved after proposed set of
> patches is an understatement.  The read performance has improved by
> more than 80% as compare to HEAD [1] (refer data in Mithun's mail).
> Also, tests by Mithun and Jesper has indicated that in multiple
> workloads, they are better than BTREE by 30~60% (in fact Jesper
> mentioned that he is seeing 40~60% benefit on production database,
> Jesper correct me if I am wrong.).  I agree that when index column is
> updated they are much worse than btree as of now, but no work has been
> done improve it and I am sure that it can be improved for those cases
> as well.
> 
> In general, I thought the tests done till now are sufficient to prove
> the importance of work, but if still Andres and others have doubt and
> they want to test some specific cases, then sure we can do more
> performance benchmarking.
> 
> Mark, thanks for supporting the case for improving Hash Indexes.
> 
> 
> [1] - 
> https://www.postgresql.org/message-id/CAD__OugX0aOa7qopz3d-nbBAoVmvSmdFJOX4mv5tFRpijqH47A%40mail.gmail.com
> -- 
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
> 

+1 Throughout the years, I have seen benchmarks that demonstrated the
performance advantages of even the initial hash index (without WAL)
over the btree of a hash variant. It is pretty hard to dismiss the
O(1) versus O(log(n)) difference. There are classes of problems for
which a hash index is the best solution. Lack of WAL has hamstrung
development in those areas for years.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] README of hash index

2016-09-16 Thread Kenneth Marshall

On Fri, Sep 16, 2016 at 04:50:53PM +0530, Amit Kapila wrote:
> Currently README of hash module contain algorithms written in below form.
> 
> The insertion algorithm is rather similar:
> 
> pin meta page and take buffer content lock in shared mode
> loop:
> compute bucket number for target hash key
> release meta page buffer content lock
> if (correct bucket page is already locked)
> break
> release any existing bucket page lock (if a concurrent split happened)
> take heavyweight bucket lock in shared mode
> retake meta page buffer content lock in shared mode
> -- (so far same as reader)
> release pin on metapage
> ..
> ..
> 
> I have mostly updated them in the patches I have proposed to improve
> hash index.  However, each time I try to update them, I find that it
> is easy to follow the code than to read and understand the existing
> algorithm written in above form from README.
> 
> Do others find it useful to maintain the algorithms in above form?
> 

Hi Amit,

Speaking for myself, I think it does help to have a text description
of the algorithm to use as a guide while trying to understand the code.
For beginners (me), it is not always obvious what a section of code is
doing.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Hash index with larger hashes

2016-08-05 Thread Kenneth Marshall

Hello Developers,

I have been following the recent discussions on increasing the
size of the hash function used in Postgres and the work to
provide WAL and performance improvements for hash indexes. 
I know it was mentioned when we moved to the new hashing
functions, but the existing functions do provide an additional
32-bits of hash. We currently do not use them, but they are
already calculated.

It had occurred to me that one way to decrease the space used
to store the hash value would be to include information about
the page number to determine the actual value. For example,
a hash index of 65k pages (540mb) would get two additional
bytes of hash with no associated storage cost. Also, if you
subdivided the hash page into say 128 sub-pages you would
get the extra 2 bytes of hash in a 4mb index. In this way,
the bigger the hash index is, the more bits you get for free.

Just wanted to throw it out there.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup wish list

2016-07-12 Thread Kenneth Marshall

On Tue, Jul 12, 2016 at 11:06:39AM -0700, Jeff Janes wrote:
> On Tue, Jul 12, 2016 at 10:48 AM, Peter Eisentraut
>  wrote:
> > On 7/12/16 12:53 PM, Jeff Janes wrote:
> >> The --help message for pg_basebackup says:
> >>
> >> -Z, --compress=0-9 compress tar output with given compression level
> >>
> >> But -Z0 is then rejected as 'invalid compression level "0"'.  The real
> >> docs do say 1-9, only the --help message has this bug.  Trivial patch
> >> attached.
> >
> > pg_dump --help and man page say it supports 0..9.  Maybe we should make
> > that more consistent.
> 
> pg_dump actually does support -Z0, though.  Well, sort of.  It outputs
> plain text.  Rather than plain text wrapped in some kind of dummy gzip
> header, which is what I had naively expected.
> 
> Is that what -Z0 in pg_basebackup should do as well, just output
> uncompressed tar data, and not add the ".gz" to the "base.tar" file
> name?
> 
> Cheers,
> 
> Jeff
> 

Hi,

Yes, please support the no compression option. It can be useful in
situations where the bottleneck is the compression itself (quite
easily done with zlib based options, another plug for a higher
performance option).

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: EOL for 8.2 (was Re: [HACKERS] Formatting Curmudgeons WAS: MMAP Buffers)

2011-04-21 Thread Kenneth Marshall

On Thu, Apr 21, 2011 at 06:04:09PM +0100, Dave Page wrote:
 On Thu, Apr 21, 2011 at 5:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  [ man, this thread has totally outlived its title, could we change that?
  ?I'll start with this subtopic ]
 
  Robert Haas robertmh...@gmail.com writes:
  In fact, I've been wondering if we shouldn't consider extending the
  support window for 8.2 past the currently-planned December 2011.
  There seem to be quite a lot of people running that release precisely
  because the casting changes in 8.3 were so painful, and I think the
  incremental effort on our part to extend support for another year
  would be reasonably small. ?I guess the brunt of the work would
  actually fall on the packagers. ?It looks like we've done 5 point
  releases of 8.2.x in the last year, so presumably if we did decide to
  extend the EOL date by a year or so that's about how much incremental
  effort would be needed.
 
  I agree that the incremental effort would not be so large, but what
  makes you think that the situation will change given another year?
  My expectation is that'd just mean people will do nothing about
  migrating for a year longer.
 
  More generally: it took a lot of argument to establish the current EOL
  policy, and bending it the first time anyone feels any actual pain
  will pretty much destroy the whole concept.
 
 It would also make at least one packager very unhappy as the 8.2
 Windows build is by far the hardest and most time consuming to do and
 I happen to know he's been counting the days until it goes.
 
 More generally, keeping it for longer means we might end up supporting
 6 major releases at once. That may not be so much work on a day to day
 basis, but it adds up to a lot at release times, which was one of the
 reasons why we agreed on the 5 year support window.
 
 -- 
 Dave Page
 Blog: http://pgsnake.blogspot.com
 Twitter: @pgsnake
 
 EnterpriseDB UK: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 

+1 for cutting the cord on 8.2. People using it still will need
to use the last release available, upgrade, or consult to have
a back-port/build made. 

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Kenneth Marshall

On Wed, Feb 23, 2011 at 09:34:06AM -0600, Merlin Moncure wrote:
 On Tue, Feb 22, 2011 at 10:16 PM, Peter Geoghegan
 peter.geoghega...@gmail.com wrote:
  I'm investigating the possibility of developing a utility function for
  our C++ client library, libpqxx, that produces array literals that can
  be used in prepared statements. This problem appears to be a bit of a
  tar pit, so I'm hoping that someone can help me out. My goal is to
  produce a template function that accepts arbitrarily nested standard
  library containers, that contain at the most nested level
  constants/literals of some type that can be fed into a stream, such as
  an int or a std::string.
 
  I'm aware that I cannot assume that types are delimited by a single
  quote, even for built-in types. I thought that I would put the onus on
  the client to specify the correct delimiter, by checking pg_type
  themselves if necessary, but default to ',' . Is this a reasonable
  approach?
 
  Escaping/quoting individual elements seems tricky. I have produced a
  generic and superficially well behaved implementation by using double
  quotes for constants. However, I have now opened the door to malicious
  parties injecting multiple array elements where only one is allowed,
  or causing malformed array literal errors by simply including a double
  quote of their own. It's not clear where the responsibility should
  rest for escaping constants/ensuring that constants don't contain
  double quotes. Can someone suggest a better approach? I can't very
  well use single quotes, because they are escaped/doubled up when we
  pass the array literal to something similar to PQexecPrepared(), and
  they shouldn't be - strings end up looking like this: 'has errant
  single quotes on either side'.
 
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.
 
  Since Postgres only supports encodings that are ASCII supersets, I
  don't believe that I have to consider encoding - only my clients do.
 
  Can someone please point me in the direction of an established client
  library/driver where all corner cases are covered, or at least enough
  of them to produce a net gain in usefulness? There may well be
  additional subtleties that have not occurred to me.
 
 yes: libpqtypes.  it manages everything in binary.  i've been thinking
 for a while that libpqtypes could be wrapped with variadic templates
 or other c++ trickery.  Because libpqtypes does everything in binary,
 it completely sidesteps all the escaping nastiness.
 
 merlin
 

Avoiding the escaping by using binary parameter transmission is
the best method. Shameless plug: libpqtypes is great!
I hope that it can be eventually included in the core distribution.
It is not uncommon to get It's an add-on package??? and avoidance
of pieces outside of the standard dist regardless of its value.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Correctly producing array literals for prepared statements

2011-02-23 Thread Kenneth Marshall

On Wed, Feb 23, 2011 at 03:34:45PM -0500, Andrew Chernow wrote:
 On 2/23/2011 3:06 PM, Peter Geoghegan wrote:
 On 23 February 2011 15:34, Merlin Moncuremmonc...@gmail.com  wrote:
 You can send nested arrays safely.  You just have to be very formal
 about escaping *everything* both as you get it and as it goes into the
 container.  This is what postgres does on the backend as it sends
 arrays out the door in text.  It might be instructive to see what the
 server does in terms of escaping.  Note that the way this works it's
 not impossible to see 128+ consecutive backslashes when dealing with
 arrays of composites.

 Sounds tedious.


 It is tedious, which is one reason why libpqtypes went binary.  There are 
 some compelling performance reasons as well that affect both client and 
 server.

 libpqtypes was originally developed to serve a very particular need and 
 wasn't aiming to be general purpose.  That came about along the way trying 
 to solve the problem.  Personally, PQexec is dead to me as well as text 
 results from a C/C++ app.  I see no advantage over libpqtypes in that 
 context.

 Unless I am missing your ultimate goal, you'd probably get what you want by 
 wrapping libpqtypes.


The performance is one of the big reasons to use binary parameters.
Converting/packing/transmitting/unpacking/converting use a lot of
CPU resources on both the server and the client in addition to 
the larger communication resources needed by the text-based methods.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] LIKE, CHAR(), and trailing spaces

2011-02-03 Thread Kenneth Marshall

On Wed, Feb 02, 2011 at 07:48:38PM -0500, Bruce Momjian wrote:
 Brendan Jurd wrote:
  On 3 February 2011 10:54, Bruce Momjian br...@momjian.us wrote:
   It seems LIKE is considering the trailing CHAR(10) field spaces as
   significant, even though our documentations says:
  
  -- snip --
  
   It says trailing spaces are not significant for character comparisons
   --- the real question is whether LIKE is a comparison. ?Obvioiusly '='
   is a comparison, but the system does not treat LIKE as a comparison in
   terms of trailing spaces. ?Is that desired behavior?
  
  Interesting.  I would have to say that from the user point of view,
  LIKE is definitely a comparison, and if the rest of the operators on
  bpchar ignore whitespace then LIKE ought to as well.
  
  Is the situation the same for regex matches (~ operators)?
 
 Yes, I think so:
 
   test= SELECT 'a'::char(10) ~ 'a$';
?column?
   --
f
   (1 row)
   
   test= SELECT 'a'::char(10) ~ 'a  *$';
?column?
   --
t
   (1 row)
 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com

In my mind LIKE/~ are pattern matching operators and not a simple
comparison operator. PostgreSQL is doing the right thing in restricting
the somewhat bizarre treatment of trailing spaces to the '=' comparison
function. I can only imagine what would be needed to allow exceptions
to the pattern matching syntax to allow you to actually work with and
match the trailing spaces otherwise.

+10 for leaving the behavior as is.

Regards,
Ken
 
   + It's impossible for everything to be true. +
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is sorting on two columns so slower than sorting on one column?

2010-12-23 Thread Kenneth Marshall

On Thu, Dec 23, 2010 at 02:33:12AM -0500, Jie Li wrote:
 Hi,
 
 Here is the test table,
 
 postgres=# \d big_wf
 Table public.big_wf
  Column |  Type   | Modifiers
 +-+---
  age| integer |
  id | integer |
 
 postgres=# \dt+ big_wf
  List of relations
  Schema |  Name  | Type  |  Owner   |  Size  | Description
 ++---+--++-
  public | big_wf | table | workshop | 142 MB |
 
 
 The first query sorting on one column:
 postgres=# explain analyze select * from big_wf order by age;
QUERY
 PLAN
 -
  Sort  (cost=565525.45..575775.45 rows=410 width=8) (actual
 time=11228.155..16427.149 rows=410 loops=1)
Sort Key: age
Sort Method:  external sort  Disk: 72112kB
-  Seq Scan on big_wf  (cost=0.00..59142.00 rows=410 width=8)
 (actual time=6.196..4797.620 rows=410 loops=1)
  Total runtime: 19530.452 ms
 (5 rows)
 
 The second query sorting on two columns:
 postgres=# explain analyze select * from big_wf order by age,id;
QUERY
 PLAN
 -
  Sort  (cost=565525.45..575775.45 rows=410 width=8) (actual
 time=37544.779..48206.702 rows=410 loops=1)
Sort Key: age, id
Sort Method:  external merge  Disk: 72048kB
-  Seq Scan on big_wf  (cost=0.00..59142.00 rows=410 width=8)
 (actual time=6.796..5518.663 rows=410 loops=1)
  Total runtime: 51258.000 ms
 (5 rows)
 
 The verision is 9.0.1 and the work_mem is 20MB. One special thing is, the
 first column(age) of all the tuples are of the same value, so the second
 column(id) is always needed for comparison.  While the first sorting takes
 about only 6 seconds, the second one takes over 30 seconds,  Is this too
 much than expected? Is there any possible optimization ?
 
 Thanks,
 Li Jie

Hi Li,

If I understand your description, in the first query the sort does
not actually have to do anything because the column values for age
are all degenerate. In the second query, you actually need to sort
the values which is why it takes longer. If the first column values
are the same, then simply sorting by id alone would be faster.
You could also bump up work_mem for the query to perform the sort
in memory.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is sorting on two columns so slower than sortingon one column?

2010-12-23 Thread Kenneth Marshall

On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote:
 Hi Ken,
 
 Thanks for your tips! Yes it is the case, and I run another query sorting on 
 the second column whose values are random.
 
 postgres=# explain analyze select * from big_wf order by id;
   QUERY PLAN  
   
 -
 Sort  (cost=565525.45..575775.45 rows=410 width=8) (actual 
 time=25681.875..36458.824 rows=410 loops=1)
   Sort Key: id
   Sort Method:  external merge  Disk: 72048kB
   -  Seq Scan on big_wf  (cost=0.00..59142.00 rows=410 width=8) (actual 
 time=8.595..5569.500 rows=410 loops=1)
 
 Now the sorting takes about 20 seconds, so it seems reasonable compared to 30 
 seconds, right? But one thing I'm confused is that, why is additional 
 comparison really so expensive?  Does it incur additional I/O? From the cost 
 model, it seems not, all the cost are the same (575775.45).
 
 Thanks,
 Li Jie

In the first query, the cost is basically the I/O cost to read the
table from disk. The actual sort does not do anything since the
sort values are the same. In the second query, the sort has to
swap things in memory/disk to get them in the correct order for
the result. This actually takes CPU and possibly additional I/O
which is why it is slower. In the case of sorting by just the id
column, the size of the sorted values is smaller which would need
fewer batches to complete the sort since the sort is bigger than
the work_mem.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is sorting on two columns so slower thansortingon one column?

2010-12-23 Thread Kenneth Marshall

On Thu, Dec 23, 2010 at 10:42:26PM +0800, Li Jie wrote:
 - Original Message - 
 From: Kenneth Marshall k...@rice.edu
 To: Li Jie jay23j...@gmail.com
 Cc: pgsql-hackers pgsql-hackers@postgresql.org
 Sent: Thursday, December 23, 2010 10:30 PM
 Subject: Re: [HACKERS] Why is sorting on two columns so slower thansortingon 
 one column?

  On Thu, Dec 23, 2010 at 10:19:46PM +0800, Li Jie wrote:
  Hi Ken,

  Thanks for your tips! Yes it is the case, and I run another query sorting 
  on the second column whose values are random.

  postgres=# explain analyze select * from big_wf order by id;
QUERY PLAN   

  -
  Sort  (cost=565525.45..575775.45 rows=410 width=8) (actual 
  time=25681.875..36458.824 rows=410 loops=1)
Sort Key: id
Sort Method:  external merge  Disk: 72048kB
-  Seq Scan on big_wf  (cost=0.00..59142.00 rows=410 width=8) 
  (actual time=8.595..5569.500 rows=410 loops=1)

  Now the sorting takes about 20 seconds, so it seems reasonable compared to 
  30 seconds, right? But one thing I'm confused is that, why is additional 
  comparison really so expensive?  Does it incur additional I/O? From the 
  cost model, it seems not, all the cost are the same (575775.45).

  Thanks,
  Li Jie

  In the first query, the cost is basically the I/O cost to read the
  table from disk. The actual sort does not do anything since the
  sort values are the same. In the second query, the sort has to
  swap things in memory/disk to get them in the correct order for
  the result. This actually takes CPU and possibly additional I/O
  which is why it is slower. In the case of sorting by just the id
  column, the size of the sorted values is smaller which would need
  fewer batches to complete the sort since the sort is bigger than
  the work_mem.

  Cheers,
  Ken

 Hi Ken,

 Thanks for your analysis.

 But in the last query that sorts on id,  since the query selects all the 
 columns for output, the actual sorted size is the same, and the only 
 difference is the comparison cost. The query sorting on two columns needs to 
 do twice the comparison. Am I right?

 Thanks,
 Li Jie

I think you are right. Sorry for the confusion.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [FeatureRequest] Base Convert Function

2010-12-21 Thread Kenneth Marshall

On Tue, Dec 21, 2010 at 11:28:17PM +0200, Pavel Golub wrote:
 Hello, Pavel.
 
 You wrote:
 
 PS Hello
 
 PS Dne 21. prosince 2010 21:11 Tom Mudru??ka to...@mudrunka.cz 
 napsal(a):
 
  Thx for you answers :-)
  Well... i know that i can write my own plugin and i am familiar with C so
  this is not the problem, but i think that such feature should be
  implemented directly in PgSQL because there are already functions for
  converting to/from base 16 so why don't make this more flexible and
  generalize it to any other radix? It's quite simple to do and i don't see
  any reason why 16 should be there and 8, 32 or 36 shouldn't :-)
 
 
 PS * It isn't a typical and often request,
 PS * There are not hard breaks for custom implementation,
 PS * You can use plperu or plpython based solutions,
 PS * It's not part of ANSI SQL
 
 But MySQL has such function. What's wrong with us? ;)
 

You are not really helping to make a good case... :)

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions, patch v20 (bitrot fixes)

2010-12-20 Thread Kenneth Marshall

On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote:
  I think you mean Unicode is not a superset of all character sets. I've
  heard this before but never found what's missing. [citation needed]?
 
  Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings.
 
 [citation needed]?  Exactly what characters are missing, and why would
 the Unicode people have chosen to leave them out?  It's not like they've
 not heard of those encodings, I'm sure.
 
   regards, tom lane
 

Here is an interesting description of some of the gotchas:

http://en.wikipedia.org/wiki/Windows-1252

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions, patch v20 (bitrot fixes)

2010-12-20 Thread Kenneth Marshall

On Mon, Dec 20, 2010 at 03:08:48PM -0500, Tom Lane wrote:
 Kenneth Marshall k...@rice.edu writes:
  On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote:
  [citation needed]?  Exactly what characters are missing, and why would
  the Unicode people have chosen to leave them out?  It's not like they've
  not heard of those encodings, I'm sure.
 
  Here is an interesting description of some of the gotchas:
  http://en.wikipedia.org/wiki/Windows-1252
 
 Well, it's interesting, but I see no glyphs on that page that lack
 Unicode assignments.
 
   regards, tom lane
 
You are correct. I mis-read the text.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Default mode for shutdown

2010-12-15 Thread Kenneth Marshall

On Wed, Dec 15, 2010 at 09:39:12AM -0500, Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
  I'm sure this has been up before, but hey, let's take it another round.
  Why don't we change the default shutdown mode for pg_ctl from smart
  to fast? I've never come across a single usecase where smart is
  what people *want*...
 
 Really?  Personally I'm quite happy with that default.
 
   regards, tom lane
 
+1

I think the default is perfect. Even if the usecase that is wanted
is fast, it should be requested each time to verify that a more
destructive shutdown is wanted. If it is really an issue, a script
or shell alias can be defined to perform the more aggressive
shutdown processes.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why percent_rank is so slower than rank?

2010-12-09 Thread Kenneth Marshall

On Thu, Dec 09, 2010 at 05:18:57PM -0500, Tom Lane wrote:
 I wrote:
  We're throwing away one tuple at a time as we advance forward through
  the tuplestore, and moving 10+ tuple pointers each time.  Ugh.
  This code was all right when written, because (IIRC) the mergejoin
  case was actually the only caller.  But it's not all right for
  WindowAgg's less-predictable usage patterns.
 
  I thought for a bit about changing things around so that the first-used
  tuple slot isn't necessarily state-memtuples[0], but just like the
  comment says, that complicates a lot of other logic.  And there isn't
  any easy place to reclaim the wasted slots later.
 
  What seems like the best bet is to put in a heuristic to make
  tuplestore_trim simply not do anything until nremove reaches some
  reasonably large amount, perhaps 10% of the number of stored tuples.
  This wastes up to 10% of the alloted memory, but that seems tolerable.
 
 On reflection I think just not doing anything isn't a very good idea.
 The problem with that is that a mis-coded caller could try to fetch
 tuples that it had already told the tuplestore could be trimmed away;
 and this would work, most of the time, until you got unlucky and the
 trim operation had actually deleted them.  I think it's pretty important
 for bug-catching purposes that the tuplestore enforce that those tuples
 are not available anymore.
 
 Hence the attached patch, which combines the two ideas by recycling
 tuples immediately but not sliding the pointer array until a reasonable
 amount of movement has occurred.  This fixes the complained-of
 performance problem AFAICT.
 
 I'm not sure whether or not to back-patch this into 9.0 and 8.4.  The
 code in tuplestore.c hasn't changed at all since 8.4, so there's not
 much risk of cross-version bugs, but if I did miss anything we could
 be shipping a buggy version next week.  Thoughts?
 
   regards, tom lane
 

+1 for back patching.

Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Kenneth Marshall

On Mon, Dec 06, 2010 at 10:14:55AM -0500, Merlin Moncure wrote:
 On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer m...@msys.ch wrote:
  Am 06.12.10 15:37, schrieb Merlin Moncure:
  On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander mag...@hagander.net 
  wrote:
  On Sun, Dec 5, 2010 at 10:22, Marc Balmer m...@msys.ch wrote:
  I am suggesting adding a function to libpq:
 
  PGresult *PQvexec(PGconn *conn, const char *fmt, ...);
 
  It behaves similar to PQexec, but it allows for printf style varargs and
 
  How is that not a horrible idea, compared to using PQexecParams()? You
  have to remember to do all your escaping and things manually, whereas
  PQexecParams() does it automatically.
 
  It's only horrible if you stick to printf style formatting and you are
  using sting techniques to inject parameters into the query. ?Non
  parameterized queries should obviously be discouraged. ?However, it's
  entirely possible to wrap the parameterized interfaces with vararg
  interface (I should know, because we did exactly that) :-). ?This
  gives you the best of both worlds, easy coding without sacrificing
  safety. ?You might not remember the libpqtypes proposal, but libpq was
  specifically extended with callbacks so that libpqtypes could exist
  after the community determined that libpqtypes was too big of a change
  to the libpq library. ?I think ultimately this should be revisited,
  with libpqtypes going in core or something even richer...I've been
  thinking for a while that postgres types should be abstracted out of
  the backend into a library that both client and server depend on.
 
  With libpqtypes, we decided to use postgres style format markers:
  select PQexecf(conn, select %int4 + %int8, an_int, a_bigint);
 
  Everything is schema qualified, so that user types are supported (of
  course, this requires implementing handling on the client).
 
  Data routed through the binary protocol, with all the byte swapping
  etc handled by the library. ?No escaping necessary. ?We also added
  full support for arrays and composites, which are a nightmare to deal
  with over straight libpq, and various other niceties like thread safe
  error handling.
 
  That would be a *HUGE* piece of software compared the relatively small
  thing I am suggesting...
 
 well, it's already written. All you would have to do is compile it.
 
  As for escaping (or not escaping) of string arguments, that can be seen
  as a bug or a feature. ?I do not wan't automatic escaping of string
  arguments in all cases, e.g. I might to construct an SQL statement with
  dynamic parts WHERE xy or AND a = b.
 
 libpqtypes doesn't escape at all.  It uses the internal parameterized
 interfaces that don't require it.  For particular types, like bytea
 and timestamps, this much faster because we use the binary wire
 format.  Less load on the client and the server.
 
  hypothetical example:
 
  filter = WHERE name like 'Balmer%';
  if (sort == SORT_DESC)
  ? ? ? ?sort =  ORDER BY name DESCENDING;
 
  PQvexec(conn, SELECT name, nr, id FROM address %s%s, filter, sort);
 
  So what I am aiming at right now is a PQvexec() function that basically
  has printf() like semantics, but adds an additional token to the format
  string (printf uses %s and %b to produce strings.) I am thinking of
  adding %S and %B, which produce strings that are escaped.
 
  That would be a small function, and reasonably safe. ?Or rather, the
  safety is in the hands of the programmer.
 
 What you are suggesting doesn't provide a lot of value over sprintf
 the query first, then exec it.  You can do what you are suggesting
 yourself, wrapping PQexec:
 
 A hypothetical wrapper would be implemented something like:
 va_list ap;
 char buf[BUFSZ];
 va_start(ap, query)
 vsnprintf(buf, BUFSZ. query, ap);
 va_end(ap);
 return PQexec(buf);
 
 This is a bad idea (security, escaping, performance)...we wrote a
 faster, safer way to do it, with richer type support.  Or you can do
 it yourself.
 
 merlin
 

I have used the libpqtypes library and it is very easy to use.

+1 for adding it or something like it to the PostgreSQL core.
I have people who will try and roll their own because it does
not come with the core. While it is a hoot to see what reinventing
the wheel produces, it is also prone to mistakes.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Report: Linux huge pages with Postgres

2010-11-28 Thread Kenneth Marshall

On Sat, Nov 27, 2010 at 02:27:12PM -0500, Tom Lane wrote:
 We've gotten a few inquiries about whether Postgres can use huge pages
 under Linux.  In principle that should be more efficient for large shmem
 regions, since fewer TLB entries are needed to support the address
 space.  I spent a bit of time today looking into what that would take.
 My testing was done with current Fedora 13, kernel version
 2.6.34.7-61.fc13.x86_64 --- it's possible some of these details vary
 across other kernel versions.
 
 You can test this with fairly minimal code changes, as illustrated in
 the attached not-production-grade patch.  To select huge pages we have
 to include SHM_HUGETLB in the flags for shmget(), and we have to be
 prepared for failure (due to permissions or lack of allocated
 hugepages).  I made the code just fall back to a normal shmget on
 failure.  A bigger problem is that the shmem request size must be a
 multiple of the system's hugepage size, which is *not* a constant
 even though the test patch just uses 2MB as the assumed value.  For a
 production-grade patch we'd have to scrounge the active value out of
 someplace in the /proc filesystem (ick).
 

I would expect that you can just iterate through the size possibilities
pretty quickly and just use the first one that works -- no /proc
groveling.

 In addition to the code changes there are a couple of sysadmin
 requirements to make huge pages available to Postgres:
 
 1. You have to configure the Postgres user as a member of the group
 that's permitted to allocate hugepage shared memory.  I did this:
 sudo sh -c id -g postgres /proc/sys/vm/hugetlb_shm_group
 For production use you'd need to put this in the PG initscript,
 probably, to ensure it gets re-set after every reboot and before PG
 is started.
 
Since it would take advantage of them automatically, this would be
just a normal DBA/admin task.

 2. You have to manually allocate some huge pages --- there doesn't
 seem to be any setting that says just give them out on demand.
 I did this:
 sudo sh -c echo 600 /proc/sys/vm/nr_hugepages
 which gave me a bit over 1GB of space reserved as huge pages.
 Again, this'd have to be done over again at each system boot.
 
Same.

 For testing purposes, I figured that what I wanted to stress was
 postgres process swapping and shmem access.  I built current git HEAD
 with --enable-debug and no other options, and tested with these
 non-default settings:
  shared_buffers   1GB
  checkpoint_segments  50
  fsyncoff
 (fsync intentionally off since I'm not trying to measure disk speed).
 The test machine has two dual-core Nehalem CPUs.  Test case is pgbench
 at -s 25; I ran several iterations of pgbench -c 10 -T 60 bench
 in each configuration.
 
 And the bottom line is: if there's any performance benefit at all,
 it's on the order of 1%.  The best result I got was about 3200 TPS
 with hugepages, and about 3160 without.  The noise in these numbers
 is more than 1% though.
 
 This is discouraging; it certainly doesn't make me want to expend the
 effort to develop a production patch.  However, perhaps someone else
 can try to show a greater benefit under some other test conditions.
 
   regards, tom lane
 
I would not really expect to see much benefit in the region that the
normal TLB page size would cover with the typical number of TLB entries.
1GB of shared buffers would not be enough to cause TLB thrashing with
most processors. Bump it to 8-32GB or more and if the queries use up
TLB entries with local work_mem you should see some more value in the
patch. 

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] unlogged tables

2010-11-17 Thread Kenneth Marshall

On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert is probably going to object that he wanted to prevent any
  fsyncing for unlogged tables, but the discussion over in pgsql-general
  is crystal clear that people do NOT want to lose unlogged data over
  a clean shutdown and restart. ?If all it takes to do that is to refrain
  from lobotomizing the checkpoint logic for unlogged tables, I say we
  should refrain.
 
  I think that's absolutely a bad idea.
 
 The customer is always right, and I think we are hearing loud and clear
 what the customers want.  Please let's not go out of our way to create
 a feature that isn't what they want.
 
   regards, tom lane
 

I would be fine with only having a safe shutdown with unlogged tables
and skip the checkpoint I/O all other times.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Protecting against unexpected zero-pages: proposal

2010-11-09 Thread Kenneth Marshall

On Tue, Nov 09, 2010 at 02:05:57PM -0500, Robert Haas wrote:
 On Tue, Nov 9, 2010 at 12:31 PM, Greg Stark gsst...@mit.edu wrote:
  On Tue, Nov 9, 2010 at 5:06 PM, Aidan Van Dyk ai...@highrise.ca wrote:
  So, for getting checksums, we have to offer up a few things:
  1) zero-copy writes, we need to buffer the write to get a consistent
  checksum (or lock the buffer tight)
  2) saving hint-bits on an otherwise unchanged page. ?We either need to
  just not write that page, and loose the work the hint-bits did, or do
  a full-page WAL of it, so the torn-page checksum is fixed
 
  Actually the consensus the last go-around on this topic was to
  segregate the hint bits into a single area of the page and skip them
  in the checksum. That way we don't have to do any of the above. It's
  just that that's a lot of work.
 
 And it still allows silent data corruption, because bogusly clearing a
 hint bit is, at the moment, harmless, but bogusly setting one is not.
 I really have to wonder how other products handle this.  PostgreSQL
 isn't the only database product that uses MVCC - not by a long shot -
 and the problem of detecting whether an XID is visible to the current
 snapshot can't be ours alone.  So what do other people do about this?
 They either don't cache the information about whether the XID is
 committed in-page (in which case, are they just slower or do they have
 some other means of avoiding the performance hit?) or they cache it in
 the page (in which case, they either WAL log it or they don't checksum
 it).  I mean, there aren't any other options, are there?
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 

That would imply that we need to have a CRC for just the hint bit
section or some type of ECC calculation that can detect bad hint
bits independent of the CRC for the rest of the page.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Kenneth Marshall

On Thu, Nov 04, 2010 at 10:00:40AM +, Dean Rasheed wrote:
 On 3 November 2010 09:24, Nicolas Barbier nicolas.barb...@gmail.com wrote:
  2010/11/2 Kenneth Marshall k...@rice.edu:
 
  Given that our hash implimentation mixes the input data well (It does.
  I tested it.) then a simple rotate-and-xor method is all that should
  be needed to maintain all of the needed information. The original
  hash function has done the heavy lifting in this case.
 
  Even with the perfect hash function for the elements, certain
  combinations of elements could still lead to massive collisions. E.g.,
  if repeated values are typical in the input data we are talking about,
  then the rotate-and-xor method would still lead to collisions between
  any array of the same values of certain lengths, regardless of the
  value. In Tom's implementation, as he mentioned before, those
  problematical lengths would be multiples of 32 (e.g., an array of 32
  1s would collide with an array of 32 2s would collide with an array of
  32 3s, etc).
 
 
 Yeah, rotate-and-xor is a pretty weak hashing algorithm, since any
 array of 32 identical elements will hash to either 0 or -1. Similarly
 various permutations or multiples of that array length will cause it
 to perform badly.
 
 The multiply-by-m algorithm doesn't have that weakness, provided m is
 chosen carefully. There are a couple of qualities a good algorithm
 should possess:
 
 1). The bits from the individual element hash values should be
 distributed evenly so that no 2 different hash values would result in
 the same contribution to the final value. This is easy to achieve -
 just make sure that m is odd.
 
 2). The way that each element's hash value bits are distributed should
 be different from the way that every other element's hash value bits
 are distributed. m=31 achieves this pretty well, although there are
 plenty of other equally valid choices.
 
 Regards,
 Dean
 
Hi Dean,

In my comment yesterday, I included a simple function that would
allow us to leverage our current hash functions mixing process to
scramble the bits effectively and retaining the maximum amount of
information in the hash.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash support for arrays

2010-11-03 Thread Kenneth Marshall

On Wed, Nov 03, 2010 at 10:24:16AM +0100, Nicolas Barbier wrote:
 2010/11/2 Kenneth Marshall k...@rice.edu:
 
  Given that our hash implimentation mixes the input data well (It does.
  I tested it.) then a simple rotate-and-xor method is all that should
  be needed to maintain all of the needed information. The original
  hash function has done the heavy lifting in this case.
 
 Even with the perfect hash function for the elements, certain
 combinations of elements could still lead to massive collisions. E.g.,
 if repeated values are typical in the input data we are talking about,
 then the rotate-and-xor method would still lead to collisions between
 any array of the same values of certain lengths, regardless of the
 value. In Tom's implementation, as he mentioned before, those
 problematical lengths would be multiples of 32 (e.g., an array of 32
 1s would collide with an array of 32 2s would collide with an array of
 32 3s, etc).
 
 Nicolas
 

True. I just took another look at our defined hash functions and it
looks like we can make a simple variant of hash_uint32() that we
can use as a stream checksum. The only thing missing is that ability
to pass in the current 32-bit hash value as a starting seed to add
the next 32-bit value. Something like this should work:

Datum
hash_uint32(uint32 k, uint32 initval)
{
register uint32 a,
b,
c;

a = b = c = 0x9e3779b9 + (uint32) sizeof(uint32) + 3923095 + initval;
a += k;

final(a, b, c);

/* report the result */
return UInt32GetDatum(c);
}

Then if you pass in the current value as the initval, it should mix
well each additional 32-bit hash value.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kenneth Marshall

On Tue, Nov 02, 2010 at 04:42:19PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Really? ?I think I don't understand when this fails isn't obviously
  better than being able to predict when it fails ...
 
  Isn't that the whole point of hash functions?  Collisions are
  inevitable, but you want them to be unpredictable.  If you want a hash
  function with predictable collision behavior, just has the first
  element.  It'll be highly predictable AND wicked fast.
 
 That seems like a rather poor straw man, since it suffers from exactly
 the defect I'm complaining about, namely failing to consider all the
 input values equally.
 
  I'd be happier about this approach if there were some actual theory
  behind it ... maybe there's some analysis out there, but the one link
  that was given was just about entirely unconvincing.
 
  I think it's from Knuth, though unfortunately I don't have a copy to
  check.  There are probably better algorithms out there, but this one's
  pretty simple.
 
 I don't see anything in Knuth suggesting a multiplier of 31.  His
 recommendation for a multiplier, if you're going to use multiplicative
 hashing, is wordsize/phi (phi being the golden ratio) ... and he also
 wants you to keep the high order not the low order bits of the product.
 
 However, this is largely beside the point, because that theory, as well
 as the Java code you're arguing from, has to do with the initial hashing
 of a raw sequence of input items.  Not with combining some existing hash
 values.  The rotate-and-xor method I suggested for that is borrowed
 exactly from section 6.4 of Knuth (page 512, in the first edition of
 volume 3).
 
   regards, tom lane
 

Given that our hash implimentation mixes the input data well (It does.
I tested it.) then a simple rotate-and-xor method is all that should
be needed to maintain all of the needed information. The original
hash function has done the heavy lifting in this case.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL and HugePage

2010-10-20 Thread Kenneth Marshall

On Wed, Oct 20, 2010 at 10:10:00AM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Tue, Oct 19, 2010 at 11:30 PM, daveg da...@sonic.net wrote:
  On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
  Heh - provided you specify
  SHM_HUGETLB
  in the relevant call that is :-)
 
  I had a patch for this against 8.3 that I could update if there is any
  interest. I suspect it is helpful.
 
  I think it would be a good feature.  Of course, we would need
  appropriate documentation, and some benchmarks showing that it really
  works.
 
 I believe that for the equivalent Solaris option, we just automatically
 enable it when available.  So there'd be no need for user documentation.
 However, I definitely *would* like to see some benchmarks proving that
 the change actually does something useful.  I've always harbored the
 suspicion that this is just a knob to satisfy people who need knobs to
 frob.
 
   regards, tom lane
 

Oracle apparently uses hugepages if they are available by first trying
with the SHM_HUGETLB option. If it fails, they reissue the command
without that option. This article does mention some of the benefits
of the larger pagesizes with large shared memory regions:

http://appcrawler.com/wordpress/?p=686

Regard,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why do we have a database specification in .pgpass?

2010-10-14 Thread Kenneth Marshall

On Thu, Oct 14, 2010 at 06:09:35AM +0200, Dennis Bj??rklund wrote:
  We have a database specification in .pgpass:
 
  hostname:port:database:username:password
 
  What is the purpose of 'database' since username/password combinations
  are global, not per database?  I would like to documents its purpose.
 
 There is the GUC parameter db_user_namespace. Just for that you could
 probably  use den...@foo as username instead, so maybe it's not the
 purpose. But I can't think of any other reason.
 
 /Dennis
 

This will allow the same user to save and use automatically different
passwords for each separate database.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another Modest Proposal: Platforms

2010-09-22 Thread Kenneth Marshall

On Wed, Sep 22, 2010 at 01:17:54PM -0700, David Fetter wrote:
 Folks,
 
 While it's interesting to note, in an historical sense, that a
 platform most recently updated when 1999 was still in the future, I
 think it's time we did a little pruning.
 
 We can start by supporting only platforms git runs on, this being the
 first in what I'd picture as a set of base requirements.
 
 What say?
 
 Cheers,
 David.

Given the amount of trouble I had to get a git for a Solaris 8
system, I am not too keen on this definition for platform. PostgreSQL
runs very well on the same system, along with SVN and CVS.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why is time with timezone 12 bytes?

2010-09-22 Thread Kenneth Marshall

On Wed, Sep 22, 2010 at 10:54:53PM +0100, Thom Brown wrote:
 On 22 September 2010 22:01, Josh Berkus j...@agliodbs.com wrote:
  All,
 
  I was just checking on our year-2027 compliance, and happened to notice
  that time with time zone takes up 12 bytes. ?This seems peculiar, given
  that timestamp with time zone is only 8 bytes, and at my count we only
  need 5 for the time with microsecond precision. ?What's up with that?
 
  Also, what is the real range of our 8-byte *integer* timestamp?
 
 The time is 8 bytes, (1,000,000 microseconds * 60 minutes, * 24 hours
 = 1,440,000,000 microseconds = 31 bits = 8 bytes).
 

31 bits = approx. 4 bytes at 8 bits/byte, not 8 bytes.

 The timezone displacement takes up to 12 bits, meaning 3 bytes.
 (1460+1459 = 2919 = 12 bits = 3 bytes).  So that's 11 bytes.  Not sure
 where the extra 1 byte comes from.
 
This would yield 7 bytes.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Kenneth Marshall

On Sun, Jun 20, 2010 at 03:01:04PM -0500, Kevin Grittner wrote:
 Joshua D. Drake  wrote:
  
  Can someone tell me what we are going to do about firewalls that
  impose their own rules outside of the control of the DBA?
  
 Has anyone actually seen a firewall configured for something so
 stupid as to allow *almost* all the various packets involved in using
 a TCP connection, but which suppressed just keepalive packets?  That
 seems to be what you're suggesting is the risk; it's an outlandish
 enough suggestion that I think the burden of proof is on you to show
 that it happens often enough to make this a worthless change.
  
 -Kevin
 

I have seen this sort of behavior but in every case it has been
the result of a myopic view of firewall/IP tables solutions to
perceived attacks. While I do agree that having heartbeat
within the replication process it worthwhile, it should definitely
be 9.1 material at best. For 9.0 such ill-behaved environments
will need much more interaction by the DBA with monitoring and
triage of problems as they arrive.

Regards,
Ken

P.S. My favorite example of odd behavior was preemptively dropping
TCP packets in one direction only at a single port. Many, many
odd things happen when the kernel does not know that the packet
would never make it to it destination. Services would sometimes
run for weeks without a problem depending on when the port ended
up being used invariably at night or on the weekend.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BYTEA / DBD::Pg change in 9.0 beta

2010-05-19 Thread Kenneth Marshall

On Wed, May 19, 2010 at 10:54:01AM -0400, Robert Haas wrote:
 On Wed, May 19, 2010 at 10:17 AM, Stefan Kaltenbrunner
 ste...@kaltenbrunner.cc wrote:
  On 05/19/2010 08:13 AM, Tom Lane wrote:
  Bernd Helmle maili...@oopsware.de writes:
  --On 18. Mai 2010 23:20:26 +0200 Jesper Krogh jes...@krogh.cc wrote:
  May I ask whats the reason is for breaking the compatibillity?
 
  Efficency, if i am allowed to call it this way. The new hex
  representation should be more efficient to retrieve and to handle than the
  old one. I think bytea_output was set to hex for testing purposes on the
  first hand, but not sure wether there was a consensus to leave it there
  finally later.
 
  Yeah, we intentionally set it that way initially to help find stuff that
  needs to be updated (as DBD::Pg evidently does). ?It's still TBD whether
  9.0.0 will ship with that default or not.
 
  given how much faster the new format is (or rather how slow the old one
  was) and the number of people I have seen complaining why is bytea so
  slow) I would like to see it staying turned on by default. However this
  also depends on how quickly database driver developers can adapt.
 
 I would favor waiting a release to turn it on by default, precisely to
 give driver developers time to adapt.
 
Changing something like that within the minor release arc is
not a good idea. It would be better to have it on by default and
if the driver developers are not up to use it, they can have that
as a setting that they will need to change when going to 9.0. I
would be very upset to have a minor upgrade break my database. At
least the major upgrades have more testing.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BYTEA / DBD::Pg change in 9.0 beta

2010-05-18 Thread Kenneth Marshall

On Tue, May 18, 2010 at 03:26:17PM -0600, Alex Hunsaker wrote:
 On Tue, May 18, 2010 at 15:20, Jesper Krogh jes...@krogh.cc wrote:
  On 2010-05-18 23:12, Alex Hunsaker wrote:
 
  set bytea_output 'escape';
 
  That was it. Knowing what the problem was I had no problem finding it in the
  release notes.
 
  May I ask whats the reason is for breaking the compatibillity?
 
 There were a couple IIRC, the big ones being speed and size.  Id look
 at the archives for more.
 
 I imagine at some point DBD::Pg will handle this transparently.  I
 also imagine Greg would happily accept patches :-)
 

Yes, the new format is much faster, more space efficient, and uses
less CPU to do the encoding. The older format caused the COPY for
bytea to be CPU limited in many more situations.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Generating Lots of PKs with nextval(): A Feature Proposal

2010-05-14 Thread Kenneth Marshall

Hi Peter,

All you need to do is define your own sequence with an
increment of 500. Look at:

http://www.postgresql.org/docs/8.4/static/sql-createsequence.html

Regards,
Ken

On Fri, May 14, 2010 at 02:56:18PM -0400, Peter Crabtree wrote:
 Recently, in preparation for migrating an application to postgres, I
 got to this part of the manual (which is *excellent* so far, by the
 way):
 
 http://www.postgresql.org/docs/8.4/interactive/functions-sequence.html
 
 A quick check with the folks on #postgresql confirmed my
 understanding, which was that the locking semantics of setval() and
 nextval() make this unsafe:
 
 SELECT setval('my_seq', nextval('my_seq') + 500);
 
 Now, I was reminded that I could simply do this:
 
 SELECT nextval('my_seq') FROM generate_series(1, 500);
 
 But of course then I would have no guarantee that I would get a
 contiguous block of ids, which means if I'm using this to do a mass
 insert of records which refer to each others' ids (example: storing a
 directed, linear graph), I either have to do a correlated update on
 the client side, after transferring the keys (consider the cost of
 doing this for a few million records - 4 MB in keys per million
 records, for, in extreme cases, 12 MB of data to be inserted -- 33%
 overhead in the worst case, presuming symmetric bandwidth), or I have
 to insert into a temporary table, then have the db backend do the
 update, then insert from there to the real table. Both are imperfect
 options in terms of performance and complexity.
 
 Thus, before I start work on it, I propose an extension to the current
 nextval():
 
 SELECT nextval('my_seq', 500);
 
 This would increment the my_seq sequence by its interval * 500, and
 return the first valid key. This both makes client code that needs a
 bunch of PKs simpler to implement, and saves in performance, since the
 client can just replace all its PKs (presuming they're currently a
 contiguous block from 1 to n) with my_starting_pk + current_pk, so
 this:
 
  pk | next_node
 +---
   0 | 1
   1 | 2
   2 | 0
 
 can be easily updated like this:
 
 SELECT nextval('my_seq', (SELECT count(*) FROM my_table));
 UPDATE my_table SET pk = currval('my_seq') + pk, next_node =
 currval('my_seq') + next_node;
 
 to something like this:
 
pk   | next_node
 +--
  521650 |521651
  521651 |521652
  521652 |521650
 
 This is a net gain of performance and ease of implementation in many
 cases where a large number of ids from a sequence are needed -- with a
 small added benefit of the keys being guaranteed to be contiguous.
 
 I don't see any technical problems with this; postgres already can
 pre-allocate more than one key, but the number is semi-static (the
 CACHE parameter to CREATE SEQUENCE). This might break existing user
 code if they've defined a nextval(regclass, integer), but I don't see
 any way to
 
 Finally, I've checked sequence.c -- this looks pretty straightforward
 to implement, but I figured checking with this list was wise before
 starting work. Apologies if I've been overly wordy.
 
 Peter
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] construct_array() use with PQexec with binary data

2010-05-05 Thread Kenneth Marshall

Dear PostgreSQL development community,

I am working on adapting a regular PQexec() call to use binary
transmission of the parameters. One of the parameters is an
array of BIGINT. Looking in include/utils/array.h, it appears
that construct_array() will do exactly what I need to get an
array to pass in with the PQexec() call. Is there a library
that includes that functionality? Or do I need to cobble it
together from the various pieces of code? Thank you for any
help.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Can we still trust plperl?

2010-03-11 Thread Kenneth Marshall

On Thu, Mar 11, 2010 at 09:31:46AM -0500, Andrew Dunstan wrote:

Last night my attention was drawn to this:

http://search.cpan.org/~timb/PostgreSQL-PLPerl-Injector-1.002/lib/PostgreSQL/PLPerl/Injector.pm

I'm wondering if we can reasonably continue to support plperl as a trusted
language, or at least redefine what trusted actually means. Does it mean
can't do untrusted operations or does it mean can't do untrusted
operations unless the DBA and/or possibly the user decide to subvert the
mechanism? To me, the latter doesn't sound much like it's worth having. Is
it?

There are a few places where plperl has an advantage over plpgsql, e.g.
code that uses lots of regexes and use of variable to access records
dynamically, so losing it might be a bit of a pain. Of course, there would
still be plperlu, with the downside that the functions have to be installed
by a superuser. One of my PGExperts colleagues told me his reaction was
Well, I might just as well use plperlu, and that pretty well sums up my
reaction.

Of course, another thing is that it might spur either building of some of
the missing stuff into plpgsql, or addition of another language that is
both safe and which supports them, like say PL/JavaScript.

Thoughts?

cheers

andrew

The DBA can do what ever he wants to do to subvert the system up to
installing hacked versions of any other trusted language so I do
not see much of a distinction. We already provide many other foot-guns
that may be used by the DBA. pl/perl is very useful as a trusted
language but I am certainly for fleshing out the features in other
pl-s.

Regards,
Ken

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbouncer + psql 9.0a4

2010-02-26 Thread Kenneth Marshall

Hi Garick,

Add an ignore_startup_parameters to your pgbouncer.ini file
with application_name.

Cheers,
Ken

On Fri, Feb 26, 2010 at 11:26:23AM -0500, Garick Hamlin wrote:
 I was just trying out 9.0a4 and I noticed.  That I can't connect to 
 pgbouncer with psql from 9.0a4 as a result of the set application_name 
 changes to psql.
 
 I imagine this is really mostly pgbouncer's problem, but I couldn't 
 figure out how if it was possible to make psql not set application_name 
 from the psql man page.  If not, at least from a users point of view psql 
 9.0a4 looks incompatible with existing versions of pgbouncer in at least 
 in this configuration that works with an older versions of psql.
 
 ... or, is that a crazy way to look at it?
 
 Garick
 
 Details are below..
 
 My setup (Solaris 10 / sparc )
 
 [ psql 9.0a4 ] - [ pgbouncer 1.3.1 ] - [ postgres 9.0a4 ]
 
 $ /usr/local/var/postgres/9.0a4/bin/psql -h 127.0.0.1 -p 6543
 psql: ERROR:  Unknown startup parameter
 
 ..pgbouncer's log..
 2010-02-25 21:56:29.721 6979 WARNING C-71d48: (nodb)/(nouser)@127.0.0.1:54000 
 unsupported startup parameter: application_name=psql
 2010-02-25 21:56:29.721 6979 LOG C-71d48: (nodb)/(nouser)@127.0.0.1:54000 
 closing because: Unknown startup parameter (age=0)
 2010-02-25 21:56:29.722 6979 LOG C-71d48: (nodb)/(nouser)@127.0.0.1:54000 
 Pooler Error: Unknown startup parameter
 
 
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding bad prepared-statement plans.

2010-02-19 Thread Kenneth Marshall

On Thu, Feb 18, 2010 at 08:31:05PM -0600, David Christensen wrote:

 On Feb 18, 2010, at 2:19 PM, Pierre C wrote:


 What about catching the error in the application and INSERT'ing into the
 current preprepare.relation table? The aim would be to do that in dev or
 in pre-prod environments, then copy the table content in production.

 Yep, but it's a bit awkward and time-consuming, and not quite suited to 
 ORM-generated requests since you got to generate all the plan names, when 
 the SQL query itself would be the most convenient unique identifier...

 A cool hack would be something like that :

 pg_execute( SELECT ..., arguments... )

 By inserting a hook which calls a user-specified function on non-existing 
 plan instead of raising an error, this could work.
 However, this wouldn't work as-is since the plan name must be = 
 NAMEDATALEN, but you get the idea ;)

 How about the SHA1 hash of the query?  Hey, it works for git... :-)

 Regards,

 David
 --
 David Christensen
 End Point Corporation
 da...@endpoint.com


Hi David,

Not to beat out own drum, but we already include a hashing function
that can be used for this purpose and is much faster than SHA-1. We
would want to use all 64-bits for this use instead of just the 32-bits
we currently use for the internal DB hashing. Here is an article
comparing the Jenkins' Hash (the one we use) and SHA-1:

http://home.comcast.net/~bretm/hash/

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Union test case broken in make check?

2009-11-19 Thread Kenneth Marshall

Without an order by, the order is not defined. The answers are the
same but the test gives a false failure because of the lack of
ordering.

Regards,
Ken

On Thu, Nov 19, 2009 at 07:54:30PM -0500, Emmanuel Cecchet wrote:
 Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
   
 Emmanuel Cecchet wrote:
 
 Is it just me or the union test case fails in CVS head?
   

   
 The buildfarm is pretty much all green: 
 http://www.pgbuildfarm.org/cgi-bin/show_status.pl
 So it looks like it's you :-)
 

 When in doubt, try make distclean and a full rebuild before assuming
 you've got a problem worth tracking down ...
   
 Well, I did:
 1. make distclean
 2. configure with CFLAGS=-O0 --enable-cassert --enable-debug --without-perl 
 --without-python --without-tcl --without-openssl
 3. make (everything normal)
 4. make check
 And it still fails for me. I am attaching my regression.diffs if someone 
 thinks it is worth tracking down ...

 Emmanuel

 *** /home/manu/workspace/PG-HEAD/src/test/regress/expected/union.out  
 2009-02-09 16:18:28.0 -0500
 --- /home/manu/workspace/PG-HEAD/src/test/regress/results/union.out   
 2009-11-19 19:37:32.0 -0500
 ***
 *** 198,208 
 WHERE f1 BETWEEN 0 AND 100;
five  
   ---
 --1004.3
 - -34.84
 -  -1.2345678901234e-200
0
   123456
   (5 rows)
   
   SELECT CAST(f1 AS char(4)) AS three FROM VARCHAR_TBL
 --- 198,208 
 WHERE f1 BETWEEN 0 AND 100;
five  
   ---
0
   123456
 + -34.84
 +  -1.2345678901234e-200
 +-1004.3
   (5 rows)
   
   SELECT CAST(f1 AS char(4)) AS three FROM VARCHAR_TBL
 ***
 *** 263,278 
   SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
   q2
   --
 -  4567890123456789
 123
   (2 rows)
   
   SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
   q2
   --
4567890123456789
4567890123456789
 -   123
   (3 rows)
   
   SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
 --- 263,278 
   SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl;
   q2
   --
 123
 +  4567890123456789
   (2 rows)
   
   SELECT q2 FROM int8_tbl INTERSECT ALL SELECT q1 FROM int8_tbl;
   q2
   --
 +   123
4567890123456789
4567890123456789
   (3 rows)
   
   SELECT q2 FROM int8_tbl EXCEPT SELECT q1 FROM int8_tbl ORDER BY 1;
 ***
 *** 305,320 
   SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
   q1
   --
 -  4567890123456789
 123
   (2 rows)
   
   SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
   q1
   --
4567890123456789
4567890123456789
 -   123
   (3 rows)
   
   --
 --- 305,320 
   SELECT q1 FROM int8_tbl EXCEPT ALL SELECT q2 FROM int8_tbl;
   q1
   --
 123
 +  4567890123456789
   (2 rows)
   
   SELECT q1 FROM int8_tbl EXCEPT ALL SELECT DISTINCT q2 FROM int8_tbl;
   q1
   --
 +   123
4567890123456789
4567890123456789
   (3 rows)
   
   --
 ***
 *** 341,348 
   SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT 
 q2 FROM int8_tbl;
   q1 
   ---
 -   4567890123456789
  123
  456
 4567890123456789
  123
 --- 341,348 
   SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl UNION ALL SELECT 
 q2 FROM int8_tbl;
   q1 
   ---
  123
 +   4567890123456789
  456
 4567890123456789
  123
 ***
 *** 353,367 
   SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL 
 SELECT q2 FROM int8_tbl)));
   q1
   --
 -  4567890123456789
 123
   (2 rows)
   
   (((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL 
 SELECT q2 FROM int8_tbl;
   q1 
   ---
 -   4567890123456789
  123
  456
 4567890123456789
  123
 --- 353,367 
   SELECT q1 FROM int8_tbl INTERSECT (((SELECT q2 FROM int8_tbl UNION ALL 
 SELECT q2 FROM int8_tbl)));
   q1
   --
 123
 +  4567890123456789
   (2 rows)
   
   (((SELECT q1 FROM int8_tbl INTERSECT SELECT q2 FROM int8_tbl))) UNION ALL 
 SELECT q2 FROM int8_tbl;
   q1 
   ---
  123
 +   4567890123456789

Re: [HACKERS] [PATCH] tsearch parser inefficiency if text includes urls or emails - new version

2009-11-08 Thread Kenneth Marshall

On Sun, Nov 08, 2009 at 05:00:53PM +0100, Andres Freund wrote:
 On Sunday 01 November 2009 16:19:43 Andres Freund wrote:
  While playing around/evaluating tsearch I notices that to_tsvector is
  obscenely slow for some files. After some profiling I found that this is
   due using a seperate TSParser in p_ishost/p_isURLPath in wparser_def.c. If
   a multibyte encoding is in use TParserInit copies the whole remaining
   input and converts it to wchar_t or pg_wchar - for every email or protocol
   prefixed url in the the document. Which obviously is bad.
  
  I solved the issue by having a seperate TParserCopyInit/TParserCopyClose
   which reuses the the already converted strings of the original TParser -
   only at different offsets.
  
  Another approach would be to get rid of the separate parser invocations -
  requiring a bunch of additional states. This seemed more complex to me, so
   I wanted to get some feedback first.
  
  Without patch:
  andres=# SELECT to_tsvector('english', document) FROM document WHERE
   filename = '/usr/share/doc/libdrm-nouveau1/changelog';
  
   
  ??
  ?
   ...
   (1 row)
  
  Time: 5835.676 ms
  
  With patch:
  andres=# SELECT to_tsvector('english', document) FROM document WHERE
   filename = '/usr/share/doc/libdrm-nouveau1/changelog';
  
   
  ??
  ?
   ...
   (1 row)
  
  Time: 395.341 ms
  
  Ill cleanup the patch if it seems like a sensible solution...
 As nobody commented here is a corrected (stupid thinko) and cleaned up 
 version. Anyone cares to comment whether I am the only one thinking this is 
 an 
 issue?
 
 Andres

+1

As a user of tsearch, I can certainly appreciate to speed-up in parsing --
more CPU for everyone else.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-10-28 Thread Kenneth Marshall

On Wed, Oct 28, 2009 at 03:31:17PM -0400, Tom Lane wrote:
 Hannu Krosing ha...@2ndquadrant.com writes:
  I had never checked the docs for hash functions, but I had assumed, that
  internal functions are prefixed by pg_ and anything else is public, free
  to use functionality.
 
 Sure, it's free to use.  It's not free to assume that we promise never
 to change it.
 
  Changing hash functions also makes in-place upgrades a lot harder, as
  they can't be done incrementally anymore for tables which use hash
  indexes.
 
 Hash indexes are so far from being production-grade that this argument
 is not significant.
 
   regards, tom lane

In addition that change from 8.3 - 8.4 to store only the hash and not
the value in the index means that a reindex would be required in any event.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: String key space for advisory locks

2009-10-27 Thread Kenneth Marshall

On Mon, Oct 26, 2009 at 06:35:13PM -0700, Christophe Pettus wrote:

 On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote:

 Hmmm, hashtext() returns int32. ,
 Can you reduce the collision issue if we had hashtext64()?

 That would certainly reduce the chance of a collison considerably, assuming 
 the right algorithm.

 --
 -- Christophe Pettus
x...@thebuild.com

The current hash function can already support generating a 64-bit
hash value by using both the b and c values. The function is called
hashlittle2 and has this comment in the original Bob Jenkins 2006
code:

/*
 * hashlittle2: return 2 32-bit hash values
 *
 * This is identical to hashlittle(), except it returns two 32-bit hash
 * values instead of just one.  This is good enough for hash table
 * lookup with 2^^64 buckets, or if you want a second hash if you're not
 * happy with the first, or if you want a probably-unique 64-bit ID for
 * the key.  *pc is better mixed than *pb, so use *pc first.  If you want
 * a 64-bit value do something like *pc + (((uint64_t)*pb)32).
 */

This should be a simple change. It would be worth running it by
the developer community to figure out how to add this functionality
in a way that will make 64-bit hashes available easily to other
code in the DB, perhaps even using them in very large hash indexes.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rejecting weak passwords

2009-10-01 Thread Kenneth Marshall

On Thu, Oct 01, 2009 at 03:54:37PM +0200, Magnus Hagander wrote:
 On Thu, Oct 1, 2009 at 15:26, Albe Laurenz laurenz.a...@wien.gv.at wrote:
  Andrew Dunstan wrote:
  So here's the patch.
  I don't think there is documentation required;
  correct me if I am wrong.
 
  How will people know how to use it, or that it's even there without at
  least a note in the docs somewhere?
 
  I'd prefer to have an example as a contrib module, as well as docs.
  Quite apart from anything else, how the heck would we test it without
  such a thing?
 
  I was not sure because no other hooks were documented anywhere else
  than in the code.
 
  I could add a paragraph in the auth-password section of
  client-auth.sgml. Or is there a better place?
 
  I could easily write a simple contrib that adds a check for
  username = password if there is interest.
 
 I think it's better to have an actually *useful* contrib module for
 it, if there is one. Meaning perhaps something that links to that
 cracklib thing mentioned upthread.
 

+1 for a sample module that will allow cracklib to drop in.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rejecting weak passwords

2009-10-01 Thread Kenneth Marshall

On Thu, Oct 01, 2009 at 01:07:04PM -0400, Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
  On Thu, Oct 1, 2009 at 17:24, Tom Lane t...@sss.pgh.pa.us wrote:
  I agree with the subsequent comments suggesting a sample module that
  actually does something useful --- although if it's going to link to
  external code like cracklib, it probably is going to have to be on
  pgfoundry not in contrib.
 
  Why is that? we have plenty of other things in contrib that rely on
  external code, for example the uuid, xml or ssl stuff.
 
 Well, maybe.  I was concerned about availability, portability, license
 compatibility, and so on.  The bar's a lot lower for pgfoundry projects
 on all those points ...
 
   regards, tom lane
 
It has been a while since I last used cracklib, but the interface
is generic enough that the sample we would ship in contrib could
be trivially adapted to use cracklib. The version we include could
just have the (username = password) check or something similar.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Another try at reducing repeated detoast work for PostGIS

2009-08-22 Thread Kenneth Marshall

On Sat, Aug 22, 2009 at 12:39:41PM -0400, Tom Lane wrote:
 Mark Cave-Ayland mark.cave-ayl...@siriusit.co.uk writes:
  So in conclusion, I think that patch looks good and that the extra time 
  I was seeing was due to RECHECK being applied to the  operator, and 
  not the time being spent within the index scan itself.
 
 Thanks, I appreciate the followup.
 
 I plan to go ahead and apply the patch to HEAD --- it doesn't conflict
 with Heikki's pending patch AFAICS, and no one has suggested an
 alternative that seems likely to get implemented soon.
 
 I am a bit tempted to apply it to 8.4 as well; otherwise the PostGIS
 people are likely to start cluttering their code with this
 add-a-dummy-function workaround, which would be unproductive in the long
 run.  Comments?
 
   regards, tom lane
 
+1 for applying it to 8.4 as well.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fixing geometic calculation

2009-08-07 Thread Kenneth Marshall

On Fri, Aug 07, 2009 at 11:29:47PM +1000, Paul Matthews wrote:
 Let us consider the ordering of real numbers in postgres. As you can see
 from
 the results below it has clearly returned the correct results.
 
   select( 1. = 1.0002 ); = f
   select( 1.  1.0002 ); = t
   select( 1.  1.0002 ); = f
 
 Imagine the situation however where postgres returned the following
 values to
 simple numerical inequalities. In such a case postgresql would be clearly
 defective and unfit for purpose.
 
   select( 1.00 = 1.01 ); = f
   select( 1.00  1.01 ); = f
   select( 1.00  1.01 ); = f
 
 If such a situation is unacceptable for the real number line, then in
 what way
 can it be acceptable for the real number plain.
 
   select( point(1.0,0)   point(1.1,0) ); = f
   select( point(1.0,0)   point(1.1,0) ); = f
   select( point(1.0,0)   point(1.1,0) ); = f
   select( point(1.0,0) - point(1.1,0) ); = 1.000655e-05
 
 We have two points with a finite separation in the x axis. Postgres
 thinks they
 are not the same point, nor one left of the other, nor to the right. This is
 clearly a both a physical and logical impossibility.
 
 The cause of this is the ill conceived FP* macros. They seem represent a
 solution to a problem that simply does not exist.
 
 The first effect of these macros is to reduce the accuracy of all geometric
 comparisons from double precision, to less than single precision. The
 following
 program correctly prints the correct answer. Whereas as we have seen above,
 postgres falls in a heap.
 
   int main() {
 float f = 1.0;
 float g = 1.1;
 if( f==g ) { printf( f=g\n ); }
 if( fg )  { printf( fg\n ); }
 if( fg )  { printf( fg\n ); }
 return 0;
   }
 
 The second effect is to take operations that would of worked correctly
 even in
 single precision, and to cause them to produce nonsensical result. For
 example
 points that can be both inside and outside a polygon at the same time.
 
 Simple analysis of the postgres source code shows that the only places
 where the
 FPzero, FPeq, FPne, FPlt, FPle FPgt and FPge macros are defined and used
 are in
 the src/backend/utils/adt/geo_ops.c and src/include/utils/geo_decls.h files.
 
 What is the justification for these macros? Why do they only affect
 geometric
 calculations, and not all numeric calculations? Why should these macro's
 not be
 abandoned?
 
 Does anyone any any objections to me:
 1) removing these macros, or at least disabling EPSILON by default.
 2) adding in the obviously missing operators (ie: box @ point)
 

Hi Paul,

Floating point calculations always have a bit of inaccuracy
because at the very minimum some values do not have exact
floating point representations and the results can be
implimentation dependent. I think disabling EPLSILON by
default is a bad idea. In my work with numeric methods,
we actually calculated EPSILON for the system we where using
at runtime. Maybe postgresql could do the same on startup.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fixing geometic calculation

2009-08-07 Thread Kenneth Marshall

On Fri, Aug 07, 2009 at 09:12:34AM -0500, Kenneth Marshall wrote:
 On Fri, Aug 07, 2009 at 11:29:47PM +1000, Paul Matthews wrote:
  Let us consider the ordering of real numbers in postgres. As you can see
  from
  the results below it has clearly returned the correct results.
  
select( 1. = 1.0002 ); = f
select( 1.  1.0002 ); = t
select( 1.  1.0002 ); = f
  
  Imagine the situation however where postgres returned the following
  values to
  simple numerical inequalities. In such a case postgresql would be clearly
  defective and unfit for purpose.
  
select( 1.00 = 1.01 ); = f
select( 1.00  1.01 ); = f
select( 1.00  1.01 ); = f
  
  If such a situation is unacceptable for the real number line, then in
  what way
  can it be acceptable for the real number plain.
  
select( point(1.0,0)   point(1.1,0) ); = f
select( point(1.0,0)   point(1.1,0) ); = f
select( point(1.0,0)   point(1.1,0) ); = f
select( point(1.0,0) - point(1.1,0) ); = 1.000655e-05
  
  We have two points with a finite separation in the x axis. Postgres
  thinks they
  are not the same point, nor one left of the other, nor to the right. This is
  clearly a both a physical and logical impossibility.

Actually, quantum theory will allow this to happen. :)

  
  The cause of this is the ill conceived FP* macros. They seem represent a
  solution to a problem that simply does not exist.
  
  The first effect of these macros is to reduce the accuracy of all geometric
  comparisons from double precision, to less than single precision. The
  following
  program correctly prints the correct answer. Whereas as we have seen above,
  postgres falls in a heap.
  
int main() {
  float f = 1.0;
  float g = 1.1;
  if( f==g ) { printf( f=g\n ); }
  if( fg )  { printf( fg\n ); }
  if( fg )  { printf( fg\n ); }
  return 0;
}
  
  The second effect is to take operations that would of worked correctly
  even in
  single precision, and to cause them to produce nonsensical result. For
  example
  points that can be both inside and outside a polygon at the same time.
  
  Simple analysis of the postgres source code shows that the only places
  where the
  FPzero, FPeq, FPne, FPlt, FPle FPgt and FPge macros are defined and used
  are in
  the src/backend/utils/adt/geo_ops.c and src/include/utils/geo_decls.h files.
  
  What is the justification for these macros? Why do they only affect
  geometric
  calculations, and not all numeric calculations? Why should these macro's
  not be
  abandoned?
  
  Does anyone any any objections to me:
  1) removing these macros, or at least disabling EPSILON by default.
  2) adding in the obviously missing operators (ie: box @ point)
  
 
 Hi Paul,
 
 Floating point calculations always have a bit of inaccuracy
 because at the very minimum some values do not have exact
 floating point representations and the results can be
 implimentation dependent. I think disabling EPLSILON by
 default is a bad idea. In my work with numeric methods,
 we actually calculated EPSILON for the system we where using
 at runtime. Maybe postgresql could do the same on startup.
 
 Regards,
 Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fixing geometic calculation

2009-08-07 Thread Kenneth Marshall

On Fri, Aug 07, 2009 at 04:16:56PM +0100, Sam Mason wrote:
 On Fri, Aug 07, 2009 at 09:49:41AM -0500, Kenneth Marshall wrote:
  On Fri, Aug 07, 2009 at 09:12:34AM -0500, Kenneth Marshall wrote:
   On Fri, Aug 07, 2009 at 11:29:47PM +1000, Paul Matthews wrote:
We have two points with a finite separation in the x axis.
Postgres thinks they are not the same point, nor one left of the
other, nor to the right. This is clearly a both a physical and
logical impossibility.
 
  Actually, quantum theory will allow this to happen. :)
 
 I'm not a physicist, but I don't think it does.  QM defines the
 probability distribution within which the particle will be found.  Once
 you've actually observed both points you will know their physical
 relation--you'll also have given them energy them so next time you look
 they'll be somewhere else, but the act of observation causes the above
 distribution to be collapsed.  This sidesteps the whole issue of the
 fact that points in PG are defined in euclidean space and do indeed
 have a definite location and can be compared at all times---they don't
 arbitrarily go jumping off millions of miles away or being annihilated
 by their anti-particle just because it's possible.
 
 I would agree with Paul that EPSILON is a hack and probably should be
 removed.  However it will cause user visible changes so it's not quite
 as simple as that to change.  I don't have anything really very useful
 to add apart from saying that maybe the default should be the other way
 around?
 
 -- 
   Sam  http://samason.me.uk/
 

It was definitely a tongue-in-cheek response since QT is not really
a topic for this mailing list. However, removing EPSILON completely
is not a good idea for the exact reason it was included originally.
Floating point numbers are approximations and since their precision
is neccessarily limited this fact must be included in any calculation
using them. I do agree that hard-coding it to a value that does not
reflect the reality of the calculation is not good. It would be
better to have a GUC to allow it to be specified than to have it
be zero. Maybe one setting would allow the system to calculate the
appropriate value for EPSILON based on the hardward. One way to
address the duplicity issue is to define for yourself what it means
if a point is both inside and outside, i.e. in this case the point
is always defined to be inside or the point is always defined to
be outside.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Kenneth Marshall

On Thu, Jul 16, 2009 at 04:27:39PM +0100, Greg Stark wrote:
 On Thu, Jul 16, 2009 at 4:16 PM, Tom Lanet...@sss.pgh.pa.us wrote:
  However, I do observe that this seems a sufficient counterexample
  against the theory that we can just remove the collapse limits and let
  GEQO save us on very complex queries. ?On my machine, the example query
  takes about 22 seconds to plan using CVS HEAD w/ all default settings.
  If I set both collapse_limit variables to very high values (I used 999),
  it takes ... um ... not sure; I gave up waiting after half an hour.
 
 What's the point of GEQO if it doesn't guarantee to produce the
 optimal plana and *also* doesn't guarantee to produce some plan, any
 plan, within some reasonable amount of time? Either we need to fix
 that or else I don't see what it's buying us over our regular planner
 which also might not produce a plan within a reasonable amount of time
 but at least if it does it'll be the right plan.
 

I do agree that we should have an actually time limit cap for
GEQO that would have it return the best plan so far at that time.
Then you can at least bound your planning time.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review remove {join,from}_collapse_limit, add enable_join_ordering

2009-07-16 Thread Kenneth Marshall

On Thu, Jul 16, 2009 at 06:49:08PM +0200, Andres Freund wrote:
 On Thursday 16 July 2009 17:59:58 Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
   The default settings currently make it relatively hard to trigger geqo at
   all.
 
  Yes, and that was intentional.  One of the implications of what we're
  discussing here is that geqo would get used a lot more for typical
  complex queries (if there is any such thing as a typical one).  So
  it's fully to be expected that the fallout would be pressure to improve
  geqo in various ways.
 
  Given that we are at the start of the development cycle, that prospect
  doesn't scare me --- there's plenty of time to fix whatever needs
  fixing.  However, I am leaning to the feeling that I don't want to be
  putting people in a position where they have no alternative but to use
  geqo.  So adjusting rather than removing the collapse limits is seeming
  like a good idea.
 Hm. I see a, a bit more fundamental problem with geqo:
 I tried several queries, and I found not a single one, where the whole 
 genetical process did any significant improvments to the 'worth'.
 It seems that always the best variant out of the pool is either the path 
 choosen in the end, or at least the cost difference is _really_ low.
 
 
 Andres
 

Hi Andres,

From some of my reading of the literature on join order
optimization via random sampling, such as what would establish
the initial GEQO pool, there is a very good possibility of having
a pretty good plan in the first pool, especially for our larger
initial pool sizes of 100-1000. And in fact, the final plan has
a good chance of being of approximately the same cost as a member
of the initial pool. Uniform sampling alone can give you a close
to optimum plan 80% of the time with an initial sample size of
100. And using biased sampling raises that to 99% or better.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-14 Thread Kenneth Marshall

On Tue, Jul 14, 2009 at 09:56:48AM -0700, Jeff Davis wrote:
 On Sun, 2009-07-12 at 14:14 +0100, Dean Rasheed wrote:
  Here is an updated version of this patch which should apply to HEAD,
  with updated docs, regression tests, pg_dump and psql \d.
  
  It works well for small numbers of temporary uniqueness violations,
  and at least as well (in fact about twice as fast) as deferred FK
  checks for large numbers of deferred checks.
 
 I took a brief look at this. You're extending the index AM, and that
 might not be necessary. It might be fine, but usually there is a lot of
 discussion around the changing of important APIs, so it might be worth
 looking at alternatives.
 
 With the patch I'm working on for generalized index constraints, there
 would be no need to extend the index AM. However, I don't expect my
 mechanism to replace the existing unique btree constraints, because I
 would expect the existing unique constraints to be faster (I haven't
 tested yet, though).
 
 Perhaps we could instead use the TRY/CATCH mechanism. It's generally
 difficult to figure out from the code exactly what happened, but in this
 case we have the error code ERRCODE_UNIQUE_VIOLATION. So, just check for
 that error code rather than passing back a boolean. You might want to
 change the signature of _bt_check_unique() so that it doesn't have to
 raise the error inside, and you can raise the error from _bt_doinsert().
 
 The only problem there is telling the btree AM whether or not to do the
 insert or not (i.e. fake versus real insert). Perhaps you can just do
 that with careful use of a global variable?
 
 Sure, all of this is a little ugly, but we've already acknowledged that
 there is some ugliness around the existing unique constraint and the
 btree code that supports it (for one, the btree AM accesses the heap).
 
I am looking at adding unique support to hash indexes for 8.5 and
they will definitely need to visit the heap.

Regards,
Ken

  I propose trying to improve performance and scalability for large
  numbers of deferred checks in a separate patch.
 
 Would it be possible to just check how long the list of potential
 conflicts is growing, and if it gets to big, just replace them all with
 a bulk check event?
 
 Regards,
   Jeff Davis
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Deferrable unique constraints

2009-07-14 Thread Kenneth Marshall

On Tue, Jul 14, 2009 at 12:13:33PM -0700, Jeff Davis wrote:
 On Tue, 2009-07-14 at 13:29 -0500, Kenneth Marshall wrote:
  I am looking at adding unique support to hash indexes for 8.5 and
  they will definitely need to visit the heap.
 
 Have you seen this patch?
 
 http://archives.postgresql.org/message-id/1246840119.19547.126.ca...@jdavis
 
 This patch will support unique constraints for hash indexes as well.
 There may still be a use-case for specialized hash index unique
 constraints, similar to btree, but please follow the work to make sure
 that no work is wasted.
 
 Also, I don't see a problem with using the same hacks in the hash index
 code as is used in the btree index code. If you see a better way, or if
 you think index AM changes would be useful to you as well, you should
 probably open that discussion.
 
 I was trying to provide an alternative to an index AM API change,
 because I thought there might be some resistance to that. However, if
 there are multiple index AMs that can make use of it, there is a
 stronger case for an API change.
 
 Regards,
   Jeff Davis
 

I will take a look at that patch. My thought was to use the same
process as the btree support for unique indexes since it has been
well tested and optimized.

Thanks,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-11 Thread Kenneth Marshall

On Sat, Jul 11, 2009 at 12:23:59PM -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  The only question I have is, whether random_r or similar is available on 
  enough platforms... Has anybody an idea about this?
  On most unixoid system one could just wrap erand48() if random_r is not 
  available.
  Windows?
 
 random_r() isn't in the Single Unix Spec AFAICS, and I also don't find
 it on HPUX 10.20, so I'd vote against depending on it.  What I do see
 in SUS is initstate() and setstate() which could be used to control
 the random() function:
 http://www.opengroup.org/onlinepubs/007908799/xsh/initstate.html
 It would also work to leave random() for use by the core code and have
 GEQO depend on something from the drand48() family:
 http://www.opengroup.org/onlinepubs/007908799/xsh/drand48.html
 Probably drand48() is less random than random(), but for the limited
 purposes of GEQO I doubt we care very much.
 
Ugh, tracking down problems caused a poor random number generator
is a difficult. Poor randomness often causes weird results from
algorithms that were designed around the assumption of a random
number.

 So far as I can find in a quick google search, neither of these families
 of functions exist on Windows :-(.  So I think maybe the best approach
 is the second one --- we could implement a port/ module that provides a
 version of whichever drand48 function we need.
 
I think that having a port/module for a random number generator is a
good idea. There are a number of good, fast random number generators
to choose from.

Cheers,
Ken

 On reflection I think the best user API is probably a geqo_seed GUC in
 the range 0 to 1, and have GEQO always reset its seed to that value at
 start of a planning cycle.  This ensures plan stability, and if you need
 to experiment with alternative plans you can change to different seed
 values.  The no reset behavior doesn't seem to have much real-world
 usefulness, because even if you chance to get a good plan, you have no
 way to reproduce it...
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kenneth Marshall

Hi,

When I was first familiarizing myself with PostgreSQL, I took a
walk through its documentation on GECO and similar processes in
the literature. One big advantage of GECO is that you can trade
off planning time for plan optimization. I do agree that it should
be updated, but there were definite cases in the literature where
the planning time for exhaustive searches could take orders of
magnitude more time to execute than the differences in the execution
times of the differing plans.

My two cents,
Ken

On Wed, Jul 08, 2009 at 09:43:12AM -0400, Noah Misch wrote:
 On Tue, Jul 07, 2009 at 09:31:14AM -0500, Kevin Grittner wrote:
  I don't remember any clear resolution to the wild variations in plan
  time mentioned here:
   
  http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php
   
  I think it would be prudent to try to figure out why small changes in
  the query caused the large changes in the plan times Andres was
  seeing.  Has anyone else ever seen such behavior?  Can we get
  examples?  (It should be enough to get the statistics and the schema,
  since this is about planning time, not run time.)
 
 With joins between statistically indistinguishable columns, I see planning 
 times
 change by a factor of ~4 for each join added or removed (postgres 8.3).  
 Varying
 join_collapse_limit in the neighborhood of the actual number of joins has a
 similar effect.  See attachment with annotated timings.  The example uses a
 single table joined to itself, but using distinct tables with identical 
 contents
 yields the same figures.
 
 The expontential factor seems smaller for real queries.  I have a query of
 sixteen joins that takes 71s to plan deterministically; it looks like this:
 
 SELECT 1 FROM fact JOIN dim0 ... JOIN dim6
 JOIN t t0 ON fact.key = t.key AND t.x = MCV0
 LEFT JOIN t t1 ON fact.key = t.key AND t.x = MCV1
 JOIN t t2 ON fact.key = t.key AND t.x = MCV2
 LEFT JOIN t t3 ON fact.key = t.key AND t.x = NON-MCV0
 LEFT JOIN t t4 ON fact.key = t.key AND t.x = NON-MCV1
 LEFT JOIN t t5 ON fact.key = t.key AND t.x = NON-MCV2
 LEFT JOIN t t6 ON fact.key = t.key AND t.x = NON-MCV3
 LEFT JOIN t t7 ON fact.key = t.key AND t.x = NON-MCV4
 
 For the real query, removing one join drops plan time to 26s, and removing two
 drops the time to 11s.  I don't have a good theory for the multiplier changing
 from 4 for the trivial demonstration to ~2.5 for this real query.  Re-enabling
 geqo drops plan time to .5s.  These tests used default_statistics_target = 
 1000,
 but dropping that to 100 does not change anything dramatically.
 
  I guess the question is whether there is anyone who has had a contrary
  experience.  (There must have been some benchmarks to justify adding
  geqo at some point?)
 
 I have queries with a few more joins (19-21), and I cancelled attempts to plan
 them deterministically after 600+ seconds and 10+ GiB of memory usage.  Even
 with geqo_effort = 10, they plan within 5-15s with good results.
 
 All that being said, I've never encountered a situation where a value other 
 than
 1 or inf for *_collapse_limit appeared optimal.
 
 nm

 SET geqo = off;
 SET join_collapse_limit = 100;
 CREATE TEMP TABLE t AS SELECT * FROM generate_series(1, 1000) f(n); ANALYZE t;
 
 --- Vary join count
 -- 242.4s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11
 NATURAL JOIN t t12;
 -- 31.2s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10 NATURAL JOIN t t11;
 -- 8.1s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
 -- 2.0s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09;
 -- 0.5s
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08;
 
 --- Vary join_collapse_limit
 -- 8.1s
 SET join_collapse_limit = 100;
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL JOIN t t10;
 -- 8.0s
 SET join_collapse_limit = 11;
 EXPLAIN SELECT 1 FROM t t00 NATURAL JOIN t t01 NATURAL JOIN t t02 NATURAL 
 JOIN t
 t03 NATURAL JOIN t t04 NATURAL JOIN t t05 NATURAL JOIN t t06 NATURAL JOIN t 
 t07
 NATURAL JOIN t t08 NATURAL JOIN t t09 NATURAL

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kenneth Marshall

On Wed, Jul 08, 2009 at 04:13:11PM -0500, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote: 
  
  It occurs to me that one way to make GEQO less scary would be to
  take out the nondeterminism by resetting its random number generator
  for each query.  You might get a good plan or an awful one, but at
  least it'd be the same one each time.  DBAs like predictability.
  
 +1  The biggest reason that I've tended to avoid geqo is that I would
 never know when it might do something really stupid with a query one
 time out of some large number, leading to mysterious complaints which
 could eat a lot of time.
  
 For a moment it seemed logical to suggest a session GUC for the seed,
 so if you got a bad plan you could keep rolling the dice until you got
 one you liked; but my right-brain kept sending shivers down my spine
 to suggest just how uncomfortable it was with that idea
  
 -Kevin
 

+1  I like the idea of a session GUC for the random number seed. If
we can come up with a way to prune the search space more aggressively,
GECO (or GECO2) will be much less prone to generating a bad plan.
I also think that a session variable would make it easier to test
GECO* by removing the nondeteminism.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] *_collapse_limit, geqo_threshold

2009-07-08 Thread Kenneth Marshall

On Wed, Jul 08, 2009 at 05:46:02PM -0400, Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  For a moment it seemed logical to suggest a session GUC for the seed,
  so if you got a bad plan you could keep rolling the dice until you got
  one you liked; but my right-brain kept sending shivers down my spine
  to suggest just how uncomfortable it was with that idea
 
 If memory serves, we actually had exactly that at some point.  But I
 think the reason it got taken out was that it interfered with the
 behavior of the random() function for everything else.  We'd have to
 give GEQO its own private random number generator.
 
   regards, tom lane
 
A separate random number generator for GECO make a lot of sense.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-19 Thread Kenneth Marshall

Yes, you are right. I thought that they were absolute function
counts. The data makes more sense now.

Regards,
Ken

On Thu, Jun 18, 2009 at 07:03:34PM -0500, Kevin Grittner wrote:
 Kenneth Marshall k...@rice.edu wrote: 
  
  What is not clear from Stefen's function listing is how the 8.4
  server could issue 33% more XLogInsert() and CopyReadLine()
  calls than the 8.3.7 server using the same input file.
  
 I thought those were profiling numbers -- the number of times a timer
 checked what was executing and found it in that method.  Which
 suggests that those two methods are probably slower now than in 8.3.7,
 at least in some environments.
  
 -Kevin
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-19 Thread Kenneth Marshall

On Fri, Jun 19, 2009 at 07:49:31PM +0200, Stefan Kaltenbrunner wrote:
 Tom Lane wrote:
 Just eyeing the code ... another thing we changed since 8.3 is to enable
 posix_fadvise() calls for WAL.  Any of the complaints want to try diking
 out this bit of code (near line 2580 in 
 src/backend/access/transam/xlog.c)?
 #if defined(USE_POSIX_FADVISE)  defined(POSIX_FADV_DONTNEED)
  if (!XLogArchivingActive() 
  (get_sync_bit(sync_method)  PG_O_DIRECT) == 0)
  (void) posix_fadvise(openLogFile, 0, 0, POSIX_FADV_DONTNEED);
 #endif

 ok after a bit of bisecting I'm happy to announce the winner of the 
 contest:

 http://archives.postgresql.org/pgsql-committers/2008-11/msg00054.php

 this patch causes a 25-30% performance regression for WAL logged copy, 
 however in the WAL bypass case (maybe that was what got tested?) it results 
 in a 20% performance increase.

 the raw numbers using the upthread posted minimal postgresql.conf are:

 post patch/wal logged: 4min10s/4min19/4min12
 post patch/wal bypass: 1m55s/1m58s/2m00
 prepatch/wal logged: 2m55s/3min00/2m59
 prepatch/wal bypass: 2m22s/2m18s/2m20s


 Stefan


Great! Maybe just increasing the size of the BULKWRITE ring,
possibly as a function of the shared_memory is all that is
needed. 256kB is the currently coded ring_size in storage/buffer/freelist.c

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 8.4 open item: copy performance regression?

2009-06-18 Thread Kenneth Marshall

On Thu, Jun 18, 2009 at 05:20:08PM -0400, Tom Lane wrote:
 Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:
  Any objections if I add:
  http://archives.postgresql.org/pgsql-performance/2009-06/msg00215.php
  to the (currently empty) list of open items for 8.4?
 
 I am unable to duplicate any slowdown on this test case.  AFAICT
 8.4 and 8.3 branch tip are about the same speed; if anything 8.4
 is faster.  Testing on x86_64 Fedora 10 ...
 
   regards, tom lane
 

What is not clear from Stefen's function listing is how the 8.4
server could issue 33% more XLogInsert() and CopyReadLine()
calls than the 8.3.7 server using the same input file. That would
account for the slow down but now why it is happening.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-02 Thread Kenneth Marshall

On Mon, Jun 01, 2009 at 08:22:23PM -0500, Kevin Grittner wrote:
 Sushant Sinha sushant...@gmail.com wrote: 
  
  I think that dot should be considered by as a word delimiter because
  when dot is not followed by a space, most of the time it is an error
  in typing. Beside they are not many valid english words that have
  dot in between.
  
 It's not treating it as an English word, but as a host name.
  
 select ts_debug('english', 'Mr.J.Sai Deepak');
  ts_debug
 ---
  (host,Host,Mr.J.Sai,{simple},simple,{mr.j.sai})
  (blank,Space symbols, ,{},,)
  (asciiword,Word, all
 ASCII,Deepak,{english_stem},english_stem,{deepak})
 (3 rows)
  
 You could run it through a dictionary which would deal with host
 tokens differently.  Just be aware of what you'll be doing to
 www.google.com if you run into it.
  
 I hope this helps.
  
 -Kevin
 

In our uses for full text indexing, it is much more important to
be able to find host name and URLs than to find mistyped names.
My two cents.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dot to be considered as a word delimiter?

2009-06-02 Thread Kenneth Marshall

On Tue, Jun 02, 2009 at 04:40:51PM -0400, Sushant Sinha wrote:
 Fair enough. I agree that there is a valid need for returning such tokens as
 a host. But I think there is definitely a need to break it down into
 individual words. This will help in cases when a document is missing a space
 in between the words.
 
 
 So what we can do is: return the entire compound word as Host and also break
 it down into individual words. I can put up a patch for this if you guys
 agree.
 
 Returning multiple tokens for the same word is a feature of the text search
 parser as explained in the documentation here:
 http://www.postgresql.org/docs/8.3/static/textsearch-parsers.html
 
 Thanks,
 Sushant.
 

+1

Ken
 On Tue, Jun 2, 2009 at 8:47 AM, Kenneth Marshall k...@rice.edu wrote:
 
  On Mon, Jun 01, 2009 at 08:22:23PM -0500, Kevin Grittner wrote:
   Sushant Sinha sushant...@gmail.com wrote:
  
I think that dot should be considered by as a word delimiter because
when dot is not followed by a space, most of the time it is an error
in typing. Beside they are not many valid english words that have
dot in between.
  
   It's not treating it as an English word, but as a host name.
  
   select ts_debug('english', 'Mr.J.Sai Deepak');
ts_debug
  
  ---
(host,Host,Mr.J.Sai,{simple},simple,{mr.j.sai})
(blank,Space symbols, ,{},,)
(asciiword,Word, all
   ASCII,Deepak,{english_stem},english_stem,{deepak})
   (3 rows)
  
   You could run it through a dictionary which would deal with host
   tokens differently.  Just be aware of what you'll be doing to
   www.google.com if you run into it.
  
   I hope this helps.
  
   -Kevin
  
 
  In our uses for full text indexing, it is much more important to
  be able to find host name and URLs than to find mistyped names.
  My two cents.
 
  Cheers,
  Ken
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] cleanup hashindex for pg_migrator hashindex compat mode (for 8.4)

2009-05-24 Thread Kenneth Marshall

On Sat, May 23, 2009 at 02:52:49PM -0400, Zdenek Kotala wrote:
 I forgot to fix contrib. Updated patch attached.
 
   Zdenek
 
 Zdenek Kotala pe v p?? 22. 05. 2009 v 16:23 -0400:
  Attached patch cleanups hash index headers to allow compile hasham for
  8.3 version. It helps to improve pg_migrator with capability to migrate
  database with hash index without reindexing.
  
  I discussed this patch year ago with Alvaro when we tried to cleanup
  include bloating problem. It should reduce also number of including.
  
  The main point is that hash functions for datatypes are now in related
  data files in utils/adt directory. hash_any() and hash_uint32 it now in
  utils/hashfunc.c.
  
  It would be nice to have this in 8.4 because it allows to test index
  migration functionality.
  
  Thanks Zdenek
  

How does that work with the updated hash functions without a reindex?

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a few crazy ideas about hash joins

2009-04-03 Thread Kenneth Marshall

On Fri, Apr 03, 2009 at 08:03:33AM -0400, Robert Haas wrote:
 On Fri, Apr 3, 2009 at 1:48 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  Robert Haas wrote:
 
  While investigating some performance problems recently I've had cause
  to think about the way PostgreSQL uses hash joins. ?So here are a few
  thoughts. ?Some of these have been brought up before.
 
  1. When the hash is not expected to spill to disk, it preserves the
  pathkeys of the outer side of the join. ?If the optimizer were allowed
  to assume that, it could produce significantly more efficient query
  plans in some cases. ?The problem is what to do if we start executing
  the query and find out that we have more stuff to hash than we expect,
  such that we need multiple batches? ?Now the results won't be sorted.
  I think we could handle this as follows: Don't count on the hash join
  to preserve pathkeys unless it helps, and only rely on it when it
  seems as if the hash table will still fit even if it turns out to be,
  say, three times as big as expected. ?But if you are counting on the
  hash join to preserve pathkeys, then pass that information to the
  executor. ?When the executor is asked to perform a hash join, it will
  first hash the inner side of the relation. ?At that point, we know
  whether we've succesfully gotten everything into a single batch, or
  not. ?If we have, perform the join normally. ?If the worst has
  happened and we've gone multi-batch, then perform the join and sort
  the output before returning it. ?The performance will suck, but at
  least you'll get the right answer.
 
  Previous in-passing reference to this idea here:
  http://archives.postgresql.org/pgsql-hackers/2008-09/msg00806.php
 
  Hmm, instead of a sorting the output if the worst happens, a final merge
  step as in a merge sort would be enough.
 
 Yeah - good point.
 
  2. Consider building the hash table lazily. ?I often see query planner
  pick a hash join over a nested inner indexscan because it thinks that
  it'll save enough time making hash probes rather than index probes to
  justify the time spent building the hash table up front. ?But
  sometimes the relation that's being hashed has a couple thousand rows,
  only a tiny fraction of which will ever be retrieved from the hash
  table. ?We can predict when this is going to happen because n_distinct
  for the outer column will be much less than the size of the inner rel.
  ?In that case, we could consider starting with an empty hash table
  that effectively acts as a cache. ?Each time a value is probed, we
  look it up in the hash table. ?If there's no entry, we use an index
  scan to find the matching rows and insert them into the hash table.
  Negative results must also be cached.
 
  Yeah, that would be quite nice. One problem is that our ndistinct estimates
  are not very accurate.
 
 Well, the right solution to that problem is to fix our ndistinct estimates.  
 :-)

Also, as seen in the hash index build performance patch, it would be better
to set the initial hash size bigger than needed to avoid the inter-page
shuffle if the guess is wrong.

Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving concurrent transactin commit rate

2009-03-25 Thread Kenneth Marshall

On Wed, Mar 25, 2009 at 03:58:06PM +, Sam Mason wrote:
 On Wed, Mar 25, 2009 at 02:38:45PM +, Greg Stark wrote:
  Sam Mason s...@samason.me.uk writes:
   Why does it top out so much though?  It goes up nicely to around ten
   clients (I tested with 8 and 12) and then tops out and levels off.  The
   log is chugging along at around 2MB/s which is well above where they
   are for a single client, but it still seems as though things could go
   further.
  
  Well 2MB/s sounds about right actually:
  
  You have: 8kB / ( 1|7200|2min)
  You want: MB/s
  * 1.92
  / 0.5208
 
 I'd need more explanation (or other pointers) to follow what you mean
 there.  I've actually got a 15k disk, but it shouldn't matter much.
 2MB/s seems to be consistent across any number of clients (specifically
 1 to 48 here).
 
  Heikki looked at this a while back and we concluded that the existing
  algorithm will only get 1/2 the optimal rate unless you have twice as many
  sessions as you ought to need to saturate the log i/o.
 
 I'm writing to a 15k disk which gives me 250 rotations per second.  In
 the case of a single client I'm getting about 220 transactions per
 second.  That seems reasonable.  When I have two clients this stays
 at about 220 transactions per second.  Also reasonable, they end up
 serialising after each other.
 
 Three clients; I get about 320 tps.  This appears to be consistent with
 1.5*220 and would imply that there's always a spare client behind the
 lock that gets committed for free.  Four clients; I get 430 tps which
 would mean the queueing is all good.
 
 Below I've calculated the (mean) transaction per second over a series
 of runs and calculated the value I'd expect to get (i.e. clients/2) and
 then the ratio of the two.
 
   clients  tps calc  ratio
1  221.5
2  223.8   220.0   102%
3  323.5   330.098%
4  427.7   440.097%
6  647.4   660.098%
8  799.7   880.091%
   12  946.0  1320.072%
   18 1020.6  1980.052%
   24 1089.2  2640.041%
   32 1116.6  3520.032%
   48 1141.8  5280.022%
 
 As you can see the ratio between the tps I'm seeing and expecting drops
 off significantly after 18 clients, with the trend starting somewhere
 around seven clients.  I don't understand why this would be happening.
 
 My highly involved and complicated benchmarking is a shell script that
 does:
 
   #!/bin/bash
   nclients=$1
   ittrs=$2
   function gensql  {
   echo INSERT INTO bm (c,v) VALUES ('$1','0');
   for (( i = 1; i  $ittrs; i++ )); do
   echo UPDATE bm SET v = '$i' WHERE c = '$1';
   done
   echo DELETE FROM bm WHERE c = '$1';
   }
   for (( c = 0; c  $nclients; c++)); do
   gensql $c | psql -Xq -f - 
   done
   for (( c = 0; c  $nclients; c++)); do
   wait
   done
 
 I'm running time test.sh 8 1000 and recording the time; tps = nclients
 * ittrs / time.  Where the time is the wall clock time expired.  I'm
 repeating measurements four times and the error bars in my SVG from
 before were the standard deviation of the runs.
 
 Something (the HOT code?) keeps the number of dead tuples consistent so
 I don't think this would be confounding things.  But improvements would
 be appreciated.
 
 -- 
   Sam  http://samason.me.uk/
 

Are you sure that you are able to actually drive the load at the
high end of the test regime? You may need to use multiple clients
to simulate the load effectively.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving concurrent transactin commit rate

2009-03-25 Thread Kenneth Marshall

On Wed, Mar 25, 2009 at 05:56:02PM +, Sam Mason wrote:
 On Wed, Mar 25, 2009 at 12:01:57PM -0500, Kenneth Marshall wrote:
  On Wed, Mar 25, 2009 at 03:58:06PM +, Sam Mason wrote:
 #!/bin/bash
 nclients=$1
 ittrs=$2
 function gensql  {
 echo INSERT INTO bm (c,v) VALUES ('$1','0');
 for (( i = 1; i  $ittrs; i++ )); do
 echo UPDATE bm SET v = '$i' WHERE c = '$1';
 done
 echo DELETE FROM bm WHERE c = '$1';
 }
 for (( c = 0; c  $nclients; c++)); do
 gensql $c | psql -Xq -f - 
 done
 for (( c = 0; c  $nclients; c++)); do
 wait
 done
  
  Are you sure that you are able to actually drive the load at the
  high end of the test regime? You may need to use multiple clients
  to simulate the load effectively.
 
 Notice that the code is putting things into the background and then
 waiting for them to finish so there will be multiple clients.  Or maybe
 I'm misunderstanding what you mean.
 
 I've just tried modifying the code to write the generated SQL out to
 a set of files first and this speeds things up by about 6% (the 48
 client case goes from taking ~42 seconds to ~39 seconds) indicating that
 everything is probably OK with the test harness.  Also note that this 6%
 improvement will be linear and across the board and hence should just
 appear as slightly reduced performance for my system.  As I'm not really
 interested in absolute performance and more in how the system scales as
 load increases this will negate this effect even further.
 
 -- 
   Sam  http://samason.me.uk/
 

I did notice how your test harness was designed. It just seemed that
the actual process contention on your load generation system will actually
bottle-neck as the number of clients increases and that may be the cause
of your fall-off, or a contributor. You could test it by generating the
load from independent boxes and see how the perfomance falls-off as you
add additional load clients+boxes.

My two cents,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with accesing Oracle from plperlu function when using remote pg client.

2009-03-16 Thread Kenneth Marshall

On Mon, Mar 16, 2009 at 03:16:07PM +0100, Tomasz Olszak wrote:
 Greetings to All!
  
 I've tried to find solution of my problem on other pg mailing lists but 
 without bigger effect.
  
 I have a table A in PG. There is also table A in Oracle.
  I want to import specific row from oracle to pg, so i create plperlu function
  
 CREATE OR REPLACE FUNCTION import.ora_a_row(a_id numeric)
  RETURNS dok_za AS
  $BODY$
  
 In IPL:
  create_connection;
  select all columns on oracle from table a where id = a_id;
  returning tuple;
  
 $BODY$
  LANGUAGE 'plperlu' VOLATILE;
  
 then i can use such function in pl/pgsql;
  
  DECLARE:
  var A%ROWTYPE;
  BEGIN;
  ...
  select * into var from import.ora_a_row(100);
  END;...
  
 Like you see it's very, very convenient.
  
 And it works, but only when I make select * from import.ora_a_row(100); 
 from psql?? on postgresql server(local client).
  When I try to make that select in pgadmin or from remote machine I have tns 
 error:
  
 TNS:could not resolve the connect identifier specified (DBD ERROR: 
 OCIServerAttach) at line 20
  
 I've tried with different postgresql versions and different perls, and 
 different DBI Oracle packages, so i think pg or perl versions are not 
 causes(Of course environment variables are propably set etc.). Oracle Base 
 directory is about 1.6 gigabyte so I think it's full client(not instant).
  
 When I used PGADMIN 1.6 on postgresql server and left host editline 
 blank(specifying only a pgport) it worked too.
  But when I've written localhost as host it didn't work (the same with 
 connecting psql -h localhost -U user database ).
  
 Anybody ancounter this kind of problem or maybe it's a bug in plperlu?
  
 I'll be grateful for any of Your help.
  
 Regards
  
 Tomasz
  

This looks like an ENVIRONMENT variable problem. The server does not
run with the same set of settings as your psql program. I think that
it will work once you get those issues ironed out.

Good luck,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Problem with accesing Oracle from plperlu functionwhen using remote pg client.

2009-03-16 Thread Kenneth Marshall

On Mon, Mar 16, 2009 at 02:30:28PM -0400, Jonah H. Harris wrote:
 On Mon, Mar 16, 2009 at 2:26 PM, Jonah H. Harris 
 jonah.har...@gmail.comwrote:
 
  On Mon, Mar 16, 2009 at 2:04 PM, Alvaro Herrera 
  alvhe...@commandprompt.com wrote:
 
   We already have one; it's called update_process_title.
 
 
  I have it turned off, and I still see the remote IP/port in the process
  list.
 
 
 Ahh, this is why:
 
 init_ps_display():set_ps_display(initial_str, true);
 
 Perhaps it should obey the configuration setting as well?
 
 -- 
 Jonah H. Harris, Senior DBA
 myYearbook.com

What about have the GUC support off, on, and a format string to use
to fix this problem.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libxml incompatibility

2009-03-06 Thread Kenneth Marshall

This looks like a problem caused by two different libxml versions:
the one used for the perl XML::LibXML wrappers and the one used to
build PostgreSQL. They really need to be the same. Does it still
segfault if they are identical?

Regards,
Ken

On Fri, Mar 06, 2009 at 04:14:04PM -0300, Alvaro Herrera wrote:
 Hi,
 
 It seems that if you load libxml into a backend for whatever reason (say
 you create a table with a column of type xml) and then create a plperlu
 function that use XML::LibXML, we get a segmentation fault.
 
 This sequence reproduces the problem for me in 8.3:
 
 create table xmlcrash (a xml);
 insert into xmlcrash values ('a /');
 create function xmlcrash() returns void language plperlu as $$ use 
 XML::LibXML; $$;
 
 The problem is reported as
 
 TRAP: BadArgument(?!(((context) != ((void *)0)  
 (Node*)((context)))-type) == T_AllocSetContext?, Archivo: 
 ?/pgsql/source/83_rel/src/backend/utils/mmgr/mcxt.c?, L?nea: 507)
 
 
 -- 
 Alvaro Herrerahttp://www.CommandPrompt.com/
 The PostgreSQL Company - Command Prompt, Inc.
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libxml incompatibility

2009-03-06 Thread Kenneth Marshall

On Fri, Mar 06, 2009 at 02:58:30PM -0500, Andrew Dunstan wrote:


 Alvaro Herrera wrote:
 Hi,

 It seems that if you load libxml into a backend for whatever reason (say
 you create a table with a column of type xml) and then create a plperlu
 function that use XML::LibXML, we get a segmentation fault.


   

 Yes, I discovered this a few weeks ago. It looks like libxml is not 
 reentrant, so for perl you need to use some other XML library. Very 
 annoying.

 cheers

 andrew

Ugh! That is worse than a simple library link incompatibility.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libxml incompatibility

2009-03-06 Thread Kenneth Marshall

On Fri, Mar 06, 2009 at 05:23:45PM -0300, Alvaro Herrera wrote:
 Kenneth Marshall wrote:
  This looks like a problem caused by two different libxml versions:
  the one used for the perl XML::LibXML wrappers and the one used to
  build PostgreSQL. They really need to be the same. Does it still
  segfault if they are identical?
 
 Unlikely, because AFAICT there's a single libxml installed on my system.
 
Yes, I saw Andrew's comment and I have had that problem my self with
Apache/PHP and perl with libxml. As simple library mismatch would at
least be easy to resolve.  :)

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore --multi-thread

2009-02-20 Thread Kenneth Marshall

On Fri, Feb 20, 2009 at 09:22:58AM -0800, Joshua D. Drake wrote:
 On Fri, 2009-02-20 at 09:33 -0500, Andrew Dunstan wrote:
 
  The short answer is that we don't know yet. There is anecdotal evidence 
  that the number of CPUs on the server is a good place to start, but we 
  should be honest enough to say that this is a new feature and we are 
  still gathering information about its performance.  If you want to give 
  some advice, then I think the best advice is to try a variety of 
  settings to see what works best for you, and if you have a good set of 
  figures report it back to us.
 
 There has been some fairly heavy testing and research that caused the
 patch in the first place. The thread is here:
 
 http://archives.postgresql.org/pgsql-hackers/2008-02/msg00695.php
 
 It is a long thread. The end was result was the fastest restore time for
 220G was performed with 24 threads with an 8 core box. It came in at 3.5
 hours.
 
 http://archives.postgresql.org/pgsql-hackers/2008-02/msg01092.php
 
 It is important to point out that this was a machine with 50 spindles.
 Which is where your bottleneck is going to be immediately after solving
 the CPU bound nature of the problem.
 
 So although the CPU question is easily answered, the IO is not. IO is
 extremely variable in its performance.
 
 Sincerely,
 
 Joshua D. Drake
 
I also ran some tests against a more modest system that was still
showing a performance improvement at (number-of-cores * 2):

http://archives.postgresql.org/pgsql-hackers/2008-11/msg01399.php

I think that a good starting point for any use should be the number
of cores given these two data points.

Cheers,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: hooking parser

2009-02-19 Thread Kenneth Marshall

On Thu, Feb 19, 2009 at 06:29:25PM +, Sam Mason wrote:
 On Wed, Feb 18, 2009 at 10:30:12AM -0500, Tom Lane wrote:
  Peter Eisentraut pete...@gmx.net writes:
   I'd be quite interested to support some kind of hook to deal with this 
   Oracle null issue.  It would be a great help for porting projects.
  
   However, doing this properly is probably more complex and needs further 
   thought.  I'd suggest writing a type of regression test first for Oracle 
   null behavior and then evaluating any kind of hook or hack against that.
  
  AFAIK, the Oracle behavior is just about entirely unrelated to the
  parser --- it's a matter of runtime comparison behavior.  It is
  certainly *not* restricted to literal NULL/'' constants, which is the
  only case that a parser hack can deal with.
 
 How about introducing a varchar2 type as in Oracle?  It would be a bit
 of a fiddle going through all the operators and functions making sure
 that versions existed to cast things back again but seems possible.
 
 Not sure how fragile user code would be with it though, I'm mainly
 worried about it trying to convert things back to TEXT automatically and
 the resulting change in semantics.  Any ideas about good ways to go?
 

Could you define a type/domain for varchar2 mapping it to varchar.
There does not seem to be anything else that needs to be done.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PERFORM] GIST versus GIN indexes for intarrays

2009-02-13 Thread Kenneth Marshall

On Fri, Feb 13, 2009 at 04:12:53PM +0300, Teodor Sigaev wrote:
 The short-term workaround for Rusty is probably to create his GIN
 index using the intarray-provided gin__int_ops opclass.  But it
 Right
 seems to me that we ought to get rid of intarray's @ and @ operators
 and have the module depend on the core anyarray operators, just as we
 have already done for = and .  Comments?
 Agree, will do. Although built-in anyarray operators have ~N^2 behaviour 
 while intarray's version - only N*log(N)
Is there a way to have the buily-in anyarray opeators be N*log(N)?

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] note on hash indexes

2009-02-04 Thread Kenneth Marshall

I had submitted the documentation change as part of my
hash function patch but it was removed as not relevant.
(It wasn't really.) I would basically remove the first
sentence:

Note: Hash index operations are not presently WAL-logged,
  so hash indexes might need to be rebuilt with REINDEX  after a
  database crash. For this reason, hash index use is presently
  discouraged.

Ken


On Wed, Feb 04, 2009 at 01:22:23PM -0300, Alvaro Herrera wrote:
 Hi,
 
 indices.sgml contains this paragraph about hash indexes:
 
   Note:  Testing has shown PostgreSQL's hash indexes to perform no
 better than B-tree indexes, and the index size and build time for hash
 indexes is much worse. Furthermore, hash index operations are not
 presently WAL-logged, so hash indexes might need to be rebuilt with
 REINDEX  after a database crash. For these reasons, hash index use is
 presently discouraged. 
 
 
 However, it seems to me that hash indexes are much improved in 8.4, so
 maybe this needs to be reworded.  I'm not sure to what point they have
 been improved though.
 
 -- 
 Alvaro Herrerahttp://www.CommandPrompt.com/
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] note on hash indexes

2009-02-04 Thread Kenneth Marshall

On Wed, Feb 04, 2009 at 11:22:44PM +0100, Zdenek Kotala wrote:
 The main speed improvement is for varchar datatype. I think It should be
 mention here as well. IIRC, times are similar with B-Tree for integer
 datatype.
 
   Zdenek
 
 Kenneth Marshall pe v st 04. 02. 2009 v 13:57 -0600:
  I had submitted the documentation change as part of my
  hash function patch but it was removed as not relevant.
  (It wasn't really.) I would basically remove the first
  sentence:
  
  Note: Hash index operations are not presently WAL-logged,
so hash indexes might need to be rebuilt with REINDEX  after a
database crash. For this reason, hash index use is presently
discouraged.
  
  Ken
  
  
  On Wed, Feb 04, 2009 at 01:22:23PM -0300, Alvaro Herrera wrote:
   Hi,
   
   indices.sgml contains this paragraph about hash indexes:
   
 Note:  Testing has shown PostgreSQL's hash indexes to perform no
   better than B-tree indexes, and the index size and build time for hash
   indexes is much worse. Furthermore, hash index operations are not
   presently WAL-logged, so hash indexes might need to be rebuilt with
   REINDEX  after a database crash. For these reasons, hash index use is
   presently discouraged. 
   
   
   However, it seems to me that hash indexes are much improved in 8.4, so
   maybe this needs to be reworded.  I'm not sure to what point they have
   been improved though.
   
   -- 
   Alvaro Herrera
   http://www.CommandPrompt.com/
   PostgreSQL Replication, Consulting, Custom Development, 24x7 support
   

The speed improvement applies particularly to any datatype that is
larger than an integer (typically 4 bytes). Also the size of fields that
can be indexed efficiently is much, much larger than the 2K for Btree.
And even 32-bit quantities may be indexed more efficiently than Btrees
for large indexes due to the O(1) probe behavior. Btrees typically need
to cache/probe the upper levels of the tree to locate the tuple. I have
held off on extensive benchmarking until WAL has been implemented.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade project status

2009-01-27 Thread Kenneth Marshall

On Tue, Jan 27, 2009 at 11:39:50AM -0300, Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
  Andrew Dunstan wrote:
  Zdenek Kotala wrote:
  2) pg_upgrade.sh
  http://archives.postgresql.org/pgsql-hackers/2008-12/msg00248.php
 
  Pg_upgrade.sh is shell script for catalog conversion. It works for
  8.3-8.4 upgrade. It will be useful while we will not have better
  solution. Disadvantage is that it is korn shell script. The idea is to
  rewrite it in PERL which is more portable, but I'm not PERL expert and
  currently there is no workable solution.
 
  I have had a very brief look at this. Translation to perl doesn't look  
  difficult. I'll see what I can do during the next week or so.
 
  We don't require perl for any other feature, do we? Seems like a pretty  
  onerous requireemnt for Windows in particular. We do use perl in the  
  build scripts, but that's only required if you want to compile from 
  source.
 
 I think it's fairly easy to install Perl on Windows actually.  It
 doesn't sound too onerous a requirement if you want in-place upgrade;
 actually it looks a very reasonable one.
 
 Much more reasonable than Korn shell in any case (or any shell for that
 matter; I think anything is going to be more of a potentially painful
 platform dependency than Perl).
 
 -- 
 Alvaro Herrerahttp://www.CommandPrompt.com/
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support
 
+1

I agree with Alvaro. Perl is a breeze to install on Windows with
Activestate and that using shell code to perform this task adds a
huge platform dependency to the code. Perl is a known and well defined
quantity for scripting.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-27 Thread Kenneth Marshall

On Sun, Jan 25, 2009 at 10:27:03PM -0600, Kenneth Marshall wrote:
 On Sat, Jan 10, 2009 at 01:36:25PM -0500, Tom Lane wrote:
  Jeff Davis pg...@j-davis.com writes:
   I ran 5 times on both old and new code, eliminating the top and bottom
   and taking the average of the remaining 3, and I got a 6.9% performance
   improvement with the new code.
  
  The question that has been carefully evaded throughout the discussion
  of this patch is whether the randomness of the hash result is decreased,
  and if so what is that likely to cost us in performance of the various
  hash-dependent algorithms.  I would still like to have an answer to that
  before we make a change to gain marginal performance improvement in
  the hash function itself (which is already shown to be barely measurable
  in the total context of a hash-dependent operation...)
  
  regards, tom lane
  
 
 Dear Hackers and Reviewers,
 
 In response to Tom's questioning the randomness of the hash_any
 when the mixing functions are split into two, the first used when
 sequentially processing the input and the second for the final
 mix, I have generated a more detailed analysis of the two hash_any
 functions.
 
 First, one change to the 11/2008 patch, the keylen is added to a, b and
 c initially so we do not need to add it later on. The is the
 micro-diff:
 -
 
 --- hashfunc.c_TWOMIX   2009-01-22 14:07:34.0 -0600
 +++ hashfunc.c_TWOMIX2  2009-01-22 14:17:32.0 -0600
 @@ -336,7 +336,6 @@
  
 /* handle the last 11 bytes */
 k = (const unsigned char *) ka;
 -   c += keylen;
  #ifdef WORDS_BIGENDIAN
 switch (len)
 {
 @@ -439,7 +438,6 @@
 }
  
 /* handle the last 11 bytes */
 -   c += keylen;
  #ifdef WORDS_BIGENDIAN
 switch (len)/* all the case statements 
 fall through */
  
 -
 
 The remainder of this document will use the runs from my initial results
 broken out using various power-of-two bucket sizes to simulate our actual
 use in PostgreSQL as the number of items hashed increases and we use more
 and more bits of our hash to identify the appropriate bucket. I have run
 each test twice, once with our current hash_any() with the single mix()
 function and then a second time using my patch from the November commitfest
 plus the patch above to produce a new hash_any() with two separate mixing
 functions mix() and final(). For each run I have generated a sequence of
 unique inputs, up to the number of buckets, hashed them with the hash
 functions (both old and new), then I calculate the expected number of
 collision p(n) using the poisson formula for each number of buckets,
 where the number of buckets are 2**16, 2**18, 2**20, 2**22, 2**24, and
 2**26. For my initial run, I used a string consisting of the letter 'a'
 followed by the integer representation of the numbers from 0 to the
 (number of buckets - 1):
 
 1) auint32 ((i.e. a1,a0002...)
 Number of buckets: 65536
 Total number of items: 65536
 Expected number with n items: 24109 24109 12054 4018 1004 200 33 4
 Actual number mix():  24044 24172 12078 4036 980 186 30 10
 Actual number mix()/final():  24027 24232 12060 3972 1001 207 31 5 1
 
 Number of buckets: 262144
 Total number of items: 262144
 Expected number with n items: 96437 96437 48218 16072 4018 803 133 19 2
 Actual number mix():  96224 96730 48240 15951 4094 744 143 17 1
 Actual number mix()/final():  96335 96646 48071 16128 4018 801 122 21 2
 
 Number of buckets: 1048576
 Total number of items: 1048576
 Expected number with n items: 385749 385749 192874 64291 16072 3214 535 76 9
 Actual number mix():  385716 385596 193243 64115 16053 3285 478 77 12 
 1
 Actual number mix()/final():  385955 385016 193789 63768 16259 3190 511 79 8 1
 
 Number of buckets: 4194304
 Total number of items: 4194304
 Expected number with n items: 1542998 1542998 771499 257166 64291 12858 2143 
 306 38
 Actual number mix():  1542536 1543489 771351 25 63830 12847 2123 
 326 19 5 1
 Actual number mix()/final():  1541828 1544429 772072 256178 64579 12774 2129 
 288 22 5
 
 Number of buckets: 16777216
 Total number of items: 16777216
 Expected number with n items: 6171992 6171992 3085996 1028665 257166 51433 
 8572 1224 153
 Actual number mix():  6170866 6174079 3085912 1027140 257808 51385 
 8638 1219 146 23
 Actual number mix()/final():  6172058 6171991 3086279 1027916 257535 51465 
 8554 1243 149 23 3
 
 Number of buckets: 67108864
 Total number of items: 67108864
 Expected number with n items: 24687971 24687971 12343985 4114661 1028665 
 205733 34288 4898 612
 Actual number mix():  24686110 24690897 12344232 4113515 1028232 
 205682 34546 4942 629 72 7
 Actual number mix()/final():  24708515 24669248 12333034 4114796 1034256 
 208424 34888 5023 598 77 5
 
 Here

Re: [HACKERS] pg_upgrade project status

2009-01-27 Thread Kenneth Marshall

On Tue, Jan 27, 2009 at 04:23:18PM +0100, Harald Armin Massa wrote:
  I think it's fairly easy to install Perl on Windows actually.  It
  doesn't sound too onerous a requirement if you want in-place upgrade;
  actually it looks a very reasonable one.
 
  Much more reasonable than Korn shell in any case (or any shell for that
  matter; I think anything is going to be more of a potentially painful
  platform dependency than Perl).
 
 
 May I humbly recommend to rewrite in Python? That should be as
 difficult / easy as PERL, AND there is a very robust py2exe
 implementation, which allows to create a single .exe file which
 contains everything.
 
 Python is present on all Linux, Windows users are totally comfortable
 with .exe files.
 
 Harald
 
 
Great idea, perl2exe is available as well and will allow the continued use
of perl for our internal scripting language.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-25 Thread Kenneth Marshall

On Sat, Jan 10, 2009 at 01:36:25PM -0500, Tom Lane wrote:
 Jeff Davis pg...@j-davis.com writes:
  I ran 5 times on both old and new code, eliminating the top and bottom
  and taking the average of the remaining 3, and I got a 6.9% performance
  improvement with the new code.
 
 The question that has been carefully evaded throughout the discussion
 of this patch is whether the randomness of the hash result is decreased,
 and if so what is that likely to cost us in performance of the various
 hash-dependent algorithms.  I would still like to have an answer to that
 before we make a change to gain marginal performance improvement in
 the hash function itself (which is already shown to be barely measurable
 in the total context of a hash-dependent operation...)
 
   regards, tom lane
 

Dear Hackers and Reviewers,

In response to Tom's questioning the randomness of the hash_any
when the mixing functions are split into two, the first used when
sequentially processing the input and the second for the final
mix, I have generated a more detailed analysis of the two hash_any
functions.

First, one change to the 11/2008 patch, the keylen is added to a, b and
c initially so we do not need to add it later on. The is the
micro-diff:
-

--- hashfunc.c_TWOMIX   2009-01-22 14:07:34.0 -0600
+++ hashfunc.c_TWOMIX2  2009-01-22 14:17:32.0 -0600
@@ -336,7 +336,6 @@
 
/* handle the last 11 bytes */
k = (const unsigned char *) ka;
-   c += keylen;
 #ifdef WORDS_BIGENDIAN
switch (len)
{
@@ -439,7 +438,6 @@
}
 
/* handle the last 11 bytes */
-   c += keylen;
 #ifdef WORDS_BIGENDIAN
switch (len)/* all the case statements fall 
through */
 
-

The remainder of this document will use the runs from my initial results
broken out using various power-of-two bucket sizes to simulate our actual
use in PostgreSQL as the number of items hashed increases and we use more
and more bits of our hash to identify the appropriate bucket. I have run
each test twice, once with our current hash_any() with the single mix()
function and then a second time using my patch from the November commitfest
plus the patch above to produce a new hash_any() with two separate mixing
functions mix() and final(). For each run I have generated a sequence of
unique inputs, up to the number of buckets, hashed them with the hash
functions (both old and new), then I calculate the expected number of
collision p(n) using the poisson formula for each number of buckets,
where the number of buckets are 2**16, 2**18, 2**20, 2**22, 2**24, and
2**26. For my initial run, I used a string consisting of the letter 'a'
followed by the integer representation of the numbers from 0 to the
(number of buckets - 1):

1) auint32 ((i.e. a1,a0002...)
Number of buckets: 65536
Total number of items: 65536
Expected number with n items: 24109 24109 12054 4018 1004 200 33 4
Actual number mix():  24044 24172 12078 4036 980 186 30 10
Actual number mix()/final():  24027 24232 12060 3972 1001 207 31 5 1

Number of buckets: 262144
Total number of items: 262144
Expected number with n items: 96437 96437 48218 16072 4018 803 133 19 2
Actual number mix():  96224 96730 48240 15951 4094 744 143 17 1
Actual number mix()/final():  96335 96646 48071 16128 4018 801 122 21 2

Number of buckets: 1048576
Total number of items: 1048576
Expected number with n items: 385749 385749 192874 64291 16072 3214 535 76 9
Actual number mix():  385716 385596 193243 64115 16053 3285 478 77 12 1
Actual number mix()/final():  385955 385016 193789 63768 16259 3190 511 79 8 1

Number of buckets: 4194304
Total number of items: 4194304
Expected number with n items: 1542998 1542998 771499 257166 64291 12858 2143 
306 38
Actual number mix():  1542536 1543489 771351 25 63830 12847 2123 
326 19 5 1
Actual number mix()/final():  1541828 1544429 772072 256178 64579 12774 2129 
288 22 5

Number of buckets: 16777216
Total number of items: 16777216
Expected number with n items: 6171992 6171992 3085996 1028665 257166 51433 8572 
1224 153
Actual number mix():  6170866 6174079 3085912 1027140 257808 51385 8638 
1219 146 23
Actual number mix()/final():  6172058 6171991 3086279 1027916 257535 51465 8554 
1243 149 23 3

Number of buckets: 67108864
Total number of items: 67108864
Expected number with n items: 24687971 24687971 12343985 4114661 1028665 205733 
34288 4898 612
Actual number mix():  24686110 24690897 12344232 4113515 1028232 205682 
34546 4942 629 72 7
Actual number mix()/final():  24708515 24669248 12333034 4114796 1034256 208424 
34888 5023 598 77 5

Here is a second run with number of items = (number of buckets)/2:
Number of buckets: 65536
Total number of items: 32768
Expected number with n items: 39749 19874 4968 828 103 10
Actual

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-10 Thread Kenneth Marshall

On Fri, Jan 09, 2009 at 02:00:39PM -0800, Jeff Davis wrote:
 On Fri, 2009-01-09 at 14:29 -0600, Kenneth Marshall wrote:
  Jeff,
  
  Thanks for the review. I would not really expect any differences in hash
  index build times other than normal noise variances. The most definitive
  benchmark that I have seen was done with my original patch submission
  in March posted by Luke of Greenplum:
  
We just applied this and saw a 5 percent speedup on a hash aggregation
 query with four columns in a 'group by' clause run against a single
 TPC-H table.
  
  I wonder if they would be willing to re-run their test? Thanks again.
 
 Separating mix() and final() should have some performance benefit,
 right?
 
 Regards,
   Jeff Davis
 
 
Yes, it does but the results can be swamped by other latencies in the
code path. Tests such as Tom's benchmark of the underlying functions is
needed to isolate the timings effectively or a benchmark like Greenplum's
that will benefit from a more efficient function.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-10 Thread Kenneth Marshall

On Sat, Jan 10, 2009 at 10:56:15AM -0800, Jeff Davis wrote:
 On Sat, 2009-01-10 at 13:36 -0500, Tom Lane wrote:
  Jeff Davis pg...@j-davis.com writes:
   I ran 5 times on both old and new code, eliminating the top and bottom
   and taking the average of the remaining 3, and I got a 6.9% performance
   improvement with the new code.
  
  The question that has been carefully evaded throughout the discussion
  of this patch is whether the randomness of the hash result is decreased,
  and if so what is that likely to cost us in performance of the various
  hash-dependent algorithms.  I would still like to have an answer to that
  before we make a change to gain marginal performance improvement in
  the hash function itself (which is already shown to be barely measurable
  in the total context of a hash-dependent operation...)
  
 
 In:
 http://archives.postgresql.org/message-id/20081104202655.gp18...@it.is.rice.edu
 
 Ken showed that the number of hash collisions is the same in the old and
 the new code for his test data. Not a direct measurement of randomness,
 but it's some kind of indication.
 
 I'm not an authority on either hash functions or statistics, so I'll
 have to defer to Ken or someone else to actually prove that the
 randomness is maintained.
 
 Regards,
   Jeff Davis
 
First, while I am not an expert by any means, basic statistics will give you the
probability of a collision when packing N items into M buckets chosen at random.
In all of my tests, I used 1.6M items into 2^32 buckets. If the bits mixing is
complete and random, the number of collisions should be close to the same no
matter what arrangement of bits are used for inputs. I think my test cases
indicate quite clearly that the new hash function is as independent of bit-
order as the older functions, in that the number of bucket collisions is
basically the same for all layouts. I have the test harness and can test
any other input that you would like me to. Second, the author of the basic
hash function (old and new) has performed more extensive testing and has
shown that the new functions are better in randomizing bits than his original
function. I will try and run a micro-benchmark of my original patch in March
and the result of the incremental approach that is the result of my Novermber
patch tomorrow.

Cheers,
Ken




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-10 Thread Kenneth Marshall

On Sat, Jan 10, 2009 at 01:57:27PM -0500, Gregory Stark wrote:
 Tom Lane t...@sss.pgh.pa.us writes:
 
  Jeff Davis pg...@j-davis.com writes:
  I ran 5 times on both old and new code, eliminating the top and bottom
  and taking the average of the remaining 3, and I got a 6.9% performance
  improvement with the new code.
 
  The question that has been carefully evaded throughout the discussion
  of this patch is whether the randomness of the hash result is decreased,
 
 In fairness that doesn't seem to be the case. The original patch was posted
 with such an analysis using cracklib and raw binary data:
 
 http://article.gmane.org/gmane.comp.db.postgresql.devel.general/105675
 
   marginal performance improvement in the hash function itself (which is
  already shown to be barely measurable in the total context of a
  hash-dependent operation...)
 
 If it's a 6% gain in the speed of Hash Join or HashAggregate it would be very
 interesting. But I gather it's a 6% speedup in the time spent actually in the
 hash function. Is that really where much of our time is going? If it's 10% of
 the total time to execute one of these nodes then we're talking about a 0.6%
 optimization...
 

The Greenplum test did show a 5% increase in performance with the replacement
functions in March.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-09 Thread Kenneth Marshall

On Fri, Jan 09, 2009 at 12:04:15PM -0800, Jeff Davis wrote:
 On Mon, 2008-12-22 at 13:47 -0600, Kenneth Marshall wrote: 
  Dear PostgreSQL developers,
  
  I am re-sending this to keep this last change to the
  internal hash function on the radar.
  
 
 Hi Ken,
 
 A few comments:
 
 1. New patch with very minor changes attached.
 
 2. I reverted the change you made to indices.sgml. We still don't use
 WAL for hash indexes, and in my opinion we should continue to discourage
 their use until we do use WAL. We can add back in the comment that hash
 indexes are suitable for large keys if we have some results to show
 that.
 
 3. There was a regression test failure in union.sql because the ordering
 of the results was different. I updated the regression test.
 
 4. Hash functions affect a lot more than hash indexes, so I ran through
 a variety of tests that use a HashAggregate plan. Test setup and results
 are attached. These results show no difference between the old and the
 new code (about 0.1% better).
 
 5. The hash index build time shows some improvement. The new code won in
 every instance in which a there were a lot of duplicates in the table
 (100 distinct values, 50K of each) by around 5%.
 
 The new code appeared to be the same or slightly worse in the case of
 hash index builds with few duplicates (100 distinct values, 5 of
 each). The difference was about 1% worse, which is probably just noise.
 
 Note: I'm no expert on hash functions. Take all of my tests with a grain
 of salt.
 
 I would feel a little better if I saw at least one test that showed
 better performance of the new code on a reasonable-looking distribution
 of data. The hash index build that you showed only took a second or two
 -- it would be nice to see a test that lasted at least a minute.
 
 Regards,
   Jeff Davis
 
 

Jeff,

Thanks for the review. I would not really expect any differences in hash
index build times other than normal noise variances. The most definitive
benchmark that I have seen was done with my original patch submission
in March posted by Luke of Greenplum:

  We just applied this and saw a 5 percent speedup on a hash aggregation
   query with four columns in a 'group by' clause run against a single
   TPC-H table.

I wonder if they would be willing to re-run their test? Thanks again.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] float8 strtod weirdness

2009-01-07 Thread Kenneth Marshall

On Wed, Jan 07, 2009 at 08:12:44PM +0530, Nikhil Sontakke wrote:
 Hi,
 
 Consider the following with latest CVS sources:
 
 postgres=# create table temp(val float4);
 CREATE TABLE
 postgres=# insert into temp values (415.1);
 INSERT 0 1
 postgres=# select * from temp where val = 415.1;
  val
 -
 (0 rows)
 
 !?
 
 The reason seems to be that 415.1 ends up being treated as a numeric and is
 converted into float8 (why not float4? - it could have helped to use the
 float4eq function then)
 
 The float8in function uses strtod which for some reason converts '415.1'
 into 415.12 causing the subsequent comparison to fail. I guess
 there are ample cases of float/strtod weirdness around? Needless to mention,
 I was mighty surprised on seeing the output for the first time around :)
 
 Casting to float4 works as expected:
 postgres=# select * from rel where x = 415.1::float4;
x
 ---
  415.1
 (1 row)
 
 Regards,
 Nikhils
 -- 
 http://www.enterprisedb.com

The traditional approach to equality test with floating point is
to do the check plus-or-minus some value epsilon. Otherwise, such
seemingly bizarre behavior results.

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2008-12-22 Thread Kenneth Marshall

Dear PostgreSQL developers,

I am re-sending this to keep this last change to the
internal hash function on the radar.

Ken

Sorry about the delay for this update to the new hash
index implementation. I was trying to get the WAL logging
in place and forgot to post the actual patch. The WAL
for hash indexes will need to wait for 8.5, but I did
want to add back in the piece of the Bob Jenkins 2006
hash function that was stripped out of the initial
patch on application due to concerns about the randomness
of the resulting hash values. Here is a re-post of my
initial findings comparing the old/new Jenkins hash
from lookup2 and lookup3. I have added a third column
containing the results for the hash_any() resulting
from the attached patch as well as simple timing test
for a DB reindex both before and after patching.

Also attached is a simple documentation patch updating
the note attached to the hash index description.

Regards,
Ken

Hi,

I have finally had a chance to do some investigation on
the performance of the old hash mix() function versus
the updated mix()/final() in the new hash function. Here
is a table of my current results for both the old and the
new hash function. In this case cracklib refers to the
cracklib-dict containing 1648379 unique words massaged
in various ways to generate input strings for the hash
functions. The result is the number of collisions in the
hash values generated.

hash inputoldnew  newv2
--------  -
cracklib  338316  338
cracklib x 2 (i.e. clibclib)  305319  300
cracklib x 3 (clibclibclib)   323329  315
cracklib x 10 302310  329
cracklib x 100350335  298
cracklib x 1000   314309  315
cracklib x 100 truncated to char(100) 311327  320

uint32 from 1-1648379 309319  347
(uint32 1-1948379)*256309314  304
(uint32 1-1948379)*16 310314  324
auint32 (i.e. a1,a0002...)  320321  312

uint32uint32 (i.e. uint64)321287  309

The different result columns are old = Jenkins 1996 hash
function(lookup2.c), new = Jenkins 2006 hash function
(lookup3.c), and newv2 = adaptation of current hash_any()
to incorporate the separate mix()/final() functions. As
you can see from the results, spliting the mix() and final()
apart does not result in any perceptible loss of randomness
in the hash assignment. I also ran a crude timing for a
reindex of the following database:

CREATE TABLE dict (word text);
CREATE INDEX wordhash ON dict USING hash (word);
INSERT INTO dict (word) VALUES('s;lhkjdpyoijxfg;lktjgh;sdlhkjo');
INSERT INTO dict (SELECT MAX(word)||MAX(word) FROM dict);
... (21 times)

REINDEX TABLE
...

The average time to reindex the table using our current
hash_any() without the separate mix()/final() was 1696ms
and 1482ms with the separate mix()/final() stages giving
almost 13% better performance for this stupid metric.
--- indices.sgml2008-10-13 14:40:06.0 -0500
+++ indices.sgml.NEW2008-11-04 12:42:35.0 -0600
@@ -190,13 +190,11 @@
 
   note
para
-Testing has shown productnamePostgreSQL/productname's hash
-indexes to perform no better than B-tree indexes, and the
-index size and build time for hash indexes is much worse.
-Furthermore, hash index operations are not presently WAL-logged,
+productnamePostgreSQL/productname's hash indexes provide
+the fast O(1) lookups, even for very large objects.
+Hash index operations are not presently WAL-logged,
 so hash indexes might need to be rebuilt with commandREINDEX/
-after a database crash.
-For these reasons, hash index use is presently discouraged.
+after a database crash. 
/para
   /note
 
--- hashfunc.c.ORIG 2008-09-03 13:07:14.0 -0500
+++ hashfunc.c.NEW  2008-11-04 08:36:16.0 -0600
@@ -200,39 +200,94 @@
  * hash function, see http://burtleburtle.net/bob/hash/doobs.html,
  * or Bob's article in Dr. Dobb's Journal, Sept. 1997.
  *
- * In the current code, we have adopted an idea from Bob's 2006 update
- * of his hash function, which is to fetch the data a word at a time when
- * it is suitably aligned.  This makes for a useful speedup, at the cost
- * of having to maintain four code paths (aligned vs unaligned, and
- * little-endian vs big-endian).  Note that we have NOT adopted his newer
- * mix() function, which is faster but may sacrifice some randomness.
+ * In the current code, we have adopted Bob's 2006 update of his hash
+ * which fetches the data a word at a time when it is suitably aligned.
+ * This makes for a useful speedup, at the cost of having to maintain
+ * four code paths (aligned vs unaligned, and little-endian vs big-endian).
+ * It also two separate mixing functions

Re: [HACKERS] Preventing index scans for non-recoverable index AMs

2008-12-17 Thread Kenneth Marshall

On Wed, Dec 17, 2008 at 05:42:41PM -0500, Jaime Casanova wrote:
 On Wed, Dec 17, 2008 at 4:26 PM, Simon Riggs si...@2ndquadrant.com wrote:
  Hot Standby won't work with hash indexes because they are
  non-recoverable.
 
  We have a number of ways of dealing with this:
 
 
 i don't see a reason for inventing the wheel, we don't have wal for
 hash indexes because makes those more slow without any benefit at
 all... now there will be one...
 
 -- 
 Atentamente,
 Jaime Casanova
 Soporte y capacitaci?n de PostgreSQL
 Asesor?a y desarrollo de sistemas
 Guayaquil - Ecuador
 Cel. +59387171157
 

I think having your index survive a server power outage or other
crash is a very good thing. Rebuilding a hash index for the case
for which it is preferred (large, large tables) would be excrutiating.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Preventing index scans for non-recoverable index AMs

2008-12-17 Thread Kenneth Marshall

On Wed, Dec 17, 2008 at 10:58:11PM +, Simon Riggs wrote:
 
 On Wed, 2008-12-17 at 16:47 -0600, Kenneth Marshall wrote:
 
  I think having your index survive a server power outage or other
  crash is a very good thing. Rebuilding a hash index for the case
  for which it is preferred (large, large tables) would be excrutiating.
 
 Completely agree.
 
 We may be outta time to make it happen.
 

I agree. I was working on adding the WAL and ran up against the
deadline. A rushed hash WAL implementation would be worse than the
other alternatives. I plan on picking it back up after 8.4 is out
the door.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Preventing index scans for non-recoverable index AMs

2008-12-17 Thread Kenneth Marshall

On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:
 On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall k...@rice.edu wrote:
  Rebuilding a hash index for the case
  for which it is preferred (large, large tables) would be excrutiating.
 
 
 there's such a situation?
 
As of 8.4, yes.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Preventing index scans for non-recoverable index AMs

2008-12-17 Thread Kenneth Marshall

On Wed, Dec 17, 2008 at 05:10:40PM -0600, Kenneth Marshall wrote:
 On Wed, Dec 17, 2008 at 06:07:41PM -0500, Jaime Casanova wrote:
  On Wed, Dec 17, 2008 at 5:47 PM, Kenneth Marshall k...@rice.edu wrote:
   Rebuilding a hash index for the case
   for which it is preferred (large, large tables) would be excrutiating.
  
  
  there's such a situation?
  
 As of 8.4, yes.
 

In addition, hash indexes can index items larger than the 1/3 page
limit of btree indexes.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] cvs head initdb hangs on unixware

2008-12-09 Thread Kenneth Marshall

Would it be reasonable to turn of optimization for this file?

Ken

On Tue, Dec 09, 2008 at 05:47:47PM +0100, [EMAIL PROTECTED] wrote:
 On Tue, 9 Dec 2008, Tom Lane wrote:

 Date: Tue, 09 Dec 2008 09:23:06 -0500
 From: Tom Lane [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: Heikki Linnakangas [EMAIL PROTECTED],
 Zdenek Kotala [EMAIL PROTECTED],
 pgsql-hackers list pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] cvs head initdb hangs on unixware [EMAIL PROTECTED] 
 writes:
 Guess what! with the fprintf .. descending node... in place, everything
 goes well. The optimizer definitly does something weird along the
 definition/assignement of leftok/rightok..

 Hmm, so the problem is in that second loop.  The trick is to pick some
 reasonably non-ugly code change that makes the problem go away.

 The first thing I'd try is to get rid of the overly cute optimization

  int rightnodeno = leftnodeno + 1;

 and make it just read

  int rightnodeno = rightchild(nodeno);

 If that doesn't work, we might try refactoring the code enough to get
 rid of the goto, but that looks a little bit tedious.

  regards, tom lane

   I tried that and moving leftok,rightok declaration outside the loop, and 
 refactor the assignement code of leftok, rightok . nothing worked!

 Regards,
 -- 
 Olivier PRENANT   Tel: +33-5-61-50-97-00 (Work)
 15, Chemin des Monges+33-5-61-50-97-01 (Fax)
 31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
 FRANCE  Email: [EMAIL PROTECTED]
 --
 Make your life a dream, make your dream a reality. (St Exupery)

 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP parallel restore patch

2008-11-20 Thread Kenneth Marshall

Okay, I have had a chance to run some timing benchmarks.
Here are my results for the parallel pg_restore patch:

Ken
--
Server settings:

   max_connections = 100   # (change requires restart)
   shared_buffers = 256MB# min 128kB
   work_mem = 128MB# min 64kB
   maintenance_work_mem = 256MB# min 1MB

   fsync = on # turns forced synchronization on or off

   synchronous_commit = off# immediate fsync at commit

   full_page_writes = on # recover from partial page writes
   checkpoint_segments = 10 # in logfile segments, min 1, 16MB each
   autovacuum = on # Enable autovacuum subprocess? 'on'

The total final database size is 6.5GB. Here are the timings for
the different run parallelism from 1 to 8 on a 4-core AMD box:

-bash-3.00$ time pg_restore -U postgres -p 5435 -d rttest /tmp/rtout.pz
...

real19m3.175s
user1m2.968s
sys 0m8.202s

improvement - 0%

-bash-3.00$ time pg_restore -U postgres -p 5435 -m 2 -d rttest /tmp/rtout.pz
...
real12m55.680s
user1m12.440s
sys 0m8.343s

improvement - 32%

-bash-3.00$ time pg_restore -U postgres -p 5435 -m 4 -d rttest /tmp/rtout.pz
...
real9m45.056s
user1m1.892s
sys 0m8.980s

improvement - 49%

The system only has 4 cores, but here are the results with -m 8:

-bash-3.00$ time pg_restore -U postgres -p 5435 -m 8 -d rttest /tmp/rtout.pz
...
real8m15.320s
user0m55.206s
sys 0m8.678s

improvement - 53%


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP parallel restore patch

2008-11-20 Thread Kenneth Marshall

On Thu, Nov 20, 2008 at 02:26:14PM -0500, Andrew Dunstan wrote:


 Kenneth Marshall wrote:
 Okay, I have had a chance to run some timing benchmarks.
 Here are my results for the parallel pg_restore patch:

 Ken
 --
 Server settings:

max_connections = 100   # (change requires restart)
shared_buffers = 256MB# min 128kB
work_mem = 128MB# min 64kB
maintenance_work_mem = 256MB# min 1MB

fsync = on # turns forced synchronization on or off

synchronous_commit = off# immediate fsync at commit

full_page_writes = on # recover from partial page writes
checkpoint_segments = 10 # in logfile segments, min 1, 16MB each
autovacuum = on # Enable autovacuum subprocess? 'on'

 The total final database size is 6.5GB. Here are the timings for
 the different run parallelism from 1 to 8 on a 4-core AMD box:

 -bash-3.00$ time pg_restore -U postgres -p 5435 -d rttest /tmp/rtout.pz
 ...

 real19m3.175s
 user1m2.968s
 sys 0m8.202s

 improvement - 0%

 -bash-3.00$ time pg_restore -U postgres -p 5435 -m 2 -d rttest 
 /tmp/rtout.pz
 ...
 real12m55.680s
 user1m12.440s
 sys 0m8.343s

 improvement - 32%

 -bash-3.00$ time pg_restore -U postgres -p 5435 -m 4 -d rttest 
 /tmp/rtout.pz
 ...
 real9m45.056s
 user1m1.892s
 sys 0m8.980s

 improvement - 49%

 The system only has 4 cores, but here are the results with -m 8:

 -bash-3.00$ time pg_restore -U postgres -p 5435 -m 8 -d rttest 
 /tmp/rtout.pz
 ...
 real8m15.320s
 user0m55.206s
 sys 0m8.678s

 improvement - 53%


   

 Interesting.

 Can you try with two changes? Turn fsync off, and use the 
 --truncate-before-load switch.

 In general, though, this is fairly much in line with other experience, i.e. 
 we can get up to about n/2 times speedup with n cores.

 thanks

 andrew

Okay, here is the same test run with:

Cheers,
Ken


fsync = off
--truncate-before-load

-bash-3.00$ time pg_restore -U postgres -p 5435 --truncate-before-load -d rttest
 /tmp/rtout.pz
...
real16m25.031s
user1m3.707s
sys 0m8.776s
improvement - 0%

-bash-3.00$ time pg_restore -U postgres -p 5435 -m 2 --truncate-before-load -d r
ttest /tmp/rtout.pz
...
real10m26.730s
user0m48.782s
sys 0m7.214s
improvement - 36%

-bash-3.00$ time pg_restore -U postgres -p 5435 -m 4 --truncate-before-load -d r
ttest /tmp/rtout.pz
...
real8m5.061s
user0m48.657s
sys 0m7.602s
improvement - 51%

-bash-3.00$ time pg_restore -U postgres -p 5435 -m 8 --truncate-before-load -d r
ttest /tmp/rtout.pz
...
real6m18.787s
user0m45.361s
sys 0m7.811s
improvement - 62%


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [RRR] [HACKERS] Tests citext casts

2008-11-07 Thread Kenneth Marshall

On Fri, Nov 07, 2008 at 10:15:17AM -0800, David E. Wheeler wrote:
 On Nov 5, 2008, at 12:34 PM, Kenneth Marshall wrote:

 I am using the anonymous CVS repository, it returns the following
 information in pg_catalog.pg_settings:

 What is lc_collate set to?

 % show lc_collate;

 FWIW, I just ran the tests myself and all passed, with and without the 
 patch (using en_US.UTF-8). I think that the regression tests generally 
 expect to be run with the C locale, though en_US generally works fine, too, 
 given that ASCII ordering has the same semantics.

 Best,

 David

David,

Thank you for the pointers. lc_collate is set to en_US.UTF-8. I
re-initdb the database with the --no-locale option and then the
tests passed successfully. Thank you for the reminder that the
regression tests need to run against a C locale database.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RRR] Tests citext casts - reviewed

2008-11-07 Thread Kenneth Marshall

The patch for the citext tests applied to module cleanly
and the patched files resulted in a clean make installcheck
run for the citext module. My previous problem was the result
of not testing with a C locale database. This patch is ready
to be applied.

Regards,
Ken Marshall

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Tests citext casts

2008-11-05 Thread Kenneth Marshall

I installed and ran the citext tests both with and without
the patch and had failures both times. The patch applied
cleanly and the make;make install completed without errors.
I have attached the two regression.diffs files, one without
the patch applied and the other with the patch.

Regards,
Ken Marshall
*** /opt/local/pg8/pgsql/contrib/citext/expected/citext_1.out   Wed Nov  5 
08:33:40 2008
--- /opt/local/pg8/pgsql/contrib/citext/results/citext.out  Wed Nov  5 
08:34:44 2008
***
*** 122,146 
  SELECT 'B'::citext   'a'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'B'::citext = 'a'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext   'B'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext = 'B'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  -- Test implicit casting. citext casts to varchar, but not vice-versa.
--- 122,146 
  SELECT 'B'::citext   'a'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'B'::citext = 'a'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext   'B'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext = 'B'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  -- Test implicit casting. citext casts to varchar, but not vice-versa.
***
*** 159,183 
  SELECT 'B'::citext   'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'B'::citext = 'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext   'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext = 'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  -- A couple of longer examlpes to ensure that we don't get any issues with bad
--- 159,183 
  SELECT 'B'::citext   'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'B'::citext = 'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext   'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext = 'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  -- A couple of longer examlpes to ensure that we don't get any issues with bad

==

*** /opt/local/pg8/pgsql/contrib/citext/expected/citext_1.out   Wed Nov  5 
08:27:57 2008
--- /opt/local/pg8/pgsql/contrib/citext/results/citext.out  Wed Nov  5 
08:28:10 2008
***
*** 122,146 
  SELECT 'B'::citext   'a'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'B'::citext = 'a'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext   'B'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext = 'B'::text AS t;  -- text wins.
   t 
  ---
!  t
  (1 row)
  
  -- Test implicit casting. citext casts to varchar, but not vice-versa.
--- 122,146 
  SELECT 'B'::citext   'a'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'B'::citext = 'a'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext   'B'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext = 'B'::text AS t;  -- text wins.
   t 
  ---
!  f
  (1 row)
  
  -- Test implicit casting. citext casts to varchar, but not vice-versa.
***
*** 159,183 
  SELECT 'B'::citext   'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'B'::citext = 'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext   'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  SELECT 'a'::citext = 'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  t
  (1 row)
  
  -- A couple of longer examlpes to ensure that we don't get any issues with bad
--- 159,183 
  SELECT 'B'::citext   'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'B'::citext = 'a'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext   'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  SELECT 'a'::citext = 'B'::varchar AS t;  -- varchar wins.
   t 
  ---
!  f
  (1 row)
  
  -- A couple of longer examlpes to ensure that we don't get any issues with bad
***
*** 740,769 
   t
  (1 row)
  
- SELECT 'f'::char::citext = 'f' AS t;
-  t 
- ---
-  t
- (1 row)
- 
- SELECT 'f'::citext::char = 'f'::char AS t;
-  t 
- ---
-  t
- (1 row)
- 
- SELECT 'f'::char::citext = 'f' AS t;
-  t 
- ---
-  t
- (1 row)
- 
- SELECT 'f'::citext::char = 'f'::char AS t;
-  t 
- ---
-  t
- (1 row)
- 
  SELECT 'foo'::citext::bytea = 'foo'::bytea AS t;
   t 
  ---
--- 740,745 

==


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [RRR] [HACKERS] Tests citext casts

2008-11-05 Thread Kenneth Marshall

On Wed, Nov 05, 2008 at 09:04:04AM -0800, David E. Wheeler wrote:
 On Nov 5, 2008, at 6:40 AM, Kenneth Marshall wrote:

 I installed and ran the citext tests both with and without
 the patch and had failures both times. The patch applied
 cleanly and the make;make install completed without errors.
 I have attached the two regression.diffs files, one without
 the patch applied and the other with the patch.

 What patch was it you applied? And is this CVS HEAD that you're testing? 
 What locale/collation is your database configured with?

 Thanks,

 David

David,

I am using the anonymous CVS repository, it returns the following
information in pg_catalog.pg_settings:

postgres=# select * from pg_catalog.pg_settings where name like 'server_%';
name| setting  | unit |  category   
   | short_desc | extra_desc | c
ontext  | vartype |  source  | min_val | max_val | enumvals | boot_val  | reset_
val | sourcefile | sourceline 
+--+--+-
---+++--
+-+--+-+-+--+---+---
++
 server_encoding| UTF8 |  | Client Connection Defaults / Locale and 
Formatting | Sets the server (database) character set encoding. || i
nternal | string  | override | | |  | SQL_ASCII | UTF8  
||   
 server_version | 8.4devel |  | Preset Options  
   | Shows the server version.  || i
nternal | string  | default  | | |  | 8.4devel  | 8.4dev
el  ||   
 server_version_num | 80400|  | Preset Options  
   | Shows the server version as an integer.|| i
nternal | integer | default  | 80400   | 80400   |  | 80400 | 80400 
||   
(3 rows)

The patch that I used is from the link in the commitfest 2008-11 wiki
which points to:

http://archives.postgresql.org/message-id/[EMAIL PROTECTED]

Cheers,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 >

1 - 100 of 223 matches

Mail list logo