Re: [HACKERS] Hash Functions

2017-09-08 Thread amul sul
On Fri, Sep 1, 2017 at 8:01 AM, Robert Haas wrote: > On Thu, Aug 31, 2017 at 8:40 AM, amul sul wrote: > > Fixed in the attached version. > > I fixed these up a bit and committed them. Thanks. > > I think this takes care of adding not only the

Re: [HACKERS] Hash Functions

2017-08-31 Thread Tom Lane
Robert Haas writes: > On Thu, Aug 31, 2017 at 10:55 PM, Tom Lane wrote: >> ALTER OPERATOR FAMILY ADD FUNCTION ... ? >> >> That would result in the functions being considered "loose" in the >> family rather than bound into an operator class. I think

Re: [HACKERS] Hash Functions

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 10:55 PM, Tom Lane wrote: > Robert Haas writes: >> I think this takes care of adding not only the infrastructure but >> support for all the core data types, but I'm not quite sure how to >> handle upgrading types in contrib. It

Re: [HACKERS] Hash Functions

2017-08-31 Thread Tom Lane
Robert Haas writes: > I think this takes care of adding not only the infrastructure but > support for all the core data types, but I'm not quite sure how to > handle upgrading types in contrib. It looks like citext, hstore, and > several data types provided by isn have

Re: [HACKERS] Hash Functions

2017-08-31 Thread Robert Haas
On Thu, Aug 31, 2017 at 8:40 AM, amul sul wrote: > Fixed in the attached version. I fixed these up a bit and committed them. Thanks. I think this takes care of adding not only the infrastructure but support for all the core data types, but I'm not quite sure how to handle

Re: [HACKERS] Hash Functions

2017-08-31 Thread amul sul
On Wed, Aug 30, 2017 at 9:05 PM, Robert Haas wrote: > On Wed, Aug 30, 2017 at 10:43 AM, amul sul wrote: > > Thanks for the suggestion, I have updated 0002-patch accordingly. > > Using this I found some strange behaviours as follow: > > > > 1) standard

Re: [HACKERS] Hash Functions

2017-08-30 Thread Robert Haas
On Wed, Aug 30, 2017 at 10:43 AM, amul sul wrote: > Thanks for the suggestion, I have updated 0002-patch accordingly. > Using this I found some strange behaviours as follow: > > 1) standard and extended0 output for the jsonb_hash case is not same. > 2) standard and extended0

Re: [HACKERS] Hash Functions

2017-08-30 Thread amul sul
On Tue, Aug 29, 2017 at 11:48 PM, Robert Haas wrote: > On Tue, Aug 22, 2017 at 8:14 AM, amul sul wrote: > > Attaching patch 0002 for the reviewer's testing. > > I think that this 0002 is not something we can think of committing > because there's no

Re: [HACKERS] Hash Functions

2017-08-29 Thread Robert Haas
On Tue, Aug 22, 2017 at 8:14 AM, amul sul wrote: > Attaching patch 0002 for the reviewer's testing. I think that this 0002 is not something we can think of committing because there's no guarantee that hash functions will return the same results on all platforms. However, what

Re: [HACKERS] Hash Functions

2017-08-29 Thread amul sul
On Tue, Aug 22, 2017 at 5:44 PM, amul sul wrote: > On Fri, Aug 18, 2017 at 11:01 PM, Robert Haas > wrote: > >> On Fri, Aug 18, 2017 at 1:12 PM, amul sul wrote: >> > I have a small query, what if I want a cache entry with extended

Re: [HACKERS] Hash Functions

2017-08-22 Thread amul sul
On Fri, Aug 18, 2017 at 11:01 PM, Robert Haas wrote: > On Fri, Aug 18, 2017 at 1:12 PM, amul sul wrote: > > I have a small query, what if I want a cache entry with extended hash > > function instead standard one, I might require that while adding > >

Re: [HACKERS] Hash Functions

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 1:12 PM, amul sul wrote: > I have a small query, what if I want a cache entry with extended hash > function instead standard one, I might require that while adding > hash_array_extended function? Do you think we need to extend > lookup_type_cache() as

Re: [HACKERS] Hash Functions

2017-08-18 Thread amul sul
On Fri, Aug 18, 2017 at 8:49 AM, Robert Haas wrote: > On Wed, Aug 16, 2017 at 5:34 PM, Robert Haas > wrote: > > Attached is a quick sketch of how this could perhaps be done (ignoring > > for the moment the relatively-boring opclass pushups). > >

Re: [HACKERS] Hash Functions

2017-08-17 Thread Robert Haas
On Wed, Aug 16, 2017 at 5:34 PM, Robert Haas wrote: > Attached is a quick sketch of how this could perhaps be done (ignoring > for the moment the relatively-boring opclass pushups). Here it is with some relatively-boring opclass pushups added. I just did the int4 bit; the

Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Kenneth Marshall writes: > On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote: >> ... In fact, on perusing the linked-to page >> http://burtleburtle.net/bob/hash/doobs.html >> Bob says specifically that taking b and c from this hash does not >> produce a fully random 64-bit

Re: [HACKERS] Hash Functions

2017-08-16 Thread Kenneth Marshall
On Wed, Aug 16, 2017 at 05:58:41PM -0400, Tom Lane wrote: > Robert Haas writes: > > Attached is a quick sketch of how this could perhaps be done (ignoring > > for the moment the relatively-boring opclass pushups). It introduces > > a new function hash_any_extended which

Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Robert Haas writes: > Attached is a quick sketch of how this could perhaps be done (ignoring > for the moment the relatively-boring opclass pushups). It introduces > a new function hash_any_extended which differs from hash_any() in that > (a) it combines both b and c into

Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:38 PM, Tom Lane wrote: > Robert Haas writes: >> After some further thought, I propose the following approach to the >> issues raised on this thread: > >> 1. Allow hash functions to have a second, optional support function, >>

Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Wed, Aug 16, 2017 at 12:38 PM, Tom Lane wrote: > Robert Haas writes: >> After some further thought, I propose the following approach to the >> issues raised on this thread: > >> 1. Allow hash functions to have a second, optional support function, >>

Re: [HACKERS] Hash Functions

2017-08-16 Thread Tom Lane
Robert Haas writes: > After some further thought, I propose the following approach to the > issues raised on this thread: > 1. Allow hash functions to have a second, optional support function, > similar to what we did for btree opclasses in >

Re: [HACKERS] Hash Functions

2017-08-16 Thread Robert Haas
On Thu, Aug 3, 2017 at 6:47 PM, Robert Haas wrote: > That seems pretty lame, although it's sufficient to solve the > immediate problem, and I have to admit to a certain predilection for > things that solve the immediate problem without creating lots of > additional work.

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 6:08 PM, Andres Freund wrote: >> That's another way to go, but it requires inventing a way to thread >> the IV through the hash opclass interface. > > Only if we really want to do it really well :P. Using a hash_combine() > like > > /* > * Combine two

Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
On 2017-08-03 17:57:37 -0400, Robert Haas wrote: > On Thu, Aug 3, 2017 at 5:50 PM, Andres Freund wrote: > > On 2017-08-03 17:43:44 -0400, Robert Haas wrote: > >> For me, the basic point here is that we need a set of hash functions > >> for hash partitioning that are different

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:50 PM, Andres Freund wrote: > On 2017-08-03 17:43:44 -0400, Robert Haas wrote: >> For me, the basic point here is that we need a set of hash functions >> for hash partitioning that are different than what we use for hash >> indexes and hash joins --

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Aug 3, 2017 at 5:32 PM, Andres Freund wrote: >> Do you have any feeling for which of those endianness-independent hash >> functions might be a reasonable choice for us? > > Not a strong / very informed one, TBH. > > I'm not convinced it's worth trying to achieve this

Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
Hi, On 2017-08-03 17:43:44 -0400, Robert Haas wrote: > For me, the basic point here is that we need a set of hash functions > for hash partitioning that are different than what we use for hash > indexes and hash joins -- otherwise when we hash partition a table and > create hash indexes on each

Re: [HACKERS] Hash Functions

2017-08-03 Thread Andres Freund
Hi, On 2017-08-03 17:09:41 -0400, Robert Haas wrote: > On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund wrote: > > Just to clarify: I don't think it's a problem to do so for integers and > > most other simple scalar types. There's plenty hash algorithms that are > > endianess

Re: [HACKERS] Hash Functions

2017-08-03 Thread Robert Haas
On Thu, Jun 1, 2017 at 2:25 PM, Andres Freund wrote: > Just to clarify: I don't think it's a problem to do so for integers and > most other simple scalar types. There's plenty hash algorithms that are > endianess independent, and the rest is just a bit of care. Do you have

Re: [HACKERS] Hash Functions

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 10:19 AM, Joe Conway wrote: >> Yeah, that's not crazy. I find it a bit surprising in terms of the >> semantics, though. SET >> when_i_try_to_insert_into_a_specific_partition_i_dont_really_mean_it = >> true? > > Maybe > SET partition_tuple_retry =

Re: [HACKERS] Hash Functions

2017-06-02 Thread Joe Conway
On 06/02/2017 05:47 AM, Robert Haas wrote: > On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis wrote: >> 2. I basically see two approaches to solve the problem: >> (a) Tom suggested at PGCon that we could have a GUC that >> automatically causes inserts to the partition to be

Re: [HACKERS] Hash Functions

2017-06-02 Thread Robert Haas
On Fri, Jun 2, 2017 at 1:24 AM, Jeff Davis wrote: > 1. For range partitioning, I think it's "yes, a little". As you point > out, there are already some weird edge cases -- the main way range > partitioning would make the problem worse is simply by having more > users. I agree.

Re: [HACKERS] Hash Functions

2017-06-01 Thread Jeff Davis
On Thu, Jun 1, 2017 at 11:25 AM, Andres Freund wrote: > Secondly, I think that's to a significant degree caused by > the fact that in practice people way more often partition on types like > int4/int8/date/timestamp/uuid rather than text - there's rarely good > reasons to do

Re: [HACKERS] Hash Functions

2017-06-01 Thread Jeff Davis
On Thu, Jun 1, 2017 at 10:59 AM, Robert Haas wrote: > 1. Are the new problems worse than the old ones? > > 2. What could we do about it? Exactly the right questions. 1. For range partitioning, I think it's "yes, a little". As you point out, there are already some weird

Re: [HACKERS] Hash Functions

2017-06-01 Thread Joe Conway
On 06/01/2017 11:25 AM, Andres Freund wrote: > On 2017-06-01 13:59:42 -0400, Robert Haas wrote: >> My personal guess is that most people will prefer the fast >> hash functions over the ones that solve their potential future >> migration problems, but, hey, options are good. > > I'm pretty sure

Re: [HACKERS] Hash Functions

2017-06-01 Thread Andres Freund
On 2017-06-01 13:59:42 -0400, Robert Haas wrote: > I'm not actually aware of an instance where this has bitten anyone, > even though it seems like it certainly could have and maybe should've > gotten somebody at some point. Has anyone else? Two comments: First, citus has been doing

Re: [HACKERS] Hash Functions

2017-06-01 Thread Robert Haas
On Fri, May 12, 2017 at 1:35 PM, Joe Conway wrote: >> That's a good point, but the flip side is that, if we don't have >> such a rule, a pg_dump of a hash-partitioned table on one >> architecture might fail to restore on another architecture. Today, I >> believe that, while

Re: [HACKERS] Hash Functions

2017-05-19 Thread Robert Haas
On Fri, May 19, 2017 at 2:36 AM, Jeff Davis wrote: > I could agree to something like that. Let's explore some of the challenges > there and potential solutions: > > 1. Dump/reload of hash partitioned data. > > Falling back to restore-through-the-root seems like a reasonable

[HACKERS] Hash Functions

2017-05-19 Thread Jeff Davis
On Thursday, May 18, 2017, Robert Haas wrote: > My experience with this area has led > me to give up on the idea of complete uniformity as impractical, and > instead look at it from the perspective of "what do we absolutely have > to ban in order for this to be sane?". I

Re: [HACKERS] Hash Functions

2017-05-18 Thread Robert Haas
On Thu, May 18, 2017 at 1:53 AM, Jeff Davis wrote: > For instance, it makes little sense to have individual check > constraints, indexes, permissions, etc. on a hash-partitioned table. > It doesn't mean that we should necessarily forbid them, but it should > make us question

Re: [HACKERS] Hash Functions

2017-05-18 Thread Jeff Davis
On Wed, May 17, 2017 at 11:35 AM, Tom Lane wrote: > I think the question is whether we are going to make a distinction between > logical partitions (where the data division rule makes some sense to the > user) and physical partitions (where it needn't). I think it might be >

Re: [HACKERS] Hash Functions

2017-05-17 Thread Jeff Davis
On Wed, May 17, 2017 at 12:10 PM, Robert Haas wrote: > 1. To handle dump-and-reload the way we partitioning does today, hash > functions would need to be portable across encodings. > 2. That's impractically difficult. > 3. So let's always load data through the top-parent. >

Re: [HACKERS] Hash Functions

2017-05-17 Thread Robert Haas
On Wed, May 17, 2017 at 2:35 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, May 16, 2017 at 4:25 PM, Jeff Davis wrote: >>> Why can't hash partitions be stored in tables the same way as we do TOAST? >>> That should take care of

Re: [HACKERS] Hash Functions

2017-05-17 Thread Tom Lane
Robert Haas writes: > On Tue, May 16, 2017 at 4:25 PM, Jeff Davis wrote: >> Why can't hash partitions be stored in tables the same way as we do TOAST? >> That should take care of the naming problem. > Hmm, yeah, something like that could be done, but

Re: [HACKERS] Hash Functions

2017-05-17 Thread Robert Haas
On Tue, May 16, 2017 at 4:25 PM, Jeff Davis wrote: > Why can't hash partitions be stored in tables the same way as we do TOAST? > That should take care of the naming problem. Hmm, yeah, something like that could be done, but every place where you are currently allowed to refer

Re: [HACKERS] Hash Functions

2017-05-17 Thread Ashutosh Bapat
On Tue, May 16, 2017 at 8:40 PM, Jeff Davis wrote: > On Mon, May 15, 2017 at 1:04 PM, David Fetter wrote: >> As the discussion has devolved here, it appears that there are, at >> least conceptually, two fundamentally different classes of partition: >> public,

Re: [HACKERS] Hash Functions

2017-05-16 Thread Amit Langote
On 2017/05/17 5:25, Jeff Davis wrote: > On Tuesday, May 16, 2017, Robert Haas wrote: >> I don't really find this a very practical design. If the table >> partitions are spread across different relfilenodes, then those >> relfilenodes have to have separate pg_class entries

Re: [HACKERS] Hash Functions

2017-05-16 Thread Peter Eisentraut
On 5/16/17 11:10, Jeff Davis wrote: > I concur at this point. I originally thought hash functions might be > made portable, but I think Tom and Andres showed that to be too > problematic -- the issue with different encodings is the real killer. I think it would be OK that if you want to move a

Re: [HACKERS] Hash Functions

2017-05-16 Thread Jeff Davis
On Tuesday, May 16, 2017, Robert Haas wrote: > I don't really find this a very practical design. If the table > partitions are spread across different relfilenodes, then those > relfilenodes have to have separate pg_class entries and separate > indexes, and those indexes

Re: [HACKERS] Hash Functions

2017-05-16 Thread David Fetter
On Tue, May 16, 2017 at 08:10:39AM -0700, Jeff Davis wrote: > On Mon, May 15, 2017 at 1:04 PM, David Fetter wrote: > > As the discussion has devolved here, it appears that there are, at > > least conceptually, two fundamentally different classes of partition: > > public, which

Re: [HACKERS] Hash Functions

2017-05-16 Thread Robert Haas
On Tue, May 16, 2017 at 11:10 AM, Jeff Davis wrote: > With hash partitioning: > * User only specifies number of partitions of the parent table; does > not specify individual partition properties (modulus, etc.) > * Dump/reload goes through the parent table (though we may

Re: [HACKERS] Hash Functions

2017-05-16 Thread Jeff Davis
On Mon, May 15, 2017 at 1:04 PM, David Fetter wrote: > As the discussion has devolved here, it appears that there are, at > least conceptually, two fundamentally different classes of partition: > public, which is to say meaningful to DB clients, and "private", used > for

Re: [HACKERS] Hash Functions

2017-05-15 Thread David Fetter
On Mon, May 15, 2017 at 03:26:02PM -0400, Robert Haas wrote: > On Sun, May 14, 2017 at 9:35 PM, Andres Freund wrote: > > On 2017-05-14 21:22:58 -0400, Robert Haas wrote: > >> but wanting a CHECK constraint that applies to only one partition > >> seems pretty reasonable (e.g.

Re: [HACKERS] Hash Functions

2017-05-15 Thread Mark Dilger
> On May 15, 2017, at 7:48 AM, Jeff Davis wrote: > > On Sun, May 14, 2017 at 6:22 PM, Robert Haas wrote: >> You'd have to prohibit a heck of a lot more than that in order for >> this to work 100% reliably. You'd have to prohibit CHECK constraints, >>

Re: [HACKERS] Hash Functions

2017-05-15 Thread Robert Haas
On Sun, May 14, 2017 at 9:35 PM, Andres Freund wrote: > On 2017-05-14 21:22:58 -0400, Robert Haas wrote: >> but wanting a CHECK constraint that applies to only one partition >> seems pretty reasonable (e.g. CHECK that records for older years are >> all in the 'inactive' state,

Re: [HACKERS] Hash Functions

2017-05-15 Thread David Fetter
On Mon, May 15, 2017 at 07:48:14AM -0700, Jeff Davis wrote: > This would mean we need to reload through the root as Andres and > others suggested, One refinement of this would be to traverse the partition tree, stopping at the first place where the next branch has hash partitions, or at any rate

Re: [HACKERS] Hash Functions

2017-05-15 Thread Bruce Momjian
On Mon, May 15, 2017 at 07:32:30AM -0700, Jeff Davis wrote: > On Sun, May 14, 2017 at 8:00 PM, Bruce Momjian wrote: > > Do we even know that floats are precise enough to determine the > > partition. For example, if you have 6.1, is it possible for > > that to be

Re: [HACKERS] Hash Functions

2017-05-15 Thread Jeff Davis
On Sun, May 14, 2017 at 6:22 PM, Robert Haas wrote: > You'd have to prohibit a heck of a lot more than that in order for > this to work 100% reliably. You'd have to prohibit CHECK constraints, > triggers, rules, RLS policies, and UNIQUE indexes, at the least. You > might

Re: [HACKERS] Hash Functions

2017-05-15 Thread Jeff Davis
On Sun, May 14, 2017 at 8:00 PM, Bruce Momjian wrote: > Do we even know that floats are precise enough to determine the > partition. For example, if you have 6.1, is it possible for > that to be 5.999 on some systems? Are IEEE systems all the same for > these

Re: [HACKERS] Hash Functions

2017-05-14 Thread Bruce Momjian
On Sun, May 14, 2017 at 01:06:03PM -0700, Andres Freund wrote: > On 2017-05-14 15:59:09 -0400, Greg Stark wrote: > > Personally while I would like to avoid code that actively crashes or > > fails basic tests on Vax > > I personally vote for simply refusing to run/compile on non-IEEE > platforms,

Re: [HACKERS] Hash Functions

2017-05-14 Thread Andres Freund
Hi, On 2017-05-14 21:22:58 -0400, Robert Haas wrote: > but wanting a CHECK constraint that applies to only one partition > seems pretty reasonable (e.g. CHECK that records for older years are > all in the 'inactive' state, or whatever). On a hash-partitioned table? > Now that's not to say that

Re: [HACKERS] Hash Functions

2017-05-14 Thread Robert Haas
On Sun, May 14, 2017 at 6:29 PM, Andres Freund wrote: > On 2017-05-14 18:25:08 -0400, Tom Lane wrote: >> It may well be that we can get away with saying "we're not going >> to make it simple to move hash-partitioned tables with float >> partition keys between architectures

Re: [HACKERS] Hash Functions

2017-05-14 Thread Peter Geoghegan
On Sun, May 14, 2017 at 3:30 PM, Tom Lane wrote: > I agree that the Far Eastern systems that can't easily be replaced > by Unicode are that way mostly because they're a mess. But I'm > still of the opinion that locking ourselves into Unicode is a choice > we might regret, far

Re: [HACKERS] Hash Functions

2017-05-14 Thread Thomas Munro
On Mon, May 15, 2017 at 10:08 AM, Thomas Munro wrote: > [2] > https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.cbcux01/flotcop.htm#flotcop Though looking more closely I see that the default is IEEE in 64 bit builds, which seems like a

Re: [HACKERS] Hash Functions

2017-05-14 Thread Tom Lane
Peter Geoghegan writes: > The express goal of the Unicode consortium is to replace all existing > encodings with Unicode. My personal opinion is that a Unicode > monoculture would be a good thing, provided reasonable differences can > be accommodated. Can't help remembering Randall

Re: [HACKERS] Hash Functions

2017-05-14 Thread Andres Freund
On 2017-05-14 18:25:08 -0400, Tom Lane wrote: > It may well be that we can get away with saying "we're not going > to make it simple to move hash-partitioned tables with float > partition keys between architectures with different float > representations". But there's a whole lot of daylight

Re: [HACKERS] Hash Functions

2017-05-14 Thread Tom Lane
Andres Freund writes: > On 2017-05-14 15:59:09 -0400, Greg Stark wrote: >> Personally while I would like to avoid code that actively crashes or >> fails basic tests on Vax > I personally vote for simply refusing to run/compile on non-IEEE > platforms, including VAX. The

Re: [HACKERS] Hash Functions

2017-05-14 Thread Thomas Munro
On Mon, May 15, 2017 at 7:59 AM, Greg Stark wrote: > On 13 May 2017 at 10:29, Robert Haas wrote: >> - Floats. There may be different representations in use on different >> hardware, which could be a problem. Tom didn't answer my question >> about whether

Re: [HACKERS] Hash Functions

2017-05-14 Thread Peter Geoghegan
On Sat, May 13, 2017 at 9:11 PM, Robert Haas wrote: > The latter is > generally false already. Maybe LATIN1 -> UTF8 is no-fail, but what > about UTF8 -> LATIN1 or SJIS -> anything? Based on previous mailing > list discussions, I'm under the impression that it is sometimes

Re: [HACKERS] Hash Functions

2017-05-14 Thread Andres Freund
On 2017-05-14 15:59:09 -0400, Greg Stark wrote: > Personally while I would like to avoid code that actively crashes or > fails basic tests on Vax I personally vote for simply refusing to run/compile on non-IEEE platforms, including VAX. The benefit of even trying to get that right, not to speak

Re: [HACKERS] Hash Functions

2017-05-14 Thread Greg Stark
On 13 May 2017 at 10:29, Robert Haas wrote: > - Floats. There may be different representations in use on different > hardware, which could be a problem. Tom didn't answer my question > about whether any even-vaguely-modern hardware is still using non-IEEE > floats, which

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 11:47 PM, Andres Freund wrote: > It'll be differently sized on different platforms. So everyone will have to > write hash functions that look at each member individually, rather than > hashing the entire struct at once. And for each member you'll

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 1:57 PM, Tom Lane wrote: > Basically, this is simply saying that you're willing to ignore the > hard cases, which reduces the problem to one of documenting the > portability limitations. You might as well not even bother with > worrying about the

Re: [HACKERS] Hash Functions

2017-05-13 Thread Andres Freund
On May 13, 2017 8:44:22 PM PDT, Robert Haas wrote: >On Sat, May 13, 2017 at 7:08 PM, Andres Freund >wrote: >> I seriously doubt that's true. A lot of more complex types have >> internal alignment padding and such. > >True, but I believe we require

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 7:08 PM, Andres Freund wrote: > I seriously doubt that's true. A lot of more complex types have > internal alignment padding and such. True, but I believe we require those padding bytes to be zero. If we didn't, then hstore_hash would be broken

Re: [HACKERS] Hash Functions

2017-05-13 Thread Andres Freund
On 2017-05-13 10:29:09 -0400, Robert Haas wrote: > On Sat, May 13, 2017 at 12:52 AM, Amit Kapila wrote: > > Can we think of defining separate portable hash functions which can be > > used for the purpose of hash partitioning? > > I think that would be a good idea. I

Re: [HACKERS] Hash Functions

2017-05-13 Thread Jeff Davis
On Fri, May 12, 2017 at 12:38 PM, Robert Haas wrote: > That is a good question. I think it basically amounts to this > question: is hash partitioning useful, and if so, for what? Two words: parallel query. To get parallelism, one of the best approaches is dividing the

Re: [HACKERS] Hash Functions

2017-05-13 Thread Jeff Davis
On Fri, May 12, 2017 at 11:45 AM, Tom Lane wrote: > Forget hash partitioning. There's no law saying that that's a good > idea and we have to have it. With a different set of constraints, > maybe we could do it, but I think the existing design decisions have > basically

Re: [HACKERS] Hash Functions

2017-05-13 Thread Tom Lane
Robert Haas writes: > On Sat, May 13, 2017 at 12:52 AM, Amit Kapila wrote: >> Can we think of defining separate portable hash functions which can be >> used for the purpose of hash partitioning? > I think that would be a good idea. I think it

Re: [HACKERS] Hash Functions

2017-05-13 Thread Jeff Davis
On Fri, May 12, 2017 at 10:34 AM, Tom Lane wrote: > Maintaining such a property for float8 (and the types that depend on it) > might be possible if you believe that nobody ever uses anything but IEEE > floats, but we've never allowed that as a hard assumption before. This is

Re: [HACKERS] Hash Functions

2017-05-13 Thread Robert Haas
On Sat, May 13, 2017 at 12:52 AM, Amit Kapila wrote: > Can we think of defining separate portable hash functions which can be > used for the purpose of hash partitioning? I think that would be a good idea. I think it shouldn't even be that hard. By data type: -

Re: [HACKERS] Hash Functions

2017-05-12 Thread Amit Kapila
On Sat, May 13, 2017 at 1:08 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 2:45 PM, Tom Lane wrote: > > Maybe a shorter argument for hash partitioning is that not one but two > different people proposed patches for it within months of the initial >

Re: [HACKERS] Hash Functions

2017-05-12 Thread Andres Freund
On 2017-05-12 21:56:30 -0400, Robert Haas wrote: > Cheap isn't free, though. It's got a double-digit percentage overhead > rather than a large-multiple-of-the-runtime overhead as triggers do, > but people still won't want to pay it unnecessarily, I think. That should be partiall addressable with

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 7:36 PM, David Fetter wrote: > On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote: >> On 5/12/17 18:13, Alvaro Herrera wrote: >> > I think for logical replication the tuple should appear as being in the >> > parent table, not the partition.

Re: [HACKERS] Hash Functions

2017-05-12 Thread David Fetter
On Fri, May 12, 2017 at 06:38:55PM -0400, Peter Eisentraut wrote: > On 5/12/17 18:13, Alvaro Herrera wrote: > > I think for logical replication the tuple should appear as being in the > > parent table, not the partition. No? > > Logical replication replicates base table to base table. How those

Re: [HACKERS] Hash Functions

2017-05-12 Thread Peter Eisentraut
On 5/12/17 18:13, Alvaro Herrera wrote: > I think for logical replication the tuple should appear as being in the > parent table, not the partition. No? Logical replication replicates base table to base table. How those tables are tied together into a partitioned table or an inheritance tree is

Re: [HACKERS] Hash Functions

2017-05-12 Thread Alvaro Herrera
Peter Eisentraut wrote: > On 5/12/17 14:23, Robert Haas wrote: > > One alternative would be to change the way that we dump and restore > > the data. Instead of dumping the data with the individual partitions, > > dump it all out for the parent and let tuple routing sort it out at > > restore

Re: [HACKERS] Hash Functions

2017-05-12 Thread Peter Eisentraut
On 5/12/17 14:23, Robert Haas wrote: > One alternative would be to change the way that we dump and restore > the data. Instead of dumping the data with the individual partitions, > dump it all out for the parent and let tuple routing sort it out at > restore time. I think this could be a pg_dump

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 2:45 PM, Tom Lane wrote: > Yeah, that isn't really appetizing at all. If we were doing physical > partitioning below the user-visible level, we could make it fly. > But the existing design makes the partition boundaries user-visible > which means we

Re: [HACKERS] Hash Functions

2017-05-12 Thread Kenneth Marshall
On Fri, May 12, 2017 at 02:23:14PM -0400, Robert Haas wrote: > > What about integers? I think we're already assuming two's-complement > arithmetic, which I think means that the only problem with making the > hash values portable for integers is big-endian vs. little-endian. > That's sounds

Re: [HACKERS] Hash Functions

2017-05-12 Thread Tom Lane
Robert Haas writes: > On Fri, May 12, 2017 at 1:34 PM, Tom Lane wrote: >> I'd vote that it's not, which means that this whole approach to hash >> partitioning is unworkable. I agree with Andres that demanding hash >> functions produce

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 1:34 PM, Tom Lane wrote: > I'd vote that it's not, which means that this whole approach to hash > partitioning is unworkable. I agree with Andres that demanding hash > functions produce architecture-independent values will not fly. If we can't produce

Re: [HACKERS] Hash Functions

2017-05-12 Thread Joe Conway
On 05/12/2017 10:17 AM, Robert Haas wrote: > On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: >> Given that a lot of data types have a architecture dependent >> representation, it seems somewhat unrealistic and expensive to have >> a hard rule to keep them architecture agnostic. And if

Re: [HACKERS] Hash Functions

2017-05-12 Thread Tom Lane
Robert Haas writes: > On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: >> Given that a lot of data types have a architecture dependent representation, >> it seems somewhat unrealistic and expensive to have a hard rule to keep them >> architecture

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 1:12 PM, Andres Freund wrote: > Given that a lot of data types have a architecture dependent representation, > it seems somewhat unrealistic and expensive to have a hard rule to keep them > architecture agnostic. And if that's not guaranteed, then

Re: [HACKERS] Hash Functions

2017-05-12 Thread Andres Freund
On May 12, 2017 10:05:56 AM PDT, Robert Haas wrote: >On Fri, May 12, 2017 at 12:08 AM, Jeff Davis wrote: >> 1. The hash functions as they exist today aren't portable -- they can >> return different results on different machines. That means using >these

Re: [HACKERS] Hash Functions

2017-05-12 Thread Robert Haas
On Fri, May 12, 2017 at 12:08 AM, Jeff Davis wrote: > 1. The hash functions as they exist today aren't portable -- they can > return different results on different machines. That means using these > functions for hash partitioning would yield different contents for the > same

[HACKERS] Hash Functions

2017-05-11 Thread Jeff Davis
https://www.postgresql.org/message-id/camp0ubeo3fzzefie1vmc1ajkkrpxlnzqooaseu6o-c+...@mail.gmail.com In that thread, I pointed out some important considerations for the hash functions themselves. This is a follow-up, after I looked more carefully. 1. The hash functions as they exist today