Re: [HACKERS] Hash Functions

Robert Haas Tue, 16 May 2017 08:29:37 -0700

On Tue, May 16, 2017 at 11:10 AM, Jeff Davis <[email protected]> wrote:
> With hash partitioning:
> * User only specifies number of partitions of the parent table; does
> not specify individual partition properties (modulus, etc.)
> * Dump/reload goes through the parent table (though we may provide
> options so pg_dump/restore can optimize this)
> * We could provide syntax to adjust the number of partitions, which
> would be expensive but still useful sometimes.
> * All DDL should be on the parent table, including check constraints,
> FKs, unique constraints, exclusion constraints, indexes, etc.
>   - Unique and exclusion constraints would only be permitted if the
> keys are a superset of the partition keys.
>   - FKs would only be permitted if the two table's partition schemes
> match and the keys are members of the same hash opfamily (this could
> be relaxed slightly, but it gets a little confusing if so)
> * No attach/detach of partitions
> * All partitions have the same permissions
> * Individual partitions would only be individually-addressable for
> maintenance (like reindex and vacuum), but not for arbitrary queries
>   - perhaps also COPY for bulk loading/dumping, in case we get clients
> smart enough to do their own hashing.


I don't really find this a very practical design.  If the table
partitions are spread across different relfilenodes, then those
relfilenodes have to have separate pg_class entries and separate
indexes, and those indexes also need to have separate pg_class
entries.  Otherwise, nothing works.  And if they do have separate
pg_class entries, then the partitions have to have their own names,
and likewise for their indexes, and a dump-and-reload has to preserve
those names.  If it doesn't, and those objects get new system-assigned
names after the dump-and-reload, then dump restoration can fail when a
system-assigned name collides with an existing name that is first
mentioned later in the dump.

If we had the ability to have anonymous pg_class entries -- relations
that have no names -- then maybe it would be possible to make
something like what you're talking about work.  But that does not seem
easy to do.  There's a unique index on (relname, relnamespace) for
good reason, and we can't make it partial on a system catalog.  We
could make the relname column allow nulls, but that would add overhead
to any code that needs to access the relation name, and there's a fair
amount of that.

Similarly, if we had the ability to associate multiple relfilenodes
with a single relation, and if index entries could point to
<which-relfilenode, block, offset> rather than just <block, offset>,
then we could also make this work.  But either of those things would
require significant re-engineering and would have downsides in other
cases.

If Java has portable hash functions, why can't we?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Functions

Reply via email to