On 09/04/2014 11:51 AM, Robert Collins wrote:
On 4 September 2014 19:53, Nejc Saje <[email protected]> wrote:

I used the terms that are used in the original caching use-case, as
described in [1] and are used in the pypi lib as well[2]. With the correct
approach, there aren't actually any partitions, 'replicas' actually denotes
the number of times you hash a node onto the ring. As for nodes&keys, what's
your suggestion?

So - we should change the Ironic terms then, I suspect (but lets check
with Deva who wrote the original code where he got them from).

The parameters we need to create a ring are:
  - how many fallback positions we use for data (currently referred to
as replicas)
  - how many times we hash the servers hosting data into the ring
(currently inferred via the hash_partition_exponent / server count)
  - the servers

and then we probe data items as we go.

The original paper isn't
http://www.martinbroadhurst.com/Consistent-Hash-Ring.html -
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.1879 is
referenced by it, and that paper doesn't include the term replica
count at all. In other systems like cassandra, replicas generally
refers to how many servers end up holding a copy of the data: Martin
Broadhurts paper uses replica there in quite a different sense - I
much prefer the Ironic use, which says how many servers will be
operating on the data: its externally relevant.

It doesn't contain that term precisely, but it does talk about replicating the buckets. What about using a descriptive name for this parameter, like 'distribution_quality', where the higher the value, higher the distribution evenness (and higher memory usage)?


I've no objection talking about keys, but 'node' is an API object in
Ironic, so I'd rather we talk about hosts - or make it something
clearly not node like 'bucket' (which the 1997 paper talks about in
describing consistent hash functions).

So proposal:
  - key - a stringifyable thing to be mapped to buckets
What about using the term 'item' from the original paper as well?

  - bucket a worker/store that wants keys mapped to it
  - replicas - number of buckets a single key wants to be mapped to
Can we keep this as an Ironic-internal parameter? Because it doesn't really affect the hash ring. If you want multiple buckets for your item, you just continue your journey along the ring and keep returning new buckets. Check out how the pypi lib does it: https://github.com/Doist/hash_ring/blob/master/hash_ring/ring.py#L119

  - partitions - number of total divisions of the hash space (power of
2 required)
I don't think there are any divisions of the hash space in the correct implementation, are there? I think that in the current Ironic implementation this tweaks the distribution quality, just like 'replicas' parameter in Ceilo implementation.

Cheers,
Nejc


I've opened a bug[3], so you can add a Closes-Bug to your patch.

Thanks!

-Rob



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to