Re: OPHF

Jonathan ellis Sat, 04 Apr 2009 21:25:01 -0700

Note that Avinash's ophf is 1-1 up to the max key length. But thishas its own problems. (Mainly, that keys cannot be arbitrarily long,and hash computation gets more expensive proportional to max keylength.)


-Jonathan


On Apr 4, 2009, at 10:32 PM, Zhu Han <[email protected]> wrote:

Avinash,

(For cassandra, the "key" mentioned in following words is the nameof therow. The "token" mentioned in following words, is the output of thehash

function. )

For example, if you use 160bit md5 hash value as the space, it'sfinitetheoretically, because you know the number of tokens which can becontainedahead of time, although the number is pretty large. For finite hashspace,

the mapping from key to token is many to one instead of one to one.

For OHPF, a lot of keys could be mapped to the same token rangebecause ofthe non-uniform distribution just as you pointed out. The loadbalancing can

solve the problem. However, if too may keys mapped to *the same token*
which is determined by the hash function, how can the load balance be
carried out? When the hash space is finite, it's possible that a large

number of keys are mapped to the same token. However, if the hashfunctionis uniformly distributed, just as MD5 or SHA1, the probability ofthe worse

case is pretty rare, at least this case  cannot be produced by human

purposely, so that's not a problem. But for OPHF, if the clientspick upthe name of key purposely, e.g. for DOS attack, the worst case canoccurpractically. When the attacker knows the algorithm of the OPHF, hecan

produce arbitrary number of keys mapped to the same token.

If the hash space is infinite, or it's as large as the possiblespace ofkeys, this would not be a problem, because the key to token mappingis one

to one instead of many to one .

best regards,
hanzhu


On Sat, Apr 4, 2009 at 11:59 PM, Avinash Lakshman <
[email protected]> wrote:

Not sure what you mean by infinite. OPHF or not any hash function'srange

is

practically infinite for MD5 is infinite hash range. With OPHF youwill

have
a lot of skew. There are only two ways to deal with it:

(1) You know the space of of all your keys ahead of time and haveyour

system exploit that.
(2) Dynamically load balance until your system reaches steady state.

We are always sorting lexicographically and trying to do both ofthe above.


Avinash

On Sat, Apr 4, 2009 at 12:18 AM, Zhu Han <[email protected]>wrote:

Avinash,

I support the proposal of Jonathan.

Theoretically,  if the system wants to support range query over

consistent

hash(DHT), the hash space should be infinite. Otherwise, a largenumber
of
keys could be mapped to the same token and it's impossible to doload
balance by moving these keys around to other physical servers.

Especially,

the churn of OPHF could be very high because the user would like to

create

a
lot of keys with the same long prefix for some applications. If the

system

takes the string as the token and sorts them in *lexicographicorder, thehash space would be infinite, so that the churn can be solved byload
balacing under any circumstances.

*best regards,
hanzhu


On Thu, Apr 2, 2009 at 11:04 PM, Avinash Lakshman <
[email protected]> wrote:
So how are you coming up with the tokens here? What do you mean by
string[0]
and string[last]? Are they the keys that belong to the system? Ifso

then

that is not a mechanism to reason about the range of your hash

function.

In

this scheme does it mean you will need to sort the tokens too using
collation scheme used by the external partitioner while identifying

which

key goes to which node. Also could you provide me with the patch

number.

I
need to go over that this weekend and make sure if it does notaffect

the

(bootstrap/load balance) logic. My fear is that these changeshave long

reaching effects and I need to make sure that all these pieces will
continue
to reside in harmony. Also your range query test did it run in a
distributed
environment or in a single box environment?
Avinash

On Wed, Apr 1, 2009 at 3:34 PM, Jonathan Ellis <[email protected]>

wrote:

So with a String ring you just sort the strings (conceptually) and

you

wrap around from strings[last] to strings[0]. The range isentirely
defined by what tokens you have.
It's nice to be able to test against a real application. Minewon't
run without range queries...  (So it is still running against my
github code where I first wrote range queries, but that uses theOPHF
approach you have and so that is where I found the problems.)

-Jonathan

On Wed, Apr 1, 2009 at 4:27 PM, Avinash Lakshman
<[email protected]> wrote:
I will look into this. Another point about P2P rings is something
about
the
ability to reason about the system. In any hash fn we hve anotion

of

range

and how the ring wraps around. For eg. in random hash such as MD5

we

say
0-2^128 -1 and then the wrap around. Similarly for the hash fnused
here
one
could define the same. Today we perform usual sort of strings and

the

hash

should respect that sort order. I guess you are saying the usual

sort

of

string does not respect collation. I will look into this and I

insist

this

should be very doable. Changes to the implementation of the core

classes

albeit how simple they are very very scary unless they are

completely

tested

too in a distributed setting. Over here we do not have detailed

test

code,
but we test by directing a % of the site traffic to a testcluster
before
we
sign off on anything.

Avinash
On Wed, Apr 1, 2009 at 3:00 PM, Jonathan Ellis<[email protected]>
wrote:
Say for instance you have inserted keys ['a', 'b', '--a', '--b']

and

you are going to do range queries on them.  The correct

collation-aware sort is ['a', '--a', 'b', '--b']. Butordering by

char value gives ['--a', '--b', 'a', 'b'].

Attached is a test program illustrating this.  The assert will

fail

because the hash ordering is not the same as the collator's.

-Jonathan

On Wed, Apr 1, 2009 at 3:47 PM, Avinash Lakshman
<[email protected]> wrote:

In fact what would you want is hash to respect oorder based on

how

the

strings are sorted. For the example you have it does. So am I

missing

something?

Avinash

On Wed, Apr 1, 2009 at 2:43 PM, Jonathan Ellis <

[email protected]

wrote:

For that aspect no difference between a String ring based on

compareTo

and a BigInteger one.  The only difference (and it is an

important

one
for the reasons I gave!) is how the compare works. But forthe

p2p

aspect it does not matter.

On Wed, Apr 1, 2009 at 3:40 PM, Avinash Lakshman
<[email protected]> wrote:

Doing what you are suggesting scares the hell out of me for a

couple

of

reasons -  All work in P2P be it random/OPHF does the token

handling

the

way

it is setup. I cannot try something that has not been well

explored

in

academia. I insist this must be doable. I am going to think

about

this

more.


Avinash

On Wed, Apr 1, 2009 at 2:31 PM, Jonathan Ellis <

[email protected]>

wrote:

Avinash already commited his new order-preserving hash

function

and I

missed it.  It's in OrderPreservingHashPartitioner.  It

takes

the

approach that Todd and I discussed back in January: turn the

string

into a base-Char.MAX_VALUE number.
(

http://groups.google.com/group/cassandra-dev/browse_thread/thread/6bda8518466210e7/f53b79c19010a9ed

).
I chatted with Avinash a little on IM but we didn't finish,

so

I'm

picking it up here.

There are two problems with this approach.  One is that the

hashes

will only be order-preserving for a subset of unicode (all

of

UCS-2,

but not all of UTF-16; see

http://en.wikipedia.org/wiki/UTF-16/UCS-2

).

The other is that this only gives you a naive ordering by

code

point

value, which for unicode is not what you want and even for

ascii

sometimes you want another collation.  (see
http://www.unicode.org/reports/tr10/ and

http://java.sun.com/javase/6/docs/api/java/text/Collator.html

).


Say for instance you have inserted keys ['a', 'b', '--a',

'--b']

and

you are going to do range queries on them.  The correct
collation-aware sort is ['a', '--a', 'b', '--b'].  But

ordering

by

char value gives ['--a', '--b', 'a', 'b'].

Switching to a more flexibile system like the one I wrote

for

CASSANDRA-3 will let use use Token<BigInteger> for random

distribution

or Token<String> for order-preserving, with user-defined

collation.

don't see a way to get this kind of flexibility from an

approach

that

insists on turning everything into BigInteger.

-Jonathan

On Mon, Mar 30, 2009 at 2:16 PM, Jonathan Ellis <

[email protected]>

wrote:

Avinash,

You mentioned that you have a new order-preserving hash

function

that

you think will be more generally useful.  Can you post it?

thanks,

-Jonathan

Re: OPHF

Reply via email to