Basically, I am trying to transform the following python datastructure into cython.
Source_index = int
Target_index = int
Phrase_count = float
Phrase_prob = float
Phrase_table = {}
Subdict = {}
Subdict[s_index] = [count, prob]
Phrase_table[tindex] = subdict
So that a call to:
Phrase_table[tindex][sindex] gives the count and prob.
I described this im my previous mail as, sorry for the confusion.
Key : int ---> value : hashtable { key: int ---> value: list(float, float) }
My main two concerns with using this hashtable are:
-Can I reference to the "subdict" hashtable from the original void
*HashTableValue? And how do I cast this?
-Can I store two floats in void *HashTableValue;
In addition,
I have started to create what I want, but I am still having some
difficulties. Attached is my .pyx file.
Some questions I have are:
-Void_star_to_hashtable is obviously wrong, but why exactly?
cdef c_HashTable void_star_to_hashtable(void* v):
cdef void** b = [v]
cdef c_HashTable* a = <c_HashTable*>b
return a[0]
-line 86: sub_dict =
<c_HashTable*>void_star_to_hashtable(hash_table_lookup(self._base,
int_to_void_star(tindex)))
Why do I need a cast here?
-----Original Message-----
From: Robert Bradshaw [mailto:[email protected]]
Sent: woensdag 9 september 2009 18:48
To: [email protected]
Subject: Re: [Cython] FW: cython and hash tables / dictionary
On Sep 9, 2009, at 7:18 AM, Sanne Korzec wrote:
> Ok, I have played around with this hash table and understand most
> of the
> basics...
>
> I now would like to create the datastructure I need for my project.
>
> Basically what I need is two linked hash tables like this
>
> Key : int ---> value : hashtable { key: int ---> value: list(float,
> float) }
>
> And
>
> Key : int ---> value : hashtable { key: int ---> value: float }
I'm not quite sure exactly what your notation means here. You need a
hashtable from ints to floats, and another one from ints to pairs of
floats?
> I am starting to wonder since this hash table works with voids for
> key and
> value only if I should change the .c and .h files myself. I think I
> would
> prefer not to.
The way this hashtable is intended to be used is that you malloc some
room for your keys/values, and then pass the pointers into the table
itself. Of course this is a bit of overhead, both in terms of runtime
(all the malloc/free calls) and manual labor.
> Does anybody have a suggestion what would be wise? Preference goes
> to quick
> implementation, not total optimization.
First, I'd see how fast Python hashtables work for you. That might be
good enough. You could look into creating a special PairOfFoats cdef
class to try to cut on the list/tuple overhead. If that doesn't work,
the next thing would be to use the structure above (manually malloc-
ing room for all the stuff, though if your ints fit into a void* you
could use the same casting trick). That failing, you could write your
own custom hash table.
As has been discovered, Python hashtables are pretty good, so expect
at most a 10x (?) improvement writing your own.
- Robert
cy_hash.pyx
Description: Binary data
_______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
