On Tue, Sep 3, 2013 at 1:42 AM, Xavier Hernandez <[email protected]>wrote:
> Al 03/09/13 09:33, En/na Anand Avati ha escrit: > > On Mon, Sep 2, 2013 at 7:24 AM, Xavier Hernandez <[email protected]>wrote: > >> Hi, >> >> dict_t structures are widely used in glusterfs. I've some ideas that >> could improve its performance. >> >> * On delete operations, return the current value if it exists. >> >> This is very useful when we want to get a value and remove it from the >> dictionary. This way it can be done accessing and locking the dict_t only >> once (and it is atomic). >> > > Makes sense. > > > >> * On add operations, return the previous value if it existed. >> >> This avoids to use a lookup and a conditional add (and it is atomic). >> > > Do you mean dict_set()? If so, how do you propose we differentiate > between "failure" and "previous value did not exist"? Do you propose > setting the previous value into a pointer to pointer, and retain the return > value as is today? > > Yes, I'm thinking to something similar to dict_set() (by the way, I would > remove the dict_add() function). > dict_add() is used in unserialization routines where dict_set() for a big set of keys guaranteed not to repeat is very expensive (unserializing would otherwise have a quadratic function as its asymptote). What is the reason you intend to remove it? > What you propose would be the simplest solution right now. However I think > it would be interesting to change the return value to an error code (this > would supply more detailed information in case of failure and we could use > EEXIST to know if the value already existed. In fact I think it would be > interesting to progressively change the -1 return code of many functions by > an error code). The pointer to pointer argument could be NULL if the > previous value is not needed. > > Of course this would change the function signature, breaking a lot of > existing code. Another possibility could be to create a dict_replace() > function, and possibly make it to fail if the value didn't exist. > It is best we do not change the meaning of existing APIs, and just add new APIs instead. The new API can be: int dict_replace (dict_t *dict, const char *key, data_t *newval, data_t **oldval); .. and leave dict_set() as is. > > > >> * Always return the data_pair_t structure instead of data_t or the data >> itself. >> >> This can be useful to avoid future lookups or other operations on the >> same element. Macros can be created to simplify writing code to access the >> actual value. >> > > The use case is not clear. A more concrete example will help.. > > Having a data_pair_t could help to navigate from an existing element > (getting next or previous. This is really interesting if dict where > implemented using a sorted structure like a trie since it would allow to > process a set of similar entries very fast, like the trusted.afr.<brick> > values for example) or removing or replacing it without needing another > lookup (a more detailed analysis would be needed to see how to handle race > conditions). > > By the way, is really the dict_t structure used concurrently ? I haven't > analyzed all the code deeply, but it seems to me that every dict_t is only > accessed from a single place at once. > There have been instances of dict_t getting used concurrently, when used as xdata and in xattrop (by AFR). There have been bugs in the past with concurrent dict access. > > >> * Use a trie instead of a hash. >> >> A trie structure is a bit more complex than a hash, but only processes >> the key once and does not need to compute the hash. A test implementation I >> made with a trie shows a significant improvement in dictionary operations. >> > > There is already an implementation of trie in libglusterfs/src/trie.c. > Though it does not compact (collapse) single-child nodes upwards into the > parent. In any case, let's avoid having two implementations of tries. > > I know. The current implementation wastes a lot of memory because it uses > an array of 256 pointers, and in some places it needs to traverse the > array. Not a b¡g deal, but if it is made many times it could be noticeable. > In my test I used a trie with 4 child pointers (with collapsing > single-child nodes) that runs a bit faster than the 256 implementation and > uses much less memory. I tried with 2, 4, 16 and 256 childs per node, and 4 > seems to be the best (at least for dictionary structures) though there are > very little difference between 4 and 16 in terms of speed. > The 256 child pointers give you constant time lookup for the next level child with just an offset indirection. With smaller fan-out, do you search through the list? Can you show an example of this? Collapsing single child node upwards is badly needed though. > I agree that it is not good to maintain two implementations of the same > thing. Maybe we could change the trie implementation. It should be > transparent. > Yes, I believe the current API can accommodate such internal changes. > > > * Implement dict_foreach() as a macro (similar to kernel's >> list_for_each()). >> >> This gives more control and avoids the need of helper functions. >> > > This makes sense too, but there are quite a few users of dict_foreach in > the existing style. Moving them all over might be a pain. > > Maybe we could create a differently named macro to implement this feature > and allow the developers to slowly change it. The old implementation could > be flagged as deprecated and use the new one for new code. Old code will > have enough time to change it until eventually the old implementation is > removed. > > If we make important changes to the dict_t structure, we could replace > current functions by macros that use the new implementation but simulates > the old behavior. > Sounds OK. > > > >> Additionally, I think it's possible to redefine structures to reduce the >> number of allocations and pointers used for each element (actual data, >> data_t, data_pair_t and key). >> > > This is highly desirable. There was some effort from Amar in the past ( > http://review.gluster.org/3910) but it has been in need of attention for > some time. It would be intersting to know if you were thinking along > similar lines? > > Yes, it is quite similar though I should analyze it more deeply. I > would also try to remove some unused/unneeded fields that are used in very > few places, add complexity and can be replaced easily, like extra_free and > extra_stdfree in dict_t for example. > > Thanks, Avati
_______________________________________________ Gluster-devel mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/gluster-devel
