Re: [PATCH v5 02/14] add a hashtable implementation that supports O(1) removal

2013-12-18 Thread Karsten Blees
Am 14.12.2013 03:04, schrieb Jonathan Nieder:
 Hi,
 
 Karsten Blees wrote:
 
  test-hashmap.c  | 340 
 
 
 Here come two small tweaks on top (meant for squashing in or applying
 to the series, whichever is more convenient).
 
 Thanks,
 Jonathan Nieder (2):
   Add test-hashmap to .gitignore
   Drop unnecessary #includes from test-hashmap
 
  .gitignore | 1 +
  test-hashmap.c | 3 +--
  2 files changed, 2 insertions(+), 2 deletions(-)
 
Thanks, these two make perfect sense.

I'm too damn slow again (merged to next two days ago), but FWIW:
Acked-by: Karsten Blees bl...@dcon.de
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 02/14] add a hashtable implementation that supports O(1) removal

2013-12-13 Thread Jonathan Nieder
Hi,

Karsten Blees wrote:

  test-hashmap.c  | 340 
 

Here come two small tweaks on top (meant for squashing in or applying
to the series, whichever is more convenient).

Thanks,
Jonathan Nieder (2):
  Add test-hashmap to .gitignore
  Drop unnecessary #includes from test-hashmap

 .gitignore | 1 +
 test-hashmap.c | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 02/14] add a hashtable implementation that supports O(1) removal

2013-11-14 Thread Karsten Blees
The existing hashtable implementation (in hash.[ch]) uses open addressing
(i.e. resolve hash collisions by distributing entries across the table).
Thus, removal is difficult to implement with less than O(n) complexity.
Resolving collisions of entries with identical hashes (e.g. via chaining)
is left to the client code.

Add a hashtable implementation that supports O(1) removal and is slightly
easier to use due to builtin entry chaining.

Supports all basic operations init, free, get, add, remove and iteration.

Also includes ready-to-use hash functions based on the public domain FNV-1
algorithm (http://www.isthe.com/chongo/tech/comp/fnv).

The per-entry data structure (hashmap_entry) is piggybacked in front of
the client's data structure to save memory. See test-hashmap.c for usage
examples.

The hashtable is resized by a factor of four when 80% full. With these
settings, average memory consumption is about 2/3 of hash.[ch], and
insertion is about twice as fast due to less frequent resizing.

Lookups are also slightly faster, because entries are strictly confined to
their bucket (i.e. no data of other buckets needs to be traversed).

Signed-off-by: Karsten Blees bl...@dcon.de
Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/technical/api-hashmap.txt | 235 ++
 Makefile|   3 +
 hashmap.c   | 228 +
 hashmap.h   |  71 +++
 t/t0011-hashmap.sh  | 240 ++
 test-hashmap.c  | 340 
 6 files changed, 1117 insertions(+)
 create mode 100644 Documentation/technical/api-hashmap.txt
 create mode 100644 hashmap.c
 create mode 100644 hashmap.h
 create mode 100755 t/t0011-hashmap.sh
 create mode 100644 test-hashmap.c

diff --git a/Documentation/technical/api-hashmap.txt 
b/Documentation/technical/api-hashmap.txt
new file mode 100644
index 000..b2280f1
--- /dev/null
+++ b/Documentation/technical/api-hashmap.txt
@@ -0,0 +1,235 @@
+hashmap API
+===
+
+The hashmap API is a generic implementation of hash-based key-value mappings.
+
+Data Structures
+---
+
+`struct hashmap`::
+
+   The hash table structure.
++
+The `size` member keeps track of the total number of entries. The `cmpfn`
+member is a function used to compare two entries for equality. The `table` and
+`tablesize` members store the hash table and its size, respectively.
+
+`struct hashmap_entry`::
+
+   An opaque structure representing an entry in the hash table, which must
+   be used as first member of user data structures. Ideally it should be
+   followed by an int-sized member to prevent unused memory on 64-bit
+   systems due to alignment.
++
+The `hash` member is the entry's hash code and the `next` member points to the
+next entry in case of collisions (i.e. if multiple entries map to the same
+bucket).
+
+`struct hashmap_iter`::
+
+   An iterator structure, to be used with hashmap_iter_* functions.
+
+Types
+-
+
+`int (*hashmap_cmp_fn)(const void *entry, const void *entry_or_key, const void 
*keydata)`::
+
+   User-supplied function to test two hashmap entries for equality. Shall
+   return 0 if the entries are equal.
++
+This function is always called with non-NULL `entry` / `entry_or_key`
+parameters that have the same hash code. When looking up an entry, the `key`
+and `keydata` parameters to hashmap_get and hashmap_remove are always passed
+as second and third argument, respectively. Otherwise, `keydata` is NULL.
+
+Functions
+-
+
+`unsigned int strhash(const char *buf)`::
+`unsigned int strihash(const char *buf)`::
+`unsigned int memhash(const void *buf, size_t len)`::
+`unsigned int memihash(const void *buf, size_t len)`::
+
+   Ready-to-use hash functions for strings, using the FNV-1 algorithm (see
+   http://www.isthe.com/chongo/tech/comp/fnv).
++
+`strhash` and `strihash` take 0-terminated strings, while `memhash` and
+`memihash` operate on arbitrary-length memory.
++
+`strihash` and `memihash` are case insensitive versions.
+
+`void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t 
initial_size)`::
+
+   Initializes a hashmap structure.
++
+`map` is the hashmap to initialize.
++
+The `equals_function` can be specified to compare two entries for equality.
+If NULL, entries are considered equal if their hash codes are equal.
++
+If the total number of entries is known in advance, the `initial_size`
+parameter may be used to preallocate a sufficiently large table and thus
+prevent expensive resizing. If 0, the table is dynamically resized.
+
+`void hashmap_free(struct hashmap *map, int free_entries)`::
+
+   Frees a hashmap structure and allocated memory.
++
+`map` is the hashmap to free.
++
+If `free_entries` is true, each hashmap_entry in the map is freed as well
+(using stdlib's free()).
+
+`void