Re: [PATCH 08/16] ewah: compressed bitmap implementation
Vicent Marti tan...@gmail.com writes: The library is re-licensed under the GPLv2 with the permission of Daniel Lemire, the original author. This says GPLv2, but the license blurbs all say or (at your option) any later version. IANAL, does this cause any problems? If so, can they be GPLv2-only instead? Makefile |6 + ewah/bitmap.c | 229 + ewah/ewah_bitmap.c | 703 ewah/ewah_io.c | 199 +++ ewah/ewah_rlw.c| 124 + ewah/ewok.h| 194 +++ ewah/ewok_rlw.h| 114 + Can we have a Documentation/technical/api-ewah.txt? (Maybe if you insert all the comments I ask for in the below, it's not necessary, but it would still be nice to have some central place where the formats are documented.) [...] +struct ewah_bitmap *bitmap_to_ewah(struct bitmap *bitmap) +{ + struct ewah_bitmap *ewah = ewah_new(); + size_t i, running_empty_words = 0; + eword_t last_word = 0; + + for (i = 0; i bitmap-word_alloc; ++i) { + if (bitmap-words[i] == 0) { + running_empty_words++; + continue; + } + + if (last_word != 0) { + ewah_add(ewah, last_word); + } There are a lot of noisy braces -- like in this instance -- if you apply the git style to the files in ewah/. I assume we'll give the directory its own style, so that it should always use braces even on one-line blocks. [...] + ewah_add(ewah, last_word); + return ewah; +} + +struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah) +{ + struct bitmap *bitmap = bitmap_new(); + struct ewah_iterator it; + eword_t blowup; + size_t i = 0; + + ewah_iterator_init(it, ewah); + + while (ewah_iterator_next(blowup, it)) { + if (i = bitmap-word_alloc) { + bitmap-word_alloc *= 1.5; Any reason that this uses a scale factor of 1.5, while the bitmap_set operation above uses 2? + bitmap-words = ewah_realloc( + bitmap-words, bitmap-word_alloc * sizeof(eword_t)); + } [...] + +void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data) +{ [...] + for (offset = 0; offset BITS_IN_WORD; ++offset) { + if ((word offset) == 0) + break; + + offset += __builtin_ctzll(word offset); Here and in the rest, you use __builtin_* within the code. This needs to be either in a separate helper that reimplements the function in terms of C if it is not available (i.e. you don't use GCC). (Alternatively, the whole series could be conditional on some HAVE_GCC_BUILTINS macro. I'd think that would be a bad tradeoff though.) + callback(pos + offset, data); + } + pos += BITS_IN_WORD; + } + } +} [...] diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c [...] +void ewah_free(struct ewah_bitmap *bitmap) +{ + if (bitmap-alloc_size) + free(bitmap-buffer); + + free(bitmap); +} Maybe first if (!bitmap) return, so that it behaves like other free()s? diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c [...] +int ewah_serialize_native(struct ewah_bitmap *self, int fd) +{ + uint32_t write32; + size_t to_write = self-buffer_size * 8; + + /* 32 bit -- bit size fr the map */ You cutpasted the typo (for) throughout the file :-) [...] + /** 32 bit -- number of compressed 64-bit words */ + write32 = (uint32_t)self-buffer_size; + if (write(fd, write32, 4) != 4) + return -1; + + if (write(fd, self-buffer, to_write) != to_write) + return -1; Shouldn't you use our neat write_in_full() and read_in_full() helpers, throughout the file? [...] diff --git a/ewah/ewok.h b/ewah/ewok.h [...] +#ifndef __EWOK_BITMAP_C__ +#define __EWOK_BITMAP_C__ _H_? +#ifndef ewah_malloc +#define ewah_malloc malloc +#endif +#ifndef ewah_realloc +#define ewah_realloc realloc +#endif +#ifndef ewah_calloc +#define ewah_calloc calloc +#endif I see you later #define them to the corresponding x*alloc version in pack-bitmap.h. Good. + +typedef uint64_t eword_t; I assume this isn't ifdef'd to help 32bit platforms because the on-disk format depends on it? +#define BITS_IN_WORD (sizeof(eword_t) * 8) + +struct ewah_bitmap { + eword_t *buffer; + size_t buffer_size; + size_t alloc_size; + size_t bit_size; + eword_t *rlw; +}; + +typedef void (*ewah_callback)(size_t pos, void *); + +struct ewah_bitmap *ewah_pool_new(void); +void ewah_pool_free(struct ewah_bitmap *bitmap); How do the pool versions differ from the non-pool versions below? I would have expected
Re: [PATCH 08/16] ewah: compressed bitmap implementation
Junio C Hamano gits...@pobox.com writes: Vicent Marti tan...@gmail.com writes: The library is re-licensed under the GPLv2 with the permission of Daniel Lemire, the original author. The source code for the C version can be found on GitHub: https://github.com/vmg/libewok The original Java implementation can also be found on GitHub: https://github.com/lemire/javaewah --- Please make sure that all patches are properly signed off. Makefile |6 + ewah/bitmap.c | 229 + ewah/ewah_bitmap.c | 703 ewah/ewah_io.c | 199 +++ ewah/ewah_rlw.c| 124 + ewah/ewok.h| 194 +++ ewah/ewok_rlw.h| 114 + This is lovely. A few comments after an initial quick scan-through. - The code and the headers are well commented, which is good. - What's __builtin_popcountll() doing there in a presumably generic codepath? - Two variants of bitmap are given different and easy to understand type names (vanilla one is bitmap, the clever one is ewah_bitmap), but at many places, a pointer to ewah_bitmap is simply called bitmap or bitmap_i without ewah anywhere, which waas confusing to read. Especially, the NAND operation for bitmap takes two bitmaps, while OR takes one bitmap and ewah_bitmap. That is fine as long as the combination is convenient for callers, but I wished the ewah variables be called with ewah somewhere in their names. - I compile with -Werror -Wdeclaration-after-statement; some places seem to trigger it. - Some extern declarations in *.c sources were irritating; shouldn't they be declared in *.h file and included? - There are some instances of if (condition) stmt; on a single line; looked irritating. - bool is not a C type we use (and not a particularly good type in C++, either). One more. - Use of unnecessary float (e.g. oldval *= 1.5) were moderately annoying. That is it for now. I am looking forward to read through the users of the library ;-) Thanks for working on this. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 08/16] ewah: compressed bitmap implementation
Vicent Marti tan...@gmail.com writes: The library is re-licensed under the GPLv2 with the permission of Daniel Lemire, the original author. The source code for the C version can be found on GitHub: https://github.com/vmg/libewok The original Java implementation can also be found on GitHub: https://github.com/lemire/javaewah --- Please make sure that all patches are properly signed off. Makefile |6 + ewah/bitmap.c | 229 + ewah/ewah_bitmap.c | 703 ewah/ewah_io.c | 199 +++ ewah/ewah_rlw.c| 124 + ewah/ewok.h| 194 +++ ewah/ewok_rlw.h| 114 + This is lovely. A few comments after an initial quick scan-through. - The code and the headers are well commented, which is good. - What's __builtin_popcountll() doing there in a presumably generic codepath? - Two variants of bitmap are given different and easy to understand type names (vanilla one is bitmap, the clever one is ewah_bitmap), but at many places, a pointer to ewah_bitmap is simply called bitmap or bitmap_i without ewah anywhere, which waas confusing to read. Especially, the NAND operation for bitmap takes two bitmaps, while OR takes one bitmap and ewah_bitmap. That is fine as long as the combination is convenient for callers, but I wished the ewah variables be called with ewah somewhere in their names. - I compile with -Werror -Wdeclaration-after-statement; some places seem to trigger it. - Some extern declarations in *.c sources were irritating; shouldn't they be declared in *.h file and included? - There are some instances of if (condition) stmt; on a single line; looked irritating. - bool is not a C type we use (and not a particularly good type in C++, either). That is it for now. I am looking forward to read through the users of the library ;-) Thanks for working on this. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html