I know I still have a lot of holes to plug, but this was more interesting because we could see some encouraging numbers. Unfortunately the result is disappointing. Maybe I did it in a stupid way and need to restart with a totally different way.
"rev-list --objects" on v2 takes 4 secs, v4 with current walker 11s and the new walker 16s (worst!). perf's top functions with v2 are 23,51% git libz.so.1.2.7 [.] inflate 16,66% git git [.] lookup_object 11,46% git libz.so.1.2.7 [.] inflate_fast 6,89% git libc-2.16.so [.] __memcpy_ssse3_back 4,19% git libz.so.1.2.7 [.] inflate_table 4,15% git git [.] find_pack_entry_one 3,84% git git [.] decode_tree_entry and with new walker 58,61% git git [.] decode_entries 18,66% git git [.] decode_varint 9,73% git git [.] use_pack 3,31% git git [.] nth_packed_object_offset 1,73% git git [.] process_tree 1,66% git git [.] pv4_lookup_blob 1,09% git git [.] get_pathref 1,03% git libc-2.16.so [.] __memcpy_ssse3_back 0,90% git libz.so.1.2.7 [.] inflate 0,50% git libz.so.1.2.7 [.] inflate_table It's no surprise that lookup_object is no longer hot. The closet is pv4_lookup_blob. nth_packed_object_offset is getting hotter as it's used extensively by decode_entries. And decode_entries is getting toooo hot. This function is now called for each tree entry of every tree. And it does get_tree_offset_cache() lookup for every call (ironically we try hard to avoid hash lookup in lookup_object). The only bit I haven't done is avoid checking if a tree is already examined, if so do not bother with copy sequences referring to it. That should cut down the number of decode_entries but not sure how much because there's no relation between tree traversing order and how copy sequences are made. Maybe we could make an exception and allow the tree walker to pass pv4_tree_cache* directly to decode_entries so it does not need to do the first lookup every time.. Suggestions? Nguyễn Thái Ngọc Duy (9): sha1_file: provide real packed type in object_info_extended pack v4: move v2 tree entry generation code out of decode_entries pv4_tree_desc: introduce new struct for pack v4 tree walker pv4_tree_desc: use struct tree_desc from pv4_tree_desc pv4_tree_desc: allow decode_entries to return v4 trees, one at a time pv4_tree_desc: complete interface pv4_tree_desc: don't bother looking for v4 trees if no v4 packs are present pv4_tree_desc: avoid lookup_object() when possible list-object.c: take "advantage" of new pv4_tree_desc interface cache.h | 3 +- list-objects.c | 38 +++++---- packv4-parse.c | 263 ++++++++++++++++++++++++++++++++++++++++++++++----------- packv4-parse.h | 48 +++++++++++ sha1_file.c | 9 +- streaming.c | 9 +- 6 files changed, 300 insertions(+), 70 deletions(-) -- 184.108.40.206.gc99314b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html