Re: [PATCH v2] pack-objects: fix performance issues on packing large deltas

2018-07-26 Thread Johannes Sixt

Am 22.07.2018 um 10:04 schrieb Nguyễn Thái Ngọc Duy:

+   if (size < pack->oe_delta_size_limit) {
+   e->delta_size_ = size;
+   e->delta_size_valid = 1;
+   } else {
+   packing_data_lock(pack);
+   if (!pack->delta_size)
+   ALLOC_ARRAY(pack->delta_size, pack->nr_alloc);
+   packing_data_unlock(pack);
+
+   pack->delta_size[e - pack->objects] = size;


My first thought was that this is wrong (falling prey to the same 
mistake as the double-checked locking pattern). But after thinking twice 
over it again, I think that this unprotected access of pack->delta_size 
is thread-safe.


Of course, I'm assuming that different threads never assign to the same 
array index.



+   e->delta_size_valid = 0;
+   }
  }


-- Hannes


Re: [PATCH v2] pack-objects: fix performance issues on packing large deltas

2018-07-23 Thread Jeff King
On Mon, Jul 23, 2018 at 08:49:59PM +0200, Duy Nguyen wrote:

> On Mon, Jul 23, 2018 at 8:38 PM Duy Nguyen  wrote:
> > I will have to study the thread dispatch code more to have a better
> > answer, unfortunately.
> 
> Well.. I thought I would need this weekend for this, but a quick look
> and ll_find_deltas() suggests that what we're doing is safe. At least
> you and Jeff are way to familiar with the delta window concept in
> pack-objects. So in multithread mode, we have a big list of all
> objects, then the list is cut in N sublists and each is owned entirely
> by one thread. Each thread then can slide a window over this sublist
> to search for the best delta.
> 
> Because of this partitioning, if trg_entry is being processed now, it
> will not be accessed by any other thread. It's owned by this thread
> and will be accessed again as src_entry when the next entry becomes
> the delta target in the same thread. As long as we don't access
> objects of another thread (and my v1 violates this) we should be ok.

Yes, that matches my knowledge of how this all works. And if it didn't,
I think the code would have been racy even _before_ your patches.

The only thing that this pack->delta_size approach is changing is that
managing that array needs to happen under lock, because it touches the
whole list.

And as long as we back-fill it from any arbitrary e->delta_size_, that
means that touching e->delta_size_ needs to be done under lock. That's
why I think keeping the individual valid flag in each entry makes the
most sense. Then whenever we overflow a particular e->delta_size_, we
don't have to care about anybody else's size.

Which I think is what your v2 patch is doing.

-Peff


Re: [PATCH v2] pack-objects: fix performance issues on packing large deltas

2018-07-23 Thread Duy Nguyen
On Mon, Jul 23, 2018 at 8:38 PM Duy Nguyen  wrote:
> I will have to study the thread dispatch code more to have a better
> answer, unfortunately.

Well.. I thought I would need this weekend for this, but a quick look
and ll_find_deltas() suggests that what we're doing is safe. At least
you and Jeff are way to familiar with the delta window concept in
pack-objects. So in multithread mode, we have a big list of all
objects, then the list is cut in N sublists and each is owned entirely
by one thread. Each thread then can slide a window over this sublist
to search for the best delta.

Because of this partitioning, if trg_entry is being processed now, it
will not be accessed by any other thread. It's owned by this thread
and will be accessed again as src_entry when the next entry becomes
the delta target in the same thread. As long as we don't access
objects of another thread (and my v1 violates this) we should be ok.

At the end of ll_find_deltas() though, we have the "stealing" stuff,
but it's protected by progress_lock() already, outside try_delta() so
we're sure nobody is updating any object_entry when the stealing
happens.

I will double check again this weekend just to be sure.
-- 
Duy


Re: [PATCH v2] pack-objects: fix performance issues on packing large deltas

2018-07-23 Thread Duy Nguyen
On Mon, Jul 23, 2018 at 8:04 PM Junio C Hamano  wrote:
>
> Nguyễn Thái Ngọc Duy   writes:
>
> > Access to e->delta_size_ (and by extension
> > pack->delta_size[e - pack->objects]) is unprotected as before, the
> > thread scheduler in pack-objects must make sure "e" is never updated
> > by two different threads.
>
> OK.  Do we need to worry about "e" (e.g. "e->delta_size_valid")
> being accessed while/before it is set by another thread?

I left some details out because I think it's getting to a gray area.
We already do this

if (!DELTA(trg_entry)) {
max_size = trg_size/2 - the_hash_algo->rawsz;
ref_depth = 1;
} else {
max_size = DELTA_SIZE(trg_entry);
...
SET_DELTA(trg_entry, src_entry);
SET_DELTA_SIZE(trg_entry, delta_size);

if the bottom half is running on one thread and stopped in the middle
while the top half is running in another thread, we have a problem.
But perhaps max_size is not that big a deal because incorrect max_size
may produce a bad pack but can't corrupt it.

I will have to study the thread dispatch code more to have a better
answer, unfortunately.
--
Duy


Re: [PATCH v2] pack-objects: fix performance issues on packing large deltas

2018-07-23 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> Access to e->delta_size_ (and by extension
> pack->delta_size[e - pack->objects]) is unprotected as before, the
> thread scheduler in pack-objects must make sure "e" is never updated
> by two different threads.

OK.  Do we need to worry about "e" (e.g. "e->delta_size_valid")
being accessed while/before it is set by another thread?

oe_delta_size() makes unprotected accesses to .delta_size_ and
pack->delta_size[e - pack->objects], so we apparently do not, and
oe_set_delta_size() only protects the allocation call and does not
prevent a reader in oe_delta_size() from first reading the _valid
field, noticing that it is 0 as initialized, and goes on to read the
pack->delta_size[] slot for the entry, while the writer is setting
the size to .delta_size_ field and flipping _valid bit, without ever
storing the size in the pack->delta_size[] array.

> @@ -130,6 +131,7 @@ struct packing_data {
>   uint32_t index_size;
>  
>   unsigned int *in_pack_pos;
> + unsigned long *delta_size;
>  
>   /*
>* Only one of these can be non-NULL and they have different
> @@ -140,10 +142,29 @@ struct packing_data {
>   struct packed_git **in_pack_by_idx;
>   struct packed_git **in_pack;
>  
> +#ifndef NO_PTHREADS
> + pthread_mutex_t lock;

I am wondering if we want the variable to say what data it is
protecting from simultaneous accesses, or leave it as generic so
that any new caller that wants to lock any (new) thing that is
associated with a packing_data structure can grab it for other
purposes.  The design of this patch clearly is the latter, which is
OK for now, I think.

> @@ -332,18 +353,34 @@ static inline unsigned long oe_delta_size(struct 
> packing_data *pack,
>  {
>   if (e->delta_size_valid)
>   return e->delta_size_;
> - return oe_size(pack, e);
> +
> + /*
> +  * pack->detla_size[] can't be NULL because oe_set_delta_size()
> +  * must have been called when a new delta is saved with
> +  * oe_set_delta().
> +  * If oe_delta() returns NULL (i.e. default state, which means
> +  * delta_size_valid is also false), then the caller must never
> +  * call oe_delta_size().
> +  */
> + return pack->delta_size[e - pack->objects];
>  }
>  
>  static inline void oe_set_delta_size(struct packing_data *pack,
>struct object_entry *e,
>unsigned long size)
>  {
> - e->delta_size_ = size;
> - e->delta_size_valid = e->delta_size_ == size;
> - if (!e->delta_size_valid && size != oe_size(pack, e))
> - BUG("this can only happen in check_object() "
> - "where delta size is the same as entry size");
> + if (size < pack->oe_delta_size_limit) {
> + e->delta_size_ = size;
> + e->delta_size_valid = 1;
> + } else {
> + packing_data_lock(pack);
> + if (!pack->delta_size)
> + ALLOC_ARRAY(pack->delta_size, pack->nr_alloc);
> + packing_data_unlock(pack);
> +
> + pack->delta_size[e - pack->objects] = size;
> + e->delta_size_valid = 0;
> + }
>  }