Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-14 Thread Jeff King
On Tue, Aug 13, 2019 at 10:00:44PM +0200, Johannes Schindelin wrote:

> > Jeff King  writes:
> >
> > > I think it might be worth just eliminating the whole idea.
> >
> > I kinda like the simplification ;-) An even thinner wrapper that
> > calls malloc() and dies if it gets NULL, without any "try-to-free"
> > logic.
> 
> This is one of those instances where I wish we would have some reliable
> data rather than having to guess whether it is a good idea or not.

I wish we did, too. If you have an idea how to collect such data, I'm
all ears.

In the absence of that, I've made an argument that it's probably the
right thing to do, and we can see if cooking it over a release cycle
introduces any complaints. That's far from perfect (in particular, I
wouldn't be surprised if very few 32-bit users test non-releases), but I
don't have other ideas.

-Peff


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-13 Thread Johannes Schindelin
Hi,

On Mon, 12 Aug 2019, Junio C Hamano wrote:

> Jeff King  writes:
>
> > I think it might be worth just eliminating the whole idea.
>
> I kinda like the simplification ;-) An even thinner wrapper that
> calls malloc() and dies if it gets NULL, without any "try-to-free"
> logic.

This is one of those instances where I wish we would have some reliable
data rather than having to guess whether it is a good idea or not.

Ciao,
DScho


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-12 Thread Jeff King
On Mon, Aug 12, 2019 at 01:04:55PM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I think it might be worth just eliminating the whole idea.
> 
> I kinda like the simplification ;-) An even thinner wrapper that
> calls malloc() and dies if it gets NULL, without any "try-to-free"
> logic.

Here it is as a patch. Note that my earlier argument around window size
is a bit questionable; the real impact there is core.packedGitLimit,
which I've clarified below.

-- >8 --
Subject: [PATCH] packfile: drop release_pack_memory()

Long ago, in 97bfeb34df (Release pack windows before reporting out of
memory., 2006-12-24), we taught xmalloc() and friends to try unmapping
pack windows when malloc() failed. It's unlikely that his helps a lot in
practice, and it has some downsides. First, the downsides:

  1. It makes xmalloc() not thread-safe. We've worked around this in
 pack-objects.c, which installs its own locking version of the
 try_to_free_routine(). But other threaded code doesn't.

  2. It makes the system as a whole harder to reason about. Functions
 which allocate heap memory under the hood may have farther-reaching
 effects than expected.

That might be worth the tradeoff if there's a benefit. But in practice,
it seems unlikely. We're generally dealing with mmap'd files, so the OS
is going to do a much better job at responding to memory pressure by
dropping individual pages (the exception is systems with NO_MMAP, but
even there the OS can probably respond just as well with swapping).

So the only thing we're really freeing is address space. On 64-bit
systems, we have plenty of that to go around. On 32-bit systems, it
could possibly help. But around the same time we made two other changes:
77ccc5bbd1 (Introduce new config option for mmap limit., 2006-12-23) and
60bb8b1453 (Fully activate the sliding window pack access., 2006-12-23).
Together that means that a 32-bit system should have no more than 256MB
total of packed-git mmaps at one time, split between a few 32MB windows.
It's unlikely we have any address space problems since then, but we
don't have any data since the features were all added at the same time.

Likewise, xmmap() will try to free memory. At first glance, it seems
like we'd need this (when we try to mmap a new window, we might need to
close an old one to save address space on a 32-bit system). But we're
saved again by core.packedGitLimit: if we're going to exceed our 256MB
limit, we'll close an existing window before we even call mmap().

So it seems unlikely that this feature is actually doing anything
useful. And while we don't have reports of it harming anything (probably
because it rarely if ever kicks in), it would be nice to simplify the
system overall. This patch drops the whole try_to_free system from
xmalloc(), as well as the manual pack memory release in xmmap().

Signed-off-by: Jeff King 
---
 builtin/pack-objects.c | 11 
 git-compat-util.h  |  3 --
 packfile.c | 18 
 sha1-file.c|  8 ++
 trace.c|  2 --
 wrapper.c  | 63 +-
 6 files changed, 15 insertions(+), 90 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 76ce906946..93e92876aa 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2342,15 +2342,6 @@ static void find_deltas(struct object_entry **list, 
unsigned *list_size,
free(array);
 }
 
-static void try_to_free_from_threads(size_t size)
-{
-   packing_data_lock(&to_pack);
-   release_pack_memory(size);
-   packing_data_unlock(&to_pack);
-}
-
-static try_to_free_t old_try_to_free_routine;
-
 /*
  * The main object list is split into smaller lists, each is handed to
  * one worker.
@@ -2391,12 +2382,10 @@ static void init_threaded_search(void)
pthread_mutex_init(&cache_mutex, NULL);
pthread_mutex_init(&progress_mutex, NULL);
pthread_cond_init(&progress_cond, NULL);
-   old_try_to_free_routine = 
set_try_to_free_routine(try_to_free_from_threads);
 }
 
 static void cleanup_threaded_search(void)
 {
-   set_try_to_free_routine(old_try_to_free_routine);
pthread_cond_destroy(&progress_cond);
pthread_mutex_destroy(&cache_mutex);
pthread_mutex_destroy(&progress_mutex);
diff --git a/git-compat-util.h b/git-compat-util.h
index 83be89de0a..f0d13e4e28 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -818,9 +818,6 @@ const char *inet_ntop(int af, const void *src, char *dst, 
size_t size);
 int git_atexit(void (*handler)(void));
 #endif
 
-typedef void (*try_to_free_t)(size_t);
-try_to_free_t set_try_to_free_routine(try_to_free_t);
-
 static inline size_t st_add(size_t a, size_t b)
 {
if (unsigned_add_overflows(a, b))
diff --git a/packfile.c b/packfile.c
index fc43a6c52c..d98ac22876 100644
--- a/packfile.c
+++ b/packfile.c
@@ -287,13 +287,6 @@ static int unuse_one_window(struct packed_git *current)
return 

Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-12 Thread Junio C Hamano
Jeff King  writes:

> I think it might be worth just eliminating the whole idea.

I kinda like the simplification ;-) An even thinner wrapper that
calls malloc() and dies if it gets NULL, without any "try-to-free"
logic.


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-12 Thread Jeff King
On Sun, Aug 11, 2019 at 12:08:53PM -0700, Carlo Arenas wrote:

> there is also the risk that xmalloc might not be sufficiently thread
> safe (ex: when it triggers unuse_one_window() through the use of a
> try_to_free_routine in packfile.c but we could mitigate the risk for
> now by doing it only first #ifdef NED (I promise to take it out before
> it even gets to next, assuming Junio would trust me enough after the
> last blow up to even get it into pu)

Yes, our xmalloc() is very much not thread-safe, and I suspect
multithreaded bits that don't override try_to_free_routine, like
git-grep, are probably buggy right now (assuming they xmalloc() outside
of the big global lock). This has been the case for ages (see
[1] from 2011, for example) and I don't think anybody has ever
complained. Which leads me to believe that try_to_free_routine() rarely
actually kicks in.

I've long suspected that the whole try_to_free_pack_memory thing is more
trouble than it's worth these days. We're dealing with mmap'd files, so
the OS is going to do a much better job at responding to memory pressure
by dropping individual pages.

So the only thing we're really freeing is address space. On 64-bit
systems, we have plenty of that to go around. On 32-bit systems, it
might. But we introduced it[2] at the same time as sliding windows[3],
so it's not clear how much it helps (keep in mind that even without this
we'd still free old pack windows to make room for more mmap'd packs).

And as a cost, xmalloc() isn't thread safe, and is a lot harder to
reason about (e.g., something simple like strbuf_grow() might trigger
quite a bit more work than you'd expect).

I think it might be worth just eliminating the whole idea.

[1] https://public-inbox.org/git/20110831015936.gb2...@sigill.intra.peff.net/
[2] 97bfeb34df (Release pack windows before reporting out of memory., 
2006-12-24)
[3] 60bb8b1453 (Fully activate the sliding window pack access., 2006-12-23)

-Peff


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-12 Thread Johannes Schindelin
Hi Carlo,

On Sun, 11 Aug 2019, Carlo Arenas wrote:

> On Sun, Aug 11, 2019 at 4:20 AM Johannes Schindelin
>  wrote:
> > On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote:
> > > cURL is very strict about its allocator being thread safe and so that 
> > > might
> > > be an issue to look for.
> >
> > Looks good to me.
>
> it is not ready yet for using though, at minimum I have to fix the #ifdef
> so there is no risky of breaking someone's build because they happen
> to still be using and old enough cURL (ex: 7.10)
>
> there is also the risk that xmalloc might not be sufficiently thread
> safe (ex: when it triggers unuse_one_window() through the use of a
> try_to_free_routine in packfile.c but we could mitigate the risk for
> now by doing it only first #ifdef NED (I promise to take it out before
> it even gets to next, assuming Junio would trust me enough after the
> last blow up to even get it into pu)

I was under the impression that we _already_ use `xmalloc()` in plenty
of multi-threaded scenarios...

And yes, nedmalloc meets the demand: it is thread-safe. All allocators
to be used by Git have to be.

> alternatively, getting some feedback from people that know that code
> better like Shawn, since I have to admit I haven't read it enough to
> be fully confident it is safe (although I believe it was enough, or I
> wouldn't had sent it for feedback otherwise)

If you are referring to Shawn Pearce (and I assume you do, because the
other Shawn I found in the shortlog only worked briefly on `git clean`
and `git p4`), I have the sad duty to inform you that he passed away.

> > Still, it is a good idea to ask libcurl to use the same allocator as Git.
>
> and it should have a positive performance impact, which should be nice
> to look for.

I actually do not expect any performance impact, given that
communicating via HTTP(S) is partially a waiting game, in any case, the
network will be the bottleneck, not allocating memory.

Ciao,
Dscho


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-11 Thread Carlo Arenas
On Sun, Aug 11, 2019 at 4:20 AM Johannes Schindelin
 wrote:
> On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote:
> > cURL is very strict about its allocator being thread safe and so that might
> > be an issue to look for.
>
> Looks good to me.

it is not ready yet for using though, at minimum I have to fix the #ifdef
so there is no risky of breaking someone's build because they happen
to still be using and old enough cURL (ex: 7.10)

there is also the risk that xmalloc might not be sufficiently thread
safe (ex: when it triggers unuse_one_window() through the use of a
try_to_free_routine in packfile.c but we could mitigate the risk for
now by doing it only first #ifdef NED (I promise to take it out before
it even gets to next, assuming Junio would trust me enough after the
last blow up to even get it into pu)

alternatively, getting some feedback from people that know that code
better like Shawn, since I have to admit I haven't read it enough to
be fully confident it is safe (although I believe it was enough, or I
wouldn't had sent it for feedback otherwise)

> Please note that this did not cause problems in Git for Windows.
> Probably because we leave it to cURL to deallocate whatever it
> allocated.

unlike PCRE2 there is no layering violation to trigger

> Still, it is a good idea to ask libcurl to use the same allocator as Git.

and it should have a positive performance impact, which should be nice
to look for.

Carlo


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-11 Thread Johannes Schindelin
Hi,


On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote:

> 4a30976e28 ([PATCH] support older versions of libcurl, 2005-07-28) added
> support for conditionally initializing cURL but when f0ed8226c9
> (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31) that
> support wasn't updated to make sure cURL will use the same allocator than
> git if compiled with USE_NED_ALLOCATOR=YesPlease (usually done in Windows)
>
> tested in macOS 10.14.6 with the system provided cURL (7.54.0)
> and latest (7.65.3) and while the API used should be added starting around
> 7.12.0 (mid 2014). couldn't get a release that old to build and therefore
> the current mismatch is unlikely to be found while testing because of that.
>
> cURL is very strict about its allocator being thread safe and so that might
> be an issue to look for.

Looks good to me.

Please note that this did not cause problems in Git for Windows.
Probably because we leave it to cURL to deallocate whatever it
allocated.

Still, it is a good idea to ask libcurl to use the same allocator as Git.

Thanks,
Dscho

> Signed-off-by: Carlo Marcelo Arenas Belón 
> ---
>  http.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/http.h b/http.h
> index b429f1cf04..59ec4cbd30 100644
> --- a/http.h
> +++ b/http.h
> @@ -27,6 +27,9 @@
>  #endif
>  #if LIBCURL_VERSION_NUM < 0x070800
>  #define curl_global_init(a) do { /* nothing */ } while (0)
> +#else
> +#define curl_global_init(a) curl_global_init_mem(a, xmalloc, free, \
> + xrealloc, xstrdup, xcalloc)
>  #endif
>
>  #if (LIBCURL_VERSION_NUM < 0x070c04) || (LIBCURL_VERSION_NUM == 0x071000)
> --
> 2.23.0.rc2
>
>


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-10 Thread Carlo Arenas
On Sat, Aug 10, 2019 at 3:17 PM Daniel Stenberg  wrote:
>
> On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote:
>
> > tested in macOS 10.14.6 with the system provided cURL (7.54.0)
> > and latest (7.65.3) and while the API used should be added starting around
> > 7.12.0 (mid 2014).
>
> Let me just gently point out that 7.12.0 was relased mid *2004*, see
> https://curl.haxx.se/docs/releases.html

Indeed; just a typo but relevant to get right for the discussion,
thanks for the gentle clarification

Carlo


Re: [RFC PATCH] http: use xmalloc with cURL

2019-08-10 Thread Daniel Stenberg

On Sat, 10 Aug 2019, Carlo Marcelo Arenas Belón wrote:


tested in macOS 10.14.6 with the system provided cURL (7.54.0)
and latest (7.65.3) and while the API used should be added starting around
7.12.0 (mid 2014).


Let me just gently point out that 7.12.0 was relased mid *2004*, see 
https://curl.haxx.se/docs/releases.html


--

 / daniel.haxx.se