Lars Schneider <larsxschnei...@gmail.com> wrote:
> Hi,
> 
> Git always performs a clean/smudge filter on files in sequential order.
> Sometimes a filter operation can take a noticeable amount of time. 
> This blocks the entire Git process.

I have the same problem in many places which aren't git :>

> I would like to give a filter process the possibility to answer Git with
> "I got your request, I am processing it, ask me for the result later!".
> 
> I see the following way to realize this:
> 
> In unpack-trees.c:check_updates() [1] we loop through the cache 
> entries and "ask me later" could be an acceptable return value of the 
> checkout_entry() call. The loop could run until all entries returned
> success or error.
> 
> The filter machinery is triggered in various other places in Git and
> all places that want to support "ask me later" would need to be patched 
> accordingly.

That all sounds reasonable.

The filter itself would need to be aware of parallelism
if it lives for multiple objects, right?

> Do you think this could be a viable approach?

It'd probably require a bit of work, but yes, I think it's viable.

We already do this with curl_multi requests for parallel
fetching from dumb HTTP servers, but that's driven by curl
internals operating with a select/poll loop.

Perhaps the curl API could be a good example for doing this.

> Do you see a better way?

Nope, I prefer non-blocking state machines to threads for
debuggability and determinism.

Anyways, I'll plan on doing something similar (in Perl) with the
synchronous parts of public-inbox which relies on "cat-file --batch"
at some point... (my rotational disks are sloooooooow :<)

Reply via email to