Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-04-09 Thread Lars Schneider

> On 27 Feb 2017, at 23:11, Jakub Narębski  wrote:
> 
> W dniu 27.02.2017 o 11:32, Lars Schneider pisze:
>> 
>>> On 27 Feb 2017, at 10:58, Jeff King  wrote:
>>> 
>>> On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:
>>> 
 +If the request cannot be fulfilled within a reasonable amount of time
 +then the filter can respond with a "delayed" status and a flush packet.
 +Git will perform the same request at a later point in time, again. The
 +filter can delay a response multiple times for a single request.
 +
 +packet:  git< status=delayed
 +packet:  git< 
 +
> 
> Is it something that happens instead of filter process sending the contents

Correct! I'll clarify this in v3!


>> 
>> I completely agree - I need to change that. However, the goal of the v2
>> iteration was to get the "convert" interface in an acceptable state.
>> That's what I intended to say in the patch comment section:
>> 
>>"Please ignore all changes behind async_convert_to_working_tree() and 
>> async_filter_finish() for now as I plan to change the implementation 
>> as soon as the interface is in an acceptable state."
> 
> I think that it is more important to start with a good abstraction,
> and the proposal for protocol, rather than getting bogged down in
> implementation details that may change as the idea for protocol
> extension changes.

I'll send out v3 shortly as proposal for a complete solution.


>>> I think it would be much more efficient to do something like:
>>> 
>>> [Git issues a request and gives it an opaque index id]
>>> git> command=smudge
>>> git> pathname=foo
>>> git> index=0
>>> git> 
>>> git> CONTENT
>>> git> 
>>> 
>>> [The data isn't ready yet, so the filter tells us so...]
>>> git< status=delayed
>>> git< 
> 
> So is it only as replacement for "status=success" + contents or
> "status=abort", that is upfront before sending any part of the file?

Yes.


> Or, as one can assume from the point of the paragraph with the
> "status=delayed", it is about replacing null list for success or
> "status=error" after sending some part (maybe empty) of a file,
> that is:

No. As this would complicate things I don't want to support it. 
(and I clarified that in the docs in v3).


> If it would not be undue burden on the filter driver process, we might
> require for it to say where to continue at (in bytes), e.g.
> 
>git< from=16426
> 
> That should, of course, go below index/pathname line.

This would make the protocol even more complicated. That's why I don't
want to support splitting the response.


>>> git< index=0
> 
> Or a filter driver could have used pathname as an index, that is
> 
>git< pathname=path/testfile.dat

In v3 I've used an index to help Git finding the right cache entry
quickly.


> 
>>> git< 
>>> git< CONTENT
>>> git< 
>>> 
>>> From Git's side, the loop is something like:
>>> 
>>> while (delayed_items > 0) {
>>> /* issue a wait, and get back the status/index pair */
>>> status = send_wait();
>>> delayed_items--;
> 
> This looks like my 'event loop' proposal[1][2], see below.

I implemented something similar in v3.


>> That could work! I had something like that in mind:
>> 
>> I teach Git a new command "list_completed" or similar. The filter
>> blocks this call until at least one item is ready for Git. 
>> Then the filter responds with a list of paths that identify the
>> "ready items". Then Git asks for these ready items just with the
>> path and not with any content. Could that work? Wouldn't the path
>> be "unique" to identify a blob per filter run?
> 
> Why in the "drain" phase it is still Git that needs to ask filter for
> contents, one file after another?  Wouldn't it be easier and simpler
> for filter to finish sending contents, and send signal that it has
> finished continue'ing?
> 
> To summarize my earlier emails, current proposal looks for me as if
> it were a "busy loop" solution, that is[2]:

In v3 the implementation still uses kind of a busy loop (I expect the
filter to block if there nothing ready, yet). An event loop would
complicate the protocol as the filter would need to initiate an action.
Right now only Git initiates actions.


> Footnotes:
> --
> a) We don't send the Git-side contents of blob again, isn't it?
>   So we need some protocol extension / new understanding anyway.
>   for example that we don't send contents if we request path again.
Correct - v3 doesn't send the content again.


> Also, one thing that we need to be solved, assuming that the proposed
> extension allows to send partial data from filter to be delayed and
> continued later, is that Git needs to keep this partial response in buf;
> this is because of precedence of gitattributes applying:

As mentioned above I don't want to support partial data as this
complicates things and is of no use for my Git LFS problem case.


Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-04-09 Thread Lars Schneider

> On 27 Feb 2017, at 11:53, Jeff King  wrote:
> 
> On Mon, Feb 27, 2017 at 11:32:47AM +0100, Lars Schneider wrote:
> 
>> ...
> 
>>> From Git's side, the loop is something like:
>>> 
>>> while (delayed_items > 0) {
>>> /* issue a wait, and get back the status/index pair */
>>> status = send_wait();
>>> delayed_items--;
>>> 
>>> /*
>>>  * use "index" to find the right item in our list of files;
>>>  * the format can be opaque to the filter, so we could index
>>>  * it however we like. But probably numeric indices in an array
>>>  * are the simplest.
>>>  */
>>> assert(index > 0 && index < nr_items);
>>> item[index].status = status;
>>> if (status == SUCCESS)
>>> read_content([index]);
>>> }
>>> 
>>> and the filter side just attaches the "index" string to whatever its
>>> internal queue structure is, and feeds it back verbatim when processing
>>> that item finishes.
>> 
>> That could work! I had something like that in mind:
>> 
>> I teach Git a new command "list_completed" or similar. The filter
>> blocks this call until at least one item is ready for Git. 
>> Then the filter responds with a list of paths that identify the
>> "ready items". Then Git asks for these ready items just with the
>> path and not with any content. Could that work? Wouldn't the path
>> be "unique" to identify a blob per filter run?
> 
> I think that could work, though I think there are few minor downsides
> compared to what I wrote above:
> 
>  - if you respond with "these items are ready", and then make Git ask
>for each again, it's an extra round-trip for each set of ready
>items. You could just say "an item is ready; here it is" in a single
>response. For a local pipe the latency is probably negligible,
>though.

It is true that the extra round-trip is not strictly necessary but I think
it simplifies the protocol/the code as I can reuse the convert machinery 
as is.


>  - using paths as the index would probably work, but it means Git has
>to use the path to find the "struct checkout_entry" again. Which
>might mean a hashmap (though if you have them all in a sorted list,
>I guess you could also do a binary search).

Agreed. I changed my implementation to use an index following your
suggestion.

>  - Using an explicit index communicates to the filter not only what the
>index is, but also that Git is prepared to accept a delayed response
>for the item. For backwards compatibility, the filter would probably
>advertise "I have the 'delayed' capability", and then Git could
>choose to use it or not on a per-item basis. Realistically it would
>not change from item to item, but rather operation to operation. So
>that means we can easily convert the call-sites in Git to the async
>approach incrementally. As each one is converted, it turns on the
>flag that causes the filter code to send the "index" tag.

Agreed. I change the implementation accordingly and I will send out the
patches shortly.

Thanks,
Lars


Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-02-27 Thread Jakub Narębski
W dniu 27.02.2017 o 11:32, Lars Schneider pisze:
> 
>> On 27 Feb 2017, at 10:58, Jeff King  wrote:
>>
>> On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:
>>
>>> +If the request cannot be fulfilled within a reasonable amount of time
>>> +then the filter can respond with a "delayed" status and a flush packet.
>>> +Git will perform the same request at a later point in time, again. The
>>> +filter can delay a response multiple times for a single request.
>>> +
>>> +packet:  git< status=delayed
>>> +packet:  git< 
>>> +

Is it something that happens instead of filter process sending the contents
of file, or is it something that happens after sending some part of the
contents (maybe empty) instead of empty list to keep "status=success"
unchanged or instead of "status=error" if there was a problem processing
file?

>>> +
>>
>> So Git just asks for the same content again? I see two issues with that:
>>
>>  1. Does git have to feed the blob content again? That can be expensive
>> to access or to keep around in memory.
>>
>>  2. What happens when the item isn't ready on the second request? I can
>> think of a few options:
>>
>>   a. The filter immediately says "nope, still delayed". But then
>>  Git ends up busy-looping with "is this one ready yet?"
>>
>>   b. The filter blocks until the item is ready. But then if other
>>items _are_ ready, Git cannot work on processing them. We lose
>>parallelism.
>>
>>   c. You could do a hybrid: block until _some_ item is ready, and
>>  then issue "delayed" responses for everything that isn't
>>ready. Then if you assume that Git is looping over and over
>>through the set of objects, it will either block or pick up
>>_something_ on each loop.
>>
>>But it makes a quadratic number of requests in the worst case.
>>E.g., imagine you have N items and the last one is available
>>first, then the second-to-last, and so on. You'll ask N times,
>>then N-1, then N-2, and so on.

The current solution is a 'busy loop' one that I wrote about[1][2],
see below.

> 
> I completely agree - I need to change that. However, the goal of the v2
> iteration was to get the "convert" interface in an acceptable state.
> That's what I intended to say in the patch comment section:
> 
> "Please ignore all changes behind async_convert_to_working_tree() and 
>  async_filter_finish() for now as I plan to change the implementation 
>  as soon as the interface is in an acceptable state."

I think that it is more important to start with a good abstraction,
and the proposal for protocol, rather than getting bogged down in
implementation details that may change as the idea for protocol
extension changes.

>>
>> I think it would be much more efficient to do something like:
>>
>>  [Git issues a request and gives it an opaque index id]
>>  git> command=smudge
>>  git> pathname=foo
>>  git> index=0
>>  git> 
>>  git> CONTENT
>>  git> 
>>
>>  [The data isn't ready yet, so the filter tells us so...]
>>  git< status=delayed
>>  git< 

So is it only as replacement for "status=success" + contents or
"status=abort", that is upfront before sending any part of the file?

Or, as one can assume from the point of the paragraph with the
"status=delayed", it is about replacing null list for success or
"status=error" after sending some part (maybe empty) of a file,
that is:

[filter driver says that it can process contents]
git< status=success
git< 
git< PARTIAL_SMUDGED_CONTENT (maybe empty)
[there was some delay, for example one of shards is slow]
git< 
git< status=delayed
git< 

>>
>>  [Git may make other requests, that are either served or delayed]
>>  git> command=smudge
>>  git> pathname=foo
>>  git> index=1
>>  git> 
>>  git< status=success
>>  git< 
>>  git< CONTENT
>>  git< 
>>
>>  [Now Git has processed all of the items, and each one either has its
>>   final status, or has been marked as delayed. So we ask for a delayed
>>   item]
>>  git> command=wait
>>  git> 

In my proposal[2] I have called this "command=continue"... but at this
point it is bikeshedding.  I think "command=wait" (or "await" ;-))
might be better.

>>
>>  [Some time may pass if nothing is ready. But eventually we get...]
>>  git< status=success

Or

git< status=resumed

If it would not be undue burden on the filter driver process, we might
require for it to say where to continue at (in bytes), e.g.

git< from=16426

That should, of course, go below index/pathname line.

>>  git< index=0

Or a filter driver could have used pathname as an index, that is

git< pathname=path/testfile.dat

>>  git< 
>>  git< CONTENT
>>  git< 
>>
>> From Git's side, the loop is something like:
>>
>>  while (delayed_items > 0) {
>>  /* issue a wait, and get back the 

Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-02-27 Thread Jeff King
On Mon, Feb 27, 2017 at 11:32:47AM +0100, Lars Schneider wrote:

> I completely agree - I need to change that. However, the goal of the v2
> iteration was to get the "convert" interface in an acceptable state.
> That's what I intended to say in the patch comment section:
> 
> "Please ignore all changes behind async_convert_to_working_tree() and 
>  async_filter_finish() for now as I plan to change the implementation 
>  as soon as the interface is in an acceptable state."

Ah, sorry, I missed that. I would think the underlying approach would
influence the interface to some degree. But as long as the interface
is sufficiently abstract, I think it gives you enough flexibility.

> > From Git's side, the loop is something like:
> > 
> >  while (delayed_items > 0) {
> > /* issue a wait, and get back the status/index pair */
> > status = send_wait();
> > delayed_items--;
> > 
> > /*
> >  * use "index" to find the right item in our list of files;
> >  * the format can be opaque to the filter, so we could index
> >  * it however we like. But probably numeric indices in an array
> >  * are the simplest.
> >  */
> > assert(index > 0 && index < nr_items);
> > item[index].status = status;
> > if (status == SUCCESS)
> > read_content([index]);
> >  }
> > 
> > and the filter side just attaches the "index" string to whatever its
> > internal queue structure is, and feeds it back verbatim when processing
> > that item finishes.
> 
> That could work! I had something like that in mind:
> 
> I teach Git a new command "list_completed" or similar. The filter
> blocks this call until at least one item is ready for Git. 
> Then the filter responds with a list of paths that identify the
> "ready items". Then Git asks for these ready items just with the
> path and not with any content. Could that work? Wouldn't the path
> be "unique" to identify a blob per filter run?

I think that could work, though I think there are few minor downsides
compared to what I wrote above:

  - if you respond with "these items are ready", and then make Git ask
for each again, it's an extra round-trip for each set of ready
items. You could just say "an item is ready; here it is" in a single
response. For a local pipe the latency is probably negligible,
though.

  - using paths as the index would probably work, but it means Git has
to use the path to find the "struct checkout_entry" again. Which
might mean a hashmap (though if you have them all in a sorted list,
I guess you could also do a binary search).

  - Using an explicit index communicates to the filter not only what the
index is, but also that Git is prepared to accept a delayed response
for the item. For backwards compatibility, the filter would probably
advertise "I have the 'delayed' capability", and then Git could
choose to use it or not on a per-item basis. Realistically it would
not change from item to item, but rather operation to operation. So
that means we can easily convert the call-sites in Git to the async
approach incrementally. As each one is converted, it turns on the
flag that causes the filter code to send the "index" tag.

-Peff


Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-02-27 Thread Lars Schneider

> On 27 Feb 2017, at 10:58, Jeff King  wrote:
> 
> On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:
> 
>> +If the request cannot be fulfilled within a reasonable amount of time
>> +then the filter can respond with a "delayed" status and a flush packet.
>> +Git will perform the same request at a later point in time, again. The
>> +filter can delay a response multiple times for a single request.
>> +
>> +packet:  git< status=delayed
>> +packet:  git< 
>> +
>> +
> 
> So Git just asks for the same content again? I see two issues with that:
> 
>  1. Does git have to feed the blob content again? That can be expensive
> to access or to keep around in memory.
> 
>  2. What happens when the item isn't ready on the second request? I can
> think of a few options:
> 
>   a. The filter immediately says "nope, still delayed". But then
>  Git ends up busy-looping with "is this one ready yet?"
> 
>   b. The filter blocks until the item is ready. But then if other
> items _are_ ready, Git cannot work on processing them. We lose
> parallelism.
> 
>   c. You could do a hybrid: block until _some_ item is ready, and
>  then issue "delayed" responses for everything that isn't
> ready. Then if you assume that Git is looping over and over
> through the set of objects, it will either block or pick up
> _something_ on each loop.
> 
> But it makes a quadratic number of requests in the worst case.
> E.g., imagine you have N items and the last one is available
> first, then the second-to-last, and so on. You'll ask N times,
> then N-1, then N-2, and so on.

I completely agree - I need to change that. However, the goal of the v2
iteration was to get the "convert" interface in an acceptable state.
That's what I intended to say in the patch comment section:

"Please ignore all changes behind async_convert_to_working_tree() and 
 async_filter_finish() for now as I plan to change the implementation 
 as soon as the interface is in an acceptable state."

> 
> I think it would be much more efficient to do something like:
> 
>  [Git issues a request and gives it an opaque index id]
>  git> command=smudge
>  git> pathname=foo
>  git> index=0
>  git> 
>  git> CONTENT
>  git> 
> 
>  [The data isn't ready yet, so the filter tells us so...]
>  git< status=delayed
>  git< 
> 
>  [Git may make other requests, that are either served or delayed]
>  git> command=smudge
>  git> pathname=foo
>  git> index=1
>  git> 
>  git< status=success
>  git< 
>  git< CONTENT
>  git< 
> 
>  [Now Git has processed all of the items, and each one either has its
>   final status, or has been marked as delayed. So we ask for a delayed
>   item]
>  git> command=wait
>  git> 
> 
>  [Some time may pass if nothing is ready. But eventually we get...]
>  git< status=success
>  git< index=0
>  git< 
>  git< CONTENT
>  git< 
> 
> From Git's side, the loop is something like:
> 
>  while (delayed_items > 0) {
>   /* issue a wait, and get back the status/index pair */
>   status = send_wait();
>   delayed_items--;
> 
>   /*
>* use "index" to find the right item in our list of files;
>* the format can be opaque to the filter, so we could index
>* it however we like. But probably numeric indices in an array
>* are the simplest.
>*/
>   assert(index > 0 && index < nr_items);
>   item[index].status = status;
>   if (status == SUCCESS)
>   read_content([index]);
>  }
> 
> and the filter side just attaches the "index" string to whatever its
> internal queue structure is, and feeds it back verbatim when processing
> that item finishes.

That could work! I had something like that in mind:

I teach Git a new command "list_completed" or similar. The filter
blocks this call until at least one item is ready for Git. 
Then the filter responds with a list of paths that identify the
"ready items". Then Git asks for these ready items just with the
path and not with any content. Could that work? Wouldn't the path
be "unique" to identify a blob per filter run?

Thanks,
Lars


Re: [PATCH v2] convert: add "status=delayed" to filter process protocol

2017-02-27 Thread Jeff King
On Sun, Feb 26, 2017 at 07:48:16PM +0100, Lars Schneider wrote:

> +If the request cannot be fulfilled within a reasonable amount of time
> +then the filter can respond with a "delayed" status and a flush packet.
> +Git will perform the same request at a later point in time, again. The
> +filter can delay a response multiple times for a single request.
> +
> +packet:  git< status=delayed
> +packet:  git< 
> +
> +

So Git just asks for the same content again? I see two issues with that:

  1. Does git have to feed the blob content again? That can be expensive
 to access or to keep around in memory.

  2. What happens when the item isn't ready on the second request? I can
 think of a few options:

   a. The filter immediately says "nope, still delayed". But then
  Git ends up busy-looping with "is this one ready yet?"

   b. The filter blocks until the item is ready. But then if other
  items _are_ ready, Git cannot work on processing them. We lose
  parallelism.

   c. You could do a hybrid: block until _some_ item is ready, and
  then issue "delayed" responses for everything that isn't
  ready. Then if you assume that Git is looping over and over
  through the set of objects, it will either block or pick up
  _something_ on each loop.

  But it makes a quadratic number of requests in the worst case.
  E.g., imagine you have N items and the last one is available
  first, then the second-to-last, and so on. You'll ask N times,
  then N-1, then N-2, and so on.

I think it would be much more efficient to do something like:

  [Git issues a request and gives it an opaque index id]
  git> command=smudge
  git> pathname=foo
  git> index=0
  git> 
  git> CONTENT
  git> 

  [The data isn't ready yet, so the filter tells us so...]
  git< status=delayed
  git< 

  [Git may make other requests, that are either served or delayed]
  git> command=smudge
  git> pathname=foo
  git> index=1
  git> 
  git< status=success
  git< 
  git< CONTENT
  git< 

  [Now Git has processed all of the items, and each one either has its
   final status, or has been marked as delayed. So we ask for a delayed
   item]
  git> command=wait
  git> 

  [Some time may pass if nothing is ready. But eventually we get...]
  git< status=success
  git< index=0
  git< 
  git< CONTENT
  git< 

>From Git's side, the loop is something like:

  while (delayed_items > 0) {
/* issue a wait, and get back the status/index pair */
status = send_wait();
delayed_items--;

/*
 * use "index" to find the right item in our list of files;
 * the format can be opaque to the filter, so we could index
 * it however we like. But probably numeric indices in an array
 * are the simplest.
 */
assert(index > 0 && index < nr_items);
item[index].status = status;
if (status == SUCCESS)
read_content([index]);
  }

and the filter side just attaches the "index" string to whatever its
internal queue structure is, and feeds it back verbatim when processing
that item finishes.

-Peff


[PATCH v2] convert: add "status=delayed" to filter process protocol

2017-02-26 Thread Lars Schneider
Some `clean` / `smudge` filters might require a significant amount of
time to process a single blob. During this process the Git checkout
operation is blocked and Git needs to wait until the filter is done to
continue with the checkout.

Teach the filter process protocol (introduced in edcc858) to accept the
status "delayed" as response to a filter request. Upon this response Git
continues with the checkout operation and asks the filter to process the
blob again after all other blobs have been processed.

Git has a multiple code paths that checkout a blob. Support delayed
checkouts only in `clone` (in unpack-trees.c) and `checkout` operations.

Signed-off-by: Lars Schneider 
---

Hi,

in v1 Junio criticized the "convert.h" interface of this patch [1].
After talking to Peff I think I understand Junio's point and I would
like to get your feedback on the new approach here. Please ignore all
changes behind async_convert_to_working_tree() and async_filter_finish()
for now as I plan to change the implementation as soon as the interface
is in an acceptable state.

The new interface also addresses Torsten's feedback and leaves
convert_to_working_tree() as is [2].

I also use '>' for numeric comparisons in Perl as suggested by Eric [3].

Please note, I rebased the patch to v2.12 as v1 did not apply clean on
master anymore.

Thanks,
Lars

[1] http://public-inbox.org/git/xmqqa8b115ll@gitster.mtv.corp.google.com/
[2] http://public-inbox.org/git/20170108201415.GA3569@tb-raspi/
[3] http://public-inbox.org/git/20170108204517.GA13779@starla/


RFC: http://public-inbox.org/git/d10f7c47-14e8-465b-8b7a-a09a1b28a...@gmail.com/
 v1: 
http://public-inbox.org/git/20170108191736.47359-1-larsxschnei...@gmail.com/


Notes:
Base Ref: v2.12.0
Web-Diff: https://github.com/larsxschneider/git/commit/13d5b37021
Checkout: git fetch https://github.com/larsxschneider/git 
filter-process/delay-v2 && git checkout 13d5b37021

 Documentation/gitattributes.txt |  9 ++
 builtin/checkout.c  |  1 +
 cache.h |  1 +
 convert.c   | 68 +
 convert.h   | 13 
 entry.c | 29 +++---
 t/t0021-conversion.sh   | 53 
 t/t0021/rot13-filter.pl | 19 
 unpack-trees.c  |  1 +
 9 files changed, 176 insertions(+), 18 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index e0b66c1220..f6bad8db40 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -473,6 +473,15 @@ packet:  git<   # empty content!
 packet:  git<   # empty list, keep "status=success" unchanged!
 

+If the request cannot be fulfilled within a reasonable amount of time
+then the filter can respond with a "delayed" status and a flush packet.
+Git will perform the same request at a later point in time, again. The
+filter can delay a response multiple times for a single request.
+
+packet:  git< status=delayed
+packet:  git< 
+
+
 In case the filter cannot or does not want to process the content,
 it is expected to respond with an "error" status.
 
diff --git a/builtin/checkout.c b/builtin/checkout.c
index f174f50303..742e8742cd 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -369,6 +369,7 @@ static int checkout_paths(const struct checkout_opts *opts,
pos = skip_same_name(ce, pos) - 1;
}
}
+   errs |= checkout_delayed_entries();

if (write_locked_index(_index, lock_file, COMMIT_LOCK))
die(_("unable to write new index file"));
diff --git a/cache.h b/cache.h
index 61fc86e6d7..66dde99a79 100644
--- a/cache.h
+++ b/cache.h
@@ -1434,6 +1434,7 @@ struct checkout {

 #define TEMPORARY_FILENAME_LENGTH 25
 extern int checkout_entry(struct cache_entry *ce, const struct checkout 
*state, char *topath);
+extern int checkout_delayed_entries(const struct checkout *state);

 struct cache_def {
struct strbuf path;
diff --git a/convert.c b/convert.c
index 4e17e45ed2..24d29f5c53 100644
--- a/convert.c
+++ b/convert.c
@@ -4,6 +4,7 @@
 #include "quote.h"
 #include "sigchain.h"
 #include "pkt-line.h"
+#include "list.h"

 /*
  * convert.c - convert a file when checking it out and checking it in.
@@ -38,6 +39,13 @@ struct text_stat {
unsigned printable, nonprintable;
 };

+static LIST_HEAD(delayed_item_queue_head);
+
+struct delayed_item {
+   void* item;
+   struct list_head node;
+};
+
 static void gather_stats(const char *buf, unsigned long size, struct text_stat 
*stats)
 {
unsigned long i;
@@ -672,7 +680,7 @@ static struct cmd2process *start_multi_file_filter(struct 
hashmap *hashmap, cons
 }

 static int