Re: [PATCH v3] sha1_file: introduce close_one_pack() to close packs on fd pressure

2013-08-02 Thread Junio C Hamano
Brandon Casey draf...@gmail.com writes:

 +/*
 + * The LRU pack is the one with the oldest MRU window, preferring packs
 + * with no used windows, or the oldest mtime if it has no windows allocated.
 + */
 +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, 
 struct pack_window **mru_w, int *accept_windows_inuse)
 +{
 + struct pack_window *w, *this_mru_w;
 + int has_windows_inuse = 0;
 +
 + /*
 +  * Reject this pack if it has windows and the previously selected
 +  * one does not.  If this pack does not have windows, reject
 +  * it if the pack file is newer than the previously selected one.
 +  */
 + if (*lru_p  !*mru_w  (p-windows || p-mtime  (*lru_p)-mtime))
 + return;
 +
 + for (w = this_mru_w = p-windows; w; w = w-next) {
 + /*
 +  * Reject this pack if any of its windows are in use,
 +  * but the previously selected pack did not have any
 +  * inuse windows.  Otherwise, record that this pack
 +  * has windows in use.
 +  */
 + if (w-inuse_cnt) {
 + if (*accept_windows_inuse)
 + has_windows_inuse = 1;
 + else
 + return;
 + }
 +
 + if (w-last_used  this_mru_w-last_used)
 + this_mru_w = w;
 +
 + /*
 +  * Reject this pack if it has windows that have been
 +  * used more recently than the previously selected pack.
 +  * If the previously selected pack had windows inuse and
 +  * we have not encountered a window in this pack that is
 +  * inuse, skip this check since we prefer a pack with no
 +  * inuse windows to one that has inuse windows.
 +  */
 + if (*mru_w  *accept_windows_inuse == has_windows_inuse 
 + this_mru_w-last_used  (*mru_w)-last_used)
 + return;

The *accept_windows_inuse == has_windows_inuse part is hard to
grok, together with the fact that this statement is evaluated for
each and every w, even though it is about this_mru_w and that
variable is not updated in every iteration of the loop.  Can you
clarify/simplify this part of the code a bit more?

For example, would the above be equivalent to this?

if (w-last_used  this_mru_w-last_used)
continue;

this_mru_w = w;
if (has_windows_inuse  *mru_w 
w-last_used  (*mru_w)-last_used)
return;

That is, if we already know a more recently used window in this
pack, we do not have to do anything to maintain mru_w.  Otherwise,
remember that this window is the most recently used one in this
pack, and if it is newer than the newest one from the pack we are
going to pick, we refrain from picking this pack.

But we do not reject ourselves if we haven't seen a window that is
in use (yet).

 + }
 +
 + /*
 +  * Select this pack.
 +  */
 + *mru_w = this_mru_w;
 + *lru_p = p;
 + *accept_windows_inuse = has_windows_inuse;
 +}
 +
 +static int close_one_pack(void)
 +{
 + struct packed_git *p, *lru_p = NULL;
 + struct pack_window *mru_w = NULL;
 + int accept_windows_inuse = 1;
 +
 + for (p = packed_git; p; p = p-next) {
 + if (p-pack_fd == -1)
 + continue;
 + find_lru_pack(p, lru_p, mru_w, accept_windows_inuse);
 + }
 +
 + if (lru_p) {
 + close(lru_p-pack_fd);
 + pack_open_fds--;
 + lru_p-pack_fd = -1;
 + return 1;
 + }
 +
 + return 0;
 +}
 +
  void unuse_pack(struct pack_window **w_cursor)
  {
   struct pack_window *w = *w_cursor;
 @@ -768,7 +845,7 @@ static int open_packed_git_1(struct packed_git *p)
   pack_max_fds = 1;
   }
  
 - while (pack_max_fds = pack_open_fds  unuse_one_window(NULL, -1))
 + while (pack_max_fds = pack_open_fds  close_one_pack())
   ; /* nothing */
  
   p-pack_fd = git_open_noatime(p-pack_name);
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] sha1_file: introduce close_one_pack() to close packs on fd pressure

2013-08-02 Thread Brandon Casey
On Fri, Aug 2, 2013 at 9:26 AM, Junio C Hamano gits...@pobox.com wrote:
 Brandon Casey draf...@gmail.com writes:

 +/*
 + * The LRU pack is the one with the oldest MRU window, preferring packs
 + * with no used windows, or the oldest mtime if it has no windows allocated.
 + */
 +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, 
 struct pack_window **mru_w, int *accept_windows_inuse)
 +{
 + struct pack_window *w, *this_mru_w;
 + int has_windows_inuse = 0;
 +
 + /*
 +  * Reject this pack if it has windows and the previously selected
 +  * one does not.  If this pack does not have windows, reject
 +  * it if the pack file is newer than the previously selected one.
 +  */
 + if (*lru_p  !*mru_w  (p-windows || p-mtime  (*lru_p)-mtime))
 + return;
 +
 + for (w = this_mru_w = p-windows; w; w = w-next) {
 + /*
 +  * Reject this pack if any of its windows are in use,
 +  * but the previously selected pack did not have any
 +  * inuse windows.  Otherwise, record that this pack
 +  * has windows in use.
 +  */
 + if (w-inuse_cnt) {
 + if (*accept_windows_inuse)
 + has_windows_inuse = 1;
 + else
 + return;
 + }
 +
 + if (w-last_used  this_mru_w-last_used)
 + this_mru_w = w;
 +
 + /*
 +  * Reject this pack if it has windows that have been
 +  * used more recently than the previously selected pack.
 +  * If the previously selected pack had windows inuse and
 +  * we have not encountered a window in this pack that is
 +  * inuse, skip this check since we prefer a pack with no
 +  * inuse windows to one that has inuse windows.
 +  */
 + if (*mru_w  *accept_windows_inuse == has_windows_inuse 
 + this_mru_w-last_used  (*mru_w)-last_used)
 + return;

 The *accept_windows_inuse == has_windows_inuse part is hard to
 grok, together with the fact that this statement is evaluated for
 each and every w, even though it is about this_mru_w and that
 variable is not updated in every iteration of the loop.  Can you
 clarify/simplify this part of the code a bit more?

 For example, would the above be equivalent to this?

 if (w-last_used  this_mru_w-last_used)
 continue;

 this_mru_w = w;
 if (has_windows_inuse  *mru_w 
 w-last_used  (*mru_w)-last_used)
 return;

 That is, if we already know a more recently used window in this
 pack, we do not have to do anything to maintain mru_w.  Otherwise,
 remember that this window is the most recently used one in this
 pack, and if it is newer than the newest one from the pack we are
 going to pick, we refrain from picking this pack.

 But we do not reject ourselves if we haven't seen a window that is
 in use (yet).

No that wouldn't be the same.  The function of *accept_windows_inuse
== has_windows_inuse and the testing of this_mru_w in every loop
rather than w, is too subtle.  I tried to draw attention to it in the
comment, but I agree it's not enough.

The case that your example would not catch is when the new pack's mru
window has already been found, but has_windows_inuse is not set until
later.  When has_windows_inuse is later set, we need to test
this_mru_w regardless of whether we have just assigned it.

For example, if mru_w points to a pack with last_used == 11 and
*accept_windows_inuse = 1, and p-windows looks like this:

   last_used  in_use
   12 0
   10 1

Then the first time through the loop, this_mru_w would be set to the
first window with last_used equal to 12.  The if statement that tests
this_mru_w-last_used  (*mru_w)-last_used would be skipped since
has_windows_inuse would still be 0.  The second time through the loop,
this_mru_w would _not_ be reset, but has_windows_inuse _would_ be set.
 This time, we would want to enter the last for loop so that we can
reject the pack.

I'll try to rework this loop or add comments to clarify.

-Brandon


 + }
 +
 + /*
 +  * Select this pack.
 +  */
 + *mru_w = this_mru_w;
 + *lru_p = p;
 + *accept_windows_inuse = has_windows_inuse;
 +}
 +
 +static int close_one_pack(void)
 +{
 + struct packed_git *p, *lru_p = NULL;
 + struct pack_window *mru_w = NULL;
 + int accept_windows_inuse = 1;
 +
 + for (p = packed_git; p; p = p-next) {
 + if (p-pack_fd == -1)
 + continue;
 + find_lru_pack(p, lru_p, mru_w, accept_windows_inuse);
 + }
 +
 + if (lru_p) {
 + close(lru_p-pack_fd);
 + pack_open_fds--;
 + lru_p-pack_fd = -1;
 + return 1;
 + }
 +
 + 

[PATCH v3] sha1_file: introduce close_one_pack() to close packs on fd pressure

2013-08-01 Thread Brandon Casey
When the number of open packs exceeds pack_max_fds, unuse_one_window()
is called repeatedly to attempt to release the least-recently-used
pack windows, which, as a side-effect, will also close a pack file
after closing its last open window.  If a pack file has been opened,
but no windows have been allocated into it, it will never be selected
by unuse_one_window() and hence its file descriptor will not be
closed.  When this happens, git may exceed the number of file
descriptors permitted by the system.

This latter situation can occur in show-ref or receive-pack during ref
advertisement.  During ref advertisement, receive-pack will iterate
over every ref in the repository and advertise it to the client after
ensuring that the ref exists in the local repository.  If the ref is
located inside a pack, then the pack is opened to ensure that it
exists, but since the object is not actually read from the pack, no
mmap windows are allocated.  When the number of open packs exceeds
pack_max_fds, unuse_one_window() will not be able to find any windows to
free and will not be able to close any packs.  Once the per-process
file descriptor limit is exceeded, receive-pack will produce a warning,
not an error, for each pack it cannot open, and will then most likely
fail with an error to spawn rev-list or index-pack like:

   error: cannot create standard input pipe for rev-list: Too many open files
   error: Could not run 'git rev-list'

This may also occur during upload-pack when refs are packed (in the
packed-refs file) and the number of packs that must be opened to
verify that these packed refs exist exceeds the file descriptor
limit.  If the refs are loose, then upload-pack will read each ref
from the object database (if the object is in a pack, allocating one
or more mmap windows for it) in order to peel tags and advertise the
underlying object.  But when the refs are packed and peeled,
upload-pack will use the peeled sha1 in the packed-refs file and
will not need to read from the pack files, so no mmap windows will
be allocated and just like with receive-pack, unuse_one_window()
will never select these opened packs to close.

When we have file descriptor pressure, we just need to find an open
pack to close.  We can leave the existing mmap windows open.  If
additional windows need to be mapped into the pack file, it will be
reopened when necessary.  If the pack file has been rewritten in the
mean time, open_packed_git_1() should notice when it compares the file
size or the pack's sha1 checksum to what was previously read from the
pack index, and reject it.

Let's introduce a new function close_one_pack() designed specifically
for this purpose to search for and close the least-recently-used pack,
where LRU is defined as (in order of preference):

   * pack with oldest mtime and no allocated mmap windows
   * pack with the least-recently-used windows, i.e. the pack
 with the oldest most-recently-used window, where none of
 the windows are in use
   * pack with the least-recently-used windows

Signed-off-by: Brandon Casey draf...@gmail.com
---

Here's the version that leaves the mmap windows open after closing
the pack file descriptor.

-Brandon

 sha1_file.c | 79 -
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 40b2329..263cf71 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -673,6 +673,83 @@ void close_pack_windows(struct packed_git *p)
}
 }
 
+/*
+ * The LRU pack is the one with the oldest MRU window, preferring packs
+ * with no used windows, or the oldest mtime if it has no windows allocated.
+ */
+static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, 
struct pack_window **mru_w, int *accept_windows_inuse)
+{
+   struct pack_window *w, *this_mru_w;
+   int has_windows_inuse = 0;
+
+   /*
+* Reject this pack if it has windows and the previously selected
+* one does not.  If this pack does not have windows, reject
+* it if the pack file is newer than the previously selected one.
+*/
+   if (*lru_p  !*mru_w  (p-windows || p-mtime  (*lru_p)-mtime))
+   return;
+
+   for (w = this_mru_w = p-windows; w; w = w-next) {
+   /*
+* Reject this pack if any of its windows are in use,
+* but the previously selected pack did not have any
+* inuse windows.  Otherwise, record that this pack
+* has windows in use.
+*/
+   if (w-inuse_cnt) {
+   if (*accept_windows_inuse)
+   has_windows_inuse = 1;
+   else
+   return;
+   }
+
+   if (w-last_used  this_mru_w-last_used)
+   this_mru_w = w;
+
+   /*
+* Reject this pack if it has windows that have been
+