Re: [PATCH v3 1/2] [GSoC] dir_iterator: iterate over dir after its contents

2017-03-26 Thread Michael Haggerty
On 03/25/2017 07:12 PM, Daniel Ferreira wrote:
> Create an option for the dir_iterator API to iterate over a directory
> path only after having iterated through its contents. This feature was
> predicted, although not implemented by 0fe5043 ("dir_iterator: new API
> for iterating over a directory tree", 2016-06-18).
> 
> This is useful for recursively removing a directory and calling rmdir()
> on a directory only after all of its contents have been wiped.
> 
> An "options" member has been added to the dir_iterator struct. It
> contains the "iterate_dirs_after_files" flag, that enables the feature
> when set to 1. Default behavior continues to be iterating over directory
> paths before its contents.
> 
> Two inline functions have been added to dir_iterator's code to avoid
> code repetition inside dir_iterator_advance() and make code more clear.
> 
> No particular functions or wrappers for setting the options struct's
> fields have been added to avoid either breaking the current dir_iterator
> API or over-engineering an extremely simple option architecture.
> 
> Signed-off-by: Daniel Ferreira 
> ---
>  dir-iterator.c | 100 
> -
>  dir-iterator.h |   7 
>  2 files changed, 84 insertions(+), 23 deletions(-)
> 
> [...]
> diff --git a/dir-iterator.h b/dir-iterator.h
> index 27739e6..4304913 100644
> --- a/dir-iterator.h
> +++ b/dir-iterator.h
> @@ -38,7 +38,14 @@
>   * dir_iterator_advance() again.
>   */
> 
> +struct dir_iterator_options {
> + unsigned iterate_dirs_after_files : 1;
> +};
> +
>  struct dir_iterator {
> + /* Options for dir_iterator */
> + struct dir_iterator_options options;
> +
>   /* The current path: */
>   struct strbuf path;

Another thing I noticed: the name of this option,
`iterate_dirs_after_files`, is a little bit misleading. If I understand
correctly, it doesn't make the iteration process files before
directories within a single directory; rather, it ensures that
subdirectories and their contents are processed before the containing
directory. Therefore, a better name might be something like "depth_first".

I should mention that I like the overall idea to add this new feature
and use it to simplify `remove_subtree()`.

Michael



Re: [PATCH v3 1/2] [GSoC] dir_iterator: iterate over dir after its contents

2017-03-26 Thread Michael Haggerty
On 03/25/2017 07:12 PM, Daniel Ferreira wrote:
> Create an option for the dir_iterator API to iterate over a directory
> path only after having iterated through its contents. This feature was
> predicted, although not implemented by 0fe5043 ("dir_iterator: new API
> for iterating over a directory tree", 2016-06-18).
> 
> This is useful for recursively removing a directory and calling rmdir()
> on a directory only after all of its contents have been wiped.
> 
> An "options" member has been added to the dir_iterator struct. It
> contains the "iterate_dirs_after_files" flag, that enables the feature
> when set to 1. Default behavior continues to be iterating over directory
> paths before its contents.
> 
> Two inline functions have been added to dir_iterator's code to avoid
> code repetition inside dir_iterator_advance() and make code more clear.
> 
> No particular functions or wrappers for setting the options struct's
> fields have been added to avoid either breaking the current dir_iterator
> API or over-engineering an extremely simple option architecture.

This patch would be easier to read if it were split into two: one
extracting the new functions and changing old code to use them, and a
second adding the new functionality. As one patch, is is hard to see
quickly which changes have what purpose.

I also suggest adding a new `unsigned int flags` parameter to
`dir_iterator_begin`. I think that's more natural, because it doesn't
make sense to change the iteration order during an iteration. It's not
much of a problem to change the API given that all callers are in the
same codebase. If you were to forget to convert any callers (or if a
different in-flight patch series were to add a new caller using the old
call style), the compiler would complain, and the problem would be
obvious and easy to fix.

I didn't actually read the patch carefully yet because I don't have time
this evening to seek out the interesting parts in the long diff.

Michael

> [...]



[PATCH v3 1/2] [GSoC] dir_iterator: iterate over dir after its contents

2017-03-25 Thread Daniel Ferreira
Create an option for the dir_iterator API to iterate over a directory
path only after having iterated through its contents. This feature was
predicted, although not implemented by 0fe5043 ("dir_iterator: new API
for iterating over a directory tree", 2016-06-18).

This is useful for recursively removing a directory and calling rmdir()
on a directory only after all of its contents have been wiped.

An "options" member has been added to the dir_iterator struct. It
contains the "iterate_dirs_after_files" flag, that enables the feature
when set to 1. Default behavior continues to be iterating over directory
paths before its contents.

Two inline functions have been added to dir_iterator's code to avoid
code repetition inside dir_iterator_advance() and make code more clear.

No particular functions or wrappers for setting the options struct's
fields have been added to avoid either breaking the current dir_iterator
API or over-engineering an extremely simple option architecture.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c | 100 -
 dir-iterator.h |   7 
 2 files changed, 84 insertions(+), 23 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index 34182a9..833d56a 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -50,6 +50,43 @@ struct dir_iterator_int {
struct dir_iterator_level *levels;
 };

+static inline void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   level->dir_state = DIR_STATE_RECURSE;
+   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
+  iter->levels_alloc);
+   level = >levels[iter->levels_nr++];
+   level->initialized = 0;
+}
+
+static inline int pop_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   return --iter->levels_nr;
+}
+
+static inline int set_iterator_data(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   if (lstat(iter->base.path.buf, >base.st) < 0) {
+   if (errno != ENOENT)
+   warning("error reading path '%s': %s",
+   iter->base.path.buf,
+   strerror(errno));
+   return -1;
+   }
+
+   /*
+* We have to set these each time because
+* the path strbuf might have been realloc()ed.
+*/
+   iter->base.relative_path =
+   iter->base.path.buf + iter->levels[0].prefix_len;
+   iter->base.basename =
+   iter->base.path.buf + level->prefix_len;
+   level->dir_state = DIR_STATE_ITER;
+
+   return 0;
+}
+
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
struct dir_iterator_int *iter =
@@ -77,18 +114,16 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
}

level->initialized = 1;
-   } else if (S_ISDIR(iter->base.st.st_mode)) {
+   } else if (S_ISDIR(iter->base.st.st_mode) &&
+   !iter->base.options.iterate_dirs_after_files) {
if (level->dir_state == DIR_STATE_ITER) {
/*
 * The directory was just iterated
 * over; now prepare to iterate into
-* it.
+* it (unless an option is set for us
+* to do otherwise).
 */
-   level->dir_state = DIR_STATE_RECURSE;
-   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
-  iter->levels_alloc);
-   level = >levels[iter->levels_nr++];
-   level->initialized = 0;
+   push_dir_level(iter, level);
continue;
} else {
/*
@@ -104,7 +139,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * This level is exhausted (or wasn't opened
 * successfully); pop up a level.
 */
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter, level) == 0)
return dir_iterator_abort(dir_iterator);

continue;
@@ -120,16 +155,33 @@ int dir_iterator_advance(struct dir_iterator 
*dir_iterator)
de = readdir(level->dir);

if (!de) {
-   /* This level is exhausted; pop up a level. */
+   /* This level is exhausted  */
if (errno) {
warning("error reading directory %s: 
%s",