Re: check-attr doesn't respect recursive definitions

2013-04-06 Thread Jan Larres
Duy Nguyen  wrote:
> On Sat, Mar 30, 2013 at 8:45 PM, Jan Larres  wrote:
>> I would expect the last command to also report 'set'. I've also tried
>> other patterns like 'foo/' and 'foo*', but it didn't make any
>> difference.
>
> Try "foo/**". You need 1.8.2 though.

That indeed works exactly the way I would like! Thanks a lot!

Jan

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-04 Thread Duy Nguyen
On Sat, Mar 30, 2013 at 8:45 PM, Jan Larres  wrote:
> I would expect the last command to also report 'set'. I've also tried
> other patterns like 'foo/' and 'foo*', but it didn't make any
> difference.

Try "foo/**". You need 1.8.2 though.
--
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-03 Thread Jan Larres
Thanks for the clarifications. Just a quick comment about the summary:

Jeff King  wrote:
> Yeah, I had the same thought. So you would have to either:
>
>   1. Hook the feature into git-archive, which knows about how it
>  recurses, and can report the correct set of paths.
>
> or
>
>   2. Tell check-attr (or some post-processor) to apply the attribute to
>  elements below the path (or possibly to prune out such paths). This
>  is not the same as recursive application, because you cannot negate
>  it (i.e., you actually find out the final attrs for both "foo" and
>  "foo/bar", and then say "the attr for 'foo' overrides the attr for
>  'foo/bar'".
>
> I posted a patch for (1), but it felt not-very-general. But (2) also
> feels gross and not very general. Even though it could in theory be used
> for things besides git-archive, it is really just applying git-archive's
> pruning rule, which other programs likely don't care about.

I actually think the first approach is not such a bad idea, it would
make "archive" more of a general-purpose archiving tool/helper for the
repository rather than just something for the special cases of tars and
zips. But I guess that's somewhat subjective.

Cheers,
Jan

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Jeff King
On Tue, Apr 02, 2013 at 10:16:55AM -0700, Junio C Hamano wrote:

> > Yes, but as I explained later, the meaning of "apply an attribute to
> > dir" in such cases is always equivalent to "apply attribute
> > recursively to dir/*". So I do not think we are violating that rule to
> > recursively apply all attributes.
> 
> I think this is where we disagree.  Attribute does not recursively
> apply in general.  It was _never_ designed to (and that is the
> authoritative answer, as I was who designed it in Apr 2007 ;-)).

My argument was not that they apply recursively in general, but that we
followed an "as if" rule where what happens in the gitignore and
export-ignore cases are indistinguishable from recursive application.
But...

> It is not even true to say that "archive" applies export-ignore to
> the directory recursively, with or without the recent change.  Would
> it allow everything but dir/file to be excluded and still dir/file
> to be included in the archive if you have a .gitattribute file like
> this?
> 
>   dir/ export-ignore
> dir/file !export-ignore
> 
> I do not think so.

This is a perfect counter-example. The current behavior _is_
distinguishable from recursive application, and my "as if" above is not
correct.

Thanks for the clarification.

> > But let's take a step back. I think Jan is trying to do a very
> > reasonable thing: come up with the same set of paths that git-archive
> > would. What's the best way to solve that?
> 
> Because the attribute does not recursively apply in general, and it
> is entirely up to the application and a particular attribute key to
> decide how the key is applied in the context of the application,
> "check-attr" by itself cannot know.  You need to know how "archive"
> treats "export-ignore" attribute and then use "check-attr" with that
> knowledge.

Yeah, I had the same thought. So you would have to either:

  1. Hook the feature into git-archive, which knows about how it
 recurses, and can report the correct set of paths.

or

  2. Tell check-attr (or some post-processor) to apply the attribute to
 elements below the path (or possibly to prune out such paths). This
 is not the same as recursive application, because you cannot negate
 it (i.e., you actually find out the final attrs for both "foo" and
 "foo/bar", and then say "the attr for 'foo' overrides the attr for
 'foo/bar'".

I posted a patch for (1), but it felt not-very-general. But (2) also
feels gross and not very general. Even though it could in theory be used
for things besides git-archive, it is really just applying git-archive's
pruning rule, which other programs likely don't care about.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Apr 02, 2013 at 09:43:30AM -0700, Junio C Hamano wrote:
>
>> > In some systems, yes, but git does not have any notion of "doc/" as an
>> > item (after all, we track content in files, not directories), so I do
>> > not see what it means to specify a directory except to say "everything
>> > under it has this property".
>> 
>> That was true back when gitattributes (and ignore) was defined to
>> apply only to the paths we track.  But export-ignore abuses the
>> attrtibute system, allows a directory to be specified in the match
>> pattern, and we declared that is a kosher use by the patch that
>> caused 1.8.1.X regression, no?  So "Git does not have any notion of
>> "doc/' as an item" is no longer true, I think.
>
> Yes, but as I explained later, the meaning of "apply an attribute to
> dir" in such cases is always equivalent to "apply attribute
> recursively to dir/*". So I do not think we are violating that rule to
> recursively apply all attributes.

I think this is where we disagree.  Attribute does not recursively
apply in general.  It was _never_ designed to (and that is the
authoritative answer, as I was who designed it in Apr 2007 ;-)).

It is not even true to say that "archive" applies export-ignore to
the directory recursively, with or without the recent change.  Would
it allow everything but dir/file to be excluded and still dir/file
to be included in the archive if you have a .gitattribute file like
this?

dir/ export-ignore
dir/file !export-ignore

I do not think so.

If we _were_ living in an alternate universe where we did not have
the .gitignore file and instead expressed what it does with an
"ignore" attribute, then having "dir" in the top-level .gitignore
file and "!file" in the .gitignore file in "dir/" directory may be
equivalent to having

dir/ ignore
dir/file -ignore

in your .gitattribute, and we would want to ignore dir/other while
including dir/file in the "may be interesting" paths.

The point is, yes, we could choose to define individual attribute
keys to apply recursively, but in general attributes were designed
not to recurse, and no existing attribute recurses.

> But let's take a step back. I think Jan is trying to do a very
> reasonable thing: come up with the same set of paths that git-archive
> would. What's the best way to solve that?

Because the attribute does not recursively apply in general, and it
is entirely up to the application and a particular attribute key to
decide how the key is applied in the context of the application,
"check-attr" by itself cannot know.  You need to know how "archive"
treats "export-ignore" attribute and then use "check-attr" with that
knowledge.

> Using:
>
>   git ls-tree --name-only -zrt HEAD |
>   git check-attr --stdin -z export-ignore
>
> means we can find out that "foo/" is ignored.

Yes, and after learning that, we need to reject everything foo/*
using the knowledge that "git archive" cuts all subpaths of a path
that has "export-ignore" attribute is attached to.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Jeff King
On Tue, Apr 02, 2013 at 12:51:28PM -0400, Jeff King wrote:

> But let's take a step back. I think Jan is trying to do a very
> reasonable thing: come up with the same set of paths that git-archive
> would. What's the best way to solve that? Recursive application of
> attributes is one way, but is there another way we could help with
> solving that?
> 
> Using:
> 
>   git ls-tree --name-only -zrt HEAD |
>   git check-attr --stdin -z export-ignore
> 
> means we can find out that "foo/" is ignored. But he would have to
> manually post-process the output to see that "foo/bar" is below "foo".
> Not impossible, but I just wonder if git can be more helpful in figuring
> this out.

One way to solve the problem is something like the patch below, which
allows "git archive --format=lstree" to give an lstree-like listing, but
with export-ignore attributes applied.

It feels weirdly specific, though, like there should be a more general
solution.

---
 Makefile |  1 +
 archive-lstree.c | 43 +++
 archive.c|  1 +
 archive.h|  1 +
 quote.c  |  4 ++--
 quote.h  |  1 +
 6 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index e9749ca..5e63d72 100644
--- a/Makefile
+++ b/Makefile
@@ -772,6 +772,7 @@ LIB_OBJS += archive.o
 LIB_OBJS += alias.o
 LIB_OBJS += alloc.o
 LIB_OBJS += archive.o
+LIB_OBJS += archive-lstree.o
 LIB_OBJS += archive-tar.o
 LIB_OBJS += archive-zip.o
 LIB_OBJS += argv-array.o
diff --git a/archive-lstree.c b/archive-lstree.c
new file mode 100644
index 000..5ca1bbc
--- /dev/null
+++ b/archive-lstree.c
@@ -0,0 +1,43 @@
+#include "cache.h"
+#include "archive.h"
+#include "quote.h"
+#include "commit.h"
+#include "blob.h"
+
+static int write_lstree_entry(struct archiver_args *args,
+ const unsigned char *sha1,
+ const char *path, size_t pathlen,
+ unsigned int mode)
+{
+   const char *type;
+
+   if (S_ISDIR(mode))
+   return 0;
+   else if (S_ISGITLINK(mode))
+   type = commit_type;
+   else
+   type = blob_type;
+
+   printf("%06o %s %s\t", mode, type, sha1_to_hex(sha1));
+   quote_c_style_counted(path, pathlen, NULL, stdout, 0);
+   printf("\n");
+
+   return 0;
+}
+
+static int write_lstree_archive(const struct archiver *ar,
+   struct archiver_args *args)
+{
+   return write_archive_entries(args, write_lstree_entry);
+}
+
+static struct archiver lstree_archiver = {
+   "lstree",
+   write_lstree_archive,
+   ARCHIVER_REMOTE
+};
+
+void init_lstree_archiver(void)
+{
+   register_archiver(&lstree_archiver);
+}
diff --git a/archive.c b/archive.c
index 3f00da6..4966db7 100644
--- a/archive.c
+++ b/archive.c
@@ -420,6 +420,7 @@ int write_archive(int argc, const char **argv, const char 
*prefix,
git_config(git_default_config, NULL);
init_tar_archiver();
init_zip_archiver();
+   init_lstree_archiver();
 
argc = parse_archive_args(argc, argv, &ar, &args, name_hint, remote);
if (nongit) {
diff --git a/archive.h b/archive.h
index 895afcd..1428fc4 100644
--- a/archive.h
+++ b/archive.h
@@ -27,6 +27,7 @@ extern void init_zip_archiver(void);
 
 extern void init_tar_archiver(void);
 extern void init_zip_archiver(void);
+extern void init_lstree_archiver(void);
 
 typedef int (*write_archive_entry_fn_t)(struct archiver_args *args,
const unsigned char *sha1,
diff --git a/quote.c b/quote.c
index 911229f..da6b7e4 100644
--- a/quote.c
+++ b/quote.c
@@ -206,8 +206,8 @@ static size_t next_quote_pos(const char *s, ssize_t maxlen)
  * of name, enclosed with double quotes if asked and needed only.
  * Return value is the same as in (1).
  */
-static size_t quote_c_style_counted(const char *name, ssize_t maxlen,
-struct strbuf *sb, FILE *fp, int no_dq)
+size_t quote_c_style_counted(const char *name, ssize_t maxlen,
+struct strbuf *sb, FILE *fp, int no_dq)
 {
 #undef EMIT
 #define EMIT(c) \
diff --git a/quote.h b/quote.h
index 133155a..f2b8acb 100644
--- a/quote.h
+++ b/quote.h
@@ -55,6 +55,7 @@ extern size_t quote_c_style(const char *name, struct strbuf 
*, FILE *, int no_dq
 
 extern int unquote_c_style(struct strbuf *, const char *quoted, const char 
**endp);
 extern size_t quote_c_style(const char *name, struct strbuf *, FILE *, int 
no_dq);
+extern size_t quote_c_style_counted(const char *name, ssize_t len, struct 
strbuf *, FILE *, int no_dq);
 extern void quote_two_c_style(struct strbuf *, const char *, const char *, 
int);
 
 extern void write_name_quoted(const char *name, FILE *, int terminator);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  ht

Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Jeff King
On Tue, Apr 02, 2013 at 09:43:30AM -0700, Junio C Hamano wrote:

> > In some systems, yes, but git does not have any notion of "doc/" as an
> > item (after all, we track content in files, not directories), so I do
> > not see what it means to specify a directory except to say "everything
> > under it has this property".
> 
> That was true back when gitattributes (and ignore) was defined to
> apply only to the paths we track.  But export-ignore abuses the
> attrtibute system, allows a directory to be specified in the match
> pattern, and we declared that is a kosher use by the patch that
> caused 1.8.1.X regression, no?  So "Git does not have any notion of
> "doc/' as an item" is no longer true, I think.

Yes, but as I explained later, the meaning of "apply an attribute to
dir" in such cases is always equivalent to "apply attribute
recursively to dir/*". So I do not think we are violating that rule to
recursively apply all attributes.

But let's take a step back. I think Jan is trying to do a very
reasonable thing: come up with the same set of paths that git-archive
would. What's the best way to solve that? Recursive application of
attributes is one way, but is there another way we could help with
solving that?

Using:

  git ls-tree --name-only -zrt HEAD |
  git check-attr --stdin -z export-ignore

means we can find out that "foo/" is ignored. But he would have to
manually post-process the output to see that "foo/bar" is below "foo".
Not impossible, but I just wonder if git can be more helpful in figuring
this out.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Apr 02, 2013 at 09:11:02AM -0700, Junio C Hamano wrote:
>
>> Jeff King  writes:
>> 
>> > Yes, it is the expected behavior, though I cannot offhand think of
>> > anything that would break if we did apply it recursively.
>> 
>> Conceptually that breaks our brain.  "All files in doc/ directories
>> are text" and "doc/ directory is text" are two different things, no?
>
> In some systems, yes, but git does not have any notion of "doc/" as an
> item (after all, we track content in files, not directories), so I do
> not see what it means to specify a directory except to say "everything
> under it has this property".

That was true back when gitattributes (and ignore) was defined to
apply only to the paths we track.  But export-ignore abuses the
attrtibute system, allows a directory to be specified in the match
pattern, and we declared that is a kosher use by the patch that
caused 1.8.1.X regression, no?  So "Git does not have any notion of
"doc/' as an item" is no longer true, I think.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Jeff King
On Tue, Apr 02, 2013 at 09:11:02AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Yes, it is the expected behavior, though I cannot offhand think of
> > anything that would break if we did apply it recursively.
> 
> Conceptually that breaks our brain.  "All files in doc/ directories
> are text" and "doc/ directory is text" are two different things, no?

In some systems, yes, but git does not have any notion of "doc/" as an
item (after all, we track content in files, not directories), so I do
not see what it means to specify a directory except to say "everything
under it has this property".

We already treat "foo/" in gitignore this way (i.e., "everything under
foo is ignored"). And we treat export-ignore the same way in
git-archive.

I agree that saying "Documentation diff=text" is not something I would
expect people to do, but what _should_ it do? Recursively applying under
the directory seems a sane option. I would not want to paint us into a
corner, but I cannot think of any other reasonable thing to do with it
(which is why I phrased it as "I cannot think of anything that would
break", and not "this is a great idea"; counterpoints are welcome).

I am sympathetic to Jan's issue, though. If git were consistent about
applying attributes only to paths, then the repo owner would have
written "dir/* export-ignore", not "dir". But it is not, and git-archive
does ask for attributes on directories, but without providing a good
mechanism for doing those attribute lookups from external programs.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Junio C Hamano
Jeff King  writes:

> Yes, it is the expected behavior, though I cannot offhand think of
> anything that would break if we did apply it recursively.

Conceptually that breaks our brain.  "All files in doc/ directories
are text" and "doc/ directory is text" are two different things, no?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Jeff King
On Tue, Apr 02, 2013 at 10:31:30AM -0400, Jeff King wrote:

> On Sat, Mar 30, 2013 at 09:45:51AM +, Jan Larres wrote:
> 
> > I am trying to write a custom archiving script that checks the
> > export-ignore attribute to know which files from an ls-files output it
> > should skip. Through this I noticed that for files in directories for
> > which the export-ignore (or any other) attribute is set, check-attr
> > still reports 'unspecified'. More precisely:
> > 
> > $ git init test
> > Initialized empty Git repository in /home/jan/test/.git/
> > $ cd test
> > $ mkdir foo
> > $ touch foo/bar
> > $ echo "foo export-ignore" > .gitattributes
> > $ git check-attr export-ignore foo
> > foo: export-ignore: set
> > $ git check-attr export-ignore foo/bar
> > foo/bar: export-ignore: unspecified
> > 
> > I would expect the last command to also report 'set'. I've also tried
> > other patterns like 'foo/' and 'foo*', but it didn't make any
> > difference. Is this expected behaviour? It does make checking the
> > attributes of single files somewhat more difficult.
> 
> Yes, it is the expected behavior, though I cannot offhand think of
> anything that would break if we did apply it recursively.

Naively, such a patch might look something like this:

diff --git a/attr.c b/attr.c
index de170ff..ac04188 100644
--- a/attr.c
+++ b/attr.c
@@ -791,8 +791,18 @@ static void collect_all_attrs(const char *path)
check_all_attr[i].value = ATTR__UNKNOWN;
 
rem = attr_nr;
-   for (stk = attr_stack; 0 < rem && stk; stk = stk->prev)
+   for (stk = attr_stack; 0 < rem && stk; stk = stk->prev) {
+   last_slash = path;
+   for (cp = path; *cp; cp++) {
+   if (*cp == '/') {
+   rem = fill(path, cp - path + 1,
+  last_slash - path,
+  stk, rem);
+   last_slash = cp;
+   }
+   }
rem = fill(path, pathlen, basename_offset, stk, rem);
+   }
 }
 
 int git_check_attr(const char *path, int num, struct git_attr_check *check)

which is based on current "next" (because it has some other related
fixes for handling directories). It seems to work for me, but I suspect
we could do it more optimally. If you have N files in "foo/", this will
check "foo/" against the attribute stack N times, and we should be able
to amortize that work.

Hmm. Actually, thinking on it more, I'm not sure this is even going to
give the right answer anyway, as it will check "foo" against the
"foo/.gitattributes", which is not right. I think we'd need to collect
attributes as we walk the stack in prepare_attr_stack.

So take this as a "this is not how to do it" patch. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-attr doesn't respect recursive definitions

2013-04-02 Thread Jeff King
On Sat, Mar 30, 2013 at 09:45:51AM +, Jan Larres wrote:

> I am trying to write a custom archiving script that checks the
> export-ignore attribute to know which files from an ls-files output it
> should skip. Through this I noticed that for files in directories for
> which the export-ignore (or any other) attribute is set, check-attr
> still reports 'unspecified'. More precisely:
> 
> $ git init test
> Initialized empty Git repository in /home/jan/test/.git/
> $ cd test
> $ mkdir foo
> $ touch foo/bar
> $ echo "foo export-ignore" > .gitattributes
> $ git check-attr export-ignore foo
> foo: export-ignore: set
> $ git check-attr export-ignore foo/bar
> foo/bar: export-ignore: unspecified
> 
> I would expect the last command to also report 'set'. I've also tried
> other patterns like 'foo/' and 'foo*', but it didn't make any
> difference. Is this expected behaviour? It does make checking the
> attributes of single files somewhat more difficult.

Yes, it is the expected behavior, though I cannot offhand think of
anything that would break if we did apply it recursively.

> git-archive ignores the directory as expected, but unfortunately it
> doesn't have an option to just list the files it would archive instead
> of actually archiving them.

Yes, git-archive feeds the directories into the attribute machinery as
it traverses the tree, so it actually checks for attributes of "foo"
before recursing. You can do the same, but I agree it is quite a bit
more annoying than just piping "ls-files -z" into "check-attr --stdin
-z".

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


check-attr doesn't respect recursive definitions

2013-03-30 Thread Jan Larres
Hi,

I am trying to write a custom archiving script that checks the
export-ignore attribute to know which files from an ls-files output it
should skip. Through this I noticed that for files in directories for
which the export-ignore (or any other) attribute is set, check-attr
still reports 'unspecified'. More precisely:

$ git init test
Initialized empty Git repository in /home/jan/test/.git/
$ cd test
$ mkdir foo
$ touch foo/bar
$ echo "foo export-ignore" > .gitattributes
$ git check-attr export-ignore foo
foo: export-ignore: set
$ git check-attr export-ignore foo/bar
foo/bar: export-ignore: unspecified

I would expect the last command to also report 'set'. I've also tried
other patterns like 'foo/' and 'foo*', but it didn't make any
difference. Is this expected behaviour? It does make checking the
attributes of single files somewhat more difficult.

git-archive ignores the directory as expected, but unfortunately it
doesn't have an option to just list the files it would archive instead
of actually archiving them.

This is with git version 1.7.10.4.

-Jan

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html