Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-30 Thread Florian Achleitner
Sorry for reacting so late, I didn't read the list carefully in the last weeks 
and my gmail filter somehow didn't trigger on that.

On Tuesday 02 October 2012 16:20:22 Junio C Hamano wrote:
> * fa/remote-svn (2012-09-19) 16 commits
>  - Add a test script for remote-svn
>  - remote-svn: add marks-file regeneration
>  - Add a svnrdump-simulator replaying a dump file for testing
>  - remote-svn: add incremental import
>  - remote-svn: Activate import/export-marks for fast-import
>  - Create a note for every imported commit containing svn metadata
>  - vcs-svn: add fast_export_note to create notes
>  - Allow reading svn dumps from files via file:// urls
>  - remote-svn, vcs-svn: Enable fetching to private refs
>  - When debug==1, start fast-import with "--stats" instead of "--quiet"
>  - Add documentation for the 'bidi-import' capability of remote-helpers
>  - Connect fast-import to the remote-helper via pipe, adding 'bidi-import'
> capability - Add argv_array_detach and argv_array_free_detached
>  - Add svndump_init_fd to allow reading dumps from arbitrary FDs
>  - Add git-remote-testsvn to Makefile
>  - Implement a remote helper for svn in C
>  (this branch is used by fa/vcs-svn.)
> 
>  A GSoC project.
>  Waiting for comments from mentors and stakeholders.

>From my point of view, this is rather complete. It got eight review cycles on 
the list.
Note that the remote helper can only fetch, pushing is not possible at all.

> 
> 
> * fa/vcs-svn (2012-09-19) 4 commits
>  - vcs-svn: remove repo_tree
>  - vcs-svn/svndump: rewrite handle_node(), begin|end_revision()
>  - vcs-svn/svndump: restructure node_ctx, rev_ctx handling
>  - svndump: move struct definitions to .h
>  (this branch uses fa/remote-svn.)
> 
>  A GSoC project.
>  Waiting for comments from mentors and stakeholders.

This is the result of what I did when I wanted to start implementing branch 
detection. I found that the existing code is not suitable and restructured it.

The main goal is to seperate svn revision parsing from git commit creation. 
Because for creating commits, you need to know on which branch to create the 
commit.
While for finding out which branch is the right one, you need to read the 
complete svn revision first to see what dirs are changed and how.

It is rather invasive and it doesn't make sense without using it later on.
So I'm not surprised that you may not like it.
Anyways it passes all existing tests (that doesn't mean it's good of course 
;))

Florian
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-05 Thread Andreas Schwab
Matthieu Moy  writes:

> Andreas Schwab  writes:
>
>> Junio C Hamano  writes:
>>
>>> When we require "x/**/y", I think we still want it to match "x/y".
>>
>> FWIW, in bash (+extglob), ksh and zsh it doesn't.
>
> You're right about bash, but I see the opposite for zsh and ksh:
>
> zsh$ echo x/**/y
> x/y x/z/y
>
> ksh$ echo x/**/y
> x/y x/z/y

Looks like this is different between filename expansion and case pattern
matching (I only tested the latter).

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-05 Thread Nguyen Thai Ngoc Duy
On Thu, Oct 04, 2012 at 09:39:02AM -0700, Junio C Hamano wrote:
> Assuming that we do want to match "x/y" with "x/**/y", I suspect
> that "'**' matches anything including a slash" would not give us
> that semantics. Is it something we can easily fix in the wildmatch
> code?

Something like this may suffice. Lightly tested with "git add -n".
Reading the code, I think we can even distinguish "match zero or more
directories" and "match one or more directories" with "/**/" and maybe
"/***/". Right now **, ***, ... are the same. So are /**/, /***/,
//...

-- 8< --
diff --git a/wildmatch.c b/wildmatch.c
index f153f8a..81eadc8 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -98,8 +98,12 @@ static int dowild(const uchar *p, const uchar *text,
continue;
  case '*':
if (*++p == '*') {
+   int slashstarstar = p[-2] == '/';
while (*++p == '*') {}
special = TRUE;
+   if (slashstarstar && *p == '/' &&
+   dowild(p + 1, text, a, force_lower_case) == TRUE)
+   return TRUE;
} else
special = FALSE;
if (*p == '\0') {
-- 8< --
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-05 Thread Matthieu Moy
Andreas Schwab  writes:

> Junio C Hamano  writes:
>
>> When we require "x/**/y", I think we still want it to match "x/y".
>
> FWIW, in bash (+extglob), ksh and zsh it doesn't.

You're right about bash, but I see the opposite for zsh and ksh:

zsh$ echo x/**/y
x/y x/z/y

ksh$ echo x/**/y
x/y x/z/y

(didn't check the doc so see whether this was configurable, but I've set
HOME=/ when launching the shell to disable my own configuration)

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-05 Thread Andreas Schwab
Junio C Hamano  writes:

> When we require "x/**/y", I think we still want it to match "x/y".

FWIW, in bash (+extglob), ksh and zsh it doesn't.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread Junio C Hamano
Michael Haggerty  writes:

> On 10/03/2012 08:17 PM, Junio C Hamano wrote:
>> 
>> What is the semantics of ** in the first place?  Is it described to
>> a reasonable level of detail in the documentation updates?  For
>> example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b",
>> "a/bfoo/c"?  Does "x**y" match "xy", "xay", "xa/by", "x/a/y"?
>> 
>> I am guessing that the only sensible definition is that "**"
>> requires anything that comes before it (if exists) is at a proper
>> hierarchy boundary, and anything matches it is also at a proper
>> hierarchy boundary, so "x**y" matches "x/a/y" and not "xy", "xay",
>> nor "xa/by" in the above example.  If "x**y" can match "xy" or "xay"
>> (or "**foo" can match "afoo"), it would be unreasonable to say it
>> implies the pattern is anchored at any level, no?
>
> Given that there is no obvious interpretation for what a construct like
> "x**y" would mean, and many plausible guesses (most of which sound
> rather useless), I suggest that we forbid it.  This will make the
> feature easier to explain and make .gitignore files that use it easier
> to understand.
>
> I think that 98% of the usefulness of "**" would be in constructs where
> it replaces a proper part of the pathname, like "**/SOMETHING" or
> "SOMETHING/**/SOMETHING"...

I think it is a good way to go in the longer term, if we all agree
that "**" matching anything does not give us a useful semantics
[*1*].

Is it something we can easily get by simple patch into the wildmatch
code?  I'd hate to see us parsing the input and validating it before
passing it to the library, as we will surely botch the quoting or
something while doing so.

When we require "x/**/y", I think we still want it to match "x/y".
Do people agree, or are there good reasons to require at least one
level between x and y for such a pattern?  Assuming that we do want
to match "x/y" with "x/**/y", I suspect that "'**' matches anything
including a slash" would not give us that semantics. Is it something
we can easily fix in the wildmatch code?


[Footnote]

*1* The message you are responding to was written in a somewhat
provocative way on purpose so that people who like the way rsync
matches "**" can vocally object. I would like to see arguments from
the both sides to see if it makes sense.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2))

2012-10-04 Thread Junio C Hamano
Stephen Bash  writes:

> I seemed to have missed the GSoC wrap up conversation... (links happily
> accepted).

I also seem to have missed such conversation, if anything like that
happened.  It certainly would have been nice for the mentors and the
student for each project to give us a two-to-three-paragraphs
summary.

As GSoC is a Google event and not the Git community one, I wouldn't
*demand* a concluding write-up, but it still took considerable
community resource, so...

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread Junio C Hamano
David Michael Barr  writes:

> On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: 
>> 
>> * fa/remote-svn (2012-09-19) 16 commits
>> ...
>> 
>> A GSoC project.
>> Waiting for comments from mentors and stakeholders.
>
> I have reviewed this topic and am happy with the design and implementation.
> I support this topic for inclusion.
>
> Acked-by: David Michael Barr 
>> 
>> * fa/vcs-svn (2012-09-19) 4 commits
>> ...
>
> This follow-on topic I'm not so sure on, some of the design
> decisions make me uncomfortable and I need some convincing before
> I can get behind this topic.

Thanks for a feedback.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread Michael Haggerty
On 10/04/2012 01:46 PM, Nguyen Thai Ngoc Duy wrote:
> On Thu, Oct 4, 2012 at 4:34 PM, Michael Haggerty  wrote:
>> As for the implementation, it is quite easy to textually convert a glob
>> pattern, including "**" parts, into a regexp.
> 
> Or we could introduce regexp syntax as an alternative and let users
> choose (and pay associated price).

It seems like overkill to me.  For filenames, globs are usually adequate.

>> _filename_char_pattern = r'[^/]'
>> _glob_patterns = [
>> ('?', _filename_char_pattern),
>> ('/**', r'(/.+)?'),
>> ('**/', r'(.+/)?'),
>> ('*', _filename_char_pattern + r'*'),
>> ]
> 
> I don't fully understand the rest (never been a big fan of python) but
> what about bracket expressions like [!abc] and [:alnum:]?

You're right; I forgot that the code that I posted doesn't support brackets.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2))

2012-10-04 Thread Stephen Bash
- Original Message -
> From: "Jonathan Nieder" 
> Sent: Thursday, October 4, 2012 4:30:01 AM
> Subject: Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; 
> Tue, 2))
> 
> > > * fa/remote-svn (2012-09-19) 16 commits
> > > - Add a test script for remote-svn
> > > - remote-svn: add marks-file regeneration
> > > - Add a svnrdump-simulator replaying a dump file for testing
> > > - remote-svn: add incremental import
> > > - remote-svn: Activate import/export-marks for fast-import
> > > - Create a note for every imported commit containing svn metadata
> > > - vcs-svn: add fast_export_note to create notes
> > > - Allow reading svn dumps from files via file:// urls
> > > - remote-svn, vcs-svn: Enable fetching to private refs
> > > - When debug==1, start fast-import with "--stats" instead of
> > > "--quiet"
> > > - Add documentation for the 'bidi-import' capability of
> > > remote-helpers
> > > - Connect fast-import to the remote-helper via pipe, adding
> > > 'bidi-import' capability
> > > - Add argv_array_detach and argv_array_free_detached
> > > - Add svndump_init_fd to allow reading dumps from arbitrary FDs
> > > - Add git-remote-testsvn to Makefile
> > > - Implement a remote helper for svn in C
> > > (this branch is used by fa/vcs-svn.)
> > >
> > > A GSoC project.
> > > Waiting for comments from mentors and stakeholders.
> >
> > I have reviewed this topic and am happy with the design and
> > implementation.  I support this topic for inclusion.
> 
> Thanks!  I'll try moving the tests to the first patch and trying it
> and hopefully send out a branch to pull tomorrow.
> 
> If I don't send anything tomorrow, that's probably a sign that I never
> will, so since I like the goal of the series I guess it would be a
> kind of implied ack.

I seemed to have missed the GSoC wrap up conversation... (links happily
accepted)  Looking at the big picture (as much as I can remember) it
seems to me the missing pieces now are branch mapping (lots of hard
work), and possibly parts (all?) of the "push to SVN" functionality?

Thoughts?

Thanks,
Stephen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread Nguyen Thai Ngoc Duy
On Thu, Oct 4, 2012 at 4:34 PM, Michael Haggerty  wrote:
On Thu, Oct 4, 2012 at 4:34 PM, Michael Haggerty  wrote:
> Given that there is no obvious interpretation for what a construct like
> "x**y" would mean, and many plausible guesses (most of which sound
> rather useless), I suggest that we forbid it.  This will make the
> feature easier to explain and make .gitignore files that use it easier
> to understand.

Yep, sounds like a good short term plan.

> As for the implementation, it is quite easy to textually convert a glob
> pattern, including "**" parts, into a regexp.

Or we could introduce regexp syntax as an alternative and let users
choose (and pay associated price). Patterns starting with // are never
matched (we don't normalize paths in .gitignore). Any patterns started
with "//regex:" is followed by regex. Reject all other // patterns for
future use.

> _filename_char_pattern = r'[^/]'
> _glob_patterns = [
> ('?', _filename_char_pattern),
> ('/**', r'(/.+)?'),
> ('**/', r'(.+/)?'),
> ('*', _filename_char_pattern + r'*'),
> ]

I don't fully understand the rest (never been a big fan of python) but
what about bracket expressions like [!abc] and [:alnum:]?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread Michael Haggerty
On 10/03/2012 08:17 PM, Junio C Hamano wrote:
> Nguyen Thai Ngoc Duy  writes:
> 
>> There's an interesting case: "**foo". According to our rules, that
>> pattern does not contain slashes therefore is basename match. But some
>> might find that confusing because "**" can match slashes,...
> 
> By "our rules", if you mean "if a pattern has slash, it is anchored",
> that obviously need to be updated with this series, if "**" is meant
> to match multiple hierarchies.
>> I think the latter makes more sense. When users put "**" they expect
>> to match some slashes. But that may call for a refactoring in
>> path_matches() in attr.c. Putting strstr(pattern, "**") in that
>> matching function may increase overhead unnecessarily.
>>
>> The third option is just die() and let users decide either "*foo",
>> "**/foo" or "/**foo", never "**foo".
> 
> For the double-star at the beginning, you should just turn it into "**/"
> if it is not followed by a slash internally, I think.
> 
> What is the semantics of ** in the first place?  Is it described to
> a reasonable level of detail in the documentation updates?  For
> example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b",
> "a/bfoo/c"?  Does "x**y" match "xy", "xay", "xa/by", "x/a/y"?
> 
> I am guessing that the only sensible definition is that "**"
> requires anything that comes before it (if exists) is at a proper
> hierarchy boundary, and anything matches it is also at a proper
> hierarchy boundary, so "x**y" matches "x/a/y" and not "xy", "xay",
> nor "xa/by" in the above example.  If "x**y" can match "xy" or "xay"
> (or "**foo" can match "afoo"), it would be unreasonable to say it
> implies the pattern is anchored at any level, no?

Given that there is no obvious interpretation for what a construct like
"x**y" would mean, and many plausible guesses (most of which sound
rather useless), I suggest that we forbid it.  This will make the
feature easier to explain and make .gitignore files that use it easier
to understand.

I think that 98% of the usefulness of "**" would be in constructs where
it replaces a proper part of the pathname, like "**/SOMETHING" or
"SOMETHING/**/SOMETHING"; in other words, where its use matches the
regexp "(^|/)\*\*/".  In these constructs the only ambiguity is whether
"**/" matches regexp

"([^/]+/)+"

or

"([^/]+/)*"

(e.g., whether "foo/**/bar" matches "foo/bar").  I personally prefer the
second, because the first behavior can be had using the second
interpretation by using "SOMETHING/*/**/SOMETHING", whereas the second
behavior cannot be implemented in terms of the first in a single line of
the .gitignore file.

Optionally, one might also like to support "SOMETHING/**" or "**" alone
in the obvious ways.

As for the implementation, it is quite easy to textually convert a glob
pattern, including "**" parts, into a regexp.  I happen to have written
some Python code that does this for another project (see below).  An
obvious optimization would be to read any literal parts of the path off
the beginning of the glob pattern and only use regexps for the tail
part.  Would a regexp-based implementation be too slow?

Michael

_filename_char_pattern = r'[^/]'
_glob_patterns = [
('?', _filename_char_pattern),
('/**', r'(/.+)?'),
('**/', r'(.+/)?'),
('*', _filename_char_pattern + r'*'),
]


def glob_to_regexp(pattern):
pattern = os.path.normpath(pattern) # remove trivial redundancies

if pattern == '**':
# This case has to be handled separately because it doesn't
# involve a '/' character adjacent to the '**' pattern.  (Such
# slashes otherwise have to be considered part of the pattern
# to handle the matching of zero path components.)
return re.compile(
r'^' + _filename_char_pattern + r'(.+' +
_filename_char_pattern + r')?$'
)

regexp = [r'^']
i = 0
while i < len(pattern):
for (s, r) in _glob_patterns:
if pattern.startswith(s, i):
regexp.append(r)
i += len(s)
break
else:
# AFAIK it's a normal character.  Escape it and add it to
# pattern.
regexp.append(re.escape(pattern[i]))
i += 1

regexp.append(r'$')

return re.compile(''.join(regexp))




-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fa/remote-svn (Re: What's cooking in git.git (Oct 2012, #01; Tue, 2))

2012-10-04 Thread Jonathan Nieder
David Michael Barr wrote:
> On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: 

>> * fa/remote-svn (2012-09-19) 16 commits
>> - Add a test script for remote-svn
>> - remote-svn: add marks-file regeneration
>> - Add a svnrdump-simulator replaying a dump file for testing
>> - remote-svn: add incremental import
>> - remote-svn: Activate import/export-marks for fast-import
>> - Create a note for every imported commit containing svn metadata
>> - vcs-svn: add fast_export_note to create notes
>> - Allow reading svn dumps from files via file:// urls
>> - remote-svn, vcs-svn: Enable fetching to private refs
>> - When debug==1, start fast-import with "--stats" instead of "--quiet"
>> - Add documentation for the 'bidi-import' capability of remote-helpers
>> - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' 
>> capability
>> - Add argv_array_detach and argv_array_free_detached
>> - Add svndump_init_fd to allow reading dumps from arbitrary FDs
>> - Add git-remote-testsvn to Makefile
>> - Implement a remote helper for svn in C
>> (this branch is used by fa/vcs-svn.)
>>
>> A GSoC project.
>> Waiting for comments from mentors and stakeholders.
>
> I have reviewed this topic and am happy with the design and implementation.
> I support this topic for inclusion.

Thanks!  I'll try moving the tests to the first patch and trying it
and hopefully send out a branch to pull tomorrow.

If I don't send anything tomorrow, that's probably a sign that I never
will, so since I like the goal of the series I guess it would be a
kind of implied ack.

Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread David Michael Barr

On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: 
> 
> * fa/remote-svn (2012-09-19) 16 commits
> - Add a test script for remote-svn
> - remote-svn: add marks-file regeneration
> - Add a svnrdump-simulator replaying a dump file for testing
> - remote-svn: add incremental import
> - remote-svn: Activate import/export-marks for fast-import
> - Create a note for every imported commit containing svn metadata
> - vcs-svn: add fast_export_note to create notes
> - Allow reading svn dumps from files via file:// urls
> - remote-svn, vcs-svn: Enable fetching to private refs
> - When debug==1, start fast-import with "--stats" instead of "--quiet"
> - Add documentation for the 'bidi-import' capability of remote-helpers
> - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' 
> capability
> - Add argv_array_detach and argv_array_free_detached
> - Add svndump_init_fd to allow reading dumps from arbitrary FDs
> - Add git-remote-testsvn to Makefile
> - Implement a remote helper for svn in C
> (this branch is used by fa/vcs-svn.)
> 
> A GSoC project.
> Waiting for comments from mentors and stakeholders.

I have reviewed this topic and am happy with the design and implementation.
I support this topic for inclusion.

Acked-by: David Michael Barr 
> 
> * fa/vcs-svn (2012-09-19) 4 commits
> - vcs-svn: remove repo_tree
> - vcs-svn/svndump: rewrite handle_node(), begin|end_revision()
> - vcs-svn/svndump: restructure node_ctx, rev_ctx handling
> - svndump: move struct definitions to .h
> (this branch uses fa/remote-svn.)
> 
> A GSoC project.
> Waiting for comments from mentors and stakeholders.

This follow-on topic I'm not so sure on, some of the design decisions make me 
uncomfortable and I need some convincing before I can get behind this topic. 

--
David Michael Barr

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-03 Thread Junio C Hamano
Nguyen Thai Ngoc Duy  writes:

>> I am guessing that the only sensible definition is that "**"
>> requires anything that comes before it (if exists) is at a proper
>> hierarchy boundary, and anything matches it is also at a proper
>> hierarchy boundary, so "x**y" matches "x/a/y"
>
> and "x/y" too? (As opposed to "x/**/y" which does not)

Yeah, x**y would match x/y under that "sensible" semantics.

>> and not "xy", "xay",
>> nor "xa/by" in the above example.  If "x**y" can match "xy" or "xay"
>> (or "**foo" can match "afoo"), it would be unreasonable to say it
>> implies the pattern is anchored at any level, no?
>
> Yeah. That makes things easier to reason, though not exactly what we're 
> having.

It sounds like that "x**y" with the code you imported would match
"xy" and "xa/b/cy", and I do not think of a concise and good way to
describe what it does to the end users.

"matches anything including '/'" is not a useful description for the
purpose of allowing the user to intuitively understand why "x**y" is
anchored at the level (or is not anchored and can appear anywhere).

Perhaps the wildmatch code may not be what we want X-<.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-03 Thread Nguyen Thai Ngoc Duy
On Thu, Oct 4, 2012 at 1:17 AM, Junio C Hamano  wrote:
> For the double-star at the beginning, you should just turn it into "**/"
> if it is not followed by a slash internally, I think.
>
> What is the semantics of ** in the first place?  Is it described to
> a reasonable level of detail in the documentation updates?  For
> example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b",
> "a/bfoo/c"?  Does "x**y" match "xy", "xay", "xa/by", "x/a/y"?

It's basically what rsync describes: use ’**’ to match anything,
including slashes.

Reading rsync's man page again, I notice I missed two other rules related to **:

 - If the pattern contains a / (not counting a trailing /) or a "**",
then it is matched against the full pathname, including any leading
directories.  If  the  pattern  doesn't contain  a / or a "**", then
it is matched only against the final component of the filename.
(Remember that the algorithm is applied recursively so "full filename"
can actually be any portion of a path from the starting directory on
down.)

 - A trailing "dir_name/***" will match both the directory (as if
"dir_name/" had been specified) and everything in the directory (as if
"dir_name/**" had been specified).  This behavior was added in version
2.6.7.

>From what you wrote, I think we'll go with the first rule. The second
rule looks irrelevant to what git's doing.

> I am guessing that the only sensible definition is that "**"
> requires anything that comes before it (if exists) is at a proper
> hierarchy boundary, and anything matches it is also at a proper
> hierarchy boundary, so "x**y" matches "x/a/y"

and "x/y" too? (As opposed to "x/**/y" which does not)

> and not "xy", "xay",
> nor "xa/by" in the above example.  If "x**y" can match "xy" or "xay"
> (or "**foo" can match "afoo"), it would be unreasonable to say it
> implies the pattern is anchored at any level, no?

Yeah. That makes things easier to reason, though not exactly what we're having.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-03 Thread Junio C Hamano
Nguyen Thai Ngoc Duy  writes:

> There's an interesting case: "**foo". According to our rules, that
> pattern does not contain slashes therefore is basename match. But some
> might find that confusing because "**" can match slashes,...

By "our rules", if you mean "if a pattern has slash, it is anchored",
that obviously need to be updated with this series, if "**" is meant
to match multiple hierarchies.
> I think the latter makes more sense. When users put "**" they expect
> to match some slashes. But that may call for a refactoring in
> path_matches() in attr.c. Putting strstr(pattern, "**") in that
> matching function may increase overhead unnecessarily.
>
> The third option is just die() and let users decide either "*foo",
> "**/foo" or "/**foo", never "**foo".

For the double-star at the beginning, you should just turn it into "**/"
if it is not followed by a slash internally, I think.

What is the semantics of ** in the first place?  Is it described to
a reasonable level of detail in the documentation updates?  For
example does "**foo" match "afoo", "a/b/foo", "a/bfoo", "a/foo/b",
"a/bfoo/c"?  Does "x**y" match "xy", "xay", "xa/by", "x/a/y"?

I am guessing that the only sensible definition is that "**"
requires anything that comes before it (if exists) is at a proper
hierarchy boundary, and anything matches it is also at a proper
hierarchy boundary, so "x**y" matches "x/a/y" and not "xy", "xay",
nor "xa/by" in the above example.  If "x**y" can match "xy" or "xay"
(or "**foo" can match "afoo"), it would be unreasonable to say it
implies the pattern is anchored at any level, no?


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-03 Thread Nguyen Thai Ngoc Duy
On Wed, Oct 3, 2012 at 6:20 AM, Junio C Hamano  wrote:
> * nd/wildmatch (2012-09-27) 5 commits
>  - Support "**" in .gitignore and .gitattributes patterns using wildmatch()
>  - Integrate wildmatch to git
>  - compat/wildmatch: fix case-insensitive matching
>  - compat/wildmatch: remove static variable force_lower_case
>  - Import wildmatch from rsync
>
>  Allows pathname patterns in .gitignore and .gitattributes files
>  with double-asterisks "foo/**/bar" to match any number of directory
>  hiearchies.
>
>  It was pointed out that some symbols that do not have to be global
>  are left global. I think this reroll fixed most of them.
>
>  Will merge to 'next'.

Just a bit of finding lately, in case you want to postpone the merge.

There's an interesting case: "**foo". According to our rules, that
pattern does not contain slashes therefore is basename match. But some
might find that confusing because "**" can match slashes, as opposed
to ordinary wildcards which cannot. So we could either go with our
rules and consider "**" just like "*" in this case (do we need
document clarification?), or redefine it that the presence of "**"
implies FNM_PATHNAME.

I think the latter makes more sense. When users put "**" they expect
to match some slashes. But that may call for a refactoring in
path_matches() in attr.c. Putting strstr(pattern, "**") in that
matching function may increase overhead unnecessarily.

The third option is just die() and let users decide either "*foo",
"**/foo" or "/**foo", never "**foo".
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html