Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-04 Thread Sergey Organov
Hi Igor,

Igor Djordjevic  writes:

[...]

> Now, not to get misinterpreted to pick sides in "(re)create" vs 
> "rebase" merge commit discussion, I just think these two (should) have 
> a different purpose, and actually having both inside interactive rebase 
> is what we should be aiming for.

Yes, if the user has an existing merge that he intends to throw away by
re-merging from scratch, he should be given a way to do it during
history editing session, no argues.

What I argue against is that this mode of operation is the default one,
let alone the only one.

> And that`s what I think is important to understand before any further 
> discussion - _(re)creating_ existing merge commits is not the same as 
> _rebasing_ them, even though the former can sometimes be used to 
> achieve the latter.

Yes, indeed. Sometimes creating new merge instead of original does the
job of rebasing the original, only it does it by pure accident.

-- Sergey


Re: [RFC] Rebasing merges: a jorney to the ultimate solution (Road Clear)

2018-03-04 Thread Sergey Organov
Hi Plillip and Igor,

Igor Djordjevic  writes:
> Hi Phillip,
>
> On 02/03/2018 12:31, Phillip Wood wrote:
>> 
>> > Thinking about it overnight, I now suspect that original proposal had a
>> > mistake in the final merge step. I think that what you did is a way to
>> > fix it, and I want to try to figure what exactly was wrong in the
>> > original proposal and to find simpler way of doing it right.
>> >
>> > The likely solution is to use original UM as a merge-base for final
>> > 3-way merge of U1' and U2', but I'm not sure yet. Sounds pretty natural
>> > though, as that's exactly UM from which both U1' and U2' have diverged
>> > due to rebasing and other history editing.
>> 
>> Hi Sergey, I've been following this discussion from the sidelines,
>> though I haven't had time to study all the posts in this thread in
>> detail. I wonder if it would be helpful to think of rebasing a merge as
>> merging the changes in the parents due to the rebase back into the
>> original merge. So for a merge M with parents A B C that are rebased to
>> A' B' C' the rebased merge M' would be constructed by (ignoring shell
>> quoting issues)
>> 
>> git checkout --detach M
>> git merge-recursive A -- M A'
>> tree=$(git write-tree)
>> git merge-recursive B -- $tree B'
>> tree=$(git write-tree)
>> git merge-recursive C -- $tree C'
>> tree=$(git write-tree)
>> M'=$(git log --pretty=%B -1 M | git commit-tree -pA' -pB' -pC')
>> 
>> This should pull in all the changes from the parents while preserving
>> any evil conflict resolution in the original merge. It superficially
>> reminds me of incremental merging [1] but it's so long since I looked at
>> that I'm not sure if there are any significant similarities.
>> 
>> [1] https://github.com/mhagger/git-imerge
>
> Interesting, from quick test[3], this seems to produce the same 
> result as that other test I previously provided[2], where temporary 
> commits U1' and U2' are finally merged with original M as a base :)

Looks like sound approach and it's interesting if these 2 methods do in
fact always bring the same result. Because if we look at the (now fixed)
original approach closely, it also just gathers the changes in merge
parents into U1' and U2', then merges the changes back into the original
M (=U1=U2=UM).

Overall, this one looks like another implementation of essentially the
same method and confirms that we all have the right thought direction
here.

>
> Just that this looks like even more straight-forward approach...?
>
> The only thing I wonder of here is how would we check if the 
> "rebased" merge M' was "clean", or should we stop for user amendment? 
> With that other approach Sergey described, we have U1'==U2' to test
> with.

That's an advantage of the original, yes.

-- Sergey


Re: [PATCH] git.el: handle default excludesfile properly

2018-03-04 Thread Junio C Hamano
Dorab Patel  writes:

> Looking deeper into how the function git-get-exclude-files is used, I
> see that it is only being called from git-run-ls-files-with-excludes.
> So, perhaps, a better (or additional) fix might be to add the
> parameter "--exclude-standard" in the call to git-run-ls-files from
> within git-run-ls-files-with-excludes. And remove the need for
> get-get-exclude-files altogether.

It is absolutely the right thing to depend on --exclude-standard, I
would think, so that we do not have to worry about details like XDG
paths and such.  Thanks for working that out between both of you.

Having said that, I am sorry to say that I am not sure if the copy
we have is the one to be patched, so I would appreciate if Alexandre
(cc'ed) can clarify the situation.  The last change done to our copy
of the script is from 2012, and I do not know if Alexandre is still
taking care of it here but the script is so perfect that there was
no need to update it for the past 5 years and we haven't seen an
update, or the caninical copy is now being maintained elsewhere and
we only have a stale copy, or what.

Thanks.


Re: [RFC PATCH] color: respect the $NO_COLOR convention

2018-03-04 Thread Junio C Hamano
"brian m. carlson"  writes:

> As a note, turning off color can improve accessibility for some people.
> I have a co-worker who has deuteranomaly and virtually all colored text
> at the terminal poses readability problems.  It would be beneficial if
> he could just set NO_COLOR=1 in his environment and have everything just
> work.
>
> For this reason, I'm in favor of taking this patch, assuming it comes
> with tests.

Oh, I agree 100% the world would be a better place if there already
is an established way to turn off all colors, instead of having to
run around and setting tool specific configuration like LS_COLORS
etc. for 42 different tools one uses during one's daily life.  I
just am not getting the feeling this no-color.org's effort is the
one.  We already have a way specific to our project already (i.e.
configuration variables), so if we adopt NO_COLOR but other people
do not universally support it (and they support something else),
we'd end up having to maintain yet another knob that only a handful
of projects understand forever, and that is where my reluctance
comes from.


Re: [PATCH 2/2] git-svn: allow empty email-address in authors-prog and authors-file

2018-03-04 Thread Eric Sunshine
On Sun, Mar 4, 2018 at 6:22 AM, Andreas Heiduk  wrote:
> The email address in --authors-file and --authors-prog can be empty but
> git-svn translated it into a syntethic email address in the form
> $USERNAME@$REPO_UUID. Now git-svn behaves like git-commit: If the email
> is explicitly set to the empty string, the commit does not contain
> an email address.

Falling back to "$name@$uuid" was clearly an intentional choice, so
this seems like a rather significant change of behavior. How likely is
it that users or scripts relying upon the existing behavior will break
with this change? If the likelihood is high, should this behavior be
opt-in?

Doesn't such a behavior change deserve being documented (and possibly tests)?

> Signed-off-by: Andreas Heiduk 
> ---
> diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
> @@ -1482,7 +1482,6 @@ sub call_authors_prog {
> }
> if ($author =~ /^\s*(.+?)\s*<(.*)>\s*$/) {
> my ($name, $email) = ($1, $2);
> -   $email = undef if length $2 == 0;
> return [$name, $email];

Mental note: existing behavior intentionally makes $email undefined if
not present in $author; revised behavior leaves it defined.

> } else {
> die "Author: $orig_author: $::_authors_prog returned "
> @@ -2020,8 +2019,8 @@ sub make_log_entry {
> remove_username($full_url);
> $log_entry{metadata} = "$full_url\@$r $uuid";
> $log_entry{svm_revision} = $r;
> -   $email ||= "$author\@$uuid";
> -   $commit_email ||= "$author\@$uuid";
> +   $email //= "$author\@$uuid";
> +   $commit_email //= "$author\@$uuid";

With the revised behavior (above), $email is unconditionally defined,
so these //= expressions will _never_ assign "$author\@$uuid" to
$email. Am I reading that correctly? If so, then isn't this now just
dead code? Wouldn't it be clearer to remove these lines altogether?

I see from reading the code that there is a "if (!defined $email)"
earlier in the function which becomes misleading with this change. I'd
have expected the patch to modify that, as well.

Also, the Perl codebase in this project is still at 5.8, whereas the
// operator (and //=) didn't become available until Perl 5.10.
(However, there has lately been some talk[1] about bumping the minimum
Perl version to 5.10.)

[1]: https://public-inbox.org/git/20171223174400.26668-1-ava...@gmail.com/

> } elsif ($self->use_svnsync_props) {
> my $full_url = canonicalize_url(
> add_path_to_url( $self->svnsync->{url}, $self->path )
> @@ -2029,15 +2028,15 @@ sub make_log_entry {
> remove_username($full_url);
> my $uuid = $self->svnsync->{uuid};
> $log_entry{metadata} = "$full_url\@$rev $uuid";
> -   $email ||= "$author\@$uuid";
> -   $commit_email ||= "$author\@$uuid";
> +   $email //= "$author\@$uuid";
> +   $commit_email //= "$author\@$uuid";
> } else {
> my $url = $self->metadata_url;
> remove_username($url);
> my $uuid = $self->rewrite_uuid || $self->ra->get_uuid;
> $log_entry{metadata} = "$url\@$rev " . $uuid;
> -   $email ||= "$author\@" . $uuid;
> -   $commit_email ||= "$author\@" . $uuid;
> +   $email //= "$author\@" . $uuid;
> +   $commit_email //= "$author\@" . $uuid;
> }
> $log_entry{name} = $name;
> $log_entry{email} = $email;


Re: [PATCH 1/2] git-svn: search --authors-prog in PATH too

2018-03-04 Thread Eric Sunshine
On Sun, Mar 4, 2018 at 6:22 AM, Andreas Heiduk  wrote:
> In 36db1eddf9 ("git-svn: add --authors-prog option", 2009-05-14) the path
> to authors-prog was made absolute because git-svn changes the current
> directoy in some situations. This makes sense if the program is part of

s/directoy/directory/

> the repository but prevents searching via $PATH.
>
> The old behaviour is still retained, but if the file does not exists, then
> authors-prog is search in $PATH as any other command.

s/search/searched for/

> Signed-off-by: Andreas Heiduk 


Re: [RFC PATCH] color: respect the $NO_COLOR convention

2018-03-04 Thread brian m. carlson
On Thu, Mar 01, 2018 at 11:06:45AM -0800, Junio C Hamano wrote:
> Leah Neukirchen  writes:
> 
> > You are right in calling this out an emerging new thing, but the
> > second list of that page proves that it will be useful to settle on a
> > common configuration, and my hope is by getting a few popular projects
> > on board, others will soon follow.  It certainly is easy to implement,
> > and rather unintrusive.  Users which don't know about this feature are
> > completely unaffected.
> 
> There certainly is chicken-and-egg problem there.  Even though I
> personally prefer not to see overuse of colors, I am not sure if
> we the Git community as a whole would want to be involved until it
> gets mainstream.

As a note, turning off color can improve accessibility for some people.
I have a co-worker who has deuteranomaly and virtually all colored text
at the terminal poses readability problems.  It would be beneficial if
he could just set NO_COLOR=1 in his environment and have everything just
work.

For this reason, I'm in favor of taking this patch, assuming it comes
with tests.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[no subject]

2018-03-04 Thread alfred chow


Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing
Chong Hing Bank, Hong Kong, Chong Hing Bank Centre, 24 Des Voeux Road
Central, Hong Kong. I have a business proposal of  $38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Email me for more details:

Best Regards,








[PATCH v9 7/8] convert: add tracing for 'working-tree-encoding' attribute

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.

Signed-off-by: Lars Schneider 
---
 convert.c| 25 +
 t/t0028-working-tree-encoding.sh |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/convert.c b/convert.c
index 9647b06679..eec34a94b9 100644
--- a/convert.c
+++ b/convert.c
@@ -313,6 +313,29 @@ static int validate_encoding(const char *path, const char 
*enc,
return 0;
 }
 
+static void trace_encoding(const char *context, const char *path,
+  const char *encoding, const char *buf, size_t len)
+{
+   static struct trace_key coe = TRACE_KEY_INIT(WORKING_TREE_ENCODING);
+   struct strbuf trace = STRBUF_INIT;
+   int i;
+
+   strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
encoding);
+   for (i = 0; i < len && buf; ++i) {
+   strbuf_addf(
+   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   i,
+   (unsigned char) buf[i],
+   (buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
+   ((i+1) % 8 && (i+1) < len ? ' ' : '\n')
+   );
+   }
+   strbuf_addchars(, '\n', 1);
+
+   trace_strbuf(, );
+   strbuf_release();
+}
+
 static const char *default_encoding = "UTF-8";
 
 static int encode_to_git(const char *path, const char *src, size_t src_len,
@@ -341,6 +364,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
if (validate_encoding(path, enc, src, src_len, die_on_error))
return 0;
 
+   trace_encoding("source", path, enc, src, src_len);
dst = reencode_string_len(src, src_len, default_encoding, enc,
  _len);
if (!dst) {
@@ -358,6 +382,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
return 0;
}
}
+   trace_encoding("destination", path, default_encoding, dst, dst_len);
 
strbuf_attach(buf, dst, dst_len, dst_len + 1);
return 1;
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index 5c7e36a164..bdc487b44f 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -4,6 +4,8 @@ test_description='working-tree-encoding conversion via 
gitattributes'
 
 . ./test-lib.sh
 
+GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING
+
 test_expect_success 'setup test files' '
git config core.eol lf &&
 
-- 
2.16.2



[PATCH v9 6/8] convert: check for detectable errors in UTF encodings

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.

Signed-off-by: Lars Schneider 
---
 convert.c| 50 +++
 t/t0028-working-tree-encoding.sh | 57 
 2 files changed, 107 insertions(+)

diff --git a/convert.c b/convert.c
index 2f864df258..9647b06679 100644
--- a/convert.c
+++ b/convert.c
@@ -266,6 +266,53 @@ static int will_convert_lf_to_crlf(size_t len, struct 
text_stat *stats,
 
 }
 
+static int validate_encoding(const char *path, const char *enc,
+ const char *data, size_t len, int die_on_error)
+{
+   if (!memcmp("UTF-", enc, 4)) {
+   /*
+* Check for detectable errors in UTF encodings
+*/
+   if (has_prohibited_utf_bom(enc, data, len)) {
+   const char *error_msg = _(
+   "BOM is prohibited in '%s' if encoded as %s");
+   /*
+* This advice is shown for UTF-??BE and UTF-??LE
+* encodings. We truncate the encoding name to 6
+* chars with %.6s to cut off the last two "byte
+* order" characters.
+*/
+   const char *advise_msg = _(
+   "The file '%s' contains a byte order "
+   "mark (BOM). Please use %.6s as "
+   "working-tree-encoding.");
+   advise(advise_msg, path, enc);
+   if (die_on_error)
+   die(error_msg, path, enc);
+   else {
+   return error(error_msg, path, enc);
+   }
+
+   } else if (is_missing_required_utf_bom(enc, data, len)) {
+   const char *error_msg = _(
+   "BOM is required in '%s' if encoded as %s");
+   const char *advise_msg = _(
+   "The file '%s' is missing a byte order "
+   "mark (BOM). Please use %sBE or %sLE "
+   "(depending on the byte order) as "
+   "working-tree-encoding.");
+   advise(advise_msg, path, enc, enc);
+   if (die_on_error)
+   die(error_msg, path, enc);
+   else {
+   return error(error_msg, path, enc);
+   }
+   }
+
+   }
+   return 0;
+}
+
 static const char *default_encoding = "UTF-8";
 
 static int encode_to_git(const char *path, const char *src, size_t src_len,
@@ -291,6 +338,9 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
if (!buf && !src)
return 1;
 
+   if (validate_encoding(path, enc, src, src_len, die_on_error))
+   return 0;
+
dst = reencode_string_len(src, src_len, default_encoding, enc,
  _len);
if (!dst) {
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index 71e8e3700b..5c7e36a164 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -62,6 +62,46 @@ test_expect_success 'check $GIT_DIR/info/attributes support' 
'
 
 for i in 16 32
 do
+   test_expect_success "check prohibited UTF-${i} BOM" '
+   test_when_finished "git reset --hard HEAD" &&
+
+   echo "*.utf${i}be text working-tree-encoding=utf-${i}be" 
>>.gitattributes &&
+   echo "*.utf${i}le text working-tree-encoding=utf-${i}le" 
>>.gitattributes &&
+
+   # Here we add a UTF-16 (resp. UTF-32) files with BOM 
(big/little-endian)
+   # but we tell Git to treat it as UTF-16BE/UTF-16LE (resp. 
UTF-32).
+   # In these cases the BOM is prohibited.
+   cp bebom.utf${i}be.raw bebom.utf${i}be &&
+   test_must_fail git add bebom.utf${i}be 2>err.out &&
+   test_i18ngrep "fatal: BOM is prohibited .* UTF-${i}BE" err.out 
&&
+
+   cp lebom.utf${i}le.raw lebom.utf${i}be &&
+   test_must_fail git add lebom.utf${i}be 2>err.out &&
+   test_i18ngrep "fatal: BOM is prohibited .* UTF-${i}BE" err.out 
&&
+
+   cp bebom.utf${i}be.raw bebom.utf${i}le &&
+   test_must_fail git add bebom.utf${i}le 2>err.out &&
+   test_i18ngrep "fatal: BOM is prohibited .* UTF-${i}LE" err.out 
&&
+
+   cp lebom.utf${i}le.raw lebom.utf${i}le &&
+   test_must_fail git add lebom.utf${i}le 2>err.out &&
+   test_i18ngrep "fatal: BOM is prohibited .* UTF-${i}LE" err.out
+   '
+

[PATCH v9 8/8] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-04 Thread lars . schneider
From: Lars Schneider 

UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are known to have round trip issues [1].

Add 'core.checkRoundtripEncoding', which contains a comma separated
list of encodings, to define for what encodings Git should check the
conversion round trip if they are used in the 'working-tree-encoding'
attribute.

Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'.

[1] 
https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode

Signed-off-by: Lars Schneider 
---
 Documentation/config.txt |  6 
 Documentation/gitattributes.txt  |  8 +
 config.c |  5 +++
 convert.c| 78 
 convert.h|  1 +
 environment.c|  1 +
 t/t0028-working-tree-encoding.sh | 39 
 7 files changed, 138 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e25b2c92b..d7a56054a5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -530,6 +530,12 @@ core.autocrlf::
This variable can be set to 'input',
in which case no output conversion is performed.
 
+core.checkRoundtripEncoding::
+   A comma separated list of encodings that Git performs UTF-8 round
+   trip checks on if they are used in an `working-tree-encoding`
+   attribute (see linkgit:gitattributes[5]). The default value is
+   `SHIFT-JIS`.
+
 core.symlinks::
If false, symbolic links are checked out as small plain files that
contain the link text. linkgit:git-update-index[1] and
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 31a4f92840..aa3deae392 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -312,6 +312,14 @@ number of pitfalls:
   internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
   That operation will fail and cause an error.
 
+- Reencoding content to non-UTF encodings can cause errors as the
+  conversion might not be UTF-8 round trip safe. If you suspect your
+  encoding to not be round trip safe, then add it to
+  `core.checkRoundtripEncoding` to make Git check the round trip
+  encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
+  set) is known to have round trip issues with UTF-8 and is checked by
+  default.
+
 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').
 
diff --git a/config.c b/config.c
index 1f003fbb90..d0ada9fcd4 100644
--- a/config.c
+++ b/config.c
@@ -1172,6 +1172,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.checkroundtripencoding")) {
+   check_roundtrip_encoding = xstrdup(value);
+   return 0;
+   }
+
if (!strcmp(var, "core.notesref")) {
notes_ref_name = xstrdup(value);
return 0;
diff --git a/convert.c b/convert.c
index eec34a94b9..6cbb2b2618 100644
--- a/convert.c
+++ b/convert.c
@@ -336,6 +336,43 @@ static void trace_encoding(const char *context, const char 
*path,
strbuf_release();
 }
 
+static int check_roundtrip(const char* enc_name)
+{
+   /*
+* check_roundtrip_encoding contains a string of space and/or
+* comma separated encodings (eg. "UTF-16, ASCII, CP1125").
+* Search for the given encoding in that string.
+*/
+   const char *found = strcasestr(check_roundtrip_encoding, enc_name);
+   const char *next;
+   int len;
+   if (!found)
+   return 0;
+   next = found + strlen(enc_name);
+   len = strlen(check_roundtrip_encoding);
+   return (found && (
+   /*
+* check that the found encoding is at the
+* beginning of check_roundtrip_encoding or
+* that it is prefixed with a space or comma
+*/
+   found == check_roundtrip_encoding || (
+   found > check_roundtrip_encoding &&
+   (*(found-1) == ' ' || *(found-1) == ',')
+   )
+   ) && (
+   /*
+* check that the found encoding is at the
+* end of check_roundtrip_encoding or
+* that it is suffixed with a space or comma
+*/
+   next == check_roundtrip_encoding + len || (
+   next < check_roundtrip_encoding + len &&
+

[PATCH v9 5/8] convert: add 'working-tree-encoding' attribute

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as most Git web front ends do not
visualize the content.

Add an attribute to tell Git what encoding the user has defined for a
given file. If the content is added to the index, then Git converts the
content to a canonical UTF-8 representation. On checkout Git will
reverse the conversion.

Signed-off-by: Lars Schneider 
---
 Documentation/gitattributes.txt  |  80 +++
 convert.c| 114 -
 convert.h|   1 +
 sha1_file.c  |   2 +-
 t/t0028-working-tree-encoding.sh | 135 +++
 5 files changed, 330 insertions(+), 2 deletions(-)
 create mode 100755 t/t0028-working-tree-encoding.sh

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 30687de81a..31a4f92840 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -272,6 +272,86 @@ few exceptions.  Even though...
   catch potential problems early, safety triggers.
 
 
+`working-tree-encoding`
+^^^
+
+Git recognizes files encoded in ASCII or one of its supersets (e.g.
+UTF-8, ISO-8859-1, ...) as text files. Files encoded in certain other
+encodings (e.g. UTF-16) are interpreted as binary and consequently
+built-in Git text processing tools (e.g. 'git diff') as well as most Git
+web front ends do not visualize the contents of these files by default.
+
+In these cases you can tell Git the encoding of a file in the working
+directory with the `working-tree-encoding` attribute. If a file with this
+attribute is added to Git, then Git reencodes the content from the
+specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded
+content in its internal data structure (called "the index"). On checkout
+the content is reencoded back to the specified encoding.
+
+Please note that using the `working-tree-encoding` attribute may have a
+number of pitfalls:
+
+- Alternative Git implementations (e.g. JGit or libgit2) and older Git
+  versions (as of March 2018) do not support the `working-tree-encoding`
+  attribute. If you decide to use the `working-tree-encoding` attribute
+  in your repository, then it is strongly recommended to ensure that all
+  clients working with the repository support it.
+
+  For example, Microsoft Visual Studio resources files (`*.rc`) or
+  PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16.
+  If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with
+  a `working-tree-encoding` enabled Git client, then `foo.ps1` will be
+  stored as UTF-8 internally. A client without `working-tree-encoding`
+  support will checkout `foo.ps1` as UTF-8 encoded file. This will
+  typically cause trouble for the users of this file.
+
+  If a Git client, that does not support the `working-tree-encoding`
+  attribute, adds a new file `bar.ps1`, then `bar.ps1` will be
+  stored "as-is" internally (in this example probably as UTF-16).
+  A client with `working-tree-encoding` support will interpret the
+  internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
+  That operation will fail and cause an error.
+
+- Reencoding content requires resources that might slow down certain
+  Git operations (e.g 'git checkout' or 'git add').
+
+Use the `working-tree-encoding` attribute only if you cannot store a file
+in UTF-8 encoding and if you want Git to be able to process the content
+as text.
+
+As an example, use the following attributes if your '*.ps1' files are
+UTF-16 encoded with byte order mark (BOM) and you want Git to perform
+automatic line ending conversion based on your platform.
+
+
+*.ps1  text working-tree-encoding=UTF-16
+
+
+Use the following attributes if your '*.ps1' files are UTF-16 little
+endian encoded without BOM and you want Git to use Windows line endings
+in the working directory. Please note, it is highly recommended to
+explicitly define the line endings with `eol` if the `working-tree-encoding`
+attribute is used to avoid ambiguity.
+
+
+*.ps1  text working-tree-encoding=UTF-16LE eol=CRLF
+
+
+You can get a list of all available encodings on your platform with the
+following command:
+
+
+iconv --list
+
+
+If you do not know the encoding of a file, then you can use the `file`
+command to guess the encoding:
+
+
+file foo.ps1
+
+
+
 `ident`
 ^^^
 
diff --git a/convert.c b/convert.c
index b976eb968c..2f864df258 100644
--- a/convert.c
+++ b/convert.c
@@ -7,6 +7,7 

[PATCH v9 0/8] convert: add support for different encodings

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Hi,

Patches 1-4,7 are preparation and helper functions.
Patch 5,6,8 are the actual change.

This series depends on Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is
already in master.

Changes since v8:

* move UTF BOM error checks in a new dedicated function
  validate_encoding() and into a dedicated commit (6)
* remove unnecessary encoding struct (became a plain char*)
* fail early and do not try to run the reencode content in case of a
  validation error
* return early if roundtrip encoding was not found (avoid undefined
  pointer arithmetic)
* fix wrong argument order in encode_to_worktree() error message
* use test_when_finished to cleanup tests
* move UTF-16/32 BOM test file generation into "setup test"
* reduce code duplication in tests
* improve documentation:
- use *.rc and *.ps1 as examples as they are usually UTF-16 encoded
* fix comment: /advise/advice/

Thanks a lot Eric for your great review! I think I fixed everything you
objected with one exception. You noticed that the current code only
checks for BOMs corresponding to the declared size (16 or 32 bits) [1].
I understand your point of view and I agree that any BOM in these cases
is *most likely* an error. However, according to the spec it might
still be valid. The comments on my related question on StackOverflow
seem to support that view [2]. Therefore, I would like to leave it as
it is in this series. If it turns out to be a problem in practice, then
I am happy to change it later. OK for you?


Thanks,
Lars

[1] https://public-inbox.org/git/df6f3855-efe7-4c13-aa53-819aae0de...@gmail.com/
[2] 
https://stackoverflow.com/questions/49038872/is-a-utf-32be-bom-valid-in-an-utf-16le-declared-data-stream


  RFC: 
https://public-inbox.org/git/bdb9b884-6d17-4be3-a83c-f67e2afa2...@gmail.com/
   v1: 
https://public-inbox.org/git/20171211155023.1405-1-lars.schnei...@autodesk.com/
   v2: 
https://public-inbox.org/git/2017122915.39680-1-lars.schnei...@autodesk.com/
   v3: 
https://public-inbox.org/git/20180106004808.77513-1-lars.schnei...@autodesk.com/
   v4: 
https://public-inbox.org/git/20180120152418.52859-1-lars.schnei...@autodesk.com/
   v5: https://public-inbox.org/git/20180129201855.9182-1-tbo...@web.de/
   v6: 
https://public-inbox.org/git/20180209132830.55385-1-lars.schnei...@autodesk.com/
   v7: 
https://public-inbox.org/git/20180215152711.158-1-lars.schnei...@autodesk.com/
   v8: 
https://public-inbox.org/git/20180224162801.98860-1-lars.schnei...@autodesk.com/


Base Ref:
Web-Diff: https://github.com/larsxschneider/git/commit/fdf0d63188
Checkout: git fetch https://github.com/larsxschneider/git encoding-v9 && git 
checkout fdf0d63188


### Interdiff (v8..v9):

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 11315054f4..aa3deae392 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -297,14 +297,16 @@ number of pitfalls:
   in your repository, then it is strongly recommended to ensure that all
   clients working with the repository support it.

-  If you declare `*.proj` files as UTF-16 and you add `foo.proj` with an
-  `working-tree-encoding` enabled Git client, then `foo.proj` will be
+  For example, Microsoft Visual Studio resources files (`*.rc`) or
+  PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16.
+  If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with
+  a `working-tree-encoding` enabled Git client, then `foo.ps1` will be
   stored as UTF-8 internally. A client without `working-tree-encoding`
-  support will checkout `foo.proj` as UTF-8 encoded file. This will
+  support will checkout `foo.ps1` as UTF-8 encoded file. This will
   typically cause trouble for the users of this file.

   If a Git client, that does not support the `working-tree-encoding`
-  attribute, adds a new file `bar.proj`, then `bar.proj` will be
+  attribute, adds a new file `bar.ps1`, then `bar.ps1` will be
   stored "as-is" internally (in this example probably as UTF-16).
   A client with `working-tree-encoding` support will interpret the
   internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
@@ -325,22 +327,22 @@ Use the `working-tree-encoding` attribute only if you 
cannot store a file
 in UTF-8 encoding and if you want Git to be able to process the content
 as text.

-As an example, use the following attributes if your '*.proj' files are
+As an example, use the following attributes if your '*.ps1' files are
 UTF-16 encoded with byte order mark (BOM) and you want Git to perform
 automatic line ending conversion based on your platform.

 
-*.proj text working-tree-encoding=UTF-16
+*.ps1  text working-tree-encoding=UTF-16
 

-Use the following attributes if your '*.proj' files are UTF-16 little
+Use the following attributes if your '*.ps1' files are UTF-16 little
 

[PATCH v9 1/8] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.

Remove the unnecessary assignment.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index 1df674e919..55b7daeb35 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -781,7 +781,6 @@ char *xstrdup_tolower(const char *string)
result = xmallocz(len);
for (i = 0; i < len; i++)
result[i] = tolower(string[i]);
-   result[i] = '\0';
return result;
 }
 
-- 
2.16.2



[PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-04 Thread lars . schneider
From: Lars Schneider 

If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.

The Unicode standard instructs to assume big-endian if there in no BOM
for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard used
in HTML5 recommends to assume little-endian to "deal with deployed
content" [3]. Strictly requiring a BOM seems to be the safest option
for content in Git.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#gen6
[2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
 Section 3.10, D98, page 132
[3] https://encoding.spec.whatwg.org/#utf-16le

Signed-off-by: Lars Schneider 
---
 utf8.c | 13 +
 utf8.h | 19 +++
 2 files changed, 32 insertions(+)

diff --git a/utf8.c b/utf8.c
index 914881cd1f..5113d26e56 100644
--- a/utf8.c
+++ b/utf8.c
@@ -562,6 +562,19 @@ int has_prohibited_utf_bom(const char *enc, const char 
*data, size_t len)
);
 }
 
+int is_missing_required_utf_bom(const char *enc, const char *data, size_t len)
+{
+   return (
+  !strcmp(enc, "UTF-16") &&
+  !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
+has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
+   ) || (
+  !strcmp(enc, "UTF-32") &&
+  !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
+has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
+   );
+}
+
 /*
  * Returns first character length in bytes for multi-byte `text` according to
  * `encoding`.
diff --git a/utf8.h b/utf8.h
index 0db1db4519..cce654a64a 100644
--- a/utf8.h
+++ b/utf8.h
@@ -79,4 +79,23 @@ void strbuf_utf8_align(struct strbuf *buf, align_type 
position, unsigned int wid
  */
 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len);
 
+/*
+ * If the endianness is not defined in the encoding name, then we
+ * require a BOM. The function returns true if a required BOM is missing.
+ *
+ * The Unicode standard instructs to assume big-endian if there in no
+ * BOM for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard
+ * used in HTML5 recommends to assume little-endian to "deal with
+ * deployed content" [3].
+ *
+ * Therefore, strictly requiring a BOM seems to be the safest option for
+ * content in Git.
+ *
+ * [1] http://unicode.org/faq/utf_bom.html#gen6
+ * [2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
+ * Section 3.10, D98, page 132
+ * [3] https://encoding.spec.whatwg.org/#utf-16le
+ */
+int is_missing_required_utf_bom(const char *enc, const char *data, size_t len);
+
 #endif
-- 
2.16.2



[PATCH v9 3/8] utf8: add function to detect prohibited UTF-16/32 BOM

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#bom10

Signed-off-by: Lars Schneider 
---
 utf8.c | 24 
 utf8.h |  9 +
 2 files changed, 33 insertions(+)

diff --git a/utf8.c b/utf8.c
index 2c27ce0137..914881cd1f 100644
--- a/utf8.c
+++ b/utf8.c
@@ -538,6 +538,30 @@ char *reencode_string_len(const char *in, int insz,
 }
 #endif
 
+static int has_bom_prefix(const char *data, size_t len,
+ const char *bom, size_t bom_len)
+{
+   return (len >= bom_len) && !memcmp(data, bom, bom_len);
+}
+
+static const char utf16_be_bom[] = {0xFE, 0xFF};
+static const char utf16_le_bom[] = {0xFF, 0xFE};
+static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
+static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+
+int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
+{
+   return (
+ (!strcmp(enc, "UTF-16BE") || !strcmp(enc, "UTF-16LE")) &&
+ (has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
+  has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
+   ) || (
+ (!strcmp(enc, "UTF-32BE") || !strcmp(enc, "UTF-32LE")) &&
+ (has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
+  has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
+   );
+}
+
 /*
  * Returns first character length in bytes for multi-byte `text` according to
  * `encoding`.
diff --git a/utf8.h b/utf8.h
index 6bbcf31a83..0db1db4519 100644
--- a/utf8.h
+++ b/utf8.h
@@ -70,4 +70,13 @@ typedef enum {
 void strbuf_utf8_align(struct strbuf *buf, align_type position, unsigned int 
width,
   const char *s);
 
+/*
+ * If a data stream is declared as UTF-16BE or UTF-16LE, then a UTF-16
+ * BOM must not be used [1]. The same applies for the UTF-32 equivalents.
+ * The function returns true if this rule is violated.
+ *
+ * [1] http://unicode.org/faq/utf_bom.html#bom10
+ */
+int has_prohibited_utf_bom(const char *enc, const char *data, size_t len);
+
 #endif
-- 
2.16.2



[PATCH v9 2/8] strbuf: add xstrdup_toupper()

2018-03-04 Thread lars . schneider
From: Lars Schneider 

Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 12 
 strbuf.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 55b7daeb35..b635f0bdc4 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -784,6 +784,18 @@ char *xstrdup_tolower(const char *string)
return result;
 }
 
+char *xstrdup_toupper(const char *string)
+{
+   char *result;
+   size_t len, i;
+
+   len = strlen(string);
+   result = xmallocz(len);
+   for (i = 0; i < len; i++)
+   result[i] = toupper(string[i]);
+   return result;
+}
+
 char *xstrvfmt(const char *fmt, va_list ap)
 {
struct strbuf buf = STRBUF_INIT;
diff --git a/strbuf.h b/strbuf.h
index 14c8c10d66..df7ced53ed 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -607,6 +607,7 @@ __attribute__((format (printf,2,3)))
 extern int fprintf_ln(FILE *fp, const char *fmt, ...);
 
 char *xstrdup_tolower(const char *);
+char *xstrdup_toupper(const char *);
 
 /**
  * Create a newly allocated string using printf format. You can do this easily
-- 
2.16.2



Re: [PATCH v8 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-04 Thread Eric Sunshine
On Sun, Mar 4, 2018 at 2:08 PM, Lars Schneider  wrote:
>> On 25 Feb 2018, at 20:50, Eric Sunshine  wrote:
>> On Sat, Feb 24, 2018 at 11:28 AM,   wrote:
>>> +   if (!re_src || src_len != re_src_len ||
>>> +   memcmp(src, re_src, src_len)) {
>>> +   const char* msg = _("encoding '%s' from %s to %s 
>>> and "
>>> +   "back is not the same");
>>> +   die(msg, path, enc->name, default_encoding);
>>
>> Last two arguments need to be swapped.
>
> Hm. Are you sure? I think it is correct as it is. We are in encode_to_git()
> here and that means we encode *to* "default encoding", no?

Okay. I guess I was just looking at the most recent
reencode_string_len() -- and maybe overlooked the "and back" -- and
was thinking that this error message applied directly to it, but I see
your point about the error saying something about encode_to_git()
overall, in which case I agree with you.

>>> +   test_config core.checkRoundtripEncoding "garbage" &&
>>> +   ! GIT_TRACE=1 git add .gitattributes roundtrip.shift 2>&1 
>>> >/dev/null |
>>> +   grep "Checking roundtrip encoding for SHIFT-JIS" &&
>>> +   test_unconfig core.checkRoundtripEncoding &&
>>
>> The "unconfig" won't take place if the test fails. Instead of
>> test_config/test_unconfig, you could use '-c' to set the config
>> transiently for the git-add operation:
>>
>>! GIT_TRACE=1 git -c core.checkRoundtripEncoding=garbage add ...
>
> Agreed. Although test_config (in t/test-lib-functions.sh) automatically
> unsets itself after the test is over.

Yep, so you could get by with that alone. The test_unconfig() simply
isn't needed.


VERY URGENT AND GET BACK TO ME

2018-03-04 Thread Mr.Yaya Bambara
Greetings My Dear Friend,

MY name is Mr.Yaya Bambara i am working in ADB bank I have
($16.4million Dollars) to transfer to your country and if you are
interested get back to me immediately for more details.and i we
give you 40% for you and 60% for me ok.

Mr.Yaya Bambara.
Telex Manager
African Development Bank (ADB)
Burkina Faso


Re: [PATCH v8 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-04 Thread Lars Schneider

> On 25 Feb 2018, at 20:50, Eric Sunshine  wrote:
> 
> On Sat, Feb 24, 2018 at 11:28 AM,   wrote:
>> UTF supports lossless conversion round tripping and conversions between
>> UTF and other encodings are mostly round trip safe as Unicode aims to be
>> a superset of all other character encodings. However, certain encodings
>> (e.g. SHIFT-JIS) are known to have round trip issues [1].
>> 
>> Add 'core.checkRoundtripEncoding', which contains a comma separated
>> list of encodings, to define for what encodings Git should check the
>> conversion round trip if they are used in the 'working-tree-encoding'
>> attribute.
>> 
>> Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'.
>> 
>> [1] 
>> https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>> diff --git a/convert.c b/convert.c
>> @@ -289,6 +289,39 @@ static void trace_encoding(const char *context, const 
>> char *path,
>> +static int check_roundtrip(const char* enc_name)
>> +{
>> +   /*
>> +* check_roundtrip_encoding contains a string of space and/or
>> +* comma separated encodings (eg. "UTF-16, ASCII, CP1125").
>> +* Search for the given encoding in that string.
>> +*/
>> +   const char *found = strcasestr(check_roundtrip_encoding, enc_name);
>> +   const char *next = found + strlen(enc_name);
> 
> Is this pointer arithmetic undefined behavior (according to the C
> standard) if NULL is returned by strcasestr()? If so, how comfortable
> are we with this? Perhaps if you add an 'if' into the mix, you can
> avoid it:
> 
>if (found) {
>const char *next = found + strlen(enc_name);
>return ...long complicated expression...;
>}
>return false;

OK. I've fixed it this way:

if (!found)
return 0;


[...]
>> +
>> +   if (!re_src || src_len != re_src_len ||
>> +   memcmp(src, re_src, src_len)) {
>> +   const char* msg = _("encoding '%s' from %s to %s and 
>> "
>> +   "back is not the same");
>> +   die(msg, path, enc->name, default_encoding);
> 
> Last two arguments need to be swapped.

Hm. Are you sure? I think it is correct as it is. We are in encode_to_git()
here and that means we encode *to* "default encoding", no?


>> +   }
>> +
>> +   free(re_src);
>> +   }
>> +
>>strbuf_attach(buf, dst, dst_len, dst_len + 1);
>>return 1;
>> }
>> diff --git a/t/t0028-working-tree-encoding.sh 
>> b/t/t0028-working-tree-encoding.sh
>> @@ -225,4 +225,45 @@ test_expect_success 'error if encoding garbage is 
>> already in Git' '
>> +test_expect_success 'check roundtrip encoding' '
>> +   text="hallo there!\nroundtrip test here!" &&
>> +   printf "$text" | iconv -f UTF-8 -t SHIFT-JIS >roundtrip.shift &&
>> +   printf "$text" | iconv -f UTF-8 -t UTF-16 >roundtrip.utf16 &&
>> +   echo "*.shift text working-tree-encoding=SHIFT-JIS" >>.gitattributes 
>> &&
>> +
>> +   # SHIFT-JIS encoded files are round-trip checked by default...
>> +   GIT_TRACE=1 git add .gitattributes roundtrip.shift 2>&1 >/dev/null |
>> +   grep "Checking roundtrip encoding for SHIFT-JIS" &&
> 
> Why redirect to /dev/null? If something does go wrong somewhere, the
> more output available for debugging the problem, the better, so
> throwing it away unnecessarily seems contraindicated.

OK!


>> +   git reset &&
>> +
>> +   # ... unless we overwrite the Git config!
>> +   test_config core.checkRoundtripEncoding "garbage" &&
>> +   ! GIT_TRACE=1 git add .gitattributes roundtrip.shift 2>&1 >/dev/null 
>> |
>> +   grep "Checking roundtrip encoding for SHIFT-JIS" &&
>> +   test_unconfig core.checkRoundtripEncoding &&
> 
> The "unconfig" won't take place if the test fails. Instead of
> test_config/test_unconfig, you could use '-c' to set the config
> transiently for the git-add operation:
> 
>! GIT_TRACE=1 git -c core.checkRoundtripEncoding=garbage add ...

Agreed. Although test_config (in t/test-lib-functions.sh) automatically 
unsets itself after the test is over. 


>> +   git reset &&
>> +
>> +   # UTF-16 encoded files should not be round-trip checked by default...
>> +   ! GIT_TRACE=1 git add roundtrip.utf16 2>&1 >/dev/null |
>> +   grep "Checking roundtrip encoding for UTF-16" &&
>> +   git reset &&
>> +
>> +   # ... unless we tell Git to check it!
>> +   test_config_global core.checkRoundtripEncoding "UTF-16, UTF-32" &&
>> +   GIT_TRACE=1 git add roundtrip.utf16 2>&1 >/dev/null |
>> +   grep "Checking roundtrip encoding for UTF-16" &&
>> +   git reset &&
>> +
>> +   # ... unless we tell Git to check it!
>> +   # (here we also check that the casing of the 

[Bug] git log --show-signature print extra carriage return ^M

2018-03-04 Thread Larry Hunter
There is bug using "git log --show-signature" in my installation

git 2.16.2.windows.1
gpg (GnuPG) 2.2.4
libgcrypt 1.8.2

that prints (with colors) an extra ^M (carriage return?) at the end of
the gpg lines. As an example, the output of "git log --show-signature
HEAD" looks like:

$ git log --show-signature HEAD
commit 46c490188ebd216f20c454ee61108e51b481844e (HEAD -> master)
gpg: Signature made 03/04/18 16:53:06 ora solare Europa occidentale^M
gpg:using RSA key ...^M
gpg: Good signature from "..." [ultimate]^M
Author: ... <...>
Date:   Sun Mar 4 16:53:06 2018 +0100
...

To help find a fix, I tested the command "git verify-commit HEAD" that
prints (without colors) the same lines without extra ^M characters.

$ git verify-commit HEAD
gpg: Signature made 03/04/18 16:53:06 ora solare Europa occidentale
gpg:using RSA key ...
gpg: Good signature from "..." [ultimate]

Thanks,
Larry


information required

2018-03-04 Thread sales
=
Thanks for your last email response to me.
The information required should include the following-:
Your full names
Your address
Telephone number
Your private email
Occupation
Age
This is to enable my further discussion with you in confidence.
Best regards and wishes to you.
Mohammad Amir Khadov

NB: Please reply to:

uk...@postaxte.com


uk...@postaxte.com



[no subject]

2018-03-04 Thread Alfred Chow




Good Day,

  I am sending you this message again.
  I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing
Chong Hing Bank, Hong Kong, Chong Hing Bank Centre, 24 Des Voeux Road
Central, Hong Kong. I have a business proposal of  $38,980,369.00.

All confirmable documents to back up the claims will be made available
to you prior to your acceptance and as soon as I receive your return
mail.

Email me for more details:

Best Regards.







Hopefully

2018-03-04 Thread Rita Micheal
Dear friend,

My name is Mr Micheal Rita, I am the Bill and Exchange (assistant)
Manager of Bank of Africa Ouagadougou, Burkina Faso. In my department
I discovered an abandoned sum of teen million five hundred thousand United
State of American dollars (10.5MILLION USA DOLLARS) in an account that
belongs to one of our foreign customer who died in airline that crashed on 4th
October 2001.

Since I got information about his death I have been expecting his next
of kin to come over and claim his money because we can not release
it unless somebody applies for it as the next of kin or relation to the
deceased as indicated in our banking guidelines, but unfortunately
we learnt that all his supposed next of kin or relation died alongside
with him in the plane crash leaving nobody behind for the claim. It is
therefore upon this discovery that I decided to make this business
proposal to you and release the money to you as next of kin or relation
to the deceased for safety and subsequent disbursement since nobody
is coming for it and I don't want the money to go into the bank treasury
as unclaimed bill.

You will be entitled with 40% of the total sum while 60% will be for
me after which I will visit your Country to invest my own share when
the fund is successfully transferred into your account, Please I would
like you to keep this transaction confidential and as a top secret as
you may wish to know that I am a bank official.

Yours sincerely,
Mr Micheal Rita.


Hello dear,

2018-03-04 Thread Rachel Rachel
Hello dear,
I am Miss Rachel Jelani. I have very important thing to discuss with
you please, this information is very vital. Contact me with my
privarte email so we can talk ( rachelrachel...@hotmail.com )
Rachel.


[PATCH 1/2] git-svn: search --authors-prog in PATH too

2018-03-04 Thread Andreas Heiduk
In 36db1eddf9 ("git-svn: add --authors-prog option", 2009-05-14) the path
to authors-prog was made absolute because git-svn changes the current
directoy in some situations. This makes sense if the program is part of
the repository but prevents searching via $PATH.

The old behaviour is still retained, but if the file does not exists, then
authors-prog is search in $PATH as any other command.

Signed-off-by: Andreas Heiduk 
---
 Documentation/git-svn.txt | 5 +
 git-svn.perl  | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-svn.txt b/Documentation/git-svn.txt
index 636e09048e..b858374649 100644
--- a/Documentation/git-svn.txt
+++ b/Documentation/git-svn.txt
@@ -657,6 +657,11 @@ config key: svn.authorsfile
expected to return a single line of the form "Name ",
which will be treated as if included in the authors file.
 +
+Due to historical reasons a relative 'filename' is first searched
+relative to the current directory for 'init' and 'clone' and relative
+to the root of the working tree for 'fetch'. If 'filename' is
+not found, it is searched like any other command in '$PATH'.
++
 [verse]
 config key: svn.authorsProg
 
diff --git a/git-svn.perl b/git-svn.perl
index a6b6c3e40c..050f2a36f4 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -374,7 +374,8 @@ version() if $_version;
 usage(1) unless defined $cmd;
 load_authors() if $_authors;
 if (defined $_authors_prog) {
-   $_authors_prog = "'" . File::Spec->rel2abs($_authors_prog) . "'";
+   my $abs_file = File::Spec->rel2abs($_authors_prog);
+   $_authors_prog = "'" . $abs_file . "'" if -x $abs_file;
 }
 
 unless ($cmd =~ /^(?:clone|init|multi-init|commit-diff)$/) {
-- 
2.16.2



[PATCH 2/2] git-svn: allow empty email-address in authors-prog and authors-file

2018-03-04 Thread Andreas Heiduk
The email address in --authors-file and --authors-prog can be empty but
git-svn translated it into a syntethic email address in the form
$USERNAME@$REPO_UUID. Now git-svn behaves like git-commit: If the email
is explicitly set to the empty string, the commit does not contain
an email address.

Signed-off-by: Andreas Heiduk 
---
 perl/Git/SVN.pm | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index bc4eed3d75..b0a340b294 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -1482,7 +1482,6 @@ sub call_authors_prog {
}
if ($author =~ /^\s*(.+?)\s*<(.*)>\s*$/) {
my ($name, $email) = ($1, $2);
-   $email = undef if length $2 == 0;
return [$name, $email];
} else {
die "Author: $orig_author: $::_authors_prog returned "
@@ -2020,8 +2019,8 @@ sub make_log_entry {
remove_username($full_url);
$log_entry{metadata} = "$full_url\@$r $uuid";
$log_entry{svm_revision} = $r;
-   $email ||= "$author\@$uuid";
-   $commit_email ||= "$author\@$uuid";
+   $email //= "$author\@$uuid";
+   $commit_email //= "$author\@$uuid";
} elsif ($self->use_svnsync_props) {
my $full_url = canonicalize_url(
add_path_to_url( $self->svnsync->{url}, $self->path )
@@ -2029,15 +2028,15 @@ sub make_log_entry {
remove_username($full_url);
my $uuid = $self->svnsync->{uuid};
$log_entry{metadata} = "$full_url\@$rev $uuid";
-   $email ||= "$author\@$uuid";
-   $commit_email ||= "$author\@$uuid";
+   $email //= "$author\@$uuid";
+   $commit_email //= "$author\@$uuid";
} else {
my $url = $self->metadata_url;
remove_username($url);
my $uuid = $self->rewrite_uuid || $self->ra->get_uuid;
$log_entry{metadata} = "$url\@$rev " . $uuid;
-   $email ||= "$author\@" . $uuid;
-   $commit_email ||= "$author\@" . $uuid;
+   $email //= "$author\@" . $uuid;
+   $commit_email //= "$author\@" . $uuid;
}
$log_entry{name} = $name;
$log_entry{email} = $email;
-- 
2.16.2



Re: git stash push -u always warns "pathspec '...' did not match any files"

2018-03-04 Thread Marc Strapetz

On 03.03.2018 16:46, Thomas Gummerer wrote:

On 03/03, Marc Strapetz wrote:

Reproducible in a test repository with following steps:

$ touch untracked
$ git stash push -u -- untracked
Saved working directory and index state WIP on master: 0096475 init
fatal: pathspec 'untracked' did not match any files
error: unrecognized input

The file is stashed correctly, though.

Tested with Git 2.16.2 on Linux and Windows.


Thanks for the bug report and the reproduction recipe.  The following
patch should fix it:


Thanks, I can confirm that the misleading warning message is fixed.

What I've noticed now is that when using -u option, Git won't warn if 
the pathspec is actually not matching a file. Also, an empty stash may 
be created. For example:


$ git stash push -u -- nonexisting
Saved working directory and index state WIP on master: 171081d initial 
import


I would probably expect to see an error message as for:

$ git stash push -- nonexisting
error: pathspec 'nonexisting' did not match any file(s) known to git.
Did you forget to 'git add'?

That said, this is no problem for us, because I know that the paths I'm 
providing to "git stash push" do exist. I just wanted to point out.


-Marc



Re: [PATCH v7 0/7] convert: add support for different encodings

2018-03-04 Thread Torsten Bögershausen
On 2018-02-28 14:21, Jeff King wrote:
> On Wed, Feb 28, 2018 at 09:20:05AM +0100, Torsten Bögershausen wrote:
> 
>>>   2. auto-detect utf-16 (your patch)
>>>  - Just Works for existing repositories storing utf-16
>>>
>>>  - carries some risk of kicking in when people would like it not to
>>>(e.g., when they really do want a binary patch that can be
>>>applied).
>>
>> The binary patch is still supported, but that detail may need some more 
>> explanation
>> in the commit message. Please see  t4066-diff-encoding.sh
> 
> Yeah, but if you don't have binary-patches enabled we'd generate a bogus
> patch. Which, granted, without that you wouldn't be able to apply the
> patch either. But somehow it feels funny to me to generate something
> that _looks_ like a patch but you can't actually apply.
> 
> I also think we'd want a plan for this to be used consistently in other
> diff-like tools. E.g., "git blame" uses textconv for the starting file
> content, and it would be nice for this to kick in then, too. Ditto for
> things like grep, pickaxe, etc.
> 
> I have some patches that reuse some of the textconv infrastructure for
> this, which should mostly make it "just work" everywhere. They need a
> little more polishing before I post them, but you can take a look at:
> 
>   https://github.com/peff/git.git jk/textconv-utf16
> 
> if you want.
> 
> -Peff
> 

Thanks for your work (I actually found some time to take look)

I am looking at the code to put 2 or 3 things on top of it:
- test case(s)
- documentation
- teach diff to add a line "b is converted to UTF-8 from UTF-16"
- teach apply to reads & understands the encoding line and throws
  in a "reencode_string_len() like your patch does

This would keep "git diff | git apply" happy.
All in all the changes do not look too invasive, at least from my point of view.





Hello Beautiful

2018-03-04 Thread jack
Good day dear, i hope this mail meets you well? my name is Jack, from the U.S. 
I know this may seem inappropriate so i ask for your forgiveness but i wish to 
get to know you better, if I may be so bold. I consider myself an easy-going 
man, adventurous, honest and fun loving person but I am currently looking for a 
relationship in which I will feel loved. I promise to answer any question that 
you may want to ask me...all i need is just your attention and the chance to 
know you more.

Please tell me more about yourself, if you do not mind. Hope to hear back from 
you soon.

Jack.


[PATCH] http.c: shell command evaluation for extraheader

2018-03-04 Thread Colin Arnott
The http.extraHeader config parameter currently only supports storing
constant values. There are two main use cases where this fails:

  0. Sensitive payloads: frequently this config parameter is used to pass
 authentication credentials in place of or in addition to the
 Authorization header, however since this value is required to be in
 the clear this can create security issues.

  1. Mutating headers: some headers, especially new authentication
 schemes, leverage short lived tokens that change over time.

There do exist solutions with current tools for these use cases, however
none are optimal:

  0. Shell alias: by aliasing over git with a call to git that includes the
 config directive and evaluates the header value inline, you can
 fake the desired mutability:
   `alias git='git -c http.extraHeader="$(gpg -d crypt.gpg)"'`
 This presents two problems:
 a. aliasing over commands can be confusing to new users, since git
config information is stored in shell configs
 b. this solution scales only to your shell, not all shells

  1. Global hook: you could implement a hook that writes the config
 entry before fetch / pull actions, so that it is up to date, but
 this does nothing to secure it.

  2. git-credential-helper: the credential helper interface already
 supports shelling out to arbitrary binaries or scripts, however
 this interface can only be used to populate the Authorization
 header.

The optimal solution involves extending the current implementation of
http.extraHeader parsing to allow for arbitrary shell command execution.
There seem to be two paradigms for such features:

  0. Overloading with '!' prefixes: seen in alias.* and credential.helper

  1. New "Cmd" suffix parameters: seen in sendemail.toCmd sendemail.ccCmd

While the latter may be more clear without documentation, the addition
of a new config parameter seemed more complex for the codebase. As such,
new documentation is included.

Several edge cases came up during implementation and the following
design decisions were made:

  0. Stdin and stderr for the child_process are exposed to the user:
 this allows commands that print status information via stderr, and
 accept input to function. The use case considered is text input for
 decryption and error handling that is out of scope for git.

  1. Failure to exec: if either the file does not exist, or any other
 exec related failure occurs, no error is presented to the user,
 and the header is not included

  2. Non-zero return code: if the child_process returns a non-zero
 value, no error is presented to the user, the return value is
 consumed, and the header is not included in the request.

  3. Headers starting with the '!' character require a shell command to
 create: because no escaping syntax was implemented, the following
 is required for such headers: "!printf '!magic: abra'"

Signed-off-by: Colin Arnott 
---
 Documentation/config.txt|  7 +++
 http.c  | 20 
 t/t5551-http-fetch-smart.sh |  6 --
 3 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10..4b2171d60 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1918,6 +1918,13 @@ http.extraHeader::
more than one such entry exists, all of them are added as extra
headers.  To allow overriding the settings inherited from the system
config, an empty value will reset the extra headers to the empty list.
+   If the value is prefixed with an exclamation point, it will
+   be treated as a shell command.  For example, defining
+   "http.extraHeader = !gpg -d < secure_header.gpg", will pass the
+   decrypted header, if the command does not exec cleanly or has a
+   non-zero return value, no header will be added.  Note that shell
+   commands will be executed from the top-level directory of a
+   repository, which may not necessarily be the current directory.
 
 http.cookieFile::
The pathname of a file containing previously stored cookie lines,
diff --git a/http.c b/http.c
index 31755023a..11103df41 100644
--- a/http.c
+++ b/http.c
@@ -380,6 +380,26 @@ static int http_options(const char *var, const char 
*value, void *cb)
} else if (!*value) {
curl_slist_free_all(extra_http_headers);
extra_http_headers = NULL;
+   } else if (value[0] == '!') {
+   struct child_process cp = CHILD_PROCESS_INIT;
+   cp.git_cmd = 0;
+   cp.in = 0;
+   cp.out = -1;
+   cp.err = 0;
+   cp.use_shell = 1;
+   argv_array_push(, value + 1);
+   if (!start_command()) {
+   struct