Re: [PATCH v4 1/3] update-unicode.sh: automatically download newer definition files
On Sat, Dec 03, 2016 at 10:00:47PM +0100, Beat Bolli wrote: > Checking just for the unicode data files' existence is not sufficient; > we should also download them if a newer version exists on the Unicode > consortium's servers. Option -N of wget does this nicely for us. > > Reviewed-by: Torsten BoegershausenMinor remark (Not sure if this motivates v5, may be Junio can fix it locally?) s/oe/ö/ Beside this: Thanks again (and I learned about the -N option of wget)
Re: Git v2.11.0 breaks max depth nested alternates
On Sat, Dec 03, 2016 at 04:24:02PM -0800, Kyle J. McKay wrote: > When the incoming quarantine takes place the current objects directory > is demoted to an alternate thereby increasing its depth (and any > alternates it references) by one and causing any object store that was > previously at the maximum nesting depth to be ignored courtesy of the > above hard-coded maximum depth. > > If the incoming push happens to need access to some of those objects > to perhaps "--fix-thin" its pack it will crash and burn. Yep, that makes sense. I didn't really worry about this because the existing "5" is totally arbitrary, and meant to be so high that nobody reaches it (it's just there to break cycles). So I do think this is worth dealing with, but I'm also curious why you're hitting the depth-5 limit. I'm guessing it has to do with hosting a hierarchy of related repos. But is your system then always in danger of busting the 5-limit if people create too deep a repository hierarchy? Specifically, I'm wondering if it would be sufficient to just bump it to 6. Or 100. Of course any static bump runs into the funny case where a repo _usually_ works, but fails when pushed to. Which is kind of nasty and unintuitive. And your patch fixes that, and we can leave the idea of bumping the static depth number as an orthogonal issue (that personally, I do not care about much about either way). > diff --git a/common-main.c b/common-main.c > index c654f955..9f747491 100644 > --- a/common-main.c > +++ b/common-main.c > @@ -37,5 +37,8 @@ int main(int argc, const char **argv) > > restore_sigpipe_to_default(); > > + if (getenv(GIT_QUARANTINE_ENVIRONMENT)) > + alt_odb_max_depth++; > + > return cmd_main(argc, argv); After reading your problem description, my initial thought was to increment the counter when we allocate the tmp-objdir, and decrement when it is destroyed. Because the parent receive-pack process adds it to its alternates, too. But: 1. Receive-pack doesn't care; it adds the tmp-objdir as an alternate, rather than adding it as its main object dir and bumping down the main one. 2. There would have to be some way of communicating to sub-processes that they should bump their max-depth by one. You've basically used the quarantine-path variable as the inter-process flag for (2). Which feels a little funny, because its value is unrelated to the alt-odb setup. But it is a reliable signal, so there's a certain elegance. It's probably the best option, given that the alternative is a specific variable to say "hey, bump your max-alt-odb-depth by one". That's pretty ugly, too. :) -Peff
[PATCH v2] tag, branch, for-each-ref: add --ignore-case for sorting and filtering
This options makes sorting ignore case, which is great when you have branches named bug-12-do-something, Bug-12-do-some-more and BUG-12-do-what and want to group them together. Sorting externally may not be an option because we lose coloring and column layout from git-branch and git-tag. The same could be said for filtering, but it's probably less important because you can always go with the ugly pattern [bB][uU][gG]-* if you're desperate. You can't have case-sensitive filtering and case-insensitive sorting (or the other way around) with this though. For branch and tag, that should be no problem. for-each-ref, as a plumbing, might want finer control. But we can always add --{filter,sort}-ignore-case when there is a need for it. Signed-off-by: Nguyễn Thái Ngọc Duy--- Changes are in tests only: diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh index fad79e8..52283df 100755 --- a/t/t3203-branch-output.sh +++ b/t/t3203-branch-output.sh @@ -208,6 +208,13 @@ test_expect_success 'sort branches, ignore case' ' test_commit initial && git branch branch-one && git branch BRANCH-two && + git branch --list | awk "{print \$NF}" >actual && + cat >expected <<-\EOF && + BRANCH-two + branch-one + master + EOF + test_cmp expected actual && git branch --list -i | awk "{print \$NF}" >actual && cat >expected <<-\EOF && branch-one diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh index 2d9cae3..07869b0 100755 --- a/t/t7004-tag.sh +++ b/t/t7004-tag.sh @@ -34,6 +34,13 @@ test_expect_success 'sort tags, ignore case' ' test_commit initial && git tag tag-one && git tag TAG-two && + git tag -l >actual && + cat >expected <<-\EOF && + TAG-two + initial + tag-one + EOF + test_cmp expected actual && git tag -l -i >actual && cat >expected <<-\EOF && initial @@ -98,8 +105,8 @@ test_expect_success 'listing all tags if one exists should output that tag' ' test_expect_success 'listing a tag using a matching pattern should succeed' \ 'git tag -l mytag' -test_expect_success 'listing a tag using a matching pattern should succeed' \ - 'git tag -l --ignore-case MYTAG' +test_expect_success 'listing a tag with --ignore-case' \ + 'test $(git tag -l --ignore-case MYTAG) = mytag' test_expect_success \ 'listing a tag using a matching pattern should output that tag' \ Documentation/git-branch.txt | 4 Documentation/git-for-each-ref.txt | 3 +++ Documentation/git-tag.txt | 4 builtin/branch.c | 23 ++- builtin/for-each-ref.c | 5 - builtin/tag.c | 4 ref-filter.c | 28 +--- ref-filter.h | 2 ++ t/t3203-branch-output.sh | 29 + t/t7004-tag.sh | 27 +++ 10 files changed, 112 insertions(+), 17 deletions(-) diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt index 1fe7344..5516a47 100644 --- a/Documentation/git-branch.txt +++ b/Documentation/git-branch.txt @@ -118,6 +118,10 @@ OPTIONS default to color output. Same as `--color=never`. +-i:: +--ignore-case:: + Sorting and filtering branches are case insensitive. + --column[=]:: --no-column:: Display branch listing in columns. See configuration variable diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index f57e69b..6d22974 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -79,6 +79,9 @@ OPTIONS Only list refs which contain the specified commit (HEAD if not specified). +--ignore-case:: + Sorting and filtering refs are case insensitive. + FIELD NAMES --- diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt index 80019c5..76cfe40 100644 --- a/Documentation/git-tag.txt +++ b/Documentation/git-tag.txt @@ -108,6 +108,10 @@ OPTIONS variable if it exists, or lexicographic order otherwise. See linkgit:git-config[1]. +-i:: +--ignore-case:: + Sorting and filtering tags are case insensitive. + --column[=]:: --no-column:: Display tag listing in columns. See configuration variable diff --git a/builtin/branch.c b/builtin/branch.c index 60cc5c8..36e0a21 100644 --- a/builtin/branch.c +++ b/builtin/branch.c @@ -512,15 +512,6 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin if (filter->verbose)
Re: git reset --hard should not irretrievably destroy new files
On Sat, Dec 3, 2016 at 6:11 PM, Christian Couderwrote: > On Sat, Dec 3, 2016 at 6:04 AM, Julian de Bhal > wrote: >> but I'd be nearly as happy if a >> commit was added to the reflog when the reset happens (I can probably make >> that happen with some configuration now that I've been bitten). > > Not sure if this has been proposed. Perhaps it would be simpler to > just output the sha1, and maybe the filenames too, of the blobs, that > are no more referenced from the trees, somewhere (in a bloblog?). Yeah, after doing a bit more reading around the issue, this seems like a smaller part of destroying local changes with a hard reset, and I'm one of the lucky ones where it is recoverable. Has anyone discussed having `git reset --hard` create objects for the current state of anything it's about to destroy, specifically so they end up in the --lost-found? I think this is what you're suggesting, only without checking for references, so that tree & blob objects exist that make any hard reset reversible. Cheers Jules P.s. Thank you for such a warm welcome while I blunder through unfamiliar protocols.
Git v2.11.0 breaks max depth nested alternates
The recent addition of pre-receive quarantining breaks nested alternates that are already at the maximum alternates nesting depth. In the file sha1_file.c in the function link_alt_odb_entries we have this: > if (depth > 5) { > error("%s: ignoring alternate object stores, nesting too deep.", > relative_base); > return; > } When the incoming quarantine takes place the current objects directory is demoted to an alternate thereby increasing its depth (and any alternates it references) by one and causing any object store that was previously at the maximum nesting depth to be ignored courtesy of the above hard-coded maximum depth. If the incoming push happens to need access to some of those objects to perhaps "--fix-thin" its pack it will crash and burn. Originally I was not going to include a patch to fix this, but simply suggest that the expeditious fix is to just allow one additional alternates nesting depth level during quarantine operations. However, it was so simple, I have included the patch below :) I have verified that where a push with Git v2.10.2 succeeds and a push with Git v2.11.0 to the same repository fails because of this problem that the below patch does indeed correct the issue and allow the push to succeed. Cheers, Kyle -- 8< -- Subject: [PATCH] receive-pack: increase max alternates depth during quarantine Ever since 722ff7f876 (receive-pack: quarantine objects until pre-receive accepts, 2016-10-03, v2.11.0), Git has been quarantining objects and packs received during an incoming push into a separate objects directory and using the alternates mechanism to make them available until they are either accepted and moved into the main objects directory or rejected and discarded. Unfortunately this has the side effect of increasing the alternates nesting depth level by one for all pre-existing alternates. If a repository is already at the maximum alternates nesting depth, then this quarantining operation can temporarily push it over making the incoming push fail. To prevent the failure we simply increase the allowed alternates nesting depth by one whenever a quarantine operation is in effect. Signed-off-by: Kyle J. McKay--- Notes: Some alternates nesting depth background: If base/fork0/fork1/fork2/fork3/fork4/fork5 represents seven git repositories where base.git has no alternates, fork0.git has base.git as an alternate, fork1.git has fork0.git as an alternate and so on where fork5.git has only fork4.git as an alternate, then fork5.git is at the maximum allowed depth of 5. git fsck --strict --full works without complaint on fork5.git. However, in base/fork0/fork1/fork2/fork3/fork4/fork5/fork6, an fsck --strict --full of fork6.git will generate complaints and any objects/packs present in base.git will be ignored. cache.h | 1 + common-main.c | 3 +++ environment.c | 1 + sha1_file.c | 2 +- 4 files changed, 6 insertions(+), 1 deletion(-) diff --git a/cache.h b/cache.h index a50a61a1..25c17c29 100644 --- a/cache.h +++ b/cache.h @@ -676,6 +676,7 @@ extern size_t packed_git_limit; extern size_t delta_base_cache_limit; extern unsigned long big_file_threshold; extern unsigned long pack_size_limit_cfg; +extern int alt_odb_max_depth; /* * Accessors for the core.sharedrepository config which lazy-load the value diff --git a/common-main.c b/common-main.c index c654f955..9f747491 100644 --- a/common-main.c +++ b/common-main.c @@ -37,5 +37,8 @@ int main(int argc, const char **argv) restore_sigpipe_to_default(); + if (getenv(GIT_QUARANTINE_ENVIRONMENT)) + alt_odb_max_depth++; + return cmd_main(argc, argv); } diff --git a/environment.c b/environment.c index 0935ec69..32e11f70 100644 --- a/environment.c +++ b/environment.c @@ -64,6 +64,7 @@ int merge_log_config = -1; int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */ unsigned long pack_size_limit_cfg; enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY; +int alt_odb_max_depth = 5; #ifndef PROTECT_HFS_DEFAULT #define PROTECT_HFS_DEFAULT 0 diff --git a/sha1_file.c b/sha1_file.c index 9c86d192..15b8432e 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -337,7 +337,7 @@ static void link_alt_odb_entries(const char *alt, int len, int sep, int i; struct strbuf objdirbuf = STRBUF_INIT; - if (depth > 5) { + if (depth > alt_odb_max_depth) { error("%s: ignoring alternate object stores, nesting too deep.", relative_base); return; ---
Re: git reset --hard should not irretrievably destroy new files
On Sat, Dec 3, 2016 at 5:49 PM, Johannes Sixtwrote: > Am 03.12.2016 um 06:04 schrieb Julian de Bhal: >> >> If you `git add new_file; git reset --hard`, new_file is gone forever. > > AFAIC, this is a feature ;-) I occasionally use it to remove a file when I > already have git-gui in front of me. Then it's often less convenient to type > the path in a shell, or to pointy-click around in a file browser. Yeah, I'm conscious that it would be a change in behaviour and would almost certainly break things in the wild. On the other hand, `rm` deletes perfectly well, but there's no good way to recover the lost files after the fact. You can take some precautions after you've been bitten, but git usually means never saying "you should have". >> git add new_file >> [...] >> git reset --hard # decided copy from backed up diff >> # boom. new_file is gone forever > > ... it is not. The file is still among the dangling blobs in the repository > until you clean it up with 'git gc'. Use 'git fsck --lost-found': Thank you so much! Super glad to be wrong here. Cheers, Jules On Sat, Dec 3, 2016 at 5:49 PM, Johannes Sixt wrote: > Am 03.12.2016 um 06:04 schrieb Julian de Bhal: >> >> If you `git add new_file; git reset --hard`, new_file is gone forever. > > > AFAIC, this is a feature ;-) I occasionally use it to remove a file when I > already have git-gui in front of me. Then it's often less convenient to type > the path in a shell, or to pointy-click around in a file browser. > >> git add new_file > > > Because of this ... > >> git add -p # also not necessary, but distracting >> git reset --hard # decided copy from backed up diff >> # boom. new_file is gone forever > > > ... it is not. The file is still among the dangling blobs in the repository > until you clean it up with 'git gc'. Use 'git fsck --lost-found': > > --lost-found > > Write dangling objects into .git/lost-found/commit/ or > .git/lost-found/other/, depending on type. If the object is a blob, the > contents are written into the file, rather than its object name. > > -- Hannes >
[PATCH v4 3/3] unicode_width.h: update the tables to Unicode 9.0
Rerunning update-unicode.sh that we fixed in the two previous commits produces these new tables. Signed-off-by: Beat Bolli--- unicode_width.h | 131 +--- 1 file changed, 107 insertions(+), 24 deletions(-) diff --git a/unicode_width.h b/unicode_width.h index 47cdd23..02207be 100644 --- a/unicode_width.h +++ b/unicode_width.h @@ -25,7 +25,7 @@ static const struct interval zero_width[] = { { 0x0825, 0x0827 }, { 0x0829, 0x082D }, { 0x0859, 0x085B }, -{ 0x08E4, 0x0902 }, +{ 0x08D4, 0x0902 }, { 0x093A, 0x093A }, { 0x093C, 0x093C }, { 0x0941, 0x0948 }, @@ -120,6 +120,7 @@ static const struct interval zero_width[] = { { 0x17C9, 0x17D3 }, { 0x17DD, 0x17DD }, { 0x180B, 0x180E }, +{ 0x1885, 0x1886 }, { 0x18A9, 0x18A9 }, { 0x1920, 0x1922 }, { 0x1927, 0x1928 }, @@ -158,7 +159,7 @@ static const struct interval zero_width[] = { { 0x1CF4, 0x1CF4 }, { 0x1CF8, 0x1CF9 }, { 0x1DC0, 0x1DF5 }, -{ 0x1DFC, 0x1DFF }, +{ 0x1DFB, 0x1DFF }, { 0x200B, 0x200F }, { 0x202A, 0x202E }, { 0x2060, 0x2064 }, @@ -171,13 +172,13 @@ static const struct interval zero_width[] = { { 0x3099, 0x309A }, { 0xA66F, 0xA672 }, { 0xA674, 0xA67D }, -{ 0xA69F, 0xA69F }, +{ 0xA69E, 0xA69F }, { 0xA6F0, 0xA6F1 }, { 0xA802, 0xA802 }, { 0xA806, 0xA806 }, { 0xA80B, 0xA80B }, { 0xA825, 0xA826 }, -{ 0xA8C4, 0xA8C4 }, +{ 0xA8C4, 0xA8C5 }, { 0xA8E0, 0xA8F1 }, { 0xA926, 0xA92D }, { 0xA947, 0xA951 }, @@ -204,7 +205,7 @@ static const struct interval zero_width[] = { { 0xABED, 0xABED }, { 0xFB1E, 0xFB1E }, { 0xFE00, 0xFE0F }, -{ 0xFE20, 0xFE2D }, +{ 0xFE20, 0xFE2F }, { 0xFEFF, 0xFEFF }, { 0xFFF9, 0xFFFB }, { 0x101FD, 0x101FD }, @@ -228,16 +229,21 @@ static const struct interval zero_width[] = { { 0x11173, 0x11173 }, { 0x11180, 0x11181 }, { 0x111B6, 0x111BE }, +{ 0x111CA, 0x111CC }, { 0x1122F, 0x11231 }, { 0x11234, 0x11234 }, { 0x11236, 0x11237 }, +{ 0x1123E, 0x1123E }, { 0x112DF, 0x112DF }, { 0x112E3, 0x112EA }, -{ 0x11301, 0x11301 }, +{ 0x11300, 0x11301 }, { 0x1133C, 0x1133C }, { 0x11340, 0x11340 }, { 0x11366, 0x1136C }, { 0x11370, 0x11374 }, +{ 0x11438, 0x1143F }, +{ 0x11442, 0x11444 }, +{ 0x11446, 0x11446 }, { 0x114B3, 0x114B8 }, { 0x114BA, 0x114BA }, { 0x114BF, 0x114C0 }, @@ -245,6 +251,7 @@ static const struct interval zero_width[] = { { 0x115B2, 0x115B5 }, { 0x115BC, 0x115BD }, { 0x115BF, 0x115C0 }, +{ 0x115DC, 0x115DD }, { 0x11633, 0x1163A }, { 0x1163D, 0x1163D }, { 0x1163F, 0x11640 }, @@ -252,6 +259,16 @@ static const struct interval zero_width[] = { { 0x116AD, 0x116AD }, { 0x116B0, 0x116B5 }, { 0x116B7, 0x116B7 }, +{ 0x1171D, 0x1171F }, +{ 0x11722, 0x11725 }, +{ 0x11727, 0x1172B }, +{ 0x11C30, 0x11C36 }, +{ 0x11C38, 0x11C3D }, +{ 0x11C3F, 0x11C3F }, +{ 0x11C92, 0x11CA7 }, +{ 0x11CAA, 0x11CB0 }, +{ 0x11CB2, 0x11CB3 }, +{ 0x11CB5, 0x11CB6 }, { 0x16AF0, 0x16AF4 }, { 0x16B30, 0x16B36 }, { 0x16F8F, 0x16F92 }, @@ -262,31 +279,59 @@ static const struct interval zero_width[] = { { 0x1D185, 0x1D18B }, { 0x1D1AA, 0x1D1AD }, { 0x1D242, 0x1D244 }, +{ 0x1DA00, 0x1DA36 }, +{ 0x1DA3B, 0x1DA6C }, +{ 0x1DA75, 0x1DA75 }, +{ 0x1DA84, 0x1DA84 }, +{ 0x1DA9B, 0x1DA9F }, +{ 0x1DAA1, 0x1DAAF }, +{ 0x1E000, 0x1E006 }, +{ 0x1E008, 0x1E018 }, +{ 0x1E01B, 0x1E021 }, +{ 0x1E023, 0x1E024 }, +{ 0x1E026, 0x1E02A }, { 0x1E8D0, 0x1E8D6 }, +{ 0x1E944, 0x1E94A }, { 0xE0001, 0xE0001 }, { 0xE0020, 0xE007F }, { 0xE0100, 0xE01EF } }; static const struct interval double_width[] = { -{ /* plane */ 0x0, 0x1C }, -{ /* plane */ 0x1C, 0x21 }, -{ /* plane */ 0x21, 0x22 }, -{ /* plane */ 0x22, 0x23 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, { 0x1100, 0x115F }, +{ 0x231A, 0x231B }, { 0x2329, 0x232A }, +{ 0x23E9, 0x23EC }, +{ 0x23F0, 0x23F0 }, +{ 0x23F3, 0x23F3 }, +{ 0x25FD, 0x25FE }, +{ 0x2614, 0x2615 }, +{ 0x2648, 0x2653 }, +{ 0x267F, 0x267F }, +{ 0x2693, 0x2693 }, +{ 0x26A1, 0x26A1 }, +{ 0x26AA, 0x26AB }, +{ 0x26BD, 0x26BE }, +{ 0x26C4, 0x26C5 }, +{ 0x26CE, 0x26CE }, +{ 0x26D4, 0x26D4 }, +{ 0x26EA, 0x26EA }, +{ 0x26F2, 0x26F3 }, +{ 0x26F5, 0x26F5 }, +{ 0x26FA, 0x26FA }, +{ 0x26FD, 0x26FD }, +{ 0x2705, 0x2705 }, +{ 0x270A, 0x270B }, +{ 0x2728, 0x2728 }, +{ 0x274C, 0x274C }, +{ 0x274E, 0x274E }, +{ 0x2753, 0x2755 }, +{ 0x2757, 0x2757 }, +{ 0x2795, 0x2797 }, +{ 0x27B0, 0x27B0 }, +{ 0x27BF, 0x27BF }, +{ 0x2B1B, 0x2B1C }, +{ 0x2B50, 0x2B50 }, +{ 0x2B55, 0x2B55 }, { 0x2E80, 0x2E99 }, { 0x2E9B, 0x2EF3 }, { 0x2F00, 0x2FD5 }, @@ -313,11 +358,49 @@ static const struct interval double_width[] = { { 0xFE68, 0xFE6B }, { 0xFF01, 0xFF60 }, { 0xFFE0, 0xFFE6 }, +{ 0x16FE0, 0x16FE0 }, +{ 0x17000, 0x187EC }, +{ 0x18800, 0x18AF2 }, { 0x1B000, 0x1B001 }, +{ 0x1F004,
[PATCH v4 2/3] update-unicode.sh: strip the plane offsets from the double_width[] table
The function bisearch() in utf8.c does a pure binary search in double_width. It does not care about the 17 plane offsets which unicode/uniset/uniset prepends. Leaving the plane offsets in the table may cause wrong results. Filter out the plane offsets in update-unicode.sh. Reviewed-by: Torsten BögershausenSigned-off-by: Beat Bolli --- update_unicode.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/update_unicode.sh b/update_unicode.sh index 3c84270..4c1ec8d 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -30,7 +30,7 @@ fi && grep -v plane) }; static const struct interval double_width[] = { - $(uniset/uniset --32 eaw:F,W) + $(uniset/uniset --32 eaw:F,W | grep -v plane) }; EOF ) -- 2.7.2
[PATCH v4 1/3] update-unicode.sh: automatically download newer definition files
Checking just for the unicode data files' existence is not sufficient; we should also download them if a newer version exists on the Unicode consortium's servers. Option -N of wget does this nicely for us. Reviewed-by: Torsten BoegershausenSigned-off-by: Beat Bolli --- Diff to v3: - change the Cc: into Reviewed-by: on Thorsten's request - include the old reroll diffs Diff to v2: - reorder the commits: fix all of update-unicode.sh first, then regenerate unicode_width.h only once Diff to v1: - reword the commit message - add Thorsten's Cc: update_unicode.sh | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/update_unicode.sh b/update_unicode.sh index 27af77c..3c84270 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -10,12 +10,8 @@ if ! test -d unicode; then mkdir unicode fi && ( cd unicode && - if ! test -f UnicodeData.txt; then - wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt - fi && - if ! test -f EastAsianWidth.txt; then - wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt - fi && + wget -N http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt \ + http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt && if ! test -d uniset; then git clone https://github.com/depp/uniset.git fi && -- 2.7.2
[PATCH] docs: warn about possible '=' in clean/smudge filter process values
From: Lars SchneiderA pathname value in a clean/smudge filter process "key=value" pair can contain the '=' character (introduced in edcc858). Make the user aware of this issue in the docs, add a corresponding test case, and fix the issue in filter process value parser of the example implementation in contrib. Signed-off-by: Lars Schneider --- Documentation/gitattributes.txt| 4 +++- contrib/long-running-filter/example.pl | 8 ++-- t/t0021-conversion.sh | 20 ++-- t/t0021/rot13-filter.pl| 8 ++-- 4 files changed, 25 insertions(+), 15 deletions(-) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index 976243a63e..e0b66c1220 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -435,7 +435,9 @@ to filter relative to the repository root. Right after the flush packet Git sends the content split in zero or more pkt-line packets and a flush packet to terminate content. Please note, that the filter must not send any response before it received the content and the -final flush packet. +final flush packet. Also note that the "value" of a "key=value" pair +can contain the "=" character whereas the key would never contain +that character. packet: git> command=smudge packet: git> pathname=path/testfile.dat diff --git a/contrib/long-running-filter/example.pl b/contrib/long-running-filter/example.pl index 39457055a5..a677569ddd 100755 --- a/contrib/long-running-filter/example.pl +++ b/contrib/long-running-filter/example.pl @@ -81,8 +81,12 @@ packet_txt_write("capability=smudge"); packet_flush(); while (1) { - my ($command) = packet_txt_read() =~ /^command=([^=]+)$/; - my ($pathname) = packet_txt_read() =~ /^pathname=([^=]+)$/; + my ($command) = packet_txt_read() =~ /^command=(.+)$/; + my ($pathname) = packet_txt_read() =~ /^pathname=(.+)$/; + + if ( $pathname eq "" ) { + die "bad pathname '$pathname'"; + } packet_bin_read(); diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index 4ea534e9fa..f3a0df2add 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -93,7 +93,7 @@ test_expect_success setup ' git checkout -- test test.t test.i && echo "content-test2" >test2.o && - echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x.o" + echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x=.o" ' script='s/^\$Id: \([0-9a-f]*\) \$/\1/p' @@ -359,12 +359,12 @@ test_expect_success PERL 'required process filter should filter data' ' cp "$TEST_ROOT/test.o" test.r && cp "$TEST_ROOT/test2.o" test2.r && mkdir testsubdir && - cp "$TEST_ROOT/test3 '\''sq'\'',\$x.o" "testsubdir/test3 '\''sq'\'',\$x.r" && + cp "$TEST_ROOT/test3 '\''sq'\'',\$x=.o" "testsubdir/test3 '\''sq'\'',\$x=.r" && >test4-empty.r && S=$(file_size test.r) && S2=$(file_size test2.r) && - S3=$(file_size "testsubdir/test3 '\''sq'\'',\$x.r") && + S3=$(file_size "testsubdir/test3 '\''sq'\'',\$x=.r") && filter_git add . && cat >expected.log <<-EOF && @@ -373,7 +373,7 @@ test_expect_success PERL 'required process filter should filter data' ' IN: clean test.r $S [OK] -- OUT: $S . [OK] IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK] IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] - IN: clean testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK] + IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK] STOP EOF test_cmp_count expected.log rot13-filter.log && @@ -385,23 +385,23 @@ test_expect_success PERL 'required process filter should filter data' ' IN: clean test.r $S [OK] -- OUT: $S . [OK] IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK] IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] - IN: clean testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK] + IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK] IN: clean test.r $S [OK] -- OUT: $S . [OK] IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK] IN: clean test4-empty.r 0 [OK] -- OUT: 0 [OK] - IN: clean testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK] + IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK] STOP EOF
Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
> On 30 Nov 2016, at 22:04, Christian Couderwrote: > > Goal > > > Git can store its objects only in the form of loose objects in > separate files or packed objects in a pack file. > > To be able to better handle some kind of objects, for example big > blobs, it would be nice if Git could store its objects in other object > databases (ODB). This is a great goal. I really hope we can use that to solve the pain points in the current Git <--> GitLFS integration! Thanks for working on this! Minor nit: I feel the term "other" could be more expressive. Plus "database" might confuse people. What do you think about "External Object Storage" or something? > Design > ~~ > > - " have": the command should output the sha1, size and > type of all the objects the external ODB contains, one object per > line. This looks impractical. If a repo has 10k external files with 100 versions each then you need to read/transfer 1m hashes (this is not made up - I am working with Git repos than contain >>10k files in GitLFS). Wouldn't it be better if Git collects all hashes that it currently needs and then asks the external ODBs if they have them? > - " get ": the command should then read from the > external ODB the content of the object corresponding to and > output it on stdout. > > - " put ": the command should then read > from stdin an object and store it in the external ODB. Based on my experience with Git clean/smudge filters I think this kind of single shot protocol will be a performance bottleneck as soon as people store more than >1000 files in the external ODB. Maybe you can reuse my "filter process protocol" (edcc858) here? > * Transfer > > To tranfer information about the blobs stored in external ODB, some > special refs, called "odb ref", similar as replace refs, are used. > > For now there should be one odb ref per blob. Each ref name should be > refs/odbs// where is the sha1 of the blob stored > in the external odb named . > > These odb refs should all point to a blob that should be stored in the > Git repository and contain information about the blob stored in the > external odb. This information can be specific to the external odb. > The repos can then share this information using commands like: > > `git fetch origin "refs/odbs//*:refs/odbs//*"` The "odbref" would point to a blob and the blob could contain anything, right? E.g. it could contain an existing GitLFS pointer, right? version https://git-lfs.github.com/spec/v1 oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393 size 12345 > Design discussion about performance > ~~~ > > Yeah, it is not efficient to fork/exec a command to just read or write > one object to or from the external ODB. Batch calls and/or using a > daemon and/or RPC should be used instead to be able to store regular > objects in an external ODB. But for now the external ODB would be all > about really big files, where the cost of a fork+exec should not > matter much. If we later want to extend usage of external ODBs, yeah > we will probably need to design other mechanisms. I think we should leverage the learnings from GitLFS as much as possible. My learnings are: (1) Fork/exec per object won't work. People have lots and lots of content that is not suited for Git (e.g. integration test data, images, ...). (2) We need a good UI. I think it would be great if the average user would not even need to know about ODB. Moving files explicitly with a "put" command seems unpractical to me. GitLFS tracks files via filename and that has a number of drawbacks, too. Do you see a way to define a customizable metric such as "move all files to ODB X that are gzip compressed larger than Y"? > Future work > ~~~ > > I think that the odb refs don't prevent a regular fetch or push from > wanting to send the objects that are managed by an external odb. So I > am interested in suggestions about this problem. I will take a look at > previous discussions and how other mechanisms (shallow clone, bundle > v3, ...) handle this. If the ODB configuration is stored in the Git repo similar to .gitmodules then every client that clones ODB references would be able to resolve them, right? Cheers, Lars
Re: [PATCH v3 1/3] update-unicode.sh: automatically download newer definition files
On 03.12.16 17:40, Torsten =?unknown-8bit?Q?B=C3=B6gershausen?= wrote: > On Sat, Dec 03, 2016 at 02:19:31PM +0100, Beat Bolli wrote: >> Checking just for the unicode data files' existence is not sufficient; >> we should also download them if a newer version exists on the Unicode >> consortium's servers. Option -N of wget does this nicely for us. >> >> Cc: Torsten B??gershausen> > The V3 series makes perfect sense, thanks for cleaning up my mess. Yeah, it took me three tries, too :-) > (And can we remove the Cc: line, or replace with it Reviewed-by ?) If you prefer, sure. Do you have any other comments? Beat
Re: [PATCH v3 1/3] update-unicode.sh: automatically download newer definition files
On Sat, Dec 03, 2016 at 02:19:31PM +0100, Beat Bolli wrote: > Checking just for the unicode data files' existence is not sufficient; > we should also download them if a newer version exists on the Unicode > consortium's servers. Option -N of wget does this nicely for us. > > Cc: Torsten B??gershausenThe V3 series makes perfect sense, thanks for cleaning up my mess. (And can we remove the Cc: line, or replace with it Reviewed-by ?)
Re: [PATCH] commit: make --only --allow-empty work without paths
On Sat, Dec 03, 2016 at 07:59:49AM +0100, Andreas Krey wrote: > > OK. I'm not sure why you would want to create an empty commit in such a > > case. > > User: Ok tool, make me a pullreq. > > Tool: But you haven't mentioned any issue > in your commit messages. Which are they? > > User: Ok, that would be A-123. > > Tool: git commit --allow-empty -m 'FIX: A-123' OK. I think "tool" is slightly funny here, but I get that is part of the real world works. Thanks for illustrating. > > Yes, I think --run is a misfeature (I actually had to look it up, as I > ... > > implicit. If a single test script is annoyingly long to run, I'd argue > > It wasn't about runtime but about output. I would have > liked to see only the output of my still-failing test; > a 'stop after test X' would be helpful there. You can do --verbose-only=, but if the test is failing, I typically use "-v -i". That makes everything verbose, and then stops at the failing test, so you can see the output easily. -Peff
[PATCH v3 1/3] update-unicode.sh: automatically download newer definition files
Checking just for the unicode data files' existence is not sufficient; we should also download them if a newer version exists on the Unicode consortium's servers. Option -N of wget does this nicely for us. Cc: Torsten BögershausenSigned-off-by: Beat Bolli --- Diff to v2: - reorder the commits: fix all of update-unicode.sh first, then regenerate unicode_width.h only once update_unicode.sh | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/update_unicode.sh b/update_unicode.sh index 27af77c..3c84270 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -10,12 +10,8 @@ if ! test -d unicode; then mkdir unicode fi && ( cd unicode && - if ! test -f UnicodeData.txt; then - wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt - fi && - if ! test -f EastAsianWidth.txt; then - wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt - fi && + wget -N http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt \ + http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt && if ! test -d uniset; then git clone https://github.com/depp/uniset.git fi && -- 2.7.2
[PATCH v3 2/3] update-unicode.sh: strip the plane offsets from the double_width[] table
The function bisearch() in utf8.c does a pure binary search in double_width. It does not care about the 17 plane offsets which unicode/uniset/uniset prepends. Leaving the plane offsets in the table may cause wrong results. Filter out the plane offsets in update-unicode.sh. Cc: Torsten BögershausenSigned-off-by: Beat Bolli --- update_unicode.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/update_unicode.sh b/update_unicode.sh index 3c84270..4c1ec8d 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -30,7 +30,7 @@ fi && grep -v plane) }; static const struct interval double_width[] = { - $(uniset/uniset --32 eaw:F,W) + $(uniset/uniset --32 eaw:F,W | grep -v plane) }; EOF ) -- 2.7.2
[PATCH v3 3/3] unicode_width.h: update the tables to Unicode 9.0
Rerunning update-unicode.sh that we fixed in the two previous commits produces these new tables. Signed-off-by: Beat Bolli--- unicode_width.h | 131 +--- 1 file changed, 107 insertions(+), 24 deletions(-) diff --git a/unicode_width.h b/unicode_width.h index 47cdd23..02207be 100644 --- a/unicode_width.h +++ b/unicode_width.h @@ -25,7 +25,7 @@ static const struct interval zero_width[] = { { 0x0825, 0x0827 }, { 0x0829, 0x082D }, { 0x0859, 0x085B }, -{ 0x08E4, 0x0902 }, +{ 0x08D4, 0x0902 }, { 0x093A, 0x093A }, { 0x093C, 0x093C }, { 0x0941, 0x0948 }, @@ -120,6 +120,7 @@ static const struct interval zero_width[] = { { 0x17C9, 0x17D3 }, { 0x17DD, 0x17DD }, { 0x180B, 0x180E }, +{ 0x1885, 0x1886 }, { 0x18A9, 0x18A9 }, { 0x1920, 0x1922 }, { 0x1927, 0x1928 }, @@ -158,7 +159,7 @@ static const struct interval zero_width[] = { { 0x1CF4, 0x1CF4 }, { 0x1CF8, 0x1CF9 }, { 0x1DC0, 0x1DF5 }, -{ 0x1DFC, 0x1DFF }, +{ 0x1DFB, 0x1DFF }, { 0x200B, 0x200F }, { 0x202A, 0x202E }, { 0x2060, 0x2064 }, @@ -171,13 +172,13 @@ static const struct interval zero_width[] = { { 0x3099, 0x309A }, { 0xA66F, 0xA672 }, { 0xA674, 0xA67D }, -{ 0xA69F, 0xA69F }, +{ 0xA69E, 0xA69F }, { 0xA6F0, 0xA6F1 }, { 0xA802, 0xA802 }, { 0xA806, 0xA806 }, { 0xA80B, 0xA80B }, { 0xA825, 0xA826 }, -{ 0xA8C4, 0xA8C4 }, +{ 0xA8C4, 0xA8C5 }, { 0xA8E0, 0xA8F1 }, { 0xA926, 0xA92D }, { 0xA947, 0xA951 }, @@ -204,7 +205,7 @@ static const struct interval zero_width[] = { { 0xABED, 0xABED }, { 0xFB1E, 0xFB1E }, { 0xFE00, 0xFE0F }, -{ 0xFE20, 0xFE2D }, +{ 0xFE20, 0xFE2F }, { 0xFEFF, 0xFEFF }, { 0xFFF9, 0xFFFB }, { 0x101FD, 0x101FD }, @@ -228,16 +229,21 @@ static const struct interval zero_width[] = { { 0x11173, 0x11173 }, { 0x11180, 0x11181 }, { 0x111B6, 0x111BE }, +{ 0x111CA, 0x111CC }, { 0x1122F, 0x11231 }, { 0x11234, 0x11234 }, { 0x11236, 0x11237 }, +{ 0x1123E, 0x1123E }, { 0x112DF, 0x112DF }, { 0x112E3, 0x112EA }, -{ 0x11301, 0x11301 }, +{ 0x11300, 0x11301 }, { 0x1133C, 0x1133C }, { 0x11340, 0x11340 }, { 0x11366, 0x1136C }, { 0x11370, 0x11374 }, +{ 0x11438, 0x1143F }, +{ 0x11442, 0x11444 }, +{ 0x11446, 0x11446 }, { 0x114B3, 0x114B8 }, { 0x114BA, 0x114BA }, { 0x114BF, 0x114C0 }, @@ -245,6 +251,7 @@ static const struct interval zero_width[] = { { 0x115B2, 0x115B5 }, { 0x115BC, 0x115BD }, { 0x115BF, 0x115C0 }, +{ 0x115DC, 0x115DD }, { 0x11633, 0x1163A }, { 0x1163D, 0x1163D }, { 0x1163F, 0x11640 }, @@ -252,6 +259,16 @@ static const struct interval zero_width[] = { { 0x116AD, 0x116AD }, { 0x116B0, 0x116B5 }, { 0x116B7, 0x116B7 }, +{ 0x1171D, 0x1171F }, +{ 0x11722, 0x11725 }, +{ 0x11727, 0x1172B }, +{ 0x11C30, 0x11C36 }, +{ 0x11C38, 0x11C3D }, +{ 0x11C3F, 0x11C3F }, +{ 0x11C92, 0x11CA7 }, +{ 0x11CAA, 0x11CB0 }, +{ 0x11CB2, 0x11CB3 }, +{ 0x11CB5, 0x11CB6 }, { 0x16AF0, 0x16AF4 }, { 0x16B30, 0x16B36 }, { 0x16F8F, 0x16F92 }, @@ -262,31 +279,59 @@ static const struct interval zero_width[] = { { 0x1D185, 0x1D18B }, { 0x1D1AA, 0x1D1AD }, { 0x1D242, 0x1D244 }, +{ 0x1DA00, 0x1DA36 }, +{ 0x1DA3B, 0x1DA6C }, +{ 0x1DA75, 0x1DA75 }, +{ 0x1DA84, 0x1DA84 }, +{ 0x1DA9B, 0x1DA9F }, +{ 0x1DAA1, 0x1DAAF }, +{ 0x1E000, 0x1E006 }, +{ 0x1E008, 0x1E018 }, +{ 0x1E01B, 0x1E021 }, +{ 0x1E023, 0x1E024 }, +{ 0x1E026, 0x1E02A }, { 0x1E8D0, 0x1E8D6 }, +{ 0x1E944, 0x1E94A }, { 0xE0001, 0xE0001 }, { 0xE0020, 0xE007F }, { 0xE0100, 0xE01EF } }; static const struct interval double_width[] = { -{ /* plane */ 0x0, 0x1C }, -{ /* plane */ 0x1C, 0x21 }, -{ /* plane */ 0x21, 0x22 }, -{ /* plane */ 0x22, 0x23 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, { 0x1100, 0x115F }, +{ 0x231A, 0x231B }, { 0x2329, 0x232A }, +{ 0x23E9, 0x23EC }, +{ 0x23F0, 0x23F0 }, +{ 0x23F3, 0x23F3 }, +{ 0x25FD, 0x25FE }, +{ 0x2614, 0x2615 }, +{ 0x2648, 0x2653 }, +{ 0x267F, 0x267F }, +{ 0x2693, 0x2693 }, +{ 0x26A1, 0x26A1 }, +{ 0x26AA, 0x26AB }, +{ 0x26BD, 0x26BE }, +{ 0x26C4, 0x26C5 }, +{ 0x26CE, 0x26CE }, +{ 0x26D4, 0x26D4 }, +{ 0x26EA, 0x26EA }, +{ 0x26F2, 0x26F3 }, +{ 0x26F5, 0x26F5 }, +{ 0x26FA, 0x26FA }, +{ 0x26FD, 0x26FD }, +{ 0x2705, 0x2705 }, +{ 0x270A, 0x270B }, +{ 0x2728, 0x2728 }, +{ 0x274C, 0x274C }, +{ 0x274E, 0x274E }, +{ 0x2753, 0x2755 }, +{ 0x2757, 0x2757 }, +{ 0x2795, 0x2797 }, +{ 0x27B0, 0x27B0 }, +{ 0x27BF, 0x27BF }, +{ 0x2B1B, 0x2B1C }, +{ 0x2B50, 0x2B50 }, +{ 0x2B55, 0x2B55 }, { 0x2E80, 0x2E99 }, { 0x2E9B, 0x2EF3 }, { 0x2F00, 0x2FD5 }, @@ -313,11 +358,49 @@ static const struct interval double_width[] = { { 0xFE68, 0xFE6B }, { 0xFF01, 0xFF60 }, { 0xFFE0, 0xFFE6 }, +{ 0x16FE0, 0x16FE0 }, +{ 0x17000, 0x187EC }, +{ 0x18800, 0x18AF2 }, { 0x1B000, 0x1B001 }, +{ 0x1F004,
[PATCH v2 1/3] update-unicode.sh: automatically download newer definition files
Checking just for the unicode data files' existence is not sufficient; we should also download them if a newer version exists on the Unicode consortium's servers. Option -N of wget does this nicely for us. Cc: Torsten BögershausenSigned-off-by: Beat Bolli --- Diff to v1: - reword the commit message - add Thorsten's Cc: update_unicode.sh | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/update_unicode.sh b/update_unicode.sh index 27af77c..3c84270 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -10,12 +10,8 @@ if ! test -d unicode; then mkdir unicode fi && ( cd unicode && - if ! test -f UnicodeData.txt; then - wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt - fi && - if ! test -f EastAsianWidth.txt; then - wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt - fi && + wget -N http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt \ + http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt && if ! test -d uniset; then git clone https://github.com/depp/uniset.git fi && -- 2.7.2
[PATCH v2 3/3] unicode_width.h: fix the double_width[] table
The function bisearch() in utf8.c does a pure binary search in double_width. It does not care about the 17 plane offsets which unicode/uniset/uniset prepends. Leaving the plane offsets in the table may cause wrong results. Filter out the plane offsets in update-unicode.sh and regenerate the table. Cc: Torsten BögershausenSigned-off-by: Beat Bolli --- Diff to v1: - add Thorsten's Cc: unicode_width.h | 17 - update_unicode.sh | 2 +- 2 files changed, 1 insertion(+), 18 deletions(-) diff --git a/unicode_width.h b/unicode_width.h index 73b5fd6..02207be 100644 --- a/unicode_width.h +++ b/unicode_width.h @@ -297,23 +297,6 @@ static const struct interval zero_width[] = { { 0xE0100, 0xE01EF } }; static const struct interval double_width[] = { -{ /* plane */ 0x0, 0x3D }, -{ /* plane */ 0x3D, 0x68 }, -{ /* plane */ 0x68, 0x69 }, -{ /* plane */ 0x69, 0x6A }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, { 0x1100, 0x115F }, { 0x231A, 0x231B }, { 0x2329, 0x232A }, diff --git a/update_unicode.sh b/update_unicode.sh index 3c84270..4c1ec8d 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -30,7 +30,7 @@ fi && grep -v plane) }; static const struct interval double_width[] = { - $(uniset/uniset --32 eaw:F,W) + $(uniset/uniset --32 eaw:F,W | grep -v plane) }; EOF ) -- 2.7.2
[PATCH 3/3] unicode_width.h: fix the double_width[] table
The function bisearch() in utf8.c does a pure binary search in double_width. It does not care about the 17 plane offsets which unicode/uniset/uniset prepends. Leaving the plane offsets in the table may cause wrong results. Filter out the plane offsets in the update-unicode.sh and regenerate the table. Signed-off-by: Beat Bolli--- unicode_width.h | 17 - update_unicode.sh | 2 +- 2 files changed, 1 insertion(+), 18 deletions(-) diff --git a/unicode_width.h b/unicode_width.h index 73b5fd6..02207be 100644 --- a/unicode_width.h +++ b/unicode_width.h @@ -297,23 +297,6 @@ static const struct interval zero_width[] = { { 0xE0100, 0xE01EF } }; static const struct interval double_width[] = { -{ /* plane */ 0x0, 0x3D }, -{ /* plane */ 0x3D, 0x68 }, -{ /* plane */ 0x68, 0x69 }, -{ /* plane */ 0x69, 0x6A }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, -{ /* plane */ 0x0, 0x0 }, { 0x1100, 0x115F }, { 0x231A, 0x231B }, { 0x2329, 0x232A }, diff --git a/update_unicode.sh b/update_unicode.sh index 3c84270..4c1ec8d 100755 --- a/update_unicode.sh +++ b/update_unicode.sh @@ -30,7 +30,7 @@ fi && grep -v plane) }; static const struct interval double_width[] = { - $(uniset/uniset --32 eaw:F,W) + $(uniset/uniset --32 eaw:F,W | grep -v plane) }; EOF ) -- 2.7.2
Re: git reset --hard should not irretrievably destroy new files
On Sat, Dec 3, 2016 at 6:04 AM, Julian de Bhalwrote: > If you `git add new_file; git reset --hard`, new_file is gone forever. > > This is totally what git says it will do on the box, but it caught me out. Yeah, you are not the first one, and probably not the last unfortunately, to be caught by it, see for example the last discussion about it: https://public-inbox.org/git/loom.20160523t023140-...@post.gmane.org/ which itself refers to this previous discussion: https://public-inbox.org/git/CANWD=rx-meis4cnzdwr2wwkshz2zu8-l31urkwbzrjsbcjx...@mail.gmail.com/ > It might seem a little less stupid if I explain what I was doing: I was > breaking apart a chunk of work into smaller changes: > > git commit -a -m 'tmp' # You feel pretty safe now, right? > git checkout -b backup/my-stuff # Not necessary, just a convenience > git checkout - > git reset HEAD^ # mixed > git add new_file > git add -p # also not necessary, but distracting > git reset --hard # decided copy from backed up diff > # boom. new_file is gone forever > > > Now, again, this is totally what git says it's going to do, and that was > pretty stupid, but that file is gone for good, and it feels bad. Yeah, I agree that it feels bad even if there are often ways to get back your data as you can see from the links in Yotam's email above. > Everything that was committed is safe, and the other untracked files in > my local directory are also fine, but that particular file is > permanently destroyed. This is the first time I've lost something since I > discovered the reflog a year or two ago. > > The behaviour that would make the most sense to me (personally) would be > for a hard reset to unstage new files, This has already been proposed last time... > but I'd be nearly as happy if a > commit was added to the reflog when the reset happens (I can probably make > that happen with some configuration now that I've been bitten). Not sure if this has been proposed. Perhaps it would be simpler to just output the sha1, and maybe the filenames too, of the blobs, that are no more referenced from the trees, somewhere (in a bloblog?). > If there's support for this idea but no-one is keen to write the code, let > me know and I could have a crack at it. Not sure if your report and your offer will make us more likely to agree to do something, but thanks for trying!