date:20150405

On Thu, Apr 02, 2015 at 03:05:35PM -0600, Alex Henrie wrote:
 This partially reverts commit 8d849957d81fc0480a52570d66cc3c2a688ecb1b.

... and brings back the bug that 8d849957d81f solves, as far as I can
see.  If that's not the case then you need to explain that in the
patch description.

Paul.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] gitk: Rearrange window title to be more conventional.

On Mon, Mar 23, 2015 at 10:18:16AM -0400, Marc Branchaud wrote:
 Signed-off-by: Marc Branchaud marcn...@xiplink.com

Thanks, applied.

Paul.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 2/2] gitk: Show the rev(s) the user specified on the command line in the window title.

On Tue, Jan 06, 2015 at 12:52:00PM -0500, Marc Branchaud wrote:
 Signed-off-by: Marc Branchaud marcn...@xiplink.com
 ---
 
 I often open multiple gitk windows in the same working directory to examine
 other branches or refs in the repo.  This change allows me to distinguish
 which window is showing what.
 
 This is an RFC because it doesn't behave great with views.  I don't use views
 at all, so this doesn't bother me.  I'm not too sure what should be displayed
 if the user changes views -- probably the view name, but I'm not sure how to
 get a that in the code.

The view name is in $viewname($curview).  If that is Command line
you probably want to show some selected command line arguments.

Using $vrevs($curview) seems about right, though it will contain
--gitk-symmetric-diff-marker in some situations, which is an
internal thing that we don't want to show externally.  You should
probably filter it out.

The patch will need a proper description before I can apply it, of
course.

Paul.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] gitk: Fix bad English grammar Matches none Commit Info

On Thu, Apr 02, 2015 at 03:05:06PM -0600, Alex Henrie wrote:
 Signed-off-by: Alex Henrie alexhenri...@gmail.com
 ---
  gitk | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/gitk b/gitk
 index 9a2daf3..30fcd30 100755
 --- a/gitk
 +++ b/gitk
 @@ -4066,7 +4066,7 @@ set known_view_options {
  {committer t15  .  --committer=*  {mc Committer:}}
  {loginfo   t15  .. --grep=*   {mc Commit Message:}}
  {allmatch  b.. --all-match{mc Matches all Commit Info 
 criteria}}
 -{igrep b.. --invert-grep  {mc Matches none Commit Info 
 criteria}}
 +{igrep b.. --invert-grep  {mc Matches no Commit Info 
 criteria}}

Thanks, applied.

Paul.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] gitk: sv.po: Update Swedish translation (305t0f0u)

On Fri, Mar 27, 2015 at 10:34:25AM +0100, Peter Krefting wrote:
 Please find attached (for text encoding reasons) an update to the Swedish
 translation for gitk.

Thanks, applied.

Paul.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/6] address packed-refs speed regressions

2015-04-05 Thread René Scharfe


Am 05.04.2015 um 20:52 schrieb Jeff King:

On Sun, Apr 05, 2015 at 03:41:39PM +0200, René Scharfe wrote:

I wonder if pluggable reference backends could help here.  Storing refs
in a database table indexed by refname should simplify things.


...this. I think that effort might be better spent on a ref storage
format that's more efficient, simpler (with respect to subtle races and
such), and could provide other features (e.g., transactional atomicity).


Such as a DBMS? :-)  Leaving storage details to SQLite or whatever 
sounds attractive to me because I'm lazy.



The big plus side of packed-refs improvements is that they just work
without worrying about compatibility issues. But ref storage is local,
so I'm not sure how big a deal that is in practice.


Adding a dependency is a big step, admittedly, so native improvements 
might be a better fit.  There's a chance that we'd run into issues 
already solved by specialized database engines long ago, though.



Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping
the packed refs file?  What numbers do you get with the following patch?


It's about 9% faster than my series + the fgets optimization I posted
(or about 25% than using getc).  Which is certainly nice, but I was
really hoping to just make strbuf_getline faster for all callers, rather
than introducing special code for one call-site.  Certainly we could
generalize the technique (i.e., a struct with the mmap data), but then I
feel we are somewhat reinventing stdio. Which is maybe a good thing,
because stdio has a lot of rough edges (as seen here), but it does feel
a bit like NIH syndrome.


Forgot to say: I like your changes.  But if strbuf_getline can only be 
made fast enough beyond that by duplicating stdio buffering then I feel 
it's better to take a different way.  E.g. dropping the requirement to 
handle NUL chars and basing it on fgets as Junio suggested in his reply 
to patch 3 sounds good.


In any case, the packed refs file seems special enough to receive 
special treatment.  Using mmap would make the most sense if we could 
also avoid copying lines to a strbuf for parsing, though.


René
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V3] t9814: Guarantee only one source exists in git-p4 copy tests

2015-04-05 Thread Vitor Antunes

By using a tree with multiple identical files and allowing copy detection to
choose any one of them, the check in the test is unnecessarily complex.  We can
simplify by:

* Modify source file (file2) before copying the file.
* Check that only file2 is the source in the output of p4 filelog.
* Remove all case statements and replace them with simple tests to check
  that source is file2.

Signed-off-by: Vitor Antunes vitor@gmail.com
Acked-by: Luke Diamand l...@diamand.org
---
 t/t9814-git-p4-rename.sh |   46 +++---
 1 file changed, 31 insertions(+), 15 deletions(-)

diff --git a/t/t9814-git-p4-rename.sh b/t/t9814-git-p4-rename.sh
index 8b9c295..99bb71b 100755
--- a/t/t9814-git-p4-rename.sh
+++ b/t/t9814-git-p4-rename.sh
@@ -132,6 +132,9 @@ test_expect_success 'detect copies' '
cd $git 
git config git-p4.skipSubmitEdit true 
 
+   echo file8 file2 
+   git commit -a -m Differentiate file2 
+   git p4 submit 
cp file2 file8 
git add file8 
git commit -a -m Copy file2 to file8 
@@ -140,6 +143,10 @@ test_expect_success 'detect copies' '
p4 filelog //depot/file8 
p4 filelog //depot/file8 | test_must_fail grep -q branch from 

 
+   echo file9 file2 
+   git commit -a -m Differentiate file2 
+   git p4 submit 
+
cp file2 file9 
git add file9 
git commit -a -m Copy file2 to file9 
@@ -149,28 +156,39 @@ test_expect_success 'detect copies' '
p4 filelog //depot/file9 
p4 filelog //depot/file9 | test_must_fail grep -q branch from 

 
+   echo file10 file2 
+   git commit -a -m Differentiate file2 
+   git p4 submit 
+
echo file2 file2 
cp file2 file10 
git add file2 file10 
git commit -a -m Modify and copy file2 to file10 
git diff-tree -r -C HEAD 
+   src=$(git diff-tree -r -C HEAD | sed 1d | sed 2d | cut -f2) 
+   test $src = file2 
git p4 submit 
p4 filelog //depot/file10 
-   p4 filelog //depot/file10 | grep -q branch from //depot/file 

+   p4 filelog //depot/file10 | grep -q branch from //depot/file2 

+
+   echo file11 file2 
+   git commit -a -m Differentiate file2 
+   git p4 submit 
 
cp file2 file11 
git add file11 
git commit -a -m Copy file2 to file11 
git diff-tree -r -C --find-copies-harder HEAD 
src=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | 
cut -f2) 
-   case $src in
-   file2 | file10) : ;; # happy
-   *) false ;; # not
-   esac 
+   test $src = file2 
git config git-p4.detectCopiesHarder true 
git p4 submit 
p4 filelog //depot/file11 
-   p4 filelog //depot/file11 | grep -q branch from //depot/file 

+   p4 filelog //depot/file11 | grep -q branch from //depot/file2 

+
+   echo file12 file2 
+   git commit -a -m Differentiate file2 
+   git p4 submit 
 
cp file2 file12 
echo some text file12 
@@ -180,15 +198,16 @@ test_expect_success 'detect copies' '
level=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d 
| cut -f1 | cut -d  -f5 | sed s/C0*//) 
test -n $level  test $level -gt 0  test $level -lt 98 

src=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | 
cut -f2) 
-   case $src in
-   file10 | file11) : ;; # happy
-   *) false ;; # not
-   esac 
+   test $src = file2 
git config git-p4.detectCopies $(($level + 2)) 
git p4 submit 
p4 filelog //depot/file12 
p4 filelog //depot/file12 | test_must_fail grep -q branch 
from 
 
+   echo file13 file2 
+   git commit -a -m Differentiate file2 
+   git p4 submit 
+
cp file2 file13 
echo different text file13 
git add file13 
@@ -197,14 +216,11 @@ test_expect_success 'detect copies' '
level=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d 
| cut -f1 | cut -d  -f5 | sed s/C0*//) 
test -n $level  test $level -gt 2  test $level -lt 
100 
src=$(git diff-tree -r -C --find-copies-harder HEAD | sed 1d | 
cut -f2) 
-   case $src in
-   file10 | file11 | file12) : ;; # happy
-   *) false ;; # not
-   esac 
+   test $src = file2 
git config

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

Jeff King p...@peff.net writes:

 On Sun, Apr 05, 2015 at 01:27:32AM -0400, Jeff King wrote:

 On Sun, Apr 05, 2015 at 12:56:14AM -0400, Jeff King wrote:
 
  The big downside is that our input strings are no longer NUL-clean
  (reading foo\0bar\n would yield just foo. I doubt that matters in
  the real world, but it does fail a few of the tests (e.g., t7008 tries
  to read a list of patterns which includes NUL, and we silently truncate
  the pattern rather than read in the NUL and barf).
 
 So there is this trick:
 
 diff --git a/strbuf.c b/strbuf.c
 index f319d8d..5ceebb7 100644
 --- a/strbuf.c
 +++ b/strbuf.c
 @@ -445,12 +445,13 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, 
 int term)
  strbuf_reset(sb);
  
  if (term == '\n') {
 +long pos = ftell(fp);
  strbuf_grow(sb, 256);
  if (!fgets(sb-buf, sb-alloc - 1, fp)) {
  strbuf_release(sb);
  return EOF;
  }
 -sb-len = strlen(sb-buf);
 +sb-len = ftell(fp) - pos;
  if (sb-buf[sb-len - 1] == '\n')
  return 0;
  }
 
 but much to my surprise it actually runs slower than the strlen version!


The later loop you have oops, the thing turns out to be longer than
we thought, so let's do byte-by-byte is protected with locking, but
this part is not, and that suggests me that the ftell-fgets-ftell
sequence we see here may have its own locking cost built-in in the
stdio library, too, perhaps?

 It also has a 32-bit overflow issue. There's fgetpos() as an
 alternative, but fpos_t is an opaque type, and we might not be able to
 do arithmetic on it (for that matter, I am not sure if arithmetic is
 strictly guaranteed on ftell() results). POSIX gives us ftello(), which
 returns an off_t. That would probably be fine.

 Actually, scratch that idea. ftell() always returns 0 on a non-seekable
 file, so we can't use it in the general case. And that probably explains
 the performance difference, too, if it is not keeping its own counter
 and relies on lseek(fileno(fp)) or similar.

Looked so promising, though ;-) X-.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: rev-list pretty format behavior

Oliver Runge oliver.ru...@gmail.com writes:

 I'm using git version 2.4.0-rc1. The same behavior exists in 2.1.0.

 Trying the same with rev-list results in:
 git rev-list --pretty=format:%h ... HEAD~3...HEAD
 commit 826aed50cbb072d8f159e4c8ba0f9bd3df21a234
 826aed5 ...
 commit 915e44c6357f3bd9d5fa498a201872c4367302d3
 915e44c ...
 commit 067178ed8a7822e6bc88ad606b707fc33658e6fc
 067178e ...

This is very much the designed behaviour, I would think.  IIRC, the
user-format support of rev-list was designed so that the scripts
can customize the output from rev-list -v, which was how scripts
were expected to read various pieces of information for each commit
originally.  And the 40-hex commit object name and/or a line that
begins with commit ... when a user format is used are meant to
serve as stable record separator (in that sense, having %H or %h in
the userformat given to rev-list is redundant) when these scripts
are reading output from rev-list.

A new option to tell rev-list that I am designing an output that
is a-line-per-commit with the userformat and do not need the default
record separator or I will arrange record separator myself would
be an acceptable thing to add, provided if many scripts yet to be
written would benefit from such a feature, though.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Why can't I stash submodule changes?

2015-04-05 Thread Shane da Silva

I’m having trouble understanding why I cannot stash changes to a submodule.

When adding a submodule to a repository (`git submodule add
./sub-repo`), I can then run `git stash` and `git stash pop` with
expected results—the submodule disappears and reappears in the working
tree.

However, when I try stashing an update to a submodule, `git stash`
reports “No local changes to save”. The following shell script
illustrates this behavior:


# Create repo
mkdir test-repo
cd test-repo
git init
git commit --allow-empty -m Initial commit

# Create submodule
mkdir sub-repo
cd sub-repo
git init
git commit --allow-empty -m Initial commit
cd -

# Add submodule
git submodule add ./sub-repo
git commit -m Add submodule

# Modify submodule
cd sub-repo
touch foo
git add foo
git commit -m Submodule changed
cd -

# Stash submodule change
git stash # ---Displays No local changes to save”


I’m trying to wrap my head around why this is the current behavior, as
I suspect this is intentional but it seems unexpected. If anyone can
shed any light on this, I would really appreciate it!

Thanks,

Shane
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/6] strbuf: add an optimized 1-character strbuf_grow

On Sun, Apr 05, 2015 at 10:13:21PM -0400, Eric Sunshine wrote:

  -   strbuf_grow(sb, 1);
  +   strbuf_grow_ch(sb);
 
 strbuf_grow_ch() seems overly special-case. What about instead taking
 advantage of inline strbuf_avail() to do something like this?
 
 if (!strbuf_avail())
 strbuf_grow(sb, 1);

Thanks, I somehow missed that function (despite it being a few line
above the one I added!).

I agree that strbuf_avail is a much better generic interface, and it
turns out to be just as fast (actually, a tiny bit faster in my tests).
I'll use that in the re-roll.

 (Minor tangent: The 1 is still slightly magical and potentially
 confusing for someone who doesn't know that the buffer is grown
 aggressively, so changing it to a larger number might make it more
 obvious to the casual reader that the buffer is in fact not being
 grown on every iteration.)

I agree this is slightly confusing (and I had to double-check how
strbuf_grow worked while writing this series). OTOH, this is not so much
about the 1 here as about how strbufs work. We care about the
amortized asymptotic cost. strbuf_add() has the same issue; we add more
bytes in each chunk, but we would still want to make sure that there is
a sub-linear relationship between the number of adds and the number of
allocations).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] gitk: Remove mc parameter from proc show_error

2015-04-05 Thread Alex Henrie

This is a better fix for 8d849957d81fc0480a52570d66cc3c2a688ecb1b.

All that was required to fix the original issue was to remove the extra
mc call, i.e. change [mc Sorry, gitk cannot run...] to simply
Sorry, gitk cannot run... Changing the signature of proc show_error
was unnecessary and introduced two new bugs: It made OK untranslatable
and mc translatable when the opposite should be true.

This new fix makes the string OK translatable and the string mc not
translatable, while leaving the string Sorry, gitk cannot run... not
translatable. It will take effect the next time `make update-po` is run.

Signed-off-by: Alex Henrie alexhenri...@gmail.com
---
 gitk | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gitk b/gitk
index 30fcd30..096389f 100755
--- a/gitk
+++ b/gitk
@@ -1894,13 +1894,13 @@ proc make_transient {window origin} {
 }
 }
 
-proc show_error {w top msg {mc mc}} {
+proc show_error {w top msg} {
 global NS
 if {![info exists NS]} {set NS }
 if {[wm state $top] eq withdrawn} { wm deiconify $top }
 message $w.m -text $msg -justify center -aspect 400
 pack $w.m -side top -fill x -padx 20 -pady 20
-${NS}::button $w.ok -default active -text [$mc OK] -command destroy $top
+${NS}::button $w.ok -default active -text [mc OK] -command destroy $top
 pack $w.ok -side bottom -fill x
 bind $top Visibility grab $top; focus $top
 bind $top Key-Return destroy $top
@@ -12011,7 +12011,7 @@ proc get_path_encoding {path} {
 # First check that Tcl/Tk is recent enough
 if {[catch {package require Tk 8.4} err]} {
 show_error {} . Sorry, gitk cannot run with this version of Tcl/Tk.\n\
-Gitk requires at least Tcl/Tk 8.4. list
+Gitk requires at least Tcl/Tk 8.4.
 exit 1
 }
 
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 3/5] Documentation: add git-log --merges= option and log.merges config. var

Koosha Khajehmoogahi koo...@posteo.de writes:

 diff --git a/Documentation/rev-list-options.txt 
 b/Documentation/rev-list-options.txt
 index f620ee4..0bb2390 100644
 --- a/Documentation/rev-list-options.txt
 +++ b/Documentation/rev-list-options.txt
 @@ -96,12 +96,24 @@ if it is part of the log message.
  --remove-empty::
   Stop when a given path disappears from the tree.
  
 +--merges={show|hide|only}::
 ++
 +--
 +`show`: show both merge and non-merge commits
 +
 +`hide`: only show non-merge commits; same as `--max-parents=1`
 +
 +`only`: only show merge commits; same as `--min-parents=2`
 +
 +If `--merges=` is not specified, default value is `show`.
 +--
 ++
 +

I am not sure if the default value is `show` is something we would
even want to mention like this.  It does not tell the whole story
and may even confuse the users, who did

git log --merge
git log --max-parent=...

but did not say any --merges=something.

I think the importat point we want to teach users is that this is an
option to use when you want to limit what is output (and by default,
we show all but nothing else in the manpage says we hide things,
so...).  And it would be beneficial to highlight that 'show' is only
there to defeat an unusual log.merges setting in users' config.

Also the formatting of this part looks rather unusual.  I would have
expected that these three items to be listed as a true AsciiDoc
enumeration, not three hand-crafted enumration-looking separate
paragraphs.

Taking both points together, we may want to do something more like
this, perhaps?

--merges={show|hide|only}::

Limit the output by type of commits.

`hide`;;
Hide merge commits from the output.

`only`;;
Hide non-merge commits from the output (i.e showing
only merge commits).

`show`;;
Do not hide either merge or non-merge commits.  This
is primarily useful when the user has non-standard
setting of `log.merges` configuration variable that
needs to be overriden from the command line.


Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANNOUNCE] Git Merge Contributors Summit, April 8th, Paris

2015-04-05 Thread Thomas Ferris Nicolaisen

On Tue, Feb 24, 2015 at 11:09 PM, Jeff King p...@peff.net wrote:
 I wanted to make one more announcement about this, since a few more
 details have been posted at:

   http://git-merge.com/

 since my last announcement. Specifically, I wanted to call attention to
 the contributor's summit on the 8th. Basically, there will be a space
 that can hold up to 50 people, it's open only to git (and JGit and
 libgit2) devs, and there isn't a planned agenda. So I want to:

   1. Encourage developers to come. You might meet some folks in person
  you've worked with online. And you can see how beautiful we all
  are.

   2. Get people thinking about what they would like to talk about.  In
  past GitTogethers, it's been a mix of people with prepared things
  to talk about, group discussions of areas, and general kibitzing.
  We can be spontaneous on the day of the event, but if you have a
  topic you want to bring up, you may want to give it some thought
  beforehand.

 If you are a git dev and want to come, please RSVP to Chris Kelly
 amateurhu...@github.com who is organizing the event. If you would like
 to come, but finances make it hard (either for travel, or for the
 conference fee), please talk to me off-list, and we may be able to help.

 If you have questions, please feel free to ask me, and I'll try to get
 answers from the GitHub folks who are organizing the event.


I'll be arriving around 11 am on the 8th, if anyone wants to record
something for the GitMinutes podcast [1]. Send me an email directly,
or just walk up to me at the conference and say hi! I'll hopefully be
hanging around the contributor's summit area with some microphones,
but I've been unable to get any feedback from GitHub about whether
this is OK, so.. I guess we'll just wing it when I get there.

[1] http://www.gitminutes.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] git-p4: Improve client path detection

2015-04-05 Thread Vitor Antunes

Luke Diamand l...@diamand.org wrote on Sun, 05 Apr 2015 20:27:11 +0100
 On 28/03/15 12:28, Vitor Antunes wrote:
  I'm adding a test case for a scenario I was confronted with when using 
  branch
  detection and a client view specification. It is possible that the 
  implemented
  fix may not cover all possible scenarios, but there is no regression in the
  available tests.

 Vitor, one thing I wondered about with this part of the change:

 -if entry[depotFile] == depotPath:
 +if entry[depotFile].find(depotPath) = 0:

 Does this mean that if 'p4 where' produces multiple lines of output that
 this will get confused, as it's just going to search for an instance of
 depotPath.

The reason why I introduced that was because in the test case I implemented (and
which reflects a scenario I am confronted with in my workplace) the branches
have a base directory that is removed in the client view mapping.
As such, we will have a situation where depotPath is //depot/branch1/ while
runninng p4 where will result in //depot/branch1/base/. To overcome this I
used find() instead of a direct comparison. Now that I think about that, I could
probably have used the simpler `if depotPath in entry[depotFile]`...

 The example in the Perforce man page for 'p4 where' would trigger this
 for example:

 http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_where.html

 -//a/b/file.txt //client/a/b/file.txt //home/user/root/a/b/file.txt
 //a/b/file.txt //client/b/file.txt /home/user/root/b/file.txt

These are examples where a simple comparison as was implemented would work.

 As an experiment, I hacked git-p4 to always use p4Where rather than
 getClientRoot(), which I would have thought ought to work, but while
 most of the tests passed, Pete's client-spec torture tests failed.

That was exactly my first approach and got to the same conclusion. I would have
investigated it further but since I haven't had much free time to invest in
solving this problem I decided to implement an intermediary solution that would
not introduce any regressions.

Vitor
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/6] address packed-refs speed regressions

2015-04-05 Thread René Scharfe


Am 05.04.2015 um 20:59 schrieb Jeff King:

Still, the numbers are promising. Here's are comparisons
against for-each-ref on torvalds/linux, which has a 218M
packed-refs file:

   $ time git for-each-ref \
   --format='%(objectname) %(refname)' \
   refs/remotes/2325298/ |
   wc -c
   44139

   real0m1.649s
   user0m1.332s
   sys 0m0.304s

   $ time ~peff/git-quick-list refs/remotes/2325298/ | wc -c
   44139

   real0m0.012s
   user0m0.004s
   sys 0m0.004s


Sweet numbers. :-P

I'm not familiar with refs.c, but its sheer size alone suggests that it 
won't be easy to integrate this prototype code there. :-/


René


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/6] strbuf: add an optimized 1-character strbuf_grow

2015-04-05 Thread Eric Sunshine

On Sat, Apr 4, 2015 at 9:11 PM, Jeff King p...@peff.net wrote:
 We have to call strbuf_grow anytime we are going to add data
 to a strbuf. In most cases, it's a noop (since we grow the
 buffer aggressively), and the cost of the function call and
 size check is dwarfed by the actual buffer operation.

 For a tight loop of single-character additions, though, this
 overhead is noticeable. Furthermore, the single-character
 case is much easier to check; since the extra parameter is
 1, we can do it without worrying about overflow.

 This patch adds a simple inline function for checking
 single-character growth. For the growth case, it just calls
 into the regular strbuf_grow(). This is redundant, as
 strbuf_grow will check again whether we need to grow. But it
 keeps our inline code simple, and most calls will not need
 to grow, so it's OK to treat this as a rare slow path.

 We apply the new function to strbuf_getwholeline. [...]

 Signed-off-by: Jeff King p...@peff.net
 ---
 diff --git a/strbuf.c b/strbuf.c
 index af2bad4..2facd5f 100644
 --- a/strbuf.c
 +++ b/strbuf.c
 @@ -445,7 +445,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
 term)
 strbuf_reset(sb);
 flockfile(fp);
 while ((ch = getc_unlocked(fp)) != EOF) {
 -   strbuf_grow(sb, 1);
 +   strbuf_grow_ch(sb);

strbuf_grow_ch() seems overly special-case. What about instead taking
advantage of inline strbuf_avail() to do something like this?

if (!strbuf_avail())
strbuf_grow(sb, 1);

(Minor tangent: The 1 is still slightly magical and potentially
confusing for someone who doesn't know that the buffer is grown
aggressively, so changing it to a larger number might make it more
obvious to the casual reader that the buffer is in fact not being
grown on every iteration.)

 sb-buf[sb-len++] = ch;
 if (ch == term)
 break;
 diff --git a/strbuf.h b/strbuf.h
 index 1883494..ef41151 100644
 --- a/strbuf.h
 +++ b/strbuf.h
 @@ -137,6 +137,15 @@ static inline size_t strbuf_avail(const struct strbuf 
 *sb)
   */
  extern void strbuf_grow(struct strbuf *, size_t);

 +/*
 + * An optimized version of strbuf_grow() for a single character.
 + */
 +static inline void strbuf_grow_ch(struct strbuf *sb)
 +{
 +   if (!sb-alloc || sb-alloc - 1 = sb-len)
 +   strbuf_grow(sb, 1);
 +}
 +
  /**
   * Set the length of the buffer to a given value. This function does *not*
   * allocate new memory, so you should not perform a `strbuf_setlen()` to a
 --
 2.4.0.rc0.363.gf9f328b
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/6] address packed-refs speed regressions

On Mon, Apr 06, 2015 at 12:39:15AM +0200, René Scharfe wrote:

 ...this. I think that effort might be better spent on a ref storage
 format that's more efficient, simpler (with respect to subtle races and
 such), and could provide other features (e.g., transactional atomicity).
 
 Such as a DBMS? :-)  Leaving storage details to SQLite or whatever sounds
 attractive to me because I'm lazy.

Exactly. Though I think some folks were worried about the extra
dependency (e.g., I think SQLite is hard for JGit, because there's no
pure-java implementation, which makes Eclipse unhappy).

With pluggable backends we can make something like a SQLite backend
optional. I.e., use it if you want the benefits and can accept the
portability downsides. But that also risks fracturing the community, and
people on the old format being left behind.

 Forgot to say: I like your changes.  But if strbuf_getline can only be made
 fast enough beyond that by duplicating stdio buffering then I feel it's
 better to take a different way.  E.g. dropping the requirement to handle NUL
 chars and basing it on fgets as Junio suggested in his reply to patch 3
 sounds good.

Yeah, though we probably need to either audit the callers, or provide a
flag for each caller to turn on the speed-over-NULs behavior. I'll look
into that, but it may not be this week, as I'll be traveling starting
tomorrow.

 In any case, the packed refs file seems special enough to receive special
 treatment.  Using mmap would make the most sense if we could also avoid
 copying lines to a strbuf for parsing, though.

I had a similar thought. Below is hacky patch, on top of your mmap
patch, that does this. It does shave off another 300ms (around 5%).

I think we may be getting into a useless area of micro-optimizing here,
though.  The results are noticeable on this ridiculous repository, but
probably not so much on real ones. The low-hanging fruit (e.g., dropping
time in half by using getc_unlocked) seems to provide the most bang for
the buck.

---
diff --git a/refs.c b/refs.c
index 144255f..708b49b 100644
--- a/refs.c
+++ b/refs.c
@@ -334,27 +334,40 @@ static int refname_is_safe(const char *refname)
return 1;
 }
 
-static struct ref_entry *create_ref_entry(const char *refname,
- const unsigned char *sha1, int flag,
- int check_name)
+static struct ref_entry *create_ref_entry_len(const char *refname, size_t len,
+ const unsigned char *sha1, int 
flag,
+ int check_name)
 {
-   int len;
struct ref_entry *ref;
 
+   /*
+* allocate before checking, since the check functions require
+* a NUL-terminated refname. And since we die() anyway if
+* the check fails, the overhead of the extra malloc is negligible
+*/
+   ref = xmalloc(sizeof(struct ref_entry) + len + 1);
+   hashcpy(ref-u.value.sha1, sha1);
+   hashclr(ref-u.value.peeled);
+   memcpy(ref-name, refname, len);
+   ref-name[len] = '\0';
+   ref-flag = flag;
+
if (check_name 
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
die(Reference has invalid format: '%s', refname);
if (!check_name  !refname_is_safe(refname))
die(Reference has invalid name: '%s', refname);
-   len = strlen(refname) + 1;
-   ref = xmalloc(sizeof(struct ref_entry) + len);
-   hashcpy(ref-u.value.sha1, sha1);
-   hashclr(ref-u.value.peeled);
-   memcpy(ref-name, refname, len);
-   ref-flag = flag;
return ref;
 }
 
+static struct ref_entry *create_ref_entry(const char *refname,
+const unsigned char *sha1, int flag,
+int check_name)
+{
+   return create_ref_entry_len(refname, strlen(refname), sha1, flag,
+   check_name);
+}
+
 static void clear_ref_dir(struct ref_dir *dir);
 
 static void free_ref_entry(struct ref_entry *entry)
@@ -1095,7 +1108,9 @@ static const char PACKED_REFS_HEADER[] =
  * Return a pointer to the refname within the line (null-terminated),
  * or NULL if there was a problem.
  */
-static const char *parse_ref_line(struct strbuf *line, unsigned char *sha1)
+static const char *parse_ref_line(const char *line, int len,
+ unsigned char *sha1,
+ size_t *refname_len)
 {
const char *ref;
 
@@ -1107,22 +1122,22 @@ static const char *parse_ref_line(struct strbuf *line, 
unsigned char *sha1)
 *  +1 (space in between hex and name)
 *  +1 (newline at the end of the line)
 */
-   if (line-len = 42)
+   if (len = 42)
return NULL;
 
-   if (get_sha1_hex(line-buf, sha1)  0)
+   if (get_sha1_hex(line, sha1)  0)
return NULL;
-   if (!isspace(line-buf[40]))

Re: gitk drawing bug

On Fri, Apr 03, 2015 at 09:28:00PM -0400, Martin d'Anjou wrote:
On 15-04-03 07:05 PM, Alex Henrie wrote:
2015-02-18 12:27 GMT-07:00 Martin d'Anjou martin.danjo...@gmail.com:
It appears I have uncovered inconsistent behaviour in gitk. Looks like
a bug. I have a picture here:
https://docs.google.com/document/d/19TTzGD94B9EEIrVU5mRMjfJFvF5Ar3MlPblRJfP5OdQ/edit?usp=sharing

Essentially, when I hit shift-F5, it sometimes draw the history
differently (still valid, but drawn differently). There is no change
in the repository between the shift-F5 keystrokes.

That's not a bug, it's a consequence of the fact that gitk is designed
to be fast. It only lays out as much of the graph is visible plus a
little more, not the whole graph, and it doesn't use any global
analysis. The reason for that is speed. Gitk is usable on a
repository with half a million commits, such as the linux kernel, and
to achieve that we can't afford to do wait until we have all the
commits read in and then do some computation over the whole topology;
it all has to be done incrementally. Also, the underlying git log
sometimes gives gitk a parent commit before one of its children, and
when that happens the topology has to be modified and thus the graph
does too, if any topology that has already been drawn gets modified.

As long as the graph correctly shows the relationships between
commits, it has achieved its purpose. If you (or anyone) can come up
with improvements that make it look nicer, that's great, and I'll
consider them as long as they don't slow down gitk on large
repositories to any noticeable extent.

Paul.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

How to send a warning message from git hosting server?

2015-04-05 Thread Yi, EungJun

Hello. I am serving a git hosting service for my company.

Sometimes I want to send a warning message to users who use my
service; e.g. the service will be shutdown tomorrow for a while
temporary.

I know it is possible to a remote message by hooks or HTTP body if an
error occured. But it seems that there is no hooks for git-fetch and
git does not print HTTP body if there is no error.

I want a way to response a remote message when a client send any kind
of request. Is it possible?

-EungJun
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

04.04.15

2015-04-05 Thread John Ryder

Need a Soft, Business, Home or Private Loan? Then Email Us Back as we would be 
glad to assist.

Regards,

John Ryder

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/6] address packed-refs speed regressions

2015-04-05 Thread René Scharfe

Am 05.04.2015 um 03:06 schrieb Jeff King:
 As I've mentioned before, I have some repositories with rather large
 numbers of refs. The worst one has ~13 million refs, for a 1.6GB
 packed-refs file. So I was saddened by this:
 
$ time git.v2.0.0 rev-parse refs/heads/foo /dev/null 21
real0m6.840s
user0m6.404s
sys 0m0.440s
 
$ time git.v2.4.0-rc1 rev-parse refs/heads/foo /dev/null 21
real0m19.432s
user0m18.996s
sys 0m0.456s
 
 The command isn't important; what I'm really measuring is loading the
 packed-refs file. And yes, of course this repository is absolutely
 ridiculous. But the slowdowns here are linear with the number of refs.
 So _every_ git command got a little bit slower, even in less crazy
 repositories. We just didn't notice it as much.
 
 Here are the numbers after this series:
 
real0m8.539s
user0m8.052s
sys 0m0.496s
 
 Much better, but I'm frustrated that they are still 20% slower than the
 original.
 
 The main culprits seem to be d0f810f (which introduced some extra
 expensive code for each ref) and my 10c497a, which switched from fgets()
 to strbuf_getwholeline. It turns out that strbuf_getwholeline is really
 slow.

10c497a changed read_packed_refs(), which reads *all* packed refs.
Each is checked for validity.  That sounds expensive if the goal is
just to look up a single (non-existing) ref.

Would it help to defer any checks until a ref is actually accessed?
Can a binary search be used instead of reading the whole file?

I wonder if pluggable reference backends could help here.  Storing refs
in a database table indexed by refname should simplify things.

Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping
the packed refs file?  What numbers do you get with the following patch?

---
 refs.c | 36 
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/refs.c b/refs.c
index 47e4e53..144255f 100644
--- a/refs.c
+++ b/refs.c
@@ -1153,16 +1153,35 @@ static const char *parse_ref_line(struct strbuf *line, 
unsigned char *sha1)
  *  compatibility with older clients, but we do not require it
  *  (i.e., peeled is a no-op if fully-peeled is set).
  */
-static void read_packed_refs(FILE *f, struct ref_dir *dir)
+static void read_packed_refs(int fd, struct ref_dir *dir)
 {
struct ref_entry *last = NULL;
struct strbuf line = STRBUF_INIT;
enum { PEELED_NONE, PEELED_TAGS, PEELED_FULLY } peeled = PEELED_NONE;
+   struct stat st;
+   void *map;
+   size_t mapsz, len;
+   const char *p;
+
+   fstat(fd, st);
+   mapsz = xsize_t(st.st_size);
+   if (!mapsz)
+   return;
+   map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0);
 
-   while (strbuf_getwholeline(line, f, '\n') != EOF) {
+   for (p = map, len = mapsz; len; ) {
unsigned char sha1[20];
const char *refname;
const char *traits;
+   const char *nl;
+   size_t linelen;
+
+   nl = memchr(p, '\n', len);
+   linelen = nl ? nl - p + 1 : len;
+   strbuf_reset(line);
+   strbuf_add(line, p, linelen);
+   p += linelen;
+   len -= linelen;
 
if (skip_prefix(line.buf, # pack-refs with:, traits)) {
if (strstr(traits,  fully-peeled ))
@@ -1204,6 +1223,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
}
 
strbuf_release(line);
+   munmap(map, mapsz);
 }
 
 /*
@@ -1224,16 +1244,16 @@ static struct packed_ref_cache 
*get_packed_ref_cache(struct ref_cache *refs)
clear_packed_ref_cache(refs);
 
if (!refs-packed) {
-   FILE *f;
+   int fd;
 
refs-packed = xcalloc(1, sizeof(*refs-packed));
acquire_packed_ref_cache(refs-packed);
refs-packed-root = create_dir_entry(refs, , 0, 0);
-   f = fopen(packed_refs_file, r);
-   if (f) {
-   stat_validity_update(refs-packed-validity, 
fileno(f));
-   read_packed_refs(f, get_ref_dir(refs-packed-root));
-   fclose(f);
+   fd = open(packed_refs_file, O_RDONLY);
+   if (fd = 0) {
+   stat_validity_update(refs-packed-validity, fd);
+   read_packed_refs(fd, get_ref_dir(refs-packed-root));
+   close(fd);
}
}
return refs-packed;
-- 
2.3.5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: how do I ignore a directory for diff

2015-04-05 Thread Lionel CHAZALLON

Hello John,

Thanks for the answer. I am also using some GUI client (smartgit). Is there any 
way to make this part of the repo attributes / configuration so that my git GUI 
would use it ? 

Lionel.

 Le 5 avr. 2015 à 14:17, John Keeping j...@keeping.me.uk a écrit :
 
 On Sun, Apr 05, 2015 at 11:31:54AM +, LongChair . wrote:
 I have been looking into ignoring a subdirectory of my tree for
 diffing with upstream.  I'll explain the situation below :
 
 My tree is a fork of an upstream repo.  There is a specific directory
 in my tree lets call it foo/bar that i would like to ignore for diff.
 This directory includes only files that i added to my repo and is
 therefore irrelevant for diffing (i know all files in there have been
 added and are not in upstream). Having there in the diff is just
 making a lot of files to appear and that is confusing to see what is
 changed from upstream.
 
 I have read the docs and found a way mentioning that i should add a
 line to .gitattributes with : foo/bar/* -diff
 
 But this still lists the files in there when i'm diffing.
 
 Is there any way to achieve this ? i cant find any clear explanation
 in the docs.
 
 Since git-diff takes a pathspec you can use the exclude magic to exclude
 certain directories like this:
 
   git diff upstream -- ':(top)' ':(exclude)foo/bar'
 
 or equivalently:
 
   git diff upstream -- :/ ':!foo/bar'
 
 The documentation for the pathspec syntax is in git-glossary(7).

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-05 Thread Duy Nguyen

On Sun, Apr 5, 2015 at 11:56 AM, Jeff King p...@peff.net wrote:
 So we'd have to either:

   1. Decide that doesn't matter.

   2. Have callers specify a damn the NULs, I want it fast flag.

2+. Avoid FILE* interface and go with syscalls for reading
packed-refs? If mmaping the entire file could be a problem for some
platform because it's too large, we have code for reading (with
bufferring) from fd somewhere, e.g. index-pack.

   3. Find some alternative that is more robust than fgets, and faster
  than getc. I don't think there is anything in stdio, but I am not
  above dropping in a faster non-portable call if it is available,
  and then falling back to the current code otherwise.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

how do I ignore a directory for diff

2015-04-05 Thread LongChair .

Hello,

I have been looking into ignoring a subdirectory of my tree for diffing with 
upstream.
I'll explain the situation below :

My tree is a fork of an upstream repo.
There is a specific directory in my tree lets call it foo/bar that i would like 
to ignore for diff. This directory includes only files that i added to my repo 
and is therefore irrelevant for diffing (i know all files in there have been 
added and are not in upstream). Having there in the diff is just making a lot 
of files to appear and that is confusing to see what is changed from upstream.

I have read the docs and found a way mentioning that i should add a line to 
.gitattributes with :
foo/bar/* -diff

But this still lists the files in there when i'm diffing.

Is there any way to achieve this ? i cant find any clear explanation in the 
docs.

Any help would be greatly appreciated :)

Lionel.

  --
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: how do I ignore a directory for diff

2015-04-05 Thread John Keeping

On Sun, Apr 05, 2015 at 11:31:54AM +, LongChair . wrote:
 I have been looking into ignoring a subdirectory of my tree for
 diffing with upstream.  I'll explain the situation below :
 
 My tree is a fork of an upstream repo.  There is a specific directory
 in my tree lets call it foo/bar that i would like to ignore for diff.
 This directory includes only files that i added to my repo and is
 therefore irrelevant for diffing (i know all files in there have been
 added and are not in upstream). Having there in the diff is just
 making a lot of files to appear and that is confusing to see what is
 changed from upstream.
 
 I have read the docs and found a way mentioning that i should add a
 line to .gitattributes with : foo/bar/* -diff
 
 But this still lists the files in there when i'm diffing.
 
 Is there any way to achieve this ? i cant find any clear explanation
 in the docs.

Since git-diff takes a pathspec you can use the exclude magic to exclude
certain directories like this:

git diff upstream -- ':(top)' ':(exclude)foo/bar'

or equivalently:

git diff upstream -- :/ ':!foo/bar'

The documentation for the pathspec syntax is in git-glossary(7).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: how do I ignore a directory for diff

2015-04-05 Thread John Keeping

On Sun, Apr 05, 2015 at 04:19:50PM +0200, Lionel CHAZALLON wrote:
  Le 5 avr. 2015 à 14:17, John Keeping j...@keeping.me.uk a écrit :
  
  On Sun, Apr 05, 2015 at 11:31:54AM +, LongChair . wrote:
  I have been looking into ignoring a subdirectory of my tree for
  diffing with upstream.  I'll explain the situation below :
  
  My tree is a fork of an upstream repo.  There is a specific directory
  in my tree lets call it foo/bar that i would like to ignore for diff.
  This directory includes only files that i added to my repo and is
  therefore irrelevant for diffing (i know all files in there have been
  added and are not in upstream). Having there in the diff is just
  making a lot of files to appear and that is confusing to see what is
  changed from upstream.
  
  I have read the docs and found a way mentioning that i should add a
  line to .gitattributes with : foo/bar/* -diff
  
  But this still lists the files in there when i'm diffing.
  
  Is there any way to achieve this ? i cant find any clear explanation
  in the docs.
  
  Since git-diff takes a pathspec you can use the exclude magic to exclude
  certain directories like this:
  
  git diff upstream -- ':(top)' ':(exclude)foo/bar'
  
  or equivalently:
  
  git diff upstream -- :/ ':!foo/bar'
  
  The documentation for the pathspec syntax is in git-glossary(7).
 
 Thanks for the answer. I am also using some GUI client (smartgit). Is
 there any way to make this part of the repo attributes / configuration
 so that my git GUI would use it ? 

I think you'll have to file a feature request with SmartGit if you want
support for this in their UI.

The standard way to set this up would be to create an alias that does
what you want, such as:

git config alias.d 'diff -- :/ :!foo/bar'

and use git d instead of git diff, but there is no way for other
programs to inherit that.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Feature request: Update remote url if it is redirected.

2015-04-05 Thread Yi, EungJun

Some git hosting services, like Github, a url to a git repository can
be changed by changing the name of the repository by the owner. If
someone tries to get the repository with the old url, usually the
hosting service serves the request with the repository indiciated by
the new url. It is very helpful for the users who don't know the url
has been changed.

In such case, I think it is recommended to update the user's remote
url with the new one because the old url possibily does not work if
the hosting service does not keep the old url anymore, or worse, works
incorrectly if someone makes a repository with the old url.

To fix the potential problem, it would be nice if Git client asks the
user, while fetching objects from a remote url, to update the url if
it is changed. If the client and the server is talking in HTTP, it is
easy to know the url is changed if the server responses with 301 Moved
Permantely or 308 Permanet Redirect.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type

2015-04-05 Thread karthik nayak




On 04/05/2015 01:16 PM, Junio C Hamano wrote:

karthik nayak karthik@gmail.com writes:


So, it makes me wonder what guarantee we have that this does not
dereference a NULL here.


As per my code, oi-typename is only pointing to something when oi-typep
is ( As oi-typename is currently only used in cat-file.c).
But what you're saying also is true, there is no other guarantee, as a user may
set oi-typename to point to a struct strbuf and leave out oi-typep.

  if (oi-typename  oi-typep)
  strbuf_addstr(oi-typename, typename(*oi-typep));

This should suffice. Do you want me to re-roll this?


I'd rather avoid the thinking along the lines of at this moment,
there happens to be only one caller that asks for typename and the
caller also sets typep, so we will be safe as long as we make sure
the caller passed typep before giving him typename back.

Somebody else may write new code that wants to learn the typename,
forgets to set typep, calls into this codepath, and ends up
scratching his head wondering why the typename string is returned to
him.  Surely the code may not crash at the new code you wrote, but
you are not helping him.



Yes! I do agree with you, If you read my previous reply, I did mention 
what you said about not having a guarantee that typep and typename are 
both set and another user might have a bug with this.




If it semantically does not make sense to ask for the typename
without asking for the type code, then we can and should make that
as a new calling convention _all_ callers must follow.

In other words, I think it is better to have

if (oi-typename  !oi-typep)
die(BUG);

somewhere near the beginning of the callchain that takes oi; that
will ensure all callers understand the rule.



Yes! this is a better approach as it will enforce that typep must be set 
when typename is set.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type

karthik nayak karthik@gmail.com writes:

 So, it makes me wonder what guarantee we have that this does not
 dereference a NULL here.

 As per my code, oi-typename is only pointing to something when oi-typep
 is ( As oi-typename is currently only used in cat-file.c).
 But what you're saying also is true, there is no other guarantee, as a user 
 may
 set oi-typename to point to a struct strbuf and leave out oi-typep.

  if (oi-typename  oi-typep)
  strbuf_addstr(oi-typename, typename(*oi-typep));

 This should suffice. Do you want me to re-roll this?

I'd rather avoid the thinking along the lines of at this moment,
there happens to be only one caller that asks for typename and the
caller also sets typep, so we will be safe as long as we make sure
the caller passed typep before giving him typename back.

Somebody else may write new code that wants to learn the typename,
forgets to set typep, calls into this codepath, and ends up
scratching his head wondering why the typename string is returned to
him.  Surely the code may not crash at the new code you wrote, but
you are not helping him.

If it semantically does not make sense to ask for the typename
without asking for the type code, then we can and should make that
as a new calling convention _all_ callers must follow.

In other words, I think it is better to have

if (oi-typename  !oi-typep)
die(BUG);

somewhere near the beginning of the callchain that takes oi; that
will ensure all callers understand the rule.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/6] address packed-refs speed regressions

On Sun, Apr 05, 2015 at 02:52:59PM -0400, Jeff King wrote:

 Right now we parse all of the packed-refs file into an in-memory cache,
 and then do single lookups from that cache. Doing an mmap() and a binary
 search is way faster (and costs less memory) for doing individual
 lookups. It relies on the list being sorted. This is generally true, but
 not something we currently rely on (however, it would be easy to add a
 sorted flag to top of the file and have the readers fall back when the
 flag is missing). I've played with a patch to do this (it's not entirely
 trivial, because you jump into the middle of a line, and then have to
 walk backwards to find the start of the record).
 
 For traversals, it's more complicated. Obviously if you are traversing
 all refs, you have to read the whole thing anyway. If you are traversing
 a subset of the refs, you can binary-search the start of the subset, and
 then walk forward. But that's where it gets tricky with the current
 code.

In case you are curious, here is my proof-of-concept for the packed-refs
binary search. You'll note that it's a separate program, and not
integrated into refs.c. I wrote this last August, and after trying to
integrate it into refs.c, I found the ref_cache problems I described,
and I haven't touched it since.

I also seem to have saved the patch for stuffing it into refs.c, but I
am not sure if it even compiles (I wrote only horrible wip in the
commit message ;) ).

-- 8 --
Subject: [PATCH] add git-quick-list

This is a proof of concept for binary-searching the
packed-refs file in order to traverse an ordered subset of
it. Note that it _only_ reads the packed-refs file
currently. To really compare to for-each-ref, it would need
to also walk the loose ref area for its prefix. On a
mostly-packed repository that shouldn't make a big speed
difference, though.

And of course we don't _really_ want a separate command here
at all. This should be part of refs.c, and everyone who
calls for_each_ref should benefit from it.

Still, the numbers are promising. Here's are comparisons
against for-each-ref on torvalds/linux, which has a 218M
packed-refs file:

  $ time git for-each-ref \
  --format='%(objectname) %(refname)' \
  refs/remotes/2325298/ |
  wc -c
  44139

  real0m1.649s
  user0m1.332s
  sys 0m0.304s

  $ time ~peff/git-quick-list refs/remotes/2325298/ | wc -c
  44139

  real0m0.012s
  user0m0.004s
  sys 0m0.004s
---
 Makefile |   1 +
 quick-list.c | 174 +++
 2 files changed, 175 insertions(+)
 create mode 100644 quick-list.c

diff --git a/Makefile b/Makefile
index 2457065..aa32598 100644
--- a/Makefile
+++ b/Makefile
@@ -541,6 +541,7 @@ PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
 PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
+PROGRAM_OBJS += quick-list.o
 
 # Binary suffix, set to .exe for Windows builds
 X =
diff --git a/quick-list.c b/quick-list.c
new file mode 100644
index 000..e423f1f
--- /dev/null
+++ b/quick-list.c
@@ -0,0 +1,174 @@
+#include cache.h
+#include refs.h
+
+struct packed_refs_iterator {
+   const char *start;
+   const char *end;
+
+   const char *cur;
+   const char *ref;
+   const char *eol;
+   const char *next;
+};
+
+static void iterator_init(struct packed_refs_iterator *pos,
+ const char *buf, size_t len)
+{
+   pos-start = buf;
+   pos-end = buf + len;
+
+   /* skip past header line */
+   if (pos-start  pos-end  *pos-start == '#') {
+   while (pos-start  pos-end  *pos-start != '\n')
+   pos-start++;
+   if (pos-start  pos-end)
+   pos-start++;
+   }
+}
+
+static int iterator_cmp(const char *key, struct packed_refs_iterator *pos)
+{
+   const char *ref = pos-ref;
+   for (; *key  ref  pos-eol; key++, ref++)
+   if (*key != *ref)
+   return (unsigned char)*key - (unsigned char)*ref;
+   return ref == pos-eol ? *key ? 1 : 0 : -1;
+}
+
+static const char *find_eol(const char *p, const char *end)
+{
+   p = memchr(p, '\n', end - p);
+   return p ? p : end;
+}
+
+static void parse_line(struct packed_refs_iterator *pos, const char *p)
+{
+   pos-cur = p;
+   if (pos-end - p  41)
+   die(truncated packed-refs file);
+   p += 41;
+
+   pos-ref = p;
+   pos-eol = p = find_eol(p, pos-end);
+
+   /* skip newline, and then past any peel records */
+   if (p  pos-end)
+   p++;
+   while (p  pos-end  *p == '^') {
+   p = find_eol(p, pos-end);
+   if (p  pos-end)
+   p++;
+   }
+   pos-next = p;
+}
+
+static void iterator_next(struct packed_refs_iterator *pos)
+{
+   if (pos-next  pos-end)
+   parse_line(pos, pos-next);
+   else
+   pos-cur = NULL;
+}
+
+static void

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

On Sun, Apr 05, 2015 at 09:36:04PM +0700, Duy Nguyen wrote:

 On Sun, Apr 5, 2015 at 11:56 AM, Jeff King p...@peff.net wrote:
  So we'd have to either:
 
1. Decide that doesn't matter.
 
2. Have callers specify a damn the NULs, I want it fast flag.
 
 2+. Avoid FILE* interface and go with syscalls for reading
 packed-refs? If mmaping the entire file could be a problem for some
 platform because it's too large, we have code for reading (with
 bufferring) from fd somewhere, e.g. index-pack.

There's strbuf_getwholeline_fd, but it's horrifically inefficient (one
syscall per character). But the other option is to implement your own
buffering, and we're generally better off letting stdio do that for us
(the exception here is that stdio does not have a good NUL-safe read
until X function).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type

2015-04-05 Thread Karthik Nayak

Update sha1_loose_object_info() to optionally allow it to read
from a loose object file of unknown/bogus type; as the function
usually returns the type of the object it read in the form of enum
for known types, add an optional typename field to receive the
name of the type in textual form and a flag to indicate the reading
of a loose object file of unknown/bogus type.

Add parse_sha1_header_extended() which acts as a wrapper around
parse_sha1_header() allowing more information to be obtained.

Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of
unknown/corrupt objects which have a unknown sha1 header size to
a strbuf structure. This was written by Junio C Hamano but tested
by me.

Helped-by: Junio C Hamano gits...@pobox.com
Helped-by: Eric Sunshine sunsh...@sunshineco.com
Signed-off-by: Karthik Nayak karthik@gmail.com
---
 cache.h |   2 ++
 sha1_file.c | 114 
 2 files changed, 94 insertions(+), 22 deletions(-)

diff --git a/cache.h b/cache.h
index c47323e..ea6faf0 100644
--- a/cache.h
+++ b/cache.h
@@ -881,6 +881,7 @@ extern int is_ntfs_dotgit(const char *name);
 
 /* object replacement */
 #define LOOKUP_REPLACE_OBJECT 1
+#define LOOKUP_LITERALLY 2
 extern void *read_sha1_file_extended(const unsigned char *sha1, enum 
object_type *type, unsigned long *size, unsigned flag);
 static inline void *read_sha1_file(const unsigned char *sha1, enum object_type 
*type, unsigned long *size)
 {
@@ -1349,6 +1350,7 @@ struct object_info {
unsigned long *sizep;
unsigned long *disk_sizep;
unsigned char *delta_base_sha1;
+   struct strbuf *typename;
 
/* Response */
enum {
diff --git a/sha1_file.c b/sha1_file.c
index 980ce6b..ac8fffc 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1564,6 +1564,34 @@ int unpack_sha1_header(git_zstream *stream, unsigned 
char *map, unsigned long ma
return git_inflate(stream, 0);
 }
 
+static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char 
*map,
+   unsigned long mapsize,
+   struct strbuf *header)
+{
+   unsigned char buffer[32], *cp;
+   unsigned long bufsiz = sizeof(buffer);
+   int status;
+
+   status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz);
+
+   if (status) {
+   strbuf_add(header, buffer, stream-next_out - buffer);
+   return status;
+   }
+
+   do {
+   status = git_inflate(stream, 0);
+   strbuf_add(header, buffer, stream-next_out - buffer);
+   for (cp = buffer; cp  stream-next_out; cp++)
+   if (!*cp)
+   /* Found the NUL at the end of the header */
+   return 0;
+   stream-next_out = buffer;
+   stream-avail_out = bufsiz;
+   } while (status != Z_STREAM_END);
+   return -1;
+}
+
 static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long 
size, const unsigned char *sha1)
 {
int bytes = strlen(buffer) + 1;
@@ -1614,27 +1642,24 @@ static void *unpack_sha1_rest(git_zstream *stream, void 
*buffer, unsigned long s
  * too permissive for what we want to check. So do an anal
  * object header parse by hand.
  */
-int parse_sha1_header(const char *hdr, unsigned long *sizep)
+int parse_sha1_header_extended(const char *hdr, struct object_info *oi,
+  unsigned int flags)
 {
-   char type[10];
-   int i;
+   struct strbuf typename = STRBUF_INIT;
unsigned long size;
+   int type;
 
/*
 * The type can be at most ten bytes (including the
 * terminating '\0' that we add), and is followed by
 * a space.
 */
-   i = 0;
for (;;) {
char c = *hdr++;
if (c == ' ')
break;
-   type[i++] = c;
-   if (i = sizeof(type))
-   return -1;
+   strbuf_addch(typename, c);
}
-   type[i] = 0;
 
/*
 * The length must follow immediately, and be in canonical
@@ -1652,12 +1677,39 @@ int parse_sha1_header(const char *hdr, unsigned long 
*sizep)
size = size * 10 + c;
}
}
-   *sizep = size;
+
+   type = type_from_string_gently(typename.buf, typename.len, 1);
+   if (oi-sizep)
+   *oi-sizep = size;
+   if (oi-typename)
+   strbuf_addbuf(oi-typename, typename);
+   strbuf_release(typename);
 
/*
+* Set type to 0 if its an unknown object and
+* we're obtaining the type using '--literally'
+* option.
+*/
+   if ((flags  LOOKUP_LITERALLY)  (type == -1))
+   type = 0;
+   else if (type == -1)
+   die(invalid object type);
+   if (oi-typep)
+   *oi-typep =

Re: [PATCH 0/6] address packed-refs speed regressions

On Sun, Apr 05, 2015 at 03:41:39PM +0200, René Scharfe wrote:

  The main culprits seem to be d0f810f (which introduced some extra
  expensive code for each ref) and my 10c497a, which switched from fgets()
  to strbuf_getwholeline. It turns out that strbuf_getwholeline is really
  slow.
 
 10c497a changed read_packed_refs(), which reads *all* packed refs.
 Each is checked for validity.  That sounds expensive if the goal is
 just to look up a single (non-existing) ref.
 
 Would it help to defer any checks until a ref is actually accessed?
 Can a binary search be used instead of reading the whole file?

Yes, but addressing that is much more invasive.

Right now we parse all of the packed-refs file into an in-memory cache,
and then do single lookups from that cache. Doing an mmap() and a binary
search is way faster (and costs less memory) for doing individual
lookups. It relies on the list being sorted. This is generally true, but
not something we currently rely on (however, it would be easy to add a
sorted flag to top of the file and have the readers fall back when the
flag is missing). I've played with a patch to do this (it's not entirely
trivial, because you jump into the middle of a line, and then have to
walk backwards to find the start of the record).

For traversals, it's more complicated. Obviously if you are traversing
all refs, you have to read the whole thing anyway. If you are traversing
a subset of the refs, you can binary-search the start of the subset, and
then walk forward. But that's where it gets tricky with the current
code.

The ref_cache code expects to fill in from outer to inner. So if you
have refs/foo, you should also have filled in all of refs/ (but not
necessarily refs/bar). This matches the way we traverse loose ref
directories; we opendir refs/, find out that it has foo and bar,
and the descend into foo, and so forth. But reading a subset of the
packed-ref file is inside out. You fill in all of refs/foo, but you
have no idea what else is in refs/.

So going in that direction would involve some surgery to the ref_cache
code. It might even involve throwing it out entirely (i.e., just mmap
the packed-refs file and look through it directly, without any kind of
in-memory cache; we don't tend to do more than one ref-iteration per
program anyway, so I'm not sure the caching is buying us much anyway).
My big concern there would be that there are a lot of subtle race issues
between packed and loose refs, and the current state is the result of a
lot of tweaking. I'd be worried that a heavy rewrite there would risk
introducing subtle and rare corruptions.

Plus it would be a lot of work, which leads me to...

 I wonder if pluggable reference backends could help here.  Storing refs
 in a database table indexed by refname should simplify things.

...this. I think that effort might be better spent on a ref storage
format that's more efficient, simpler (with respect to subtle races and
such), and could provide other features (e.g., transactional atomicity).

The big plus side of packed-refs improvements is that they just work
without worrying about compatibility issues. But ref storage is local,
so I'm not sure how big a deal that is in practice.

 Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping
 the packed refs file?  What numbers do you get with the following patch?

It's about 9% faster than my series + the fgets optimization I posted
(or about 25% than using getc).  Which is certainly nice, but I was
really hoping to just make strbuf_getline faster for all callers, rather
than introducing special code for one call-site. Certainly we could
generalize the technique (i.e., a struct with the mmap data), but then I
feel we are somewhat reinventing stdio. Which is maybe a good thing,
because stdio has a lot of rough edges (as seen here), but it does feel
a bit like NIH syndrome.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] git-p4: Improve client path detection

2015-04-05 Thread Luke Diamand


On 28/03/15 12:28, Vitor Antunes wrote:

I'm adding a test case for a scenario I was confronted with when using branch
detection and a client view specification. It is possible that the implemented
fix may not cover all possible scenarios, but there is no regression in the
available tests.


Vitor, one thing I wondered about with this part of the change:

-if entry[depotFile] == depotPath:
+if entry[depotFile].find(depotPath) = 0:

Does this mean that if 'p4 where' produces multiple lines of output that 
this will get confused, as it's just going to search for an instance of 
depotPath.


The example in the Perforce man page for 'p4 where' would trigger this 
for example:


http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_where.html

-//a/b/file.txt //client/a/b/file.txt //home/user/root/a/b/file.txt
//a/b/file.txt //client/b/file.txt /home/user/root/b/file.txt

As an experiment, I hacked git-p4 to always use p4Where rather than 
getClientRoot(), which I would have thought ought to work, but while 
most of the tests passed, Pete's client-spec torture tests failed.


Luke




Vitor Antunes (2):
   git-p4: Check branch detection and client view together
   git-p4: Improve client path detection when branches are used

  git-p4.py|   11 --
  t/t9801-git-p4-branch.sh |   98 ++
  2 files changed, 105 insertions(+), 4 deletions(-)



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 1/4] sha1_file.c: support reading from a loose object of unknown type

karthik nayak karthik@gmail.com writes:

 On 04/05/2015 01:16 PM, Junio C Hamano wrote:

 If it semantically does not make sense to ask for the typename
 without asking for the type code, then we can and should make that
 as a new calling convention _all_ callers must follow.

 In other words, I think it is better to have

  if (oi-typename  !oi-typep)
  die(BUG);

 somewhere near the beginning of the callchain that takes oi; that
 will ensure all callers understand the rule.

 Yes! this is a better approach as it will enforce that typep must be
 set when typename is set.

Not so fast ;-)

The key phrase in what I wrote above is If it does not make sense,
and I am not yet convinced if that is the case or not.  If we are to
change the calling convention for the callers, the reason why it
does not make sense to ask only for typename but not typep must be
explained in the log message.

And if it turns out that the answer to that question is it is valid
to ask only for typename, then it would be wrong to force everybody
who wants to learn the stringified typename to supply typep.  And in
such a case it might be better to do something like this instead (on
top of your patch I am responding to):

 sha1_file.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/sha1_file.c b/sha1_file.c
index f4055dd..26fbb7c 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2639,6 +2639,7 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
struct cached_object *co;
struct pack_entry e;
int rtype;
+   enum object_type real_type;
const unsigned char *real = lookup_replace_object_extended(sha1, flags);
 
co = find_cached_object(real);
@@ -2670,9 +2671,18 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
return -1;
}
 
+   /*
+* packed_object_info() does not follow the delta chain to
+* find out the real type, unless it is given oi-typep.
+*/
+   if (oi-typename  !oi-typep)
+   oi-typep = real_type;
+
rtype = packed_object_info(e.p, e.offset, oi);
if (rtype  0) {
mark_bad_packed_object(e.p, real);
+   if (oi-typep == real_type)
+   oi-typep = NULL;
return sha1_object_info_extended(real, oi, 0);
} else if (in_delta_base_cache(e.p, e.offset)) {
oi-whence = OI_DBCACHED;
@@ -2686,6 +2696,8 @@ int sha1_object_info_extended(const unsigned char *sha1, 
struct object_info *oi,
if (oi-typename)
strbuf_addstr(oi-typename, typename(*oi-typep));
 
+   if (oi-typep == real_type)
+   oi-typep = NULL;
return 0;
 }
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

Jeff King p...@peff.net writes:

 So we'd have to either:

   1. Decide that doesn't matter.

   2. Have callers specify a damn the NULs, I want it fast flag.

The callers that used to call fgets and then later rewritten to
strbuf_getwholeline(), either of the above obviously should be OK,
and because the whole reason why we added strbuf_getline() interface
was to avoid having to repeatedly call fgets() and concatenate the
result if the fixed-size buffer we would give it is too small, I'd
say the callers that want to read lines terminated by LF and have NUL
as part of payload would be a tiny minority.

It depends on what we would find out after auditing all callers of
this function, but I would not be surprised if we decided #1 (i.e.
this is about a _line_; what are you doing by having a NUL on
it?), and the safest would be to do the inverse of #2, i.e. make it
fast by default and make oddball callers that care about NULs to
pass a flag.

Thanks for working on this.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to send a warning message from git hosting server?