Re: Subtree in Git

2013-03-02 Thread David Michael Barr
On Sat, Mar 2, 2013 at 9:05 AM, Paul Campbell pcampb...@kemitix.net wrote:
 On Fri, Mar 1, 2013 at 2:28 AM, Kindjal kind...@gmail.com wrote:
 David Michael Barr b at rr-dav.id.au writes:

 From a quick survey, it appears there are no more than 55 patches
 squashed into the submitted patch.
 As I have an interest in git-subtree for maintaining the out-of-tree
 version of vcs-svn/ and a desire to improve my rebase-fu, I am tempted
 to make some sense of the organic growth that happened on GitHub.
 It doesn't appear that anyone else is willing to do this, so I doubt
 there will be any duplication of effort.


 What is the status of the work on git-subtree described in this thread?
 It looks like it's stalled.


 I hadn't been aware of that patch. Reading the thread David Michael
 Barr was going to try picking the patch apart into sensible chunks.


Sorry for not updating the thread. I did end up moving onto other things.
I quickly realised the reason for globbing all the patches together was
that the individual patches were not well contained.
That is single patches with multiple unrelated changes and multiple
patches changing the same things in different directions.
To me this means that the first step is to curate the history.

 If this work is still needing done I'd like to volunteer.

You're most welcome. Sorry again for abandoning the thread.

--
David Michael Barr
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: remote-testsvn: Hangs at revision

2012-12-05 Thread David Michael Barr
On Wednesday, 5 December 2012 at 5:20 PM, Ramkumar Ramachandra wrote:
 Hi,
 
 I tried out the testsvn remote helper on a simple Subversion
 repository, but it seems to hang at Revision 8 indefinitely without
 any indication of progress. I'm currently digging in to see what went
 wrong. The repository I'm cloning is:
 
 $ git clone testsvn::http://python-lastfm.googlecode.com/svn/trunk/
I attempted to clone the same repository and was able to fetch 152 revisions.
So the issue Ram saw might have been temporal.


I did however receive a warning at the end of the clone:

warning: remote HEAD refers to nonexistent ref, unable to checkout. 

--
David Michael Barr

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] pack-objects: compression level for non-blobs

2012-11-26 Thread David Michael Barr
 Add config pack.graphcompression similar to pack.compression.
 Applies to non-blob objects and if unspecified falls back to pack.compression.
 
 We may identify objects compressed with level 0 by their leading bytes.
 Use this to force recompression when the source and target levels mismatch.
 Limit its application to when the config pack.graphcompression is set.
 
 Signed-off-by: David Michael Barr b...@rr-dav.id.au 
 (mailto:b...@rr-dav.id.au)
 ---
 builtin/pack-objects.c | 49 +
 1 file changed, 45 insertions(+), 4 deletions(-)
 
 I started working on this just before taking a vacation,
 so it's been a little while coming.
 
 The intent is to allow selective recompression of pack data.
 For small objects/deltas the overhead of deflate is significant.
 This may improve read performance for the object graph.
 
 I ran some unscientific experiments with the chromium repository.
 With pack.graphcompression = 0, there was a 2.7% increase in pack size.
 I saw a 35% improvement with cold caches and 43% otherwise on git log --raw.

I neglected to mention that this is a WIP. I get failures with certain 
repositories: 

fatal: delta size changed

--
David Michael Barr

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] pack-objects: compression level for non-blobs

2012-11-25 Thread David Michael Barr
Add config pack.graphcompression similar to pack.compression.
Applies to non-blob objects and if unspecified falls back to pack.compression.

We may identify objects compressed with level 0 by their leading bytes.
Use this to force recompression when the source and target levels mismatch.
Limit its application to when the config pack.graphcompression is set.

Signed-off-by: David Michael Barr b...@rr-dav.id.au
---
 builtin/pack-objects.c | 49 +
 1 file changed, 45 insertions(+), 4 deletions(-)

 I started working on this just before taking a vacation,
 so it's been a little while coming.

 The intent is to allow selective recompression of pack data.
 For small objects/deltas the overhead of deflate is significant.
 This may improve read performance for the object graph.

 I ran some unscientific experiments with the chromium repository.
 With pack.graphcompression = 0, there was a 2.7% increase in pack size.
 I saw a 35% improvement with cold caches and 43% otherwise on git log --raw.

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..9518daf 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -40,6 +40,7 @@ struct object_entry {
unsigned long z_delta_size; /* delta data size (compressed) */
unsigned int hash;  /* name hint hash */
enum object_type type;
+   enum object_type actual_type;
enum object_type in_pack_type;  /* could be delta */
unsigned char in_pack_header_size;
unsigned char preferred_base; /* we do not pack this, but is available
@@ -81,6 +82,8 @@ static int num_preferred_base;
 static struct progress *progress_state;
 static int pack_compression_level = Z_DEFAULT_COMPRESSION;
 static int pack_compression_seen;
+static int pack_graph_compression_level = Z_DEFAULT_COMPRESSION;
+static int pack_graph_compression_seen;
 
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
@@ -125,14 +128,14 @@ static void *get_delta(struct object_entry *entry)
return delta_buf;
 }
 
-static unsigned long do_compress(void **pptr, unsigned long size)
+static unsigned long do_compress(void **pptr, unsigned long size, int level)
 {
git_zstream stream;
void *in, *out;
unsigned long maxsize;
 
memset(stream, 0, sizeof(stream));
-   git_deflate_init(stream, pack_compression_level);
+   git_deflate_init(stream, level);
maxsize = git_deflate_bound(stream, size);
 
in = *pptr;
@@ -191,6 +194,18 @@ static unsigned long write_large_blob_data(struct 
git_istream *st, struct sha1fi
return olen;
 }
 
+static int check_pack_compressed(struct packed_git *p,
+   struct pack_window **w_curs,
+   off_t offset)
+{
+   unsigned long avail;
+   int compressed = 0;
+   unsigned char *in = use_pack(p, w_curs, offset, avail);
+   if (avail = 3)
+   compressed = !!(in[2]  0x6);
+   return compressed;
+}
+
 /*
  * we are going to reuse the existing object data as is.  make
  * sure it is not corrupt.
@@ -240,6 +255,8 @@ static void copy_pack_data(struct sha1file *f,
}
 }
 
+#define compression_level(type) ((type)  (type) != OBJ_BLOB ? 
pack_graph_compression_level : pack_compression_level)
+
 /* Return 0 if we will bust the pack-size limit */
 static unsigned long write_no_reuse_object(struct sha1file *f, struct 
object_entry *entry,
   unsigned long limit, int 
usable_delta)
@@ -286,7 +303,7 @@ static unsigned long write_no_reuse_object(struct sha1file 
*f, struct object_ent
else if (entry-z_delta_size)
datalen = entry-z_delta_size;
else
-   datalen = do_compress(buf, size);
+   datalen = do_compress(buf, size, 
compression_level(entry-actual_type));
 
/*
 * The object header is a byte of 'type' followed by zero or
@@ -379,6 +396,13 @@ static unsigned long write_reuse_object(struct sha1file 
*f, struct object_entry
offset += entry-in_pack_header_size;
datalen -= entry-in_pack_header_size;
 
+   if (!pack_to_stdout 
+   pack_graph_compression_seen 
+   check_pack_compressed(p, w_curs, offset) != 
!!compression_level(entry-actual_type)) {
+   unuse_pack(w_curs);
+   return write_no_reuse_object(f, entry, limit, usable_delta);
+   }
+
if (!pack_to_stdout  p-index_version == 1 
check_pack_inflate(p, w_curs, offset, datalen, entry-size)) {
error(corrupt packed object for %s, 
sha1_to_hex(entry-idx.sha1));
@@ -955,6 +979,8 @@ static int add_object_entry(const unsigned char *sha1, enum 
object_type type,
memset(entry, 0, sizeof(*entry));
hashcpy(entry-idx.sha1, sha1);
entry-hash = hash;
+   if (pack_graph_compression_seen)
+   entry-actual_type

Re: Subtree in Git

2012-10-26 Thread David Michael Barr
On Saturday, 27 October 2012 at 12:10 AM, Herman van Rink wrote:
 On 10/22/2012 04:41 PM, d...@cray.com (mailto:d...@cray.com) wrote:
  Herman van Rink r...@initfour.nl (mailto:r...@initfour.nl) writes:
  
   On 10/21/2012 08:32 AM, Junio C Hamano wrote:
Herman van Rink r...@initfour.nl (mailto:r...@initfour.nl) writes:

 Junio, Could you please consider merging the single commit from my
 subtree-updates branch? 
 https://github.com/helmo/git/tree/subtree-updates


In general, in areas like contrib/ where there is a volunteer area
maintainer, unless the change something ultra-urgent (e.g. serious
security fix) and the area maintainer is unavailable, I'm really
reluctant to bypass and take a single patch that adds many things
that are independent from each other.
   
   
   Who do you see as volunteer area maintainer for contrib/subtree?
   My best guess would be Dave. And he already indicated earlier in the
   thread to be ok with the combined patch as long as you are ok with it.
  
  
  Let's be clear. Junio owns the project so what he says goes, no
  question. I provided some review feedback which I thought would help
  the patches get in more easily. We really shouldn't be adding multiple
  features in one patch. This is easily separated into multiple patches.
  
  Then there is the issue of testcases. We should NOT have git-subtree go
  back to the pre-merge _ad_hoc_ test environment. We should use what the
  usptream project uses. That will make mainlining this much easier in
  the future.
  
  If Junio is ok with overriding my decisions here, that's fine. But I
  really don't understand why you are so hesitant to rework the patches
  when it should be realtively easy. Certainly easier than convincing me
  they are in good shape currently. :)
 
 
 
 If it's so easy to rework these patches then please do so yourself.
 It's been ages since I've worked on this so I would also have to
 re-discover everything.

From a quick survey, it appears there are no more than 55 patches
squashed into the submitted patch.
As I have an interest in git-subtree for maintaining the out-of-tree
version of vcs-svn/ and a desire to improve my rebase-fu, I am tempted
to make some sense of the organic growth that happened on GitHub.
It doesn't appear that anyone else is willing to do this, so I doubt
there will be any duplication of effort.



--
David Michael Barr

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2012, #01; Tue, 2)

2012-10-04 Thread David Michael Barr

On Wednesday, 3 October 2012 at 9:20 AM, Junio C Hamano wrote: 
 
 * fa/remote-svn (2012-09-19) 16 commits
 - Add a test script for remote-svn
 - remote-svn: add marks-file regeneration
 - Add a svnrdump-simulator replaying a dump file for testing
 - remote-svn: add incremental import
 - remote-svn: Activate import/export-marks for fast-import
 - Create a note for every imported commit containing svn metadata
 - vcs-svn: add fast_export_note to create notes
 - Allow reading svn dumps from files via file:// urls
 - remote-svn, vcs-svn: Enable fetching to private refs
 - When debug==1, start fast-import with --stats instead of --quiet
 - Add documentation for the 'bidi-import' capability of remote-helpers
 - Connect fast-import to the remote-helper via pipe, adding 'bidi-import' 
 capability
 - Add argv_array_detach and argv_array_free_detached
 - Add svndump_init_fd to allow reading dumps from arbitrary FDs
 - Add git-remote-testsvn to Makefile
 - Implement a remote helper for svn in C
 (this branch is used by fa/vcs-svn.)
 
 A GSoC project.
 Waiting for comments from mentors and stakeholders.

I have reviewed this topic and am happy with the design and implementation.
I support this topic for inclusion.

Acked-by: David Michael Barr b...@rr-dav.id.au
 
 * fa/vcs-svn (2012-09-19) 4 commits
 - vcs-svn: remove repo_tree
 - vcs-svn/svndump: rewrite handle_node(), begin|end_revision()
 - vcs-svn/svndump: restructure node_ctx, rev_ctx handling
 - svndump: move struct definitions to .h
 (this branch uses fa/remote-svn.)
 
 A GSoC project.
 Waiting for comments from mentors and stakeholders.

This follow-on topic I'm not so sure on, some of the design decisions make me 
uncomfortable and I need some convincing before I can get behind this topic. 

--
David Michael Barr

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Using bitmaps to accelerate fetch and clone

2012-09-27 Thread David Michael Barr
Hi all,

On Fri, Sep 28, 2012 at 3:20 AM, Jeff King p...@peff.net wrote:
 On Thu, Sep 27, 2012 at 07:17:42PM +0700, Nguyen Thai Ngoc Duy wrote:

  Operation   Index V2   Index VE003
  Clone   37530ms (524.06 MiB) 82ms (524.06 MiB)
  Fetch (1 commit back)  75ms 107ms
  Fetch (10 commits back)   456ms (269.51 KiB)341ms (265.19 KiB)
  Fetch (100 commits back)  449ms (269.91 KiB)337ms (267.28 KiB)
  Fetch (1000 commits back)2229ms ( 14.75 MiB)189ms ( 14.42 MiB)
  Fetch (1 commits back)   2177ms ( 16.30 MiB)254ms ( 15.88 MiB)
  Fetch (10 commits back) 14340ms (185.83 MiB)   1655ms (189.39 MiB)

 Beautiful. And curious, why do 100-1000 and 1-1 have such
 big leaps in time (V2)?

 Agreed. I'm very excited about these numbers.

+1

 Definitely :-). I have shown my interest in this topic before. So I
 should probably say that I'm going to work on this on C Git, but
 slllwwwly. As this benefits the server side greatly, perhaps a
 GitHubber ;-) might want to work on this on C Git, for GitHub itself
 of course, and, as a side effect, make the rest of us happy?

 Yeah, GitHub is definitely interested in this. I may take a shot at it,
 but I know David Barr (cc'd) is also interested in such things.

Yeah, I'm definitely interested, I love this stuff.

--
David Michael Barr
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failing svn imports from apache.org

2012-09-17 Thread David Michael Barr
Hi Enrico,
Repositories as old and large as ASF are the reason I created svn-fe.  
git-svn is known to choke on these repositories.
If you have plenty of bandwidth, it might well be faster to:
* Grab an ASF archive (16GB)
* Use svn-fe to import the entire tree into git.
* Use a simple script to extract the standard layout into a new repo.
* Use git-svn to keep the new repo up-to-date.

--  
David Michael Barr


On Saturday, 15 September 2012 at 8:07 PM, Enrico Weigelt wrote:

  
   Does anyone have an idea, what might be wrong here / how to fix it
   ?
   
   
   
  Here: git svn --version
  git-svn version 1.7.12.592.g41e7905 (svn 1.6.18)
   
  What's yours?
  
 1.7.9.5 (ubuntu precise)
  
  I'm getting
   
  Initialized empty Git repository in /tmp/discovery/.git/
  Using higher level of URL:
  http://svn.apache.org/repos/asf/commons/proper/discovery =
  http://svn.apache.org/repos/asf
  W: Ignoring error from SVN, path probably does not exist: (160013):
  Dateisystem hat keinen Eintrag: File not found: revision 100, path
  '/commons/proper/discovery'
  W: Do not be alarmed at the above message git-svn is just searching
  aggressively for old history.
  This may take a while on large repositories
   
  and then it checks the revisions. I didn't want to wait for
  r1301705...
   
  Does your git svn abort earlier or after checking all revs?
  
 It also scanned through thousands of revisions and then failed:
  
 W: Do not be alarmed at the above message git-svn is just searching 
 aggressively for old history.
 This may take a while on large repositories
 mkdir .git: No such file or directory at /usr/lib/git-core/git-svn line 3669
  
  
 cu
 --  
 Mit freundlichen Grüßen / Kind regards  
  
 Enrico Weigelt  
 VNC - Virtual Network Consult GmbH  
 Head Of Development  
  
 Pariser Platz 4a, D-10117 Berlin
 Tel.: +49 (30) 3464615-20
 Fax: +49 (30) 3464615-59
  
 enrico.weig...@vnc.biz; www.vnc.de (http://www.vnc.de)  
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org 
 (mailto:majord...@vger.kernel.org)
 More majordomo info at http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/5] GSOC: prepare svndump for branch detection

2012-08-18 Thread David Michael Barr
On Sat, Aug 18, 2012 at 6:40 AM, Florian Achleitner
florian.achleitner.2.6...@gmail.com wrote:
 Hi!

 This patch series should prepare vcs-svn/svndump.* for branch
 detection. When starting with this feature I found that the existing
 functions are not yet appropriate for that.
 These rewrites the node handling part of svndump.c, it is very
 invasive. The logic in handle_node is not simple, I hope that I
 understood every case the existing code tries to adress.
 At least it doesn't break an existing testcase.

 The series applies on top of:
 [PATCH/RFC v4 16/16] Add a test script for remote-svn.
 I could also rebase it onto master if you think it makes sense.

 Florian

  [RFC 1/5] vcs-svn: Add sha1 calculaton to fast_export and

This change makes me uncomfortable.
We are doubling up on hashing with fast-import.
This introduces git-specific logic into vcs-svn.

  [RFC 2/5] svndump: move struct definitions to .h.
  [RFC 3/5] vcs-svn/svndump: restructure node_ctx, rev_ctx handling
  [RFC 4/5] vcs-svn/svndump: rewrite handle_node(),
  [RFC 5/5] vcs-svn: remove repo_tree

I haven't read the rest of the series yet but I expect
it is less controversial than the first patch.

--
David Michael Barr
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v3 00/16] GSOC remote-svn

2012-08-14 Thread David Michael Barr
On Wed, Aug 15, 2012 at 5:13 AM, Florian Achleitner
florian.achleitner.2.6...@gmail.com wrote:
 Hi.

 Version 3 of this series adds the 'bidi-import' capability, as suggested
 Jonathan.
 Diff details are attached to the patches.
 04 and 05 are completely new.

 [PATCH/RFC v3 01/16] Implement a remote helper for svn in C.
 [PATCH/RFC v3 02/16] Integrate remote-svn into svn-fe/Makefile.
 [PATCH/RFC v3 03/16] Add svndump_init_fd to allow reading dumps from
 [PATCH/RFC v3 04/16] Connect fast-import to the remote-helper via
 [PATCH/RFC v3 05/16] Add documentation for the 'bidi-import'
 [PATCH/RFC v3 06/16] remote-svn, vcs-svn: Enable fetching to private
 [PATCH/RFC v3 07/16] Add a symlink 'git-remote-svn' in base dir.
 [PATCH/RFC v3 08/16] Allow reading svn dumps from files via file://
 [PATCH/RFC v3 09/16] vcs-svn: add fast_export_note to create notes
 [PATCH/RFC v3 10/16] Create a note for every imported commit
 [PATCH/RFC v3 11/16] When debug==1, start fast-import with --stats
 [PATCH/RFC v3 12/16] remote-svn: add incremental import.
 [PATCH/RFC v3 13/16] Add a svnrdump-simulator replaying a dump file
 [PATCH/RFC v3 14/16] transport-helper: add import|export-marks to
 [PATCH/RFC v3 15/16] remote-svn: add marks-file regeneration.
 [PATCH/RFC v3 16/16] Add a test script for remote-svn.

Thank you Florian, this series was a great read. My apologies for the
limited interaction over the course of summer. You have done well and
engaged with the community to produce this result.

Thank you Jonathan for the persistent reviews. No doubt they have
contributed to the quality of the series.

Thank you Junio for your dedication to reviewing the traffic on this
mailing list.

I will no longer be reachable on this address after Friday.

I hope to make future contributions with the identity:
David Michael Barr b...@rr-dav.id.au
This will be my persistent address.

--
David Barr
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] vcs-svn housekeeping

2012-07-06 Thread David Michael Barr
On Sat, Jul 7, 2012 at 3:10 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 Hi Junio,

 The following changes since commit 58ebd9865d2bb9d42842fbac5a1c4eae49e92859:

   vcs-svn/svndiff.c: squelch false unused warning from gcc (2012-01-27 
 11:58:56 -0800)

 are available at:

   git://repo.or.cz/git/jrn.git svn-fe

 The first three commits duplicate changes that are already in master
 but were committed independently on the svn-fe branch last February.
 The rest are David's db/vcs-svn series which aims to address various
 nits noticed when merging the code back into svn-dump-fast-export:
 unnecessary use of git-specific functions (prefixcmp, memmem) and
 warnings reported by clang.

 Some of the patches had to change a little since v2 of db/vcs-svn, so
 I'll be replying with a copy of the patches for reference.

 David has looked the branch over and acked and tested it.

 Thoughts welcome, as usual.  I think these are ready for pulling into
 master.  Sorry to be so slow at this.

 David Barr (7):
   vcs-svn: drop no-op reset methods
   vcs-svn: avoid self-assignment in dummy initialization of pre_off
   vcs-svn: simplify cleanup in apply_one_window
   vcs-svn: use constcmp instead of prefixcmp
   vcs-svn: use strstr instead of memmem
   vcs-svn: suppress signed/unsigned comparison warnings
   vcs-svn: suppress a signed/unsigned comparison warning

 Jonathan Nieder (4):
   vcs-svn: allow import of  4GiB files
   vcs-svn: suppress -Wtype-limits warning
   vcs-svn: suppress a signed/unsigned comparison warning
   vcs-svn: allow 64-bit Prop-Content-Length

 Ramsay Allan Jones (1):
   vcs-svn: rename check_overflow and its arguments for clarity

Thank you Jonathan for doing this. Definitely the result of
collaborating on a series is gorgeous. I do wish I could absorb your
flair for polish.

--
David Barr
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html