Re: Need some help on patching buildin-files // was: Looking for feedback and help with a git-mirror for local usage

2015-06-14 Thread David Aguilar
On Fri, Jun 12, 2015 at 12:52:44PM +0200, Bernd Naumann wrote:
 Hello again,
 
 After digging the code I may have got a clue where to start but I
 would  still appreciate some help from a developer, cause I have never
 learned to write C. (Some basics at school which happened over a
 decade ago.)
 
 Currently I have questions on:
 
 * How to patch clone: would cmd_clone() a good place? Or are there
 other calls which might be better. I think about to insert the check
 if a mirror will be setup or just updated, right after dest_exists.

If you'd still like to modify git clone itself, then the
cmd_clone entry point is certainly the place to start.
I would suggest exploring other alternatives, though.


Is it possible to use a caching HTTP proxy, so that git clone
goes through a local caching proxy?  I haven't tried this myself,
so maybe it's not even possible, but that seems like a natural
http-ish solution.


Another idea is to use Git's URL rewriting feature.  If your
clone URLs all follow a similar pattern then they can
automatically be rewritten to point to some other URL.

e.g. in ~/.gitconfig:

[url file:///home/git/mirror/github.com/]
insteadOf = https://github.com/;

This will make git clone from /home/git/mirror/github.com/
whenever it sees https://github.com/ URLs.

This is not perfect because it ends up cloning from your local
copies rather than setting up the references via --mirror, but
at least it avoids hitting the network.  You'll need to
periodically update your local mirrors, though.

If you prefer to keep ~/.gitconfig pristine then you could do it
in a wrapper script by injecting e.g. the -c config flags,

git \
-c url.file://foo/bar/.insteadOf=https://github.com/ \
clone ...

 [...snip...]
  
  I often build in example 'openwrt' with various build-scripts which
  depends heavily on a fresh or clean environment and they omit many
  sources via `git clone`, which results sometimes in over 100 MB of
  traffic just for one build. /* Later needed .tar.gz source archives
  are stored in a symlinked download directory which is supported by
  'openwrt/.config' since a few months... to reduce network traffic.
  */

Why does a rebuild delete existing Git repositories?
That seems like a bad practice, and shouldn't be needed.
If possible, it would be worth improving the build scripts.

For example, a clone can be made pristine by doing
git reset --hard  git clean -fdx.  Deleting a repository
just so that it can be re-cloned is very wasteful.

  My connection to the internet is not the fastest in world and 
  sometimes unstable, so I wanted to have some kind of local bare 
  repository mirror, which is possible with `git clone --mirror`.
  
  From these repositories I can later clone from, by calling `git 
  clone --reference /path/to.git url`, but I do not wish to edit 
  all the build-scripts and Makefiles.

Maybe it'd be possible to make just the git clone part of the
build scripts configurable?

That'd make it really easy to inject a wrapper script that scans
the arguments and injects the needed --mirror arguments, in the
case that the above options won't work.


  So I wrote a git wrapper script (`$HOME/bin/git`), which checks if
   `git` was called with 'clone', and if so, then it will first 
  clones the repository as a mirror and then clones from that local 
  mirror. If the mirror already exists, then it will only be updated 
  (`git remote update`). This works for now.
  
  [...snip...]
  
  Ok, so far, so good, but the implementation of the current 
  shell-prototype looks way too hacky [0] and I have found some edge
   cases on which my script will fail: The script depends on the
  fact that the last, or at least the second last argument is a
  valid git-url, but the following is a valid call, too :
  
  `git --no-pager \ clone g...@github.com:openwrt/packages.git 
  openwrt-packages --depth 1`
  
  But this is not valid:
  
  `git clone https://github.com/openwrt/packages.git --reference 
  packages.git packages-2` or `git clone --verbose 
  https://github.com/openwrt/packages.git packages-2 --reference 
  packages.git`
  
  
  I found out that git-clone actually also can only make a guess
  what is the url and what not.

Another option is to rewrite the wrapper script in a better language.
For example, Python's argparse module can handle the above cases
with minimal fuss.

Anyways, as I said before, the root problem is really the build
scripts.  I bet modifying the build scripts to reuse existing
git repositories is easier than modifying git clone.

cheers,
-- 
David
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need some help on patching buildin-files // was: Looking for feedback and help with a git-mirror for local usage

2015-06-12 Thread Bernd Naumann
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello again,

After digging the code I may have got a clue where to start but I
would  still appreciate some help from a developer, cause I have never
learned to write C. (Some basics at school which happened over a
decade ago.)

Currently I have questions on:

* How to patch clone: would cmd_clone() a good place? Or are there
other calls which might be better. I think about to insert the check
if a mirror will be setup or just updated, right after dest_exists.

* Is it correct that a new config key just get specified via a config
file or by cmd_init_db()? So later, a check on that value is enough?
Would be the section 'user' a good place for this key or is it
something that would get a own/new section?

* Have I missed a relevant file?

git/git.c
git/builtin/clone.c
git/builtin/fetch.c
git/builtin/push.c
git/buildin/remote.c
along with the translation and Documentation, of course.


If you have some comments on that, please share these with me, and if
you are interested in helping me to got this implemented, I would
appreciate that :)

Sincere regards,
Bernd


On 06/11/2015 10:44 PM, Bernd Naumann wrote:
 Hello,
 
 I have came up with an idea # Yep I know, exactly that kind of 
 e-mail everyone wants to read ;) and I'm working currently on a 
 shell-prototype to face the following situation and problem and 
 need some feedback/advise:
 
 
 I often build in example 'openwrt' with various build-scripts which
 depends heavily on a fresh or clean environment and they omit many
 sources via `git clone`, which results sometimes in over 100 MB of
 traffic just for one build. /* Later needed .tar.gz source archives
 are stored in a symlinked download directory which is supported by
 'openwrt/.config' since a few months... to reduce network traffic.
 */
 
 My connection to the internet is not the fastest in world and 
 sometimes unstable, so I wanted to have some kind of local bare 
 repository mirror, which is possible with `git clone --mirror`.
 
 From these repositories I can later clone from, by calling `git 
 clone --reference /path/to.git url`, but I do not wish to edit 
 all the build-scripts and Makefiles.
 
 
 So I wrote a git wrapper script (`$HOME/bin/git`), which checks if
  `git` was called with 'clone', and if so, then it will first 
 clones the repository as a mirror and then clones from that local 
 mirror. If the mirror already exists, then it will only be updated 
 (`git remote update`). This works for now.
 
 /* To be able to have multiple identical named repositories, the 
 script builds paths like:
 
 ~/var/cache/gitmirror $ find . -name *.git
 
 ./github.com/openwrt-management/packages.git 
 ./github.com/openwrt/packages.git 
 ./github.com/openwrt-routing/packages.git ./nbd.name/packages.git 
 ./git.openwrt.org/packages.git ./git.openwrt.org/openwrt.git
 
 It strips the schema from the url and replaces : with / in
 case a port is specified or a svn link is provided. The remaining
 should be a valid linux file and directory structure, if I guess 
 correctly!? */
 
 Ok, so far, so good, but the implementation of the current 
 shell-prototype looks way too hacky [0] and I have found some edge
  cases on which my script will fail: The script depends on the
 fact that the last, or at least the second last argument is a
 valid git-url, but the following is a valid call, too :
 
 `git --no-pager \ clone g...@github.com:openwrt/packages.git 
 openwrt-packages --depth 1`
 
 But this is not valid:
 
 `git clone https://github.com/openwrt/packages.git --reference 
 packages.git packages-2` or `git clone --verbose 
 https://github.com/openwrt/packages.git packages-2 --reference 
 packages.git`
 
 
 I found out that git-clone actually also can only make a guess
 what is the url and what not.
 
 
 
 However, now I'm looking for a way to write something like a 
 submodul for git which will check for a *new* git-config value like
 user.mirror (or something...) which points to a directory, and
 will be used to clone from, and in case of 'fetch', 'pull' or 
 'remote update' update the mirror first, and then the update of
 the current working directory is gotten from that mirror. (And in
 case of 'push' the mirror would be updated from the working dir,
 of course.)
 
 
 I would like to hear some toughs on that, and how I could start to
  build this submodul, or if someone more talented, then I am, is 
 willed to spent some time on that. If requested/wished I could
 send a link to the shell-prototype.
 
 
 [0] For a reason I have to do ugly things like `$( eval exec 
 /usr/bin/git clone --mirror $REPO_URL ) 21 /dev/null` cause 
 otherwise in case of just `eval exec` the script stops after 
 execution, and without `eval exec` arguments with spaces will 
 interpreted as seperated arguments, which is no good, because of 
 failing .
 
 
 Thanks for your time! Yours faithfully, Bernd -- To unsubscribe 
 from this list: send the line unsubscribe git in the body of a