Re: [PATCH] Get commits from remote repositories by HTTP
Hello! This adds a program to download a commit, the trees, and the blobs in them from a remote repository using HTTP. It skips anything you already have. Is it really necessary to write your own HTTP downloader? If so, is it necessary to forget basic stuff like the Host: header? ;-) If you feel that it should be optimized for speed, then at least use persistent connections. + if (memcmp(target, http://;, 7)) + return -1; Can crash if the string is too short. + entry = gethostbyname(name); + memcpy(sockad.sin_addr.s_addr, +((struct in_addr *)entry-h_addr)-s_addr, 4); Can crash if the host doesn't exist or if you feed it with an URL containing port number. +static int get_connection() (void) + local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666); What if it fails? Have a nice fortnight -- Martin `MJ' Mares [EMAIL PROTECTED] http://atrey.karlin.mff.cuni.cz/~mj/ Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth A student who changes the course of history is probably taking an exam. - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
On Sat, 16 Apr 2005, Tony Luck wrote: On 4/16/05, Daniel Barkalow [EMAIL PROTECTED] wrote: +buffer = read_sha1_file(sha1, type, size); You never free this buffer. Ideally, this should all be rearranged to share the code with read-tree, and it should be fixed in common. It would also be nice if you saved tree objects in some temporary file and did not install them until after you had fetched all the blobs and trees that this tree references. Then if your connection is interrupted you can just restart it. It looks over everything relevant, even if it doesn't need to download anything, so it should work to continue if it stops in between. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
Tony Luck wrote: Otherwise this looks really nice. I was going to script something similar using wget ... but that would have made zillions of seperate connections. Not so kind to the server. How about building a file list and doing a batch download via 'wget -i /tmp/foo'? A quick test (on my ancient wget-1.7) indicates that it reuses connectionss when successive URLs point to the same server. Writing yet another http client does seem a bit pointless, what with wget and curl available. The real win lies in creating the smarts to get the minimum number of files. --Adam - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
On Sun, 17 Apr 2005, Martin Mares wrote: Hello! This adds a program to download a commit, the trees, and the blobs in them from a remote repository using HTTP. It skips anything you already have. Is it really necessary to write your own HTTP downloader? If so, is it necessary to forget basic stuff like the Host: header? ;-) I wanted to get something hacked quickly; can you suggest a good one to use? If you feel that it should be optimized for speed, then at least use persistent connections. That's the next step. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
On Sat, 16 Apr 2005, Adam Kropelin wrote: Tony Luck wrote: Otherwise this looks really nice. I was going to script something similar using wget ... but that would have made zillions of seperate connections. Not so kind to the server. How about building a file list and doing a batch download via 'wget -i /tmp/foo'? A quick test (on my ancient wget-1.7) indicates that it reuses connectionss when successive URLs point to the same server. You need to look at some of the files before you know what other files to get. You could do it in waves, but that would be excessively complicated to code and not the most efficient anyway. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
Daniel Barkalow wrote: On Sat, 16 Apr 2005, Adam Kropelin wrote: How about building a file list and doing a batch download via 'wget -i /tmp/foo'? A quick test (on my ancient wget-1.7) indicates that it reuses connectionss when successive URLs point to the same server. You need to look at some of the files before you know what other files to get. You could do it in waves, but that would be excessively complicated to code and not the most efficient anyway. Ah, yes. Makes sense. How about libcurl or another http client library, then? Minimizing dependencies on external libraries is good, but writing a really robust http client is a tricky business. (Not that you aren't up to it; I just wonder if it's the best way to spend your time.) --Adam - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
How about building a file list and doing a batch download via 'wget -i /tmp/foo'? A quick test (on my ancient wget-1.7) indicates that it reuses connectionss when successive URLs point to the same server. Here's a script that does just that. So there is a burst of individual wget commands to get HEAD, the top commit object, and all the tree objects. The just one to get all the missing blobs. Subsequent runs will do far less work as many of the tree objects will not have changed, so we don't descend into any tree that we already have. -Tony Not a patch ... it is a whole file. I called it git-wget, but it might also want to be called git-pulltop. Signed-off-by: Tony Luck [EMAIL PROTECTED] -- script starts here - #!/bin/sh # Copyright (C) 2005 Tony Luck REMOTE=http://www.kernel.org/pub/linux/kernel/people/torvalds/linux-2.6.git/ rm -rf .gittmp # set up a temp git repository so that we can use cat-file and ls-tree on the # objects we pull without installing them into our tree. This allows us to # restart if the download is interrupted mkdir .gittmp cd .gittmp init-db wget -q $REMOTE/HEAD if cmp -s ../.git/HEAD HEAD then echo Already have HEAD = `cat ../.git/HEAD` cd .. rm -rf .gittmp exit 0 fi sha1=`cat HEAD` sha1file=${sha1:0:2}/${sha1:2} if [ -f ../.git/objects/$sha1file ] then echo Already have most recent commit. Update HEAD to $sha1 cd .. rm -rf .gittmp exit 0 fi wget -q $REMOTE/objects/$sha1file -O .git/objects/$sha1file treesha1=`cat-file commit $sha1 | (read tag tree ; echo $tree)` get_tree() { treesha1file=${1:0:2}/${1:2} if [ -f ../.git/objects/$treesha1file ] then return fi wget -q $REMOTE/objects/$treesha1file -O .git/objects/$treesha1file ls-tree $1 | while read mode tag sha1 name do subsha1file=${sha1:0:2}/${sha1:2} if [ -f ../.git/objects/$subsha1file ] then continue fi if [ $mode = 4 ] then get_tree $sha1 `expr $2 + 1` else echo objects/$subsha1file needbloblist fi done } # get all the tree objects to our .gittmp area, and create list of needed blobs get_tree $treesha1 # now get the blobs cd ../.git if [ -s ../.gittmp/needbloblist ] then wget -q -r -nH --cut-dirs=6 --base=$REMOTE -i ../.gittmp/needbloblist fi # Now we have the blobs, move the trees and commit from .gitttmp cd ../.gittmp/.git/objects find ?? -type f -print | while read f do mv $f ../../../.git/objects/$f done # update HEAD cd ../.. mv HEAD ../.git cd .. rm -rf .gittmp -- script ends here - - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html