RFE: git-patch-id should handle patches without leading "diff"
Hi, all: Every now and again I come across a patch sent to LKML without a leading "diff a/foo b/foo" -- usually produced by quilt. E.g.: https://lore.kernel.org/lkml/20181125185004.151077...@linutronix.de/ I am guessing quilt does not bother including the leading "diff a/foo b/foo" because it's redundant with the next two lines, however this remains a valid patch recognized by git-am. If you pipe that patch via git-patch-id, it produces nothing, but if I put in the leading "diff", like so: diff a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c then it properly returns "fb3ae17451bc619e3d7f0dd647dfba2b9ce8992e". Can we please teach git-patch-id to work without the leading diff a/foo b/foo, same as git-am? Best, -K signature.asc Description: PGP signature
Re: insteadOf and git-request-pull output
On Thu, Nov 15, 2018 at 07:54:32PM +0100, Ævar Arnfjörð Bjarmason wrote: > > I think that if we use the "principle of least surprise," insteadOf > > rules shouldn't be applied for git-request-pull URLs. > > I haven't used request-pull so I don't have much of an opinion on this, > but do you think the same applies to 'git remote get-url '? > > I.e. should it also show the original unmunged URL, or the munged one as > it does now? I don't know, maybe both? As opposed to git-request-pull, this is not exposing the insteadOf URL to someone other than the person who set it up, so even if it does return the munged URL, it wouldn't be unexpected. -K
insteadOf and git-request-pull output
Hi, all: Looks like setting url.insteadOf rules alters the output of git-request-pull. I'm not sure that's the intended use of insteadOf, which is supposed to replace URLs for local use, not to expose them publicly (but I may be wrong). E.g.: $ git request-pull HEAD^ git://foo.example.com/example | grep example git://foo.example.com/example $ git config url.ssh://bar.insteadOf git://foo $ git request-pull HEAD^ git://foo.example.com/example | grep example ssh://bar.example.com/example I think that if we use the "principle of least surprise," insteadOf rules shouldn't be applied for git-request-pull URLs. Best, -K
Re: Generate more revenue from Git
Michal: This is strictly a development list. If you would like to discuss any and all monetization features, please feel free to reach out to me via email. Regards, -K On Thu, May 17, 2018 at 04:45:18PM +0300, Michal Sapozhnikov wrote: Hi, I would like to schedule a quick call this week. What's the best way to schedule a 15 minute call? Thanks, -- Michal Sapozhnikov | Business Manager, Luminati SDK | +972-50-2826778 | Skype: live:michals_43 http://luminati.io/sdk On 10-May-18 14:04, 7d (by eremind) wrote: Hi, I am writing with the hope of talking to the appropriate person who handles the app's monetization. If it makes sense to have a call, let me know how your schedule looks. Best Regards,
Re: worktrees vs. alternates
On 05/16/18 15:37, Jeff King wrote: > Yes, that's pretty close to what we do at GitHub. Before doing any > repacking in the mother repo, we actually do the equivalent of: > > git fetch --prune ../$id.git +refs/*:refs/remotes/$id/* > git repack -Adl > > from each child to pick up any new objects to de-duplicate (our "mother" > repos are not real repos at all, but just big shared-object stores). Yes, I keep thinking of doing the same, too -- instead of using torvalds/linux.git for alternates, have an internal repo where objects from all forks are stored. This conversation may finally give me the shove I've been needing to poke at this. :) Is your delta-islands patch heading into upstream, or is that something that's going to remain external? > I say "equivalent" because those commands can actually be a bit slow. So > we do some hacky tricks like directly moving objects in the filesystem. > > In theory the fetch means that it's safe to actually prune in the mother > repo, but in practice there are still races. They don't come up often, > but if you have enough repositories, they do eventually. :) I feel like a whitepaper on "how we deal with bajillions of forks at GitHub" would be nice. :) I was previously told that it's unlikely such paper could be written due to so many custom-built things at GH, but I would be very happy if that turned out not to be the case. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: worktrees vs. alternates
On 05/16/18 15:23, Jeff King wrote: > I implemented "repack -k", which keeps all objects and just rolls them > into the new pack (along with any currently-loose unreachable objects). > Aside from corner cases (e.g., where somebody accidentally added a 20GB > file to an otherwise 100MB-repo and then rolled it back), it usually > doesn't significantly affect the repository size. Hmm... I should read manpages more often! :) So, do you suggest that this is a better approach: - mother repos: "git repack -adk" - child repos: "git repack -Adl" (followed by prune) Currently, we do "-Adl" regardless, but we already track whether a repo is being used for alternates anywhere (so we don't prune it) and can do different flags if that improves performance. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: worktrees vs. alternates
On 05/16/18 15:03, Martin Fick wrote: >> I'm undecided about that. On the one hand this does create >> lots of small files and inevitably causes (some) >> performance degradation. On the other hand, I don't want >> to keep useless objects in the pack, because that would >> also cause performance degradation for people cloning the >> "mother repo." If my assumptions on any of that are >> incorrect, I'm happy to learn more. > My suggestion is to use science, not logic or hearsay. :) > i.e. test it! I think the answer will be "it depends." In many of our cases the repos that need those loose objects are rarely accessed -- usually because they are forks with older data (hence why they need objects that are no longer used by the mother repo). Therefore, performance impacts of occasionally touching a handful of loose objects will be fairly negligible. This is especially true on non-spinning media where seek times are low anyway. Having slimmer packs for the mother repo would be more beneficial in this case. On the other hand, if the "child repo" is frequently used, then the impact of needing a bunch of loose objects would be greater. For the sake of simplicity, I think I'll leave things as they are -- it's cheaper to fix this via reducing seek times than by applying complicated logic trying to optimize on a per-repo basis. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: worktrees vs. alternates
On 05/16/18 14:26, Martin Fick wrote: > If you are going to keep the unreferenced objects around > forever, it might be better to keep them around in packed > form? I'm undecided about that. On the one hand this does create lots of small files and inevitably causes (some) performance degradation. On the other hand, I don't want to keep useless objects in the pack, because that would also cause performance degradation for people cloning the "mother repo." If my assumptions on any of that are incorrect, I'm happy to learn more. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: worktrees vs. alternates
On 05/16/18 14:02, Ævar Arnfjörð Bjarmason wrote: > > On Wed, May 16 2018, Konstantin Ryabitsev wrote: > >> Maybe git-repack can be told to only borrow parent objects if they are >> in packs. Anything not in packs should be hardlinked into the child >> repo. That's my wishful think for the day. :) > > Can you elaborate on how this would help? > > We're just going to create loose objects on interactive "git commit", > presumably you're not adding someone's working copy as the alternate. The loose objects I'm thinking of are those that are generated when we do "git repack -Ad" -- this takes all unreachable objects and loosens them (see man git-repack for more info). Normally, these would be pruned after a certain period, but we're deliberately keeping them around forever just in case another repo relies on them via alternates. I want those repos to "claim" these loose objects via hardlinks, such that we can run git-prune on the mother repo instead of dragging all the unreachable objects on forever just in case. > Otherwise if it's just being pushed to all those pushes are going to be > in packs, and the packs may contain e.g. pushes for the "pu" branch or > whatever, which are objects that'll go away. There are lots of cases where unreachable objects in one repo would never become unreachable in another -- for example, if the author had stopped updating it. Hope this helps. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: worktrees vs. alternates
On 05/16/18 13:14, Martin Fick wrote: > On Wednesday, May 16, 2018 10:58:19 AM Konstantin Ryabitsev > wrote: >> >> 1. Find every repo mentioning the parent repository in >> their alternates 2. Repack them without the -l switch >> (which copies all the borrowed objects into those repos) >> 3. Once all child repos have been repacked this way, prune >> the parent repo (it's safe now) > > This is probably only true if the repos are in read-only > mode? I suspect this is still racy on a busy server with no > downtime. We don't actually do this anywhere. :) It's a feature I keep hoping to add one day to grokmirror, but keep putting off because of various considerations. As you can imagine, if we have 300 forks of linux.git all using torvalds/linux.git as their alternates, then repacking them all without -l would balloon our disk usage 300-fold. At this time it's just cheaper to keep a bunch of loose objects around forever at the cost of decreased performance. Maybe git-repack can be told to only borrow parent objects if they are in packs. Anything not in packs should be hardlinked into the child repo. That's my wishful think for the day. :) Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: worktrees vs. alternates
On Wed, May 16, 2018 at 05:34:34PM +0200, Ævar Arnfjörð Bjarmason wrote: I may have missed some edge case, but I believe this entire workaround isn't needed if you guarantee that the parent repo doesn't contain any objects that will get un-referenced. You can't guarantee that, because the parent repo can have its history rewritten either via a forced push, or via a rebase. Obviously, this won't happen in something like torvalds/linux.git, which is why it's pretty safe to alternate off of that repo for us, but codeaurora.org repos aren't always strictly-ff (e.g. because they may rebase themselves based on what is in upstream AOSP repos) -- so objects in them may become unreferenced and pruned away, corrupting any repos using them for alternates. I'm very interested in GVFS, because it would certainly make my life easier maintaining source.codeaurora.org, which is many thousands of repos that are mostly forks of the same stuff. However, GVFS appears to only exist for Windows (hint-hint, nudge-nudge). :) This should make you happy: https://arstechnica.com/gadgets/2017/11/microsoft-and-github-team-up-to-take-git-virtual-file-system-to-macos-linux/ But I don't know what the current status is or where it can be followed. Very good to know, thanks! -K
Re: worktrees vs. alternates
On 05/16/18 09:02, Derrick Stolee wrote: > This is the biggest difference. You cannot have the same ref checked out > in multiple worktrees, as they both may edit that ref. The alternates > allow you to share data in a "read only" fashion. If you have one repo > that is the "base" repo that manages that objects dir, then that is > probably a good way to reduce the duplication. I'm not familiar with > what happens when a "child" repo does 'git gc' or 'git repack', will it > delete the local objects that is sees exist in the alternate? The parent repo is not keeping track of any other repositories that may be using it for alternates, which is why you basically: 1. never run auto-gc in the parent repo 2. repack it manually using -Ad to keep loose objects that other repos may be borrowing (but we don't know if they are) 3. never prune the parent repo, because this may delete objects other repos are borrowing Very infrequently you may consider this extra set of maintenance steps: 1. Find every repo mentioning the parent repository in their alternates 2. Repack them without the -l switch (which copies all the borrowed objects into those repos) 3. Once all child repos have been repacked this way, prune the parent repo (it's safe now) 4. Repack child repos again, this time with the -l flag, to get your savings back. I would heartily love a way to teach git-repack to recognize when an object it's borrowing from the parent repo is in danger of being pruned. The cheapest way of doing this would probably be to hardlink loose objects into its own objects directory and only consider "safe" objects those that are part of the parent repository's pack. This should make alternates a lot safer, just in case git-prune happens to run by accident. > GVFS uses alternates in this same way: we create a drive-wide "shared > object cache" that GVFS manages. We put our prefetch packs filled with > commits and trees in there, and any loose objects that are downloaded > via the object virtualization are placed as loose objects in the > alternate. We also store the multi-pack-index and commit-graph in that > alternate. This means that the only objects in each src dir are those > created by the developer doing their normal work. I'm very interested in GVFS, because it would certainly make my life easier maintaining source.codeaurora.org, which is many thousands of repos that are mostly forks of the same stuff. However, GVFS appears to only exist for Windows (hint-hint, nudge-nudge). :) Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: main url for linking to git source?
On Tue, May 08, 2018 at 01:51:30AM +, brian m. carlson wrote: > On Mon, May 07, 2018 at 11:15:46AM -0700, Stefan Beller wrote: > > There I would try to mirror Junios list of "public repositories" > > https://git-blame.blogspot.com/p/git-public-repositories.html > > without officially endorsing one over another. > > I think I would also prefer a list of available repositories over a > hard-coded choice. It may be that some places (say, Australia) have > better bandwidth to one over the other, and users will be able to have a > better experience with certain mirrors. > > While I'm sympathetic to the idea of referring to kernel.org because > it's open-source and non-profit, users outside of North America are > likely to have a less stellar experience with its mirrors, since they're > all in North America. I'm a bit worried that I'll come across as some kind of annoying pedant, but git.kernel.org is actually 6 different systems available in US, Europe, Hong Kong, and Australia. :) We use geodns to map users to the nearest server (I know, GeoDNS is not the best, but it's what we have for free). -K signature.asc Description: PGP signature
Re: main url for linking to git source?
On 05/07/18 07:38, Johannes Schindelin wrote: >> The git-scm.com site currently links to https://github.com/git/git for >> the (non-tarball) source code. Somebody raised the question[1] of >> whether it should point to kernel.org instead. >> >> Do people find one interface more or less pleasing than the other? Do we >> want to prefer kernel.org as more "official" or less commercial? > > I don't really care about "official" vs "commercial", as kernel.org is > also run by a business, so it is all "commercial" to me. Kernel.org is a registered US non-profit organization, managed by a non-profit industry consortium (The Linux Foundation). The entire stack behind kernel.org is free software, excepting any firmware blobs on the physical hardware. I'm not trying to influence anyone's opinion of where the links should be pointing at, but it's important to point out that kernel.org and GitHub serve different purposes: - kernel.org provides free-as-in-liberty archive hosting on a platform that is not locked into any vendor. - github.com provides an integrated development infrastructure that is fully closed-source, excepting the protocols. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: Is offloading to GPU a worthwhile feature?
On 04/08/18 09:59, Jakub Narebski wrote: >> This is an entirely idle pondering kind of question, but I wanted to >> ask. I recently discovered that some edge providers are starting to >> offer systems with GPU cards in them -- primarily for clients that need >> to provide streaming video content, I guess. As someone who needs to run >> a distributed network of edge nodes for a fairly popular git server, I >> wondered if git could at all benefit from utilizing a GPU card for >> something like delta calculations or compression offload, or if benefits >> would be negligible. > > The problem is that you need to transfer the data from the main memory > (host memory) geared towards low-latency thanks to cache hierarchy, to > the GPU memory (device memory) geared towards bandwidth and parallel > access, and back again. So to make sense the time for copying data plus > the time to perform calculations on GPU (and not all kinds of > computations can be speed up on GPU -- you need fine-grained massively > data-parallel task) must be less than time to perform calculations on > CPU (with multi-threading). Would something like this be well-suited for tasks like routine fsck, repacking and bitmap generation? That's the kind of workloads I was imagining it would be most well-suited for. > Also you would need to keep non-GPU and GPGPU code in sync. Some parts > of code do not change much; and there also solutions to generate dual > code from one source. > > Still, it might be good idea, I'm still totally the wrong person to be implementing this, but I do have access to Packet.net's edge systems which carry powerful GPUs for projects that might be needing these for video streaming services. It seems a shame to have them sitting idle if I can offload some of the RAM- and CPU-hungry tasks like repacking to be running there. Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: The most efficient way to test if repositories share the same objects
On 03/22/18 17:44, Junio C Hamano wrote: > Wouldn't it be more efficient to avoid doing so one-by-one? > That is, wouldn't > > rev-list --max-parents=0 --all > > be a bit faster than > > for-each-ref | > while read object type refname > do > rev-list --max-parents=0 $refname > done > > I wonder? Yeah, you're right -- I forgot that we can pass --all. The check takes 30 seconds, which is a lot better than 12 hours. :) It's a bit heavy still, but msm kernel repos are one of the heaviest outliers, so let me try to run with this. Thanks for the suggestion! Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
Re: The most efficient way to test if repositories share the same objects
On 03/22/18 15:35, Junio C Hamano wrote: > I am not sure how Konstantin defines "the most efficient", but if it > is "with the smallest number of bits exchanged between the > repositories", then the answer would probably be to find the root > commit(s) in each repository and if they share any common root(s). > If there isn't then there is no hope to share objects between them, > of course. Hmm... so, this a cool idea that I'd like to use, but there are two annoying gotchas: 1. I cannot assume that refs/heads/master is meaningful -- my problem is actually with something like https://source.codeaurora.org/quic/la/kernel/msm-3.18 -- you will find that master is actually unborn and there are 7700 other heads (don't get me started on that unless you're buying me a lot of drinks). 2. Even if there is a HEAD I know I can use, it's pretty slow on large repos (e.g. linux.git): $ time git rev-list --max-parents=0 HEAD a101ad945113be3d7f283a181810d76897f0a0d6 cd26f1bd6bf3c73cc5afe848677b430ab342a909 be0e5c097fc206b863ce9fe6b3cfd6974b0110f4 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 real0m6.311s user0m6.153s sys 0m0.110s If I try to do this for each of the 7700 heads, this will take roughly 12 hours. My current strategy has been pretty much: git -C msm-3.10.git show-ref --tags -s | sort -u > /tmp/refs1 git -C msm-3.18.git show-ref --tags -s | sort -u > /tmp/refs2 and then checking if an intersection of these matches at least half of refs in either repo: #/usr/bin/env python import numpy refs1 = numpy.array(open('/tmp/refs1').readlines()) refs2 = numpy.array(open('/tmp/refs2').readlines()) in_common = len(numpy.intersect1d(refs1, refs2)) if in_common > len(refs1)/2 or in_common > len(refs2)/2: print('Lots of shared refs') else: print('None or too few shared refs') This works well enough at least for those repos with lots of shared tags, but will miss potentially large repos where there's only heads that can be pointing at commits that aren't necessarily the same between two repos. Thanks for your help! Best, -- Konstantin Ryabitsev Director, IT Infrastructure Security The Linux Foundation signature.asc Description: OpenPGP digital signature
The most efficient way to test if repositories share the same objects
Hi, all: What is the most efficient way to test if repoA and repoB share common commits? My goal is to automatically figure out if repoB can benefit from setting alternates to repoA and repacking. I currently do it by comparing the output of "show-ref --tags -s", but that does not work for repos without tags. Best, -- Konstantin Ryabitsev signature.asc Description: OpenPGP digital signature
Is offloading to GPU a worthwhile feature?
Hi, all: This is an entirely idle pondering kind of question, but I wanted to ask. I recently discovered that some edge providers are starting to offer systems with GPU cards in them -- primarily for clients that need to provide streaming video content, I guess. As someone who needs to run a distributed network of edge nodes for a fairly popular git server, I wondered if git could at all benefit from utilizing a GPU card for something like delta calculations or compression offload, or if benefits would be negligible. I realize this would be silly amounts of work. But, if it's worth it, perhaps we can benefit from all the GPU computation libs written for cryptocoin mining and use them for something good. :) Best, -- Konstantin Ryabitsev signature.asc Description: OpenPGP digital signature
Re: Repacking a repository uses up all available disk space
On Sun, Jun 12, 2016 at 05:38:04PM -0400, Jeff King wrote: > > - When attempting to repack, creates millions of files and eventually > > eats up all available disk space > > That means these objects fall into the unreachable category. Git will > prune unreachable loose objects after a grace period based on the > filesystem mtime of the objects; the default is 2 weeks. > > For unreachable packed objects, their mtime is jumbled in with the rest > of the objects in the packfile. So Git's strategy is to "eject" such > objects from the packfiles into individual loose objects, and let them > "age out" of the grace period individually. > > Generally this works just fine, but there are corner cases where you > might have a very large number of such objects, and the loose storage is > much more expensive than the packed (e.g., because each object is stored > individually, not as a delta). > > It sounds like this is the case you're running into. > > The solution is to lower the grace period time, with something like: > > git gc --prune=5.minutes.ago > > or even: > > git gc --prune=now You are correct, this solves the problem, however I'm curious. The usual maintenance for these repositories is a regular run of: - git fsck --full - git repack -Adl -b --pack-kept-objects - git pack-refs --all - git prune The reason it's split into repack + prune instead of just gc is because we use alternates to save on disk space and try not to prune repos that are used as alternates by other repos in order to avoid potential corruption. Am I not doing something that needs to be doing in order to avoid the same problem? Thanks for your help. Regards, -- Konstantin Ryabitsev Linux Foundation Collab Projects Montréal, Québec signature.asc Description: PGP signature
Repacking a repository uses up all available disk space
Hello: I have a problematic repository that: - Takes up 9GB on disk - Passes 'git fsck --full' with no errors - When cloned with --mirror, takes up 38M on the target system - When attempting to repack, creates millions of files and eventually eats up all available disk space Repacking the result of 'git clone --mirror' shows no problem, so it's got to be something really weird with that particular instance of the repository. If anyone is interested in poking at this particular problem to figure out what causes the repack process to eat up all available disk space, you can find the tarball of the problematic repository here: http://mricon.com/misc/src.git.tar.xz (warning: 6.6GB) You can clone the non-problematic version of this repository from git://codeaurora.org/quic/chrome4sdp/breakpad/breakpad/src.git Best, -- Konstantin Ryabitsev Linux Foundation Collab Projects Montréal, Québec signature.asc Description: PGP signature
Re: Resumable git clone?
On Wed, Mar 02, 2016 at 12:41:20AM -0800, Junio C Hamano wrote: > Josh Triplett <j...@joshtriplett.org> writes: > > > If you clone a repository, and the connection drops, the next attempt > > will have to start from scratch. This can add significant time and > > expense if you're on a low-bandwidth or metered connection trying to > > clone something like Linux. > > For this particular issue, your friendly k.org administrator already > has a solution. Torvalds/linux.git is made into a bundle weekly > with > > $ git bundle create clone.bundle --all > > and the result placed on k.org CDN. So low-bandwidth cloners can > grab it over resumable http, clone from the bundle, and then fill > the most recent part by fetching from k.org already. I finally got around to documenting this here: https://kernel.org/cloning-linux-from-a-bundle.html > The tooling to allow this kind of "bundle" (and possibly other forms > of "CDN offload" material) transparently used by "git clone" was the > proposal by Shawn Pearce mentioned elsewhere in this thread. To reiterate, I believe that would be an awesome feature. Regards, -- Konstantin Ryabitsev Linux Foundation Collab Projects Montréal, Québec signature.asc Description: PGP signature
Re: [PATCH v2 3/3] http-backend: spool ref negotiation requests to buffer
On 20 May 2015 at 03:37, Jeff King p...@peff.net wrote: + /* partial read from read_in_full means we hit EOF */ + len += cnt; + if (len alloc) { + *out = buf; + warning(request size was %lu, (unsigned long)len); + return len; + } Jeff: This patch appears to work well -- the only complaint I have is that I now have warning: request size was NNN all over my error logs. :) Is it supposed to convey an actual warning message, or is it merely a debug statement? Best, -- Konstantin Ryabitsev Sr. Systems Administrator Linux Foundation Collab Projects 541-224-6067 Montréal, Québec -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sources for 3.18-rc1 not uploaded
On 20/10/14 02:28 PM, Junio C Hamano wrote: I have to wonder why 10f343ea (archive: honor tar.umask even for pax headers, 2014-08-03) is a problem but an earlier change v1.8.1.1~8^2 (archive-tar: split long paths more carefully, 2013-01-05), which also should have broken bit-for-bit compatibility, went unnoticed, though. What I am getting at is that correcting past mistakes in the output should not be forbidden unconditionally with a complaint like this. I think Greg actually ran into that one, and uses a separate 1.7 git tree for this reason. I can update our servers to git 2.1 (which most of them already have), which should help with previous incompatibilities -- but not the future ones obviously. :) -K -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Sources for 3.18-rc1 not uploaded
On 20/10/14 06:28 PM, brian m. carlson wrote: Junio, quite frankly, I don't think that that fix was a good idea. I'd suggest having a *separate* umask for the pax headers, so that we do not break this long-lasting stability of git archive output in ways that are unfixable and not compatible. kernel.org has relied (for a *long* time) on being able to just upload the signature of the resulting tar-file, because both sides can generate the same tar-fiel bit-for-bit. It sounds like kernel.org has a bug, then. Perhaps that's the appropriate place to fix the issue. It's not a bug, it's a feature (TM). KUP relies on git-archive's ability to create identical tar archives across platforms and versions. The benefit is that Linus or Greg can create a detached PGP signature against a tarball created from git archive [tag] on their system, and just tell kup to create the same archive remotely, thus saving them the trouble of uploading 80Mb each time they cut a release. With their frequent travel to places where upload bandwidth is both slow and unreliable, this ability to not have to upload hundreds of Mbs each time they cut a release is very handy and certainly helps keep kernel releases on schedule. So, while it's fair to point out that git-archive was never intended to always create bit-for-bit identical outputs, it would be *very nice* if this remained in place, as at least one large-ish deployment (us) finds it really handy. -K signature.asc Description: OpenPGP digital signature
Re: git archve --format=tar output changed from 1.8.1 to 1.8.2.1
On 31/01/13 12:41 PM, Greg KH wrote: Ugh, uploading a 431Mb file, over a flaky wireless connection (I end up doing lots of kernel releases while traveling), would be a horrible change. I'd rather just keep using the same older version of git that kernel.org is running instead. Well, we do accept compressed archives, so you would be uploading about 80MB instead of 431MB, but that would still be a problem for anyone releasing large tarballs over unreliable connections. I know you routinely do 2-3 releases at once, so that would still mean uploading 120-180MB. I don't have immediate statistics on how many people release using kup --tar, but I know that at least you and Linus rely on that exclusively. Regards, -- Konstantin Ryabitsev Systems Administrator Linux Foundation, kernel.org Montréal, Québec signature.asc Description: OpenPGP digital signature