Re: Spring cleaning: Reducing Number Footprint of HG Repos
Additionally I've been setting up a host named hg-archive.mozilla.org with a lower SLA to shelve repositories that have not been touched in many many years. Deleting this old code from hg.m.o, even if it's available elsewhere if an unpopular thing to do, so it's unsurprising I didn't receive much buy-in when I proposed it. Cool. I think the real reason for this evaluation and push into other services is that it is perceived that the user repositories don't add much value, especially when you consider all of the features that could be happening from them such as triggering CI jobs based on these, and self-service collaboration. I use them all the time to keep my personal patch queues synced across multiple machines (or even on one), and to collaborate with others on my team. I think if we advertised more how useful they are for this sort of thing it would help people work more efficiently. (And maybe tweaked the initial setup procedure just a smidge to reduce now edit this non-existant file sort of stuff.) Yes, the user-repo deletion is a feature and it is currently broken. It's been a corner-case of the migration to local disk, and a fix has yet to be coded up. Please ping me if you're trying to remove a repository until I can fix this. Add an option for delete a repo, and have it say please email bkero, for now? Thanks for the work on hg! -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
Wow, okay. A lot to address here. The primary instigator of this migrating of user repositories off to external services came from when we were (and still are) crunched for disk space after restructuring our Mercurial infrastructure to use local disks. We did this for several reasons: * An internal quote for remaining on NFS for hosting (even with just the 300GB used) would have cost us a low six-digit figure * Mercurial devs originally said that some of our clone corruption problems may have come from NFS faking transaction atomicity (see bug 974094) * This approach did not allow us to expand to multiple datacenters (especially cost-effectively) The 300GB limit in this case comes from repurposing the old hgweb-serving hosts to use their local disks instead of an NFS mount. These hosts came with two 300GB disks paired in RAID-1 configuration. If this is simply a matter of disk space we can agree to reconfigure these hosts as RAID-0 instead. The reliability should never matter since these are simply clones of the original canonical source. This is what I was spending a considerable amount of time doing. Additionally I've been setting up a host named hg-archive.mozilla.org with a lower SLA to shelve repositories that have not been touched in many many years. Deleting this old code from hg.m.o, even if it's available elsewhere if an unpopular thing to do, so it's unsurprising I didn't receive much buy-in when I proposed it. I think the real reason for this evaluation and push into other services is that it is perceived that the user repositories don't add much value, especially when you consider all of the features that could be happening from them such as triggering CI jobs based on these, and self-service collaboration. Yes, the user-repo deletion is a feature and it is currently broken. It's been a corner-case of the migration to local disk, and a fix has yet to be coded up. Please ping me if you're trying to remove a repository until I can fix this. As for project repositories, these should totally be self-service and automated. The human-as-an-API approach to these means it is often too much work for developers to request one for simple or short collaborative projects. Sadly for Mercurial development at Mozilla it is just me for the development work. If, as gps said, people are willing to help out with some of the development I would be happy to test and deploy whatever changes are proposed. The code for the infrastructure is available at https://github.com/bkero/puppet-module-hg. Feel free to spin up a VM and try to improve things. Ben ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
Hi, Mozilla's Manifest principle #8 states: 8.Transparent community-based processes promote participation, accountability and trust. Decision making, afaik, is a process. So... Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. I'd like to question the above. I would like to make user repos ... Was this decision arrived by yourself, or through a transparent process with your releng team? And if it was transparent, where was it discussed? Would I be privy to these discussions? Or is this decision similar to the DT issue? So, in the name of transparency, how exactly did you come about in deciding this? Reading your message, I understood the possible issues(read: Why?): 1) Resource: Time 2) Resource: Disk space 3) Resource: maintenance Is the machine/vm/whatever that holds the user repos and/or non-user repos anyway tied to the CI systems? i.e Does the CI system also contain the user/non-user repos? Also, are you sure that these are not 'mission-critical' repos (user-repos and non-user repos)? The word 'seems' imply you're not sure. Don't get me wrong. You have every right to make these decisions. I know (with 100% certainty) that this decision affects a few community projects. I'm not saying it isn't technically 'feasible' to move repos away from Mozilla's systems. It is technically do-able. Feasibility is project-dependent. What I'm not 100% certain is whether it is the 'right' thing to do. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. This covers #2 in the list. Disk space. From your post to gps, I quote: The fact that repos keep growing means that we'll have to do this migration again soon. We are at 260gb/300gb. I can see why this might lead you to make your decision; but is this the only alternative? I mean 300GB? How much is 1TB in the US? AIUI, having user and non-user repos don't take that much processing power and the minimum HD size you can get now-a-days is 500GB. Why not migrate it to a 1TB drive? How long would that last? How long did 300GB last? *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. How exactly did you come to the conclusion that 'there should be less than a dozen of those'? I'm really curious. Did you go through each non-user repos (as you did with the user repos) and decided which ones fitted to your criteria as 'mission-critical'? Which are the 'dozen'(or less) repos are you talking about? This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. Pardon me, but is this the right approach? We're talking about a lot of project branches here. 'Start experimenting' isn't something that would go well with already established processes/systems. Moving them isn't a technical issue.(We've established that it's technically do-able.) It's a systematic issue. Moving a project, say A, to a different system (3rd party or otherwise), require some changes to the underlying systems/processes that require that repo to be where it is. So those need to be changed. Then the processes/systems are checked for errors. If it doesn't work, move the project branch elsewhere. Another set of changes. Do-able? Sure. I'm not saying it isn't do-able. Is it necessarily the right thing to do? *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. Umm. Haven't you already given reasons why moving everything to bitbucket isn't a good idea? (No availability guaranteed would
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 27/03/14 00:53, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. I think that if you truly intend to go ahead with this, the news will need way, way wider circulation than mozilla.dev.platform. I have some useful software stored in a user repo, and I only happen to read this group. It will also need much more lead time than a month. I'm also somewhat surprised that this has been proposed without any previous attempt to measure the impact of doing it. Or has such work been done, but the results not published? How often are all these repos pulled from or pushed to? Could we achieve many of the gains by getting people to clean up after themselves, rather than eliminating the capability entirely? I don't think you're suggesting this, but just to be clear: I'm against storing our repo of record for anything of long-term importance on any system other than our own. And yes, I know about B2G. Gerv ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
Also, if you are using a COW filesystem, initial clones should be nearly free and you'd only pay the extra copy cost for changesets added afterwards. This could help dramatically with mozilla-central clones. Out of curiosity, is there open source software for a shared Git object store? git. git also has a wide array of interesting backends(eg swift) to choose from, etc. It's slightly less painful to host than hg. Yet I still don't see a compelling reason to roll our own poor imitation of github/bitbucket. re busted self-serve deletes in another email: https://bugzilla.mozilla.org/show_bug.cgi?id=983085 Taras ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
Want to move to github? (0) sudo apt-get install python-setuptools (1) sudo easy_install hg-git (2) add |hggit =| under [extensions] in your .hgrc file (3) Go to GitHub.com and create your new repo. (4) cd hg_repo (5) hg bookmark -r default master (6) hg push git+ssh://g...@github.com/you/name of your repo you created in step 3 On Wednesday, March 26, 2014, Taras Glek tg...@mozilla.com wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. Taras ps. Footprint stats: *Largest User Repos Out Of ~130GB* 1.1Gdmt.alexandre_gmail.com 1.1Gjblandy_mozilla.com 1.1Gjparsons_mozilla.com 1.2Gbugzilla_standard8.plus.com 1.2Gmbrubeck_mozilla.com 1.2Gmrbkap_mozilla.com 1.3Gdcamp_campd.org 1.3Gjst_mozilla.com 1.4Gblassey_mozilla.com 1.4Ggszorc_mozilla.com 1.4Giacobcatalin_gmail.com 1.5Gcpearce_mozilla.com 1.5Ghurley_mozilla.com 1.6Gbsmedberg_mozilla.com 1.6Gdglastonbury_mozilla.com 1.6Gdtc-moz_scieneer.com 1.6Gjlund_mozilla.com 1.6Gsarentz_mozilla.com 1.6Gsbruno_mozilla.com 1.7Gmshal_mozilla.com 1.9Gmhammond_skippinet.com.au 2.1Glwagner_mozilla.com 2.4Garmenzg_mozilla.com 2.4Gdougt_mozilla.com 2.5Gbschouten_mozilla.com 2.7Ghwine_mozilla.com 2.8Geakhgari_mozilla.com 2.8Gmozilla_kewis.ch -- // mobile ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Wed, Mar 26, 2014 at 11:58:48PM -0700, Doug Turner wrote: Want to move to github? (0) sudo apt-get install python-setuptools (1) sudo easy_install hg-git (2) add |hggit =| under [extensions] in your .hgrc file (3) Go to GitHub.com and create your new repo. (4) cd hg_repo (5) hg bookmark -r default master (6) hg push git+ssh://g...@github.com/you/name of your repo you created in step 3 I don't know the state of github backend, but it used to be recommended to start from a fork than to push something fresh, especially when it's as massive as mozilla-central. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
For talos development we allow pointing at a user specific repo instead of the master one. This has greatly reduced the time to bring up new tests. This could easily be hosted elsewhere, but we chose to restrict it to user repos for a security measure. You have to have cleared some form of basic authentication with user repos and now if someone wants to see how their talos modifications run on talos they can do that without checking them in. A change like this will require us to either remove this functionality, make it less secure, or create busy work whenever someone new wants to point to a custom talos repository. -Joel ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 14-03-26 07:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. First of all, I applaud this and it's important to get it done. However, we need to review what is used within the releng system and the security implications of using non-mozilla hosting for repos. Our infra also allows on the try server to test talos repositories under hg.m.o/users/blah. We should also get security sign-off for a different type of hosting of those repos. We're putting an etherpad together with repos important to releng systems. cheers, Armen ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/27/2014 1:11 AM, Mike Hommey wrote: On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote: On 3/26/14, 4:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. How much time do we spend operating user repositories? I follow the repos bugzilla components and most of the requests I see have little if anything to do with user repositories. And I reckon that's because user repositories are self-service. Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in Taras's list, but after looking, I figured I had an old m-c copy with old patches on top of it. Prior to the hg migration to local disk there was (well technically still is): ssh hg.mozilla.org edit repo which allowed you to delete it. We even had/have this info on MDN. The bug exists today that the deletion does not propogate out to the local-storage webheads. ~Justin Wood (Callek) ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/26/2014 9:15 PM, Taras Glek wrote: Bobby Holley mailto:bobbyhol...@gmail.com Wednesday, March 26, 2014 17:27 I don't understand what the overhead is. We don't run CI on user repos. It's effectively just ssh:// + disk space, right? That seems totally negligible. Human overhead in keeping infra running could be spent making our infra better elsewhere. Also, project branches are pretty useful for teams working together on large projects that aren't ready to land in m-c. We only use them when we need them, so why would we shut them down? I'm not suggesting killing it. My suggestion is that project branch experience would likely be better when not hosted by mozilla. It would still trigger our c-i systems. Except when you consider the disposable project branches get Level 2 commit privs needed, and that to commit to our repos you need to have signed the committer agreement, which grants some legal recompense if malice is done. These project branches run on non try based machines which have elevated rights vs what try does, and can do much much more harm if there is malice here. I for one would not be happy from a sec standpoint if we allowed bitbucket-hosted repos to execute arbitrary code this way. ~Justin Wood (Callek) ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/27/14, 12:53 AM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. I think it's utterly sad making that we're giving up on hosting, instead of fixing it. I have several things in my user repos that only run on our hg server, mostly because all other repo hoster don't send correct mimetypes for raw files. In particular this affects dashboards I created to share aggregated bugzilla data etc. I'm also sad that we're removing the ability for contributors to share their mozilla-central clones, at least in large parts of the world. Pushing a full clone to some random server just isn't working for large parts of teh world. And all that while the opportunity for us to help you on the data consumption is just broken. Sad. Note, strategically, I think that mozilla needs to support developing o the web, and the github editor isn't it. It'll be web-based IDEs, which require good tooling and hosting to be on the same infrastructure. Axel *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. Taras ps. Footprint stats: *Largest User Repos Out Of ~130GB* 1.1Gdmt.alexandre_gmail.com 1.1Gjblandy_mozilla.com 1.1Gjparsons_mozilla.com 1.2Gbugzilla_standard8.plus.com 1.2Gmbrubeck_mozilla.com 1.2Gmrbkap_mozilla.com 1.3Gdcamp_campd.org 1.3Gjst_mozilla.com 1.4Gblassey_mozilla.com 1.4Ggszorc_mozilla.com 1.4Giacobcatalin_gmail.com 1.5Gcpearce_mozilla.com 1.5Ghurley_mozilla.com 1.6Gbsmedberg_mozilla.com 1.6Gdglastonbury_mozilla.com 1.6Gdtc-moz_scieneer.com 1.6Gjlund_mozilla.com 1.6Gsarentz_mozilla.com 1.6Gsbruno_mozilla.com 1.7Gmshal_mozilla.com 1.9Gmhammond_skippinet.com.au 2.1Glwagner_mozilla.com 2.4Garmenzg_mozilla.com 2.4Gdougt_mozilla.com 2.5Gbschouten_mozilla.com 2.7Ghwine_mozilla.com 2.8Geakhgari_mozilla.com 2.8Gmozilla_kewis.ch 2.9Grcampbell_mozilla.com 3.1Gbhearsum_mozilla.com 3.1Grjesup_wgate.com 3.2Gagal_mozilla.com 3.3Gaxel_mozilla.com 3.3Gprepr-ffxbld 4.2Gjford_mozilla.com 4.3Gmgervasini_mozilla.com 4.6Glsblakk_mozilla.com 5.0Gbsmith_mozilla.com 5.5Gnthomas_mozilla.com 5.8Gcoop_mozilla.com 6.5Gjhopkins_mozilla.com 7.7Graliiev_mozilla.com 9.2Gcatlee_mozilla.com 13Gstage-ffxbld *Space Usage by Non-user repos ~100GB* 24K integration/gaia-1_4 28K addon-sdk 28K projects/collusion 32K integration/gaia-1_1_0 32K projects/emscripten 32K projects/Moz2D 32K releases/mozilla-b2g18_v1_1_0 144Kprojects/addon-sdk-jetperf-tests 268Kipccode 452Ktestpilot-l10n 500Kreleases/firefox-hotfixes 700Kprojects/python-nss 896Kschema-validation 1.2Mprojects/mccoy 1.4Mpyxpcom 2.4Mplatform-model 2.4M
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/27/2014 2:58 AM, Doug Turner wrote: Want to move to github? (0) sudo apt-get install python-setuptools (1) sudo easy_install hg-git (2) add |hggit =| under [extensions] in your .hgrc file (3) Go to GitHub.com and create your new repo. (4) cd hg_repo (5) hg bookmark -r default master (6) hg push git+ssh://g...@github.com/you/name of your repo you created in step 3 hg-git can't run without a very very custom and difficult-to-setup hg on windows. Specifically because hg uses py2exe which strips out EVERY unused python library. And even doing hg in a virtualenv is hard because you get a MUCH slower hg due to no compiled code. I have never further tested hg-git on windows after I encountered the two issues above. ~Justin Wood (Callek) ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/27/2014 1:58 AM, Doug Turner wrote: Want to move to github? (0) sudo apt-get install python-setuptools (1) sudo easy_install hg-git (2) add |hggit =| under [extensions] in your .hgrc file (3) Go to GitHub.com and create your new repo. (4) cd hg_repo (5) hg bookmark -r default master (6) hg push git+ssh://g...@github.com/you/name of your repo you created It's worth noting that hg-git is having some performance issues with github right now. A basic clone of a 1MB repository takes well over a minute before it starts doing anything. -- Joshua Cranmer Thunderbird and DXR developer Source code archæologist ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 14-03-26 08:27 PM, Bobby Holley wrote: I don't understand what the overhead is. We don't run CI on user repos. It's effectively just ssh:// + disk space, right? That seems totally negligible. FTR from an operations standpoint, it is never just. Never. If it was *just* we wouldn't even be having this conversation. Trust me. regards, Armen ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 27/03/2014 13:43, Justin Wood (Callek) wrote: On 3/27/2014 2:58 AM, Doug Turner wrote: Want to move to github? (0) sudo apt-get install python-setuptools (1) sudo easy_install hg-git (2) add |hggit =| under [extensions] in your .hgrc file (3) Go to GitHub.com and create your new repo. (4) cd hg_repo (5) hg bookmark -r default master (6) hg push git+ssh://g...@github.com/you/name of your repo you created in step 3 hg-git can't run without a very very custom and difficult-to-setup hg on windows. Specifically because hg uses py2exe which strips out EVERY unused python library. And even doing hg in a virtualenv is hard because you get a MUCH slower hg due to no compiled code. I have never further tested hg-git on windows after I encountered the two issues above. ~Justin Wood (Callek) IME tortoisehg ships a much happier-making hg (than mozilla-build) that has a bunch of python libs you want. I've never used hg-git, however, so I don't know if it has enough of what you need. ~ Gijs ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 27/03/14 14:17, Armen Zambrano G. wrote: On 14-03-26 08:27 PM, Bobby Holley wrote: I don't understand what the overhead is. We don't run CI on user repos. It's effectively just ssh:// + disk space, right? That seems totally negligible. FTR from an operations standpoint, it is never just. Never. If it was *just* we wouldn't even be having this conversation. Trust me. To be fair there are also considerable costs associated with outsourcing VCS hosting, mostly associated with integrating the external hosting with other systems that need to work with the repository. For example W3C's web-platform-tests testsuite is being hosted on GitHub and as a result we have spent a non-trivial amount of effort on integration with a system for ensuring contributers agree to a CLA, a code review tool, synchronization of HEAD with a web server and various other things. This might be less effort than doing all the hosting at the W3C (although the reason we did it was purely that GitHub is familiar to potential contributers), but of course it will all have to be thrown away if we want to move providers in the future. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
What are mission critical repos since you just put everything in the same list? If we start removing project branches to be put on outsourced VCS we remove any sheriff support for that project branch since, as been pointed out many times, we dont have access to the server side commit hooks and can't close the tree. This may (I want to use *want* but don't have the data to prove it) impact engineering productivity. We have this situation with Gaia which has its canonical repo on Github. Sheriffs can land checkin-needed but can't close the tree. The way the B2G people do it is to remove everyone from the repo and then re-add (or thats how they used to do it) which then spams you with you are now getting notifications for repository X which is annoying. There is the other thing we need to worry about is the constant DDoS of Github[1]. We have seen that when there is a massive one it will take down their site for hours impacting engineering productivity again since people can't pull or push. I couldn't find similar reports on bitbucket but it can happen to any third party we may use. David [1] https://github.com/blog/1796-denial-of-service-attacks On 26/03/2014 23:53, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. Taras ps. Footprint stats: *Largest User Repos Out Of ~130GB* 1.1Gdmt.alexandre_gmail.com 1.1Gjblandy_mozilla.com 1.1Gjparsons_mozilla.com 1.2Gbugzilla_standard8.plus.com 1.2Gmbrubeck_mozilla.com 1.2Gmrbkap_mozilla.com 1.3Gdcamp_campd.org 1.3Gjst_mozilla.com 1.4Gblassey_mozilla.com 1.4Ggszorc_mozilla.com 1.4Giacobcatalin_gmail.com 1.5Gcpearce_mozilla.com 1.5Ghurley_mozilla.com 1.6Gbsmedberg_mozilla.com 1.6Gdglastonbury_mozilla.com 1.6Gdtc-moz_scieneer.com 1.6Gjlund_mozilla.com 1.6Gsarentz_mozilla.com 1.6Gsbruno_mozilla.com 1.7Gmshal_mozilla.com 1.9Gmhammond_skippinet.com.au 2.1Glwagner_mozilla.com 2.4Garmenzg_mozilla.com 2.4Gdougt_mozilla.com 2.5Gbschouten_mozilla.com 2.7Ghwine_mozilla.com 2.8Geakhgari_mozilla.com 2.8Gmozilla_kewis.ch 2.9Grcampbell_mozilla.com 3.1Gbhearsum_mozilla.com 3.1Grjesup_wgate.com 3.2Gagal_mozilla.com 3.3Gaxel_mozilla.com 3.3Gprepr-ffxbld 4.2Gjford_mozilla.com 4.3Gmgervasini_mozilla.com 4.6Glsblakk_mozilla.com 5.0Gbsmith_mozilla.com 5.5Gnthomas_mozilla.com 5.8Gcoop_mozilla.com 6.5Gjhopkins_mozilla.com 7.7Graliiev_mozilla.com 9.2Gcatlee_mozilla.com 13Gstage-ffxbld *Space Usage by Non-user repos ~100GB* 24K integration/gaia-1_4 28K addon-sdk 28K projects/collusion 32K integration/gaia-1_1_0
Re: Spring cleaning: Reducing Number Footprint of HG Repos
Taras Glek schrieb: *User Repos* TLDR: I would like to make user repos read-only by April 30th. When that happens, I will stop running any custom crash reports and dashboards that the stability program depends on, at least until further notice. I do not want to run a non-Mozilla-hosted repo for Mozilla work stuff. KaiRo ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/27/14, 6:37 AM, Justin Wood (Callek) wrote: On 3/26/2014 9:15 PM, Taras Glek wrote: Bobby Holley mailto:bobbyhol...@gmail.com Wednesday, March 26, 2014 17:27 I don't understand what the overhead is. We don't run CI on user repos. It's effectively just ssh:// + disk space, right? That seems totally negligible. Human overhead in keeping infra running could be spent making our infra better elsewhere. Also, project branches are pretty useful for teams working together on large projects that aren't ready to land in m-c. We only use them when we need them, so why would we shut them down? I'm not suggesting killing it. My suggestion is that project branch experience would likely be better when not hosted by mozilla. It would still trigger our c-i systems. Except when you consider the disposable project branches get Level 2 commit privs needed, and that to commit to our repos you need to have signed the committer agreement, which grants some legal recompense if malice is done. These project branches run on non try based machines which have elevated rights vs what try does, and can do much much more harm if there is malice here. I for one would not be happy from a sec standpoint if we allowed bitbucket-hosted repos to execute arbitrary code this way. The security concern should be on the scheduling front, not where the code is hosted. If a repo push incurs automation activity, we have established trust that anyone who can push to that repo can be trusted. If we don't have this automatic scheduling on push, no trust is established and there is no security concern. If a user is able to schedule automation manually (say by calling a web API), we trust the user isn't doing something nefarious. Since the scheduling API requires authentication, there shouldn't be a new security concern here. Even if there is an increased security concern over MITM or silent repo modification by 3rd party, these concerns can be mitigated through proper security settings (our Mercurial clients in automation aren't currently validating x509 fingerprints) and moving our automation jobs to execute in containers, which I believe is already in the works. That leaves us pretty much with kernel vulnerabilities (that can escape from containers), which we should be protecting ourselves against anyway. This problem is little different than what insert cloud hosting service provider here deals with. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
*User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. As mentioned, too fast. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. If we're spending any significant time or money running these, let's solve that instead - I really don't think much time or money *should* be needed to run low-priority repos with non-mission-critical availability requirements. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. Some issues I raised privately before this post went public, but I don't see addressed here: * Security implications Any dev who works on security bugs (and most do at one point or another, or might) who puts a patch queue on an external host is proxying to that host all security assurances. This makes that external hosting a tempting target for people who want to find 0-days. I'd like to say this is an excessive amount of paranoia, but given both the lucrative market for 0-days and NSA's interest in 0-days (and ability to compel or buy silence from companies or employees at companies), I no longer think this is excessive. :-( I'm less worried about silent changes to the repos to slip stuff in (though it's possible) than someone silently cataloging possible 0-day targets in repos associated with devs, especially ones marked as referring to bugs that aren't visible. * cleanup Per previous comments, it wasn't aware I could get rid of a user repo in any easy way (and it may actually be busted right now). Likely 50% of what's in user repos (or more) is dusty stuff that people could simply delete. I have one large and one medium repo I need to keep and some patch queues (most of which are deletable now). Anything else can go. But there's no trivial way for me to see what I have and delete them. A simple 1/month nag mail listing your private repos and their sizes would help. * side note: my repo names are tied to the email address in my key. It's dead. I'd change my key to the new permanent email address, but I worry I might lose all my user repos. * Backup/data-integrity/availability Already mentioned was availability guarantees or lack thereof. We'd need to back up these external repos (and find them somehow). Taras commented to me that we use expensive storage solutions for user repos (similar to primary repos). IMHO that's not needed: User repos needs lower SLA gear I'd imagine - redundancy, but could probably just live in a RAID-1+1 array with consumer drives with very high reliability (two RAID-1 arrays in a RAID-1 configuration) - you'd need a 3-drive simultaneous failure to have to fall back on backups. Hell, a single RAID-1 is probably good enough, so long as it's backed up frequently. Taras mentioned that this is time not spent doing other things; my response: I imagine you can buy a RAID drive and just drop it in in place of $$$-expensive-drive. But yes this requires some thought/planning/etc time for them; while moving to random-VCS-storage requires some time by N devs - net result may be more time than if we keep it inhouse. Plus time by IT to set up remote backup and negotiate something with random-VCS to let that happen. If we're dropping backup and just relying on the service, there are some additional concerns. -- Randell Jesup, Mozilla Corp remove news for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 03/27/2014 10:10 AM, Joshua Cranmer wrote: It's worth noting that hg-git is having some performance issues with github right now. A basic clone of a 1MB repository takes well over a minute before it starts doing anything. When I was converting my repositories last night I found that although the push to github from hg(-git) was hanging, it in fact had completed all of its work already. After a second or two you could control-C, re-push, and it would say there was nothing to do. If you checked on github, the commits would in fact be all there, and they would be there before the second push attempt or hitting control-C. Obviously, if you are pushing something huge like a clone of mozilla-central, you may need to legitimately wait a long time. But for clones of mozilla-central it's probably most advisable and polite to fork the gecko-dev repo and either do a light-weight import of any branches using git fast-import or the fancy tooling used to produce gecko-dev in the first place. A very cursory exploration shows http://repo.or.cz/w/fast-export.git provides fast-export from hg for fast-import to git, but it's probably best to read the blog-posts for the gecko-dev conversion instead. Andrew ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/27/2014 12:11 PM, Andrew Sutherland wrote: On 03/27/2014 10:10 AM, Joshua Cranmer wrote: It's worth noting that hg-git is having some performance issues with github right now. A basic clone of a 1MB repository takes well over a minute before it starts doing anything. When I was converting my repositories last night I found that although the push to github from hg(-git) was hanging, it in fact had completed all of its work already. After a second or two you could control-C, re-push, and it would say there was nothing to do. If you checked on github, the commits would in fact be all there, and they would be there before the second push attempt or hitting control-C. I saw that too, but the clone/pull is a different error, reported here: https://bitbucket.org/durin42/hg-git/issue/90/stuck-clone-over-git-ssh-to-bitbucketorg. Note that the hang happens here well before the work is done. -- Joshua Cranmer Thunderbird and DXR developer Source code archæologist ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Spring cleaning: Reducing Number Footprint of HG Repos
*User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. Taras ps. Footprint stats: *Largest User Repos Out Of ~130GB* 1.1Gdmt.alexandre_gmail.com 1.1Gjblandy_mozilla.com 1.1Gjparsons_mozilla.com 1.2Gbugzilla_standard8.plus.com 1.2Gmbrubeck_mozilla.com 1.2Gmrbkap_mozilla.com 1.3Gdcamp_campd.org 1.3Gjst_mozilla.com 1.4Gblassey_mozilla.com 1.4Ggszorc_mozilla.com 1.4Giacobcatalin_gmail.com 1.5Gcpearce_mozilla.com 1.5Ghurley_mozilla.com 1.6Gbsmedberg_mozilla.com 1.6Gdglastonbury_mozilla.com 1.6Gdtc-moz_scieneer.com 1.6Gjlund_mozilla.com 1.6Gsarentz_mozilla.com 1.6Gsbruno_mozilla.com 1.7Gmshal_mozilla.com 1.9Gmhammond_skippinet.com.au 2.1Glwagner_mozilla.com 2.4Garmenzg_mozilla.com 2.4Gdougt_mozilla.com 2.5Gbschouten_mozilla.com 2.7Ghwine_mozilla.com 2.8Geakhgari_mozilla.com 2.8Gmozilla_kewis.ch 2.9Grcampbell_mozilla.com 3.1Gbhearsum_mozilla.com 3.1Grjesup_wgate.com 3.2Gagal_mozilla.com 3.3Gaxel_mozilla.com 3.3Gprepr-ffxbld 4.2Gjford_mozilla.com 4.3Gmgervasini_mozilla.com 4.6Glsblakk_mozilla.com 5.0Gbsmith_mozilla.com 5.5Gnthomas_mozilla.com 5.8Gcoop_mozilla.com 6.5Gjhopkins_mozilla.com 7.7Graliiev_mozilla.com 9.2Gcatlee_mozilla.com 13Gstage-ffxbld *Space Usage by Non-user repos ~100GB* 24K integration/gaia-1_4 28K addon-sdk 28K projects/collusion 32K integration/gaia-1_1_0 32K projects/emscripten 32K projects/Moz2D 32K releases/mozilla-b2g18_v1_1_0 144Kprojects/addon-sdk-jetperf-tests 268Kipccode 452Ktestpilot-l10n 500Kreleases/firefox-hotfixes 700Kprojects/python-nss 896Kschema-validation 1.2Mprojects/mccoy 1.4Mpyxpcom 2.4Mplatform-model 2.4Mxforms 2.6Mreleases/mobile-1.1 2.6Mvenkman 2.8Mwww 2.9Mreleases/mobile-5.0 3.1Mpenelope 3.3Mreleases/mobile-2.0 3.5Mtbbuild 3.7Mhgcustom 3.9Mreleases/mobile-6.0 4.6Mchatzilla 5.3Mgraphs 5.4Mprojects/kraken 6.4Mprojects/ldap-sdks 6.7Mdom-inspector 6.7Mprojects/htmlparser 7.0Mweave-l10n 13M mobile-browser 14M integration/gaia-ui-tests 14M projects/jss 19M projects/addon-sdk-release 20M projects/addon-sdk-beta 25M projects/nspr 25M releases/comm-1.9.2 28M rewriting-and-analysis 30M camino 30M releases/comm-esr24 30M releases/comm-miramar 31M projects/addon-sdk 35M releases/comm-1.9.1 37M releases/comm-esr17 43M gaia-l10n 44M releases/comm-esr10 48M l10n 48M qa 51M releases/gaia-l10n 52M automation 53M projects/2007-configure-rewrite 59M
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Wednesday 2014-03-26 16:53 -0700, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. This seems like a pretty disruptive change -- it involves breaking links to the places lots of little pieces of our infrastructure live. It also means that we're not in control of our own data in a way that's often useful to us -- having access to our history is often very important for understanding the present (such as understanding why code is the way it is). If we don't have reliable archiving of our history, those of us who think it's important will end up spreading that work around and probably being less efficient at it. (For example, I try to save dev-platform threads that I think are important locally because I don't trust the Google Groups archive to be permanent.) It also makes it harder to find Mozilla-related things. For example, many of us publish version-controlled patch queues as user repositories. If I'm reviewing a patch queue and want to apply the queue, I occasionally look around at see if that user has published the patch queue as a user repository so that I can apply it. If there's no longer a standard place for them to be published, I'll end up either sorting out the patch order manually or waiting 24 hours for somebody in another timezone to wake up and tell me where it is. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. The goal of project branches is so that teams can collaborate on a project that needs continuous integration testing during its development. Are we not using it for that? Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. I think it makes sense if individual developers are going to end up spending more time/resources working around the fact that we don't do it than it would take to continue doing it. I don't have data one way or another, but I think it's a real possibility. -David -- 턞 L. David Baron http://dbaron.org/ 턂 턢 Mozilla https://www.mozilla.org/ 턂 Before I built a wall I'd ask to know What I was walling in or walling out, And to whom I was like to give offense. - Robert Frost, Mending Wall (1914) signature.asc Description: Digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
I don't understand what the overhead is. We don't run CI on user repos. It's effectively just ssh:// + disk space, right? That seems totally negligible. Also, project branches are pretty useful for teams working together on large projects that aren't ready to land in m-c. We only use them when we need them, so why would we shut them down? On Wed, Mar 26, 2014 at 9:11 PM, L. David Baron dba...@dbaron.org wrote: On Wednesday 2014-03-26 16:53 -0700, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. This seems like a pretty disruptive change -- it involves breaking links to the places lots of little pieces of our infrastructure live. It also means that we're not in control of our own data in a way that's often useful to us -- having access to our history is often very important for understanding the present (such as understanding why code is the way it is). If we don't have reliable archiving of our history, those of us who think it's important will end up spreading that work around and probably being less efficient at it. (For example, I try to save dev-platform threads that I think are important locally because I don't trust the Google Groups archive to be permanent.) It also makes it harder to find Mozilla-related things. For example, many of us publish version-controlled patch queues as user repositories. If I'm reviewing a patch queue and want to apply the queue, I occasionally look around at see if that user has published the patch queue as a user repository so that I can apply it. If there's no longer a standard place for them to be published, I'll end up either sorting out the patch order manually or waiting 24 hours for somebody in another timezone to wake up and tell me where it is. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. The goal of project branches is so that teams can collaborate on a project that needs continuous integration testing during its development. Are we not using it for that? Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. I think it makes sense if individual developers are going to end up spending more time/resources working around the fact that we don't do it than it would take to continue doing it. I don't have data one way or another, but I think it's a real possibility. -David -- 턞 L. David Baron http://dbaron.org/ 턂 턢 Mozilla https://www.mozilla.org/ 턂 Before I built a wall I'd ask to know What I was walling in or walling out, And to whom I was like to give offense. - Robert Frost, Mending Wall (1914) ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/26/14, 4:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. How much time do we spend operating user repositories? I follow the repos bugzilla components and most of the requests I see have little if anything to do with user repositories. And I reckon that's because user repositories are self-service. Are user repositories more than just disk space and seldom CPU usage and page cache eviction? We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. What about making non-user repos more self-service? (They currently require bugs for everything AFAICT.) This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. I would *really* like the ability to trigger automation on any repo, regardless of its URL. Moving project branches elsewhere might make this happen, so +1. *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. A lot of it has to do with lack of hooks. Without pre-push hooks on Bitbucket or Github, there will be footgunning. The counter argument is just back out bad commits. But excessive backouts can be problematic (see our Firefox tree management and ask Jesse about bisecting impact). There is also the issue with size. Remember when GitHub disabled our mirror without notice because it became too large and became a performance problem? I can only speculate what Bitbucket will do when 1000 new 1.5+ GB clones of the Firefox repo show up. Have we asked them? In the case of Mercurial, we'll want to someday deploy Facebook's remotefilelog extension to enable shallow clones (drastically reducing clone time in the process - a game changer for new contributors who can't download 1+ GB of repo data). We may also want to deploy a bundle lookaside extension that automatically uses a bundle for initial clones. Obviously we can do these things for repos on hg.mozilla.org. But what about the user clones on Bitbucket? We may run into compatibility problems. Hosting arbitrary Moz-related hg repositories does not make strategic sense. We should do the absolute minimum(eg http://bke.ro/?p=380) required to keep Firefox shipping smoothly and focus our efforts on making Firefox better. Strategic, no. Necessary because we have no better alternative, quite possibly. If this boils down to maintaining the code behind hg.mozilla.org/git.mozilla.org, I and others have offered to help. I've volunteered to improve the self-service capabilities of user repos, for example. But, the code is in some private IT repository and it's difficult to get your hands on initially, to test, and deploy. Whatever the outcome of this proposal is, I hope that roadblock can be eliminated. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
Gregory Szorc mailto:g...@mozilla.com Wednesday, March 26, 2014 17:40 On 3/26/14, 4:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. How much time do we spend operating user repositories? I follow the repos bugzilla components and most of the requests I see have little if anything to do with user repositories. And I reckon that's because user repositories are self-service. Are user repositories more than just disk space and seldom CPU usage and page cache eviction? Some significant portion Ben's time was spent on user repos during the multi-week webhead migration(http://bke.ro/?p=380). bugs like https://bugzilla.mozilla.org/show_bug.cgi?id=983085 cropped up. The fact that repos keep growing means that we'll have to do this migration again soon. We are at 260gb/300gb. As long as our footprint is 40gb we can't migrate to fast, cheap cheerful AWS nodes. Have to do something complex or expensive instead...which means more devop time. As long as our footprint keeps growing we'll keep revisiting this problem. B2G guys seem to prefer github already. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. What about making non-user repos more self-service? (They currently require bugs for everything AFAICT.) This requires more devop time. Handling bugs wastes time on both sides. Self-serve is the biggest advantage of breaking the habit and moving to a hosted service. This is a soft target. I don't have a concrete plan here. I'd like to start experimenting with moving project branches elsewhere and see where that takes us. I would *really* like the ability to trigger automation on any repo, regardless of its URL. Moving project branches elsewhere might make this happen, so +1. Right. I'm all for increased productivity through self-serve. Having a weird ad-hoc system does not seem like a win. *What my hg repo needs X/Y that 3rd-party services do not provide?* If you have a good reason to use a feature not supported by github/bitbucket, we should continue hosting your repo at Mozilla. *Why Not Move Everything to Github/Bitbucket/etc?* Mozilla prefers to keep repositories public by-default. This does not fit Github's business model which is built around private repos. Github's free service does not provide any availability guarantee. There is also a problem of github not supporting hg. I'm not completely sure why we can't move everything to bitbucket. Some of it is to do with anecdotal evidence of robustness problems. Some of it is lack of hooks (sans post-receive POSTs).Additionally, as with Github there is no availability guarantee. A lot of it has to do with lack of hooks. Without pre-push hooks on Bitbucket or Github, there will be footgunning. The counter argument is just back out bad commits. But excessive backouts can be problematic (see our Firefox tree management and ask Jesse about bisecting impact). No. Counterargument is: stop using hg like cvs. Have a staging repo and automation to transfer changesets as part of a c-i process. There is also the issue with size. Remember when GitHub disabled our mirror without notice because it became too large and became a performance problem? I can only speculate what Bitbucket will do when 1000 new 1.5+ GB clones of the Firefox repo show up. Have we asked them? We are asking. Part of the reason mc got pulled is heavy traffic on that repo. User project repos should generate a lot less load. It's more of a lets try this. In the case of Mercurial, we'll want to someday deploy Facebook's remotefilelog extension to enable shallow clones (drastically reducing clone time in the process - a game changer for new contributors who can't download 1+ GB of repo data). We may also want to deploy a bundle lookaside extension that automatically uses a bundle for initial clones. Obviously we can do these things for repos on hg.mozilla.org. But what about the user clones on Bitbucket? We may run into compatibility problems. There are lots of potential solutions here. I would like to deploy hg proxies(something like
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Wed, Mar 26, 2014 at 04:53:27PM -0700, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. We are happy to help move specific hg repos to bitbucket. Once you have migrated your repository, please comment in https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk space. *Non-User Repos* There are too many non-user repos. I'm not convinced we should host ash, oak, other project branches internally. I think we should focus on mission-critical repos only. There should be less than a dozen of those. I would like to stop hosting non-mission-critical repositories by end of Q2. What about nspr, nss, comm-central, venkman, chatzilla, etc.? Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote: On 3/26/14, 4:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. How much time do we spend operating user repositories? I follow the repos bugzilla components and most of the requests I see have little if anything to do with user repositories. And I reckon that's because user repositories are self-service. Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in Taras's list, but after looking, I figured I had an old m-c copy with old patches on top of it. Also note that the lack of something better than mercurial's share, we sadly have to waste plenty of disk space for each copy of a mercurial repo. If mercurial's share was more like git's object alternates, that would be much less dramatic. (BTW, I don't think it would be extremely difficult to implement) Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Thursday 2014-03-27 14:11 +0900, Mike Hommey wrote: Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in They're just as easy to remove as to create: https://developer.mozilla.org/en-US/docs/Creating_Mercurial_User_Repositories#Editing_your_personal_repository -David -- 턞 L. David Baron http://dbaron.org/ 턂 턢 Mozilla https://www.mozilla.org/ 턂 Before I built a wall I'd ask to know What I was walling in or walling out, And to whom I was like to give offense. - Robert Frost, Mending Wall (1914) signature.asc Description: Digital signature ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On 3/26/14, 10:11 PM, Mike Hommey wrote: On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote: On 3/26/14, 4:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. How much time do we spend operating user repositories? I follow the repos bugzilla components and most of the requests I see have little if anything to do with user repositories. And I reckon that's because user repositories are self-service. Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in Taras's list, but after looking, I figured I had an old m-c copy with old patches on top of it. That sounds like a bug in the self-service feature! Also note that the lack of something better than mercurial's share, we sadly have to waste plenty of disk space for each copy of a mercurial repo. If mercurial's share was more like git's object alternates, that would be much less dramatic. (BTW, I don't think it would be extremely difficult to implement) It's 2014: why are we worrying about disk space values less than 10 TB? More seriously though, it's not extremely difficult to implement a custom storage backend for Mercurial. remotefilelog does it. It's only a matter of time before someone hooks up SQL, S3, Neo4j, etc to make server-side scaling more efficient. Also, if you are using a COW filesystem, initial clones should be nearly free and you'd only pay the extra copy cost for changesets added afterwards. This could help dramatically with mozilla-central clones. Out of curiosity, is there open source software for a shared Git object store? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Wed, Mar 26, 2014 at 10:32:07PM -0700, L. David Baron wrote: On Thursday 2014-03-27 14:11 +0900, Mike Hommey wrote: Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in They're just as easy to remove as to create: https://developer.mozilla.org/en-US/docs/Creating_Mercurial_User_Repositories#Editing_your_personal_repository Doh. That's what you get from reading the outline and not associating Edit with Delete. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Thu, Mar 27, 2014 at 02:42:13PM +0900, Mike Hommey wrote: On Wed, Mar 26, 2014 at 10:32:07PM -0700, L. David Baron wrote: On Thursday 2014-03-27 14:11 +0900, Mike Hommey wrote: Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in They're just as easy to remove as to create: https://developer.mozilla.org/en-US/docs/Creating_Mercurial_User_Repositories#Editing_your_personal_repository Doh. That's what you get from reading the outline and not associating Edit with Delete. Interestingly, I just deleted that old repo, and guess what? I can still clone it, and it's still available on hgweb, while the operation itself took a while, suggesting it did, in fact, delete something. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Spring cleaning: Reducing Number Footprint of HG Repos
On Wed, Mar 26, 2014 at 10:40:39PM -0700, Gregory Szorc wrote: On 3/26/14, 10:11 PM, Mike Hommey wrote: On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote: On 3/26/14, 4:53 PM, Taras Glek wrote: *User Repos* TLDR: I would like to make user repos read-only by April 30th. We should archive them by May 31st. Time spent operating user repositories could be spent reducing our end-to-end continuous integration cycles. These do not seem like mission-critical repos, seems like developers would be better off hosting these on bitbucket or github. Using a 3rd-party host has obvious benefits for collaboration self-service that our existing system will never meet. How much time do we spend operating user repositories? I follow the repos bugzilla components and most of the requests I see have little if anything to do with user repositories. And I reckon that's because user repositories are self-service. Note that while user repositories are self-service on the creation side, there is no obvious way to self-service a user repo removal. I'm not in Taras's list, but after looking, I figured I had an old m-c copy with old patches on top of it. That sounds like a bug in the self-service feature! Also note that the lack of something better than mercurial's share, we sadly have to waste plenty of disk space for each copy of a mercurial repo. If mercurial's share was more like git's object alternates, that would be much less dramatic. (BTW, I don't think it would be extremely difficult to implement) It's 2014: why are we worrying about disk space values less than 10 TB? More seriously though, it's not extremely difficult to implement a custom storage backend for Mercurial. remotefilelog does it. It's only a matter of time before someone hooks up SQL, S3, Neo4j, etc to make server-side scaling more efficient. That doesn't even need sql, s3, or whatever. Just that a shared clone have local filelogs. Also, if you are using a COW filesystem, initial clones should be nearly free and you'd only pay the extra copy cost for changesets added afterwards. This could help dramatically with mozilla-central clones. Out of curiosity, is there open source software for a shared Git object store? git. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform