Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-04-18 Thread Randell Jesup
Additionally I've been setting up a host named hg-archive.mozilla.org with
a lower SLA to shelve repositories that have not been touched in many many
years. Deleting this old code from hg.m.o, even if it's available elsewhere
if an unpopular thing to do, so it's unsurprising I didn't receive much
buy-in when I proposed it.

Cool.

I think the real reason for this evaluation and push into other services is
that it is perceived that the user repositories don't add much value,
especially when you consider all of the features that could be happening
from them such as triggering CI jobs based on these, and self-service
collaboration.

I use them all the time to keep my personal patch queues synced across
multiple machines (or even on one), and to collaborate with others on my
team.  I think if we advertised more how useful they are for this sort
of thing it would help people work more efficiently.  (And maybe tweaked
the initial setup procedure just a smidge to reduce now edit this
non-existant file sort of stuff.)

Yes, the user-repo deletion is a feature and it is currently broken. It's
been a corner-case of the migration to local disk, and a fix has yet to be
coded up. Please ping me if you're trying to remove a repository until I
can fix this.

Add an option for delete a repo, and have it say please email bkero,
for now?

Thanks for the work on hg!

-- 
Randell Jesup, Mozilla Corp
remove news for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-04-01 Thread ben . kero
Wow, okay. A lot to address here.

The primary instigator of this migrating of user repositories off to external 
services came from when we were (and still are) crunched for disk space after 
restructuring our Mercurial infrastructure to use local disks.

We did this for several reasons:
* An internal quote for remaining on NFS for hosting (even with just the 300GB 
used) would have cost us a low six-digit figure
* Mercurial devs originally said that some of our clone corruption problems may 
have come from NFS faking transaction atomicity (see bug 974094)
* This approach did not allow us to expand to multiple datacenters (especially 
cost-effectively)

The 300GB limit in this case comes from repurposing the old hgweb-serving hosts 
to use their local disks instead of an NFS mount. These hosts came with two 
300GB disks paired in RAID-1 configuration. If this is simply a matter of disk 
space we can agree to reconfigure these hosts as RAID-0 instead. The 
reliability should never matter since these are simply clones of the original 
canonical source. This is what I was spending a considerable amount of time 
doing.

Additionally I've been setting up a host named hg-archive.mozilla.org with a 
lower SLA to shelve repositories that have not been touched in many many years. 
Deleting this old code from hg.m.o, even if it's available elsewhere if an 
unpopular thing to do, so it's unsurprising I didn't receive much buy-in when I 
proposed it.

I think the real reason for this evaluation and push into other services is 
that it is perceived that the user repositories don't add much value, 
especially when you consider all of the features that could be happening from 
them such as triggering CI jobs based on these, and self-service collaboration.

Yes, the user-repo deletion is a feature and it is currently broken. It's been 
a corner-case of the migration to local disk, and a fix has yet to be coded up. 
Please ping me if you're trying to remove a repository until I can fix this.

As for project repositories, these should totally be self-service and 
automated. The human-as-an-API approach to these means it is often too much 
work for developers to request one for simple or short collaborative projects. 

Sadly for Mercurial development at Mozilla it is just me for the development 
work. If, as gps said, people are willing to help out with some of the 
development I would be happy to test and deploy whatever changes are proposed. 
The code for the infrastructure is available at 
https://github.com/bkero/puppet-module-hg. Feel free to spin up a VM and try to 
improve things.

Ben
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-28 Thread E Wong

Hi,

Mozilla's Manifest principle #8 states:

8.Transparent community-based processes promote participation, 
accountability and trust.


Decision making, afaik, is a process.

So...

Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles. These do not seem like
mission-critical repos, seems like developers would be better off
hosting these on bitbucket or github. Using a 3rd-party host has obvious
benefits for collaboration  self-service that our existing system will
never meet.


I'd like to question the above.  I would like to make user repos ...
Was this decision arrived by yourself, or through a transparent
process with your releng team?  And if it was transparent,
 where was it discussed?  Would I be privy to these discussions?
Or is this decision similar to the DT issue?

So, in the name of transparency, how exactly did you come about in
deciding this?

Reading your message, I understood the possible issues(read: Why?):

1) Resource: Time
2) Resource: Disk space
3) Resource: maintenance

Is the machine/vm/whatever that holds the user repos and/or
non-user repos anyway tied to the CI systems? i.e Does the
CI system also contain the user/non-user repos?

Also, are you sure that these are not 'mission-critical'
repos (user-repos and non-user repos)?  The word 'seems'
imply you're not sure.

Don't get me wrong.  You have every right to make these
decisions.  I know (with 100% certainty) that
this decision affects a few community projects.

I'm not saying it isn't technically 'feasible' to move repos away
from Mozilla's systems.  It is technically do-able. Feasibility
is project-dependent.  What I'm not 100% certain is whether it is the 
'right' thing to do.




Once you have migrated your repository, please comment in
https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
disk space.


This covers #2 in the list.  Disk space. From your post to gps, I
quote:

The fact that repos keep growing means that we'll have to do this 
migration again soon. We are at 260gb/300gb.


I can see why this might lead you to make your decision; but is
this the only alternative?  I mean 300GB?  How much is 1TB in
the US?  AIUI, having user and non-user repos don't take that
much processing power and the minimum HD size you can get
now-a-days is 500GB.  Why not migrate it to a 1TB drive?
How long would that last?  How long did 300GB last?



*Non-User Repos*
There  are too many non-user repos. I'm not convinced we should host
ash, oak, other project branches internally. I think we should focus on
mission-critical repos only. There should be less than a dozen of those.
I would like to stop hosting non-mission-critical repositories by end of
Q2.


How exactly did you come to the conclusion that 'there should be less
than a dozen of those'?  I'm really curious. Did you go through each
non-user repos (as you did with the user repos) and decided which ones
fitted to your criteria as 'mission-critical'?

Which are the 'dozen'(or less) repos are you talking about?



This is a soft target. I don't have a concrete plan here. I'd like to
start experimenting with moving project branches elsewhere and see where
that takes us.


Pardon me, but is this the right approach?  We're talking about a lot
of project branches here. 'Start experimenting' isn't something that
would go well with already established processes/systems. Moving them
isn't a technical issue.(We've established that it's technically
do-able.)  It's a systematic issue.  Moving a project, say A, to a
different system (3rd party or otherwise), require some changes to the
underlying systems/processes that require that repo to be where it is.
So those need to be changed.  Then the processes/systems are
checked for errors.  If it doesn't work, move the project branch
elsewhere.  Another set of changes.  Do-able?  Sure.  I'm not
saying it isn't do-able.  Is it necessarily the right thing
to do?



*What my hg repo needs X/Y that 3rd-party services do not provide?*
If you have a good reason to use a feature not supported by
github/bitbucket, we should continue hosting your repo at Mozilla.

*Why Not Move Everything to Github/Bitbucket/etc?*
Mozilla  prefers to keep repositories public by-default. This does not
fit  Github's business model which is built around private repos.
Github's free  service does not provide any availability guarantee.
There is also a problem of github not supporting hg.

I'm not completely sure why we can't move everything to bitbucket. Some
of  it is to do with anecdotal evidence of robustness problems. Some of
it is lack of hooks (sans post-receive POSTs).Additionally, as with
Github there is no availability guarantee.



Umm.  Haven't you already given reasons why moving everything to
bitbucket isn't a good idea?  (No availability guaranteed would

Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Gervase Markham

On 27/03/14 00:53, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.


I think that if you truly intend to go ahead with this, the news will 
need way, way wider circulation than mozilla.dev.platform. I have some 
useful software stored in a user repo, and I only happen to read this 
group. It will also need much more lead time than a month.


I'm also somewhat surprised that this has been proposed without any 
previous attempt to measure the impact of doing it. Or has such work 
been done, but the results not published? How often are all these repos 
pulled from or pushed to? Could we achieve many of the gains by getting 
people to clean up after themselves, rather than eliminating the 
capability entirely?


I don't think you're suggesting this, but just to be clear: I'm against 
storing our repo of record for anything of long-term importance on any 
system other than our own. And yes, I know about B2G.


Gerv

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Taras Glek



Also, if you are using a COW filesystem, initial clones should be nearly
free and you'd only pay the extra copy cost for changesets added afterwards.
This could help dramatically with mozilla-central clones.

Out of curiosity, is there open source software for a shared Git object
store?


git.
git also has a wide array of interesting backends(eg swift) to choose 
from, etc. It's slightly less painful to host than hg. Yet I still don't 
see

a compelling reason to roll our own poor imitation of github/bitbucket.

re busted self-serve deletes in another email: 
https://bugzilla.mozilla.org/show_bug.cgi?id=983085


Taras



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Doug Turner
Want to move to github?

(0) sudo apt-get install python-setuptools
(1) sudo easy_install hg-git
(2) add |hggit =| under [extensions] in your .hgrc file
(3) Go to GitHub.com and create your new repo.
(4) cd hg_repo
(5) hg bookmark -r default master
(6) hg push git+ssh://g...@github.com/you/name of your repo you created
in step 3


On Wednesday, March 26, 2014, Taras Glek tg...@mozilla.com wrote:

 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We should
 archive them by May 31st.

 Time  spent operating user repositories could be spent reducing our
  end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off hosting
 these on bitbucket or github. Using a 3rd-party host has obvious benefits
 for collaboration  self-service that our existing system will never meet.

 We are happy to help move specific hg repos to bitbucket.

 Once you have migrated your repository, please comment in
 https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
 disk space.

 *Non-User Repos*
 There  are too many non-user repos. I'm not convinced we should host ash,
 oak, other project branches internally. I think we should focus on
  mission-critical repos only. There should be less than a dozen of those. I
 would like to stop hosting non-mission-critical repositories by end of Q2.

 This is a soft target. I don't have a concrete plan here. I'd like to
 start experimenting with moving project branches elsewhere and see where
 that takes us.

 *What my hg repo needs X/Y that 3rd-party services do not provide?*
 If you have a good reason to use a feature not supported by
 github/bitbucket, we should continue hosting your repo at Mozilla.

 *Why Not Move Everything to Github/Bitbucket/etc?*
 Mozilla  prefers to keep repositories public by-default. This does not fit
  Github's business model which is built around private repos. Github's free
  service does not provide any availability guarantee. There is also a
 problem of github not supporting hg.

 I'm not completely sure why we can't move everything to bitbucket. Some of
  it is to do with anecdotal evidence of robustness problems. Some of it is
 lack of hooks (sans post-receive POSTs).Additionally, as with Github there
 is no availability guarantee.

 Hosting arbitrary Moz-related hg repositories does not make strategic
 sense. We should do the absolute minimum(eg http://bke.ro/?p=380)
 required to keep Firefox shipping smoothly and focus our efforts on making
 Firefox better.


 Taras


 ps. Footprint stats:

 *Largest User Repos Out Of ~130GB*
 1.1Gdmt.alexandre_gmail.com
 1.1Gjblandy_mozilla.com
 1.1Gjparsons_mozilla.com
 1.2Gbugzilla_standard8.plus.com
 1.2Gmbrubeck_mozilla.com
 1.2Gmrbkap_mozilla.com
 1.3Gdcamp_campd.org
 1.3Gjst_mozilla.com
 1.4Gblassey_mozilla.com
 1.4Ggszorc_mozilla.com
 1.4Giacobcatalin_gmail.com
 1.5Gcpearce_mozilla.com
 1.5Ghurley_mozilla.com
 1.6Gbsmedberg_mozilla.com
 1.6Gdglastonbury_mozilla.com
 1.6Gdtc-moz_scieneer.com
 1.6Gjlund_mozilla.com
 1.6Gsarentz_mozilla.com
 1.6Gsbruno_mozilla.com
 1.7Gmshal_mozilla.com
 1.9Gmhammond_skippinet.com.au
 2.1Glwagner_mozilla.com
 2.4Garmenzg_mozilla.com
 2.4Gdougt_mozilla.com
 2.5Gbschouten_mozilla.com
 2.7Ghwine_mozilla.com
 2.8Geakhgari_mozilla.com
 2.8Gmozilla_kewis.ch



-- 
// mobile
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Mike Hommey
On Wed, Mar 26, 2014 at 11:58:48PM -0700, Doug Turner wrote:
 Want to move to github?
 
 (0) sudo apt-get install python-setuptools
 (1) sudo easy_install hg-git
 (2) add |hggit =| under [extensions] in your .hgrc file
 (3) Go to GitHub.com and create your new repo.
 (4) cd hg_repo
 (5) hg bookmark -r default master
 (6) hg push git+ssh://g...@github.com/you/name of your repo you created
 in step 3

I don't know the state of github backend, but it used to be recommended
to start from a fork than to push something fresh, especially when it's
as massive as mozilla-central.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread jmaher
For talos development we allow pointing at a user specific repo instead of the 
master one.  This has greatly reduced the time to bring up new tests.  This 
could easily be hosted elsewhere, but we chose to restrict it to user repos for 
a security measure.  You have to have cleared some form of basic authentication 
with user repos and now if someone wants to see how their talos modifications 
run on talos they can do that without checking them in.

A change like this will require us to either remove this functionality, make it 
less secure, or create busy work whenever someone new wants to point to a 
custom talos repository.

-Joel
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Armen Zambrano G.
On 14-03-26 07:53 PM, Taras Glek wrote:
 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We should
 archive them by May 31st.
 
 Time  spent operating user repositories could be spent reducing our 
 end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off
 hosting these on bitbucket or github. Using a 3rd-party host has obvious
 benefits for collaboration  self-service that our existing system will
 never meet.
 
 We are happy to help move specific hg repos to bitbucket.
 
 Once you have migrated your repository, please comment in
 https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
 disk space.
 
 *Non-User Repos*
 There  are too many non-user repos. I'm not convinced we should host
 ash, oak, other project branches internally. I think we should focus on 
 mission-critical repos only. There should be less than a dozen of those.
 I would like to stop hosting non-mission-critical repositories by end of
 Q2.

First of all, I applaud this and it's important to get it done. However,
we need to review what is used within the releng system and the security
implications of using non-mozilla hosting for repos.

Our infra also allows on the try server to test talos repositories under
hg.m.o/users/blah. We should also get security sign-off for a different
type of hosting of those repos.

We're putting an etherpad together with repos important to releng systems.

cheers,
Armen


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Justin Wood (Callek)

On 3/27/2014 1:11 AM, Mike Hommey wrote:

On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote:

On 3/26/14, 4:53 PM, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles. These do not seem like
mission-critical repos, seems like developers would be better off
hosting these on bitbucket or github. Using a 3rd-party host has obvious
benefits for collaboration  self-service that our existing system will
never meet.


How much time do we spend operating user repositories? I follow the repos
bugzilla components and most of the requests I see have little if anything
to do with user repositories. And I reckon that's because user repositories
are self-service.


Note that while user repositories are self-service on the creation side,
there is no obvious way to self-service a user repo removal. I'm not in
Taras's list, but after looking, I figured I had an old m-c copy with
old patches on top of it.


Prior to the hg migration to local disk there was (well technically 
still is):


ssh hg.mozilla.org edit repo

which allowed you to delete it. We even had/have this info on MDN. The 
bug exists today that the deletion does not propogate out to the 
local-storage webheads.


~Justin Wood (Callek)

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Justin Wood (Callek)

On 3/26/2014 9:15 PM, Taras Glek wrote:




Bobby Holley mailto:bobbyhol...@gmail.com
Wednesday, March 26, 2014 17:27
I don't understand what the overhead is. We don't run CI on user
repos. It's effectively just ssh:// + disk space, right? That seems
totally negligible.

Human overhead in keeping infra running could be spent making our infra
better elsewhere.


Also, project branches are pretty useful for teams working together on
large projects that aren't ready to land in m-c. We only use them when
we need them, so why would we shut them down?

I'm not suggesting killing it. My suggestion is that project branch
experience would likely be better when not hosted by mozilla. It would
still trigger our c-i systems.


Except when you consider the disposable project branches get Level 2 
commit privs needed, and that to commit to our repos you need to have 
signed the committer agreement, which grants some legal recompense if 
malice is done.


These project branches run on non try based machines which have 
elevated rights vs what try does, and can do much much more harm if 
there is malice here.


I for one would not be happy from a sec standpoint if we allowed 
bitbucket-hosted repos to execute arbitrary code this way.


~Justin Wood (Callek)

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Axel Hecht

On 3/27/14, 12:53 AM, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles. These do not seem like
mission-critical repos, seems like developers would be better off
hosting these on bitbucket or github. Using a 3rd-party host has obvious
benefits for collaboration  self-service that our existing system will
never meet.

We are happy to help move specific hg repos to bitbucket.

Once you have migrated your repository, please comment in
https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
disk space.


I think it's utterly sad making that we're giving up on hosting, instead 
of fixing it.


I have several things in my user repos that only run on our hg server, 
mostly because all other repo hoster don't send correct mimetypes for 
raw files. In particular this affects dashboards I created to share 
aggregated bugzilla data etc.


I'm also sad that we're removing the ability for contributors to share 
their mozilla-central clones, at least in large parts of the world. 
Pushing a full clone to some random server just isn't working for large 
parts of teh world.


And all that while the opportunity for us to help you on the data 
consumption is just broken.


Sad.

Note, strategically, I think that mozilla needs to support developing o 
the web, and the github editor isn't it. It'll be web-based IDEs, which 
require good tooling and hosting to be on the same infrastructure.


Axel



*Non-User Repos*
There  are too many non-user repos. I'm not convinced we should host
ash, oak, other project branches internally. I think we should focus on
mission-critical repos only. There should be less than a dozen of those.
I would like to stop hosting non-mission-critical repositories by end of
Q2.

This is a soft target. I don't have a concrete plan here. I'd like to
start experimenting with moving project branches elsewhere and see where
that takes us.

*What my hg repo needs X/Y that 3rd-party services do not provide?*
If you have a good reason to use a feature not supported by
github/bitbucket, we should continue hosting your repo at Mozilla.

*Why Not Move Everything to Github/Bitbucket/etc?*
Mozilla  prefers to keep repositories public by-default. This does not
fit  Github's business model which is built around private repos.
Github's free  service does not provide any availability guarantee.
There is also a problem of github not supporting hg.

I'm not completely sure why we can't move everything to bitbucket. Some
of  it is to do with anecdotal evidence of robustness problems. Some of
it is lack of hooks (sans post-receive POSTs).Additionally, as with
Github there is no availability guarantee.

Hosting arbitrary Moz-related hg repositories does not make strategic
sense. We should do the absolute minimum(eg http://bke.ro/?p=380)
required to keep Firefox shipping smoothly and focus our efforts on
making Firefox better.


Taras


ps. Footprint stats:

*Largest User Repos Out Of ~130GB*
1.1Gdmt.alexandre_gmail.com
1.1Gjblandy_mozilla.com
1.1Gjparsons_mozilla.com
1.2Gbugzilla_standard8.plus.com
1.2Gmbrubeck_mozilla.com
1.2Gmrbkap_mozilla.com
1.3Gdcamp_campd.org
1.3Gjst_mozilla.com
1.4Gblassey_mozilla.com
1.4Ggszorc_mozilla.com
1.4Giacobcatalin_gmail.com
1.5Gcpearce_mozilla.com
1.5Ghurley_mozilla.com
1.6Gbsmedberg_mozilla.com
1.6Gdglastonbury_mozilla.com
1.6Gdtc-moz_scieneer.com
1.6Gjlund_mozilla.com
1.6Gsarentz_mozilla.com
1.6Gsbruno_mozilla.com
1.7Gmshal_mozilla.com
1.9Gmhammond_skippinet.com.au
2.1Glwagner_mozilla.com
2.4Garmenzg_mozilla.com
2.4Gdougt_mozilla.com
2.5Gbschouten_mozilla.com
2.7Ghwine_mozilla.com
2.8Geakhgari_mozilla.com
2.8Gmozilla_kewis.ch
2.9Grcampbell_mozilla.com
3.1Gbhearsum_mozilla.com
3.1Grjesup_wgate.com
3.2Gagal_mozilla.com
3.3Gaxel_mozilla.com
3.3Gprepr-ffxbld
4.2Gjford_mozilla.com
4.3Gmgervasini_mozilla.com
4.6Glsblakk_mozilla.com
5.0Gbsmith_mozilla.com
5.5Gnthomas_mozilla.com
5.8Gcoop_mozilla.com
6.5Gjhopkins_mozilla.com
7.7Graliiev_mozilla.com
9.2Gcatlee_mozilla.com
13Gstage-ffxbld

*Space Usage by Non-user repos ~100GB*
24K integration/gaia-1_4
28K addon-sdk
28K projects/collusion
32K integration/gaia-1_1_0
32K projects/emscripten
32K projects/Moz2D
32K releases/mozilla-b2g18_v1_1_0
144Kprojects/addon-sdk-jetperf-tests
268Kipccode
452Ktestpilot-l10n
500Kreleases/firefox-hotfixes
700Kprojects/python-nss
896Kschema-validation
1.2Mprojects/mccoy
1.4Mpyxpcom
2.4Mplatform-model
2.4M  

Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Justin Wood (Callek)

On 3/27/2014 2:58 AM, Doug Turner wrote:

Want to move to github?

(0) sudo apt-get install python-setuptools
(1) sudo easy_install hg-git
(2) add |hggit =| under [extensions] in your .hgrc file
(3) Go to GitHub.com and create your new repo.
(4) cd hg_repo
(5) hg bookmark -r default master
(6) hg push git+ssh://g...@github.com/you/name of your repo you created
in step 3



hg-git can't run without a very very custom and difficult-to-setup hg on 
windows.


Specifically because hg uses py2exe which strips out EVERY unused python 
library. And even doing hg in a virtualenv is hard because you get a 
MUCH slower hg due to no compiled code.


I have never further tested hg-git on windows after I encountered the 
two issues above.


~Justin Wood (Callek)


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Joshua Cranmer 

On 3/27/2014 1:58 AM, Doug Turner wrote:

Want to move to github?

(0) sudo apt-get install python-setuptools
(1) sudo easy_install hg-git
(2) add |hggit =| under [extensions] in your .hgrc file
(3) Go to GitHub.com and create your new repo.
(4) cd hg_repo
(5) hg bookmark -r default master
(6) hg push git+ssh://g...@github.com/you/name of your repo you created


It's worth noting that hg-git is having some performance issues with 
github right now. A basic clone of a 1MB repository takes well over a 
minute before it starts doing anything.


--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Armen Zambrano G.
On 14-03-26 08:27 PM, Bobby Holley wrote:
 I don't understand what the overhead is. We don't run CI on user repos.
 It's effectively just ssh:// + disk space, right? That seems totally
 negligible.
 
FTR from an operations standpoint, it is never just. Never.
If it was *just* we wouldn't even be having this conversation. Trust me.

regards,
Armen
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Gijs Kruitbosch

On 27/03/2014 13:43, Justin Wood (Callek) wrote:

On 3/27/2014 2:58 AM, Doug Turner wrote:

Want to move to github?

(0) sudo apt-get install python-setuptools
(1) sudo easy_install hg-git
(2) add |hggit =| under [extensions] in your .hgrc file
(3) Go to GitHub.com and create your new repo.
(4) cd hg_repo
(5) hg bookmark -r default master
(6) hg push git+ssh://g...@github.com/you/name of your repo you created
in step 3



hg-git can't run without a very very custom and difficult-to-setup hg on
windows.

Specifically because hg uses py2exe which strips out EVERY unused python
library. And even doing hg in a virtualenv is hard because you get a
MUCH slower hg due to no compiled code.

I have never further tested hg-git on windows after I encountered the
two issues above.

~Justin Wood (Callek)


IME tortoisehg ships a much happier-making hg (than mozilla-build) that 
has a bunch of python libs you want. I've never used hg-git, however, so 
I don't know if it has enough of what you need.


~ Gijs
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread James Graham

On 27/03/14 14:17, Armen Zambrano G. wrote:

On 14-03-26 08:27 PM, Bobby Holley wrote:

I don't understand what the overhead is. We don't run CI on user repos.
It's effectively just ssh:// + disk space, right? That seems totally
negligible.


FTR from an operations standpoint, it is never just. Never.
If it was *just* we wouldn't even be having this conversation. Trust me.


To be fair there are also considerable costs associated with outsourcing 
VCS hosting, mostly associated with integrating the external hosting 
with other systems that need to work with the repository. For example 
W3C's web-platform-tests testsuite is being hosted on GitHub and as a 
result we have spent a non-trivial amount of effort on integration with 
a system for ensuring contributers agree to a CLA, a code review tool, 
synchronization of HEAD with a web server and various other things. This 
might be less effort than doing all the hosting at the W3C (although the 
reason we did it was purely that GitHub is familiar to potential 
contributers), but of course it will all have to be thrown away if we 
want to move providers in the future.



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread David Burns


What are mission critical repos since you just put everything in the 
same list?


If we start removing project branches to be put on outsourced VCS we 
remove any sheriff support for that project branch since, as been 
pointed out many times, we dont have access to the server side commit 
hooks and can't close the tree. This may (I want to use *want* but don't 
have the data to prove it) impact engineering productivity. We have this 
situation with Gaia which has its canonical repo on Github. Sheriffs can 
land checkin-needed but can't close the tree. The way the B2G people do 
it is to remove everyone from the repo and then re-add (or thats how 
they used to do it) which then spams you with you are now getting 
notifications for repository X which is annoying.


There is the other thing we need to worry about is the constant DDoS of 
Github[1]. We have seen that when there is a massive one it will take 
down their site for hours impacting engineering productivity again since 
people can't pull or push. I couldn't find similar reports on bitbucket 
but it can happen to any third party we may use.


David

[1] https://github.com/blog/1796-denial-of-service-attacks

On 26/03/2014 23:53, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We 
should archive them by May 31st.


Time  spent operating user repositories could be spent reducing our  
end-to-end continuous  integration cycles. These do not seem like 
mission-critical repos, seems like developers would be better off 
hosting these on bitbucket or github. Using a 3rd-party host has 
obvious benefits for collaboration  self-service that our existing 
system will never meet.


We are happy to help move specific hg repos to bitbucket.

Once you have migrated your repository, please comment in 
https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some 
disk space.


*Non-User Repos*
There  are too many non-user repos. I'm not convinced we should host 
ash, oak, other project branches internally. I think we should focus 
on  mission-critical repos only. There should be less than a dozen of 
those. I would like to stop hosting non-mission-critical repositories 
by end of Q2.


This is a soft target. I don't have a concrete plan here. I'd like to 
start experimenting with moving project branches elsewhere and see 
where that takes us.


*What my hg repo needs X/Y that 3rd-party services do not provide?*
If you have a good reason to use a feature not supported by 
github/bitbucket, we should continue hosting your repo at Mozilla.


*Why Not Move Everything to Github/Bitbucket/etc?*
Mozilla  prefers to keep repositories public by-default. This does not 
fit  Github's business model which is built around private repos. 
Github's free  service does not provide any availability guarantee. 
There is also a problem of github not supporting hg.


I'm not completely sure why we can't move everything to bitbucket. 
Some of  it is to do with anecdotal evidence of robustness problems. 
Some of it is lack of hooks (sans post-receive POSTs).Additionally, as 
with Github there is no availability guarantee.


Hosting arbitrary Moz-related hg repositories does not make strategic 
sense. We should do the absolute minimum(eg http://bke.ro/?p=380) 
required to keep Firefox shipping smoothly and focus our efforts on 
making Firefox better.



Taras


ps. Footprint stats:

*Largest User Repos Out Of ~130GB*
1.1Gdmt.alexandre_gmail.com
1.1Gjblandy_mozilla.com
1.1Gjparsons_mozilla.com
1.2Gbugzilla_standard8.plus.com
1.2Gmbrubeck_mozilla.com
1.2Gmrbkap_mozilla.com
1.3Gdcamp_campd.org
1.3Gjst_mozilla.com
1.4Gblassey_mozilla.com
1.4Ggszorc_mozilla.com
1.4Giacobcatalin_gmail.com
1.5Gcpearce_mozilla.com
1.5Ghurley_mozilla.com
1.6Gbsmedberg_mozilla.com
1.6Gdglastonbury_mozilla.com
1.6Gdtc-moz_scieneer.com
1.6Gjlund_mozilla.com
1.6Gsarentz_mozilla.com
1.6Gsbruno_mozilla.com
1.7Gmshal_mozilla.com
1.9Gmhammond_skippinet.com.au
2.1Glwagner_mozilla.com
2.4Garmenzg_mozilla.com
2.4Gdougt_mozilla.com
2.5Gbschouten_mozilla.com
2.7Ghwine_mozilla.com
2.8Geakhgari_mozilla.com
2.8Gmozilla_kewis.ch
2.9Grcampbell_mozilla.com
3.1Gbhearsum_mozilla.com
3.1Grjesup_wgate.com
3.2Gagal_mozilla.com
3.3Gaxel_mozilla.com
3.3Gprepr-ffxbld
4.2Gjford_mozilla.com
4.3Gmgervasini_mozilla.com
4.6Glsblakk_mozilla.com
5.0Gbsmith_mozilla.com
5.5Gnthomas_mozilla.com
5.8Gcoop_mozilla.com
6.5Gjhopkins_mozilla.com
7.7Graliiev_mozilla.com
9.2Gcatlee_mozilla.com
13Gstage-ffxbld

*Space Usage by Non-user repos ~100GB*
24K integration/gaia-1_4
28K addon-sdk
28K projects/collusion
32K integration/gaia-1_1_0

Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Robert Kaiser

Taras Glek schrieb:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th.


When that happens, I will stop running any custom crash reports and 
dashboards that the stability program depends on, at least until further 
notice. I do not want to run a non-Mozilla-hosted repo for Mozilla work 
stuff.


KaiRo
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Gregory Szorc

On 3/27/14, 6:37 AM, Justin Wood (Callek) wrote:

On 3/26/2014 9:15 PM, Taras Glek wrote:




Bobby Holley mailto:bobbyhol...@gmail.com
Wednesday, March 26, 2014 17:27
I don't understand what the overhead is. We don't run CI on user
repos. It's effectively just ssh:// + disk space, right? That seems
totally negligible.

Human overhead in keeping infra running could be spent making our infra
better elsewhere.


Also, project branches are pretty useful for teams working together on
large projects that aren't ready to land in m-c. We only use them when
we need them, so why would we shut them down?

I'm not suggesting killing it. My suggestion is that project branch
experience would likely be better when not hosted by mozilla. It would
still trigger our c-i systems.


Except when you consider the disposable project branches get Level 2
commit privs needed, and that to commit to our repos you need to have
signed the committer agreement, which grants some legal recompense if
malice is done.

These project branches run on non try based machines which have
elevated rights vs what try does, and can do much much more harm if
there is malice here.

I for one would not be happy from a sec standpoint if we allowed
bitbucket-hosted repos to execute arbitrary code this way.


The security concern should be on the scheduling front, not where the 
code is hosted.


If a repo push incurs automation activity, we have established trust 
that anyone who can push to that repo can be trusted. If we don't have 
this automatic scheduling on push, no trust is established and there is 
no security concern.


If a user is able to schedule automation manually (say by calling a web 
API), we trust the user isn't doing something nefarious. Since the 
scheduling API requires authentication, there shouldn't be a new 
security concern here.


Even if there is an increased security concern over MITM or silent repo 
modification by 3rd party, these concerns can be mitigated through 
proper security settings (our Mercurial clients in automation aren't 
currently validating x509 fingerprints) and moving our automation jobs 
to execute in containers, which I believe is already in the works. That 
leaves us pretty much with kernel vulnerabilities (that can escape from 
containers), which we should be protecting ourselves against anyway.


This problem is little different than what insert cloud hosting service 
provider here deals with.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Randell Jesup
*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

As mentioned, too fast.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles.

If we're spending any significant time or money running these, let's
solve that instead - I really don't think much time or money *should* be
needed to run low-priority repos with non-mission-critical availability
requirements.

These do not seem like
mission-critical repos, seems like developers would be better off hosting
these on bitbucket or github. Using a 3rd-party host has obvious benefits
for collaboration  self-service that our existing system will never meet.

Some issues I raised privately before this post went public, but I don't
see addressed here:

* Security implications

Any dev who works on security bugs (and most do at one point or another,
or might) who puts a patch queue on an external host is proxying to that
host all security assurances.  This makes that external hosting a
tempting target for people who want to find 0-days.  

I'd like to say this is an excessive amount of paranoia, but given both
the lucrative market for 0-days and NSA's interest in 0-days (and
ability to compel or buy silence from companies or employees at
companies), I no longer think this is excessive.  :-(

I'm less worried about silent changes to the repos to slip stuff in
(though it's possible) than someone silently cataloging possible 0-day
targets in repos associated with devs, especially ones marked as
referring to bugs that aren't visible.

* cleanup

Per previous comments, it wasn't aware I could get rid of a user repo in
any easy way (and it may actually be busted right now).

Likely 50% of what's in user repos (or more) is dusty stuff that people
could simply delete.  I have one large and one medium repo I need to
keep and some patch queues (most of which are deletable now).  Anything
else can go.  But there's no trivial way for me to see what I have and
delete them.  A simple 1/month nag mail listing your private repos and
their sizes would help.

* side note: my repo names are tied to the email address in my key.
  It's dead.  I'd change my key to the new permanent email address, but
  I worry I might lose all my user repos.

* Backup/data-integrity/availability

Already mentioned was availability guarantees or lack thereof.

We'd need to back up these external repos (and find them somehow).
Taras commented to me that we use expensive storage solutions for user
repos (similar to primary repos).  IMHO that's not needed:

User repos needs lower SLA gear I'd imagine - redundancy, but could
probably just live in a RAID-1+1 array with consumer drives with very
high reliability (two RAID-1 arrays in a RAID-1 configuration) - you'd
need a 3-drive simultaneous failure to have to fall back on backups.

Hell, a single RAID-1 is probably good enough, so long as it's backed up
frequently.

Taras mentioned that this is time not spent doing other things; my
response:

I imagine you can buy a RAID drive and just drop it in in place of
$$$-expensive-drive. But yes this requires some thought/planning/etc
time for them; while moving to random-VCS-storage requires some time by
N devs - net result may be more time than if we keep it inhouse.  Plus
time by IT to set up remote backup and negotiate something with
random-VCS to let that happen. If we're dropping backup and just relying
on the service, there are some additional concerns.

-- 
Randell Jesup, Mozilla Corp
remove news for personal email
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Andrew Sutherland

On 03/27/2014 10:10 AM, Joshua Cranmer  wrote:
It's worth noting that hg-git is having some performance issues with 
github right now. A basic clone of a 1MB repository takes well over a 
minute before it starts doing anything.


When I was converting my repositories last night I found that although 
the push to github from hg(-git) was hanging, it in fact had completed 
all of its work already.  After a second or two you could control-C, 
re-push, and it would say there was nothing to do. If you checked on 
github, the commits would in fact be all there, and they would be there 
before the second push attempt or hitting control-C.


Obviously, if you are pushing something huge like a clone of 
mozilla-central, you may need to legitimately wait a long time.  But for 
clones of mozilla-central it's probably most advisable and polite to 
fork the gecko-dev repo and either do a light-weight import of any 
branches using git fast-import or the fancy tooling used to produce 
gecko-dev in the first place.  A very cursory exploration shows 
http://repo.or.cz/w/fast-export.git provides fast-export from hg for 
fast-import to git, but it's probably best to read the blog-posts for 
the gecko-dev conversion instead.


Andrew

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-27 Thread Joshua Cranmer 

On 3/27/2014 12:11 PM, Andrew Sutherland wrote:

On 03/27/2014 10:10 AM, Joshua Cranmer  wrote:
It's worth noting that hg-git is having some performance issues with 
github right now. A basic clone of a 1MB repository takes well over a 
minute before it starts doing anything.


When I was converting my repositories last night I found that although 
the push to github from hg(-git) was hanging, it in fact had completed 
all of its work already.  After a second or two you could control-C, 
re-push, and it would say there was nothing to do. If you checked on 
github, the commits would in fact be all there, and they would be 
there before the second push attempt or hitting control-C.


I saw that too, but the clone/pull is a different error, reported here: 
https://bitbucket.org/durin42/hg-git/issue/90/stuck-clone-over-git-ssh-to-bitbucketorg. 
Note that the hang happens here well before the work is done.


--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Taras Glek

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should 
archive them by May 31st.


Time  spent operating user repositories could be spent reducing our  
end-to-end continuous  integration cycles. These do not seem like 
mission-critical repos, seems like developers would be better off 
hosting these on bitbucket or github. Using a 3rd-party host has obvious 
benefits for collaboration  self-service that our existing system will 
never meet.


We are happy to help move specific hg repos to bitbucket.

Once you have migrated your repository, please comment in 
https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some 
disk space.


*Non-User Repos*
There  are too many non-user repos. I'm not convinced we should host 
ash, oak, other project branches internally. I think we should focus on  
mission-critical repos only. There should be less than a dozen of those. 
I would like to stop hosting non-mission-critical repositories by end of Q2.


This is a soft target. I don't have a concrete plan here. I'd like to 
start experimenting with moving project branches elsewhere and see where 
that takes us.


*What my hg repo needs X/Y that 3rd-party services do not provide?*
If you have a good reason to use a feature not supported by 
github/bitbucket, we should continue hosting your repo at Mozilla.


*Why Not Move Everything to Github/Bitbucket/etc?*
Mozilla  prefers to keep repositories public by-default. This does not 
fit  Github's business model which is built around private repos. 
Github's free  service does not provide any availability guarantee. 
There is also a problem of github not supporting hg.


I'm not completely sure why we can't move everything to bitbucket. Some 
of  it is to do with anecdotal evidence of robustness problems. Some of 
it is lack of hooks (sans post-receive POSTs).Additionally, as with 
Github there is no availability guarantee.


Hosting arbitrary Moz-related hg repositories does not make strategic 
sense. We should do the absolute minimum(eg http://bke.ro/?p=380) 
required to keep Firefox shipping smoothly and focus our efforts on 
making Firefox better.



Taras


ps. Footprint stats:

*Largest User Repos Out Of ~130GB*
1.1Gdmt.alexandre_gmail.com
1.1Gjblandy_mozilla.com
1.1Gjparsons_mozilla.com
1.2Gbugzilla_standard8.plus.com
1.2Gmbrubeck_mozilla.com
1.2Gmrbkap_mozilla.com
1.3Gdcamp_campd.org
1.3Gjst_mozilla.com
1.4Gblassey_mozilla.com
1.4Ggszorc_mozilla.com
1.4Giacobcatalin_gmail.com
1.5Gcpearce_mozilla.com
1.5Ghurley_mozilla.com
1.6Gbsmedberg_mozilla.com
1.6Gdglastonbury_mozilla.com
1.6Gdtc-moz_scieneer.com
1.6Gjlund_mozilla.com
1.6Gsarentz_mozilla.com
1.6Gsbruno_mozilla.com
1.7Gmshal_mozilla.com
1.9Gmhammond_skippinet.com.au
2.1Glwagner_mozilla.com
2.4Garmenzg_mozilla.com
2.4Gdougt_mozilla.com
2.5Gbschouten_mozilla.com
2.7Ghwine_mozilla.com
2.8Geakhgari_mozilla.com
2.8Gmozilla_kewis.ch
2.9Grcampbell_mozilla.com
3.1Gbhearsum_mozilla.com
3.1Grjesup_wgate.com
3.2Gagal_mozilla.com
3.3Gaxel_mozilla.com
3.3Gprepr-ffxbld
4.2Gjford_mozilla.com
4.3Gmgervasini_mozilla.com
4.6Glsblakk_mozilla.com
5.0Gbsmith_mozilla.com
5.5Gnthomas_mozilla.com
5.8Gcoop_mozilla.com
6.5Gjhopkins_mozilla.com
7.7Graliiev_mozilla.com
9.2Gcatlee_mozilla.com
13Gstage-ffxbld

*Space Usage by Non-user repos ~100GB*
24K integration/gaia-1_4
28K addon-sdk
28K projects/collusion
32K integration/gaia-1_1_0
32K projects/emscripten
32K projects/Moz2D
32K releases/mozilla-b2g18_v1_1_0
144Kprojects/addon-sdk-jetperf-tests
268Kipccode
452Ktestpilot-l10n
500Kreleases/firefox-hotfixes
700Kprojects/python-nss
896Kschema-validation
1.2Mprojects/mccoy
1.4Mpyxpcom
2.4Mplatform-model
2.4Mxforms
2.6Mreleases/mobile-1.1
2.6Mvenkman
2.8Mwww
2.9Mreleases/mobile-5.0
3.1Mpenelope
3.3Mreleases/mobile-2.0
3.5Mtbbuild
3.7Mhgcustom
3.9Mreleases/mobile-6.0
4.6Mchatzilla
5.3Mgraphs
5.4Mprojects/kraken
6.4Mprojects/ldap-sdks
6.7Mdom-inspector
6.7Mprojects/htmlparser
7.0Mweave-l10n
13M mobile-browser
14M integration/gaia-ui-tests
14M projects/jss
19M projects/addon-sdk-release
20M projects/addon-sdk-beta
25M projects/nspr
25M releases/comm-1.9.2
28M rewriting-and-analysis
30M camino
30M releases/comm-esr24
30M releases/comm-miramar
31M projects/addon-sdk
35M releases/comm-1.9.1
37M releases/comm-esr17
43M gaia-l10n
44M releases/comm-esr10
48M l10n
48M qa
51M releases/gaia-l10n
52M automation
53M projects/2007-configure-rewrite
59M 

Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread L. David Baron
On Wednesday 2014-03-26 16:53 -0700, Taras Glek wrote:
 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We
 should archive them by May 31st.
 
 Time  spent operating user repositories could be spent reducing our
 end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off
 hosting these on bitbucket or github. Using a 3rd-party host has
 obvious benefits for collaboration  self-service that our existing
 system will never meet.
 
 We are happy to help move specific hg repos to bitbucket.
 
 Once you have migrated your repository, please comment in
 https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free
 some disk space.

This seems like a pretty disruptive change -- it involves breaking
links to the places lots of little pieces of our infrastructure
live.

It also means that we're not in control of our own data in a way
that's often useful to us -- having access to our history is often
very important for understanding the present (such as understanding
why code is the way it is).  If we don't have reliable archiving of
our history, those of us who think it's important will end up
spreading that work around and probably being less efficient at it.
(For example, I try to save dev-platform threads that I think are
important locally because I don't trust the Google Groups archive to
be permanent.)

It also makes it harder to find Mozilla-related things.  For
example, many of us publish version-controlled patch queues as user
repositories.  If I'm reviewing a patch queue and want to apply the
queue, I occasionally look around at see if that user has published
the patch queue as a user repository so that I can apply it.  If
there's no longer a standard place for them to be published, I'll
end up either sorting out the patch order manually or waiting 24
hours for somebody in another timezone to wake up and tell me where
it is.

 *Non-User Repos*
 There  are too many non-user repos. I'm not convinced we should host
 ash, oak, other project branches internally. I think we should focus
 on  mission-critical repos only. There should be less than a dozen
 of those. I would like to stop hosting non-mission-critical
 repositories by end of Q2.

The goal of project branches is so that teams can collaborate on a
project that needs continuous integration testing during its
development.  Are we not using it for that?

 Hosting arbitrary Moz-related hg repositories does not make
 strategic sense. We should do the absolute minimum(eg
 http://bke.ro/?p=380) required to keep Firefox shipping smoothly and
 focus our efforts on making Firefox better.

I think it makes sense if individual developers are going to end up
spending more time/resources working around the fact that we don't
do it than it would take to continue doing it.  I don't have data
one way or another, but I think it's a real possibility.

-David

-- 
턞   L. David Baron http://dbaron.org/   턂
턢   Mozilla  https://www.mozilla.org/   턂
 Before I built a wall I'd ask to know
 What I was walling in or walling out,
 And to whom I was like to give offense.
   - Robert Frost, Mending Wall (1914)


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Bobby Holley
I don't understand what the overhead is. We don't run CI on user repos.
It's effectively just ssh:// + disk space, right? That seems totally
negligible.

Also, project branches are pretty useful for teams working together on
large projects that aren't ready to land in m-c. We only use them when we
need them, so why would we shut them down?


On Wed, Mar 26, 2014 at 9:11 PM, L. David Baron dba...@dbaron.org wrote:

 On Wednesday 2014-03-26 16:53 -0700, Taras Glek wrote:
  *User Repos*
  TLDR: I would like to make user repos read-only by April 30th. We
  should archive them by May 31st.
 
  Time  spent operating user repositories could be spent reducing our
  end-to-end continuous  integration cycles. These do not seem like
  mission-critical repos, seems like developers would be better off
  hosting these on bitbucket or github. Using a 3rd-party host has
  obvious benefits for collaboration  self-service that our existing
  system will never meet.
 
  We are happy to help move specific hg repos to bitbucket.
 
  Once you have migrated your repository, please comment in
  https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free
  some disk space.

 This seems like a pretty disruptive change -- it involves breaking
 links to the places lots of little pieces of our infrastructure
 live.

 It also means that we're not in control of our own data in a way
 that's often useful to us -- having access to our history is often
 very important for understanding the present (such as understanding
 why code is the way it is).  If we don't have reliable archiving of
 our history, those of us who think it's important will end up
 spreading that work around and probably being less efficient at it.
 (For example, I try to save dev-platform threads that I think are
 important locally because I don't trust the Google Groups archive to
 be permanent.)

 It also makes it harder to find Mozilla-related things.  For
 example, many of us publish version-controlled patch queues as user
 repositories.  If I'm reviewing a patch queue and want to apply the
 queue, I occasionally look around at see if that user has published
 the patch queue as a user repository so that I can apply it.  If
 there's no longer a standard place for them to be published, I'll
 end up either sorting out the patch order manually or waiting 24
 hours for somebody in another timezone to wake up and tell me where
 it is.

  *Non-User Repos*
  There  are too many non-user repos. I'm not convinced we should host
  ash, oak, other project branches internally. I think we should focus
  on  mission-critical repos only. There should be less than a dozen
  of those. I would like to stop hosting non-mission-critical
  repositories by end of Q2.

 The goal of project branches is so that teams can collaborate on a
 project that needs continuous integration testing during its
 development.  Are we not using it for that?

  Hosting arbitrary Moz-related hg repositories does not make
  strategic sense. We should do the absolute minimum(eg
  http://bke.ro/?p=380) required to keep Firefox shipping smoothly and
  focus our efforts on making Firefox better.

 I think it makes sense if individual developers are going to end up
 spending more time/resources working around the fact that we don't
 do it than it would take to continue doing it.  I don't have data
 one way or another, but I think it's a real possibility.

 -David

 --
 턞   L. David Baron http://dbaron.org/   턂
 턢   Mozilla  https://www.mozilla.org/   턂
  Before I built a wall I'd ask to know
  What I was walling in or walling out,
  And to whom I was like to give offense.
- Robert Frost, Mending Wall (1914)

 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Gregory Szorc

On 3/26/14, 4:53 PM, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles. These do not seem like
mission-critical repos, seems like developers would be better off
hosting these on bitbucket or github. Using a 3rd-party host has obvious
benefits for collaboration  self-service that our existing system will
never meet.


How much time do we spend operating user repositories? I follow the 
repos bugzilla components and most of the requests I see have little if 
anything to do with user repositories. And I reckon that's because user 
repositories are self-service. Are user repositories more than just disk 
space and seldom CPU usage and page cache eviction?



We are happy to help move specific hg repos to bitbucket.

Once you have migrated your repository, please comment in
https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
disk space.

*Non-User Repos*
There  are too many non-user repos. I'm not convinced we should host
ash, oak, other project branches internally. I think we should focus on
mission-critical repos only. There should be less than a dozen of those.
I would like to stop hosting non-mission-critical repositories by end of
Q2.


What about making non-user repos more self-service? (They currently 
require bugs for everything AFAICT.)



This is a soft target. I don't have a concrete plan here. I'd like to
start experimenting with moving project branches elsewhere and see where
that takes us.


I would *really* like the ability to trigger automation on any repo, 
regardless of its URL. Moving project branches elsewhere might make this 
happen, so +1.



*What my hg repo needs X/Y that 3rd-party services do not provide?*
If you have a good reason to use a feature not supported by
github/bitbucket, we should continue hosting your repo at Mozilla.

*Why Not Move Everything to Github/Bitbucket/etc?*
Mozilla  prefers to keep repositories public by-default. This does not
fit  Github's business model which is built around private repos.
Github's free  service does not provide any availability guarantee.
There is also a problem of github not supporting hg.

I'm not completely sure why we can't move everything to bitbucket. Some
of  it is to do with anecdotal evidence of robustness problems. Some of
it is lack of hooks (sans post-receive POSTs).Additionally, as with
Github there is no availability guarantee.


A lot of it has to do with lack of hooks. Without pre-push hooks on 
Bitbucket or Github, there will be footgunning. The counter argument is 
just back out bad commits. But excessive backouts can be problematic 
(see our Firefox tree management and ask Jesse about bisecting impact).


There is also the issue with size. Remember when GitHub disabled our 
mirror without notice because it became too large and became a 
performance problem? I can only speculate what Bitbucket will do when 
1000 new 1.5+ GB clones of the Firefox repo show up. Have we asked them?


In the case of Mercurial, we'll want to someday deploy Facebook's 
remotefilelog extension to enable shallow clones (drastically reducing 
clone time in the process - a game changer for new contributors who 
can't download 1+ GB of repo data). We may also want to deploy a bundle 
lookaside extension that automatically uses a bundle for initial 
clones. Obviously we can do these things for repos on hg.mozilla.org. 
But what about the user clones on Bitbucket? We may run into 
compatibility problems.



Hosting arbitrary Moz-related hg repositories does not make strategic
sense. We should do the absolute minimum(eg http://bke.ro/?p=380)
required to keep Firefox shipping smoothly and focus our efforts on
making Firefox better.


Strategic, no. Necessary because we have no better alternative, quite 
possibly.


If this boils down to maintaining the code behind 
hg.mozilla.org/git.mozilla.org, I and others have offered to help. I've 
volunteered to improve the self-service capabilities of user repos, for 
example. But, the code is in some private IT repository and it's 
difficult to get your hands on initially, to test, and deploy. Whatever 
the outcome of this proposal is, I hope that roadblock can be eliminated.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Taras Glek




Gregory Szorc mailto:g...@mozilla.com
Wednesday, March 26, 2014 17:40
On 3/26/14, 4:53 PM, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles. These do not seem like
mission-critical repos, seems like developers would be better off
hosting these on bitbucket or github. Using a 3rd-party host has obvious
benefits for collaboration  self-service that our existing system will
never meet.


How much time do we spend operating user repositories? I follow the 
repos bugzilla components and most of the requests I see have little 
if anything to do with user repositories. And I reckon that's because 
user repositories are self-service. Are user repositories more than 
just disk space and seldom CPU usage and page cache eviction?
Some significant portion Ben's time was spent on user repos during the 
multi-week webhead migration(http://bke.ro/?p=380).


bugs like https://bugzilla.mozilla.org/show_bug.cgi?id=983085 cropped up.

The fact that repos keep growing means that we'll have to do this 
migration again soon. We are at 260gb/300gb.


As long as our footprint is 40gb we can't migrate to fast, cheap  
cheerful AWS nodes. Have to do something complex or expensive 
instead...which means more devop time. As long as our footprint keeps 
growing we'll keep revisiting this problem.


B2G guys seem to prefer github already.



We are happy to help move specific hg repos to bitbucket.

Once you have migrated your repository, please comment in
https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some
disk space.

*Non-User Repos*
There  are too many non-user repos. I'm not convinced we should host
ash, oak, other project branches internally. I think we should focus on
mission-critical repos only. There should be less than a dozen of those.
I would like to stop hosting non-mission-critical repositories by end of
Q2.


What about making non-user repos more self-service? (They currently 
require bugs for everything AFAICT.)
This requires more devop time. Handling bugs wastes time on both sides.  
Self-serve is the biggest advantage of breaking the habit and moving to 
a hosted service.



This is a soft target. I don't have a concrete plan here. I'd like to
start experimenting with moving project branches elsewhere and see where
that takes us.


I would *really* like the ability to trigger automation on any repo, 
regardless of its URL. Moving project branches elsewhere might make 
this happen, so +1.
Right. I'm all for increased productivity through self-serve. Having a 
weird ad-hoc system does not seem like a win.



*What my hg repo needs X/Y that 3rd-party services do not provide?*
If you have a good reason to use a feature not supported by
github/bitbucket, we should continue hosting your repo at Mozilla.

*Why Not Move Everything to Github/Bitbucket/etc?*
Mozilla  prefers to keep repositories public by-default. This does not
fit  Github's business model which is built around private repos.
Github's free  service does not provide any availability guarantee.
There is also a problem of github not supporting hg.

I'm not completely sure why we can't move everything to bitbucket. Some
of  it is to do with anecdotal evidence of robustness problems. Some of
it is lack of hooks (sans post-receive POSTs).Additionally, as with
Github there is no availability guarantee.


A lot of it has to do with lack of hooks. Without pre-push hooks on 
Bitbucket or Github, there will be footgunning. The counter argument 
is just back out bad commits. But excessive backouts can be 
problematic (see our Firefox tree management and ask Jesse about 
bisecting impact).
No. Counterargument is: stop using hg like cvs. Have a staging repo and 
automation to transfer changesets as part of a c-i process.


There is also the issue with size. Remember when GitHub disabled our 
mirror without notice because it became too large and became a 
performance problem? I can only speculate what Bitbucket will do when 
1000 new 1.5+ GB clones of the Firefox repo show up. Have we asked them?
We are asking. Part of the reason mc got pulled is heavy traffic on that 
repo. User  project repos should generate a lot less load. It's more of 
a lets try this.


In the case of Mercurial, we'll want to someday deploy Facebook's 
remotefilelog extension to enable shallow clones (drastically 
reducing clone time in the process - a game changer for new 
contributors who can't download 1+ GB of repo data). We may also want 
to deploy a bundle lookaside extension that automatically uses a 
bundle for initial clones. Obviously we can do these things for repos 
on hg.mozilla.org. But what about the user clones on Bitbucket? We 
may run into compatibility problems.
There are lots of potential solutions here. I would like to deploy hg 
proxies(something like 

Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Mike Hommey
On Wed, Mar 26, 2014 at 04:53:27PM -0700, Taras Glek wrote:
 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We should
 archive them by May 31st.
 
 Time  spent operating user repositories could be spent reducing our
 end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off hosting
 these on bitbucket or github. Using a 3rd-party host has obvious benefits
 for collaboration  self-service that our existing system will never meet.
 
 We are happy to help move specific hg repos to bitbucket.
 
 Once you have migrated your repository, please comment in
 https://bugzilla.mozilla.org/show_bug.cgi?id=988628so we can free some disk
 space.
 
 *Non-User Repos*
 There  are too many non-user repos. I'm not convinced we should host ash,
 oak, other project branches internally. I think we should focus on
 mission-critical repos only. There should be less than a dozen of those. I
 would like to stop hosting non-mission-critical repositories by end of Q2.

What about nspr, nss, comm-central, venkman, chatzilla, etc.?

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Mike Hommey
On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote:
 On 3/26/14, 4:53 PM, Taras Glek wrote:
 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We should
 archive them by May 31st.
 
 Time  spent operating user repositories could be spent reducing our
 end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off
 hosting these on bitbucket or github. Using a 3rd-party host has obvious
 benefits for collaboration  self-service that our existing system will
 never meet.
 
 How much time do we spend operating user repositories? I follow the repos
 bugzilla components and most of the requests I see have little if anything
 to do with user repositories. And I reckon that's because user repositories
 are self-service.

Note that while user repositories are self-service on the creation side,
there is no obvious way to self-service a user repo removal. I'm not in
Taras's list, but after looking, I figured I had an old m-c copy with
old patches on top of it.

Also note that the lack of something better than mercurial's share, we
sadly have to waste plenty of disk space for each copy of a mercurial
repo. If mercurial's share was more like git's object alternates, that
would be much less dramatic. (BTW, I don't think it would be extremely
difficult to implement)

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread L. David Baron
On Thursday 2014-03-27 14:11 +0900, Mike Hommey wrote:
 Note that while user repositories are self-service on the creation side,
 there is no obvious way to self-service a user repo removal. I'm not in

They're just as easy to remove as to create:
https://developer.mozilla.org/en-US/docs/Creating_Mercurial_User_Repositories#Editing_your_personal_repository

-David

-- 
턞   L. David Baron http://dbaron.org/   턂
턢   Mozilla  https://www.mozilla.org/   턂
 Before I built a wall I'd ask to know
 What I was walling in or walling out,
 And to whom I was like to give offense.
   - Robert Frost, Mending Wall (1914)


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Gregory Szorc

On 3/26/14, 10:11 PM, Mike Hommey wrote:

On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote:

On 3/26/14, 4:53 PM, Taras Glek wrote:

*User Repos*
TLDR: I would like to make user repos read-only by April 30th. We should
archive them by May 31st.

Time  spent operating user repositories could be spent reducing our
end-to-end continuous  integration cycles. These do not seem like
mission-critical repos, seems like developers would be better off
hosting these on bitbucket or github. Using a 3rd-party host has obvious
benefits for collaboration  self-service that our existing system will
never meet.


How much time do we spend operating user repositories? I follow the repos
bugzilla components and most of the requests I see have little if anything
to do with user repositories. And I reckon that's because user repositories
are self-service.


Note that while user repositories are self-service on the creation side,
there is no obvious way to self-service a user repo removal. I'm not in
Taras's list, but after looking, I figured I had an old m-c copy with
old patches on top of it.


That sounds like a bug in the self-service feature!


Also note that the lack of something better than mercurial's share, we
sadly have to waste plenty of disk space for each copy of a mercurial
repo. If mercurial's share was more like git's object alternates, that
would be much less dramatic. (BTW, I don't think it would be extremely
difficult to implement)


It's 2014: why are we worrying about disk space values less than 10 TB?

More seriously though, it's not extremely difficult to implement a 
custom storage backend for Mercurial. remotefilelog does it. It's only a 
matter of time before someone hooks up SQL, S3, Neo4j, etc to make 
server-side scaling more efficient.


Also, if you are using a COW filesystem, initial clones should be nearly 
free and you'd only pay the extra copy cost for changesets added 
afterwards. This could help dramatically with mozilla-central clones.


Out of curiosity, is there open source software for a shared Git object 
store?

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Mike Hommey
On Wed, Mar 26, 2014 at 10:32:07PM -0700, L. David Baron wrote:
 On Thursday 2014-03-27 14:11 +0900, Mike Hommey wrote:
  Note that while user repositories are self-service on the creation side,
  there is no obvious way to self-service a user repo removal. I'm not in
 
 They're just as easy to remove as to create:
 https://developer.mozilla.org/en-US/docs/Creating_Mercurial_User_Repositories#Editing_your_personal_repository

Doh. That's what you get from reading the outline and not associating
Edit with Delete.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Mike Hommey
On Thu, Mar 27, 2014 at 02:42:13PM +0900, Mike Hommey wrote:
 On Wed, Mar 26, 2014 at 10:32:07PM -0700, L. David Baron wrote:
  On Thursday 2014-03-27 14:11 +0900, Mike Hommey wrote:
   Note that while user repositories are self-service on the creation side,
   there is no obvious way to self-service a user repo removal. I'm not in
  
  They're just as easy to remove as to create:
  https://developer.mozilla.org/en-US/docs/Creating_Mercurial_User_Repositories#Editing_your_personal_repository
 
 Doh. That's what you get from reading the outline and not associating
 Edit with Delete.

Interestingly, I just deleted that old repo, and guess what? I can still
clone it, and it's still available on hgweb, while the operation itself
took a while, suggesting it did, in fact, delete something.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Spring cleaning: Reducing Number Footprint of HG Repos

2014-03-26 Thread Mike Hommey
On Wed, Mar 26, 2014 at 10:40:39PM -0700, Gregory Szorc wrote:
 On 3/26/14, 10:11 PM, Mike Hommey wrote:
 On Wed, Mar 26, 2014 at 05:40:36PM -0700, Gregory Szorc wrote:
 On 3/26/14, 4:53 PM, Taras Glek wrote:
 *User Repos*
 TLDR: I would like to make user repos read-only by April 30th. We should
 archive them by May 31st.
 
 Time  spent operating user repositories could be spent reducing our
 end-to-end continuous  integration cycles. These do not seem like
 mission-critical repos, seems like developers would be better off
 hosting these on bitbucket or github. Using a 3rd-party host has obvious
 benefits for collaboration  self-service that our existing system will
 never meet.
 
 How much time do we spend operating user repositories? I follow the repos
 bugzilla components and most of the requests I see have little if anything
 to do with user repositories. And I reckon that's because user repositories
 are self-service.
 
 Note that while user repositories are self-service on the creation side,
 there is no obvious way to self-service a user repo removal. I'm not in
 Taras's list, but after looking, I figured I had an old m-c copy with
 old patches on top of it.
 
 That sounds like a bug in the self-service feature!
 
 Also note that the lack of something better than mercurial's share, we
 sadly have to waste plenty of disk space for each copy of a mercurial
 repo. If mercurial's share was more like git's object alternates, that
 would be much less dramatic. (BTW, I don't think it would be extremely
 difficult to implement)
 
 It's 2014: why are we worrying about disk space values less than 10 TB?
 
 More seriously though, it's not extremely difficult to implement a custom
 storage backend for Mercurial. remotefilelog does it. It's only a matter of
 time before someone hooks up SQL, S3, Neo4j, etc to make server-side scaling
 more efficient.

That doesn't even need sql, s3, or whatever. Just that a shared clone
have local filelogs.

 Also, if you are using a COW filesystem, initial clones should be nearly
 free and you'd only pay the extra copy cost for changesets added afterwards.
 This could help dramatically with mozilla-central clones.
 
 Out of curiosity, is there open source software for a shared Git object
 store?

git.

Mike
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform