============================================
#fedora-meeting: Infrastructure (2013-02-21)
============================================


Meeting started by nirik at 19:00:01 UTC. The full logs are available at
http://meetbot.fedoraproject.org/fedora-meeting/2013-02-21/infrastructure.2013-02-21-19.00.log.html
.



Meeting summary
---------------
* welcome y'all  (nirik, 19:00:01)

* New folks introductions and Apprentice tasks.  (nirik, 19:02:15)
  * new easyfix tasks welcome, team members are encouraged to try and
    file tickets for them.  (nirik, 19:05:28)

* Applications status / discussion  (nirik, 19:06:17)
  * pingou has vastly simplified the pkgdb db.  (nirik, 19:07:42)
  * new pkgdb-cli pushed out as well as copr-cli  (nirik, 19:08:16)
  * fas release being tested in staging, for 2013-02-28 release to prod.
    (nirik, 19:08:57)
  * askbot is now sending fedmsg's.  (nirik, 19:11:56)
  * more fas-openid testing welcome. Has worked for those folks that
    have tried it so far.  (nirik, 19:15:29)
  * fedocal ready for 1.0 tag and review process.  (nirik, 19:16:16)
  * LINK: http://elections-dev.cloud.fedoraproject.org/   (abadger1999,
    19:16:30)
  * testing on new elections version welcome:
    http://elections-dev.cloud.fedoraproject.org/ (make account in
    fakefas)  (nirik, 19:17:04)
  * will try out an f18 server for mm3 staging testing and feel out an
    updates policy, etc. Possibly using snapshots more.  (nirik,
    19:33:27)
  * will look at moving fas-openid to prod as soon as is feasable.
    (nirik, 19:33:46)
  * feedback on github reviews of all commits welcome.  (nirik,
    19:39:04)
  * mirrormanager update to 1.4 soon.  (nirik, 19:39:11)

* Sysadmin status / discussion  (nirik, 19:43:00)
  * smooge got our bnfs01 server's disks working again.  (nirik,
    19:43:56)
  * nagios adjustments in progress  (nirik, 19:44:30)
  * arm boxes will get new net friday hopefully  (nirik, 19:45:07)
  * mass reboot next wed (tenative) for rhel 6.4 upgrades.  (nirik,
    19:47:52)

* Private Cloud status update / discussion  (nirik, 19:52:50)
  * euca cloudlet limping along after upgrade.  (nirik, 19:55:11)
  * work on going to bring openstack cloudlet up to more production
    (nirik, 19:55:26)
  * please see skvidal if you want to get involved in our private cloud
    setup  (nirik, 20:01:29)

* Upcoming Tasks/Items  (nirik, 20:01:33)
  * 2013-02-28 end of 4th quarter  (nirik, 20:01:44)
  * 2013-03-01 nag fi-apprentices  (nirik, 20:01:44)
  * 2013-03-07 remove inactive apprentices.  (nirik, 20:01:44)
  * 2013-03-19 to 2013-03-26 - koji update  (nirik, 20:01:44)
  * 2013-03-29 - spring holiday.  (nirik, 20:01:44)
  * 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze  (nirik,
    20:01:46)
  * 2013-04-16 F19 alpha release  (nirik, 20:01:48)
  * 2013-05-07 to 2013-05-21 BETA infrastructure freeze  (nirik,
    20:01:50)
  * 2013-05-21 F19 beta release  (nirik, 20:01:52)
  * 2013-05-31 end of 1st quarter  (nirik, 20:01:54)
  * 2013-06-11 to 2013-06-25 FINAL infrastructure freeze.  (nirik,
    20:01:56)
  * 2013-06-25 F19 FINAL release  (nirik, 20:01:58)

* Open Floor  (nirik, 20:02:49)

Meeting ended at 20:04:14 UTC.




Action Items
------------





Action Items, by person
-----------------------
* **UNASSIGNED**
  * (none)




People Present (lines said)
---------------------------
* nirik (143)
* skvidal (99)
* abadger1999 (47)
* pingou (24)
* abompard (15)
* smooge (10)
* mdomsch (10)
* threebean (6)
* zodbot (5)
* SmootherFrOgZ (4)
* cyberworm54 (4)
* lmacken (2)
* maayke (1)
* ricky (0)
* dgilmore (0)
* CodeBlock (0)
--
19:00:01 <nirik> #startmeeting Infrastructure (2013-02-21)
19:00:01 <zodbot> Meeting started Thu Feb 21 19:00:01 2013 UTC.  The chair is 
nirik. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:01 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link 
#topic.
19:00:01 <nirik> #meetingname infrastructure
19:00:01 <zodbot> The meeting name has been set to 'infrastructure'
19:00:01 <nirik> #topic welcome y'all
19:00:01 <nirik> #chair smooge skvidal CodeBlock ricky nirik abadger1999 
lmacken dgilmore mdomsch threebean
19:00:01 <zodbot> Current chairs: CodeBlock abadger1999 dgilmore lmacken 
mdomsch nirik ricky skvidal smooge threebean
19:00:13 * skvidal is here
19:00:15 <nirik> hello everyone. whos around for an infrastructure meeting?
19:00:15 <smooge> not guilty
19:00:23 * cyberworm54 is here
19:00:25 * lmacken 
19:00:26 * threebean is kinda here
19:00:28 * maayke is here
19:00:33 * abadger1999 here
19:00:40 * pingou here
19:00:52 * SmootherFrOgZ here
19:02:08 <nirik> ok, I guess lets go ahead and dive in...
19:02:15 <nirik> #topic New folks introductions and Apprentice tasks.
19:02:30 <nirik> any new folks like to introduce themselves? or apprentices 
with questions or comments?
19:03:04 <cyberworm54> Hi I am an apprentice and hopefully I can learn and 
contribute as much as I can
19:03:31 <nirik> welcome (back) cyberworm54
19:03:57 <cyberworm54> Thanks!
19:04:01 <nirik> to digress a bit... do folks think our apprentice setup is 
working well? or is there anything we can do to improve it?
19:04:20 <nirik> I think the biggest problem is new people getting up to speed 
and finding things they can work on.
19:04:52 <skvidal> nirik: also - we have a fair amount more code-related tasks 
than general admin tasks that newcomers can get into
19:04:56 <nirik> we are also low on new easyfix tickets, particularly in the 
sysadmin side.
19:05:02 <nirik> yeah.
19:05:14 <cyberworm54> it is a bit ...confusing but once you get to the docs 
and actually read it you have a start point
19:05:28 <nirik> #info new easyfix tasks welcome, team members are encouraged 
to try and file tickets for them.
19:06:06 <nirik> ok, moving on then I guess.
19:06:17 <nirik> #topic Applications status / discussion
19:06:27 <nirik> any application / development news this week or upcoming?
19:06:46 <pingou> I've been doing some cleanup on the pkgdb db scheme
19:06:49 <pingou> before: http://ambre.pingoured.fr/public/pkgdb.png
19:06:57 <pingou> after: http://ambre.pingoured.fr/public/pkgdb2.png
19:07:25 <pingou> that's with the help of abadger1999 :)
19:07:29 <nirik> wow. nice!
19:07:29 <lmacken> nice ☺
19:07:42 <nirik> #info pingou has vastly simplified the pkgdb db.
19:07:46 * abadger1999 just reviews and makes suggestions to what pingou writes 
;-)
19:07:54 <pingou> pushed a new version of pkgdb-cli (waiting to arrive in 
testing) and pushed upstream a new version of copr-cli
19:08:16 <nirik> #info new pkgdb-cli pushed out as well as copr-cli
19:08:19 <abadger1999> New fas release is finally out the door.  Planning to 
upgrade production on Feb 28.
19:08:29 <pingou> abadger1999 and I have started to think about pkgdb2 
basically, schema update is the first step
19:08:56 <abadger1999> pkgdb -- yeah, and pkgdb2 api is probably going to be 
the second step
19:08:57 <nirik> #info fas release being tested in staging, for 2013-02-28 
release to prod.
19:09:19 <abadger1999> as a note for admins -- the fas release that introduced 
fedmsg introduced a bug that you should know about
19:09:40 <SmootherFrOgZ> btw, there's a bunch of locale fixes in the new fas 
release
19:09:41 <abadger1999> email verification when people change their email 
address was broken.
19:09:50 <nirik> thats the one we have in prod, but we have hotfixed it right?
19:10:00 <SmootherFrOgZ> would good to test fas with different languages
19:10:32 <nirik> cool.
19:10:39 <abadger1999> it would change the email when the user first entered 
the updated email in the form instead of waiting for them to confirm that the 
received the verification email.
19:10:45 <nirik> I saw in stg that it also has the 'no longer accept just 
yubikey for password' in.
19:11:37 <threebean> askbot got fedmsg hooks in production this week.  there 
are some new bugs to chase down regarding invalid sigs and busted links..
19:11:41 <nirik> any other application news? oh...
19:11:56 <nirik> #info askbot is now sending fedmsg's.
19:11:58 <threebean> Latest status -> http://www.fedmsg.com/en/latest/status/
19:12:08 <skvidal> fedmsg.com? wow
19:12:25 <nirik> Has anyone had a chance to test patrick's fas-openid dev 
instance? any feedback for him?
19:12:26 <abadger1999> nirik: Hmm... looks like production isn't hotfixed.
19:12:30 <skvidal> threebean: what's the status on fedmsg emitters from outside 
of the vpn?
19:12:35 <abadger1999> nirik: but next fas release will have the fix.
19:12:40 <nirik> abadger1999: :( I thought we did. ok.
19:12:47 <threebean> skvidal: no material progress yet, but I've been thinking 
it over.
19:12:50 <abadger1999> Can we wait until Thursday?
19:13:01 <skvidal> threebean: okay thanks
19:13:04 <threebean> skvidal: I have some janitorial work to do.. then that's 
next on my list.
19:13:21 <skvidal> threebean: that's the limiting factor for adding notices 
from coprs, I think
19:13:29 <nirik> abadger1999: I suppose
19:14:12 * threebean nods
19:14:18 <abadger1999> I've used fas-openid but not tested it heavily.  It has 
worked and looks nice.  puiterwijk has a flask-fas-openid auth plugin that he's 
tested and converted fedocal, IIRC, to use it.
19:14:41 <nirik> yeah, it's worked for me for a small set of sites I tested.
19:15:22 <pingou> speaking of fedocal, I need to tag 0.1.0 and put it up for 
review
19:15:29 <nirik> #info more fas-openid testing welcome. Has worked for those 
folks that have tried it so far.
19:15:41 <pingou> the current feature requests will have to wait for the next 
release...
19:15:57 <nirik> pingou: yeah. Will be good to get it setup. :)
19:16:15 <abadger1999> Oh, fchiulli has a new version of elections that's ready 
for some light testing
19:16:16 <nirik> #info fedocal ready for 1.0 tag and review process.
19:16:24 <pingou> abadger1999: oh cool!
19:16:30 <abadger1999> http://elections-dev.cloud.fedoraproject.org/
19:16:31 <nirik> abadger1999: cool. Is there an instance up?
19:16:34 <nirik> nice.
19:16:44 <skvidal> nirik: should be
19:16:48 <abadger1999> You need to make an account on fakefas in order to try 
it out.
19:17:04 <nirik> #info testing on new elections version welcome: 
http://elections-dev.cloud.fedoraproject.org/ (make account in fakefas)
19:17:06 <abadger1999> Please do try it out.
19:17:06 <skvidal> abadger1999: is elections switching to fas-openid, too?
19:17:24 <pingou> abadger1999: and the code is ?
19:17:45 <abadger1999> skvidal: I believe it is using flask-fas right now 
because flask-fas-openid isn't in a released python-fedora yet.
19:18:03 <abadger1999> pingou: https://github.com/fedora-infra/elections
19:18:14 <skvidal> abadger1999: got it
19:18:19 <pingou> abadger1999: great
19:18:20 <skvidal> abadger1999: thx
19:18:36 <abadger1999> np
19:18:47 <nirik> I have one more application type thing to discuss... dunno if 
abompard is still awake, but we should discuss mailman3. ;)
19:18:51 <abadger1999> I am all for moving more things over to the 
flask-fas-openid plugin though.
19:19:15 * nirik is too.
19:19:33 <nirik> anyhow, we are looking at setting up a mailman3 staging to do 
some more testing and shake things out.
19:19:41 <nirik> however, mailman3 needs python 2.7
19:19:43 <abompard> nirik: yeaj
19:20:06 <nirik> so, it seems: a) rhel6 + a bunch of python rpms we build and 
maintain against python 2.7
19:20:12 <nirik> or b) fedora 18 instance
19:20:30 <smooge> abadger1999, congrats on election stuff
19:20:38 <abompard> yes, and MM3 really does not work on python 2.6, sadly
19:20:47 * pingou question: which one will be out first: EL7 or MM3? :-p
19:20:55 <nirik> we are starting to have more fedora in our infra (for example, 
the arm builders are all f18)
19:21:09 <nirik> so, we might want to come up with some policy/process around 
them. Like when do to updates, etc.
19:21:09 <abadger1999> smooge: thanks.  It was all fchiulli though :-)  I told 
him he can be the new owner of the code too :-)
19:21:13 <abompard> I've already rebuilt an application for a non-system 
python, and it's not much fun
19:21:33 <smooge> bwahahahah
19:21:33 <abompard> as in non-scriptable
19:21:58 <nirik> yeah, it's pain either way...
19:21:59 * abadger1999 thinks fedora boxes are going to be preferable to 
non-system python.
19:22:07 <pingou> +1
19:22:09 <skvidal> nirik: an idea
19:22:11 <nirik> I'm leaning that way as well.
19:22:16 <abompard> by the way, Debian has a strange but nifty packaging policy 
for python package that make them work with all the installed versions of python
19:22:21 <smooge> I think we should make a bunch of servers rawhide
19:22:40 <skvidal> abompard: I assume the db /data for mm3 is all separate from 
where it needs to run, right
19:22:46 <abadger1999> abompard: yeah -- I've looked at the policy but not hte 
implementation.  But every time I've run it by dmalcolm, he's said he doesn't 
like it.
19:23:04 <abadger1999> abompard: i think some of that might be because he has 
looked at the implementation :-)
19:23:05 <abompard> abadger1999: understandably, it's symlink-based
19:23:17 <abompard> skvidal: yeah, to some extent
19:23:23 <skvidal> nirik: I wonder if we could have 2 instances - talking to 
the same db - so we could update f18 to latest - run mm3 on it in r/o mode - to 
make sure it is working
19:23:27 <abompard> skvidal: it has local spool directories
19:23:30 <skvidal> nirik: then just pass the ip over to the other one
19:23:40 <nirik> in the past we have been shy of fedora instances because of 
the massive updates flow I think, as well as possible bugs around those 
updates. I think it's gotten much better in the last few years (I like to think 
due to the updates policy, but hard to say)
19:23:59 <skvidal> nirik: which is why I was thinking we don't do updates to 
the RUNNING instance
19:24:07 <skvidal> we just swap out the instance that is in use/has that ip
19:24:08 <abadger1999> ... or less contributors?   /me ducks and runs
19:24:16 <nirik> :)
19:24:22 <skvidal> nirik: so we test the 'install'
19:24:24 <nirik> skvidal: right, so a extra level of staging?
19:24:31 <skvidal> nirik: one level, really
19:24:32 <abompard> skvidal: I don't know how MM3 will handle a read-only DB
19:24:37 <skvidal> prod and staging
19:25:02 <nirik> well, right now we are talking about a staging instance only, 
but yeah, I see what you mean. we could do something along those lines.
19:25:17 <nirik> I also think for some use cases it's not as likely to break...
19:25:36 <nirik> ie, for mailman, postfix and mailman and httpd all need to 
work, but it doesn't need super leaf nodes right?
19:25:39 <skvidal> abompard: understood
19:26:02 <skvidal> nirik: anyway - just an idea
19:26:04 <skvidal> nirik: ooo - actually
19:26:13 <skvidal> nirik: I just had a second idea that you will either hate or 
love
19:26:14 <nirik> where as for something like a pyramid app, it would be a much 
more complex stack
19:26:16 <skvidal> nirik: snapshots
19:26:16 <abompard> skvidal: we may get bugs because of that, not because of 
the upgrade
19:26:30 <skvidal> nirik: we snapshot the running instance in the cloud
19:26:32 <nirik> yeah, we could do that too.
19:26:32 <skvidal> nirik: upgrade it
19:26:36 <skvidal> and if it dies - roll it out
19:27:04 <abompard> for the moment it will only be low-traffic lists anyway
19:27:22 <abompard> and I must check that, but if MM is not running, I think 
postfix keeps the message
19:27:30 <abadger1999> skvidal: how would that work in terms of data?  would we 
keep the db and local spool directory separate from the snapshots?
19:27:33 <abompard> and re-delivers when MM starts
19:27:34 <skvidal> abompard: yes
19:27:35 <nirik> FWIW, I run f18 servers at home here, and they have been 
pretty darn stable. (as they were when f17... earlier releases had more 
breakage from my standpoint)
19:27:41 <skvidal> err
19:27:41 <skvidal> abadger1999: yes
19:27:44 <abadger1999> Cool.
19:28:11 <skvidal> abadger1999: no reason we can't have a mm3-db server in the 
cloud :)
19:28:12 * abadger1999 kinda likes that.  although possibly he just doesn't 
know all the corner cases there :-)
19:28:16 <nirik> yeah. I'm sure we could do something with snapshots.
19:28:21 <skvidal> anyway - just an idea
19:28:23 <skvidal> nothing in stone
19:28:27 <nirik> yeah.
19:29:06 <nirik> also, for updates, we may just do them on the same schedule as 
rhel ones, unless something security comes up in an exposed part... ie, just 
look at the httpd, etc not the entire machine.
19:29:42 <nirik> anyhow, all to be determined, we can feel out a policy.
19:29:49 <nirik> anything else on the applications side?
19:29:56 <abadger1999> I have two more
19:30:00 <abadger1999> Do we have a schedule for getting fas-openid into 
production?
19:30:28 <nirik> abadger1999: I think it's ready for stg for sure now... but 
not sure when prod...
19:30:58 <nirik> I'm fine with rolling it out as fast as we are comfortable 
with.
19:31:03 <nirik> I'd like to see it get more use. ;)
19:31:04 <abadger1999> I think we're coming along great.  But if we're going to 
start migrating apps to use fas-openid/telling people to use it when developing 
their apps (like elections), then we need to have a plan for getting it into 
prod
19:31:09 <abadger1999> <nod>
19:31:19 <abadger1999> nirik: it's setup to replace the current fas urls?
19:31:34 <nirik> abadger1999: not fully sure on that. I think so...
19:31:36 * abadger1999 was wondering if we could deploy it and just not 
announce it for a few weeks
19:31:46 <nirik> thats a thought.
19:32:22 <abadger1999> alright -- I guess let's talk about htis more on Friday 
after our classroom session with puiterwijk :-)
19:32:26 <nirik> Oddly I have noticed that for things like askbot you get two 
different "users" with different urls.
19:32:28 <nirik> yeah
19:33:05 <abadger1999> Other thing is for all the devs here, how's the "review 
all changes" idea working out?
19:33:27 <nirik> #info will try out an f18 server for mm3 staging testing and 
feel out an updates policy, etc. Possibly using snapshots more.
19:33:39 <abadger1999> I've liked how it works with pingou, puiterwijk, and 
SmootherFrogZ for fas, python-fedora, and packagedb.
19:33:46 <nirik> #info will look at moving fas-openid to prod as soon as is 
feasable.
19:33:55 <skvidal> abompard: how much space do you need on the mm server itself 
- if you are not storing the db there?
19:33:59 <abadger1999> lmacken: Is it working okay for bodhi and such too?
19:34:07 <abadger1999> anything that's falling through the cracks?
19:34:14 <abompard> skvidal: I need to check that
19:34:21 <nirik> skvidal: if we are doing this as a real staging, we might want 
to just make a real 'lists01.stg.phx2' virthost instead of cloud?
19:34:26 <pingou> abompard: I defintevely like it
19:34:53 <abadger1999> Do we want to say that certain things are okay to push 
without review?  (making a release would be a candidate...I was going to 
suggest documentation earlier but pingou found a number of problems with my 
documentation patch :-)
19:34:53 <pingou> abadger1999: ^ :)
19:34:54 <skvidal> nirik: okay - I didn't know if we wanted to be cloud-er-fic 
about it or not
19:35:01 <skvidal> nirik: thx
19:35:31 <nirik> skvidal: yeah, I'm open to either, but I think right now until 
we have less fog in our clouds, a real one might be better for this... but 
either way
19:35:53 <skvidal> nirik: well - with attached persistent volumes - using one 
of the qcow imgs is non-harmful
19:35:55 <nirik> abadger1999: I like seeing the extra review. I've not done 
much reviewing myself. ;)
19:36:06 <abompard> skvidal: not much, a few hudred MB
19:36:08 <skvidal> nirik: but I agree about fog
19:36:17 * abadger1999 notes that threebean is in another meeting but said he 
still likes the idea but hasn't done it consisstently all the time.  So more 
experimentation with it needed.
19:36:43 * abadger1999 liked that nb reviewed a documentation update the other 
day :-)
19:37:02 <pingou> I think it can bring us new contributor
19:37:21 <pingou> some of them are easyfix
19:37:31 <pingou> other are bigger and then might need more experienced 
reviewers
19:37:57 <nirik> yeah
19:38:21 <nirik> welcome mdomsch
19:38:41 <abadger1999> Yeah.  I agree.  it's nice to have someone else's eyes 
on the bigger fixes even if they're relatively new too, though.  It's better 
than before where I would have committed it without any review at all.
19:38:49 <mdomsch> better late than never
19:38:51 <nirik> that reminds me, mdomsch was going to look at updating mm in 
prod to 1.4 on friday... if not then, then sometime soon. ;)
19:39:04 <nirik> #info feedback on github reviews of all commits welcome.
19:39:11 <mdomsch> anyone have any grief with doing a major MM upgrade tomorrow 
afternoon?
19:39:11 <nirik> #info mirrormanager update to 1.4 soon.
19:39:47 <abadger1999> mdomsch: If you're around in case it goes sideways it 
would be very nice.
19:39:51 <mdomsch> everything I know I've broken, I've fixed.  Now it's time to 
test in production. :-)
19:39:52 <nirik> I think it should be fine. We can be somewhat paranoid and not 
touch one of the apps so we have an easy fallback.
19:40:11 <abadger1999> get the fixes in that you've had pending and get us onto 
a single codebase for development.
19:40:13 <mdomsch> k
19:40:25 <nirik> (until we are sure the others are all working right I mean)
19:40:31 <mdomsch> right
19:40:34 <mdomsch> so bapp02, then app01
19:40:47 * nirik nods.
19:40:48 <mdomsch> and I'll stop the automatic push from bapp02 to app*
19:40:58 <nirik> sounds good.
19:41:00 <mdomsch> until we're comfortable.  Worst case, we have slightly stale 
data for a few hours
19:41:21 * nirik nods.
19:41:28 <abadger1999> instead of "if you're around"  it would'vs been clearer 
for me to say "as long as you're around" :-)
19:41:43 <nirik> mdomsch: you've picked up all the hotfixes into 1.4 right?
19:41:46 <mdomsch> abadger1999: naturally; I'm not around nearly as much
19:41:56 <abadger1999> Yeah.  we miss you ;-)
19:42:16 <nirik> abadger1999: +1 :)
19:42:40 <nirik> anyhow, any other application news? or shall we move on?
19:43:00 <nirik> #topic Sysadmin status / discussion
19:43:06 <mdomsch> nirik: yes I pulled them all in while at FUDCon
19:43:17 <nirik> lets see... this week smooge was out at phx2 for a whirlwind 
tour.
19:43:22 <nirik> mdomsch: cool.
19:43:45 <nirik> #info smooge got out bnfs01 server's disks working again.
19:43:51 <nirik> #undo
19:43:51 <zodbot> Removing item from minutes: <MeetBot.items.Info object at 
0x281d8c50>
19:43:56 <nirik> #info smooge got our bnfs01 server's disks working again.
19:44:09 <smooge> kind of sort of
19:44:19 <nirik> I've been tweaking nagios of late... hopefully making it 
better.
19:44:30 <nirik> #info nagios adjustments in progress
19:44:56 <nirik> We should have net for the rest of the arm boxes friday.
19:45:07 <nirik> #info arm boxes will get new net friday hopefully
19:45:14 <skvidal> I had a discussion with the author of pynag this morning
19:45:49 <nirik> cool. Worth using for a tool for us to runtime manage nagios?
19:45:50 <skvidal> if we have people willing to spend some time - we could 
easily build a query tool/cli-tool for nagios downtimes/acknowledgements/etc
19:46:07 <nirik> that would be quite handy, IMHO
19:46:12 <skvidal> nirik: it needs some code to make it work - but I think the 
basic functionality  is available
19:46:41 <nirik> for some things the ansible nagios module would do, but for 
others it would be nice to have a command line.
19:47:15 <nirik> I'd like to look at doing a mass reboot next wed or so... 
upgrade everything to rhel 6.4.
19:47:17 <SmootherFrOgZ> skvidal: interesting!
19:47:37 <nirik> Might do staging today/tomorrow to let it soak there and see 
if any of our stuff breaks. ;)
19:47:52 <nirik> #info mass reboot next wed (tenative) for rhel 6.4 upgrades.
19:47:59 <skvidal> nirik: right - I'd like to be able to enhance the ansible 
nagios module to be more idempotent and 'proper'
19:48:04 <skvidal> nirik: pynag _could_ do that
19:48:17 <nirik> yeah, it looks very bare bones right now.
19:48:35 <nirik> in particular we could use a 'downtime for host and all 
dependent hosts' type thing
19:48:52 <skvidal> nirik: we could also use a 'give me the state of this host'
19:48:58 <skvidal> without having to go to the webpage
19:49:14 <skvidal> according to palli (a pynag developer) it can read status.dat
19:49:15 <skvidal> from nagios
19:49:18 <smooge> I am looking at lldpd for our PHX2 systems 
http://vincentbernat.github.com/lldpd/ Mainly to better get an idea of where 
things are
19:49:20 <skvidal> to determine ACTUAL state
19:49:26 <nirik> finally in the sysamin world, I'd really like to poke ansible 
more and get it to where we can use it for more hosts. Keep getting 
sidetracked, but it will happen! :)
19:51:04 <nirik> smooge: another thing we could look at there is 
http://linux-ha.org/source-doc/assimilation/html/index.html (it uses lldpd type 
stuff). They are about to have their first release... so very early days.
19:51:33 <smooge> ah cool
19:51:38 <smooge> wiill look at that also
19:51:55 <nirik> oh, on nagios, I set an option: soft_state_dependencies=1
19:52:22 <nirik> this hopefully will help us not get the flurry of notices when 
a machine is dropping on and off the net, or has too high a load to answer, 
then answers again.
19:52:50 <nirik> #topic Private Cloud status update / discussion
19:53:01 <nirik> skvidal: want to share your pain where we are with cloudlets? 
:)
19:53:08 <skvidal> sure
19:53:23 <skvidal> last week I did the euca upgrade and the wheels came right 
off
19:53:29 <skvidal> and then it plunged over a cliff
19:53:31 <skvidal> into a volcano
19:53:41 <pingou> sounds like a lot of fun
19:53:42 <skvidal> where it was eaten by a volcano monster
19:53:54 <smooge> who was riding a yak
19:53:56 <skvidal> anyway the euca instance is limping along at the moment with 
not-occasional failures :(
19:54:04 <skvidal> smooge: and the yak had to be shaven
19:54:17 <pingou> brough back some pictures >
19:54:19 <pingou> ?
19:54:21 <skvidal> so...
19:54:35 <skvidal> I've been working on porting our imgs/amis/etc over to 
openstack
19:54:44 <skvidal> and getting things more production-y in the openstack 
instance -
19:54:58 <skvidal> I got ssl working around the ec2 api for openstack
19:55:11 <nirik> #info euca cloudlet limping along after upgrade.
19:55:12 <skvidal> working on ssl'ing the other items
19:55:18 <skvidal> for the past couple of days
19:55:26 <nirik> #info work on going to bring openstack cloudlet up to more 
production
19:55:28 <skvidal> I've been in a fist fight with openstack and qcow images
19:55:33 <skvidal> and resizing disks
19:55:47 <skvidal> I just got confirmation from someone that what we want to do 
is just not possible at the moment :)
19:55:54 <nirik> lovely. ;(
19:56:10 <skvidal> nirik: not until we get the initramdisk to resize the 
partitions :(
19:56:17 <skvidal> so - I'm punting on this
19:56:24 <skvidal> I just put in a new ami and kernel/ramdisk combo
19:56:29 <skvidal> that's rhel6.4 latest
19:56:30 <smooge> sometimes that is best
19:56:35 <nirik> yeah. I think that could work, but needs some time to get 
working right. Hopefully by the cloud-utils maintainer. ;)
19:56:38 <skvidal> and since it is an AMI  it resizes the disks
19:56:50 <skvidal> what it DOES NOT DO is follow the kernel on the disk - it 
uses the one(s) in the cloud
19:56:54 <skvidal> which is suck
19:57:00 <skvidal> but at least it is known/obvious suck
19:57:08 <nirik> but it should also get us moving past it for now.
19:57:11 <skvidal> I've also just built a new qcow from rhel6.4
19:57:27 <skvidal> so for systems that don't need to be on-the-fly made - we 
can spin them up
19:57:31 <skvidal> growpart the partition
19:57:33 <skvidal> reboot
19:57:35 <skvidal> resize
19:57:37 <skvidal> and go
19:57:47 <skvidal> and i'm working on a playbook to handle all of the above for 
you
19:57:51 <skvidal> and, yes, it makes me cry inside
19:58:08 <nirik> ;(
19:58:12 <skvidal> that's where we are at the moment
19:58:26 <skvidal> I am making new keys/accounts/tenants/whatever
19:58:35 <skvidal> for our lockbox 'admin' user
19:58:40 <skvidal> for making persistent instances
19:58:53 * nirik nods.
19:58:57 <skvidal> the next step is to start making use of the resource tags in 
openstack
19:59:02 <skvidal> so we can more easily track all this shit
19:59:15 <skvidal> also I have to make a bunch of volumes and rsycn over all 
the data from the euca volumes :(
19:59:30 <skvidal> I fully expect that last part to be a giant example of 
suffering
19:59:46 <nirik> yeah. we should probibly move one set of instances first and 
sort out if there's any doom
19:59:57 <skvidal> if I sound kinda 'bleah' there's a reason
20:00:02 <skvidal> nirik: I thought I'd start with the fartboard
20:00:07 <nirik> heh. ok
20:00:37 <skvidal> nirik: also - now that we have instance tags - it should be 
doable to write a simple 'start me up' script using ansible to spin out the 
instances
20:00:40 <skvidal> and KNOW where they are
20:00:48 <nirik> ok, we are running over time... let me quickly do upcoming and 
open floor. ;)
20:00:52 <skvidal> sorry
20:00:55 <skvidal> thx
20:00:57 <nirik> thats fine. ;) all good info
20:01:04 <skvidal> one last thing
20:01:08 <skvidal> if anyone wants to get involved
20:01:08 <skvidal> ping me
20:01:29 <nirik> #info please see skvidal if you want to get involved in our 
private cloud setup
20:01:33 <nirik> #topic Upcoming Tasks/Items
20:01:42 <nirik> (big paste)
20:01:44 <nirik> #info 2013-02-28 end of 4th quarter
20:01:44 <nirik> #info 2013-03-01 nag fi-apprentices
20:01:44 <nirik> #info 2013-03-07 remove inactive apprentices.
20:01:44 <nirik> #info 2013-03-19 to 2013-03-26 - koji update
20:01:44 <nirik> #info 2013-03-29 - spring holiday.
20:01:46 <nirik> #info 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze
20:01:48 <nirik> #info 2013-04-16 F19 alpha release
20:01:50 <nirik> #info 2013-05-07 to 2013-05-21 BETA infrastructure freeze
20:01:52 <nirik> #info 2013-05-21 F19 beta release
20:01:54 <nirik> #info 2013-05-31 end of 1st quarter
20:01:56 <nirik> #info 2013-06-11 to 2013-06-25 FINAL infrastructure freeze.
20:01:58 <nirik> #info 2013-06-25 F19 FINAL release
20:02:00 <nirik> anything people want to schedule/note etc?
20:02:07 <nirik> I'll add the fas update and the mass reboot.
20:02:20 <abadger1999> Sounds good.
20:02:49 <nirik> #topic Open Floor
20:02:54 <nirik> Anyone have items for open floor?
20:03:32 <pingou> I have a series of blog post 'Fedora-Infra: Did you know?' 
coming, like once a week for the coming 4 weeks
20:03:32 <nirik> ok.
20:03:42 <skvidal> pingou: wow
20:03:46 <nirik> pingou: awesome. More blog posts would be great.
20:03:49 <pingou> short stuff, speaking about some cool features/ideas
20:03:52 <skvidal> pingou: looking forward to seeing those
20:04:10 <nirik> Thanks for coming everyone. Do continue over on our regular 
channels. :)
20:04:14 <nirik> #endmeeting

Attachment: signature.asc
Description: PGP signature

_______________________________________________
infrastructure mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/infrastructure

Reply via email to