Re: Bug 1642609: new model config defaults

2016-12-08 Thread Michael Foord
I created bug 1648426 to track discussion of which model config options 
should (if indeed any...) propagate by default.


https://bugs.launchpad.net/juju/+bug/1648426

Michael


On 07/12/16 21:37, Michael Foord wrote:

Hey all,

I spent far longer than was reasonable working out why OIL were unable 
to deploy workloads with juju 2.0.2 from the proposed stream.


https://bugs.launchpad.net/juju/+bug/1642609

The repro of the bug involved bootstrapping a xenial controller 
creating several new models and deploying bundles into the models. The 
Xenial machines would provision and trusty machines fail.


The cause of the problem is actually by design, although I would argue 
still insane and needs fixing. The agent-stream (proposed) is a 
model-config option that is not propagated to new models. So for new 
models the default stream is "released". The xenial agent is cached in 
the controller, so new models can provision xenial machines from the 
cached agent. When trying to provision a trusty machine the new model 
looks in it's agent-stream, released, which does not have 2.0.2 tools 
and thusly fails.


There are three current workarounds:

* If we promote 2.0.2 from proposed to released this specific problem 
goes away...


* After adding a new model you can set the agent-stream in the 
model-config


* Bootstrapping with "--model-default agent-stream=proposed" allegedly 
does propagate config options to new models


I am strongly of the opinion that at the very least a newly created 
model should be capable of deploying workloads, which means that at 
least a subset of model-config options should be propagated by default 
to new models. This means at least, agent-stream, agent-metadata-url, 
proxy settings etc.


All the best,

Michael Foord





--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Bug 1642609: new model config defaults

2016-12-07 Thread Michael Foord

Hey all,

I spent far longer than was reasonable working out why OIL were unable 
to deploy workloads with juju 2.0.2 from the proposed stream.


https://bugs.launchpad.net/juju/+bug/1642609

The repro of the bug involved bootstrapping a xenial controller creating 
several new models and deploying bundles into the models. The Xenial 
machines would provision and trusty machines fail.


The cause of the problem is actually by design, although I would argue 
still insane and needs fixing. The agent-stream (proposed) is a 
model-config option that is not propagated to new models. So for new 
models the default stream is "released". The xenial agent is cached in 
the controller, so new models can provision xenial machines from the 
cached agent. When trying to provision a trusty machine the new model 
looks in it's agent-stream, released, which does not have 2.0.2 tools 
and thusly fails.


There are three current workarounds:

* If we promote 2.0.2 from proposed to released this specific problem 
goes away...


* After adding a new model you can set the agent-stream in the model-config

* Bootstrapping with "--model-default agent-stream=proposed" allegedly 
does propagate config options to new models


I am strongly of the opinion that at the very least a newly created 
model should be capable of deploying workloads, which means that at 
least a subset of model-config options should be propagated by default 
to new models. This means at least, agent-stream, agent-metadata-url, 
proxy settings etc.


All the best,

Michael Foord



--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Github Reviews vs Reviewboard

2016-10-14 Thread Michael Foord

0


On 13/10/16 23:44, Menno Smits wrote:
We've been trialling Github Reviews for some time now and it's time to 
decide whether we stick with it or go back to Reviewboard.


We're going to have a vote. If you have an opinion on the issue please 
reply to this email with a +1, 0 or -1, optionally followed by any 
further thoughts.


  * +1 means you prefer Github Reviews
  * -1 means you prefer Reviewboard
  * 0 means you don't mind.

If you don't mind which review system we use there's no need to reply 
unless you want to voice some opinions.


The voting period starts *now* and ends my*EOD next Friday (October 21)*.

As a refresher, here are the concerns raised for each option.

*Github Reviews*

  * Comments disrupt the flow of the code and can't be minimised,
hindering readability.
  * Comments can't be marked as done making it hard to see what's
still to be taken care of.
  * There's no way to distinguish between a problem and a comment.
  * There's no summary of issues raised. You need to scroll through
the often busy discussion page.
  * There's no indication of which PRs have been reviewed from the
pull request index page nor is it possible to see which PRs have
been approved or otherwise.
  * It's hard to see when a review has been updated.

*Reviewboard*

  * Another piece of infrastructure for us to maintain
  * Higher barrier to entry for newcomers and outside contributors
  * Occasionally misses Github pull requests (likely a problem with
our integration so is fixable)
  * Poor handling of deleted and renamed files
  * Falls over with very large diffs
  * 1990's looks :)
  * May make future integration of tools which work with Github into
our process more difficult (e.g. static analysis or automated
review tools)

There has been talk of evaluating other review tools such as Gerrit 
and that may still happen. For now, let's decide between the two 
options we have recent experience with.


- Menno




-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: hacking upload tools for development

2016-04-19 Thread Michael Foord



On 14/04/16 19:38, Aaron Bentley wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi there.

I've done a lot of work with simplestreams lately, and we've got some
decent tools for generating them quickly and easily.  I'd be happy to
work with someone from core to develop a tool to generate streams that
you can use in place of --upload-tools.


That sounds ideal.

Michael


Aaron

On 2016-04-14 09:52 AM, Nate Finch wrote:

I wrote a wiki page about how to hack bootstrap --upload-tools so
that it doesn't stop the controller from going through the normal
processes of requesting tools from streams etc.  This can be handy
when that's the part of the code you want to debug, but you need to
upload a custom binary with extra debugging information and/or
changes to that code.  It turns out that the changes are very
minimal, they're just hard to find and spread out in a few
different files.

Feel free to make corrections if there's anything I've missed or
could do better:
https://github.com/juju/juju/wiki/Hacking-upload-tools



-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQEcBAEBCAAGBQJXD+ONAAoJEK84cMOcf+9hT/4IALfSUh8Z5r8FqzAROYq5FnMq
8UG3Do6FkN/zFDZBelM1NJpePOWqkyvcJbZSLHGsnbcDRMpADXju0LDALKKcx87p
AVzsiMea/7RcZi9ln2Pdb+Ct508eBK4IN7f8L20iA1x3i83gvZ9s1JzJ/Gxb9+KP
5ovkQ7MNnekjQz5ejCBFjRgOq31dIhTCV26QzaUqQxJnEmA5N2SpOfEgfsqcTGxX
PWQYLcm255er/QBp4Pr76bOAeVdqR0yGNx6N3fiezH39noNPQMl7L5bbWABk8DHc
by3FjrDdFYSQmEUPwJVmfK3lBD6JjdJH7y7fu4qE5A+Zp2mV2cN/mr3Fuw0m57I=
=PKZ7
-END PGP SIGNATURE-




--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: kvm instance creation

2015-06-03 Thread Michael Foord



On 03/06/15 09:15, Robie Basak wrote:

Hi Michael,

(uvtool author here)

On Tue, Jun 02, 2015 at 09:30:15AM +0100, Michael Foord wrote:

In order to specify a MAC address for a KVM image we need to create a
libvirt domain xml template to pass as the template argument to uvt-kvm.

Right.


Ideally we'd like to specify as *little as possible* in the template and
have the generated libvirt xml (i.e. the created image) be identical to what
uvt-kvm would have created for us without the template. The only change
being that the image will have a network interface with the MAC address we
specify.

Attached are two xml files. juju-bare-metal.xml is the *generated* xml
(using master) for a kvm image on MAAS bare metal (functionally the same as
the xml we generate for a kvm image in a kvm image - I checked).
template.xml is the minimal template I found that would cause uvt-kvm to
generate the same image.

What you've done here is correct the best way to do it right now - I
presume you will end up generating the XML here instead of letting
uvtool do it?


We have a minimal template, specifying only what we have to, and let 
uvtool generate the rest from defaults and command line parameters.



Note that in addition to picking up the (usually default) template XML
file, uvtool also does some manipulation of the XML tree in response to
the other options specified (--disk, --memory, some networking options,
etc) as well as some mandatory manipulations (defining the name, for
example).
Yes, we'll still be using the command line parameters for those things. 
We'd like to hardcode as little as possible in the XML for the reasons 
you describe.



One catch is that if things change in the future, Juju (by hardcoding
the XML default) will be stuck in the past. For example, I think there
are some changes in the pipeline where supporting different
architectures such as ppc64el and arm64 require slightly different
optimal XML definitions.

How about I add an option to allow you to set the MAC address from the
command line? This would only apply to = Wily, but older releases
aren't set to change anyway. For future releases, if you used that
option, then you'd get what you need but also be able to stick to
default XML handling.

The catch is that you'll need to implement it both ways - but the
implementation to add the command line option would presumably be pretty
small.

How does that sound?


That would be great! We're happy to use XML generation as a fallback for 
older versions of uvtool.


At some point in the not too far future we'll need to support multiple 
NICs (each with a MAC address and bridged to a different network on the 
host), so if you could bear that use case in mind that would also be 
great...


Many thanks,

Michael Foord


Robie




-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


kvm instance creation

2015-06-02 Thread Michael Foord

Hey folks,

Just wanting to share the knowledge in case I get hit by a bus (and 
provide an opportunity for those of you who know more than me to point 
out my howling errors) :-)


With the new MAAS (1.8) devices API, to allocate an IP address for a 
container we need to know the MAC address of the container. To setup 
routing for the container we need to know the IP address before we 
create the container. So we need to know (i.e. generate) the MAC address 
before container creation.


In order to specify a MAC address for a KVM image we need to create a 
libvirt domain xml template to pass as the template argument to uvt-kvm.


References:

http://manpages.ubuntu.com/manpages/trusty/man1/uvt-kvm.1.html
https://libvirt.org/formatdomain.html#elementsNICS
https://help.ubuntu.com/lts/serverguide/cloud-images-and-uvtool.html

Ideally we'd like to specify as *little as possible* in the template and 
have the generated libvirt xml (i.e. the created image) be identical to 
what uvt-kvm would have created for us without the template. The only 
change being that the image will have a network interface with the MAC 
address we specify.


Attached are two xml files. juju-bare-metal.xml is the *generated* xml 
(using master) for a kvm image on MAAS bare metal (functionally the same 
as the xml we generate for a kvm image in a kvm image - I checked). 
template.xml is the minimal template I found that would cause uvt-kvm 
to generate the same image.


We already have code support (need to check it works) for specifying a 
NIC with MAC address for lxc.


All the best,

Michael Foord

domain type='kvm'
  namejuju-machine-0-kvm-0/name
  uuid2fec39e2-30ad-4bbf-b8be-1b6b678fc957/uuid
  memory unit='KiB'524288/memory
  currentMemory unit='KiB'524288/currentMemory
  vcpu placement='static'1/vcpu
  os
type arch='x86_64' machine='pc-i440fx-trusty'hvm/type
boot dev='hd'/
  /os
  features
acpi/
apic/
pae/
  /features
  clock offset='utc'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashdestroy/on_crash
  devices
emulator/usr/bin/kvm-spice/emulator
disk type='file' device='disk'
  driver name='qemu' type='qcow2'/
  source file='/var/lib/uvtool/libvirt/images/juju-machine-0-kvm-0.qcow'/
  target dev='vda' bus='virtio'/
  address type='pci' domain='0x' bus='0x00' slot='0x04' function='0x0'/
/disk
disk type='file' device='disk'
  driver name='qemu' type='raw'/
  source file='/var/lib/uvtool/libvirt/images/juju-machine-0-kvm-0-ds.qcow'/
  target dev='vdb' bus='virtio'/
  address type='pci' domain='0x' bus='0x00' slot='0x05' function='0x0'/
/disk
controller type='usb' index='0'
  address type='pci' domain='0x' bus='0x00' slot='0x01' function='0x2'/
/controller
controller type='pci' index='0' model='pci-root'/
interface type='bridge'
  mac address='52:54:00:91:16:bd'/
  source bridge='juju-br0'/
  model type='virtio'/
  address type='pci' domain='0x' bus='0x00' slot='0x03' function='0x0'/
/interface
serial type='stdio'
  target port='0'/
/serial
console type='stdio'
  target type='serial' port='0'/
/console
input type='mouse' bus='ps2'/
input type='keyboard' bus='ps2'/
graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1'
  listen type='address' address='127.0.0.1'/
/graphics
video
  model type='cirrus' vram='9216' heads='1'/
  address type='pci' domain='0x' bus='0x00' slot='0x02' function='0x0'/
/video
memballoon model='virtio'
  address type='pci' domain='0x' bus='0x00' slot='0x06' function='0x0'/
/memballoon
  /devices
/domain


domain type='kvm'
  namenew-machine/name
  memory unit='KiB'1048576/memory
  currentMemory unit='KiB'1048576/currentMemory
  vcpu placement='static'1/vcpu
  os
typehvm/type
  /os
  features
acpi/
apic/
pae/
  /features
  devices
controller type='usb' index='0'
  address type='pci' domain='0x' bus='0x00' slot='0x01' function='0x2'/
/controller
controller type='pci' index='0' model='pci-root'/
serial type='stdio'
  target port='0'/
/serial
console type='stdio'
  target type='serial' port='0'/
/console
input type='mouse' bus='ps2'/
input type='keyboard' bus='ps2'/
graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1'
  listen type='address' address='127.0.0.1'/
/graphics
video
  model type='cirrus' vram='9216' heads='1'/
  address type='pci' domain='0x' bus='0x00' slot='0x02' function='0x0'/
/video
interface type='network'
  mac address='52:54:00:7a:ef:cf'/
  model type='virtio'/
  source network='maas'/
/interface
  /devices
/domain

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Is simplestreams spam worth having in the Log

2015-04-01 Thread Michael Foord



On 01/04/15 11:47, John Meinel wrote:
I've been noticing lately that everytime a test fails it ends up 
having a *lot* of lines about failing to find simplestreams headers. 
(this last test failure had about 200 long lines of that, and only 6 
lines of actual failure message that was useful).


Now I think there are a few things to look at here:

1) The lines about looking for any double up and occur 9 times. Why 
are we repeating the search for tools 9 times in 
TestUpgradeCharmDir? maybe its genuine, but it sure feels like we're 
doing work over and over again that could be done once.


2) We still default to reporting every failed index.json lookup, and 
*not* reporting the one that succeeded. Now these are at DEBUG level, 
but I have the feeling their utility is low enough that we should 
actually switch them to TRACE and *start* logging the one we 
successfully found at DEBUG level.


Thoughts?



Oh god, please reduce log spam where you can! Trawling through logs for 
actual failure reasons is the bane of my life (and probably everyone 
else's)!


Michael


John
=:-




-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Please, no more types called State

2015-03-12 Thread Michael Foord



On 12/03/15 05:01, David Cheney wrote:

lucky(~/src/github.com/juju/juju) % pt -i type\ State\ | wc -l

23

Thank you.


When I was new to Juju the fact that we had a central State, core to 
the Juju model, but we had umpteen types called State - so where you saw 
a State you had no idea what it actually was and when someone mentioned 
State you couldn't be sure what they meant - was a significant part of 
the learning curve.


Perhaps a better solution would have been a better name for the core State.

Michael



Dave




--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Manual bootstrap to kvm (for testing behind a proxy etc)

2015-02-11 Thread Michael Foord

Hey all,

I've been working on fixing the problem(s) with deploying juju behind a 
proxy [1]. This involved creating kvm instances (using virt-manager), 
firewalling them off to only have access to the network through a proxy 
(squid running on the host) and bootstrapping with the manual provider. 
As this is a generally useful technique for testing bootstrap (etc) 
without using the local provider (which is a special snowflake in many 
ways and can't always be used for testing).


There's nothing new or complex here, but it's a nice technique. I'll 
also describe the firewall rules needed for simulating a machine behind 
a proxy. Useful if you ever need to test this scenario.


First of all create a new kvm instance from 14.04 server, and pre-select 
openssh to be installed. You shouldn't need to install anything else.


If you're going to be running behind a proxy then clone the kvm instance 
(probably a good technique anyway) and use the clone. This is because 
you can't reprovision a machine with the manual provider when it's 
behind a proxy [2].


If you want to run behind a proxy then install squid3 on your host and 
edit the squid.conf to allow access from the local network (or from 
everywhere). The default squid port is 3128.


iptables rules for the kvm instance are easiest to setup with ufw, which 
should be installed by default. Run the following commands as root:


ufw enable
ufw default deny outgoing
ufw allow out 22
ufw allow in 22
ufw allow out 17070
ufw allow in 17070
ufw allow out 67/udp
ufw allow in 67/udp
ufw allow out 3128/tcp
ufw allow in 3128/tcp
ufw allow out 53/udp

This permits ssh access, the apiserver, dns and dhcp, plus access to the 
squid proxy, but blocks everything else.


You can then edit environments.yaml as normal for the manual provider 
(run ip addr in the kvm instance to get the IP address of course):


manual:
type: manual
bootstrap-host: 192.168.178.190
bootstrap-user: username
#http-proxy: http://192.168.178.103:3128/
#https-proxy: http://192.168.178.103:3128/

Followed by:

juju switch manual
juju bootstrap --upload-tools

This will just work...

If you also wish to deploy units to a separate machine (you can deploy 
to the state server instance with --to 0 of course) you'll need 
another kvm instance and use the form:


juju deploy wordpress --to ssh:user@ip addr


Warm regards,

Michael Foord

[1] https://bugs.launchpad.net/juju-core/+bug/1403225
[2] https://bugs.launchpad.net/juju-core/+bug/1418139
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: reviewboard-github integration

2014-10-21 Thread Michael Foord


On 20/10/14 22:38, Eric Snow wrote:

This should be resolved now.  I've verified it works for me.  If it
still impacts anyone, just let me know.


I still have the issue I'm afraid. No reviewer set, no diff.

http://reviews.vapour.ws/r/211/

Michael



-eric

On Mon, Oct 20, 2014 at 7:34 PM, Eric Snow eric.s...@canonical.com wrote:

Yeah, this is the same issue that Ian brought up.  I'm looking into
it.  Sorry for the pain.

-eric

On Mon, Oct 20, 2014 at 5:31 PM, Dimiter Naydenov
dimiter.nayde...@canonical.com wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hey Eric,

Today I tried proposing a PR and the RB issue (#202) was created, but
it didn't have Reviewers field set (as described below), it wasn't
published (due to the former), but MOST importantly didn't have a diff
uploaded. After fiddling around with rbt I managed to do:
$ rbt diff  ~/patch
(while on the proposed feature branch)

And then went to the RB issue page and manually uploaded the generated
diff and published it.

So most definitely the hook generating RB issues have to upload the
diff as well :)

It's coming together, keep up the good work!

Cheers,
Dimiter

On 20.10.2014 16:53, Eric Snow wrote:

On Mon, Oct 20, 2014 at 6:06 AM, Ian Booth
ian.bo...@canonical.com wrote:

Hey Eric

This is awesome, thank you.

I did run into a gotcha - I created a PR and then looked at the
Incoming review queue and there was nothing new there. I then
clicked on All in the Outgoing review queue and saw that the
review was unpublished. I then went to publish it and it
complained at least one reviewer was needed. So I had to fill in
juju-team and all was good.

1. Can we make it so that the review is published automatically?
2. Can we pre-fill juju-team as the reviewer?

Good catch.  The two are actually related.  The review is
published, but that fails because no reviewer got set.  I'll get
that fixed.

-eric



- --
Dimiter Naydenov dimiter.nayde...@canonical.com
juju-core team
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJURSrnAAoJENzxV2TbLzHw0BQH/16P4qPDI28kkGs398qRKY5s
eUtcHBpYs+JuLV2ZA0LjCpTds89RBDW6cKsxcfXxaAmawIb0KHh920VzKb1Wl2OT
z/iMOF2q91LnV58dqPf7mZjHaT1LPRdSRxg6aAZW/mjexwVRtRDT4Asd5w6JpKrH
9Tkqfy86OilJ70X8qNbegvjJrBAttwoLLI4jwJq4dNWUbWCBbuumryh0k6+GlmNH
NiKbpi45pPy/RIFVA7ewbLIOpUXleHm5NIGlA/liZOMHpz0w5QHK3FYGLuGMNzQC
fq4qW6rfb1ITdr7XWsA3gooV6FUndw3mbNsod3QgSv82RDA6GGECHeYimGG94/g=
=POJ4
-END PGP SIGNATURE-

--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev



--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Agent API Versioning and Upgrades

2014-09-17 Thread Michael Foord


On 17/09/14 10:15, Dimiter Naydenov wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all,

TL;DR When introducing a new agent API facade version used by a
worker, which requires an upgrade step (schema changes, migration), I
propose to not to keep the old worker code (using the older API facade
version only) after the upgrade step is in place. In other words,
trust our upgrade logic to do the right thing.


This sounds sane and the right approach to me. Especially the find 
and fix bugs in upgrade rather than trying to defensively code around 
bugs that may or may not exist (at the cost of keeping around a lot more 
old code we don't really want).


Michael


While doing the port ranges work I encountered a common problem, which
I'll try to explain below.

As our agents and workers evolve we introduce new agent API facade
versions to enable that. Unlike the client API we have to support
every version since the release of trusty, the older agent API
versions can be deprecated sooner. Why? Because we have synchronized
upgrades which run before any workers have a chance to connect to the
API server and request a specific facade version.

I'm specifically talking about http://reviews.vapour.ws/r/33/, which
introduces a new FirewallerAPIV1 and refactors the existing code as
FirewallerAPIBase (for embedding and sharing common code between V0
and V1) and FirewallerAPIV0. The apiserver side has separate test
suites for V0 and V1, and a base suite containing common tests for
both versions. The reason for introducing V1 is because the firewaller
worker will start watching opened port ranges on machines, not units,
as soon as the APIV1 lands and the worker code is changed. For this to
happen though, there are some schema changes (adding a new openedPorts
collection) and migration of existing data (moving individual opened
ports from the units document to the new collection as port ranges on
the unit's assigned machine), which can be implemented as an upgrade
step. Once the upgrade step is in place, due to the way upgrades work
now (in both HA and non-HA scenarios), we can guarantee that:
  1. The (new) worker using APIV1 will only start after the upgrades
are done (on all state servers)
  2. Even if not all state server have synchronized some time after the
upgrade, it's possible for the worker to connect to an apiserver which
is not yet fully upgraded to support APIV1. The worker tries to
connect, requests version 1, does not get it and terminates,
triggering a restart and hopefully connecting to another apiserver
which supports APIV1 (or keeps restarting until it does, but it should
be for a relatively short time).

So, once there's an upgrade step in place, I see no point in keeping
the older worker code (using APIV0 only) around. The only reason for
keeping the old code is if we happen to connect to an apiserver which
does not support APIV1 (therefore we can't use the new worker code
which only works with APIV1) and have to fall back to the old code.

If we trust our upgrade process works correctly (and file bugs when it
doesn't), chances are after the upgrade we WILL be able to connect to
an upgraded apiserver supporting V1. If not, we'll bounce the worker a
few times until it does.

I hope all that made sense and appreciate comments.

Cheers,
- -- 
Dimiter Naydenov dimiter.nayde...@canonical.com

juju-core team
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJUGVFEAAoJENzxV2TbLzHwQlwH/icQi0fcUtcVQseadE1hpvfW
L/WPM7NPpYw5Wgr74joMi4R6ExUki5kQiUGVO6Eoqa5cfZEpW6jXAloOzaN8+bdG
YCTUKYK0GSD9ptemUG9IehoCDqJrpse9I2bEFGhNdLy+PDRCbb0XT9N5X0dWyhC5
ugJ8wNxCANRCbRayFS1PRTaouzIXDCKuxX+z3vzsWnVQnJnxMXMEbmmNo0XH6bdp
dY8umH1PbbqGRfgs6SPlSxRfrnjD/JN+ZQ7hBvdIwlUUQVDqf9TP0hZknqTla1Fw
hY3qIyA6/eRu0jlj9I10TfzcCPX3Hpx9fE9+Q8YALgzWE4Jamuq7hPmAGOH/zeA=
=XZ7y
-END PGP SIGNATURE-




--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Unit Tests Integration Tests

2014-09-12 Thread Michael Foord


On 12/09/14 06:05, Ian Booth wrote:


On 12/09/14 01:59, roger peppe wrote:

On 11 September 2014 16:29, Matthew Williams
matthew.willi...@canonical.com wrote:

Hi Folks,

There seems to be a general push in the direction of having more mocking in
unit tests. Obviously this is generally a good thing but there is still
value in having integration tests that test a number of packages together.
That's the subject of this mail - I'd like to start discussing how we want
to do this. Some ideas to get the ball rolling:

Personally, I don't believe this is obviously a good thing.
The less mocking, the better, in my view, because it gives
better assurance that the code will actually work in practice.

Mocking also implies that you know exactly what the
code is doing internally - this means that tests written
using mocking are less useful as regression tests, as
they will often need to be changed when the implementation
changes.


Let's assume that the term stub was meant to be used instead of mocking. Well
written unit tests do not involve dependencies outside of the code being tested,
and to achieve this, stubs are typically used. As others have stated already in
this thread, unit tests are meant to be fast. Our Juju unit tests are in many
cases not unit tests at all - they involve bringing up the whole stack,
including mongo in replicaset mode for goodness sake, all to test a single
component. This approach is flawed and goes against what would be considered as
best practice by most software engineers. I hope we can all agree on that point.


I agree. I tend to see the need for stubs (I dislike Martin Fowler's 
terminology and prefer the term mock - as it really is by common 
parlance just a mock object) as a failure of the code. Just sometimes a 
necessary failure.


Code, as you say, should be written as much as possible in decoupled 
units that can be tested in isolation. This is why test first is 
helpful, because it makes you think about how am I going to test this 
unit before your write it - and you're less likely to code in hard to 
test dependencies.


Where dependencies are impossible to avoid, typically at the boundaries 
of layers, stubs can be useful to isolate units - but the need for them 
often indicates excessive coupling.




To bring up but one of many concrete examples - we have a set of Juju CLI
commands which use a Juju client API layer to talk to an API service running on
the state server. We unit test Juju commands by starting a full state server
and ensuring the whole system behaves as expected, end to end. This is
expensive, slow, and unnecessary. What we should be doing here is stubbing out
the client API layer and validating that:
1. the command passes the correct parameters to the correct API call
2. the command responds the correct way when results are returned

Anything more than that is unnecessary and wasteful. Yes, we do need end-end
integration tests as well, but these are in addition to, not in place of, unit
tests. And integration tests tend to be fewer in number, and run less frequently
than, unit tests; the unit tests have already covered all the detailed
functionality and edge cases; the integration tests conform the moving pieces
mesh together as expected.

As per other recent threads to juju-dev, we have already started to introduce
infrastructure to allow us to start unit testing various Juju components the
correct way, starting with the commands, the API client layer, and the API
server layer. Hopefully we will also get to the point where we can unit test
core business logic like adding and placing machines, deploying units etc,
without having to have a state server and mongo. But that's a way off given we
first need to unpick the persistence logic from our business logic and address
cross pollination between our architectural layers.



+1

Being able to test business logic without having to start a state server 
and mongo will make our tests s much faster and more reliable. The 
more we can do this *without* stubs the better, but I'm sure that's not 
entirely possible.


All the best,

Michael

--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Simulating a slow disk with nbd

2014-09-04 Thread Michael Foord

Hey all,

I've been diagnosing some replicaset issues that particularly show up on 
systems with slow disks, particularly our CI infrastructure. To simulate 
a slow disk I've been using nbd (Network block device [1]), with 
trickle. This provides a remote mounted disk (actually local served 
over the loopback) that is rate limited. I then run the tests inside an 
lxc container with its filesystem on the rate limited disk.


Getting this working was tricky, so I thought I'd share the information. 
In particular I couldn't get nbd-server to read a configuration file, so 
I force it to skip loading the config file and specify the parameters at 
the command line. This is deprecated, but works fine.


The sequence of commands to create the nbd drive and create / start the 
lxc container are as follows:


sudo apt-get install trickle nbd-server nbd-client

# create a 10GB file to act as the disk.
dd if=/dev/zero of=/path/to/some/file bs=1024 count=1000
# create a file system on it
mke2fs /path/to/some/file

# start nbd-server under trickle
trickle -d 2000 -u 2000 nbd-server -C   1234 /path/to/some/file

# start the client
sudo nbd-client localhost 1234 /dev/nbd0

# create mount point
sudo mkdir /mnt/nbd
# mount the nbd device
sudo mount /dev/nbd0 /mnt/nbd

# create the lxc container
sudo lxc-create -t ubuntu -n nbd --dir=/mnt/nbd

# start the container
sudo lxc-start --name nbd


The -d 2000 -u 2000 parameters to trickle tell it to rate limit access 
to nbd-server to 2000 kB/s. Adjust this appropriately for a 
faster/slower system.


Something worth noting is that when shutting down it matters what order 
you do things. If you shut down things in the wrong order you can end up 
with an lxc container that you can't restart or a device that you can't 
unmount / remount.


The right order to (un)do things is:

Shut down lxc container
Unmount device
Kill nbd-client / server

I hope this is helpful to anyone else who may want to simulate a system 
with slow i/o performance.


All the best,

Michael Foord

[1] http://nbd.sourceforge.net/

--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju induction sprint summary

2014-07-15 Thread Michael Foord


On 14/07/14 09:43, Ian Booth wrote:

Hi all

So last week we had a Juju induction sprint for Tanzanite and Moonstone teams to
welcome Eric and Katherine to the Juju fold. Following is a summary of some key
outcomes from the sprint that are relevant to others working on Juju (we also
did other stuff not generally applicable for this email). Some items will
interest some folks, while others may not quite be so relevant to you, so scan
the topics to see what you find interesting.

* Architectural overview - and a cool new tool

The sprint started with an architectural overview of the Juju moving parts and
how they interacted to deploy and maintain a Juju environment. Katherine noted
that our in-tree documentation has lots of text and no diagrams. She pointed out
a great tool for easily putting together UML diagrams using a simple text based
syntax - Plant UML http://plantuml.sourceforge.net. Check it out, it's pretty
cool. We'll be adding a diagram or two to the in-tree docs to show how it works.

* Code review (replacement for Github's native code review)

We are going to use Review Board. When we first looked at it before the sprint,
a major show stopper was lack of an auth plugin which worked with Github. Eric
has stepped up and written the necessary plugin. We'll have something deployed
this week or early next week, once some more tooling to finish the Github
integration is done. The key features:
- Login with Github button on main login screen
- pull requests automatically imported to Review Board and added to review queue
- diffs can be uploaded to Review Board as WIP and submitted to Github when
finalised

* Fixing the Juju state.State mess

state is a mess of layering violations and intermingled concerns. The result is
slow and fragile unit tests, scalability issues, hard to understand code, code
which is difficult to extend and refactor (to name a few issues).

The correct layering should be something like:
* remote service interface (aka apiserver)
* juju services for managing machines, services, units etc
* juju domain model
* model persistence (aka state)

The persistence layer above is all that should be in the state package. The plan
is to incrementally extract Juju service business logic out of state and pull it
up into a services layer. The first to be done is the machine placement and
deployment logic. Wayne has a WIP branch for this. The benefit of this work
can't be overstated, and the sprint allowed both teams to be able to work
together to understand the direction and intent of the work.

* Mongo 2.6 support

The work to port Juju to Mongo 2.6 is pretty much complete. The newer Mongo
version offers a number of bug fixes and  improvements over the 2.4 series, and
we need to be able to run with an up-to-date version.

* Providers don't need to have a storage implementation (almost)

A significant chunk of old code which was to support agents connecting directly
to mongo was removed (along with the necessary refactoring). This then allowed
the Environ interface to drop the StateInfo() method and instead implement a
method which returns the state server instances (not committed yet but close).
The next step is to remove the Storage() interface from Environ and make storage
an internal implementation detail which is not mandatory, so long as providers
have a way to figure out their state servers (this can be done using tagging for
example).

* Juju 1.20.1 release (aka juju/mongo issues)

A number of issues with how Juju and mongo interact became apparent when
replicasets were used for HA. Unfortunately Juju 1.20 shipped with these issues
unfixed. Part of the sprint was spent working on some urgent fixes to ship a bug
fix 1.20.1 release. There's still an outstanding mongo session issue that needs
to be fixed this week for a 1.20.2 release. Michael is working on it. The tl;dr
is that we are holding onto sessions and not refreshing, which means that the
underlying socket can time out and Juju loses connection to mongo.


The specific bug here is to do with i/o timeout errors:

https://bugs.launchpad.net/juju-core/+bug/1307434

It looks very likely that the cause of this is due to session timeout / 
connection problems caused by us using a single global session for all 
communication with mongo.


This global session permeates everywhere. Everything that has an 
mgo.Collection holds an indirect reference to this session and uses it. 
This includes watchers, state.State and the transaction runner we use 
for executing transactions.


mgo has socket pooling built into it, but to use that we need to be 
copying (and closing) sessions rather than executing queries off mgo 
collections or using the session directly.


Unpicking this is a fair amount of work. The current issue I have is 
that when I copy the session I immediately get auth errors. Caused (I 
assume) by us changing the connection credentials after we create the 
master session. So whenever we change credentials we also need to 

Re: critical regression blocks 1.19.3 release

2014-05-29 Thread Michael Foord


On 28/05/14 21:54, Curtis Hovey-Canonical wrote:

I don't think we can do a  release this week, or if we do, the release
will be from a version that works from last week.

The current blocker is
 lp:juju-core r2803 or lp:juju-core r2804 broke local precise
 deployments. Juju cannot bootstrap.
 https://bugs.launchpad.net/juju-core/+bug/1324255

CI has stumbled from one regression to the next this week. Each fix is
followed by another feature that breaks CI. Since we intend to switch
to github in a matter of hours, CI has little time to bless a recent
revision for release.

As I have been doing analysis of the problems, I have not had time to
update CI to build tarballs from git. I will now work on that since I
don't believe there is anything I can do to convince CI that juju
trunk is releasable.


It's highly likely that this is caused by the change to have local 
provider use replica sets. I'll setup a precise VM and see if I can 
confirm. If so we should back out that change. *sigh* (and sorry if it 
is indeed the case).


Michael







--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: critical regression blocks 1.19.3 release

2014-05-29 Thread Michael Foord


On 28/05/14 21:54, Curtis Hovey-Canonical wrote:

I don't think we can do a  release this week, or if we do, the release
will be from a version that works from last week.

The current blocker is
 lp:juju-core r2803 or lp:juju-core r2804 broke local precise
 deployments. Juju cannot bootstrap.
 https://bugs.launchpad.net/juju-core/+bug/1324255

CI has stumbled from one regression to the next this week. Each fix is
followed by another feature that breaks CI. Since we intend to switch
to github in a matter of hours, CI has little time to bless a recent
revision for release.

As I have been doing analysis of the problems, I have not had time to
update CI to build tarballs from git. I will now work on that since I
don't believe there is anything I can do to convince CI that juju
trunk is releasable.



Hmmm... both Andrew Wilkins and I have tried trunk with precise and had 
it bootstrap fine. I'll dig in and see if I can work out what is causing 
the CI failures.


Michael

--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Replica sets re-enabled for local provider

2014-05-28 Thread Michael Foord

Hey all,

When we switched to replica sets we explicitly disabled them for the 
local provider because it caused mongo to fail to start for some 
people. It turns out that along the path of enabling write-majority 
for mongo it would be convenient if the local provider was less of a 
special snowflake and also used replica sets. It also turns out that the 
problem with mongo not starting *appears* to have gone away.


We've now re-enabled the use of replica sets for the local provider [1]. 
Please try it out, especially if you had difficulties with this before, 
and report back if there are any problems.


All the best,

Michael Foord

[1] https://code.launchpad.net/~mfoord/juju-core/local-replset/+merge/221117

--
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev