Re: PROPOSAL: stop recording 'executing update-status hook'

2017-05-22 Thread Stuart Bishop
On 22 May 2017 at 14:36, Tim Penhey <tim.pen...@canonical.com> wrote:
> On 20/05/17 19:48, Merlijn Sebrechts wrote:
>>
>> On May 20, 2017 09:05, "John Meinel" <j...@arbash-meinel.com
>> <mailto:j...@arbash-meinel.com>> wrote:
>>
>> I would actually prefer if it shows up in 'juju status' but that we
>> suppress it from 'juju status-log' by default.
>>
>>
>> This is still very strange behavior. Why should this be default? Just pipe
>> the output of juju status through grep and exclude update-status if that is
>> really what you want.
>>
>> However, I would even argue that this isn't what you want in most
>> use-cases.  "update-status" isn't seen as a special hook in charms.reactive.
>> Anything can happen in that hook if the conditions are right. Ignoring
>> update-status will have unforeseen consequences...
>
>
> Hmm... there are (at least) two problems here.
>
> Firstly, update-status *should* be a special case hook, and it shouldn't
> take long.
>
> The purpose of the update-status hook was to provide a regular beat for the
> charm to report on the workload status. Really it shouldn't be doing other
> things.
>
> The fact that it is a periodic execution rather than being executed in
> response to model changes is the reason it isn't fitting so well into the
> regular status and status history updates.
>
> The changes to the workload status would still be shown in the history of
> the workload status, and the workload status is shown in the status output.
>
> One way to limit the execution of the update-status hook call would be to
> put a hard timeout on it enforced by the agent.
>
> Thoughts?

Unfortunately update-status got wired into charms.reactive like all
the other standard hooks, and just means 'do whatever still needs to
be done'. I think its too late to add timeouts or restrictions. But I
do think special casing it in the status history is needed. Anything
important will still end up in there due to workload status changes.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju Leader Election and Application-Specific Leadership

2017-04-06 Thread Stuart Bishop
On 6 April 2017 at 00:26, Dmitrii Shcherbakov
<dmitrii.shcherba...@canonical.com> wrote:

> https://jujucharms.com/docs/2.1/reference-charm-hooks#leader-elected
> "leader-elected is run at least once to signify that Juju decided this
> unit is the leader. Authors can use this hook to take action if their
> protocols for leadership, consensus, raft, or quorum require one unit
> to assert leadership. If the election process is done internally to
> the service, other code should be used to signal the leader to Juju.
> For more information read the charm leadership document."
>
> This doc says
> "If the election process is done internally to the service, other code
> should be used to signal the leader to Juju.".
>
> However, I don't see any hook tools to assert leadership to Juju from
> a charm based upon application-specific leadership information
> http://paste.ubuntu.com/24319908/
>
> So, as far as I understand, there is no manual way to designate a
> leader and the doc is wrong.
>
> Does anyone know if it is supposed to be that way and if this has not
> been implemented for a reason?

I agree with your reading, and think the documentation is wrong.  If
the election process is done internally to the service, there is no
way (and no need) to signal the internal 'leader' to Juju.

I also put 'leader' in quotes because if your service maintains its
own master, you should not call it 'leader' to avoid confusion with
the Juju leader.

For example, the lead unit in a PostgreSQL service appoints one of the
units as master. The master remains the master until the operator runs
the 'switchover' action on the lead unit, or the master unit is
destroyed causing the lead unit to start the failover process. At no
point does Juju care which unit is 'master'. Its communicated to the
end user using the workload status. Its simple enough to do and works
well.


-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju 2.1.0, and Conjure-up, are here!

2017-02-23 Thread Stuart Bishop
On 23 February 2017 at 23:20, Simon Davy <simon.d...@canonical.com> wrote:

> One thing that seems to have landed in 2.1, which is worth noting IMO, is
> the local juju lxd image aliases.
>
> tl;dr: juju 2.1 now looks for the lxd image alias juju/$series/$arch in the
> local lxd server, and uses that if it finds it.
>
> This is amazing. I can now build a local nightly image[1] that pre-installs
> and pre-downloads a whole set of packages[2], and my local lxd units don't
> have to install them when they spin up. Between layer-basic and Canonical
> IS' basenode, for us that's about 111 packages that I don't need to install
> on every machine in my 10 node bundle. Took my install hook times from 5min+
> each to <1min, and probably halfs my initial deploy time, on average.

Ooh, thanks for highlighting this! I've needed this feature for a long
time for exactly the same reasons.


> [2] my current nightly cron:
> https://gist.github.com/bloodearnest/3474741411c4fdd6c2bb64d08dc75040

/me starts stealing

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: lxd and constraints

2017-01-13 Thread Stuart Bishop
On 13 January 2017 at 02:20, Nate Finch <nate.fi...@canonical.com> wrote:

I'm implementing constraints for lxd containers and provider... and
> stumbled on an impedance mismatch that I don't know how to handle.
>


> I'm not really sure how to resolve this problem.  Maybe it's not a
> problem.  Maybe constraints just have a different meaning for containers?
> You have to specify the machine number you're deploying to for any
> deployment past the first anyway, so you're already manually choosing the
> machine, at which point, constraints don't really make sense anyway.
>

I don't think Juju can handle this. Either constraints have different
meanings with different cloud providers, or lxd needs to accept minimum
constraints (along with any other cloud providers with this behavior).

If you decide constraints need to consistently mean minimum, then I'd argue
it is best to not pass them to current-gen lxd at all. Enforcing that
containers are restricted to the minimum viable resources declared in a
bundle does not seem helpful, and Juju does not have enough information to
choose suitable maximums (and if it did, would not know if they would
remain suitable tomorrow).

-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Opaque automatic hook retries from API

2017-01-06 Thread Stuart Bishop
On 6 January 2017 at 01:39, Casey Marshall <casey.marsh...@canonical.com>
wrote:

> On Thu, Jan 5, 2017 at 3:33 AM, Adam Collard <adam.coll...@canonical.com>
> wrote:
>
>> Hi,
>>
>> The automatic hook retries[0] that landed as part of 2.0 (are documented
>> as) run indefinitely[1] - this causes problems as an API user:
>>
>> Imagine you are driving Juju using the API, and when you perform an
>> operation (e.g. set the configuration of a service, or reboot the unit, or
>> add a relation..) - you want to show the status of that operation.
>>
>> Prior to the automatic retries, you simply perform your operation, and
>> watch the delta streams for the corresponding change to the unit - the
>> success or otherwise of the operation is reflected in the unit
>> agent-status/workload-status pair.
>>
>> Now, with retries, if you see a unit in the error state, you can't
>> accurately reflect the status of the operation, since the unit will
>> undoubtedly retry the hook again. Maybe it succeeds, maybe it fails again.
>> How can one say after receiving the first delta of a unit error if the
>> operation succeeded or failed?
>>
>> With no visibility up front on the retry strategy that Juju will perform
>> (e.g. something representing the exponential backoff and a fixed number of
>> retries before Juju admits defeat) it is impossible to say at any point in
>> the delta stream what the result of a failed-at-least-once operation is.
>>
>
> I think the retry strategy is great -- it leverages the immutability we
> expect hooks to provide, to deliver a robust result over unreliable
> substrates -- and all substrates are unreliable where there's
> internetworking involved!
>
> However I see your point about the retry strategy muddling status. I've
> noticed this sometimes when watching openstack or k8s bundles "shake out"
> the errors as they come up. I don't think this is always a charm quality
> issue, it's maybe because we're trying to show two different things with
> status?
>

errors being 'shaken out' are almost always unhandled race conditions. I
find destroy-service/remove-application is particularly problematic,
because the doomed units don't know they are being destroyed but rather is
informed about departing one relation at a time (which is inherently racy,
because the units the doomed service are related too will process their
relation-departed hooks almost immediately and stop talking to the doomed
service, while the doomed service still thinks it can access their
resources while it falls apart one piece at a time).

I'm becoming more and more a believer that we can't reasonably avoid these
errors, and instead maybe we should assume that they will happen and it is
perfectly normal. We can stick to writing nice idempotent handlers, simpler
because we can ignore and bubble up failures. Simpler protocols (eg.
removing all the handshaking the PostgreSQL interface does to try to avoid
races with authorization). And going back to Adam's point, have hooks
retried a few times with some sort of backoff before even being reported as
a failure to the end user. One of the reasons test suites are currently
flaky is that there are race conditions we have no reasonable way of
solving, such as a database restarting itself while a hook on another unit
is attempting to use it. Even though I currently bootstrap test envs with
the retry behaviour off, I'm thinking of changing that.


What if Juju made a clearer distinction between result-state ("what I'm
> doing most recently or last attempted to do") vs. goal-state ("what I'm
> trying to get done") in the status? Would that help?
>

Isn't the goal state just the failed hook? I would certainly like to see
the list of hooks queued to run on each unit though if that is what you
mean (not in the default tabular status, but in the json status dump).



>> Can retries be limited to a small number, with a backoff algorithm
>> explicitly documented and stuck to by Juju, with the retry attempt number
>> included in the delta stream?
>>
>
This sounds like a good idea. The limit could even be dynamic, with a retry
attempted every time a unit it is related too successfully runs a hook,
until the environment is quiescent.



-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: A (Very) Minimal Charm

2016-12-16 Thread Stuart Bishop
On 16 December 2016 at 22:33, Katherine Cox-Buday <
katherine.cox-bu...@canonical.com> wrote:

> Tim Penhey <tim.pen...@canonical.com> writes:
>
> > Make sure you also run on LXD with a decent delay to the APT archive.
>
> Open question: is there any reason we shouldn't expect charm authors to
> take a hard-right towards charms with snaps embedded as resources? I know
> one of our long-standing conceptual problems is consistency across units
> which snaps solves nicely.
>

https://github.com/stub42/layer-snap is how I'm expecting things to go.
There is already one charm in the ~charmers review queue using it and I'm
aware of several more in various stages of development.

More work is needed though. In particular, Juju storage is inaccessible to
snaps, because there is no way to reach it from inside the containment.

(But none of this is a reason to not optimize Juju unit provisioning times,
since we will still need an environment setup capable of running the charms
so they can install the snaps for some time yet).

-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Leadership Election Tools

2016-12-14 Thread Stuart Bishop
On 14 December 2016 at 00:39, Matthew Williams <
matthew.willi...@canonical.com> wrote:

> Hey Folks,
>
> Let's say I'm a charm author that wants to test leadership election in my
> charm. Are there any tools available that will let me force leadership
> election in juju so that I can test how my charm handles it? I was looking
> at the docs here: https://jujucharms.com/docs/stable/developer-leadership
> but couldn't see anything
>

I don't think there is any supported way of doing this.

If you don't mind an unsupported hack though, use 'juju ssh' to shut down
the unit's jujud, wait 30 seconds for the lease to expire, and you should
have a new leader. 'juju ssh' again to restart the jujud, 'juju wait' for
the hooks to clear, and failover is done. 'juju run' will hang if you use
it to shutdown jujud, so don't do that.

juju ssh ubuntu/0 'sudo systemctl stop jujud-unit-ubuntu-0.service'
sleep 30
juju ssh ubuntu/0 'sudo systemctl stop jujud-unit-ubuntu-0.service'
juju wait

Ideally, you may be able to structure things so that it doesn't matter
which unit is leader. If all state relating to leadership decisions is
stored in the leadership settings, and if you avoid using @hook, then it
doesn't matter which unit makes the decisions. Worst case is that *no* unit
is leader when hooks are run, and decisions get deferred until
leader-elected runs.

(Interesting race condition for the day: It is possible for all units in a
service to run their upgrade-charm hook and for none of them to be leader
at the time, so @hook('upgrade-charm') code guarded by is-leader may never
run. And reactive handlers have no concept of priority and might kick in
rather late for upgrade steps, requiring more creative use of reactive
states to guard 'new' code from running too soon. Not specific to
upgrade-charm hooks either, so avoid using @hook and leadership together)


-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: A (Very) Minimal Charm

2016-12-01 Thread Stuart Bishop
On 1 December 2016 at 19:53, Marco Ceppi <marco.ce...@canonical.com> wrote:

> On Thu, Dec 1, 2016 at 5:00 AM Adam Collard <adam.coll...@canonical.com>
> wrote:
>
>> On Thu, 1 Dec 2016 at 04:02 Nate Finch <nate.fi...@canonical.com> wrote:
>>
>> On IRC, someone was lamenting the fact that the Ubuntu charm takes longer
>> to deploy now, because it has been updated to exercise more of Juju's
>> features.  My response was - just make a minimal charm, it's easy.  And
>> then of course, I had to figure out how minimal you can get.  Here it is:
>>
>> It's just a directory with a metadata.yaml in it with these contents:
>>
>> name: min
>> summary: nope
>> description: nope
>> series:
>>   - xenial
>>
>> (obviously you can set the series to whatever you want)
>> No other files or directories are needed.
>>
>>
>> This is neat, but doesn't detract from the bloat in the ubuntu charm.
>>
>
> I'm happy to work though changes to the Ubuntu charm to decrease "bloat".
>
>
>> IMHO the bloat in the ubuntu charm isn't from support for Juju features,
>> but the switch to reactive plus conflicts in layer-base wanting to a)
>> support lots of toolchains to allow layers above it to be slimmer and b) be
>> a suitable base for "just deploy me" ubuntu.
>>
>
> But it is to support the reactive framework, where we utilize newer Juju
> features, like status and application-version to make the charm rich
> despite it's minimal goal set. Honestly, a handful of cached wheelhouses
> and some apt packages don't strike me as bloat, but I do want to make sure
> the Ubuntu charm works for those using it. So,
>
> What's the real problem with the Ubuntu charm today?
> How does it not achieve it's goal of providing a relatively blank Ubuntu
> machine? What are people using the Ubuntu charm for?
>
> Other than demos, hacks/workarounds, and testing I'm not clear on the
> purpose of an Ubuntu charm in a model serves.
>

The cs:ubuntu charm gets used on production to attach subordinates too. For
example, we install cs:ubuntu onto our controller nodes so we can install
subordinates like cs:ntp, cs:nrpe, cs:~telegraf-chamers/telegraf and
others. Its also used in test suites for these sort of subordinates.

The 'problem' is, like all reactive charms, the first thing it does is pull
down approximately 160MB of packages and installs them (installing pip
pulls in build-essentials, or at least a big chunk of it). Its very
noticeable when working locally, and maybe in CI environments.

If I knew how to solve this for all reactive charms, I would have suggested
it already. It could be fixed in cs:ubuntu by making it non-reactive, if
people think it is worth it (its not like it actually needs any reactive
features. A minimal metadata.yaml and an install or start hook to set the
status is all it needs).

Maybe reactive is entrenched enough as the new world order that we can get
specific cloud images spun for it, where a pile of packages are
preinstalled so we don't need to wait for cloud-init or the charm to
install them. We might be able to lower deployment times from minutes to
seconds, since often this step is the main time sink.

-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: List plugins installed?

2016-09-30 Thread Stuart Bishop
On 30 September 2016 at 04:47, Nate Finch <nate.fi...@canonical.com> wrote:

> Seem alike the easiest thing to do is have a designated plugin directory
> and have juju install  copy the binary/script there.  Then
> we're only running plugins the user has specifically asked to install.
>

This does not work if the plugin has dependencies, such as the Python
standard library or external tools such as git or graphviz. Nothing running
inside the snap containment can access stuff outside of the containment.

I think it will be more complex solution that needs to be designed with the
snappy team. As far as I can tell its either going to need a small daemon
running outside of containment and a way of passing messages to it (such as
how a snap can open a web page in a browser running outside of
containment), or having plugins distributed as snaps and somehow allowing
the juju snap to call executables in these plugin snaps.

(which is going to take time, so I guess we need to keep the existing
mechanism going a while longer and the snap in devmode)

-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju and snappy implementation spike - feedback please

2016-08-09 Thread Stuart Bishop
On 9 August 2016 at 19:08, Ian Booth <ian.bo...@canonical.com> wrote:

> I personally like the idea that the snap could use a juju-home interface to
> allow access to the standard ~/.local/share/juju directory; thus allowing
> a snap
> and regular Juju to be used interchangeably (at least initially). This will
> allow thw use case "hey, try my juju snap and you can use your existing
> settings" But, isn't it verboten for snaps to access dot directories in
> user
> home in any way, regardless of what any interface says? We could provide an
> import tool to copy from ~/.local/share/juju to ~/snap/blah...
>
> But in the other case, using a personal snap and sharing settings with the
> official Juju snap - do we know what the official snappy story is around
> this
> scenario? I can't imagine this is the first time it's come up?
>


The big difference to me is that $SNAP_USER_DATA will roll back if the snap
is rolled back. I'm not sure what happens if the snap is removed and
reinstalled. Given end users should no longer need to be messing around
with the dotfiles, I think the rollback behaviour is what should drive your
decision. Is it nice behaviour? Or will it mess things up because rollback
will cause things to get out of sync with the deployments?


-- 
Stuart Bishop <stuart.bis...@canonical.com>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Quick win - juju check

2016-05-26 Thread Stuart Bishop
On 24 May 2016 at 11:14, Tim Penhey <tim.pen...@canonical.com> wrote:

> We talked quite a bit in Vancouver about quick wins. Things we could get
> into Juju that are simple to write that add quick value.

For trivial, quick wins consider:

'juju do --wait', from
https://bugs.launchpad.net/juju-core/+bug/1445066 (hey, you filed that
bug).

Adding a common option for the *-set and other hook environment tools
to get their data from stdin, rather than the command line, from
https://bugs.launchpad.net/juju-core/+bug/1274460

My favourite is as always 'juju wait', but that might not turn out to
be trivial.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Planning for Juju 2.2 (16.10 timeframe)

2016-04-28 Thread Stuart Bishop
On 9 March 2016 at 06:51, Mark Shuttleworth <m...@ubuntu.com> wrote:
> Hi folks
>
> We're starting to think about the next development cycle, and gathering
> priorities and requests from users of Juju. I'm writing to outline some
> current topics and also to invite requests or thoughts on relative
> priorities - feel free to reply on-list or to me privately.

Another item I'd like to see is distribution upgrades. We not have a
lot of systems deployed with Trusty that will need to be upgraded to
Xenial not too far in the future. For many services you would just
bring up a new service with a new name and cut over, but this is
impractical for other services such as database shards deployed on
MaaS provisioned hardware. Handling upgrades may be as simple as
allowing operators (or a charm action) perform the necessary
dist-upgrade one unit at a time and have the controller notice and
cope when the unit's jujud is bounced. Not all units would be running
the same distribution release at the same time, and I'm assuming the
service is running a multi-series charm here that supports both
releases (so we don't need to worry about how to handle upgrade-charm
hooks, at least for now)

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: New juju in ubuntu

2016-04-07 Thread Stuart Bishop
On 7 April 2016 at 16:46, roger peppe <roger.pe...@canonical.com> wrote:
> On 7 April 2016 at 10:17, Stuart Bishop <stuart.bis...@canonical.com> wrote:
>> On 7 April 2016 at 16:03, roger peppe <roger.pe...@canonical.com> wrote:
>>> On 7 April 2016 at 09:38, Tim Penhey <tim.pen...@canonical.com> wrote:
>>>> We could probably set an environment variable for the plugin called
>>>> JUJU_BIN that is the juju that invoked it.
>>>>
>>>> Wouldn't be too hard.
>>>
>>> How does that stop old plugins failing because the new juju is trying
>>> to use them?
>>>
>>> An alternative possibility: name all new plugins with the prefix "juju2-" 
>>> rather
>>> than "juju".
>>
>> I've opened https://bugs.launchpad.net/juju-core/+bug/1567296 to track this.
>>
>> Prepending the $PATH is not hard either - just override the
>> environment in the exec() call.
>>
>> The nicest approach may be to not use 'juju1', 'juju2' and 'juju' but
>> instead just 'juju'. It would be a thin wrapper that sets the $PATH
>> and invokes the correct binary based on some configuration such as an
>> environment variable. This would fix plugins, and lots of other stuff
>> that are about to break too such as deployment scripts, test suites
>> etc.
>
> There are actually two problems here. One is the fact that plugins
> use the Juju binary. For that, setting the PATH might well be the right thing.
>
> But there's also a problem with other plugins that use the Juju API
> directly (they might be written in Go, for example) and therefore
> implicitly assume the that they're talking to a juju 1 or juju 2 environment.
> Since local configuration files have changed and the API has changed, it's
> important that a plugin written for Go 1 won't be invoked by a juju 2
> binary.

If juju 2.x changed the plugin prefix from juju- to juju2-, that would
also solve the issue of juju 2.x specific plugins showing up in juju
1.x's command line help and vice versa.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: New juju in ubuntu

2016-04-07 Thread Stuart Bishop
On 7 April 2016 at 16:03, roger peppe <roger.pe...@canonical.com> wrote:
> On 7 April 2016 at 09:38, Tim Penhey <tim.pen...@canonical.com> wrote:
>> We could probably set an environment variable for the plugin called
>> JUJU_BIN that is the juju that invoked it.
>>
>> Wouldn't be too hard.
>
> How does that stop old plugins failing because the new juju is trying
> to use them?
>
> An alternative possibility: name all new plugins with the prefix "juju2-" 
> rather
> than "juju".

I've opened https://bugs.launchpad.net/juju-core/+bug/1567296 to track this.

Prepending the $PATH is not hard either - just override the
environment in the exec() call.

The nicest approach may be to not use 'juju1', 'juju2' and 'juju' but
instead just 'juju'. It would be a thin wrapper that sets the $PATH
and invokes the correct binary based on some configuration such as an
environment variable. This would fix plugins, and lots of other stuff
that are about to break too such as deployment scripts, test suites
etc.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: New juju in ubuntu

2016-04-06 Thread Stuart Bishop
On 7 April 2016 at 03:55, Marco Ceppi <marco.ce...@canonical.com> wrote:
>
> On Wed, Apr 6, 2016 at 10:07 AM Stuart Bishop <stuart.bis...@canonical.com>
> wrote:
>>
>> On 5 April 2016 at 23:35, Martin Packman <martin.pack...@canonical.com>
>> wrote:
>>
>> > The challenge here is we want Juju 2.0 and all the new functionality
>> > to be the default on release, but not break our existing users who
>> > have working Juju 1.X environments and no deployment upgrade path yet.
>> > So, versions 1 and 2 have to be co-installable, and when upgrading to
>> > xenial users should get the new version without their existing working
>> > juju being removed.
>> >
>> > There are several ways to accomplish that, but based on feedback from
>> > the release team, we switched from using update-alternatives to having
>> > 'juju' on xenial always be 2.0, and exposing the 1.X client via a
>> > 'juju-1' binary wrapper. Existing scripts can either be changed to use
>> > the new name, or add the version-specific binaries directory
>> > '/var/lib/juju-1.25/bin' to the path.
>>
>> How do our plugins know what version of juju is in play? Can they
>> assume that the 'juju' binary found on the path is the juju that
>> invoked the plugin, or is there some other way to tell using
>> environment variables or such? Or will all the juju plugins just fail
>> if they are invoked from the non-default juju version?
>
>
> You can invoke `juju version` from within the plugin and parse the output.
> That's what I've been doing when I need to distinguish functionality.

That seems fine if you are invoking the plugin from the default
unnumbered 'juju'. But running 'juju2 wait' will mean that juju-wait
will be executing juju 1.x commands and fail. And conversely running
'juju1 wait' will invoke juju 2.x and probably fail.

I think the plugin API needs to be extended to support allowing
multiple juju versions to coexist. An environment variable would do
the trick but require every plugin to be fixed. Altering $PATH so
'juju' runs the correct juju would allow existing plugins to run
unmodified (the bulk of them will work with both juju1 and juju2,
since the cli is similar enough that many plugins will work
unmodified.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: New juju in ubuntu

2016-04-06 Thread Stuart Bishop
On 5 April 2016 at 23:35, Martin Packman <martin.pack...@canonical.com> wrote:

> The challenge here is we want Juju 2.0 and all the new functionality
> to be the default on release, but not break our existing users who
> have working Juju 1.X environments and no deployment upgrade path yet.
> So, versions 1 and 2 have to be co-installable, and when upgrading to
> xenial users should get the new version without their existing working
> juju being removed.
>
> There are several ways to accomplish that, but based on feedback from
> the release team, we switched from using update-alternatives to having
> 'juju' on xenial always be 2.0, and exposing the 1.X client via a
> 'juju-1' binary wrapper. Existing scripts can either be changed to use
> the new name, or add the version-specific binaries directory
> '/var/lib/juju-1.25/bin' to the path.

How do our plugins know what version of juju is in play? Can they
assume that the 'juju' binary found on the path is the juju that
invoked the plugin, or is there some other way to tell using
environment variables or such? Or will all the juju plugins just fail
if they are invoked from the non-default juju version?

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Planning for Juju 2.2 (16.10 timeframe)

2016-04-02 Thread Stuart Bishop
On 1 April 2016 at 20:50, Mark Shuttleworth <m...@ubuntu.com> wrote:
> On 19/03/16 01:02, Stuart Bishop wrote:
>> On 9 March 2016 at 10:51, Mark Shuttleworth <m...@ubuntu.com> wrote:
>>
>>> Operational concerns
>> I still want 'juju-wait' as a supported, builtin command rather than
>> as a fragile plugin I maintain and as code embedded in Amulet that the
>> ecosystem team maintain. A thoughtless change to Juju's status
>> reporting would break all our CI systems.
>
> Hmm.. I would have thought that would be a lot more reasonable now we
> have status well in hand. However, the charms need to support status for
> it to be meaningful to the average operator, and we haven't yet made
> good status support a requirement for charm promulgation in the store.
>
> I'll put this on the list to discuss.


It is easier with Juju 1.24+. You check the status. If all units are
idle, you wait about 15 seconds and check again. If all units are
still idle and the timestamps haven't changed, the environment is
probably idle. And for some (all?) versions of Juju, you also need to
ssh into the units and ensure that one of the units in each service
thinks it is the leader as it can take some time for a new leader to
be elected.

Which means 'juju wait' as a plugin takes quite a while to run and
only gives a probable result, whereas if this information about the
environment was exposed it could be instantaneous and correct.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Planning for Juju 2.2 (16.10 timeframe)

2016-03-18 Thread Stuart Bishop
sk space, but means you could
migrate a 10 unit Cassandra cluster to a new 5 unit Cassandra cluster.
(the charm doesn't actually do this yet, this is just speculation on
how it could be done). I imagine other services such as OpenStack
Swift would be in the same boat.

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Units & resources: are units homogeneous?

2016-02-16 Thread Stuart Bishop
On 17 February 2016 at 01:20, Katherine Cox-Buday
<katherine.cox-bu...@canonical.com> wrote:

> My understanding is that it's a goal to make the management of units more
> consistent, and making the units more homogeneous would support this, but
> I'm wondering from a workload perspective if this is also true? One example
> I could think of to support the discussion is a unit being elected leader
> and thus taking a different path through it's workflow than the other units.
> When it comes to resources, maybe this means it pulls a different sub-set of
> the declared resources, or maybe doesn't pull resources at all (e.g. it's
> coordinating the rest of the units or something).

While I have charms where units have distinct roles (one master,
multiple standbys, and the juju leader making decisions), they can be
treated as homogeneous since they need to be able to fail over from
one role to another. The only use case I can think of where different
resources might be pulled down on different units is deploying a new
service with data restored from a backup. The master would be the only
unit to pull down this resource (the backup) on deployment, and the
standbys would replicate it from the master.

And now I think of it, can I stream resources? I don't want to
provision a machine with 8TB of storage just so I can restore a 4TB
dump. Maybe this is just a terrible example, since I probably couldn't
be bothered uploading the 4TB dump in the first place, and would
instead setup tunnels and pipes to stream it into a 'juju run'
command. An abuse of Juju resources better suited to Juju blob
storage?

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Automatic retries of hooks

2016-01-20 Thread Stuart Bishop
On 20 January 2016 at 17:46, William Reade <william.re...@canonical.com> wrote:

> On Wed, Jan 20, 2016 at 8:46 AM, Stuart Bishop <stuart.bis...@canonical.com>
> wrote:

>> It happens naturally if you structure your charm to have a single hook
>> that does everything that needs to be done, rather than trying to
>> craft individual hooks to deal with specific events.
>
> Independent of everything else, *this* should *excellent* advice for
> speeding up your deployments. Have you already been writing charms like
> this? I'd love to hear your experiences; and, in particular, if you've
> noticed any improvement in deployment speed. The theoretically achievable
> speedup is vast, but the hook runner wasn't written with this approach in
> mind; we might need to make a couple of small tweaks [0] to get the best out
> of the approach.

The PostgreSQL charm has now existed in three forms. Traditional,
services framework, and now reactive framework. Using the services
framework, deployment speed was slower than traditional. You ended up
with one very long string of steps, many of which were unnecessary. I
felt it easier to maintain and understand, but logs noisier and it was
slower. The reactive framework is much faster deployment wise than all
other versions, as you can easily have only the necessary steps
triggered for the current state. The execution thread is harder to
follow, since there isn't really one, but it still seems very
maintainable and understandable. There is less code than the other
versions. It does drive you to create separate handlers for each hook,
but advice is to keep hooks at the absolute bare minimum to adjust the
charms state based on the event and put all the actual logic in the
state driven handlers.


-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Automatic retries of hooks

2016-01-19 Thread Stuart Bishop
On 20 January 2016 at 13:17, John Meinel <j...@arbash-meinel.com> wrote:

> There are classes of failures that a charm hook itself cannot handle. The
> specific one Bogdan was working with is the fact that the machine itself is
> getting restarted while the charm is in the middle of processing a hook.
> There isn't any way the hook itself can handle that, unless you could raise
> a very specific error that indicates you should be retried (so as it notices
> its about to die, it raises the try-me-again error).
>
> Hooks are supposed to be idempotent regardless, aren't they? So while we
> paper over transient bugs in them, doesn't it make the system more resilient
> overall?

The new update-status hook could be used to recover, as it is called
automatically at regular intervals. If the reboot really was random,
you would need to clear the error status first. But if it is triggered
by the charm, it is just a case of 'reboot(now+30s);
status_set('waiting', 'Waiting for reboot'); sys.exit(0)' and waiting
for the update-status hook to kick in.

It happens naturally if you structure your charm to have a single hook
that does everything that needs to be done, rather than trying to
craft individual hooks to deal with specific events.



-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Making logging to MongoDB the default

2015-10-22 Thread Stuart Bishop
On 22 October 2015 at 22:17, Nate Finch <nate.fi...@canonical.com> wrote:

> IMO, all-machines.log is a bad idea anyway (it duplicates what's in the log
> files already, and makes it very likely that the state machines will run out
> of disk space, since they're potentially aggregating hundreds or thousands
> of machines' logs, not to mention adding a lot of network overhead). I'd be
> happy to see it go away.  However, I am not convinced that dropping text
> file logs in general is a good idea, so I'd love to hear what we're gaining
> by putting logs in Mongo.

I'm looking forward to having access to them in a structured format so
I can generate logs, reports and displays the way I like rather than
dealing with the hard to parse strings in the text logs. 'juju
debug-logs [--from ts] [--until ts] [-F] [--format=json]' would keep
me quite happy and I can filter, format, interleave and colorize the
output to my hearts content. I can even generate all-machines.log if I
feel like a headache ;)

-- 
Stuart Bishop <stuart.bis...@canonical.com>

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Use case for: min-version

2015-08-12 Thread Stuart Bishop
On 12 August 2015 at 05:02, Jeff Pihach jeff.pih...@canonical.com wrote:
 Version checking for features can be dangerous because a commands output or
 availability may change in the future and now your charm also needs a
 max-version, or version-range etc. A more robust solution could be
 something along the lines of a feature-supported query which would return
 whether that command is indeed supported in the active environment with the
 necessary syntax.

max-version should be very rare. To need it, you need to have both a
backwards incompatible change in Juju and a charm supporting such a
wide range of Juju versions that you need the multiple codepaths.
Still, I imagine it will happen and hookenv.juju_has_version easily
updated to support a range.

If you want your feature-supported API, now the version number is
exposed in charm-helpers you can easily add such a matrix of feature
flags. It just needs someone interested enough to maintain the list as
it grows and grows over time. I personally think it is impractical,
for exactly the problems you describe. A flag like 'leadership' isn't
very useful. I'm interested in leadership as it behaves in 1.23, or
leadership as implemented in 1.25 with the leader-deposed hook, or
leadership as implemented in 1.24.4 with the HA stability fixes, or
storage as of 1.25 when I can upgrade a service previously using the
block storage broker, or status as of 1.28 when we added more failure
states, or relation-set as of 1.23.3 when the --file argument was
fixed to accept input from stdin. Or most practically, I'm interested
in Juju 1.24 stable because I know I'm using features that did not
exist in 1.23 stable and that is the version I'm running tests with.

And now I think of it, it also makes testing easier (and thus
hopefully improves quality). If you are testing code guarded by both
the leadership and status feature flags, you have 4 code paths to
test. If you are testing code guarded by has_version_1.24, you only
have 2 code paths to test. And you would save time and effort, since
we all know that all versions of Juju implementing unit status also
implement basic leadership. Juju is developed and releases features on
a single trunk, where as for cross browser compatibility you are
supporting a matrix of features enabled or not on dozens of different
branches (one for each browser).

pw = host.genpw()
if feature('leadership'):
leader_set(dict(password=pw))
else:
relation_set(password=pw, relid=hookenv.get_peer_relid())
status_set('blocked', 'Connect to {} using password {} to complete
setup'.format(url, pw))
if feature('status'):
raise SystemExit(0)
else:
raise SystemExit(1)


I think it is best to add the feature flags you want in your own
charm, using the version number exposed by charm-helpers, rather than
coarse feature flags exposed by charm-helpers or juju that don't
necessarily align with your charm's actual requirements.

As for graceful fallback, its great when you can do it. Both Marco and
I use the same example - status_set. Under 1.23 or earlier, it uses
juju-log. Under 1.24 or higher, it uses status-set. However, if you
look at my original sample code you see that it isn't enough because
you still need to decide what to do next based on the behaviour that
was hidden from you. The graceful fallback practically requires you to
sniff the version if you want to block your units properly.

def block_and_exit(msg):
hookenv.status_set('blocked', msg)
if hookenv.has_juju_version('1.24'):
raise SystemExit(0)  # blocked state for modern juju
raise SystemExit(1) # error state for older juju

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Use case for: min-version

2015-08-12 Thread Stuart Bishop
On 12 August 2015 at 03:56, Tim Penhey tim.pen...@canonical.com wrote:
 It would be trivial for the Juju version to be exported in the hook
 context as environment variables.

 Perhaps something like this:

 JUJU_VERSION=1.24.4
 JUJU_VERSION_MAJOR=1
 JUJU_VERSION_MINOR=24
 JUJU_VERSION_PATCH=4
 # tag for 'alpha' 'beta'
 JUJU_VERSION_TAG=

 Thoughts?

Whatever :-)

An environment variable seems the obvious way to communicate the
information. Give me a version string like 1.24.4 and I'm happy. The
trick is documenting and sticking to the format, so I know if one day
you might throw 1.26.1-alpha6 at me.

So do it if it really is trivial, or if not try not to break the work
around in charm-helpers (parsing the output of
/var/lib/juju/tools/machine-*/jujud version)


-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Use case for: min-version

2015-08-11 Thread Stuart Bishop
On 11 August 2015 at 03:32, Matt Bruzek matthew.bru...@canonical.com wrote:

 We wrote a charm that needed election logic, so we used the new Juju feature
 is_leader.  A user was interested in using a bundle that contained this
 charm and it failed on them.  It was hard to track down the cause of the
 problem.  It appears they were using an earlier version of Juju that is
 available from universe and only the PPA had the more current version.

 Read more about the problem here:
 https://bugs.launchpad.net/charms/+source/etcd/+bug/1483380

 I heard the min-version feature discussed at previous Cloud sprints but to
 my knowledge we do not have it implemented yet.  The idea was a charm could
 specify in metadata.yaml what min-version of Juju they support.

 There are a lot of new features that juju-core are cranking out (and that is
 *awesome*)!  We have already run into this problem with a real user, and
 will have the problem in the future.

 Can we reopen the discussion of min-version?  Or some other method of
 preventing this kind of problem in the future?

charmhelpers already supports this with
charmhelpers.core.hookenv.has_juju_version, thanks to Curtis who
described a reliable way of accessing it.

Adding it to the official hook environment is
https://bugs.launchpad.net/juju-core/+bug/1455368

It is particularly useful for:

hookenv.status_set('blocked', I'm in a right pickle. Help!)
if hookenv.has_juju_version('1.24'):
raise SystemExit(0)  # Blocked state, 1.24+
raise SytemExit(1)  # Error state, 1.23

I've also got version checks in charmhelpers.coordinator, which
requires leadership.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Send Juju logs to different database?

2015-05-06 Thread Stuart Bishop
On 6 May 2015 at 04:57, Menno Smits menno.sm...@canonical.com wrote:

 It is more likely that Juju will grow the ability to send logs to external
 log services using the syslog protocol (and perhaps others). You could use
 this to log to your own log aggregator or database. This feature has been
 discussed but hasn't been planned in any detail yet (pull requests would be
 most welcome!).

syslog seems a bad fit, as the logs are now structured data and I'd
like to keep it that way. I guess people want it as an option, but I'd
consider it the legacy option here.

My own use case would be to make a more readable debug-logs, rather
than attempting to parse the debug-logs output ;) Hmm... I may be able
to do this already via the Juju API.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju devel 1.23-beta2 is released

2015-03-30 Thread Stuart Bishop
On 26 March 2015 at 23:58, Curtis Hovey-Canonical cur...@canonical.com wrote:

 ### Service Leader Elections

 Services running in an environment bootstrapped with the
 leader-election feature flag now have access to three new hook tools:

 is-leader - returns true only if the executing unit is guaranteed
 service leadership for the next 30s
 leader-get - as relation-get; accessible only within the service
 leader-set - as relation-set; will fail if not executed on leader

 ...and two new hooks:

 leader-elected - runs when the unit takes service leadership
 leader-settings-changed - runs when another unit runs leader-set

 When a unit starts up, it will always run either leader-elected or
 leader-settings-changed as soon as possible, delaying only doing so only
 to run the install hook; complete any queued or in-flight operation; or
 resolve a hook or upgrade error.

Looking forward to this.

Looking at the specifics, I'm interested in how the leader unit
performs long running operations with the 30s lease. Lets say unit 0
is the leader, and decides that it is an appropriate time for unit 0
to run a repair operation that might take a few hours.

As the wording currently stands, no hooks are triggered on the leader
when the leader calls leader-set. This gives the leader no alternative
but to perform the long running operation in the same hook, and risk
another leader being elected and making a conflicting decision.

I think that the leader-settings-changed hook needs to be called
whenever *any* unit runs leader-set (including the current unit if it
is the leader), rather than only when a different unit runs
leader-set. This way, the leader can make its decisions and exit
whatever hook triggered it within its 30s lease, and all units can
perform their long running tasks in the leader-settings-changed hook.

Alternatively, it could kick of an asynchronous task but they don't exist yet.

Or would I need to run 'is-leader' in a thread every 30s to keep the
lease renewed?

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju devel 1.23-beta2 is released

2015-03-30 Thread Stuart Bishop
On 30 March 2015 at 18:14, John Meinel j...@arbash-meinel.com wrote:
 I believe the Juju agent itself is running a renew the lease every 30s. It
 probably wouldn't hurt for the charm to check that it is still the master
 periodically if it is going to be running for an hour, since it might lose
 connection without otherwise realizing.

Oh, that will work too I think.

I'd be testing this myself, but I'm having lxc issues and don't want
to compound it with juju beta issues ;)

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feature Request: -about-to-depart hook

2015-02-03 Thread Stuart Bishop
On 3 February 2015 at 21:23, Stuart Bishop stuart.bis...@canonical.com wrote:
 On 28 January 2015 at 21:03, Mario Splivalo

 I'm not sure if this is possible... Once the unit left relation juju is
 no longer aware of it so there is no way of knowing if -broken completed
 with success or not. Or am I wrong here?

 Hooks have no way of telling, but juju could in the same way that you
 can tell by running 'juju status'. If the unit is still running, it
 might still run the -broken hook. Once the unit is destroyed, we know
 it will never run the -broken hook.

While typing up https://bugs.launchpad.net/juju-core/+bug/1417874 I
realized that your proposed solution of a pre-departure hook is the
only one that can work. Once -departed hooks start firing both the
doomed unit and the leader have already lost the access needed to
decommission the departing node.

I'm going to need to tear out the decommissioning code from my charm
(that started failing my tests once I tightened security), and
document the manual decommissioning process. Unless someone can come
up with a better way forward with current juju.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feature Request: -about-to-depart hook

2015-02-03 Thread Stuart Bishop
On 28 January 2015 at 21:03, Mario Splivalo
mario.spliv...@canonical.com wrote:
 On 01/27/2015 09:52 AM, Stuart Bishop wrote:
 Ignoring the, most likely, wrong nomenclature of the proposed hook, what
 are your opinions on the matter?

 I've been working on similar issues.

 When the peer relation-departed hook is fired, the unit running it
 knows that $REMOTE_UNIT is leaving the cluster. $REMOTE_UNIT may not
 be alive - we may be removing a failed unit from the service.
 $REMOTE_UNIT may be alive but uncontactable - some form of network
 partition has occurred.

 $REMOTE_UNIT doesn't have to be the one leaving the cluster. If I have
 3-unit cluster (mongodb/0, mongodb/1, mongodb/2), and I 'juju remove
 mongodb/1), the relation-departed hook will fire on all three units.
 Moreover, it will fire twice on mongodb/1. So, from mongodb/2
 perspective, $REMOTE_UNIT is indeed pointing to mongodb/0, which is, in
 this case, leaving the relation. But if we observe the same scenario on
 mongodb/0, $REMOTE_UNIT there will point to mongodb/0. But that unit is
 NOT leaving the cluster. There is no way to know if the hook is running
 on the unit that's leaving or is it running on the unit that's staying.

I see, and have also struck the same problem with the Cassandra charm.
It is impossible to have juju decommission a node.

My relation-departed hook must reset the firewall rules, since the
replication connection is unauthenticated and we cannot leave it open.
This means I cannot decommission the departing unit in the
relation-broken hook, as the remaining nodes refuse to talk to it and
it has no way of redistributing its data.

And I can't decommission the departing node in the relation-departed
hook, because as you correctly say, it is impossible to know which
unit is actually leaving the cluster and which are remaining.


 But, if that takes place in relation-departed, there is no way of
 knowing if you need to do a stepdown, because you don't know if you're
 the unit being removed, or is it the remote unit being removed.
 Therefore the logic for removing nodes had to go to relation-broken.
 But, as you explained, if the unit goes down catastrophically the
 relation-broken will never be executed and I have a cluster that needs
 manual intervention to clean up.

Leadership might provide a work around, as the service is guaranteed
to have exactly one leader. If a unit is running the relation-departed
hook and it is the leader, it knows it is not the one leaving the
cluster (or it would no longer be leader) and it can perform the
decommissioning.

But that is a messy work around. Given we have both struck nearly
exactly the same problem, I'd surmise the same issue will occur in
pretty much all similar systems (Swift, Redis, mysql, ...) and we need
a better solution.

I've also heard rumours of a goal state, which may provide units
enough context to know what is happening. I don't know the details of
this though.


 I'm not sure if this is possible... Once the unit left relation juju is
 no longer aware of it so there is no way of knowing if -broken completed
 with success or not. Or am I wrong here?

Hooks have no way of telling, but juju could in the same way that you
can tell by running 'juju status'. If the unit is still running, it
might still run the -broken hook. Once the unit is destroyed, we know
it will never run the -broken hook.


-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feature Request: -about-to-depart hook

2015-01-27 Thread Stuart Bishop
On 26 January 2015 at 20:54, Mario Splivalo
mario.spliv...@canonical.com wrote:
 Hello!

 Currently juju provides relation-departed hook, which will fire on all
 units that are part of relation, and relation-broken hook, which will
 fire on unit that just departed the relation.

 The problem arises when we have a multi-unit service peered. Consider
 MongoDB charm where we usually have replicaset formed with three or more
 units:
 When a unit is destroyed (with 'juju remove-unit') first relation-broken
 hook will fire between the departing unit and all the 'staying' units.
 Then, on the departed unit relation-broken hook is fired. But, if we
 need to do some work on the departing unit before it leaves the
 relation, there is no way to do so. When 'relation-departed' hook is
 called there is no way of telling (if we make observation from within
 the hook) if we are running on unit that is departing, or on unit that
 is 'staying' within the relation.

 A '-before-departed' hook would, I think, solve. First a
 '-before-departed' hook will be fired on the departing unit. Then
 '-departed' hook will fire against departing and staying units. And,
 lastly, as it is now, the -broken hook will fire.

 Ignoring the, most likely, wrong nomenclature of the proposed hook, what
 are your opinions on the matter?

I've been working on similar issues.

When the peer relation-departed hook is fired, the unit running it
knows that $REMOTE_UNIT is leaving the cluster. $REMOTE_UNIT may not
be alive - we may be removing a failed unit from the service.
$REMOTE_UNIT may be alive but uncontactable - some form of network
partition has occurred.

When the peer relation-broken hook is fired, the unit running it knows
that is it leaving the cluster and decomissions itself. However, this
hook may never be run if the unit has failed. Or it may be impossible
to complete successfully (eg. corrupted filesystem).

I agree that this is not rich enough to remove units robustly. The
peer relation-departed hooks are not particularly useful to me, as
they cannot know in advance if the relation-broken hook will complete
successfully. It is the peer relation-broken hook that is responsible
for properly decoupling the unit from the service, and this works fine
if the unit is healthy. The problem is of course if the departing unit
*has* failed, because no subsequent hooks are called to repair the
damaged cluster.

As a concrete example, to remove a cassandra node from a cluster:
 - First, run 'nodetool decommission' on the departing node. This
streams its partitions to the remaining nodes.
 - Second, if 'nodetool decommission' failed or could not be run, run
'nodetool removenode' on one of the other nodes. This removed the
failed node from the ring, and the remaining nodes will rebalance and
rebuild using redundant copies of the data. Data may be lost if stored
with a replication factor of 1 or if updates only waited for an
acknowledgement from 1 node.

An extra hook as you suggest would help me to solve this issue. But
what would also solve my issue is juju leadership (currently in
development). When the lead unit runs its peer relation-departed hook,
it connects to the departing unit and runs the decommissioning process
on its behalf. If it is unable to connect, it assumes the node is
failed and cleans up. It can even notify the remaining non-leader
units that the remove unit has been removed from the cluster, giving
them a chance to update their configuration if necessary. You can't
really do this without the leadership feature, as you can't coordinate
which of the remaining units is responsible for decommissioning the
departing unit (and they would trip over each other if they all
attempted to decommission the departing node).

The edge case in my approach is of course if the departing unit is
live, but for some reason the leader cannot connect to it. Maybe your
inter DC links have gone down. However, there are similar issues with
the extra hook. If your -before-departed hook fails to run, how long
should juju wait until it gives up and triggers the -departed hooks?

Perhaps what is needed here is instead an extra hook run on the
remaining units if the -broken hook could not be run successfully?
Lets call it relation-failed. It could be fired when we know the vm is
gone and the -broken hook was not successfully run.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: juju min version feature

2014-12-16 Thread Stuart Bishop
On 16 December 2014 at 21:36, William Reade william.re...@canonical.com wrote:
 On Tue, Dec 16, 2014 at 6:36 AM, Stuart Bishop
 stuart.bis...@canonical.com wrote:
 I think we need the required juju version even if we also allow people
 to specify features. swift-storage could specify that it needs 'the
 version of juju that configures lxc to allow loopback mounts', which
 is a bug fix rather than a feature. Providing a feature flag for every
 bug fix that a charm may depend on is impractical.

 1) If you're developing for 1.20, then I think the compatible-1.20
 flag mentioned above should work as you desire, until juju changes to
 the point where some feature is actively incompatible. (As stated
 above, I'm expecting there will be some degree of tuning the charm
 environment to the declared flags regardless.)

 2) Expand on the impracticality a bit please? I imagine that when
 we're talking about bugfixes of the sort you describe, the proportion
 of charms that care about a given one will be small; tracking them all
 may be somewhat *tedious* for the developers, but I don't see it being
 especially difficult or risky -- and AFAICS it need not impact any
 charm developers other than those who need that specific flag.

 ...not that I'm really keen to define a flag for every bugfix :-/. Do
 you have a rough idea of how often you've wanted min-version so far?

Practically, as a charm developer I'll be developing and testing using
juju-stable (1.20.14) and would tag my charms as minversion 1.20.14
(or tagged compatible-1.20 if you prefer). I will not give a moments
thought to old versions of juju unless I have a special need for back
porting my work.

For example, I've recently implemented rolling restarts in a charm
(and will push this to charmhelpers). For now, it uses the peer
relation to coordinate things. IIRC, in older versions of juju you
could not use the peer relationship unless there was at least one
other peer and the relationship had been joined. That has since been
fixed, and I can rely on using the peer relationship even if I only
have a single unit reducing my code paths. I'm relying on this
behaviour, yet have no idea if this changed in 1.16 or 1.18 or 1.20.
Most developers will never even know the behaviour was different in
the past, since they are developing in the present. Developers can't
track what they are not aware of.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: juju min version feature

2014-12-15 Thread Stuart Bishop
On 16 December 2014 at 00:13, John Meinel j...@arbash-meinel.com wrote:
 Can't we just as easily provide tools to find out what version of Juju
 provides a particular feature? Certainly a CLI of:
  $ juju supported-features
   leader-election
   container-addressibility

 Or even possibly something that talks to something like the charm-store:
  $ juju known-features
  leader-election: juju = 2.2
  container-addressibility: juju = 2.0

 I'm personally on the side of having charm *authors* talk about the features
 they want. Because then in juju-world we can enable/disable specific
 features based on them being requested, which makes charm authors get the
 features they need right. (e.g., if the charm doesn't say it needs
 leader-election, then it doesn't get leader tools exposed.)

 min-version, otoh, leads to people just setting it to the thing they are
 using, and doesn't give Juju a way to smartly enable/disable functionality.
 It also suffers from when we want to drop a feature that didn't turn out
 quite like what we thought it would.

On the flip side, I could state that I am developing my charm for juju
1.20 and not care what features I'm using. If someone deploys my charm
with juju 2.1, then juju could do so by deploying the charm in a 1.20
compatible environment. Juju devs can forge ahead and make backwards
incompatible changes to hook tools and the meanings of environment
variables by providing a compatibility layer.

I do think it is useful to encode the required juju version in the
charm. We also need versioning on interfaces (charms need to make
backwards incompatible changes to interfaces), and better support for
multiple charm series (1.0, 1.1, 2.0 vs 'trusty', 'precise'), but that
is all future spec work.

I think we need the required juju version even if we also allow people
to specify features. swift-storage could specify that it needs 'the
version of juju that configures lxc to allow loopback mounts', which
is a bug fix rather than a feature. Providing a feature flag for every
bug fix that a charm may depend on is impractical.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feature Request: show running relations in 'juju status'

2014-11-23 Thread Stuart Bishop
On 19 November 2014 at 19:59, William Reade william.re...@canonical.com wrote:
 On Tue, Nov 18, 2014 at 9:37 AM, Stuart Bishop
 stuart.bis...@canonical.com wrote:
 Ok. If there is a goal state, and I am able to wait until the goal
 state is the actual state, then my needs (and amulet and juju-deployer
 needs) will be met. It does seem a rather lengthy and long winded way
 of getting there though. The question I have always needed juju to
 answer is 'are there any hooks running or are there any hooks queued
 to run?'. I've always assumed that juju must already know this (or it
 would be unable to function), but refuses to communicate this single
 bit of information in any way.

 Juju as a system actually doesn't know this. Unit idleness is known
 only by the unit agents themselves, and only implicitly at that -- if
 we're blocking in a particular select clause then we're (probably!)
 idle, and that's it. I agree that exposing idleness would be good, and
 I'm doing some of the preliminary work necessary right now, but it's
 not my current focus: it's just a happy side-effect of what needs to
 be done for leader election.

Ok. I was thinking of a central system tracking the unit states and
firing hooks, but it seems the units are much more independent,
tracking their own state and making their own decisions.


 That would work too. If all units are in idle state, then the system
 has reached a steady state and my question answered.

 Sort of. It's steady for now, but will not necessarily still be steady
 by the time you're reacted to it -- even if you're the only
 administrator, imagine a cron job that uses juju-run and triggers a
 wave of relation traffic across the system.

Your example is actually a steady state in my mind, in much the same
way a biological system may be in a steady state despite having a
heartbeat. But yes, you can construct some pathological cases where my
heuristic is not good enough to detect when the system has reached an
equilibrium. I am perfectly fine with reporting that the system *was*
in a steady state rather than *is* in a steady state. If your system
is chaotic enough where the difference matters, I think you are better
off fixing it rather than forging ahead attempting to reliably test
and deploy a chaotic system.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feature Request: show running relations in 'juju status'

2014-11-18 Thread Stuart Bishop
On 18 November 2014 12:23, Ian Booth ian.bo...@canonical.com wrote:

 On 17/11/14 15:47, Stuart Bishop wrote:
 On 17 November 2014 07:13, Ian Booth ian.bo...@canonical.com wrote:

 The new Juju Status work planned for this cycle will hopefully address the 
 main
 concern about knowing when a deployed charm is fully ready to do the work 
 for
 which it was installed. ie the current situation whereby a unit is marked as
 Started but it not ready. Charms are able to mark themselves as Busy and 
 also
 set a status message to indicate they are churning and not ready to run. 
 Charms
 can also indicate that they are Blocked and require manual intervention (eg 
 a
 service needs a database and no relation has been established yet to 
 provide the
 database), or Waiting (the database on which the service relies is busy but 
 will
 resolve automatically when the database is available again).

 As long as the 'ready' state is managed by juju and not the unit, I'll
 stand happily corrected :-) The focus I'd seen had been on the unit
 declaring its own status, and there is no way for a unit to know that
 is ready because it has no way of knowing that, for example, there are
 another 10 peer units being provisioned that will need to be related.


 You are correct that the initial scope of work is more about the unit, and 
 less
 about the deployment as a whole. There are plans though to address the issue.
 We're throwing around the concept of a goal state, which is conceptually 
 akin
 to looking forward in time to be able to inform units what relations they will
 expect to participate in and what units will be deployed. They'd likely be
 something like a relation-goals hook tool (to compliment relation-list and
 relation-ids), as well as hook(s) for when the goal state changes. There's
 ongoing work in the uniter by William to get the architecture right so this 
 work
 can be considered. There's still a lot of value in the current Juju Status 
 work,
 but as you point out, it's not the full story.

Ok. If there is a goal state, and I am able to wait until the goal
state is the actual state, then my needs (and amulet and juju-deployer
needs) will be met. It does seem a rather lengthy and long winded way
of getting there though. The question I have always needed juju to
answer is 'are there any hooks running or are there any hooks queued
to run?'. I've always assumed that juju must already know this (or it
would be unable to function), but refuses to communicate this single
bit of information in any way.


 So although there are not currently plans to show the number of running 
 hooks in
 the first phase of this work, mechanisms are being provided to allow charm
 authors to better communicate the state of their charms to give much 
 clearer and
 more accurate feedback as to 1) when a charm is fully ready to do work, 2) 
 if a
 charm is not ready to do work, why not.

 A charm declaring itself ready is part of the picture. What is more
 important is when the system is ready. You don't want to start pumping
 requests through your 'ready' webserver, only to have it torn away as
 a new block device is mounted on your database when its storage-joined
 hook is invoked and returned to 'ready' state again once the
 storage-changed hook has completed successfully.


 Also being thrown around is the concept of a new agent-state called Idle,
 which would be used when there are no pending hooks to run. There are plans as

That would work too. If all units are in idle state, then the system
has reached a steady state and my question answered.


 well for the next phase of the Juju status work to allow collaborating 
 services
 to notify when they are busy, and mark relationships as down. So if the 
 database
 had it's storage-attached hook invoked, it would mark itself as Busy, mark its
 relation to the webserver as Down, thus allowing the webserver to put itself
 into Waiting. Or, if we are talking about the initial install phase, the
 database would not initially mark itself as Running until its declared storage
 requirements were met, so the webserver would go from Installing to Waiting 
 and
 then to Running one the database became Running.

I'm not entirely sure how useful this feature is, given the inherent
race conditions with serialized hooks. Right now, you need to write
charms that gracefully cope with dependent services that have gone
down without notice. With this feature, you will need to write charms
that gracefully cope with dependent services that have gone down and
the notification hasn't reached you yet. Or if the outage is for
non-juju reasons, like a network partition. The window of time waiting
for hooks to bubble through could easily be minutes when you have a
simple chain of services (eg. postgresql - pgbouncer - django -
haproxy - apache seems common enough).

Your example with storage is particularly interesting, as I was just
dealing with this yesterday in my rewrite of the Cassandra charm. The
existing

Re: Feature Request: show running relations in 'juju status'

2014-11-16 Thread Stuart Bishop
On 17 November 2014 07:13, Ian Booth ian.bo...@canonical.com wrote:

 The new Juju Status work planned for this cycle will hopefully address the 
 main
 concern about knowing when a deployed charm is fully ready to do the work for
 which it was installed. ie the current situation whereby a unit is marked as
 Started but it not ready. Charms are able to mark themselves as Busy and also
 set a status message to indicate they are churning and not ready to run. 
 Charms
 can also indicate that they are Blocked and require manual intervention (eg a
 service needs a database and no relation has been established yet to provide 
 the
 database), or Waiting (the database on which the service relies is busy but 
 will
 resolve automatically when the database is available again).

As long as the 'ready' state is managed by juju and not the unit, I'll
stand happily corrected :-) The focus I'd seen had been on the unit
declaring its own status, and there is no way for a unit to know that
is ready because it has no way of knowing that, for example, there are
another 10 peer units being provisioned that will need to be related.


 So although there are not currently plans to show the number of running hooks 
 in
 the first phase of this work, mechanisms are being provided to allow charm
 authors to better communicate the state of their charms to give much clearer 
 and
 more accurate feedback as to 1) when a charm is fully ready to do work, 2) if 
 a
 charm is not ready to do work, why not.

A charm declaring itself ready is part of the picture. What is more
important is when the system is ready. You don't want to start pumping
requests through your 'ready' webserver, only to have it torn away as
a new block device is mounted on your database when its storage-joined
hook is invoked and returned to 'ready' state again once the
storage-changed hook has completed successfully.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Feature Request: show running relations in 'juju status'

2014-11-14 Thread Stuart Bishop
On 14 November 2014 22:31, Mario Splivalo mario.spliv...@canonical.com wrote:
 Hello, good people!

 How hard would it be to implement 'showing running relations in juju
 status'?

 Currently there is no easy (if any) way of knowing the state of the
 deployment. When one does 'juju add-relation' the relation hooks are
 run, but there is no feedback on weather the hooks are still running or
 everything is done. Only in case there is a hook error you would see
 that one in 'juju status'. One can have logs tailed and assume that when
 there is no action for some amount of time - everything deployed as it
 should.

 Having juju status display number of running hooks would greatly help in
 troubleshooting deployments.

This has been my most wanted feature for well over a year, and at the
moment is covered by
https://bugs.launchpad.net/juju-core/+bug/1254766. Unfortunately, I
don't think the work has been scheduled and I don't think the latest
round of updates to 'juju status' cover it.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Using subdocument _id fields for multi-environment support

2014-10-01 Thread Stuart Bishop
On 1 October 2014 11:25, Menno Smits menno.sm...@canonical.com wrote:

 MongoDB allows the _id field to be a subdocument so Tim asked me to
 experiment with this to see if it might be a cleaner way to approach the
 multi-environment conversion before we update any more collections. The code
 for these experiments can be found here:
 https://gist.github.com/mjs/2959bb3e90a8d4e7db50 (I've included the output
 as a comment on the gist).

 What I've found suggests that using a subdocument for the _id is a better
 way forward. This approach means that each field value is only stored once
 so there's no chance of the document key being out of sync with other fields
 and there's no unnecessary redundancy in the amount of data being stored.
 The fields in the _id subdocument are easy to access individually and can be
 queried separately if required. It is also possible to create indexes on
 specific fields in the _id subdocument if necessary for performance reasons.

Using a subdocument for the _id is taught and recommended in the
MongoDB courseware. In particular, the index is more useful to the
query planner. If the fields are separate, then mongodb will end up
querying by unit name and then filtering the results by environment
(but that won't matter much in this case).

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Using subdocument _id fields for multi-environment support

2014-10-01 Thread Stuart Bishop
On 1 October 2014 19:31, Kapil Thangavelu
kapil.thangav...@canonical.com wrote:

 every _id seem like clear wins for subdoc _ids. Although i'm curious what
 effect this data struct has on mongo resource reqs at scale vs the compound
 string, as mongo tries keeps _id sets in mem, when it doesn't fit in mem,
 perf becomes unpredictable (aka bad) as there's two io per doc fetch (id,
 and doc) and extra io on insert to verify uniqueness.

I think it is the index that needs to be kept in RAM, rather than the
actual _id, so it will be a win here. Instead of having 3 indexes to
keep in RAM to stop performance sucking (_id, unit, environment), we
now just have a single fatter one.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: logrotate configuration seems wrong

2014-09-15 Thread Stuart Bishop
On 15 September 2014 12:38, John Meinel j...@arbash-meinel.com wrote:

 7) copytruncate seems the wrong setting for interactive with rsyslog. I
 believe rsyslog is already aware that the file needs to be rotated, and thus

It is only aware if you sent it a HUP signal.

 it shouldn't be trying to write to the same file handle (and thus we don't
 need to truncate in place). I'm not 100% sure on the interactions here, but
 copytruncate seems to have an inherent likelyhood of dropping data (while
 you are copying, if any data gets written then you'll miss those last few
 bytes when you go to truncate, right?)

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju Actions - Use Cases

2014-09-10 Thread Stuart Bishop
On 10 September 2014 11:23, Tim Penhey tim.pen...@canonical.com wrote:
 On 10/09/14 06:59, John Weldon wrote:
 We're looking for use cases for Juju Actions, mostly to make sure we
 expose the right API.

 I'm hoping for a few different use cases from the Juju Web UI folks, but
 I'd appreciate input from anyone wanting to use Juju Actions in their
 charms too.

 I've started a document with some example use cases to prime the pump:
 please contribute to this document and don't feel constrained to the
 style or layout I adopted for the examples.


 If you have any interest or investment in using or publishing Actions
 for Juju please review and contribute!

 Google Docs Link
 https://docs.google.com/document/d/1uYffkkGA1njQ1oego_h8BYBMrlGpmN_lwsnrOZFxE9Q/edit?usp=sharing

 I'd love to see explicit backup/restore actions for the postgresql charm.

For the PostgreSQL charm, off the top of my head:

backup-start
- May be a logical backup or filesystem level backup, so perhaps 2 actions
- May take hours or days.
- Scheduled in cron, in addition to on demand.
- Should it return immediately, or emit status while the backup progresses?
- Can backups be streamed back to the user? If not, the charm has
to support many storage options.

backup-cancel
- Cancel a running backup
- The charm might need to cancel a backup, eg. if failover has
been triggered.

backup-status
- Status of running backups

backup-recover
- Destroy the master database, rebuilding using the backup
- May take hours or days, multi-terrabyte databases are not uncommon.
- Can the backup be streamed from the user? If not, the charm has
to support many storage options.
- If backup is filesystem level, optionally recover to a specific
point in time.
- Does not require location of backup, as default would be the
automatic backups.
- Only makes sense running on a unit in the master service if
using cascading replica services
- Recovery does not have to happen on the master unit in the
master service. If recovery is done on a hot standby unit in the
master service, that hot standby will be promoted to master when it
completes.
- Once recovery is complete, all hot standbys need to be rebuilt
from the master

failover
- Promote a specific unit to be the master.

rebuild
- Rebuild a hot standby unit from the master unit.
- This may rarely need to be done by an end user, eg. if a unit
has desynchronized during an extended netsplit and the data required
to catch up is no longer available.
- More likely, this action will be invoked by the backup-recover action
- Most likely, this action will be invoked by the
peer-relation-joined and slave-relation-joined hooks, allowing the
rebuild to be done asynchronously rather than the current situation
where the hooks may take hours or days to complete.


For the pgbouncer charm, matching the main pgbouncer actions:

stop
  - Stop the pgbouncer daemon.
  - Big hammer if the 'disable, kill, pause, resume, enable' dance
is not your style.

start
  - Start the daemon.

disable [db]
- Disable new client connections to a given database

kill [db]
- Immediately drop all client and server connections on a
given database.

enable [db]
- Reenable a database after a 'disable'

pause [db]
- Disconnect from a database, first waiting for queries to complete.

resume [db]
- Resume after a previous 'pause'.


The storage and storage-subordinate charms could have some interesting
use cases, although these might end up being swallowed by juju-core
rather than become actions. At the moment the storage-subordinate
informs the charm when the requested filesystem mount is ready, and it
is the host charm's responsibility to shut down daemons, move
datafiles to the new mount, and restart. If there were standard
actions to stop and start the system, then the subordinate could do
everything and the only burden placed on the host charm is advertising
a patch that contains all of its data files. Perhaps these start/stop
actions already exist in the form of the start/stop hooks.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Juju Actions - Use Cases

2014-09-10 Thread Stuart Bishop
On 10 September 2014 19:49, Richard Harding rick.hard...@canonical.com wrote:

 I think most of the use cases presented so far line up with ours. One I
 want to call out as interesting and I hadn't thought about is killing a
 long running action in progress. The example of a database backup. I don't
 see anything along those lines in the current api doc. You can cancel
 something from the queue, but can you cancel something running.

I don't think this one impacts the design. The cancel action can kill
the process being run by the backup action easily enough, and that
still meets my use case.

Oh... I'll add one more to the list while I'm here

reset-secrets
  - Causes all generated passwords and secrets to be regenerated.
  - Likely will cause a micro outage as clients will get disconnected,
so it is on demand rather that done automatically every few hours.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: A beginner's adventure in Charm authoring

2014-09-04 Thread Stuart Bishop
On 4 September 2014 14:26, John Meinel j...@arbash-meinel.com wrote:

 Deploying a local charm is needlessly complex. Why do I need to create a
 special directory structure, move my code under there, set --repository and
 write local:foo and even then it has to go scanning through the directory,
 looking for a charm with the right name in the metadata.yaml.  Why can't I
 just say deploy the charm in this directory? e.g.   juju deploy
 --local=path  Bam, done.

 At the very least we need to know what OS Series the charm is targeting.

 Which is currently only inferred from the path. I don't particularly like
 it, and I think the code that searches your whole repository and then picks
 the best one is bad, as it confuses people far more often than it is
 helpful.
 (If you have $REPO, and have $REPO/precise/charm and
 $REPO/precise/charm-backup but the 'revision' number in charm-backup is
 higher for whatever reason, juju deploy --repository=$REPO charm will
 actually deploy charm-backup)

 I'm certainly for deploy the charm in this directory as long as we can
 sort out a good way to determine the series.

The only sane way I see is for the charm to declare what series it
supports, probably in its metadata.yaml. In practice, we regularly
deploy branches targetted to precise to trusty and vice versa because
one branch supports both series and the branch on the other series
just an unmaintained atavism. I think forcing a 1:1 mapping between a
branch and a series is not useful to anyone, and the series component
in the charm URL just causes confusion.

Well... it might have one use. Versioning. It gives you a way of
breaking backwards compatibility with old versions of your charm. So
for instance, the major rewrite of the Cassandra charm won't be able
to upgrade-charm from the old version, so instead we hope to push it
to trusty and leave the precise branch to rot in peace. Not ideal, but
the only way of doing charm versioning at the moment.

In fact, now I think about it the release in the URL *is* the major
version (series) of the charm. It is just unfortunate that the
possible charm versions have been hardwired to the Ubuntu releases,
because the Ubuntu release is much less important than the release of
the software I'm charming.

I think we could decouple this, allowing arbitrary supported series in
a charm and gaining a sane charm versioning concept, by redoing the
Launchpad model and changing charms from being sourcepackages on a
distribution called 'charms' to instead being products with product
series. I could then switch product-series whenever my charm chances
the set of supported Ubuntu releases, or when upgrade-charm stops
working without manual steps.


-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: A beginner's adventure in Charm authoring

2014-09-04 Thread Stuart Bishop
On 4 September 2014 16:30, John Meinel j...@arbash-meinel.com wrote:
 ...



 The only sane way I see is for the charm to declare what series it
 supports, probably in its metadata.yaml. In practice, we regularly
 deploy branches targetted to precise to trusty and vice versa because
 one branch supports both series and the branch on the other series
 just an unmaintained atavism. I think forcing a 1:1 mapping between a
 branch and a series is not useful to anyone, and the series component
 in the charm URL just causes confusion.

 So how do we decide what image to bring up to install your charm on? If it
 supports multiple OS series, then you still need a place/syntax/something to
 disambiguate what you actually want us to do. (I'm not saying that being
 directly in the URL is the ideal place, but we do need to consider how we
 interact with the system.)

I imagine the list in config.yaml would be in recommended order. That
order would be  used if the series was not explicitly specified in the
constraints when deploying the service.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: First customer pain point pull request - default-hook

2014-08-22 Thread Stuart Bishop
On 22 August 2014 10:43, Marco Ceppi marco.ce...@canonical.com wrote:
 So there is already a JUJU_HOOK_NAME environment variable. So that is easy
 enough. I'm not sure what the issue is with having a default-hook file that
 is executed when juju can't find that hook name.

 I don't want to make it an all or nothing solution where you either have one
 file or hooks per file, there doesn't seem to be any real advantage to that.
 For example my default - hook might be written in a language not on the
 cloud image, now I need an install hook which installs that interpreter.

Looking at the charms I am writing now, I have install, start, stop
and do-everything-else. peer relation-broken is possibly the other one
that will need to be special, to ensure a unit being destroyed doesn't
stomp on active resources being used by the remaining peers.


 I'm a plus one to a fall back of default-hook when hook isn't found and a +1
 to the already existent environment variable.

I'm +0. The symlinks are a dead chicken that needs to be sacrificed,
but it is all explicit. I can imagine problems with default-hook too,
such as a typo causing your default-hook to be called instead of your
desired hook.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Intentionally introducing failures into Juju

2014-08-14 Thread Stuart Bishop
On 14 August 2014 07:31, Menno Smits menno.sm...@canonical.com wrote:
 I like the idea being able to trigger failures using the juju command line.

 I'm undecided about how the need to fail should be stored. An obvious
 location would be in a new collection managed by state, or even as a field
 on existing state objects and documents. The downside of this approach is
 that a connection to state will then need to be available from where-ever we
 would like failures to be triggered - this isn't always possible or
 convenient.

 Another approach would be to have juju inject-failure drop files in some
 location (along the lines of what I've already implemented) using SSH. This
 has the advantage of making the failure checks easy to perform from anywhere
 with the disadvantage of making it more difficult to manage existing
 failures. There would also be some added complexity when creating failure
 files for about-to-be-created entities (e.g. the juju deploy
 --inject-failure case).

 Do you have any thoughts on this?


Further to just injecting failures, I'm interested in controlling when
and the order hooks can run. A sort of manual mode, which could be
driven by a test harness such as Amulet. Perhaps all hooks in the
queue are initially held, and I can unhold them one at a time.

This would let me test the odd edge cases, such as peers departing
peer relations during handshaking, or what happens when a new client
unit is added and its relation-changed hooks manages to run before the
relation-joined hooks at the server end.

If you could do this, you could inject your failures by actually
breaking your units using juju run or juju ssh. Deploy your units, run
the install hooks, juju ssh in breaking one of the units (rm -rf /,
whatever), run the peer relation hooks, confirm that the service is
still usable despite the failed unit.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Implement system reboot via juju hooks

2014-08-11 Thread Stuart Bishop
On 11 August 2014 18:20, William Reade william.re...@canonical.com wrote:

 I'd like to explore your use cases a bit more to see if we can find a clean
 solution to your problems that doesn't go too far down the (2) road that I'm
 nervous about. (The try-again-later mechanism is much smaller and cleaner
 and I think we can accommodate that one pretty easily, fwiw -- but what are
 the other problems you want to solve?)

Memory related settings in PostgreSQL will only take effect when the
database is bounced. I need to avoid bouncing the primary database:
 1) when backups are in progress.
 2) when a hot standby unit is being rebuilt from the primary.

Being able to have a hook abort and be retried later would let me
avoid blocking.

A locking service would be useful too for units to signal certain
operations (with locks automatically released when the hooks that took
them exit). The in-progress update to the Cassandra charm has
convoluted logic in its peer relation hooks to do rolling restarts of
all the nodes, and I imagine MongoDB, Swift and many others have the
same issue to solve.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Implement system reboot via juju hooks

2014-08-09 Thread Stuart Bishop
On 8 August 2014 19:58, Gabriel Samfira gsamf...@cloudbasesolutions.com wrote:
 Hello folks!

 I would like to start work on implementing reboots via juju hooks. I
 have outlined in a google docs document a few thoughts regarding why
 this is needed and some implementation details I would like to discuss
 before starting.

 You may find the doc here:

 http://goo.gl/tGoIuM

 Any thoughts/suggestions are welcome.

 Gabriel

I don't think this should be restricted to server reboots. The
framework is generally useful.

I have hooks that need to bounce the primary service so config changes
can take effect. They can't do that if a long running operation is
currently in progress, eg. a backup or a replica node is being built.
Currently, I need to block the hook until such time as I can proceed.
I think this would be cleaner if I could instead return a particular
error code from my hook, stating that it is partially complete and
requesting it to be rescheduled.

So it would be nice if requesting a reboot and requesting a hook to be
rescheduled are independent things.

I had wondered if juju-run should allow arbitrary things to be run in
a hook context later.

juju-run --after hook /sbin/reboot # queue the reboot command to be
run after this hook completes.
juju-run --after hook config-changed  # queue the config-changed hook
to be run after this hook completes, and after any previously queued
commands
juju-run --after tomorrow report-status # Run the report-status
command sometime after 24 hours.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Mongo experts - help need please

2014-07-25 Thread Stuart Bishop
On 25 July 2014 12:05, Gustavo Niemeyer gustavo.nieme...@canonical.com wrote:
 On Fri, Jul 25, 2014 at 1:02 AM, Ian Booth ian.bo...@canonical.com wrote:
 We've transitioned to using Session.Copy() to address the situation whereby 
 Juju
 would create a mongo collection instance and then continue to make db calls
 against that collection without realising the underlying socket may have 
 become
 disconnected. This resulted in Juju components failing, logging i/o timeout
 errors talking to mongo, even though mongo itself was still up and running.

 Sounds sane, as I indicated in previous discussions about the topic in
 these last two weeks and also about a year ago when we covered that.
 Serializing every single request to a concurrent server via a single
 database connection seems like a pretty bad idea for anything but
 simplistic servers.

 As an aside - I'm wondering whether the mgo driver shouldn't transparently 
 catch
 an i/o error associated with a dead socket and retry using a fresh connection
 rather than imposing that responsibility on the caller?

 The evidence so far indicates that this will likely not happen. The
 current design was purposefully put in place so that harsh connection
 errors are not swept under the rug, and this seems to be working well
 so far. I'd rather not have juju proceeding over a harsh problem such
 as a master re-election midway through the execution of an algorithm
 without any indication that the failure has happened, let alone
 silently retry operations that in most cases are not idempotent.

 That said, the goal is of course not to make the developer's life
 miserable. All the driver wants is an acknowledgement that the error
 was perceived and taken care of. This is done trivially by calling:

 session.Refresh()

 Done. The driver will happily drop the error notice, and proceed with
 further operations, blocking if waiting for a re-election to take
 place is necessary.

The bug Ian cites and is trying to work around has sessions failing
with an i/o error after some time (I'm guessing resource starvation in
MongoDB or TCP networking issues). session.Copy() is pulling things
from a pool, so it might be handing out sessions doomed to fail with
exactly the same issue. The connections in the pool could even be
perfectly functional when they went in, with no way at the go level of
knowing they have failed without trying them.

If this is the case, then Ian would need to handle the failure by
ensuring the failed connection does not go back in the pool and
grabbing a new one (the defered Close() will return it I think). And
repeating until it works, or until the pool has been exhausted and we
know Mongo is actually down rather than just having a polluted pool.

-- 
Stuart Bishop stuart.bis...@canonical.com

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev