Re: [openstack-dev] [Openstack] [TROVE] Manual Installation Again

2014-03-06 Thread Mark Kirkwood

On 07/03/14 18:03, Mark Kirkwood wrote:



The wrong concerns the action given to trove-manage in the Prepare
Database section:

$ trove-manage --config-file=PathToTroveConf image_update mysql
 `nova --os-username trove --os-password trove --os-tenant-name trove
 --os-auth-url http://KeystoneIp:5000/v2.0 image-list | awk
'/trove-image/ {print $2}'`

This should probably be:

$ trove-manage --config-file=PathToTroveConf datastore_version_update
mysql mysql-5.5 mysql
 `nova --os-username trove --os-password trove --os-tenant-name trove
 --os-auth-url http://KeystoneIp:5000/v2.0 image-list | awk
'/trove-image/ {print $2}'` 1

...which is a bit of a mouthful - might be better to break it into 2 steps.




...and I got it wrong too - forgot the package arg, sorry:

$ trove-manage --config-file=PathToTroveConf datastore_version_update 
mysql mysql-5.5 mysql

`nova --os-username trove --os-password trove --os-tenant-name trove
--os-auth-url http://KeystoneIp:5000/v2.0 image-list | awk 
'/trove-image/ {print $2}'` mysql-server-5.5 1


Especially in the light of the above I think a less confusing 
presentation would be:


$ nova --os-username trove --os-password trove --os-tenant-name trove
--os-auth-url http://KeystoneIp:5000/v2.0 image-list | awk 
'/trove-image/ {print $2}'

image uuid

$ trove-manage --config-file=PathToTroveConf datastore_version_update 
mysql mysql-5.5 mysql image uuid mysql-server-5.5 1



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [DevStack] neutron config not working

2014-06-24 Thread Mark Kirkwood

On 25/06/14 10:59, Rob Crittenden wrote:

Before I get punted onto the operators list, I post this here because
this is the default config and I'd expect the defaults to just work.

Running devstack inside a VM with a single NIC configured and this in
localrc:

disable_service n-net
enable_service q-svc
enable_service q-agt
enable_service q-dhcp
enable_service q-l3
enable_service q-meta
enable_service neutron
Q_USE_DEBUG_COMMAND=True

Results in a successful install but no DHCP address assigned to hosts I
launch and other oddities like no CIDR in nova net-list output.

Is this still the default way to set things up for single node? It is
according to https://wiki.openstack.org/wiki/NeutronDevstack




That does look ok: I have an essentially equivalent local.conf:

...
ENABLED_SERVICES+=,-n-net
ENABLED_SERVICES+=,q-svc,q-agt,q-dhcp,q-l3,q-meta,q-metering,tempest

I don't have 'neutron' specifically enabled... not sure if/why that 
might make any difference tho. However instance launching and ip address 
assignment seem to work ok.


However I *have* seen the issue of instances not getting ip addresses in 
single host setups, and it is often due to use of virt io with bridges 
(with is the default I think). Try:


nova.conf:
...
libvirt_use_virtio_for_bridges=False


Regards

Mark



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [Trove] Trove instance got stuck in BUILD state

2014-07-07 Thread Mark Kirkwood

On 08/07/14 00:40, Amrith Kumar wrote:




I think it is totally ludicrous (and to all the technical writers who work on 
OpenStack, downright offensive) to say the “docs are useless”. Not only have I 
been able to install and successfully operate a OpenStack installation by 
(largely) following the documentation, but “trove-integration” and “redstack” 
are useful for developers but I would highly doubt that a production deployment 
of Trove would use ‘redstack’.



Syed, maybe you need to download a guest image for Trove, or maybe there is 
something else amiss with your setup. Happy to catch up with you on IRC and 
help you with that. Optionally, email me and I’ll give you a hand.





It is a bit harsh, to be sure. However critical areas are light/thin or 
not covered at all - and this is bound to generate a bit of frustration 
for folk wanting to use this feature.


In particular:

- guest image preparation
- guest file injection (/etc/guest_info) nova interaction
- dns requirements for guest image (self hostname resolv)
- swift backup config authorization
- api_extensions_path setting and how critical that is

There are probably more that I have forgotten (repressed perhaps...)!

Regards

Mark


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [Trove] Trove instance got stuck in BUILD state

2014-07-07 Thread Mark Kirkwood

On 08/07/14 17:08, Denis Makogon wrote:

Mark, there are also no documentation about service tuning(no description
of service related options, sample configs in Trove repo is not enough).
So, I think we should extend your list of significant things to document.


Right - I guess most of the tuning/config parameters could be better 
documented too (I do recall seeing this mentioned for one of the Trove 
meetings).


One other thing I recall is:

- mysql security install/setup in guest (mysql root password).

I had to struggle through all of these - and it took a lot of time, 
because essentially the only viable way to debug each issue was:


- check in an equivalent devstack build
- read devstack setup code

or (if the issue was present in devstack as well)

- read the trove code and insert debug logging as appropriate

...and while this was a very interesting exercise, it was not a fast one!

Cheers

Mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [Trove] Trove instance got stuck in BUILD state

2014-07-07 Thread Mark Kirkwood

On 08/07/14 17:08, Denis Makogon wrote:

Mark, there are also no documentation about service tuning(no description
of service related options, sample configs in Trove repo is not enough).
So, I think we should extend your list of significant things to document.


...and in case it might be helpful: here's my notes for installing 
openstack/trove on Ubuntu 14.04 using the Ubuntu packages, and debugging 
it (ahem). Some of the issues (virtio on bridges) are caused by it being 
all on one node, but hopefully it is generally useful (I cover how to 
build the guest image and get backups etc going). I make no claim for it 
being the best/only way to force the beast into being :-) but I think it 
works (for the logic devotees - sufficient not maybe not necessary)!


Regards

Mark



README-OPENSTACK.gz
Description: application/gzip
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Trove] Postgresql Anyone working on DBaaS?

2014-07-10 Thread Mark Kirkwood
Where I work we make use of Postgresql for most of our database needs. 
It would be nice to be able to offer a Postgresql flavor within the 
Trove framework. Is anyone working on adding it in?


If noone else is, then I might look at doing it, if there are folks 
working on it - let me know if I can help with any part thereof.


Regards

Mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Trove] Postgresql Anyone working on DBaaS?

2014-07-10 Thread Mark Kirkwood

Thanks Denis, will do!

On 10/07/14 19:23, Denis Makogon wrote:

Hello Mark.

There are several patches for Postgresql are hanging at the review queue.

Here's useful link:

https://blueprints.launchpad.net/trove/+spec/postgresql-support

Contact point: Kevin Conway (you can ping him in IRC and ask if he needs
any help)


Best regards,
Denis Makogon


On Thu, Jul 10, 2014 at 9:24 AM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz mailto:mark.kirkw...@catalyst.net.nz
wrote:

Where I work we make use of Postgresql for most of our database
needs. It would be nice to be able to offer a Postgresql flavor
within the Trove framework. Is anyone working on adding it in?

If noone else is, then I might look at doing it, if there are folks
working on it - let me know if I can help with any part thereof.

Regards

Mark

_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Trove] Backup/restore namespace config move has leftovers

2014-07-31 Thread Mark Kirkwood

In my latest devstack pull I notice that

backup_namespace
restore_namespace

have moved from the default conf group to per datastore (commit 
61935d3). However they still appear in the common_opts section of



trove/common/cfg.py


This seems like an oversight - or is there something I'm missing?

Cheers

Mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Trove] Backup/restore namespace config move has leftovers

2014-08-01 Thread Mark Kirkwood

On 01/08/14 21:35, Denis Makogon wrote:


I'd suggest to file a bug
report and fix given issue.




Done.

https://bugs.launchpad.net/trove/+bug/1351545


I also took the opportunity to check if all the currently defined 
datastores had backup/restore_namespace set - they didn't, so I noted 
that too (I'm guessing they now actually *need* to have something set to 
avoid breakage)...


regards

Mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Trove] Backup/restore namespace config move has leftovers

2014-08-02 Thread Mark Kirkwood

On 02/08/14 18:24, Denis Makogon wrote:


Mark, we don't have to add backup/restore namespace options to datastores that 
does't support backup/restore feature.
You should take a look how backup procedure is being executed at Trove-API 
service site, see
https://github.com/openstack/trove/blob/master/trove/backup/models.py
(Method called _validate_can_perform_action)

If you'll have another questions, feel free to catch me up at IRC 
(denis_makogon).



Thanks Denis - I did wonder if it was an optional specification! Doh! 
However, while I'm a bit ignorant wrt redis and cassandra, I do have a 
bit todo with mongo and that certainly *does* support backup/restore...


Cheers

Mark


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Murano] Working in devstack?

2014-04-15 Thread Mark Kirkwood

Hi all,

There is some interest here in making use of Murano for Samba ADDC a 
service...so we've been (attempting) to get it up and running in 
devstack. In the process I've managed to get myself confused about the 
correct instructions for doing this:


- the docs suggest 
http://murano-docs.github.io/latest/getting-started/content/ch04s03.html

- the project provides murano-deployment/devstack-scripts/README.rst)

...which are markedly different approaches (it *looks* like the project 
README.rst is out of date, as it the scripts it runs try to get 
heat-horizon from repos that do not exist anymore). If this approach is 
not longer workable, we should really remove (or correct these 
instructions).


So following 
http://murano-docs.github.io/latest/getting-started/content/ch04s03.html 
inside a Ubuntu 12.04 VM I stumbled into a few bugs:


- several missing deps in various murano-*/requirements.txt (see attached)
- typo in /murano-api/setup.sh (db_sync vs db-sync)

Fixing these seems to get most things going (some tabs crash with 
missing tables, so I need to dig a bit more to find where they get 
created...).


Any pointers/etc would be much appreciated!

Cheers

Mark

Getting Murano to run in devstack
=

$ git clone https://github.com/openstack-dev/devstack.git
$ cd devstack
$ cat local.conf
[[local|localrc]]
ADMIN_PASSWORD=swordfish
MYSQL_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD
SERVICE_TOKEN=servicetoken

$ ./stack.sh
$ ./unstack.sh
$ sudo apt-get purge apache2# make sure default site does not linger

$ mv files/images ~ # save images!

$ cd
$ git clone https://github.com/stackforge/murano-deployment.git
$ cp -r murano-deployment/devstack-integration/* devstack/
$ cd devstack 
$ cp single-node.local.conf local.conf
$ vi local.conf # set HOST_IP
$ ./stack.sh


Bugs


1/ Several missing pip requirements:

This necessitates re-running ./stack.sh several times and correcting
the errors below one by one (sigh, there is hopefully a better way)

$ tail /opt/stack/murano-api/requirements.txt
yaql

$ tail /opt/stack/murano-conductor/requirements.txt
jsonpath
celery

$ tail /opt/stack/murano-repository/test-requirements.txt
flask-testing

$ tail /opt/stack/murano-dashboard/requirements.txt
bunch

2/ Missing directory

$ mkdir /etc/tgt


3/ Missing tables

No tables in murano db's
(error is install script)

$ vi /opt/stack/murano-api/setup.sh # db_sync - db-sync
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Murano] Working in devstack?

2014-04-15 Thread Mark Kirkwood
Thanks...yes, I'd not realized that running ./stack.sh again would 
unpatch my murano-api/setup.sh! Rerunning the db setup as you suggested 
gives me the tables. I didn't do it in as tidy a manner as yours however :-)


$ export OS_USERNAME=admin
$ export OS_PASSWORD=swordfish
$ export OS_TENANT_NAME=demo
$ export OS_AUTH_URL=http://host:5000/v2.0/
$ murano-manage --config-file /etc/murano/murano-api.conf db-sync

Cheers

Mark

On 16/04/14 11:23, Georgy Okrokvertskhov wrote:

Hi Mark,

Thank you for a detailed report. As I know Murano team is working on fixing
devstack scripts.

As for DB setup it should be done by a command: tox -evenv -- murano-manage
--config-file etc/murano/murano-api.conf db-sync

It works in my testing environment.

Thanks
Georgy


On Tue, Apr 15, 2014 at 4:11 PM, Mark Kirkwood 
mark.kirkw...@catalyst.net.nz wrote:


Hi all,

There is some interest here in making use of Murano for Samba ADDC a
service...so we've been (attempting) to get it up and running in devstack.
In the process I've managed to get myself confused about the correct
instructions for doing this:

- the docs suggest http://murano-docs.github.io/latest/getting-started/
content/ch04s03.html
- the project provides murano-deployment/devstack-scripts/README.rst)

...which are markedly different approaches (it *looks* like the project
README.rst is out of date, as it the scripts it runs try to get
heat-horizon from repos that do not exist anymore). If this approach is not
longer workable, we should really remove (or correct these instructions).

So following http://murano-docs.github.io/latest/getting-started/
content/ch04s03.html inside a Ubuntu 12.04 VM I stumbled into a few bugs:

- several missing deps in various murano-*/requirements.txt (see attached)
- typo in /murano-api/setup.sh (db_sync vs db-sync)

Fixing these seems to get most things going (some tabs crash with missing
tables, so I need to dig a bit more to find where they get created...).

Any pointers/etc would be much appreciated!

Cheers

Mark


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] MySQL performance and Mongodb backend maturity question

2014-09-24 Thread Mark Kirkwood

On 25/09/14 15:37, Qiming Teng wrote:

Hi,

Some weeks ago, I checked my then latest devstack install and I learned
this: event support in Ceilometer is only available for sqlalchemy
backend; mongodb backend was still under development.  I have been using
MySQL during the past weeks and now I think I'm trapped by a performance
problem of MySQL.

One or two Nova servers were launched and remain idle for about 10 days.
Now I'm seeing a lot of data accumulated in db and I wanted to cleanse
it manually.  Here is what I got:

mysql select count(*) from metadata_text;
+--+
| count(*) |
+--+
| 25249913 |
+--+
1 row in set (3.83 sec)

mysql delete from metadata_text limit 1000;
Query OK, 1000 rows affected (0.02 sec)

mysql delete from metadata_text limit 1;
Query OK, 1 rows affected (0.39 sec)

mysql delete from metadata_text limit 10;
Query OK, 10 rows affected (2.31 sec)

mysql delete from metadata_text limit 100;
Query OK, 100 rows affected (25.32 sec)

mysql delete from metadata_text limit 200;
Query OK, 200 rows affected (1 min 16.17 sec)

mysql delete from metadata_text limit 400;
Query OK, 400 rows affected (7 min 40.40 sec)

There were 25M records in one table.  The deletion time is reaching an
unacceptable level (7 minutes for 4M records) and it was not increasing
in a linear way.  Maybe DB experts can show me how to optimize this?



Writes of bigger datasets will take non linear time when (possibly 
default?) configs are outgrown. For instance (assumimg metadata_text is 
an innodb table, take a look at:


- innodb_log_buffer_size
- innodb_log_file_size (warning: read the manual carefully before 
changing this)

- innodb_buffer_pool_size

Also index maintenance can get to be a limiting factor, I'm not sure if 
mysql will use the sort buffer to help with this, but maybe try increase


- sort_buffer_size

(just for the session doing the delete) and see if it helps.

There are many (way too many) other parameters to tweak, but the above 
ones are probably the best to start with.


Cheers

Mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [trove] Confused about nova_proxy_admin_* settings

2015-01-20 Thread Mark Kirkwood
I've been looking at how the 3 nova_proxy_admin_* settings are used. I'm 
coming to the conclusion that I'm confused:


E.g I note that a standard devstack (stable/juno branch) with trove 
enabled sets these as follows:


nova_proxy_admin_pass =
nova_proxy_admin_tenant_name = trove
nova_proxy_admin_user = radmin


However there is no 'radmin' user (or role) created in keystone, so the 
settings above cannot possibly work (if they were needed/used). Some 
experimentation involving removing these three settings from all of the 
trove config files seems to support the idea that they are in fact not 
needed, which has me somewhat puzzled.


If someone could shed some light here that would be awesome!

Thanks

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove] Confused about nova_proxy_admin_* settings

2015-01-21 Thread Mark Kirkwood

Thanks Nikhil,

I figured I must have missed something that actually used the proxy - I 
didn't have metering enabled in devstack. I'll enable it and (I guess) 
it should fail until I give it correct proxy admin credentials...


Cheers

Mark

On 22/01/15 05:55, Nikhil Manchanda wrote:

Hi Mark:

It's been a little while since I looked at this last, but as I recall these
values seem
to be used and needed only by the trove taskmanager. If you have support
for
metering messages turned on, this account gets used to look up instance
details
when sending periodic metering messages to an AMQP exchange.

These aren't needed on the guest-agent, and that's probably the reason why a
non-existing value of radmin seems to work. The fact that this exists in
the guest
configuration as well is probably a bug and should be cleaned up.

Cheers,
Nikhil


On Tue, Jan 20, 2015 at 4:05 PM, Mark Kirkwood 
mark.kirkw...@catalyst.net.nz wrote:


I've been looking at how the 3 nova_proxy_admin_* settings are used. I'm
coming to the conclusion that I'm confused:

E.g I note that a standard devstack (stable/juno branch) with trove
enabled sets these as follows:

nova_proxy_admin_pass =
nova_proxy_admin_tenant_name = trove
nova_proxy_admin_user = radmin


However there is no 'radmin' user (or role) created in keystone, so the
settings above cannot possibly work (if they were needed/used). Some
experimentation involving removing these three settings from all of the
trove config files seems to support the idea that they are in fact not
needed, which has me somewhat puzzled.

If someone could shed some light here that would be awesome!

Thanks

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove] confused about trove-guestagent need nova's auth info

2015-01-11 Thread Mark Kirkwood

On 18/12/14 14:30, 乔建 wrote:

When using trove, we need to configure nova’s user information in the
configuration file of trove-guestagent, such as

lnova_proxy_admin_user

lnova_proxy_admin_pass

lnova_proxy_admin_tenant_name

Is it necessary? In a public cloud environment, It will lead to serious
security risks.

I traced the code, and noticed that the auth data mentioned above is
packaged in a context object, then passed to the trove-conductor via
message queue.

Is it more suitable for trove-conductor to get the corresponding
information from its own conf file?



Yes - all good points. Experimenting with devstack Juno branch, it seems 
you can happily remove these three settings.


However the guest agent does seem to need the rabbit host and password, 
which is probably undesirable for the same reasons that you mentioned above.


Regards

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove] confused about trove-guestagent need nova's auth info

2015-01-11 Thread Mark Kirkwood

On 11/01/15 22:25, Denis Makogon wrote:



Guest agent doesn't need configuration options described above. IIRC,
only taskmanager needs them.


Right - so we need to update the default config files and doco - as they 
have them in there.



About passing auth data. What are those benefits of changing the way in
which auth data is shipped? If you still think of security risks - you
may use SSL protocol that is available in most of messaging services.



I guessing the original poster was thinking along these lines: breaking 
into the image gives the tenant access to privileged passwords. Whether 
the communication is SSL or not is another (interesting) factor.


Regards

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] who is the ptl of trove?

2015-05-11 Thread Mark Kirkwood

On 09/05/15 02:28, Monty Taylor wrote:

On 05/08/2015 03:45 AM, Nikhil Manchanda wrote:


Comments and answers inline.

Li Tianqing writes:


[...]



1) why we put the trove vm into user's tenant, not the trove's
tenant? User can login on that vm, and that vm must connect to
rabbitmq. It is quite insecure.
what's about put the tenant into trove tenant?


While the default configuration of Trove in devstack puts Trove guest
VMs into the users' respective tenants, it's possible to configure Trove
to create VMs in a single Trove tenant. You would do this by
overriding the default novaclient class in Trove's remote.py with one
that creates all Trove VMs in a particular tenant whose user credentials
you will need to supply. In fact, most production instances of Trove do
something like this.


Might I suggest that if this is how people regularly deploy, that such a
class be included in trove proper, and that a config option be provided
like use_tenant='name_of_tenant_to_use' that would trigger the use of
the overridden novaclient class?

I think asking an operator as a standard practice to override code in
remote.py is a bad pattern.


2) Why there is no trove mgmt cli, but mgmt api is in the code?
Does it disappear forever ?


The reason for this is because the old legacy Trove client was rewritten
to be in line with the rest of the openstack clients. The new client
has bindings for the management API, but we didn't complete the work on
writing the shell pieces for it. There is currently an effort to
support Trove calls in the openstackclient, and we're looking to
support the management client calls as part of this as well. If this is
something that you're passionate about, we sure could use help landing
this in Liberty.


3)  The trove-guest-agent is in vm. it is connected by taskmanager
by rabbitmq. We designed it. But is there some prectise to do this?
 how to make the vm be connected in vm-network and management
 network?


Most deployments of Trove that I am familiar with set up a separate
RabbitMQ server in cloud that is used by Trove. It is not recommended to
use the same infrastructure RabbitMQ server for Trove for security
reasons. Also most deployments of Trove set up a private (neutron)
network that the RabbitMQ server and guests are connected to, and all
RPC messages are sent over this network.


This sounds like a great chunk of information to potentially go into
deployer docs.




I'd like to +1 this.

It is misleading that the standard documentation (and the devstack 
setup) describes a configuration that is unsafe/unwise to use in 
production. This is surely unusual to say the least! Normally when test 
or dev setups use unsafe configurations the relevant docs clearly state 
this - and describe how it should actually be done.


In addition the fact that several extended question threads were 
required to extract this vital information is ...disappointing, and does 
not display the right spirit for an open source project in my opinion!


Regards

Mark




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Mysql db connection leaking?

2015-04-16 Thread Mark Kirkwood

On 17/04/15 09:20, Qiming Teng wrote:


Wondering if there is something misconfigured in my devstack
environment, which was reinstalled on RHEL7 about 10 days ago.
I'm often running into mysql connections problem as shown below:

$ mysql
ERROR 1040 (HY000): Too many connections

When I try dump the mysql connection list, I'm getting the followng
result after a 'systemctl restart mariadb.service':

$ mysqladmin processlist | grep nova | wc -l
125

Most of the connections are at Sleep status:

$ mysqladmin processlist | grep nova | grep Sleep | wc -l
123

As for the workload, I'm currently only running two VMs in a multi-host
devstack environment.

So, my questions:

   - Why do we have so many mysql connections from nova?
   - Is it possible this is caused some misconfigurations?
   - 125 connections in such a toy setup is insane, any hints on nailing
 down the connections to the responsible nova components?



Can you show us the full listing for

mysqladmin processlist

as 125 seems high for a toy setup? How many nova instances are running? 
I have a (single node) devstack setup with 2 nova instances running and 
I have 9 mysql connections to the nova db.


Regards

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Debian already using Python 3.5: please gate on that

2015-06-20 Thread Mark Kirkwood

On 21/06/15 01:32, Doug Hellmann wrote:

Excerpts from Dave Walker's message of 2015-06-20 14:05:48 +0100:

On 20 Jun 2015 1:05 pm, Doug Hellmann d...@doughellmann.com wrote:
SNIP



Whether we want to support 3.4 and 3.5, or just 3.4 and then just 3.5
is an ecosystem question IMO, not an upstream one. 3.4 and 3.5 are
very similar when you consider the feature set crossover with 2.7.


Right, and IIRC that's why we said at the summit that we would rather
not take the resources (in terms of people and CI servers) to run tests
against both, yet. Let's get one project fully working on one version of
python 3, and then we can start thinking about whether we need to
test multiple versions.


Having information available, even if there is not immediate intent to
support seems like something we should constantly be driving for.  It
should certainly be non-voting, but if resources are limited - perhaps a
periodic job, rather that gate?


OTOH, if Canonical doesn't release a version of 3.4 that removes the
core dump bug soon, I will support moving fully to 3.5 or another test
platform, because that bug is causing us trouble in Oslo still.


s/Canonical/ubuntu

Can you link to the bug? I did a quick search, but couldn't find it quickly.



Hmm, https://bugs.launchpad.net/ubuntu/+source/python3.4/+bug/1367907 is
marked as released but somehow it's still being triggered for
oslo.messaging.



FWIW http://packages.ubuntu.com/search?keywords=python3.4 suggests that 
Trusty is still using python 3.4.0, and you don't get 3.4.3 until Vivid, 
which is a bit sad - definitely worth quizzing the Ubuntu folks about 
what the plan is there.


Regards

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [puppet] [Swift] Multiple proxy recipes will create out of sync rings

2015-06-11 Thread Mark Kirkwood

I've looking at using puppet-swift to deploy a swift cluster.

Firstly - without 
http://git.openstack.org/cgit/stackforge/puppet-swift/tree/tests/site.pp 
I would have struggled a great deal more to get up and running, so a big 
thank you for a nice worked example of how to do multiple nodes!


However I have stumbled upon a problem - with respect to creating 
multiple proxy nodes. There are some recipes around that follow on from 
the site.pp above and explicitly build 1 proxy (e.g 
https://github.com/CiscoSystems/puppet-openstack-ha/blob/folsom_ha/examples/swift-nodes.pp)


Now the problem is - each proxy node does a ring builder create, so ends 
up with *different* builder (and therefore) ring files. This is not 
good, as the end result is a cluster with all storage nodes and *one* 
proxy with the same set of ring files, and *all* other proxies with 
*different* ring (and builder) files.


I have used logic similar to the attached to work around this, i.e only 
create rings if we are the 'ring server', otherwise get 'em via rsync.


Thoughts?

Regards

Mark
  # create the ring if we are the ring server
  if $ipaddress_eth0 == ringserver_local_net_ip {
class { 'swift::ringbuilder':
  # the part power should be determined by assuming 100 partitions per drive
  part_power = '18'
  replicas   = '2'
  min_part_hours = 1
  require= Class['swift'],
}

# sets up an rsync db that can be used to sync the ring DB
class { 'swift::ringserver':
  local_net_ip = $ipaddress_eth0,
}

# exports rsync gets that can be used to sync the ring files
@@swift::ringsync { ['account', 'object', 'container']:
  ring_server = $ipaddress_eth0,
}
  } else {
# collect resources for synchronizing the ring databases
Swift::Ringsync||
  }

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [puppet] [Swift] Multiple proxy recipes will create out of sync rings

2015-06-12 Thread Mark Kirkwood

On 12/06/15 17:27, Mark Kirkwood wrote:

I've looking at using puppet-swift to deploy a swift cluster.

Firstly - without
http://git.openstack.org/cgit/stackforge/puppet-swift/tree/tests/site.pp
I would have struggled a great deal more to get up and running, so a big
thank you for a nice worked example of how to do multiple nodes!

However I have stumbled upon a problem - with respect to creating
multiple proxy nodes. There are some recipes around that follow on from
the site.pp above and explicitly build 1 proxy (e.g
https://github.com/CiscoSystems/puppet-openstack-ha/blob/folsom_ha/examples/swift-nodes.pp)


Now the problem is - each proxy node does a ring builder create, so ends
up with *different* builder (and therefore) ring files. This is not
good, as the end result is a cluster with all storage nodes and *one*
proxy with the same set of ring files, and *all* other proxies with
*different* ring (and builder) files.

I have used logic similar to the attached to work around this, i.e only
create rings if we are the 'ring server', otherwise get 'em via rsync.

Thoughts?



I should have noted that the previously mentioned site.pp introduced the 
idea of a ringmaster or ringserver host, and I made use of that i.e only 
one proxy actually builds the rings.


Also I see in my effort to provide a simple fragment I left off a $ in 
front of some variables, and I should probably have defined the 
ringserver ip to make it clear what was happening, e.g:



$ringserver_local_net_ip = '192.168.5.200'
if $ipaddress_eth0 == $ringserver_local_net_ip {


Cheers

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [puppet] [Swift] Multiple proxy recipes will create out of sync rings

2015-06-14 Thread Mark Kirkwood
With respect to using a seed - the facility to supply one to the 
rebalance operation has recently been added to puppet-swift master 
branch (commit b8b4434), however the seed parameter is not available to 
any of the usual calling methods (this looks to be deliberate from the 
commit message), so is not immediately useful without surgery :-)


Regards

Mark

On 13/06/15 18:05, Mark Kirkwood wrote:

 From what I can see, the ring gets created and rebalanced in
puppet-swift/manifest/ringbuilder.pp i.e calling:

   class { '::swift::ringbuilder':
 # the part power should be determined by assuming 100 partitions
per drive
 part_power = '18',
 replicas   = '3',
 min_part_hours = 1,
 require= Class['swift'],
   }

*not* when each device is added.

Yeah, using a seed is probably a good solution too. For the moment I'm
using the idea of one proxy being a 'ring server/master' which achieves
the same thing (identical rings everywhere). However I'll have a look at
using a seed, as this may simplify the code and also the operational
procedure needed to replace said 'master' if it fails (i.e to avoid
accidentally creating a new ring when you really don't need to...)

Regards,

Mark

On 12/06/15 23:10, McCabe, Donagh wrote:

I skimmed the code, but since I'm not familiar with the environment, I
could not find where swift-ring-builder rebalance is invoked. I'm
guessing that each time you add a device to a ring, a rebalance is
also done. Leaving aside how inefficient that is, the key thing is
that the rebalance command has an optional seed parameter. Unless
you explicitly set the seed (to same value on all node obviously), you
won't get the same ring on all nodes. You also need to make sure you
add the same set of drives and in same order.

Regards,
Donagh
-Original Message-
From: Mark Kirkwood [mailto:mark.kirkw...@catalyst.net.nz]
Sent: 12 June 2015 06:28
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [puppet] [Swift] Multiple proxy recipes will
create out of sync rings

I've looking at using puppet-swift to deploy a swift cluster.

Firstly - without
http://git.openstack.org/cgit/stackforge/puppet-swift/tree/tests/site.pp
I would have struggled a great deal more to get up and running, so a
big thank you for a nice worked example of how to do multiple nodes!

However I have stumbled upon a problem - with respect to creating
multiple proxy nodes. There are some recipes around that follow on
from the site.pp above and explicitly build 1 proxy (e.g
https://github.com/CiscoSystems/puppet-openstack-ha/blob/folsom_ha/examples/swift-nodes.pp)


Now the problem is - each proxy node does a ring builder create, so
ends up with *different* builder (and therefore) ring files. This is
not good, as the end result is a cluster with all storage nodes and
*one* proxy with the same set of ring files, and *all* other proxies with
*different* ring (and builder) files.

I have used logic similar to the attached to work around this, i.e
only create rings if we are the 'ring server', otherwise get 'em via
rsync.

Thoughts?

Regards

Mark
__

OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [puppet] [Swift] Multiple proxy recipes will create out of sync rings

2015-06-13 Thread Mark Kirkwood
From what I can see, the ring gets created and rebalanced in 
puppet-swift/manifest/ringbuilder.pp i.e calling:


  class { '::swift::ringbuilder':
# the part power should be determined by assuming 100 partitions 
per drive

part_power = '18',
replicas   = '3',
min_part_hours = 1,
require= Class['swift'],
  }

*not* when each device is added.

Yeah, using a seed is probably a good solution too. For the moment I'm 
using the idea of one proxy being a 'ring server/master' which achieves 
the same thing (identical rings everywhere). However I'll have a look at 
using a seed, as this may simplify the code and also the operational 
procedure needed to replace said 'master' if it fails (i.e to avoid 
accidentally creating a new ring when you really don't need to...)


Regards,

Mark

On 12/06/15 23:10, McCabe, Donagh wrote:

I skimmed the code, but since I'm not familiar with the environment, I could not find where 
swift-ring-builder rebalance is invoked. I'm guessing that each time you add a device 
to a ring, a rebalance is also done. Leaving aside how inefficient that is, the key thing is that 
the rebalance command has an optional seed parameter. Unless you explicitly set the 
seed (to same value on all node obviously), you won't get the same ring on all nodes. You also need 
to make sure you add the same set of drives and in same order.

Regards,
Donagh
-Original Message-
From: Mark Kirkwood [mailto:mark.kirkw...@catalyst.net.nz]
Sent: 12 June 2015 06:28
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [puppet] [Swift] Multiple proxy recipes will create 
out of sync rings

I've looking at using puppet-swift to deploy a swift cluster.

Firstly - without
http://git.openstack.org/cgit/stackforge/puppet-swift/tree/tests/site.pp
I would have struggled a great deal more to get up and running, so a big thank 
you for a nice worked example of how to do multiple nodes!

However I have stumbled upon a problem - with respect to creating multiple proxy 
nodes. There are some recipes around that follow on from the site.pp above and 
explicitly build 1 proxy (e.g
https://github.com/CiscoSystems/puppet-openstack-ha/blob/folsom_ha/examples/swift-nodes.pp)

Now the problem is - each proxy node does a ring builder create, so ends up 
with *different* builder (and therefore) ring files. This is not good, as the 
end result is a cluster with all storage nodes and *one* proxy with the same 
set of ring files, and *all* other proxies with
*different* ring (and builder) files.

I have used logic similar to the attached to work around this, i.e only create 
rings if we are the 'ring server', otherwise get 'em via rsync.

Thoughts?

Regards

Mark
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [puppet] [swift] Storage service startup should await ring creation

2015-08-14 Thread Mark Kirkwood

On 10/07/15 12:43, Mark Kirkwood wrote:

Hi,

I am using puppet-swift to deploy a swift multi node cluster (Icehouse),
following the setup in supplied tests/site.pp. I am running into two
issues that seem to be related to the subject above:

1/ Errors when the storage replication services try to start before the
ring files exist. e.g:

Error: Could not start Service[swift-object-replicator]: Execution of
'/sbin/start swift-object-replicator' returned 1: start: Job failed to
start
Wrapped exception:
Execution of '/sbin/start swift-object-replicator' returned 1: start:
Job failed to start
Error:
/Stage[main]/Swift::Storage::Object/Swift::Storage::Generic[object]/Service[swift-object-replicator]/ensure:
change from stopped to running failed: Could not start
Service[swift-object-replicator]: Execution of '/sbin/start
swift-object-replicator' returned 1: start: Job failed to start

Now these will be fixed the *next* time I do a puppet run (provided I've
performed a run on the appropriate proxy/ringmaster). However the
failing services make scripted testing difficult as we have to put in
logic to the effect don't worry about errors the 1st time.

2/ Container and object stats not updated without full restart of services

This one is a bit more subtle - everything works but 'swift stat' always
shows zero objects and bytes for every container. The only way to fix
this is to stop and start all services on each storage node.

Again this complicates scripted builds as there is the need to go and
stop + start all the swift storage services! Not to mention an extra
little quirk for ops to remember at zero dark 30 oclock...

I've made a patch that prevents these services starting until the ring
files exist (actually for now it just checks the object ring) - see
attached.

Now while I'm not entirely sure that this is the best way to solve the
issue (custom fact that changes the service start flag)...I *do* think
that making the storage services await the ring existence *is* a needed
change, so any thoughts on better ways to do this are appreciated.

Also note that this change *does* require one more puppet run on the
storage nodes:
- one to create the storage servers config and drives
- one to get the ring from the proxy/ringmaster
- one to start the services



I decided to work around these rather than trying to battle in my patch 
to the swift module.


For 1/ I'm trapping the return code for the 1st puppet run and handling 
errors there...run not doing anything for any subsequent run as there 
shouldn't be any errors thereafter. Seems to work ok


For 2/ I'm inserting an exec in our driving puppet code to just stop and 
start (not restart as that does not solve it...growl) *all* the services 
on a storage node. e.g (see tests/site.pp in the swift module for context):


  # collect resources for synchronizing the ring databases
  Swift::Ringsync||

  # stop and start all swift services
  # this is a workaround due to the services starting before the ring
  # is synced which stops stats (and maybe other things) working.
  # We can't just restart as this does *not* achieve the same result.
  exec { 'stop-start-services':
provider = shell,
command  = 'for x in `initctl list|grep swift|awk \'{print 
$1}\'`;do stop $x;start $x;done;exit 0',

path = '/usr/bin:/usr/sbin:/bin:/sbin',
logoutput= true,
  }

  # make sure we stop and start all services after syncing ring
  Swift::Ringsync||
  - Exec['stop-start-services']


Ok, so it is a pretty big hammer, but leaves all of the services in a 
known good state, so I'm happy.


Regards

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [swift] [ceilometer] Ceilometer log dir permissions bust swift proxy

2015-07-15 Thread Mark Kirkwood

On 14/07/15 11:05, Mark Kirkwood wrote:

On 13/07/15 06:44, Emilien Macchi wrote:




Yeah, I guess I should raise a bug with Ubuntu so it (maybe) gets sorted
out. However in the meantime we may have to work around it by amending
the dir permissions in puppet (the alternative - rolling our own
packages is undesirable in general). However I note there are also
python version mistakes in the ceilometer requires.txt which need
patching (sigh).



For completeness, with reference to that last point the latest 
(2015.1.5) releases for ceilometer on Ubuntu 14.04 seem to have 
corrected the requires.txt (which is great).


Regards

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [swift] [ceilometer] Ceilometer log dir permissions bust swift proxy

2015-07-13 Thread Mark Kirkwood

On 13/07/15 06:44, Emilien Macchi wrote:


I think this is something that could be fixed in packaging scripts.
I can see you're using Puppet to deploy OpenStack, and fwiw, we are
stopping to manage permissions in Puppet because of packaging overlap.
 From now, we totally rely on packaging permissions. We are in the
process to drop all the code that hardcode POSIX User/groups/permissions
in our manifests.



Yeah, I guess I should raise a bug with Ubuntu so it (maybe) gets sorted 
out. However in the meantime we may have to work around it by amending 
the dir permissions in puppet (the alternative - rolling our own 
packages is undesirable in general). However I note there are also 
python version mistakes in the ceilometer requires.txt which need 
patching (sigh).


Regards

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [swift] [ceilometer] Ceilometer log dir permissions bust swift proxy

2015-07-10 Thread Mark Kirkwood

Hi,

I'm deploying a swift 1.13 cluster on Ubuntu 14.04 and enabling 
ceilometer in the proxy pipeline results in it not working.


The cause appears to be the log directory perms (note I am running the 
proxy under Apache):


[Fri Jul 10 05:12:15.126214 2015] [:error] [pid 6844:tid 
140048779998976] [remote 192.168.5.1:21419] IOError: [Errno 13] 
Permission denied: '/var/log/ceilometer/proxy-server.wsgi.log'


Sure enough:

$ ls -la /var/log/ceilometer/
total 16
drwxr-x---  2 ceilometer adm4096 Jul 10 02:55 .

Looking at the swift user:

$ id swift
uid=50144(swift) gid=50145(swift) 
groups=50145(swift),4(adm),106(puppet),115(ceilometer)


This looks like https://bugs.launchpad.net/swift/+bug/1269473 but the 
code I have includes the fix for that. In fact it looks like the 
directory permissions are just being set wrong (indeed chmod'ing them to 
be 770 fixes this).


Am I missing something? I don't see how this can possibly work unless 
the directory allows group write.


Regards

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [puppet] [swift] Storage service startup should await ring creation

2015-07-09 Thread Mark Kirkwood

Hi,

I am using puppet-swift to deploy a swift multi node cluster (Icehouse), 
following the setup in supplied tests/site.pp. I am running into two 
issues that seem to be related to the subject above:


1/ Errors when the storage replication services try to start before the 
ring files exist. e.g:


Error: Could not start Service[swift-object-replicator]: Execution of 
'/sbin/start swift-object-replicator' returned 1: start: Job failed to start

Wrapped exception:
Execution of '/sbin/start swift-object-replicator' returned 1: start: 
Job failed to start
Error: 
/Stage[main]/Swift::Storage::Object/Swift::Storage::Generic[object]/Service[swift-object-replicator]/ensure: 
change from stopped to running failed: Could not start 
Service[swift-object-replicator]: Execution of '/sbin/start 
swift-object-replicator' returned 1: start: Job failed to start


Now these will be fixed the *next* time I do a puppet run (provided I've 
performed a run on the appropriate proxy/ringmaster). However the 
failing services make scripted testing difficult as we have to put in 
logic to the effect don't worry about errors the 1st time.


2/ Container and object stats not updated without full restart of services

This one is a bit more subtle - everything works but 'swift stat' always 
shows zero objects and bytes for every container. The only way to fix 
this is to stop and start all services on each storage node.


Again this complicates scripted builds as there is the need to go and 
stop + start all the swift storage services! Not to mention an extra 
little quirk for ops to remember at zero dark 30 oclock...


I've made a patch that prevents these services starting until the ring 
files exist (actually for now it just checks the object ring) - see 
attached.


Now while I'm not entirely sure that this is the best way to solve the 
issue (custom fact that changes the service start flag)...I *do* think 
that making the storage services await the ring existence *is* a needed 
change, so any thoughts on better ways to do this are appreciated.


Also note that this change *does* require one more puppet run on the 
storage nodes:

- one to create the storage servers config and drives
- one to get the ring from the proxy/ringmaster
- one to start the services

Regards

Mark
From 1583d68eedbeaecbacb5a29258343b9e980ce4a4 Mon Sep 17 00:00:00 2001
From: Mark Kirkwood mark.kirkw...@catalyst.net.nz
Date: Fri, 10 Jul 2015 11:18:16 +1200
Subject: [PATCH] Make storage service startup wait for the ring

This solves two problems:

1/ Errors when the replication services fail to start when the
ring files are absent.

2/ Container and object stats not updated without full restart of services

While neither of these issues prevent a working deployment, they play
havoc with fully automated deployment for ci etc (and make arcane
knowledge necessary for ops).

This change *does* require one more puppet run on the storage nodes:
- one to create the storage servers config and drives
- one to get the ring from the proxy/ringmaster
- one to start the services

(amend: make sure the custom fact goes in the correct dir)
Change-Id: I168c26aed5f6dfc337ea8bc5f863e15f6e86d4a2
---
 modules/swift/lib/facter/ringfact.rb |  5 +
 modules/swift/manifests/storage/account.pp   | 14 ++
 modules/swift/manifests/storage/container.pp | 17 -
 modules/swift/manifests/storage/object.pp| 15 +++
 4 files changed, 38 insertions(+), 13 deletions(-)
 create mode 100644 modules/swift/lib/facter/ringfact.rb

diff --git a/modules/swift/lib/facter/ringfact.rb b/modules/swift/lib/facter/ringfact.rb
new file mode 100644
index 000..4816cb8
--- /dev/null
+++ b/modules/swift/lib/facter/ringfact.rb
@@ -0,0 +1,5 @@
+Facter.add(ringexists) do
+  setcode do
+File.exists?(/etc/swift/object.ring.gz) ? true : false
+  end
+end
diff --git a/modules/swift/manifests/storage/account.pp b/modules/swift/manifests/storage/account.pp
index a4398c3..1719776 100644
--- a/modules/swift/manifests/storage/account.pp
+++ b/modules/swift/manifests/storage/account.pp
@@ -18,16 +18,22 @@ class swift::storage::account(
   $enabled= true,
   $package_ensure = 'present'
 ) {
+  if $::ringexists == false {
+$service_enabled = false
+  } else {
+$service_enabled = $enabled
+  }
+
   swift::storage::generic { 'account':
 manage_service = $manage_service,
-enabled= $enabled,
+enabled= $service_enabled,
 package_ensure = $package_ensure,
   }
 
   include swift::params
 
   if $manage_service {
-if $enabled {
+if $service_enabled {
   $service_ensure = 'running'
 } else {
   $service_ensure = 'stopped'
@@ -37,7 +43,7 @@ class swift::storage::account(
   service { 'swift-account-reaper':
 ensure= $service_ensure,
 name  = $::swift::params::account_reaper_service_name,
-enable= $enabled,
+enable= $service_enabled,
 provider  = $::swift::params

[openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Mark Kirkwood

After looking at:

https://www.youtube.com/watch?v=9YHvYkcse-k

I have a question (that follows on from Bruno's) about using erasure 
coding with geo replication.


Now the example given to show why you could/should not use erasure 
coding with geo replication is somewhat flawed as it is immediately 
clear that you cannot set:


- num_data_frags > num_devices (or nodes) in a region

and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to 
me like if you have:


- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.

So my real question is - it looks like it *is* possible to use erasure 
coding in geo replicated situations - however I may well be missing 
something significant, so I'd love some clarification here [1]!


Cheers

Mark

[1] Reduction is disk usage and net traffic looks attractive

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Mark Kirkwood

On 15/02/16 23:29, Kota TSUYUZAKI wrote:

Hello Mark,

AFAIK, a few reasons for that we still are in working progress for erasure code 
+ geo replication.


and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to me like 
if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.


Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) 
still have a problem which cannot decode the original data when all feed 
fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes into proxy 
for GET request, it will fail at the decoding) The problem was already resolved 
in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so if you 
thought to use the newest Swift, it might be not
a matter.



Ah right, in my testing I always took down my "1st" region...which will 
have had data fragments therein. For interest I'll try to provoke a 
situation where I have all parity ones to assemble (and see what happens).




In the Swift perspective, I think that we need more tests/discussion for geo 
replication around write/read affinity[2] which is geo replication stuff in 
Swift itself and performances.

For the write/read affinity, actually we didn't consider the affinity control 
to simplify the implementation until EC landed into Swift master[3] so I think 
it's time to make sure how we can use the
affinity control with EC but it's not done yet.

For the performance perspective, in my experiments, more parities causes quite 
performance degradation[4]. To prevent the degradation, I am working for the 
spec which makes duplicated copy from
data/parity fragments and spread them out into geo regions.

To sumurize, we've not done the work yet but we welcome to discuss and 
contribute for EC + geo replication anytime, IMO.

Thanks,
Kota

1: 
https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
2: 
http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
3: 
http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
4: 
https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html





Excellent - thank you for a very comprehensive answer.

Regards

Mark



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Mark Kirkwood

On 16/02/16 17:10, Mark Kirkwood wrote:

On 15/02/16 23:29, Kota TSUYUZAKI wrote:

Hello Mark,

AFAIK, a few reasons for that we still are in working progress for
erasure code + geo replication.


and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks
to me like if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.


Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty
stable) still have a problem which cannot decode the original data
when all feed fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes
into proxy for GET request, it will fail at the decoding) The problem
was already resolved in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so
if you thought to use the newest Swift, it might be not
a matter.



Ah right, in my testing I always took down my "1st" region...which will
have had data fragments therein. For interest I'll try to provoke a
situation where I have all parity ones to assemble (and see what happens).




So tried this out - still works fine. Checking version of pyeclib I see 
Ubuntu 15.10 is giving me:


- Swift 2.5.0
- pyeclib 1.0.8

Hmmm - Canonical deliberately upping the version of pyeclib (shock)? 
Interesting...anyway explains why I cannot get it to fail. However, all 
your other points are noted, and again thanks!


Regards

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-04-19 Thread Mark Kirkwood

Hi,

Has the release of 2.7 significantly changed the assessment here?

Thanks

Mark

On 15/02/16 23:29, Kota TSUYUZAKI wrote:

Hello Mark,

AFAIK, a few reasons for that we still are in working progress for erasure code 
+ geo replication.


and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to me like 
if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.


Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) 
still have a problem which cannot decode the original data when all feed 
fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes into proxy 
for GET request, it will fail at the decoding) The problem was already resolved 
in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so if you 
thought to use the newest Swift, it might be not
a matter.

In the Swift perspective, I think that we need more tests/discussion for geo 
replication around write/read affinity[2] which is geo replication stuff in 
Swift itself and performances.

For the write/read affinity, actually we didn't consider the affinity control 
to simplify the implementation until EC landed into Swift master[3] so I think 
it's time to make sure how we can use the
affinity control with EC but it's not done yet.

For the performance perspective, in my experiments, more parities causes quite 
performance degradation[4]. To prevent the degradation, I am working for the 
spec which makes duplicated copy from
data/parity fragments and spread them out into geo regions.

To sumurize, we've not done the work yet but we welcome to discuss and 
contribute for EC + geo replication anytime, IMO.

Thanks,
Kota

1: 
https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
2: 
http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
3: 
http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
4: 
https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html



(2016/02/15 18:00), Mark Kirkwood wrote:

After looking at:

https://www.youtube.com/watch?v=9YHvYkcse-k

I have a question (that follows on from Bruno's) about using erasure coding 
with geo replication.

Now the example given to show why you could/should not use erasure coding with 
geo replication is somewhat flawed as it is immediately clear that you cannot 
set:

- num_data_frags > num_devices (or nodes) in a region

and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to me like 
if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.

So my real question is - it looks like it *is* possible to use erasure coding 
in geo replicated situations - however I may well be missing something 
significant, so I'd love some clarification here [1]!

Cheers

Mark

[1] Reduction is disk usage and net traffic looks attractive

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev








__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

2017-06-20 Thread Mark Kirkwood

On 21/06/17 02:08, Jay Pipes wrote:


On 06/20/2017 09:42 AM, Doug Hellmann wrote:

Does "service VM" need to be a first-class thing?  Akanda creates
them, using a service user. The VMs are tied to a "router" which
is the billable resource that the user understands and interacts with
through the API.


Frankly, I believe all of these types of services should be built as 
applications that run on OpenStack (or other) infrastructure. In other 
words, they should not be part of the infrastructure itself.


There's really no need for a user of a DBaaS to have access to the 
host or hosts the DB is running on. If the user really wanted that, 
they would just spin up a VM/baremetal server and install the thing 
themselves.




Yes, I think this area is where some hard thinking would be rewarded. I 
recall when I first met Trove, in my mind I expected to be 'carving off 
a piece of database'...and was a bit surprised to discover that it 
(essentially) leveraged Nova VM + OS + DB (no criticism intended - just 
saying I was surprised). Of course after delving into how it worked I 
realized that it did make sense to make use of the various Nova things 
(schedulers etc)*but* now we are thinking about re-architecting 
(plus more options exist now), it would make sense to revisit this area.


Best wishes

Mark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [osc][swift] Setting storage policy for a container possible via the client?

2018-04-26 Thread Mark Kirkwood



On 20/04/18 04:54, Dean Troyer wrote:

On Thu, Apr 19, 2018 at 7:51 AM, Doug Hellmann  wrote:

Excerpts from Mark Kirkwood's message of 2018-04-19 16:47:58 +1200:

Swift has had storage policies for a while now. These are enabled by
setting the 'X-Storage-Policy' header on a container.

It looks to me like this is not possible using openstack-client (even in
master branch) - while there is a 'set' operation for containers this
will *only* set  'Meta-*' type headers.

It seems to me that adding this would be highly desirable. Is it in the
pipeline? If not I might see how much interest there is at my end for
adding such - as (famous last words) it looks pretty straightforward to do.

I can't imagine why we wouldn't want to implement that and I'm not
aware of anyone working on it. If you're interested and have time,
please do work on the patch(es).

The primary thing that hinders Swift work like this is OSC does not
use swiftclient as it wasn't a standalone thing yet when I wrote that
bit (lifting much of the actual API code from swiftclient) .  We
decided a while ago to not add that dependency and drop the
OSC-specific object code and use the SDK when we start using SDK for
everything else, after there is an SDK 1.0 release.

Moving forward on this today using either OSC's api.object code or the
SDK would be fine, with the same SDK caveat we have with Neutron,
since SDK isn't 1.0 we may have to play catch-up and maintain multiple
SDK release compatibilities (which has happened at least twice).


Ok, understood. I've uploaded a small patch that adding policy 
specification to 'container create' and adds some policy details display 
to 'container show' and 'object store account show' [1]. It uses the 
existing api design, but tries to get the display to look a little like 
what the swift cli provides (particularly for the account info).


regards
Mark
[1] Gerrit topic is objectstorepolicies


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [osc][swift] Setting storage policy for a container possible via the client?

2018-04-18 Thread Mark Kirkwood
Swift has had storage policies for a while now. These are enabled by 
setting the 'X-Storage-Policy' header on a container.


It looks to me like this is not possible using openstack-client (even in 
master branch) - while there is a 'set' operation for containers this 
will *only* set  'Meta-*' type headers.


It seems to me that adding this would be highly desirable. Is it in the 
pipeline? If not I might see how much interest there is at my end for 
adding such - as (famous last words) it looks pretty straightforward to do.


regards

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev