Re: [openstack-dev] [Nova] Handling soft delete for instance rows in a new cells database

2014-11-25 Thread Ahmed RAHAL

Hi,

Le 2014-11-24 17:20, Michael Still a écrit :

Heya,

This is a new database, so its our big chance to get this right. So,
ideas welcome...

Some initial proposals:

  - we do what we do in the current nova database -- we have a deleted
column, and we set it to true when we delete the instance.

  - we have shadow tables and we move delete rows to a shadow table.

  - something else super clever I haven't thought of.


Some random thoughts that came to mind ...

1/ as far as I remember, you rarely want to delete a row
- it's usually a heavy DB operation (well, was back then)
- it's destructive (but we may want that)
- it creates fragmentation (less of a problem depending on db engine)
- it can break foreign key relations if not done the proper way

2/ updating a row to 'deleted=1'
- gives an opportunity to set a useful deletion time-stamp
I would even argue that setting the deleted_at field would suffice to 
declare a row 'deleted' (as in 'not NULL'). I know, explicit is better 
than implicit ...

- the update operation is not destructive
- an admin/DBA can decide when and how to purge/archive rows

3/ moving the row at deletion
- you want to avoid additional steps to complete an operation, thus 
avoid creating a new record while deleting one
- even if you wrap things into a transaction, not being able to create a 
row somewhere can make your delete transaction fail
- if I were to archive all deleted rows, at scale I'd probably move them 
to another db server altogether



Now, I for one would keep the current mark-as-deleted model.

I however perfectly get the problem of massive churn with instance 
creation/deletion.
So, let's be crazy, why not have a config option 
'on_delete=mark_delete', 'on_delete=purge' or 'on_delete=archive' and 
let the admin choose ? (is that feasible ?)


This would especially come handy if the admin decides the global cells 
database may not need to keep track of deleted instances, the cell-local 
nova database being the absolute reference for that.


HTH,

Ahmed.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Log / error message format best practices standards

2014-06-26 Thread Ahmed RAHAL

Hi,

Le 2014-06-26 12:14, boden a écrit :

We were recently having a discussion over here in trove regarding a
standardized format to use for log and error messages - obviously
consistency is ideal (within and across projects). As this discussion
involves the broader dev community, bringing this topic to the list for
feedback...

[...]

For in-line values (#a above) I find single quotes the most consumable
as they are a clear indication the value came from code and moreover
provide a clear set of delimiters around the value. However to date
unquoted appears to be the most widely used.


+1


For appended values (#b above) I find a delimiter such as ':' most
consumable as it provides a clear boundary between the message and
value. Using ':' seems fairly common today, but you'll find other
formatting throughout the code.


+1
--

Ahmed

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] should we have a stale data indication in nova list/show?

2014-06-25 Thread Ahmed RAHAL

Le 2014-06-25 14:26, Day, Phil a écrit :

-Original Message-
From: Sean Dague [mailto:s...@dague.net]
Sent: 25 June 2014 11:49
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] should we have a stale data indication in
nova list/show?



+1 that the state shouldn't be changed.

What about if we exposed the last updated time to users and allowed them to 
decide if its significant or not ?



This would just indicate the last operation's time stamp.
There already is a field in nova show called 'updated' that has some 
kind of indication. I honestly do not know who updates that field, but 
if anything, this existing field could/should be used.



Ahmed.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] should we have a stale data indication in nova list/show?

2014-06-24 Thread Ahmed RAHAL

Le 2014-06-24 17:38, Joe Gordon a écrit :


On Jun 24, 2014 2:31 PM, Russell Bryant rbry...@redhat.com
mailto:rbry...@redhat.com wrote:



  There be dragons here.  Just because Nova doesn't see the node reporting
  in, doesn't mean the VMs aren't actually still running.  I think this
  needs to be left to logic outside of Nova.
 
  For example, if your deployment monitoring really does think the host is
  down, you want to make sure it's *completely* dead before taking further
  action such as evacuating the host.  You certainly don't want to risk
  having the VM running on two different hosts.  This is just a business I
  don't think Nova should be getting in to.

I agree nova shouldn't take any actions. But I don't think leaving an
instance as 'active' is right either.  I was thinking move instance to
error state (maybe an unknown state would be more accurate) and let the
user deal with it, versus just letting the user deal with everything.
Since nova knows something *may* be wrong shouldn't we convey that to
the user (I'm not 100% sure we should myself).


I saw compute nodes going down, from a management perspective (say, 
nova-compute disappeared), but VMs were just fine. Reporting on the 
state may be misleading. The 'unknown' state would fit, but nothing lets 
us presume the VMs are non-functional or impacted.


As far as an operator is concerned, a compute node not responding is a 
reason enough to check the situation.


To go further about other comments related to customer feedback, there 
are many reasons a customer may think his VM is down, so showing him a 
'useful information' in some cases will only trigger more anxiety.
Besides people will start hammering the API to check 'state' instead of 
using proper monitoring.

But, state is already reported if the customer shuts down a VM, so ...

Currently, compute nodes state reporting is done by the nova-compute 
process himself, reporting back with a time stamp to the database 
(through conductor if I recall well). It's more like a watchdog than a 
reporting system.
For VMs (assuming we find it useful) the same kind of process could 
occur: nova-compute reporting back all states with time stamps for all 
VMs he hosts. This shall then be optional, as I already sense 
scaling/performance issues here (ceilometer anyone ?).


Finally, assuming the customer had access to this 'unknown' state 
information, what would he be able to do with it ? Usually he has no 
lever to 'evacuate' or 'recover' the VM. All he could do is spawn 
another instance to replace the lost one. But only if the VM really is 
currently unavailable, an information he must get from other sources.


So, I see how the state reporting could be a useful information, but am 
not sure that nova Status is the right place for it.


Ahmed.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] should we have a stale data indication in nova list/show?

2014-06-24 Thread Ahmed RAHAL

Hi,

Le 2014-06-24 20:12, Joe Gordon a écrit :


Finally, assuming the customer had access to this 'unknown' state
information, what would he be able to do with it ? Usually he has no
lever to 'evacuate' or 'recover' the VM. All he could do is spawn
another instance to replace the lost one. But only if the VM really
is currently unavailable, an information he must get from other sources.


If I was a user, and my instance went to an 'UNKNOWN' state, I would
check if its still operating, and if not delete it and start another
instance.


If I was a user and polled nova list/show on a regular basis just in 
case the management pane indicates a failure, I should have no 
expectation whatsoever. If service availability is my concern, I should 
monitor the service, nothing else. From there, once the service has 
failed, I can imagine checking if VM management is telling me something. 
However, if my service is down and I have no longer access to the VM ... 
simple case: destroy and respawn.


My point is that we should not make the nova state an expected source of 
truth regarding service availability in the VM, as there is no way to 
tell such a thing. If my VM is being DDOSed, nova would still say 
everything is fine, while my service is really down. In that situation, 
console access would help me determine if the VM management is wrong by 
stating everything is ok or if there is another root cause.
Similarly, should nova show a state change if load in the VM is through 
the roof and the service is not responsive ? or if OOM is killing all my 
processes because of a memory shortage ?


As stated before, providing such a state information is misleading 
because there are cases where node unavailability is not service 
disruptive, thus it would indicate a false positive while the opposite 
(everything is ok) is not at all indicative of a healthy status of the 
service.


Maybe am I overseeing a use case here where you absolutely need the user 
of the service to know a potential problem with his hosting platform.


Ahmed.

--
=
Ahmed Rahal ara...@iweb.com / iWeb Technologies
Spécialiste de l'Architecture TI
/ IT Architecture Specialist
=

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] locked instances and snaphot

2014-06-17 Thread Ahmed RAHAL

Hi there,

Le 2014-06-16 15:28, melanie witt a écrit :

Hi all,


[...]


During the patch review, a reviewer raised a concern about the
purpose of instance locking and whether prevention of snapshot while
an instance is locked is appropriate. From what we understand,
instance lock is meant to prevent unwanted modification of an
instance. Is snapshotting considered a logical modification of an
instance? That is, if an instance is locked to a user, they take a
snapshot, create another instance using that snapshot, and modify the
instance, have they essentially modified the original locked
instance?

I wanted to get input from the ML on whether it makes sense to
disallow snapshot an instance is locked.


Beyond 'preventing accidental change to the instance', locking could be 
seen as 'preventing any operation' to the instance.
If I, as a user, lock an instance, it certainly only prevents me from 
accidentally deleting the VM. As I can unlock whenever I need to, there 
seems to be no other use case (chmod-like).
If I, as an admin, lock an instance, I am preventing operations on a VM 
and am preventing an ordinary user from overriding the lock.


This is a form of authority enforcing that maybe should prevent even 
snapshots to be taken off that VM. The thing is that enforcing this 
beyond the limits of nova is AFAIK not there, so cloning/snapshotting 
cinder volumes will still be feasible.
Enforcing it only in nova as a kind of 'security feature' may become 
misleading.


The more I think about it, the more I get to think that locking is just 
there to avoid mistakes, not voluntary misbehaviour.


--

Ahmed

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev