On Jan 18, 2010, at 4:00 PM, Scott Lawrence wrote:

> On Mon, 2010-01-18 at 14:47 -0500, Marden P. Marshall wrote:
>> On Jan 18, 2010, at 1:39 PM, Scott Lawrence wrote:
>>> Actually, I don't think that realtime and fine-grained access to most
>>> configuration data is a good goal.  While on the surface it may seem so,
>>> I think it actually creates some serious management and system
>>> architecture challenges that are best avoided.
>>> 
>>> One of the core architectural principles that has guided most sipXecs
>>> service development is that configuration data for services is both:
>>> 
>>>    1. Distributed to the system on which the service runs so that the
>>>       service is to the greatest extent possible autonomous.
>>> 
>>>    2. Stored on that system so that a service restart or even system
>>>       reboot does not require communication with the central
>>>       management database to resume service.
>> 
>> As I stated before, mission critical applications should maintain a
>> local configuration cache, thus making them autonomous in the event
>> that the database be unreachable.  Another approach is to utilize
>> database replication when deploying distributed systems.  A more
>> involved strategy, but one that brings other benefits to the table.
> 
> Isn't maintaining such a cache exactly as much work as interpreting a
> separate configuration file or files?  Adding database replication also
> adds complexity and dependencies (and most likely increases our use of
> postgres-specific features, since replication is not standard across
> databases as far as I know).

No it is not.  Since the cache is intended to be used only for bulk 
configuration of the application at startup, the control path is much simpler, 
involving a very mechanical marshaling / unmarshaling of the data.  There are 
also data binding services available which support both O/R mapping and XML 
binding, thus providing a single source solution.

> 
>>> Having services access the SQL database directly has some bad
>>> properties:
>>> 
>>>     * It creates ambiguities or even inability to provide service when
>>>       access to the database is unavailable.  This limits the ability
>>>       to distribute services on multiple systems, especially when
>>>       those systems are connected by less reliable WAN links.
>> 
>> Again, service interruptions due to database unavailability is a
>> non-issue. 
> 
> It's not a non-issue.  You can argue that there are ways to address the
> issue, but none of them come for free, so it's an issue and an important
> one.

In the argument against direct database access, it is a non-issue as there are 
numerous and valid ways in which to solve the problem.
 
> 
>> In addition, what is being proposed is more robust, due to the fact
>> that only a subset of the configuration data, that which is relevant
>> to the configuration change, need be communicated.
> 
> I don't see how that's true.  A complete configuration can be checked a
> multiple points for internal integrity and consistency (something we
> implement now in many services through schema validity tests of most xml
> configuration files).  That can't be done with partial configuration
> updates.

I do not think that there is any need to be concerned about the reliability of 
ODBC network transactions.

> 
> Granted, the amount of data to be moved when a change is made may be
> smaller (depending on the nature of the change), but I see that more as
> a performance issue than a robustness issue.
> 
>>>     * It creates potential performance bottlenecks.  We've seen this
>>>       in the call-state/CDR system in the proxy (in which records are
>>>       written to a local database by each proxy, and those databases
>>>       are read by the central call resolver to create the CDRs); the
>>>       current high-load stress tests fail first in the CDR subsystem.
>> 
>> There is no comparison between the throughput demands of a
>> configuration database and that of CDR's.  But should the postgreSQL
>> service begin to fail due to heavy CDR database transactions, the
>> Config Server will not be able to function anyway, making this
>> argument moot.
> 
> I wasn't thinking in terms of throughput - I'm concerned with what
> happens when the database is not available.  The present scheme of
> replicating data to each component has the advantage (though we may not
> be exploiting it as often as we should) that the management application
> can know what configuration data has reached the services that need it,
> and display system state accordingly.  If the services are pulling data
> from the database directly, then sipXconfig can't know which have the
> data and which do not (unless we build an independent mechanism
> specifically to provide that feedback).

I see no reason why an application cannot provide such feedback, regardless of 
how the configuration has been conveyed.

> 
>>>     * It places a greater burden on the management software to
>>>       maintain an internally consistent and usable configuration in
>>>       the database at all times.  By separating the database from the
>>>       configuration used by the services, we allow the management
>>>       software (and administrator) to directly control how and when
>>>       configuration data is propagated to the live services.  A series
>>>       of changes can be made in the database which, when complete,
>>>       form a new and useful configuration without worrying about
>>>       whether or not any intermediate state consisting of just some of
>>>       those changes is disfunctional.  Many reconfiguration operations
>>>       consist of multiple steps, and it's easy to create situations in
>>>       which step 1 will break things until step 2 is done; by
>>>       separating the distribution and activation of configuration data
>>>       from the storage in the database, we have an obvious direct way
>>>       to control when the service 'sees' a new configuration, and can
>>>       improve the odds that it is internally consistent. 
>> 
>> I suspect that there are not many such cases.  I certainly know of
>> none for the applications that I am involved with.  
> 
> I can think of many, including in those applications.  Anything that
> changes how calls are distributed usually requires some changes in
> multiple components, since addressing and authorization are distributed
> in sipXecs.  If the proxy starts sending calls to a new ACD queue before
> the ACD has been configured to accept calls at that address and knows
> with agents are associated with it, then things are going to go wrong
> pretty quickly.

The Config Server should be responsible for committing such changes as a 
complete set, regardless of how many components are involved.  Otherwise, as I 
stated before, it risks leaving the configuration database in an unstable 
state.  Regardless, there is always going to be a period where related 
components are going to be out of sync while they wait to receive and act upon 
their configuration changes.  In your example, if the ACD was accessing it's 
configuration directly from the database, it would be running with the correct 
configuration much quicker and would therefore be less likely to incorrectly 
reject an incoming call.

> 
>> But if there are cases where multiple configuration changes need to be
>> performed as an atomic operation, then the Config Server must commit
>> these changes to the database only once the change set is complete.
>> Otherwise it risks leaving the database in an unstable state.  And
>> since the application is notified only after the database has been
>> updated, it is guaranteed to only retrieve complete and valid
>> configuration data.
> 
> That creates a new constraint on the sipXconfig implementation that's
> not consistent with how it uses the database now.   Which serves to
> illustrate the importance of the point that Dale raised - letting the
> services access the database directly makes the database itself into an
> interface.  I think it's much more work to maintain a database that is
> an interface than one that is purely internal to the application.  
> 
>>> sipXsupervisor enforces that the software version number matches a
>>> configuration version number provided by sipXconfig before starting
>>> anything (preventing new software being started with old data or the
>>> other way around).
>> 
>> I see this behavior being preserved regardless of how an application
>> retrieves its configuration data.  Also having an application retrieve
>> it's configuration directly from the database is more efficient and
>> will result in shorter update times due to the fact that once the
>> Config Server has updated the database, the applications can
>> immediately start without needing to wait for configuration files to
>> be regenerated and replicated.
> 
> Applications don't have to wait for sipXconfig now - the whole point of
> the current scheme is that normally the applications have a complete
> local copy of their configuration data.

I am referring to the situation where a major software update has just 
occurred.  Today the applications are held off from running until after the 
database has been updated and configuration files have been regenerated.  By 
bypassing the configuration files, the applications can be allowed to start up 
just as soon as the database update is completed.

> 
> Efficiency is far from the most important criteria for infrequent events
> like configuration changes.  The ability to know that the changes are
> correct and be able to debug them when they are not is, in my opinion,
> far more important, and direct database access is not nearly as good in
> that regard since the interface is not as well defined.  When the
> management application generates files (or other datasets like the imdb
> contents) and writes them, their contents are directly testable for
> validity and consistency (the sipXconfig unit tests exploit this heavily
> today).  If the management application and the managed service share
> access to a single database, then each can have its own independent
> opinion as to which parts of the data the service is using and what the
> internal consistency and validity constraints are.
> 

By eliminating the need for the Config Server to generate configuration files, 
the process of validation is not made harder, it is made simpler.  As for 
database record consistency and validity is concerned, the Config Server has to 
be deemed the master in all cases.  The application is strictly a read-only 
consumer, which is no different than it is today with config files.  The data 
is merely in a different format.


-Mardy

_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/

Reply via email to