Re: datastructure classes

MICHAEL MCGRADY Thu, 16 Dec 2010 12:48:59 -0800

Just a few thoughts below, which are advanced to add to the conversation, not 
detract from it.

On Dec 16, 2010, at 8:01 AM, Patricia Shanahan wrote:

> Mike McGrady wrote:
>> Intense network systems like our clients cannot work with database
>> calls.  They are too slow and do not scale.  They either reach a cost
>> or a performance ceiling.
> 
> I feel we are mixing requirements and implementation.

That may be correct, Patricia.  And the distinction is important.

On requirements, they can be seen as non-functional and functional.  The former 
drive the latter, I think we will all agree, but maybe not.  I definitely do 
not think that we can see if something will work and not consider initially 
whether or not even if it works it would be fairly useless except in ho-hum 
cases.  To ho-hum business users, that may be taken as a slight.  It is not.  I 
am just interested in because my clients are interested in high performance.

So, I think the non-functional requirements and related technologies, e.g., 
clustering, in-memory access, etc., are primary.  For example, when scaling is 
at issue, it is not important that 300,000,000 transactions can be handled in 
10 hours, say, but the fact that we can start at 100 transactions with 
economies of scale and then use the system to scale to 300,000,000 with the 
same performance is.  This is particularly important in a cloud-based economy 
and using economies of scale in equipment, development time, etc.  That said, 
let me say a few things and I offer them not as written in stone but as 
something to think about and maybe agree on, or disagree on.

First, I should qualify this.  I am speaking off the top of my head and it 
appears I must be more guarded and/or more considerate of the time people have 
to read.  I am a toss it out and discuss it guy and some people do not have the 
time for that.  I believe in measure ten times and cut once, but people have 
different drivers.  I did not mean we do not do databases.  Of course we do.  
Doesn't everyone?  However, the primary data model and structures have to be 
in-memory because we cannot tolerate the time database calls take (in-memory is 
approximately 10,000 times faster).  I think that this is not only a 
requirement for me and is not sui generis but is really a dominant part of the 
industry that will consider using Outrigger.  Also, unless we want to give up 
and succumb to Brewer's Theorem, a multi-tiered architecture with asynchronous 
writes to a database with eventual consistency will not do. This does not mean 
that Outrigger must not write to a database.  It must, of course, but there are 
other considerations, as we all know.

Second, that said, scaling as a requirement is roughly the capacity to continue 
to add stressors on the system (users, connections, memory, cpu cycles) and 
have the performance be the same, or equal.  If the performance drops, that 
does not scale, in this definition, which is the one I am used to.  What 
scaling does not mean to me is the ability to handle a given humongous load, 
humongous given stressors.

Third, I am not sure at this point where we want to go with Outrigger.  I am 
only interested and only have time to build a system that will not knock on the 
door but rather will knock the door down.  My clients expect no less.  And, 
ultimately I am not here for academic interests.  Others might have different 
interests.  I respect that of course.  I am just stating what I need.

Fourth, I tend to work as primarily an architect and a designer of 
system-of-systems off the non-functional values, the QCC or the "ilities".  
This is, for my perspective, the goal and functional requirements in my mind 
are defined by and flow from these non-functional requirements.  So, I might 
have a different perspective but ultimately we would agree, I think, on the 
details.

> Response time, throughput, ACID properties and the like are requirements. 
> What Outrigger uses as persistent storage for tuple space is implementation.
> 
> I would like to get more data on your performance requirements - specifics 
> about e.g. x% of JavaSpace reads under these conditions must complete within 
> y seconds.

I can answer this in a few days but it really would be as fast as possible 
until other values are compromised.  Certainly I can say that I work in 
complicated and critical systems and have to use as little time in my piece as 
possible.  When a radar has to be read, sign on security passed, multilevel 
security ensured, upgraded or downgraded structural data handled, integration 
logic decoupled, whether at the application or virtual machine basis, etc., and 
get to a screen in 1.0 seconds, the more invisible we are the better, except in 
some respects.  ;-)

> 
> The most useful and specific way of presenting requirements would be a 
> scalable benchmark - something where we can simulate a larger number of 
> machines doing real work by having a smaller number dedicated to driving 
> transactions at a single JavaSpace and measuring the results.

I do not know if you are covering this in what you say, you might be and you 
probably are, but do you cover the case where the replication of the data is 
transactional?  Also, could you expand on this.  I think you have a great idea 
here but I am not sure that you have expressed the whole of it.

> 
> Measurements from a small to medium sized environment would be useful input 
> for estimating performance at higher loads. Without that sort of data, I 
> cannot even guess whether an achievable Outrigger implementation will be able 
> to meet all your requirements.

I know that either an asynchronous write (consistency) or a synchronous write 
(performance) to a database will not work if that is the end-all and be-all for 
a number of reasons..  If you want me to quantify that, I can, but the fact 
that we can spend millions and do it will not be adequate.  The cost of 
machines and development time is important, crucial.  Scaling is necessary to 
achieve a consistent cost between high and low levels of stressors to a system, 
especially in this cloud-based economies of scale age.

> 
> Patricia

Michael McGrady
Chief Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
mmcgr...@topiatechnology.com

Re: datastructure classes

Reply via email to