John,

Disagree it is a "simple thing", but it is a good example.

It also demonstrates blending well, since analogies are used all the
time in this domain to circumvent impedance mismatches.

For example, versioning very large database systems' schema is
non-trivial since the default methods don't scale:
alter table BigTable add /*column*/ foo int

This will lock out all readers and writers until it completes.
Effectively it is a denial of service attack. Predicting its
completion time is difficult, since it will depend on how the table
was previously built (e.g. if anything fancy was done storing sparse
columns; if there is still storage space available in-row to store the
int required by this new column thus avoiding a complete rebuild; if
the table needs to be completely rebuilt, then so do its indices; if
the table is sharded across many independent disks, then the storage
engine can parallelize the task). The *intention* is to add a column
to a table, presumably for some new requirement. But there is a latent
requirement on the intention, forming a new meaning, that nobody
should observe a delay during the schema upgrade.

Now, if the default method isn't robust enough, then What is? and What
do we call it?

Well, what I did to solve this problem was type in "how to add a
column to a large table" into Google [1].

As for naming it, well, the enterprise software community came up with
this concept called "Database Refactorings" [2] [3] or simply
"Migrations" [4], which are a heuristic system for approximating the
Holy Grail of having a reversible logic for schema operations
(generally difficult due to "destructive changes" and other problems).
 Programmers procedurally embed knowledge on how to change the schema,
and then just pass messages to a server that has all of this
procedural knowledge embedded in it. It i interesting (to me, anyway)
that programmers have developed a human process for working around a
complex theoretical problem (e.g., see [5] for a discussion of the
challenges in building a lingua franca for data integration, schema
evolution, database design, and model management), without ever
knowing the problems.  Good designers realize there is a structural
problem and create some structure and encapsulate the process for
solving it.  Schema matching in general is considered AI-Complete
since it believed to require reproducing human intelligence to do it
automatically [6], and so some approaches even take a cognitive
learning approach [7].

But can we do even better than this conceptualization?  For example,
at what point does an engineer decide a RDBMS is the wrong tool for
the job and switches to a NoSQL database like Redis?  If we can
identify that point, we can also perhaps predict if that trade-off was
indeed a good one.  Was the engineer simply following a pop culture
phenomenon or did he/she make a genuinely good choice?

Beyond that, another related example implicit in your referential
integrity example is dynamically federated, dynamically distributed
system design.  In the general case, we know from the CAP Theorem that
due to partition barriers we cannot guarantee referential integrity
while also having high availability and performance.  We also can't
implicitly trust the Java client code due to out-of-band communication
protocol attacks, e,g, imagine a SQL injection attack.  Likewise, we
might wish to re-use validation logic in multiple places, such as in
an HTML form, and it is not sufficient to depend on the HTML form's
JavaScript validation logic, since JavaScript can be disabled and the
browser can be bypassed completely using raw encoding of HTTP PUT/POST
form actions and sending that directly to the server.

Food for thought.

[1] http://www.google.com/search?q=how+to+add+a+column+to+a+large+table
[2] http://martinfowler.com/articles/evodb.html
[3] http://databaserefactoring.com/
[4] http://guides.rubyonrails.org/migrations.html
[5] http://www.mecs-press.org/ijigsp/ijigsp-200901007.pdf
[6] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.134.6252
[7] 
http://z-bo.tumblr.com/post/454811730/learning-to-map-between-structured-representations-of

On 4/10/11, John Nilsson <[email protected]> wrote:
> Hello John,
>
> Thanks for the pointers, I will indeed have a look at this.
>
> I have a pet project of mine trying to create a platform and
> prgramming model to handle this kind of problem. Such a simple thing
> as keeping referential integritey between static Java code, the
> embedded SQL and over to the dynamic databas is one of those
> irritating problems I intend to address with this approach.
>
> I enviosion a system with a meta-language and some standard
> transformation to editor views, compilation stages and type-systems,
> implemented in terms of this meta-language.
>
> BR,
> John
>
> On Sun, Apr 10, 2011 at 4:38 AM, John Zabroski <[email protected]>
> wrote:
>> John,
>>
>> It is true you can't know exact intention but that hasn't stopped
>> computer scientists from trying to answer the question. For example,
>> Joe Goguen's work on algebraic semiotics resulted in Joe developing a
>> few basic rules for mapping information from one medium to another.
>> Joe's first rule was "Wherever possible, preserve the structure of the
>> content."
>>
>> I could think of... and have thought of... a lot of techniques for
>> automatically porting code (an extremely difficult problem,
>> considering it covers correct live migration from an Intel to an
>> adversarial AMD processor with possibly deliberately incompatible
>> Instruction Set Architecture), including ways to automatically
>> trade-off structure with other goals in a controlled fashion.  One
>> that Goguen was interested in was "content mixing" or "predictive
>> modeling" - hot buzzwords before the AI Winter came and dried up lots
>> of interesting funding. It is starting to re-emerge because of the
>> multi-core kerfluffle, since it can achieve the sorts of
>> "parallel-busyness" chipmaker's crave. I'd recommend Mark Turner's
>> paper Forging Connections, which suggests some meaning belong to the
>> mapping itself, rather the source-target approaches.  In other words,
>> we tend to construct meaning in a blend between the source and target.
>> We don't just have mappings-as-meanings , but "forge" meaning *from*
>> mapping. (I hope I explained that well.)
>>
>> On 4/9/11, John Nilsson <[email protected]> wrote:
>>> I would think that it is generally impossible to automatically extract
>>> intentions from code. I run into this wall every day at work, I know
>>> _what_ the code is doing. But there is often little information as to
>>> _why_ it does what it does. It's not only due to the fact that the
>>> program is shaped by the idioms and constraints by the host language
>>> it is also the fact that the host language in general is a machine
>>> description language not a general problem statment language.
>>>
>>> I guess you are referring to the first problem when you talk about
>>> expressibility.
>>>
>>> To address the second problem I'm thinking that you have to seperate
>>> the problem description, and solution from machine specifications.
>>> That is have a programming model where you create languages
>>> specifically to encode the problem, and then create an intepreter for
>>> the language to create machines solving it.
>>>
>>> BR,
>>> John

_______________________________________________
fonc mailing list
[email protected]
http://vpri.org/mailman/listinfo/fonc

Reply via email to