Re: Flight Data Recorders, etc. (was: [fonc] my two cents)

John Zabroski Fri, 12 Mar 2010 15:54:27 -0800

On Tue, Mar 2, 2010 at 10:44 AM, Faré <[email protected]> wrote:

> On 2 March 2010 10:18, John Zabroski <[email protected]> wrote:
> > 1. I am simultaneously interested in open, reflective, dynamically
> > distributed and dynamically federated systems
> Nice way to put it. Welcome to the club!
>
> > 3. "Flight Data Recorders" is a fancy way to say all software is a living
> > system, and computer scientists suck at studying living systems, and
> > programmers lack sufficient tools for repairing living systems
> > 4. "Flight Data Recorders" biggest disadvantage is not mentioned: dealing
> > with purity
> No purity needed. See Omniscient Debugging...
>        http://www.lambdacs.com/debugger/
> Note that the idea is old, and has been done in the 1980's in
> "expert systems" that could "explain" their decisions.
> ODB shows that the idea is applicable to JVM languages.
>
>
I'm aware of BIl Lewis's work.  What I mean is that nobody has demonstrated
safe nesting of impure functions in functions with purity guarantees.  For
example, in Haskell, a good debugger is an unsolved problem worked on by
many.  Bil Lewis solved the problem for stack-based, procedural message
passing languages like Smalltalk and Java.  For what it is worth, the same
ideas apply to "recorders" in VMs, and will eventually be generalized to the
point where you can simulate a distributed environment with multiple VMs
both recording simultaneously, so that playback can be synchronized.
HOWEVER, AFAIK, nobody has cracked Haskell and come up with a really kickass
debugger.  But I don't track Haskell that carefully.



> > 8. Compiler optimization vs. algorithms in general is a false dichotomy
> in a
> > maximally open, maximally reflective system, and I'll claim anyone who
> > thinks this dichotomy is real will not push themselves into an *extreme
> > position* necessary to radically innovate
> So you have the ambition of writing a SSC that can transform bogosort
> into quicksort?
>
>
Not at all.  I have an ambition of writing a compiler with an architecture
that lets humans actively solve problems, rather than passively waiting on
the results of the SSC.  100 years from now, I picture shuttle missions to
other planets with sudden failures and the need for fast solutions.  SCC
might take too long to satisfy various real-time requirements.  I don't want
to be the guy responsible for writing gcc and being at fault for having
designed a shitty architecture that prevents an Apollo-like mission rescue.

HCI in Compiler Design is completely overlooked, and probably why modern
production compilers like gcc (and frankly even llvm) are monsters in terms
of size, compexity and trustworthiness.


> > Typed data access has never been a real problem,
> Talk to the millions of people who've seen their credit card or other
> data stolen because of a bug in a PHP site. http://xkcd.com/327/
>
>
Listen!  Pay attention to the argument I am going to lay down!

The basic assumption tied to your argument is that somehow, ODBC or JDBC are
Object-Oriented.

This myth is propagated by the folks at Smalltalk vendors as well.  Taken
from one brochure of a vendor (XXX) who shall not be named:

ODBC
XXX leverages the Microsoft ODBC (Open
Database Connectivity) interface, enabling applications
developed in XXX to access data in database
management systems (DBMS) using SQL.

and

Connectivity with databases
XXX supports the native API of all the major
databases, automatically integrating into the database
and allowing applications developed in XXX
to leverage the latest features of most databases. This
capability frees development teams from having to learn
complicated database APIs, and provides superior
application performance over generic gateways like ODBC.
XXX’s abstract database framework provides an
easy, high-level interface to your database. It lets you
maintain a high level of source compatibility if you need to
change databases, and you can use the built-in internal
database for offline prototyping. The framework is also
extended for each database to provide access to database-
specific features.
XXX provides connectivity to the following
databases:
[...]

If your best argument is that LINQ is an improvement over straight JDBC or
ODBC, I am not sure I buy it.  LINQ has no model for handling partial
failure or error recovery; everything is static production rules in terms of
LINQ2SQL or LINQ2Entities.  So all the static production rules in LINQ
require maintaining the expression tree "live" using editable expressions.
In short, if you want real software engineering properties like disruption
tolerance, partial failure, and error recovery, you know, ANYTHING
pertaining to quality beyond the trivial "must not leak authority", then
LINQ (literally, the Language Integrated Query syntax) is not it.  LINQ
actually repeats many of SQL's mistakes from a syntax point-of-view, because
you can't do anything to prevent denial-of-service attacks by end-users.
These issues are covered rather eloquently by Vadim Tropashko in his book
SQL Design Patterns and also on his personal web log.  If you don't
understand these issues, then please do not criticize me about "what data
access should look like" by linking to XKCD -  that's completely
reductionist thinking and keeps us thinking about how data access should
work from a 1970's view point.

Microsoft Research's big achievement is applying category theory to the
design of data access. This is a huge win for category theory.
System.Reactive and System.Interactive libraries are great ideas.  However,
they don't address key issues related to self-sustaining systems such as the
ones I mentioned above.  ADO.NET Entity Framework is also an interesting
idea, but if you understand programming language theory, then you realize it
is a spined, tagged, scanner-driven interpreter.  For various reasons,
including performance, I don't think this is a good model.  In fact, it is
really no improvement over ADO Classic, or all the other Microsoft data
access APIs of the past.  An excellent history of Microsoft data access APIs
can be found at [2]

When you think about it, Entity Framework disallows a lot of interesting
compiler optimizations and also fundamentally requires eager loading of data
for even the most trivial parameterization.  So it doesn't even matter if
your DBMS can optimize based on cached query plans, because you're eagerly
loading the data and sorting it in your middle tier or on the client.  This
is so completely backwards, and also encourages a style of programming where
you can accidentally place logic at a wrong tier that accidentally leaks
through to the user, even though they don't have authority to view that
information.  As an example, you can't do Futamura Projections using Entity
Framework.  So you fundamentally can't build any kind of interesting data
mining application using an API like Entity Framework. But if you read
Microsoft's marketing material, they plan to use Entity Framework for a
whole range of data services, from Data Access Objects to Data Transfer
Objects.

Keep in mind my earlier objection still applies -- you can't do metalogic
substitution with LINQ.  Don Syme even admitted this in a footnote of an
earlier Tech Report on LINQ that is now no longer on his MS Research
website.

About a year and a half ago, I e-mailed the world's foremost database
system's researcher, Mike Stonebraker of MIT, telling him that one section
of his VLDB '07 paper, "The End of an Architectural Era" [1] was wrong.  He
completely incorrectly described Ruby on Rails as having a "JDBC" interface
to the database.  He ultimately admitted JDBC was actually not a
requirement, and that his real point was that a DSL like Ruby on Rails'
ActiveRecord and associated APIs made interacting with the database much
easier and safer than SQL for standard requirements.  He then asked me what
Ruby on Rails' dependency was, and I said that it really depends on the
implementation of Ruby and the version of Ruby on Rails, but ultimately it
should be Ruby-DBI and JRuby should be using JDBC underneath the hood of
Ruby-DBI's interfaces.  I then made the point that Ruby on Rails is actually
a death trap, because it cannot coalesce requests across ActiveRecord object
instances, so you end up with "Data Access Object inside Data Access Object"
anti-patterns.  This is what idiots refer to as "Object-Relational Impedance
Mismatch", a crap term used to describe how idiotic JDBC and ODBC is.
Anyway, I agreed with Stonebraker a much better API than SQL was needed, but
disagreed Ruby on Rails was anywhere near nirvana.  I explained my reasoning
to him, he called it "thoughtful", and that was literally all he had to say
at the end of the conversation -- "thanks for your thoughtful reply".  But
his startups, Vertica and Streambase, haven't done anything with these
arguments.  That indicates to me it may just be a hard problem to solve.
But considering SQL Injection is still one of the Top 25 most common
problems in software, it might be worth solving. [3]

I also think SQL itself violates the general principle of separating
specification from implementation.  Your data access layer should not
contain hacks like parameter sniffing hints.  Nobody does this, though.  So
nobody understands it might just be possible to pull off.  However, if you
talk to any DBA (or "Database Developer" as they are called these days)
about this, and they instantly recognize I'm right.  And the real world cost
of these non-OO data access APIs probably numbers in the billions.

Informal list of references
[1] http://www.vldb2007.org/program/papers/industrial/p1150-stonebraker.pdf

[2]
http://alexbarnett.net/blog/archive/2006/12/05/A-Short-History-of-the-Evolution-of-Microsoft-Data-Access-APIs.aspx

[3] http://www.sans.org/top25-programming-errors/

_______________________________________________
fonc mailing list
[email protected]
http://vpri.org/mailman/listinfo/fonc

Re: Flight Data Recorders, etc. (was: [fonc] my two cents)

Reply via email to