Thanks Julian. I covered Migrations above. See reference [4].
I would view migrations as a way to encapsulate formed meanings. Something that has always struck me as funny about the NoSQL movement is the complaints about how much of a PITA they find schema versioning in an RDBMS. I've never served millions of pages a day, but I've always been deeply suspicious of the cause-effect relationship described by the engineers at the biggest sites. Put simply, it doesn't make any sense. It's a bad correlation. The next big architectural step forward will rectify this with a better correlation. On 4/10/11, Julian Leviston <[email protected]> wrote: > You should probably have a look at ActiveRecord::Migration which is part of > Rails if you're interested in SQL-based systems, and in fact ActiveRecord in > general is a really wonderful abstraction system - and a very good mix of > "do what you *can* in a programming-language based DSL, and what you can't > in direct SQL". > > http://api.rubyonrails.org/classes/ActiveRecord/Migration.html > > Julian. > > On 11/04/2011, at 2:58 AM, John Nilsson wrote: > >> Wow, thanks. This will keep me occupied for a while ;-) >> >> Regarding AI completness and the quest for automation. In my mind its >> better to start with making it simpler for humans to do, and just keep >> making it simpler until you can remove the human element. This way you >> can put something out there much quicker and get feedback on what >> works, and what doesn't. Most importantly something out there for >> other developers (people smarter than me) to extend and improve. >> >> Regarding the database I have some ideas of experimenting with a >> "persistent" approach to data storage, employing a scheme with >> branching and human assisted merging to handle evolution of the >> system. For a fast moving target such as a typical OLTP-system there >> must be automated merging of course, but I see no reason for the >> algorithms to be either general or completley automatic. I think we >> can safley assume that there will be some competend people around to >> assist with the merging. Afterall there must be humans involved to >> trigger the evolution to being with. >> >> To solve the impedance missmatch between the dynamic world of the >> databse and the static world of application development I'm thinking >> the best approach is to simply remove it. Why have a static, >> unconnected, version of the application at all? After all code is >> data, and data belongs in the database. >> >> To have any hope of getting any kind of prototype out there I have, >> for now, decided to avoid thinking of distributed systems and/or >> untrusted systemcomponents. I guess this will be a 3.0 concern ;-) >> >> BR, >> John >> >> On Sun, Apr 10, 2011 at 4:40 PM, John Zabroski <[email protected]> >> wrote: >>> John, >>> >>> Disagree it is a "simple thing", but it is a good example. >>> >>> It also demonstrates blending well, since analogies are used all the >>> time in this domain to circumvent impedance mismatches. >>> >>> For example, versioning very large database systems' schema is >>> non-trivial since the default methods don't scale: >>> alter table BigTable add /*column*/ foo int >>> >>> This will lock out all readers and writers until it completes. >>> Effectively it is a denial of service attack. Predicting its >>> completion time is difficult, since it will depend on how the table >>> was previously built (e.g. if anything fancy was done storing sparse >>> columns; if there is still storage space available in-row to store the >>> int required by this new column thus avoiding a complete rebuild; if >>> the table needs to be completely rebuilt, then so do its indices; if >>> the table is sharded across many independent disks, then the storage >>> engine can parallelize the task). The *intention* is to add a column >>> to a table, presumably for some new requirement. But there is a latent >>> requirement on the intention, forming a new meaning, that nobody >>> should observe a delay during the schema upgrade. >>> >>> Now, if the default method isn't robust enough, then What is? and What >>> do we call it? >>> >>> Well, what I did to solve this problem was type in "how to add a >>> column to a large table" into Google [1]. >>> >>> As for naming it, well, the enterprise software community came up with >>> this concept called "Database Refactorings" [2] [3] or simply >>> "Migrations" [4], which are a heuristic system for approximating the >>> Holy Grail of having a reversible logic for schema operations >>> (generally difficult due to "destructive changes" and other problems). >>> Programmers procedurally embed knowledge on how to change the schema, >>> and then just pass messages to a server that has all of this >>> procedural knowledge embedded in it. It i interesting (to me, anyway) >>> that programmers have developed a human process for working around a >>> complex theoretical problem (e.g., see [5] for a discussion of the >>> challenges in building a lingua franca for data integration, schema >>> evolution, database design, and model management), without ever >>> knowing the problems. Good designers realize there is a structural >>> problem and create some structure and encapsulate the process for >>> solving it. Schema matching in general is considered AI-Complete >>> since it believed to require reproducing human intelligence to do it >>> automatically [6], and so some approaches even take a cognitive >>> learning approach [7]. >>> >>> But can we do even better than this conceptualization? For example, >>> at what point does an engineer decide a RDBMS is the wrong tool for >>> the job and switches to a NoSQL database like Redis? If we can >>> identify that point, we can also perhaps predict if that trade-off was >>> indeed a good one. Was the engineer simply following a pop culture >>> phenomenon or did he/she make a genuinely good choice? >>> >>> Beyond that, another related example implicit in your referential >>> integrity example is dynamically federated, dynamically distributed >>> system design. In the general case, we know from the CAP Theorem that >>> due to partition barriers we cannot guarantee referential integrity >>> while also having high availability and performance. We also can't >>> implicitly trust the Java client code due to out-of-band communication >>> protocol attacks, e,g, imagine a SQL injection attack. Likewise, we >>> might wish to re-use validation logic in multiple places, such as in >>> an HTML form, and it is not sufficient to depend on the HTML form's >>> JavaScript validation logic, since JavaScript can be disabled and the >>> browser can be bypassed completely using raw encoding of HTTP PUT/POST >>> form actions and sending that directly to the server. >>> >>> Food for thought. >>> >>> [1] http://www.google.com/search?q=how+to+add+a+column+to+a+large+table >>> [2] http://martinfowler.com/articles/evodb.html >>> [3] http://databaserefactoring.com/ >>> [4] http://guides.rubyonrails.org/migrations.html >>> [5] http://www.mecs-press.org/ijigsp/ijigsp-200901007.pdf >>> [6] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.134.6252 >>> [7] >>> http://z-bo.tumblr.com/post/454811730/learning-to-map-between-structured-representations-of >>> >>> On 4/10/11, John Nilsson <[email protected]> wrote: >>>> Hello John, >>>> >>>> Thanks for the pointers, I will indeed have a look at this. >>>> >>>> I have a pet project of mine trying to create a platform and >>>> prgramming model to handle this kind of problem. Such a simple thing >>>> as keeping referential integritey between static Java code, the >>>> embedded SQL and over to the dynamic databas is one of those >>>> irritating problems I intend to address with this approach. >>>> >>>> I enviosion a system with a meta-language and some standard >>>> transformation to editor views, compilation stages and type-systems, >>>> implemented in terms of this meta-language. >>>> >>>> BR, >>>> John >>>> >>>> On Sun, Apr 10, 2011 at 4:38 AM, John Zabroski <[email protected]> >>>> wrote: >>>>> John, >>>>> >>>>> It is true you can't know exact intention but that hasn't stopped >>>>> computer scientists from trying to answer the question. For example, >>>>> Joe Goguen's work on algebraic semiotics resulted in Joe developing a >>>>> few basic rules for mapping information from one medium to another. >>>>> Joe's first rule was "Wherever possible, preserve the structure of the >>>>> content." >>>>> >>>>> I could think of... and have thought of... a lot of techniques for >>>>> automatically porting code (an extremely difficult problem, >>>>> considering it covers correct live migration from an Intel to an >>>>> adversarial AMD processor with possibly deliberately incompatible >>>>> Instruction Set Architecture), including ways to automatically >>>>> trade-off structure with other goals in a controlled fashion. One >>>>> that Goguen was interested in was "content mixing" or "predictive >>>>> modeling" - hot buzzwords before the AI Winter came and dried up lots >>>>> of interesting funding. It is starting to re-emerge because of the >>>>> multi-core kerfluffle, since it can achieve the sorts of >>>>> "parallel-busyness" chipmaker's crave. I'd recommend Mark Turner's >>>>> paper Forging Connections, which suggests some meaning belong to the >>>>> mapping itself, rather the source-target approaches. In other words, >>>>> we tend to construct meaning in a blend between the source and target. >>>>> We don't just have mappings-as-meanings , but "forge" meaning *from* >>>>> mapping. (I hope I explained that well.) >>>>> >>>>> On 4/9/11, John Nilsson <[email protected]> wrote: >>>>>> I would think that it is generally impossible to automatically extract >>>>>> intentions from code. I run into this wall every day at work, I know >>>>>> _what_ the code is doing. But there is often little information as to >>>>>> _why_ it does what it does. It's not only due to the fact that the >>>>>> program is shaped by the idioms and constraints by the host language >>>>>> it is also the fact that the host language in general is a machine >>>>>> description language not a general problem statment language. >>>>>> >>>>>> I guess you are referring to the first problem when you talk about >>>>>> expressibility. >>>>>> >>>>>> To address the second problem I'm thinking that you have to seperate >>>>>> the problem description, and solution from machine specifications. >>>>>> That is have a programming model where you create languages >>>>>> specifically to encode the problem, and then create an intepreter for >>>>>> the language to create machines solving it. >>>>>> >>>>>> BR, >>>>>> John >>> >> >> _______________________________________________ >> fonc mailing list >> [email protected] >> http://vpri.org/mailman/listinfo/fonc > > _______________________________________________ fonc mailing list [email protected] http://vpri.org/mailman/listinfo/fonc
