[Haskell-cafe] streaming translation using monads

Warren Harris Tue, 18 Nov 2008 18:23:46 -0800

I am working on a query language translator, and although I feel thata monadic formulation would work well for this application, I'vestumbled on a number of questions and difficulties that I thought theknowledgeable people here might be able to help me with.

As a translator, there's a source language and a target language, andboth of these are specified as a grammar of nested expressions. Myfirst thought was to formulate the target language as an embeddedlanguage where each clause expression is represented as a monadoperation, so that the bind operator can join the pieces together, e.g.:


 (clause1, clause2 ...)

could be specified as an embedded language as:

 clause1  >>= \ v1 ->
 clause2  >>= \ v2 -> ...

However, each of the clauses is actually an output routine to send theexpression that it denotes to a remote server, and a parser forreceiving the results. Since a clause is really a pair of operations,it doesn't seem possible to formulate a monad that will compose allthe output routines together and compose all the input routinestogether in one shot. (Note that the variables in the above code (v1,v2) represent inputs to be received from the remote server -- alloutputs are packaged into the clause expressions themselves and areeffectively literals.)

A naive formulation of a monad to implement the above as "output ->input v" might appear to work, but has the ill-effect of interleavingthe output and input statements for each clause rather than composingsomething that can send the entire request, and then receive theentire result.

This forces me to use "output * input v" as the type of each clauseexpression, but now it is not possible to write a monad with a bindoperation that will compose pieces in terms of input variables.Instead I have had to resort to using a set of combinators that threada continuation function through each clause and accumulate inputs asthey are received:


 clause1 ==> (\ k v1 -> k (trans1 v1)) ++
 clause2 ==> (\ k v2 -> k (trans2 v2)) ++ ...

This threading is necessary in that I want to stream the translationback to the client requesting the translation rather than building upthe (possibly large) results in memory.

This formulation has proven to be quite cumbersome in practice, as theresulting continuation types reflect the depth-first traversal of thenested query expressions, and type errors can be quite unintuitive.(It is quite interesting though that each continuation/transformationfunction can receive not only receive the input from the immediatelypreceding clause, but from any of the preceding clauses, and alsoreturn more or fewer results. However getting anything wrong can bevery problematic in that it can lead to either downstream *or*upstream errors depending on how the clauses are composed into anoverall query expression.)

An alternative to all this would be to use an algebraic datatype tospecify the target language (with separate routines for the output andinput operations), but that would appear to require another sum typeto express the values to be received. I'd like to avoid that ifpossible since the projection of those values back into my programcould lead to dynamic type errors, and also causes seemingly needlessmemory allocations.

There must be another technique for this sort of streaming translationout there... I welcome any suggestions you might have!


Warren Harris
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] streaming translation using monads

Reply via email to