RE: Database interface - would like advice on oracle library bind ing
Alistair Bayley wrote: Still making slow progress on an Oracle database binding... now I'm trying to fit the API I have into some sort of abstract interface (like the one(s) discussed previously: Since the previous dicussion I've discovered Krasimir Angelov's HSQL interface as part of his HTOOLKIT: https://sourceforge.net/project/showfiles.php?group_id=65248 This has a uniform abstract interface similar to the one discussed, apparently implemented for MYSQL, ODBC, and PostgreSQL. I've only played with the MYSQL variant. If possible, it would seem sensible to provide a consistent interface for your Oracle binding. If not, it may be worth persuading him to generalise the interface. Then again, perhaps you knew about this already. Tim ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface - would like advice on oracle library bind ing
Yes, I know of it. Krasimir's code uses similar functions and types in the three bindings (MySQL, ODBC, PostgreSql), but the bindings do not share a common interface (AFAICT). Because they are so similar, making them support a common interface shouldn't be a lot of work. I was hoping the earlier discussions on database interfaces would converge to a solution most interested parties agreed on. I'm still sold on the left-fold interface discussed previously. Oleg claims it is sufficient, as it can be automatically converted to a stream-like interface. I bit of me agrees with you that I should be consistent with previous work, and another bit says make it simple and elegant, and this leads me towards Oleg's proposals (left-fold + auto-generated value extraction functions + auto-generated stream functions). Do others on this list (and elsewhere) have any further opinions as to how a database interface ought to look? -Original Message- From: Tim Docker [mailto:[EMAIL PROTECTED] Sent: 29 September 2003 13:43 To: Bayley, Alistair; [EMAIL PROTECTED] Subject: RE: Database interface - would like advice on oracle library bind ing Alistair Bayley wrote: Still making slow progress on an Oracle database binding... now I'm trying to fit the API I have into some sort of abstract interface (like the one(s) discussed previously: Since the previous dicussion I've discovered Krasimir Angelov's HSQL interface as part of his HTOOLKIT: https://sourceforge.net/project/showfiles.php?group_id=65248 This has a uniform abstract interface similar to the one discussed, apparently implemented for MYSQL, ODBC, and PostgreSQL. I've only played with the MYSQL variant. If possible, it would seem sensible to provide a consistent interface for your Oracle binding. If not, it may be worth persuading him to generalise the interface. Then again, perhaps you knew about this already. Tim * The information in this email and in any attachments is confidential and intended solely for the attention and use of the named addressee(s). This information may be subject to legal professional or other privilege or may otherwise be protected by work product immunity or other legal rules. It must not be disclosed to any person without our authority. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. * ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface - would like advice on oracle library binding
The following code illustrates a _generic_ interface to low-level database code. The left-fold iterator doQuery is completely generic over any possible iterator -- no matter how many columns the query returns, what are the types of these columns and what is the type of the seed (accumulator). The code for doQuery remains the same. The iterator allocates buffers for columns at the beginning and frees the buffers at the very end. Again, this buffer handling is generic. There is no longer need to write extraction/unmarshalling function for specific types of rows. We only need fetching functions for specific datatypes (not columns!). Again, the query and the row buffer management code is completely generic. I guess I'm repeating myself. The tests: -- Query returns one column of type String -- Never mind undefined: we return some static data in the buffers, -- we don't have any oracle to bind to test1 = doQuery undefined undefined iter1 ([]::[String]) where iter1:: String - [String] - Either [String] [String] iter1 s acc = Right $ s:acc -- Query returns two columns of types String and Int test2 = doQuery undefined undefined iter2 ([]::[(String,Int)]) where iter2:: String - Int - [(String,Int)] - Either [(String,Int)] [(String,Int)] iter2 s i acc = Right $ (s,i):acc -- Query returns three columns of types Int, String and Int test3 = doQuery undefined undefined iter3 ([]::[(Int,String,Int)]) where iter3:: Int - String - Int - [(Int,String,Int)] - Either [(Int,String,Int)] [(Int,String,Int)] iter3 i1 s i2 acc = Right $ (i1,s,i2):acc Use the function runtests to run either of these tests. The code follows. Compiler flags: -fglasgow-exts -fallow-overlapping-instances -- DB column buffers type BufferSize = Int data BufferType = ORA_char | ORA_int type Position = Int -- column number of the result table data Buffer = Buffer { bufptr :: String -- for this stub, just use String , nullindptr :: String -- likewise , retsizeptr :: String -- likewise , size:: BufferSize , pos:: Position , ora_type:: BufferType } -- understandably, below is just a stub ... alloc_buffer (siz, typ) ps = return $ Buffer { bufptr = show ps, pos = ps, size = siz, ora_type = typ} -- In this stub, don't do anything free ptr = return () -- DB Column types class DBType a where alloc_buffer_hints:: a - (BufferSize, BufferType) col_fetch:: Buffer - IO a instance DBType String where alloc_buffer_hints _ = (2000, ORA_char) col_fetch buffer = return (bufptr buffer) instance DBType Int where alloc_buffer_hints _ = (4, ORA_int) col_fetch buffer = return (read $ bufptr buffer) -- need to add more ... -- Row iteratees. Note, the folowing two instances cover ALL possible -- iteratees. No other instances are needed class SQLIteratee iter seed where iter_apply:: [Buffer] - seed - iter - IO (Either seed seed) alloc_buffers:: Position - iter - seed - IO [Buffer] instance (DBType a) = SQLIteratee (a-seed-Either seed seed) seed where iter_apply [buf] seed fn = col_fetch buf = (\v - return$ fn v seed) alloc_buffers n _ _ = sequence [alloc_buffer (alloc_buffer_hints (undefined::a)) n] instance (SQLIteratee iter' seed, DBType a) = SQLIteratee (a-iter') seed where iter_apply (buf:others) seed fn = col_fetch buf = (\v - iter_apply others seed (fn v)) alloc_buffers n fn seed = do this_buffer - alloc_buffer (alloc_buffer_hints (undefined::a)) n other_buffers - alloc_buffers (n+1) (fn (undefined::a)) seed return (this_buffer:other_buffers) free_buffers = mapM_ free -- The left fold iterator -- the query executor data Session -- not relevant for this example data SQLStmt db_execute session query = return () db_fetch_row buffers = return () -- use static data doQuery:: (SQLIteratee iter seed) = Session - SQLStmt - iter - seed - IO seed -- In this example, we just allocate buffers, fetch two rows and terminate -- with a clean-up doQuery session query iteratee seed = do buffers - alloc_buffers 0 iteratee seed db_execute session query db_fetch_row buffers (Right seed) - iter_apply buffers seed iteratee db_fetch_row buffers (Right seed) - iter_apply buffers seed iteratee free_buffers buffers return seed -- Tests -- Query returns one column of type String test1 = doQuery undefined undefined iter1 ([]::[String]) where iter1:: String - [String] - Either [String] [String] iter1 s acc = Right $ s:acc -- Query returns two columns of types String and Int test2 = doQuery undefined undefined iter2 ([]::[(String,Int)]) where iter2:: String - Int - [(String,Int)] - Either [(String,Int)] [(String,Int)] iter2 s i acc = Right $ (s,i):acc -- Query returns three columns of types Int, String and Int test3 = doQuery
RE: Database interface - would like advice on oracle library bind ing
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] I'd like to remark first that many real databases let us avoid opening many cursors at once. It seems to be more efficient to do as much as possible within the engine ... I agree (and Tom Pledger makes the same comment). It's rare that I want to have more than one cursor open at once. When I've seen other developers do it, they're often doing something that would be better accomplished as a join. And that last time I remember doing it, it was a nested loops situation (which, on reflection, probably could have been done as a join... oh well). OTOH, I have seen interleaved fetches used to simulate a join where the specific DBMS product struggled to perform the join efficiently itself. In this case the developer is making up for a deficiency in the DBMS, and I don't want to exclude the option from the interface (if you agree with what you read on http://www.dbdebunk.com , then most DBMS products are deficient in a number of ways). Still, opening several cursors may be unavoidable. The left fold approach may still help -- we _can_ mechanically invert a left fold combinator (to be precise, a half-baked version of it) into a lazy list. Please see a separate message how to invert the left fold Found it. I'll have a look... I believe the extract functions can be constructed automatically -- similar to the way Quickcheck constructs test cases. I believe that instead of I though this might be possible, but I had no idea how to do it. I'll have a look at the Quickcheck source to see how it's done, unless you can suggest a better example. buffers. It's hard for me to write the corresponding code because I don't have Oracle handy (besides, I like Informix better). Is it possible to come up with a stub that uses flat files? We are only interested in fetching rows. It doesn't matter if these rows come from a file or from a database. That would make prototyping the interface quite easier. I wasn't interested so much in prototyping the interface, as trying to write an implementation that supported the interface(s) discussed previously. I intended to provide the left-fold interface, and was wondering if that was all that was needed (for selects). Still, this would be a good exercise for me, at least so I can figure out how to generate extraction functions. From: Tom Pledger [mailto:[EMAIL PROTECTED] Here's one approach: find the OCI equivalent of JDBC's ResultSetMetaData, and use it to drive the allocation and freeing of buffers. I've considered this, and I think it's the next route I'll take (the OCI supports it). At the least it'll give me the ability to perform arbitrary queries (at present I have to know the types of the result set columns and construct the extraction function manually). I've also considered stuffing more information into the Cursor type (which I've introduced), and using this in a modal fashion to decide what to do at each point. Here's another: ... Make getInt (and friends) behave differently depending on the mode of the cursor they're passed: either allocate a buffer and return _|_, decode and return the current column of the current row, or free a buffer and return _|_. Not wanting to sound too dumb, but how do you return _|_ ? Thanks, Alistair. * The information in this email and in any attachments is confidential and intended solely for the attention and use of the named addressee(s). This information may be subject to legal professional or other privilege or may otherwise be protected by work product immunity or other legal rules. It must not be disclosed to any person without our authority. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. * ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Database interface - would like advice on oracle library bind ing
Here's another: ... Make getInt (and friends) behave differently depending on the mode of the cursor they're passed: either allocate a buffer and return _|_, decode and return the current column of the current row, or free a buffer and return _|_. Not wanting to sound too dumb, but how do you return _|_ ? Return undefined or better, error Erroneously demanded bottom result of buffer allocation phase Note that undefined :: a undefined = undefined error :: String - a {- defined internally -} HTH. --KW 8-) -- Keith Wansbrough [EMAIL PROTECTED] http://www.cl.cam.ac.uk/users/kw217/ University of Cambridge Computer Laboratory. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Database interface - would like advice on oracle library binding
(2nd attempt; mailman thinks I'm not a list member, but it still keeps sending me mail.) Still making slow progress on an Oracle database binding... now I'm trying to fit the API I have into some sort of abstract interface (like the one(s) discussed previously: http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ). 1. Is the left-fold the best/only interface to expose? I think yes, but that doesn't allow fine-grained control over cursors i.e. being able to open many cursors at once and interleave fetches from them. Or does it? 2. I'm finding it hard to write a doQuery function that takes an extraction function that isn't a pig to write. Some advice would be useful here... (and a long-ish explanation follows): The Oracle Call Interface (OCI) requires that I allocate buffers for the result of a single row fetch, before the first row is fetched. So a query involves: - prepare statement etc - allocate buffers (one for each column - call OCI C function DefineByPos) - fetch row - extract/marshal data from buffer into Haskell types (which are then processed by fold function) - fetch, marshal (repeat until no more rows) - free buffers i.e. the same buffers are re-used for each row. The problem for me is how to specify the left-fold function in terms of the low-level API. If I want to supply extraction functions (to extract Haskell values from result buffer), how do I manage the buffer allocation in the doQuery function? The buffer allocate/free code also needs to know the column positions and types, in the same manner as the extract functions. I want to be able to write code like this: results - doQuery dbconn sqltext [] \row results - do name- stringv row 1 address - stringv row 2 return (name,address):results .. where the stringv function extracts/marshals a Haskell String from the result buffer. The intermediate approach I currently have means I have to pass an IO action into the doQuery function that, when evaluated, allocates the buffer and returns two more actions: - an action that extracts the row as a tuple - another action that frees the buffer The doQuery function evaluates the initial action (to allocate the buffer), uses the extract action to build the result (at present a list), and when there are no more rows, uses the free action to free the buffer. This approach is quite awkward (especially w.r.t. writing extract functions), and it's hard for me to see how to build a better interface. Hard, because of the memory management requirements. Here's a chunk of the code. A lot of it is OCI plumbing, but I hope you can see how awkward it is to create an extract function (see ex3 at the bottom). Given pointers to the buffer, extract a string of variable length (you have to terminate it yourself). fetchStringVal :: OCIColInfo - IO String fetchStringVal (_, bufptr, nullindptr, retsizeptr) = do retsize - liftM cShort2Int (peek retsizeptr) nullind - liftM cShort2Int (peek nullindptr) -- unused pokeByteOff (castPtr bufptr) retsize nullByte val - peekCString (castPtr bufptr) return val Free a single column's buffer. freeColBuffer :: OCIColInfo - IO () freeColBuffer (_, bufptr, nullindptr, retsizeptr) = do free bufptr free retsizeptr free nullindptr Create a buffer for a string column, and return the extract and free IO actions. getExtractFnString :: Int - ErrorHandle - StmtHandle - IO (IO String, IO ()) getExtractFnString posn err stmt = do c - defineCol err stmt posn 2000 oci_SQLT_CHR return ((fetchStringVal c), (freeColBuffer c)) doQuery uses the extractFns action to create the result buffer, and the two actions (extract and free) which are passed to doQuery2. doQuery2 :: ErrorHandle - StmtHandle - IO a - IO () - [a] - IO [a] doQuery2 err stmt extractData freeMem results = do rc - fetch err stmt if rc == oci_NO_DATA then do freeMem return results else do v - extractData doQuery2 err stmt extractData freeMem (v:results) doQuery :: Session - String - (ErrorHandle - StmtHandle - IO (IO a, IO ())) - IO [a] doQuery (Sess env err con) qry extractFns = do stmt - getStmt env err prepare err stmt qry execute err con stmt (extractData, freeMem) - extractFns err stmt doQuery2 err stmt extractData freeMem [] The interface provided by doQuery means I have to write extract functions like this. Here's one for a select that returns three String columns. It's quite awkward... ex3 :: ErrorHandle - StmtHandle - IO (IO (String, String, String), IO ()) ex3 err stmt = do (fetchcol1, freecol1) - getExtractFnString 1 err stmt (fetchcol2, freecol2) - getExtractFnString 2 err stmt (fetchcol3, freecol3) - getExtractFnString 3 err stmt return ( do { s1 - fetchcol1; s2 - fetchcol2; s3 - fetchcol3; return (s1, s2, s3) } , do { freecol1; freecol2; freecol3 } )
Database interface - would like advice on oracle library binding
Bayley, Alistair writes: : | Still making slow progress on an Oracle database binding... now I'm trying | to fit the API I have into some sort of abstract interface (like the one(s) | discussed previously: | http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ). | | | 1. Is the left-fold the best/only interface to expose? I think yes, | but that doesn't allow fine-grained control over cursors i.e. being | able to open many cursors at once and interleave fetches from | them. Or does it? It looks like the interleaving would be limited to a nested loop structure: a cursor could be processed in full during one extraction for another cursor. Application-side nested loop structures are often a poor substitute for server-side joins. | 2. I'm finding it hard to write a doQuery function that takes an | extraction function that isn't a pig to write. Some advice would be | useful here... (and a long-ish explanation follows): : Here's my attempt to summarise the piggishness you describe: The interface to Oracle requires that you initialise a cursor by allocating a suitably typed buffer for each column prior to fetching the first row, and finalise a cursor by freeing those buffers after fetching the last row. This means that we must iterate over the columns 3 times. We would prefer to express this iteration only once, and have the other 2 happen automatically within the library. (As opposed to what ex3 does, which is to iterate for getExtractFnString, iterate for fetchcolN, and iterate for freecolN.) Here's one approach: find the OCI equivalent of JDBC's ResultSetMetaData, and use it to drive the allocation and freeing of buffers. Here's another: Add a mode attribute to the abstract data type which encompasses ErrorHandle and StmtHandle. (I'll persist in referring to that ADT as Cursor.) Expect extraction functions to be written along these lines: \cursor result - do field1 - getIntcursor field2 - getString cursor field3 - getString cursor return ((field1, field2, field3):result, True) Make getInt (and friends) behave differently depending on the mode of the cursor they're passed: either allocate a buffer and return _|_, decode and return the current column of the current row, or free a buffer and return _|_. doQuery could then apply the extraction function once in Allocate mode after opening the cursor, once per fetched row in Decode mode, and once in Free mode at the end. There's nothing to stop an extraction function from varying the number of get___ functions it applies, or trying to match their results when not in Decode mode. These weakness could be mitigated by: Pointing out that some database connection standards (JDBC, and for all I know also ODBC) don't guarantee that you can still get at a row's 1st column after you've looked at its 2nd column, i.e. there's a precedent for such fragility. Complicating the extraction functions by giving them the type (Cursor - b - IO (IO (b, Bool))) , expecting that all the get___ functions are applied in the outer IO layer, and undertaking that the inner IO layer will only be used in Decode mode. Regards, Tom ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
Tom Pledger wrote: How about introducing a Cursor abstract data type? doquery :: Process - String - b - (Cursor - b - IO (b, Bool)) - IO b stringv :: Cursor - CInt - IO String doublev :: Cursor - CInt - IO Double intv:: Cursor - CInt - IO Int This achieves the restriction you're after, because doquery is the only exported producer of Cursor, and stringv etc. are the only exported consumers of Cursor. I like this. Not sure about whether I'd call it Cursor or just Row. Code something like results - doquery dbconn sqltext [] \row results - do name- colv row 1 address - colv row 2 return (name,address):results seems quite tidy. It also has the benefit that the function you pass to doquery can make other calls to doquery, without mucking up the 'current row' state. There would be one current row per Cursor, not one per Process. Is it normal or common to support multiple simultaneous queries on a single DB connection? Tim ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
From: Tim Docker [mailto:[EMAIL PROTECTED] It also has the benefit that the function you pass to doquery can make other calls to doquery, without mucking up the 'current row' state. There would be one current row per Cursor, not one per Process. Is it normal or common to support multiple simultaneous queries on a single DB connection? Depends on what you mean by multiple simultaneous queries. A single query is usually synchronous w.r.t. returning a cursor i.e. you issue the query and get back a cursor in an atomic operation, and then you can do other things while you fetch rows from the cursor. If you need to have two SQL statements running at the same time (two big inserts or updates, say) then you usually need two sessions. It probably depends on the API, but some (Oracle, I believe) will allow you to open multiple sessions with the same connection. In a single session you could open a number of cursors (in any order), interleave fetches (in any order), and then close the cursors (in any order). So you could have: - a number of Connections (perhaps to different servers/databases), where - each Connection has one or more Sessions, and - each Session has zero or more Cursors (and one current Transaction). My session model and terminology are probably quite Oracle specific, but I hope it's reasonably clear. * The information in this email and in any attachments is confidential and intended solely for the attention and use of the named addressee(s). This information may be subject to legal professional or other privilege or may otherwise be protected by work product immunity or other legal rules. It must not be disclosed to any person without our authority. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. * ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
Tim Docker writes: : | Is it normal or common to support multiple simultaneous queries on | a single DB connection? In transaction processing, yes. There's an idiom where you use one query to select all the (financial) transactions in a batch, but there's so much variation in how you need to process each row, that you can't express it all in one SQL query. So you use a variety of little queries during the processing of each row of the main query. The result of the fold in your doquery could be, say, a record of batch totals. - Tom ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
Tim Docker writes: : | The list being folded over | is implied by the DB query, is accessible through the IO monad. | Hence a parameter is not required. It would really be: | | doquery :: Process - String - b - (b - IO (b,Bool)) - IO b : | One thing that I am unsure about is whether the column value | accessing functions that I specified before | |stringv :: Process - CInt - IO String |doublev :: Process - CInt - IO Double |intv:: Process - CInt - IO Int | | should return actions in the IO monad as above, or instead should | be in some other DBQuery monad, that trivially extends the IO monad, | but is only valid inside the doquery call. This would have the benefit | of restricting the column access functions to inside a query via the | type system. How about introducing a Cursor abstract data type? doquery :: Process - String - b - (Cursor - b - IO (b, Bool)) - IO b stringv :: Cursor - CInt - IO String doublev :: Cursor - CInt - IO Double intv:: Cursor - CInt - IO Int This achieves the restriction you're after, because doquery is the only exported producer of Cursor, and stringv etc. are the only exported consumers of Cursor. It also has the benefit that the function you pass to doquery can make other calls to doquery, without mucking up the 'current row' state. There would be one current row per Cursor, not one per Process. | I'd also probably use a typeclass to specify a single colv function. | ie: | |class DBCol a where |colv :: DBQuery a | |instance DBCol String where... |instance DBCol Double where... Good idea. The user can always use explicit type signatures to resolve ambiguity, and even then the code size will be about the same as with stringv etc. - Tom ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
Tom Pledger writes: By the way, how does the a in a - b - IO (b, Bool) work? It looks like it has something to do with the current row. Does doquery have to accommodate any possible substitution for a ? I fired this off without thinking about it too much, and looking at the prelude type signatures for fold. The list being folded over is implied by the DB query, is accessible through the IO monad. Hence a parameter is not required. It would really be: doquery :: Process - String - b - (b - IO (b,Bool)) - IO b I don't have a preference, but offer this view of the options: With an exception, Stop, and return the last b you saw. With a boolean,Stop, and return this b. I think I like the behavior where, when the bool in the tuple is true, the b in the tuple is immediately returned from the query. Exceptions would be propagated to the caller of doquery without modification (but with appropriate cleanups). One thing that I am unsure about is whether the column value accessing functions that I specified before stringv :: Process - CInt - IO String doublev :: Process - CInt - IO Double intv:: Process - CInt - IO Int should return actions in the IO monad as above, or instead should be in some other DBQuery monad, that trivially extends the IO monad, but is only valid inside the doquery call. This would have the benefit of restricting the column access functions to inside a query via the type system. I'd also probably use a typeclass to specify a single colv function. ie: class DBCol a where colv :: DBQuery a instance DBCol String where... instance DBCol Double where... doquery :: Process - String - b - (b - DBQuery (b,Bool)) - IO b Any comments? Tim ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
[EMAIL PROTECTED] wrote: If I may interject, that's precisely how a Scheme DB interface is designed. The main function is a left-fold. Not quite though: it provides for a premature termination: A major procedure: DB1:fold-left PROC INITIAL-SEED QUERY-OBJECT Premature termination sounds like a useful property. I can see two ways this could be done: keep the previous signature, and use an exception to exit early, or add a boolean return value like the scheme version: doquery :: Process - String - (a - b - IO (b,Bool)) - b - IO b Any opinions on which alternative wouuld be preferable? Tim ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
I'll probably generalise the query function to do a fold, rathen than always accumulate a list: doquery :: Process - String - (a - b - IO b) - b - IO b If I may interject, that's precisely how a Scheme DB interface is designed. The main function is a left-fold. Not quite though: it provides for a premature termination: A major procedure: DB1:fold-left PROC INITIAL-SEED QUERY-OBJECT A QUERY-OBJECT (which in this implementation is a list of fragments that make a SQL statement, in the reverse order -- without the terminating semi-colon) is submitted to the database, using the default database connection. PROC is a procedure: SEED COL COL ... The procedure PROC takes 1+n arguments where n is the number of columns in the the table returned by the query. The procedure PROC must return two values: CONTINUE? NEW-SEED The query is executed, and the PROC is applied to each returned row in order. The first invocation of PROC receives INITIAL-SEED as its first argument. Each following invocation of PROC receives as the first argument the NEW-SEED result of the previous invocation of PROC. The CONTINUE? result of PROC is an early termination flag. If that flag is returned as #f, any further applications of PROC are skipped and DB1:fold-left finishes. The function DB1:fold-left returns NEW-SEED produced by the last invocation of PROC. If the query yielded no rows, DB1:fold-left returns the INITIAL-SEED. Thus DB1:fold-left is identical to the left fold over a sequence, modulo the early termination. There are a few minor variants of the above procedure, optimized for common particular cases: a query that is expected to return at most one row, and a query that expects to return at most one row with exactly one column. The latter query has the same interface as a lookup in a finite map. The QUERY-OBJECT is of coursed not built by hand. There is a domain-specific language (which greatly resembles the source language and makes some use of quasi-quotation) whose result is a query object. Therefore, I can write both the query and the handlers of the query result using the same syntax. The premature termination is important. Database connections and cursors are too precious resources to leave them for the garbage collector to finalize. More discussion and pointers can be found at http://srfi.schemers.org/srfi-44/mail-archive/msg00023.html The interface has been used in an industrial application. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Database interface
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I still want to use HaskellDB (or at least the relational calculus part of it), so I was thinking of splitting it into two pieces: a library that submits SQL queries and returns the results (database interface), and a library that constructs queries with the relational calculus and generates SQL from them. Obviously the database interface must be written for each DBMS product, while the relational calculus library ought to be independent of DBMS (although it might be wise to have SQL compatibility flags for the SQL it generates. For example: Oracle 8 (what I'm using now) only supports SQL-92). Ideally, we would have the following. 1/ A high level combinator library for relational calculus built on, 2/ A standard low-level Haskell API i.e. functions like Connection - String - ( a - b - IO b) - IO b 3/ Database specific bindings. The problem is that projections and cartesian products generate new types, at least in the way it was done in HaskellDB. The form the calculus takes would need to be substantially reworked. Tom -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/O6DBYha8TWXIQwoRAmkCAKCwZkNpU3TjX5bj7rOZHx5DXRjPhwCgihnY sAWwEKX+5hZ1Tiu4wfSGo7Y= =ALnJ -END PGP SIGNATURE- ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
Tim Docker writes: | Tom Pledger writes: | | This is a pretty good way to stop those nasty vague SQL row types at | the Haskell border and turn them into something respectable. Perhaps | it would even be worth constraining the extracted type to be in | DeepSeq | | doquery :: (DeepSeq v) = | Process - String - IO v - IO [v] | | Can you explain what the constraint does here? Yes. It soothes my superstitious fears about unsafeInterleaveIO. I imagined that: - doquery (like getContents) uses unsafeInterleaveIO to make the resulting list lazy, i.e. avoid grabbing all the rows at the outset, - the unsafe interleaving would *somehow* get into the extraction action, and - the current row could be overwritten while some stringv (etc.) actions on it were yet to happen, so - a good safety measure would be to force evaluation (hence DeepSeq) of each list element before overwriting the current row. But now that you ask, I think it's possible for doquery to stop the unsafeInterleaveIO from leaking into the extraction action, so the DeepSeq would only protect users from shooting themselves in the foot with weird queries like this doquery p select s from t (unsafeInterleaveIO (stringv p 1)) -- DeepSeq would allow doquery to undo the effect of unsafeInterleaveIO or this [a1, a2] - doquery p select s from t (return (stringv p 1)) s2 - a2 s1 - a1 -- DeepSeq would disallow this at compile time - Tom ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Database interface
Thomas L. Bevan writes: | Does anyone know if there is work being done on a standard Haskell | database interface. I suspect that there isn't. The pattern seems to be that someone gets an interface working well enough for some purposes, and perhaps shares it, but is too modest and/or busy to put it forward as a standard. For a row extraction mechanism, I'd vote for passing an extraction function (or IO action) to the main query function, like Tim Docker described last month. http://haskell.cs.yale.edu/pipermail/haskell-cafe/2003-July/004684.html This is a pretty good way to stop those nasty vague SQL row types at the Haskell border and turn them into something respectable. Perhaps it would even be worth constraining the extracted type to be in DeepSeq doquery :: (DeepSeq v) = Process - String - IO v - IO [v] so that the interface clearly may confine its attention to one row at a time per cursor. - Tom ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
Tom Pledger writes: This is a pretty good way to stop those nasty vague SQL row types at the Haskell border and turn them into something respectable. Perhaps it would even be worth constraining the extracted type to be in DeepSeq doquery :: (DeepSeq v) = Process - String - IO v - IO [v] Can you explain what the constraint does here? Although I haven't touched the db interface in the meantime, when/if I do, I'll probably generalise the query function to do a fold, rathen than always accumulate a list: doquery :: Process - String - (a - b - IO b) - b - IO b Tim ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Database interface
-Original Message- From: Thomas L. Bevan [mailto:[EMAIL PROTECTED] This type of interface is fine as far as it goes, but there is only ever a loose coupling between the database and application. The HaskellDB work implemented a relational calculus that ensured that all queries were checked at compile time for validity against a particular database. HaskellDB doesn't actually type check against the database; it checks against types defined in a module, which is generated from the database. If you change the database, then you need to regenerate the module. I prefer the approach of passing an extraction function/IO action to the interface. This is the approach taken by HaSql, HaskellDB (I think, although it's so indirect I find it hard to follow), and Tim Docker's Sybase library, and it's the design I intend to use for my Oracle library (when I get around to it). I still want to use HaskellDB (or at least the relational calculus part of it), so I was thinking of splitting it into two pieces: a library that submits SQL queries and returns the results (database interface), and a library that constructs queries with the relational calculus and generates SQL from them. Obviously the database interface must be written for each DBMS product, while the relational calculus library ought to be independent of DBMS (although it might be wise to have SQL compatibility flags for the SQL it generates. For example: Oracle 8 (what I'm using now) only supports SQL-92). * The information in this email and in any attachments is confidential and intended solely for the attention and use of the named addressee(s). This information may be subject to legal professional or other privilege or may otherwise be protected by work product immunity or other legal rules. It must not be disclosed to any person without our authority. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. * ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Designing a Haskell database interface
I'm making slow progress on an Oracle OCI binding. I've got the noddy session setup and database connection stuff working, so now I'm looking at how results should be returned from SQL queries. In Haskell, there doesn't seem to be any consistent way of returning results from SQL queries, unlike (say) Java's JDBC interface. It may be a bit early to propose a standard SQL dbms interface design, but... does anyone have any idea(s) about how one ought to look? Here's a simple survey of the sql dbms interfaces I've come across so far (I found HaskellDB the most complex, and difficult to understand). My assumptions about how these various libraries work might be quite wrong: HaskellDB: dbQuery returns IO [row r], where - dbQuery is implemented by adoQuery (this library uses Odbc): adoQuery :: IConnection () - PrimQuery - Rel r - IO [AdoRow r] - AdoRow a is an instance of class Row - class Row row a declares one function: rowSelect rowSelect :: Attr r a - row r - a - type AdoRow implements rowSelect with adoRowSelect: adoRowSelect :: Variant a = Attr r a - AdoRow r - a - Attr has one constructor: Attr Attribute, and Attribute is just a synonym for String. So (I think) the return types are determined by the types of the phantom types in Attr r a and AdoRow r. And this is where my head explodes... LibPQ: fetchAllRows returns tuple pair of the connection and row : (DBI a, [[String]]) - so, a row is a list of list of Strings. HaSql: haSQLObtainQueryResults returns SQL [a], where - SQL is some custom IO+State Monad - the type of a is the return type of a function you pass to haSQLObtainQueryResults to (I think) convert an Odbc pointer into a Haskell type. MySql-hs: mysqlQuery returns a tuple of (Integral, [[Maybe String]], [MysqlField]), where - the first element (Integral type) is the number of rows - the second element is a list of list of Strings - the result set. - the third element is metadata. MysqlField is a record describing a database column. The most sophisticated implementation wrt type information seems to be HaskellDB. My initial goal was to use this library and provide an Oracle database driver for it, but the HaskellDB seems to be quite dependent on Trex, which AFAICT is a Hugs library. I was thinking of splitting it into two parts: a library that submits SQL queries and returns the results, and a library that constructs queries with the relational calculus and generates SQL from them. The relational calculus bit was what I was interested in, but for now I want to work on getting data out of my Oracle database. Also... I was wondering how you might go about mapping arbitrary dbms types to a Haskell result set. In an ideal world, you can store any values you like in a relational database. However, most SQL dbms products give you just numbers, text, and dates. Some dbms's (PostgreSql and Oracle) allow users/programmers to create new types and let the dbms treat them in the same manner as the built-in ones i.e. with equality and ordering predicates, and indexing. The built-in support for the three basic types (numbers, text, dates) is reflected in the JDBC API; it has methods like: getByte, getShort, getFloat, getDouble, getBigDecimal, getInt, getString, getDate, getTime, getBoolean, while other types are handled by methods like getBinaryStream and getObject. So how would you convert a (say) PostgreSql Point or Box value to a Haskell type? Would the approach taken by HaSql be the best (the user provides a function that converts binary data into a Haskell value)? * The information in this email and in any attachments is confidential and intended solely for the attention and use of the named addressee(s). This information may be subject to legal professional or other privilege or may otherwise be protected by work product immunity or other legal rules. It must not be disclosed to any person without our authority. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are not authorised to and must not disclose, copy, distribute, or retain this message or any part of it. * ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: Designing a Haskell database interface
I wrote an experimental wrapper around the sybase db client libraries. This was my first attempt at using the ffi, and I was impressed with the fact that I got enough working purely using haskell. My simplistic approach was to embed everything in the IO monad, wrapping each sybase call trivially, and then building slightly higher level functions on top of each. Of course, the fun with handling queries is that you want to get values of types defined by the query string (ie unknown at compile time) from the database and into the typed haskell world. In my approach, the top level interface looked like: open :: String - String - String - IO Process -- open a connection to the database returning an abstract -- process object doquery :: Process - String - IO v - IO [v] -- Takes: a db process; the SQL query string; and -- a (no parameter) IO action that, when executed -- during a query returns a value of type v derived -- from the current row -- -- Returns the list of values of type v derived from -- each row. stringv :: Process - CInt - IO String doublev :: Process - CInt - IO Double intv:: Process - CInt - IO Int datetimev :: Process - CInt - IO DateTime -- Returns the string/double/int/datetime value of the specified -- column in the current row You could then write queries something like this: dbp - DB.open SERVER user password rows - DB.doquery dbp queryText $ do name - DB.stringv dbp 1 v1 - DB.doublev dbp 2 v2 - DB.doublev dbp 3 return (label,v1,v2) At which point rows would have type [(String,Double,Double)]. This did the job, but I'd be interested in better ways of doing it! Tim -Original Message- From: Bayley, Alistair [mailto:[EMAIL PROTECTED] Sent: Monday, July 14, 2003 4:25 PM To: [EMAIL PROTECTED] Subject: Designing a Haskell database interface I'm making slow progress on an Oracle OCI binding. I've got the noddy session setup and database connection stuff working, so now I'm looking at how results should be returned from SQL queries. In Haskell, there doesn't seem to be any consistent way of returning results from SQL queries, unlike (say) Java's JDBC interface. It may be a bit early to propose a standard SQL dbms interface design, but... does anyone have any idea(s) about how one ought to look? Here's a simple survey of the sql dbms interfaces I've come across so far (I found HaskellDB the most complex, and difficult to understand). My assumptions about how these various libraries work might be quite wrong: HaskellDB: dbQuery returns IO [row r], where - dbQuery is implemented by adoQuery (this library uses Odbc): adoQuery :: IConnection () - PrimQuery - Rel r - IO [AdoRow r] - AdoRow a is an instance of class Row - class Row row a declares one function: rowSelect rowSelect :: Attr r a - row r - a - type AdoRow implements rowSelect with adoRowSelect: adoRowSelect :: Variant a = Attr r a - AdoRow r - a - Attr has one constructor: Attr Attribute, and Attribute is just a synonym for String. So (I think) the return types are determined by the types of the phantom types in Attr r a and AdoRow r. And this is where my head explodes... LibPQ: fetchAllRows returns tuple pair of the connection and row : (DBI a, [[String]]) - so, a row is a list of list of Strings. HaSql: haSQLObtainQueryResults returns SQL [a], where - SQL is some custom IO+State Monad - the type of a is the return type of a function you pass to haSQLObtainQueryResults to (I think) convert an Odbc pointer into a Haskell type. MySql-hs: mysqlQuery returns a tuple of (Integral, [[Maybe String]], [MysqlField]), where - the first element (Integral type) is the number of rows - the second element is a list of list of Strings - the result set. - the third element is metadata. MysqlField is a record describing a database column. The most sophisticated implementation wrt type information seems to be HaskellDB. My initial goal was to use this library and provide an Oracle database driver for it, but the HaskellDB seems to be quite dependent on Trex, which AFAICT is a Hugs library. I was thinking of splitting it into two parts: a library that submits SQL queries and returns the results, and a library that constructs queries with the relational calculus and generates SQL from them. The relational calculus bit was what I was interested in, but for now I want to work on getting data out of my Oracle database. Also... I was wondering how you might go about mapping arbitrary dbms types to a Haskell result set. In an ideal world, you can store any values you like in a relational database. However, most SQL dbms products give you just numbers, text, and dates. Some dbms's (PostgreSql and Oracle) allow users