RE: Database interface - would like advice on oracle library bind ing

2003-09-29 Thread Tim Docker

Alistair Bayley wrote:

 Still making slow progress on an Oracle database binding... 
 now I'm trying to fit the API I have into some sort of 
 abstract interface (like the one(s) discussed previously:


Since the previous dicussion I've discovered Krasimir
Angelov's HSQL interface as part of his HTOOLKIT:

https://sourceforge.net/project/showfiles.php?group_id=65248

This has a uniform abstract interface similar to the one
discussed, apparently implemented for MYSQL, ODBC, and
PostgreSQL. I've only played with the MYSQL variant.

If possible, it would seem sensible to provide a consistent
interface for your Oracle binding. If not, it may be worth
persuading him to generalise the interface.

Then again, perhaps you knew about this already.

Tim
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface - would like advice on oracle library bind ing

2003-09-29 Thread Bayley, Alistair
Yes, I know of it.

Krasimir's code uses similar functions and types in the three bindings
(MySQL, ODBC, PostgreSql), but the bindings do not share a common interface
(AFAICT). Because they are so similar, making them support a common
interface shouldn't be a lot of work.

I was hoping the earlier discussions on database interfaces would converge
to a solution most interested parties agreed on. I'm still sold on the
left-fold interface discussed previously. Oleg claims it is sufficient, as
it can be automatically converted to a stream-like interface. I bit of me
agrees with you that I should be consistent with previous work, and another
bit says make it simple and elegant, and this leads me towards Oleg's
proposals (left-fold + auto-generated value extraction functions +
auto-generated stream functions).

Do others on this list (and elsewhere) have any further opinions as to how a
database interface ought to look?


 -Original Message-
 From: Tim Docker [mailto:[EMAIL PROTECTED]
 Sent: 29 September 2003 13:43
 To: Bayley, Alistair; [EMAIL PROTECTED]
 Subject: RE: Database interface - would like advice on oracle library
 bind ing
 
 
 
 Alistair Bayley wrote:
 
  Still making slow progress on an Oracle database binding... 
  now I'm trying to fit the API I have into some sort of 
  abstract interface (like the one(s) discussed previously:
 
 
 Since the previous dicussion I've discovered Krasimir
 Angelov's HSQL interface as part of his HTOOLKIT:
 
 https://sourceforge.net/project/showfiles.php?group_id=65248
 
 This has a uniform abstract interface similar to the one
 discussed, apparently implemented for MYSQL, ODBC, and
 PostgreSQL. I've only played with the MYSQL variant.
 
 If possible, it would seem sensible to provide a consistent
 interface for your Oracle binding. If not, it may be worth
 persuading him to generalise the interface.
 
 Then again, perhaps you knew about this already.
 
 Tim


*
The information in this email and in any attachments is 
confidential and intended solely for the attention and use 
of the named addressee(s). This information may be 
subject to legal professional or other privilege or may 
otherwise be protected by work product immunity or other 
legal rules.  It must not be disclosed to any person without 
our authority.

If you are not the intended recipient, or a person 
responsible for delivering it to the intended recipient, you 
are not authorised to and must not disclose, copy, 
distribute, or retain this message or any part of it.
*

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface - would like advice on oracle library binding

2003-09-25 Thread oleg

The following code illustrates a _generic_ interface to low-level
database code. The left-fold iterator doQuery is completely generic
over any possible iterator -- no matter how many columns the query
returns, what are the types of these columns and what is the type of
the seed (accumulator). The code for doQuery remains the same. The
iterator allocates buffers for columns at the beginning and frees the
buffers at the very end. Again, this buffer handling is generic. There
is no longer need to write extraction/unmarshalling function for
specific types of rows. We only need fetching functions for specific
datatypes (not columns!). Again, the query and the row buffer
management code is completely generic. I guess I'm repeating
myself. The tests:

-- Query returns one column of type String
-- Never mind undefined: we return some static data in the buffers,
-- we don't have any oracle to bind to
test1 = doQuery undefined undefined iter1 ([]::[String])
  where
 iter1:: String - [String] - Either [String] [String]
 iter1 s acc = Right $ s:acc 


-- Query returns two columns of types String and Int
test2 = doQuery undefined undefined iter2 ([]::[(String,Int)])
  where
 iter2:: String - Int - [(String,Int)] - 
 Either [(String,Int)] [(String,Int)]
 iter2 s i acc = Right $ (s,i):acc 


-- Query returns three columns of types Int, String and Int
test3 = doQuery undefined undefined iter3 ([]::[(Int,String,Int)])
  where
 iter3:: Int - String - Int - [(Int,String,Int)] - 
 Either [(Int,String,Int)] [(Int,String,Int)]
 iter3 i1 s i2 acc = Right $ (i1,s,i2):acc 

Use the function runtests to run either of these tests.


The code follows. Compiler flags: 
-fglasgow-exts -fallow-overlapping-instances

-- DB column buffers

type BufferSize = Int
data BufferType = ORA_char | ORA_int
type Position = Int  -- column number of the result table

data Buffer = Buffer { bufptr :: String -- for this stub, just use String
 , nullindptr :: String -- likewise
 , retsizeptr :: String -- likewise
 , size:: BufferSize 
 , pos:: Position
 , ora_type:: BufferType }

-- understandably, below is just a stub ...
alloc_buffer (siz, typ) ps = 
  return $ Buffer { bufptr = show ps, pos = ps,  size = siz, ora_type = typ}
  
-- In this stub, don't do anything
free ptr = return ()


-- DB Column types

class DBType a where
  alloc_buffer_hints:: a - (BufferSize, BufferType)
  col_fetch:: Buffer - IO a

instance DBType String where
  alloc_buffer_hints _ = (2000, ORA_char)
  col_fetch buffer = return (bufptr buffer)

instance DBType Int where
  alloc_buffer_hints _ = (4, ORA_int)
  col_fetch buffer = return (read $ bufptr buffer)
  
-- need to add more ...

-- Row iteratees. Note, the folowing two instances cover ALL possible
-- iteratees. No other instances are needed

class SQLIteratee iter seed where
iter_apply:: [Buffer] - seed - iter - IO (Either seed seed)
alloc_buffers:: Position - iter - seed - IO [Buffer]

instance (DBType a) = SQLIteratee (a-seed-Either seed seed) seed where
iter_apply [buf] seed fn = col_fetch buf = (\v - return$ fn v seed)
alloc_buffers n _ _ = 
   sequence [alloc_buffer (alloc_buffer_hints (undefined::a)) n]

instance (SQLIteratee iter' seed, DBType a) = SQLIteratee (a-iter') seed 
 where
iter_apply (buf:others) seed fn = 
  col_fetch buf = (\v - iter_apply others seed (fn v))
alloc_buffers n fn seed = do
  this_buffer - alloc_buffer (alloc_buffer_hints (undefined::a)) n
  other_buffers - alloc_buffers (n+1) (fn (undefined::a)) seed
  return (this_buffer:other_buffers)

free_buffers = mapM_ free

-- The left fold iterator -- the query executor

data Session   -- not relevant for this example
data SQLStmt

db_execute session query = return ()

db_fetch_row buffers = return ()  -- use static data

doQuery:: (SQLIteratee iter seed) = Session - SQLStmt - iter - seed - IO seed

-- In this example, we just allocate buffers, fetch two rows and terminate
-- with a clean-up

doQuery session query iteratee seed = do
  buffers - alloc_buffers 0 iteratee seed
  db_execute session query
  db_fetch_row buffers
  (Right seed) - iter_apply buffers seed iteratee
  db_fetch_row buffers
  (Right seed) - iter_apply buffers seed iteratee
  free_buffers buffers
  return seed
  

-- Tests

-- Query returns one column of type String
test1 = doQuery undefined undefined iter1 ([]::[String])
  where
 iter1:: String - [String] - Either [String] [String]
 iter1 s acc = Right $ s:acc 


-- Query returns two columns of types String and Int
test2 = doQuery undefined undefined iter2 ([]::[(String,Int)])
  where
 iter2:: String - Int - [(String,Int)] - 
 Either [(String,Int)] [(String,Int)]
 iter2 s i acc = Right $ (s,i):acc 


-- Query returns three columns of types Int, String and Int
test3 = doQuery 

RE: Database interface - would like advice on oracle library bind ing

2003-09-24 Thread Bayley, Alistair
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 
 I'd like to remark first that many real databases let us avoid opening
 many cursors at once. It seems to be more efficient to do as much as
 possible within the engine ...

I agree (and Tom Pledger makes the same comment). It's rare that I want to
have more than one cursor open at once. When I've seen other developers do
it, they're often doing something that would be better accomplished as a
join. And that last time I remember doing it, it was a nested loops
situation (which, on reflection, probably could have been done as a join...
oh well).

OTOH, I have seen interleaved fetches used to simulate a join where the
specific DBMS product struggled to perform the join efficiently itself. In
this case the developer is making up for a deficiency in the DBMS, and I
don't want to exclude the option from the interface (if you agree with what
you read on http://www.dbdebunk.com , then most DBMS products are deficient
in a number of ways).


 Still, opening several cursors may be unavoidable. The left fold
 approach may still help -- we _can_ mechanically invert a left fold
 combinator (to be precise, a half-baked version of it) into a lazy
 list. Please see a separate message how to invert the left fold

Found it. I'll have a look...

 I believe the extract functions can be constructed automatically --
 similar to the way Quickcheck constructs test cases. I believe that
 instead of

I though this might be possible, but I had no idea how to do it. I'll have a
look at the Quickcheck source to see how it's done, unless you can suggest a
better example.


 buffers. It's hard for me to write the corresponding code because I
 don't have Oracle handy (besides, I like Informix better). Is it
 possible to come up with a stub that uses flat files? We are only
 interested in fetching rows. It doesn't matter if these rows come from
 a file or from a database. That would make prototyping the interface 
 quite easier.

I wasn't interested so much in prototyping the interface, as trying to write
an implementation that supported the interface(s) discussed previously. I
intended to provide the left-fold interface, and was wondering if that was
all that was needed (for selects). Still, this would be a good exercise for
me, at least so I can figure out how to generate extraction functions.


 From: Tom Pledger [mailto:[EMAIL PROTECTED]

 Here's one approach: find the OCI equivalent of JDBC's
 ResultSetMetaData, and use it to drive the allocation and freeing of
 buffers.

I've considered this, and I think it's the next route I'll take (the OCI
supports it). At the least it'll give me the ability to perform arbitrary
queries (at present I have to know the types of the result set columns and
construct the extraction function manually).

I've also considered stuffing more information into the Cursor type (which
I've introduced), and using this in a modal fashion to decide what to do at
each point.


 Here's another: ...
 Make getInt (and friends) behave differently depending on the mode
 of the cursor they're passed: either allocate a buffer and return
 _|_, decode and return the current column of the current row, or
 free a buffer and return _|_.

Not wanting to sound too dumb, but how do you return _|_ ?


Thanks,
Alistair.


*
The information in this email and in any attachments is 
confidential and intended solely for the attention and use 
of the named addressee(s). This information may be 
subject to legal professional or other privilege or may 
otherwise be protected by work product immunity or other 
legal rules.  It must not be disclosed to any person without 
our authority.

If you are not the intended recipient, or a person 
responsible for delivering it to the intended recipient, you 
are not authorised to and must not disclose, copy, 
distribute, or retain this message or any part of it.
*

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Database interface - would like advice on oracle library bind ing

2003-09-24 Thread Keith Wansbrough
  Here's another: ...
  Make getInt (and friends) behave differently depending on the mode
  of the cursor they're passed: either allocate a buffer and return
  _|_, decode and return the current column of the current row, or
  free a buffer and return _|_.
 
 Not wanting to sound too dumb, but how do you return _|_ ?

Return

  undefined

or better,

  error Erroneously demanded bottom result of buffer allocation phase

Note that

  undefined :: a
  undefined = undefined

  error :: String - a
  {- defined internally -}

HTH.

--KW 8-)
  
-- 
Keith Wansbrough [EMAIL PROTECTED]
http://www.cl.cam.ac.uk/users/kw217/
University of Cambridge Computer Laboratory.

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Database interface - would like advice on oracle library binding

2003-09-23 Thread Bayley, Alistair
(2nd attempt; mailman thinks I'm not a list member, but it still keeps
sending me mail.)

Still making slow progress on an Oracle database binding... now I'm trying
to fit the API I have into some sort of abstract interface (like the one(s)
discussed previously:
 http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ).


1. Is the left-fold the best/only interface to expose? I think yes, but that
doesn't allow fine-grained control over cursors i.e. being able to open many
cursors at once and interleave fetches from them. Or does it?


2. I'm finding it hard to write a doQuery function that takes an extraction
function that isn't a pig to write. Some advice would be useful here... (and
a long-ish explanation follows):

The Oracle Call Interface (OCI) requires that I allocate buffers for the
result of a single row fetch, before the first row is fetched. So a query
involves:
 - prepare statement etc
 - allocate buffers (one for each column - call OCI C function
DefineByPos)
 - fetch row
 - extract/marshal data from buffer into Haskell types (which are then
processed by fold function)
 - fetch, marshal (repeat until no more rows)
 - free buffers

i.e. the same buffers are re-used for each row.

The problem for me is how to specify the left-fold function in terms of the
low-level API. If I want to supply extraction functions (to extract Haskell
values from result buffer), how do I manage the buffer allocation in the
doQuery function? The buffer allocate/free code also needs to know the
column positions and types, in the same manner as the extract functions.

I want to be able to write code like this:

results - doQuery dbconn sqltext [] \row results - do
name- stringv row 1
address - stringv row 2
return (name,address):results

.. where the stringv function extracts/marshals a Haskell String from the
result buffer.

The intermediate approach I currently have means I have to pass an IO action
into the doQuery function that, when evaluated, allocates the buffer and
returns two more actions:
 - an action that extracts the row as a tuple
 - another action that frees the buffer

The doQuery function evaluates the initial action (to allocate the buffer),
uses the extract action to build the result (at present a list), and when
there are no more rows, uses the free action to free the buffer.

This approach is quite awkward (especially w.r.t. writing extract
functions), and it's hard for me to see how to build a better interface.
Hard, because of the memory management requirements.



Here's a chunk of the code. A lot of it is OCI plumbing, but I hope you can
see how awkward it is to create an extract function (see ex3 at the bottom).

Given pointers to the buffer, extract a string of variable length (you have
to terminate it yourself).

 fetchStringVal :: OCIColInfo - IO String
 fetchStringVal (_, bufptr, nullindptr, retsizeptr) = do
   retsize - liftM cShort2Int (peek retsizeptr)
   nullind - liftM cShort2Int (peek nullindptr) -- unused
   pokeByteOff (castPtr bufptr) retsize nullByte
   val - peekCString (castPtr bufptr)
   return val

Free a single column's buffer.

 freeColBuffer :: OCIColInfo - IO ()
 freeColBuffer (_, bufptr, nullindptr, retsizeptr) = do
   free bufptr
   free retsizeptr
   free nullindptr

Create a buffer for a string column, and return the extract and free IO
actions.

 getExtractFnString :: Int - ErrorHandle - StmtHandle - IO (IO String,
IO ())
 getExtractFnString posn err stmt = do
   c - defineCol err stmt posn 2000 oci_SQLT_CHR
   return ((fetchStringVal c), (freeColBuffer c))


doQuery uses the extractFns action to create the result buffer,
and the two actions (extract and free) which are passed to doQuery2.

 doQuery2 :: ErrorHandle - StmtHandle - IO a - IO () - [a] - IO [a]
 doQuery2 err stmt extractData freeMem results = do
   rc - fetch err stmt
   if rc == oci_NO_DATA
 then do
   freeMem
   return results
 else do
   v - extractData
   doQuery2 err stmt extractData freeMem (v:results)

 doQuery :: Session - String - (ErrorHandle - StmtHandle - IO (IO a, IO
())) - IO [a]
 doQuery (Sess env err con) qry extractFns = do
   stmt - getStmt env err
   prepare err stmt qry
   execute err con stmt
   (extractData, freeMem) - extractFns err stmt
   doQuery2 err stmt extractData freeMem []


The interface provided by doQuery means I have to write extract functions
like this.
Here's one for a select that returns three String columns.
It's quite awkward...

 ex3 :: ErrorHandle - StmtHandle - IO (IO (String, String, String), IO
())
 ex3 err stmt = do
   (fetchcol1, freecol1) - getExtractFnString 1 err stmt
   (fetchcol2, freecol2) - getExtractFnString 2 err stmt
   (fetchcol3, freecol3) - getExtractFnString 3 err stmt
   return
 ( do { s1 - fetchcol1; s2 - fetchcol2; s3 - fetchcol3; return (s1,
s2, s3) }
 , do { freecol1; freecol2; freecol3 }
 )



Database interface - would like advice on oracle library binding

2003-09-23 Thread Tom Pledger
Bayley, Alistair writes:
 :
 | Still making slow progress on an Oracle database binding... now I'm trying
 | to fit the API I have into some sort of abstract interface (like the one(s)
 | discussed previously:
 |  http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ).
 | 
 | 
 | 1. Is the left-fold the best/only interface to expose? I think yes,
 | but that doesn't allow fine-grained control over cursors i.e. being
 | able to open many cursors at once and interleave fetches from
 | them. Or does it?

It looks like the interleaving would be limited to a nested loop
structure: a cursor could be processed in full during one extraction
for another cursor.

Application-side nested loop structures are often a poor substitute
for server-side joins.

 | 2. I'm finding it hard to write a doQuery function that takes an
 | extraction function that isn't a pig to write. Some advice would be
 | useful here... (and a long-ish explanation follows):
 :

Here's my attempt to summarise the piggishness you describe:

The interface to Oracle requires that you initialise a cursor by
allocating a suitably typed buffer for each column prior to
fetching the first row, and finalise a cursor by freeing those
buffers after fetching the last row.

This means that we must iterate over the columns 3 times.  We
would prefer to express this iteration only once, and have the
other 2 happen automatically within the library.  (As opposed to
what ex3 does, which is to iterate for getExtractFnString, iterate
for fetchcolN, and iterate for freecolN.)

Here's one approach: find the OCI equivalent of JDBC's
ResultSetMetaData, and use it to drive the allocation and freeing of
buffers.

Here's another:

Add a mode attribute to the abstract data type which encompasses
ErrorHandle and StmtHandle.  (I'll persist in referring to that
ADT as Cursor.)

Expect extraction functions to be written along these lines:

\cursor result
   - do field1 - getIntcursor
 field2 - getString cursor
 field3 - getString cursor
 return ((field1, field2, field3):result, True)

Make getInt (and friends) behave differently depending on the mode
of the cursor they're passed: either allocate a buffer and return
_|_, decode and return the current column of the current row, or
free a buffer and return _|_.

doQuery could then apply the extraction function once in Allocate
mode after opening the cursor, once per fetched row in Decode
mode, and once in Free mode at the end.

There's nothing to stop an extraction function from varying the number
of get___ functions it applies, or trying to match their results when
not in Decode mode.  These weakness could be mitigated by:

Pointing out that some database connection standards (JDBC, and
for all I know also ODBC) don't guarantee that you can still get
at a row's 1st column after you've looked at its 2nd column,
i.e. there's a precedent for such fragility.

Complicating the extraction functions by giving them the type

(Cursor - b - IO (IO (b, Bool)))

, expecting that all the get___ functions are applied in the outer
IO layer, and undertaking that the inner IO layer will only be
used in Decode mode.

Regards,
Tom

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-21 Thread Tim Docker
Tom Pledger wrote:

  How about introducing a Cursor abstract data type?
  
  doquery :: Process - String - b - (Cursor - b - IO (b,
Bool))
 - IO b
  stringv :: Cursor - CInt - IO String
  doublev :: Cursor - CInt - IO Double
  intv:: Cursor - CInt - IO Int
  
  This achieves the restriction you're after, because doquery is the
  only exported producer of Cursor, and stringv etc. are the only
  exported consumers of Cursor.

I like this. Not sure about whether I'd call it Cursor or just Row. Code
something like

results - doquery dbconn sqltext [] \row results - do
name- colv row 1
address - colv row 2
return (name,address):results

seems quite tidy.

  It also has the benefit that the function you pass to doquery can
make
  other calls to doquery, without mucking up the 'current row' state.
  There would be one current row per Cursor, not one per Process.

Is it normal or common to support multiple simultaneous queries on
a single DB connection?

Tim
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-21 Thread Bayley, Alistair
 From: Tim Docker [mailto:[EMAIL PROTECTED]

   It also has the benefit that the function you pass to doquery can
 make
   other calls to doquery, without mucking up the 'current row' state.
   There would be one current row per Cursor, not one per Process.
 
 Is it normal or common to support multiple simultaneous queries on
 a single DB connection?


Depends on what you mean by multiple simultaneous queries. A single query
is usually synchronous w.r.t. returning a cursor i.e. you issue the query
and get back a cursor in an atomic operation, and then you can do other
things while you fetch rows from the cursor.

If you need to have two SQL statements running at the same time (two big
inserts or updates, say) then you usually need two sessions. It probably
depends on the API, but some (Oracle, I believe) will allow you to open
multiple sessions with the same connection.

In a single session you could open a number of cursors (in any order),
interleave fetches (in any order), and then close the cursors (in any
order).

So you could have:
 - a number of Connections (perhaps to different servers/databases), where
 - each Connection has one or more Sessions, and
 - each Session has zero or more Cursors (and one current Transaction).

My session model and terminology are probably quite Oracle specific, but I
hope it's reasonably clear.


*
The information in this email and in any attachments is 
confidential and intended solely for the attention and use 
of the named addressee(s). This information may be 
subject to legal professional or other privilege or may 
otherwise be protected by work product immunity or other 
legal rules.  It must not be disclosed to any person without 
our authority.

If you are not the intended recipient, or a person 
responsible for delivering it to the intended recipient, you 
are not authorised to and must not disclose, copy, 
distribute, or retain this message or any part of it.
*

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-21 Thread Tom Pledger
Tim Docker writes:
 :
 | Is it normal or common to support multiple simultaneous queries on
 | a single DB connection?

In transaction processing, yes.  There's an idiom where you use one
query to select all the (financial) transactions in a batch, but
there's so much variation in how you need to process each row, that
you can't express it all in one SQL query.  So you use a variety of
little queries during the processing of each row of the main query.
The result of the fold in your doquery could be, say, a record of
batch totals.

- Tom

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-20 Thread Tom Pledger
Tim Docker writes:
 :
 | The list being folded over
 | is implied by the DB query, is accessible through the IO monad.
 | Hence a parameter is not required. It would really be:
 | 
 | doquery :: Process - String - b - (b - IO (b,Bool)) - IO b
 :
 | One thing that I am unsure about is whether the column value
 | accessing functions that I specified before
 | 
 |stringv :: Process - CInt - IO String
 |doublev :: Process - CInt - IO Double
 |intv:: Process - CInt - IO Int
 | 
 | should return actions in the IO monad as above, or instead should
 | be in some other DBQuery monad, that trivially extends the IO monad,
 | but is only valid inside the doquery call. This would have the benefit
 | of restricting the column access functions to inside a query via the
 | type system.

How about introducing a Cursor abstract data type?

doquery :: Process - String - b - (Cursor - b - IO (b, Bool))
   - IO b
stringv :: Cursor - CInt - IO String
doublev :: Cursor - CInt - IO Double
intv:: Cursor - CInt - IO Int

This achieves the restriction you're after, because doquery is the
only exported producer of Cursor, and stringv etc. are the only
exported consumers of Cursor.

It also has the benefit that the function you pass to doquery can make
other calls to doquery, without mucking up the 'current row' state.
There would be one current row per Cursor, not one per Process.

 | I'd also probably use a typeclass to specify a single colv function.
 | ie:
 | 
 |class DBCol a where
 |colv :: DBQuery a
 | 
 |instance DBCol String where...
 |instance DBCol Double where...

Good idea.  The user can always use explicit type signatures to
resolve ambiguity, and even then the code size will be about the same
as with stringv etc.

- Tom

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-19 Thread Tim Docker
Tom Pledger writes:

 By the way, how does the  a  in  a - b - IO (b, Bool)  work?
 It looks like it has something to do with the current row.  Does
 doquery have to accommodate any possible substitution for  a ?

I fired this off without thinking about it too much, and looking
at the prelude type signatures for fold. The list being folded over
is implied by the DB query, is accessible through the IO monad.
Hence a parameter is not required. It would really be:

doquery :: Process - String - b - (b - IO (b,Bool)) - IO b

 I don't have a preference, but offer this view of the options:
 
 With an exception, Stop, and return the last b you saw.
 With a boolean,Stop, and return this b.

I think I like the behavior where, when the bool in the tuple
is true, the b in the tuple is immediately returned from the
query. Exceptions would be propagated to the caller of doquery
without modification (but with appropriate cleanups).

One thing that I am unsure about is whether the column value
accessing functions that I specified before

   stringv :: Process - CInt - IO String
   doublev :: Process - CInt - IO Double
   intv:: Process - CInt - IO Int

should return actions in the IO monad as above, or instead should
be in some other DBQuery monad, that trivially extends the IO monad,
but is only valid inside the doquery call. This would have the benefit
of restricting the column access functions to inside a query via the
type system.

I'd also probably use a typeclass to specify a single colv function.
ie:

   class DBCol a where
   colv :: DBQuery a

   instance DBCol String where...
   instance DBCol Double where...

   doquery :: Process - String - b - (b - DBQuery (b,Bool)) - IO b

Any comments?

Tim
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-14 Thread Tim Docker

[EMAIL PROTECTED] wrote:

 If I may interject, that's precisely how a Scheme DB interface is
 designed. The main function is a left-fold. Not quite though: it
 provides for a premature termination:
 
 A major procedure: DB1:fold-left PROC INITIAL-SEED QUERY-OBJECT

Premature termination sounds like a useful property. I can
see two ways this could be done: keep the previous signature,
and use an exception to exit early, or add a boolean return 
value like the scheme version:

   doquery :: Process - String - (a - b  - IO (b,Bool)) - b - IO b

Any opinions on which alternative wouuld be preferable?

Tim
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-14 Thread oleg

 I'll probably generalise the query function to do a fold, rathen than
 always accumulate a list:
 doquery :: Process - String - (a - b  - IO b) - b - IO b

If I may interject, that's precisely how a Scheme DB interface is
designed. The main function is a left-fold. Not quite though: it
provides for a premature termination:

A major procedure: DB1:fold-left PROC INITIAL-SEED QUERY-OBJECT

A QUERY-OBJECT (which in this implementation is a list of fragments
that make a SQL statement, in the reverse order -- without the
terminating semi-colon) is submitted to the database, using the
default database connection.

PROC is a procedure: SEED COL COL ...
The procedure PROC takes 1+n arguments where n is the number of
columns in the the table returned by the query.  The procedure PROC
must return two values: CONTINUE? NEW-SEED

The query is executed, and the PROC is applied to each returned row in
order. The first invocation of PROC receives INITIAL-SEED as its first
argument. Each following invocation of PROC receives as the first
argument the NEW-SEED result of the previous invocation of PROC.  The
CONTINUE? result of PROC is an early termination flag. If that flag is
returned as #f, any further applications of PROC are skipped and
DB1:fold-left finishes.  The function DB1:fold-left returns NEW-SEED
produced by the last invocation of PROC. If the query yielded no rows,
DB1:fold-left returns the INITIAL-SEED.

Thus DB1:fold-left is identical to the left fold over a sequence,
modulo the early termination.

There are a few minor variants of the above procedure, optimized for
common particular cases: a query that is expected to return at most
one row, and a query that expects to return at most one row with
exactly one column. The latter query has the same interface as a
lookup in a finite map.

The QUERY-OBJECT is of coursed not built by hand. There is a
domain-specific language (which greatly resembles the source language
and makes some use of quasi-quotation) whose result is a query
object. Therefore, I can write both the query and the handlers of the
query result using the same syntax.

The premature termination is important. Database connections and
cursors are too precious resources to leave them for the garbage
collector to finalize. More discussion and pointers can be found at

http://srfi.schemers.org/srfi-44/mail-archive/msg00023.html

The interface has been used in an industrial application.
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Database interface

2003-08-14 Thread Thomas L. Bevan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 I still want to use HaskellDB (or at least the relational calculus part of
 it), so I was thinking of splitting it into two pieces: a library that
 submits SQL queries and returns the results (database interface), and a
 library that constructs queries with the relational calculus and generates
 SQL from them. Obviously the database interface must be written for each
 DBMS product, while the relational calculus library ought to be independent
 of DBMS (although it might be wise to have SQL compatibility flags for the
 SQL it generates. For example: Oracle 8 (what I'm using now) only supports
 SQL-92).

Ideally, we would have the following.

1/ A high level combinator library for relational calculus built on,

2/ A standard low-level Haskell API 
 i.e. functions like Connection - String - ( a - b - IO b) - IO b

3/ Database specific bindings.

The problem is that projections and cartesian products generate new types, at 
least in the way it was done in HaskellDB. The form the calculus takes would 
need to be substantially reworked.

Tom
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/O6DBYha8TWXIQwoRAmkCAKCwZkNpU3TjX5bj7rOZHx5DXRjPhwCgihnY
sAWwEKX+5hZ1Tiu4wfSGo7Y=
=ALnJ
-END PGP SIGNATURE-

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-14 Thread Tom Pledger
Tim Docker writes:
 | Tom Pledger writes:
 | 
 |   This is a pretty good way to stop those nasty vague SQL row types at
 |   the Haskell border and turn them into something respectable.  Perhaps
 |   it would even be worth constraining the extracted type to be in
 |   DeepSeq
 |  
 |  doquery :: (DeepSeq v) =
 | Process - String - IO v - IO [v]
 | 
 | Can you explain what the constraint does here?

Yes.  It soothes my superstitious fears about unsafeInterleaveIO.

I imagined that:

  - doquery (like getContents) uses unsafeInterleaveIO to make the
resulting list lazy, i.e. avoid grabbing all the rows at the
outset,

  - the unsafe interleaving would *somehow* get into the extraction
action, and

  - the current row could be overwritten while some stringv (etc.)
actions on it were yet to happen, so

  - a good safety measure would be to force evaluation (hence DeepSeq)
of each list element before overwriting the current row.

But now that you ask, I think it's possible for doquery to stop the
unsafeInterleaveIO from leaking into the extraction action, so the
DeepSeq would only protect users from shooting themselves in the foot
with weird queries like this

doquery p select s from t (unsafeInterleaveIO (stringv p 1))
-- DeepSeq would allow doquery to undo the effect of unsafeInterleaveIO

or this

[a1, a2] - doquery p select s from t (return (stringv p 1))
s2 - a2
s1 - a1
-- DeepSeq would disallow this at compile time

- Tom

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Database interface

2003-08-14 Thread Tom Pledger
Thomas L. Bevan writes:
 | Does anyone know if there is work being done on a standard Haskell
 | database interface.

I suspect that there isn't.  The pattern seems to be that someone gets
an interface working well enough for some purposes, and perhaps shares
it, but is too modest and/or busy to put it forward as a standard.

For a row extraction mechanism, I'd vote for passing an extraction
function (or IO action) to the main query function, like Tim Docker
described last month.

http://haskell.cs.yale.edu/pipermail/haskell-cafe/2003-July/004684.html

This is a pretty good way to stop those nasty vague SQL row types at
the Haskell border and turn them into something respectable.  Perhaps
it would even be worth constraining the extracted type to be in
DeepSeq

doquery :: (DeepSeq v) =
   Process - String - IO v - IO [v]

so that the interface clearly may confine its attention to one row at
a time per cursor.

- Tom

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-14 Thread Tim Docker
Tom Pledger writes:

  This is a pretty good way to stop those nasty vague SQL row types at
  the Haskell border and turn them into something respectable.  Perhaps
  it would even be worth constraining the extracted type to be in
  DeepSeq
 
 doquery :: (DeepSeq v) =
Process - String - IO v - IO [v]

Can you explain what the constraint does here?

Although I haven't touched the db interface in the meantime, when/if I
do,
I'll probably generalise the query function to do a fold, rathen than
always
accumulate a list:

doquery :: Process - String - (a - b  - IO b) - b - IO b

Tim
___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Database interface

2003-08-14 Thread Bayley, Alistair
 -Original Message-
 From: Thomas L. Bevan [mailto:[EMAIL PROTECTED]
 
 This type of interface is fine as far as it goes, but there is only 
 ever a loose coupling between the database and application.
 
 The HaskellDB work implemented a relational calculus that ensured
 that all queries were checked at compile time for validity against
 a particular database.

HaskellDB doesn't actually type check against the database; it checks
against types defined in a module, which is generated from the database. If
you change the database, then you need to regenerate the module.

I prefer the approach of passing an extraction function/IO action to the
interface. This is the approach taken by HaSql, HaskellDB (I think, although
it's so indirect I find it hard to follow), and Tim Docker's Sybase library,
and it's the design I intend to use for my Oracle library (when I get around
to it).

I still want to use HaskellDB (or at least the relational calculus part of
it), so I was thinking of splitting it into two pieces: a library that
submits SQL queries and returns the results (database interface), and a
library that constructs queries with the relational calculus and generates
SQL from them. Obviously the database interface must be written for each
DBMS product, while the relational calculus library ought to be independent
of DBMS (although it might be wise to have SQL compatibility flags for the
SQL it generates. For example: Oracle 8 (what I'm using now) only supports
SQL-92).


*
The information in this email and in any attachments is 
confidential and intended solely for the attention and use 
of the named addressee(s). This information may be 
subject to legal professional or other privilege or may 
otherwise be protected by work product immunity or other 
legal rules.  It must not be disclosed to any person without 
our authority.

If you are not the intended recipient, or a person 
responsible for delivering it to the intended recipient, you 
are not authorised to and must not disclose, copy, 
distribute, or retain this message or any part of it.
*

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


Designing a Haskell database interface

2003-07-14 Thread Bayley, Alistair
I'm making slow progress on an Oracle OCI binding. I've got the noddy
session setup and database connection stuff working, so now I'm looking at
how results should be returned from SQL queries. In Haskell, there doesn't
seem to be any consistent way of returning results from SQL queries, unlike
(say) Java's JDBC interface. It may be a bit early to propose a standard SQL
dbms interface design, but... does anyone have any idea(s) about how one
ought to look?

Here's a simple survey of the sql dbms interfaces I've come across so far (I
found HaskellDB the most complex, and difficult to understand). My
assumptions about how these various libraries work might be quite wrong:


HaskellDB:
  dbQuery returns IO [row r],  where
   - dbQuery is implemented by adoQuery (this library uses Odbc):
   adoQuery :: IConnection () - PrimQuery - Rel r - IO [AdoRow r]
   - AdoRow a is an instance of class Row
   - class Row row a declares one function: rowSelect
   rowSelect :: Attr r a - row r - a
   - type AdoRow implements rowSelect with adoRowSelect:
   adoRowSelect :: Variant a = Attr r a - AdoRow r - a
   - Attr has one constructor: Attr Attribute, and Attribute is just a
synonym for String.
  So (I think) the return types are determined by the types of the phantom
types in Attr r a and AdoRow r. And this is where my head explodes...


LibPQ:
  fetchAllRows returns tuple pair of the connection and row :  (DBI a,
[[String]])
   - so, a row is a list of list of Strings.


HaSql:
  haSQLObtainQueryResults returns SQL [a], where 
   - SQL is some custom IO+State Monad
   - the type of a is the return type of a function you pass to
haSQLObtainQueryResults to (I think) convert an Odbc pointer into a Haskell
type.


MySql-hs:
  mysqlQuery returns a tuple of (Integral, [[Maybe String]], [MysqlField]),
where
   - the first element (Integral type) is the number of rows
   - the second element is a list of list of Strings - the result set.
   - the third element is metadata. MysqlField is a record describing a
database column.


The most sophisticated implementation wrt type information seems to be
HaskellDB. My initial goal was to use this library and provide an Oracle
database driver for it, but the HaskellDB seems to be quite dependent on
Trex, which AFAICT is a Hugs library.

I was thinking of splitting it into two parts: a library that submits SQL
queries and returns the results, and a library that constructs queries with
the relational calculus and generates SQL from them. The relational calculus
bit was what I was interested in, but for now I want to work on getting data
out of my Oracle database.


Also...

I was wondering how you might go about mapping arbitrary dbms types to a
Haskell result set.

In an ideal world, you can store any values you like in a relational
database. However, most SQL dbms products give you just numbers, text, and
dates. Some dbms's (PostgreSql and Oracle) allow users/programmers to create
new types and let the dbms treat them in the same manner as the built-in
ones i.e. with equality and ordering predicates, and indexing. The built-in
support for the three basic types (numbers, text, dates) is reflected in the
JDBC API; it has methods like: getByte, getShort, getFloat, getDouble,
getBigDecimal, getInt, getString, getDate, getTime, getBoolean, while other
types are handled by methods like getBinaryStream and getObject.

So how would you convert a (say) PostgreSql Point or Box value to a Haskell
type? Would the approach taken by HaSql be the best (the user provides a
function that converts binary data into a Haskell value)?


*
The information in this email and in any attachments is 
confidential and intended solely for the attention and use 
of the named addressee(s). This information may be 
subject to legal professional or other privilege or may 
otherwise be protected by work product immunity or other 
legal rules.  It must not be disclosed to any person without 
our authority.

If you are not the intended recipient, or a person 
responsible for delivering it to the intended recipient, you 
are not authorised to and must not disclose, copy, 
distribute, or retain this message or any part of it.
*

___
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: Designing a Haskell database interface

2003-07-14 Thread Tim Docker
I wrote an experimental wrapper around the sybase db client
libraries. This was my first attempt at using the ffi, and I was
impressed with the fact that I got enough working purely using
haskell.

My simplistic approach was to embed everything in the IO
monad, wrapping each sybase call trivially, and then building
slightly higher level functions on top of each. Of course, the
fun with handling queries is that you want to get values of types
defined by the query string (ie unknown at compile time) from the
database and into the typed haskell world.

In my approach, the top level interface looked like:

open :: String - String - String - IO Process
-- open a connection to the database returning an abstract
-- process object

doquery :: Process - String - IO v - IO [v]
-- Takes: a db process; the SQL query string; and
-- a (no parameter) IO action that, when executed
-- during a query returns a value of type v derived
-- from the current row
--
-- Returns the list of values of type v derived from
-- each row.

stringv :: Process - CInt - IO String
doublev :: Process - CInt - IO Double
intv:: Process - CInt - IO Int
datetimev :: Process - CInt - IO DateTime
-- Returns the string/double/int/datetime value of the specified
-- column in the current row

You could then write queries something like this:

dbp - DB.open SERVER user password
rows - DB.doquery dbp queryText $ do
  name - DB.stringv dbp 1
  v1 - DB.doublev dbp 2
  v2  - DB.doublev dbp 3
  return (label,v1,v2)

At which point rows would have type [(String,Double,Double)].

This did the job, but I'd be interested in better ways of doing it!

Tim

 -Original Message-
 From: Bayley, Alistair [mailto:[EMAIL PROTECTED]
 Sent: Monday, July 14, 2003 4:25 PM
 To: [EMAIL PROTECTED]
 Subject: Designing a Haskell database interface
 
 
 I'm making slow progress on an Oracle OCI binding. I've got the noddy
 session setup and database connection stuff working, so now 
 I'm looking at
 how results should be returned from SQL queries. In Haskell, 
 there doesn't
 seem to be any consistent way of returning results from SQL 
 queries, unlike
 (say) Java's JDBC interface. It may be a bit early to propose 
 a standard SQL
 dbms interface design, but... does anyone have any idea(s) 
 about how one
 ought to look?
 
 Here's a simple survey of the sql dbms interfaces I've come 
 across so far (I
 found HaskellDB the most complex, and difficult to understand). My
 assumptions about how these various libraries work might be 
 quite wrong:
 
 
 HaskellDB:
   dbQuery returns IO [row r],  where
- dbQuery is implemented by adoQuery (this library uses Odbc):
adoQuery :: IConnection () - PrimQuery - Rel r - IO 
 [AdoRow r]
- AdoRow a is an instance of class Row
- class Row row a declares one function: rowSelect
rowSelect :: Attr r a - row r - a
- type AdoRow implements rowSelect with adoRowSelect:
adoRowSelect :: Variant a = Attr r a - AdoRow r - a
- Attr has one constructor: Attr Attribute, and Attribute is just a
 synonym for String.
   So (I think) the return types are determined by the types 
 of the phantom
 types in Attr r a and AdoRow r. And this is where my head 
 explodes...
 
 
 LibPQ:
   fetchAllRows returns tuple pair of the connection and row :  (DBI a,
 [[String]])
- so, a row is a list of list of Strings.
 
 
 HaSql:
   haSQLObtainQueryResults returns SQL [a], where 
- SQL is some custom IO+State Monad
- the type of a is the return type of a function you pass to
 haSQLObtainQueryResults to (I think) convert an Odbc pointer 
 into a Haskell
 type.
 
 
 MySql-hs:
   mysqlQuery returns a tuple of (Integral, [[Maybe String]], 
 [MysqlField]),
 where
- the first element (Integral type) is the number of rows
- the second element is a list of list of Strings - the result set.
- the third element is metadata. MysqlField is a record 
 describing a
 database column.
 
 
 The most sophisticated implementation wrt type information seems to be
 HaskellDB. My initial goal was to use this library and 
 provide an Oracle
 database driver for it, but the HaskellDB seems to be quite 
 dependent on
 Trex, which AFAICT is a Hugs library.
 
 I was thinking of splitting it into two parts: a library that 
 submits SQL
 queries and returns the results, and a library that 
 constructs queries with
 the relational calculus and generates SQL from them. The 
 relational calculus
 bit was what I was interested in, but for now I want to work 
 on getting data
 out of my Oracle database.
 
 
 Also...
 
 I was wondering how you might go about mapping arbitrary dbms 
 types to a
 Haskell result set.
 
 In an ideal world, you can store any values you like in a relational
 database. However, most SQL dbms products give you just 
 numbers, text, and
 dates. Some dbms's (PostgreSql and Oracle) allow 
 users