[Haskell-cafe] STM, IO and b-trees

2007-08-20 Thread Ben
for sake of argument, suppose an enterprising haskell newbie wanted to
code up concurrent b-trees (really b-link trees) in haskell.  if i am
understanding STM correctly, it will NOT help in any way with the
implementation, because of the IO-intensive nature of the algorithms?
so i will have to resort to the usual games with locks and latches?

(using

Ibrahim Jaluta, Seppo Sippu and Eljas Soisalon-Soininen. Concurrency
control and recovery for balanced B-link trees. The VLDB Journal,
Volume 14, Issue 2 (April 2005), Pages: 257 - 277, ISSN:1066-.

as a source for b-tree algorithms.)

take care, B
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] STM, IO and b-trees

2007-08-20 Thread Thomas Conway
On 8/21/07, Ben [EMAIL PROTECTED] wrote:
 for sake of argument, suppose an enterprising haskell newbie wanted to
 code up concurrent b-trees (really b-link trees) in haskell.  if i am
 understanding STM correctly, it will NOT help in any way with the
 implementation, because of the IO-intensive nature of the algorithms?
 so i will have to resort to the usual games with locks and latches?

I have produced exactly such an implementation in my day-job (so I
can't, at this stage, give you the code, I'm afraid), but I'll happily
give you some tips:

1. Investigate relaxed balance.

BTrees with relaxed balance enable you to break up operations into
much smaller transactions, which will reduce the amount of rerunning
on transactions (big transactions are more likely to contain
conflicts).

Also, getting all the edge cases right is hard with strict balance.
Especially in the presence of deletions. It is VASTLY simpler with
relaxed balance, though there are a few little tricks. If it was too
easy, it wouldn't be any fun (see 3, below). Hint: Although the
on-disk version doesn't need or want parent pointers, you might want
them for your in-memory version of pages.

2. Separate the IO from the BTree-stuff.

Conceptually keep a codeTVar (Map Address ByteString)/code. In the
transaction, use this to find pages. If the page is not there, throw
an exception containing the desired address. In a wrapper, catch the
exception, read the page, add it to the map as a separate transaction
then retry the original transaction. I say conceptually because
something like codeTArray Address (Maybe ByteString)/code, or
similar will yield much better concurrency. In general, you want to
push the TVars down as far as possible.

3. Have Fun

STM is very cool, so make sure you enjoy making it all hang together. :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe