[Haskell-cafe] Re: Support for lock-free/wait-free programming? / ANN: bits-atomic

2010-08-30 Thread Gabriel Wicke
On Sun, 29 Aug 2010 13:08:48 -0400, Gregory Collins wrote:

 Gregory Collins g...@gregorycollins.net writes:
 ...sigh... The programs link fine without it part is a partial lie.
 Anything that the C linker links (i.e. executables) works fine without
 an explicit -lgcc_s, but ghci and compilations using template haskell
 fail with an unknown symbol `___bswapdi2', or equivalent.
 
 Looking for a workaround now.

After some more off-list discussion with Gregory (thanks for your help!) I
split out the atomic operations to the bits-atomic package [1] which no longer
depends on libgcc_s. GCC produces native code for atomic operations, so
libgcc_s is not needed for these operations. 

Additionally, a test suite covering the most important operations is now
included.

I am still looking for a solution to drop the libgcc dependency for
Data.Bits.Extras (which will remain in bits-extras) as well, but this is
harder as the decision whether to use fall-back software implementations
depends on the CPU capabilities.

Cheers,
Gabriel

[1] http://hackage.haskell.org/package/bits-atomic-0.1.0

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Support for lock-free/wait-free programming? / ANN: bits-atomic

2010-08-30 Thread Gregory Collins
Gabriel Wicke wi...@wikidev.net writes:

 On Sun, 29 Aug 2010 13:08:48 -0400, Gregory Collins wrote:

 After some more off-list discussion with Gregory (thanks for your help!) I
 split out the atomic operations to the bits-atomic package [1] which no longer
 depends on libgcc_s. GCC produces native code for atomic operations, so
 libgcc_s is not needed for these operations. 

Wonderful, works perfectly, thanks!

G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-29 Thread Gabriel Wicke
On Sat, 28 Aug 2010 21:22:57 -0400, Gregory Collins wrote:
 On OSX libgcc_s is only supplied as a dynamic library, which GHC
 doesn't seem to be able to link with. Any ideas would be appreciated.
 
 It works with this patch:
 -CC-Options:   -fomit-frame-pointer -march=native -Wall -   

Hello Gregory,

thanks for your patch.

I have just uploaded bits-extras-0.1.1 [1] with tweaked CC-Options to 
only include
-Wall on non-linux systems (plus minor documentation tweaks).

Could you check if this compiles on OSX? I would prefer to keep -Wall 
enabled
if possible.

Gabriel

[1] http://hackage.haskell.org/package/bits-extras-0.1.1

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-29 Thread Gregory Collins
Gabriel Wicke wi...@wikidev.net writes:

 Hello Gregory,

 thanks for your patch.

 I have just uploaded bits-extras-0.1.1 [1] with tweaked CC-Options to
 only include -Wall on non-linux systems (plus minor documentation
 tweaks).

 Could you check if this compiles on OSX? I would prefer to keep -Wall
 enabled if possible.

It doesn't work because you missed the other half of my patch, removing
the reference to libgcc_s -- programs link just fine without it. With it
in there I get:

Resolving dependencies...
Downloading bits-extras-0.1.1...
Configuring bits-extras-0.1.1...
cabal: Missing dependency on a foreign library:
* Missing C library: gcc_s

G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-29 Thread Gregory Collins
Gregory Collins g...@gregorycollins.net writes:

 It doesn't work because you missed the other half of my patch, removing
 the reference to libgcc_s -- programs link just fine without it. With it
 in there I get:

...sigh... The programs link fine without it part is a partial
lie. Anything that the C linker links (i.e. executables) works fine
without an explicit -lgcc_s, but ghci and compilations using template
haskell fail with an unknown symbol `___bswapdi2', or equivalent.

Looking for a workaround now.

G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-28 Thread Gregory Collins
Gabriel Wicke wi...@wikidev.net writes:

 On Tue, 17 Aug 2010 01:09:41 -0400, Gregory Collins wrote:

 Hello all,
 
 Does GHC expose any primitives for things like atomic compare-and-swap?
 I can't seem to find anything in the docs. 

 Hello Gregory,

 I have recently published experimental and low-level FFI bindings to
 GCC's atomic primitives including CAS as 'bits-extras' on Hackage [1].
 Instances for Int and Word types are included.

 This is likely lower-level than what you are after, but might still be
 handy for experimentation.

On OSX libgcc_s is only supplied as a dynamic library, which GHC doesn't
seem to be able to link with. Any ideas would be appreciated.

G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-28 Thread Gregory Collins
Gregory Collins g...@gregorycollins.net writes:

 Gabriel Wicke wi...@wikidev.net writes:

 On Tue, 17 Aug 2010 01:09:41 -0400, Gregory Collins wrote:

 Hello all,
 
 Does GHC expose any primitives for things like atomic compare-and-swap?
 I can't seem to find anything in the docs. 

 Hello Gregory,

 I have recently published experimental and low-level FFI bindings to
 GCC's atomic primitives including CAS as 'bits-extras' on Hackage [1].
 Instances for Int and Word types are included.

 This is likely lower-level than what you are after, but might still be
 handy for experimentation.

 On OSX libgcc_s is only supplied as a dynamic library, which GHC doesn't
 seem to be able to link with. Any ideas would be appreciated.

It works with this patch:

--
Remove cc-options field from .cabal; the flags given don't work on OSX

diff --git a/bits-extras.cabal b/bits-extras.cabal
--- a/bits-extras.cabal
+++ b/bits-extras.cabal
@@ -42,8 +42,6 @@
 --CC-Options:   -O3 -fomit-frame-pointer -march=native -Wall
 -- Try link-time optimization (inlining) with gcc 4.5:
 -- CC-Options:   -fomit-frame-pointer -march=native -Wall -flto
-CC-Options:   -fomit-frame-pointer -march=native -Wall
-Extra-Libraries:  gcc_s
 Include-Dirs: cbits
 Install-Includes: bitops-gcc.h atomic-bitops-gcc.h
 Extensions:   ForeignFunctionInterface
--


G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-27 Thread Gabriel Wicke
On Tue, 17 Aug 2010 01:09:41 -0400, Gregory Collins wrote:

 Hello all,
 
 Does GHC expose any primitives for things like atomic compare-and-swap?
 I can't seem to find anything in the docs. 

Hello Gregory,

I have recently published experimental and low-level FFI bindings to 
GCC's 
atomic primitives including CAS as 'bits-extras' on Hackage [1].
Instances for Int and Word types are included.

This is likely lower-level than what you are after, but might still be 
handy for experimentation.

Kind regards

Gabriel

[1]: http://hackage.haskell.org/packages/archive/bits-extras/0.1.0/doc/
html/Data-Bits-Atomic.html


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-17 Thread Simon Marlow

On 17/08/2010 06:09, Gregory Collins wrote:


Does GHC expose any primitives for things like atomic compare-and-swap?
I can't seem to find anything in the docs. I'm wondering if it's
possible, for example, to implement things like the wait-free concurrent
queue from [1] or a lock-free wait-free hash table like the Azul people
are doing [2].


We could provide a compare-and-swap on IORefs, indeed I've been planning 
to try that so we could move the implementation of atomicModifyIORef 
into user-space, so to speak.


Anecdotally it's hard to beat the performance of a good pure data 
structure in a mutable container.  For example, the mutable list 
experiments we did a while back [1] were all at least a factor of 10 
slower than IORef [a].


[1] http://www.haskell.org/~simonmar/bib/concurrent-data-08_abstract.html


In network servers, we often have a large number of clients fighting
over shared resources -- even simple stuff like which threads are
scheduled to timeout? and which threads are currently active, so I can
kill them if I need to?. Shared mutable collections seem to be a fact
of life here.

Under load, pure data structures behind MVars don't perform well because
of lock contention, and in my experience atomicModifyIORef is marginally
better but still suffers from contention issues (probably related to
thunk blackholing).


This is the specific case that we addressed in the GHC runtime recently, 
and you should find that 6.14 will perform a lot better than 6.12 with 
pure data structures in mutable containers, especially with 
atomicModifyIORef.


In the meantime you'll probably get better performance using STM. 
Whether you should perform the update strictly inside the transaction or 
not is a matter for experimentation, it probably depends on the data 
structure.



Recently I got a ~35% speedup on a benchmark by switching one of these
tables from a Map behind an IORef to a hashmap using a striped-locking
scheme (see [3] if you're interested.) I think there's a need for
high-concurrency data structures like this, and I don't see much on
Hackage that addresses it. As much as we like to say that we have a
handle on concurrency issues, from what I can see java.util.concurrent.*
has us soundly thrashed in this department, especially as it relates to
exposing atomic wait-free CPU primitives like the ones in
java.util.concurrent.atomic.* [4]. Speaking as a partisan, this cannot
be allowed to stand, can it?


Absolutely.  Although the Java crowd have to grapple with a complicated 
memory model which makes building these data structures virtually 
impossible.  At least in Haskell you can whip up a concurrent version of 
any pure data structure trivially, and have a lot of confidence that you 
got it right.  Of course you could do that in Java, but they don't tend 
to build pure data structures by default, whereas we have them by the 
bucketload.


Cheers,
Simon




Cheers,

G.

--

[1] Maged M. Michael and Michael L. Scott, Simple, Fast, and Practical
Non-Blocking and Blocking Concurrent Queue Algorithms:
http://www.cs.rochester.edu/u/michael/PODC96.html

[2] http://www.stanford.edu/class/ee380/Abstracts/070221_LockFreeHash.pdf

[3] 
http://github.com/snapframework/snap-server/blob/master/src/Data/HashMap/Concurrent.hs#L1

[4] 
http://download.oracle.com/javase/6/docs/api/java/util/concurrent/atomic/AtomicInteger.html



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Support for lock-free/wait-free programming?

2010-08-17 Thread Gregory Collins
Simon Marlow marlo...@gmail.com writes:

 On 17/08/2010 06:09, Gregory Collins wrote:

 We could provide a compare-and-swap on IORefs, indeed I've been
 planning to try that so we could move the implementation of
 atomicModifyIORef into user-space, so to speak.

For hash tables it would be nice to have a CAS indexing primitive on
MutableArray#/MutableByteArray#/etc, also.


 Under load, pure data structures behind MVars don't perform well because
 of lock contention, and in my experience atomicModifyIORef is marginally
 better but still suffers from contention issues (probably related to
 thunk blackholing).

 This is the specific case that we addressed in the GHC runtime recently, and
 you should find that 6.14 will perform a lot better than 6.12 with pure data
 structures in mutable containers, especially with atomicModifyIORef.

 In the meantime you'll probably get better performance using STM. Whether you
 should perform the update strictly inside the transaction or not is a matter
 for experimentation, it probably depends on the data structure.

In the specific benchmark in question STM was slightly slower than MVars
(but that's pretty impressive!).

G
-- 
Gregory Collins g...@gregorycollins.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe