Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

2011-02-23 Thread Johan Tibell
On Wed, Feb 23, 2011 at 12:57 PM, Gwern Branwen gwe...@gmail.com wrote:
 On Wed, Feb 23, 2011 at 12:45 AM, Max Bolingbroke
 batterseapo...@hotmail.com wrote:
 I'm a bit sceptical that it is (I was not convinced by the earlier
 strict-set-inclusion argument, since that's another Data.Map feature
 I've never used). I thought of some other possibilities though:
  1. If copying an unordered-collection to a flat array you can improve
 the constant factors (not the asymptotics) with O(1) size to
 pre-allocate the array

 For kicks, I grepped my local repos for users of Data.Set's size
 function. My results are imperfect (I have few Github repos, I didn't
 check .lhs files, I only grepped files for 'Set.size' or 'S.size'),
 but I found around 100 uses of Data.Set.size.

Could you manually look at some of them to see if you find something
interesting. In particular `Set.size s == 0` (a common use of size in
imperative languages) could be replaced by `Set.null s`.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

2011-02-23 Thread Johan Tibell
On Wed, Feb 23, 2011 at 1:27 PM, Gwern Branwen gwe...@gmail.com wrote:
 You could look at them yourself; I attached the files. I see 6 uses
 out of ~100 which involve an == 0

Looks like the mailing list gateway didn't let your attachements
through. No need to attach them though, I can just grep Hackage for
uses.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

2011-02-23 Thread Johan Tibell
On Wed, Feb 23, 2011 at 1:55 PM, Max Bolingbroke
batterseapo...@hotmail.com wrote:
 Thanks for bringing some data to the table. There are definitely some
 common patterns in what you sent me:

 1) For defining Binary instances, you need to write set size before
 you write the elements: ~7 occurrences
 2) Tests against small constants (typically = 0 or 1, but also 2 and
 3!): ~15 occurrences
 3) A surprise to me: generating fresh names! People keep a set of all
 names generated so far, and then just take size+1 as their fresh name.
 Nice trick. ~17 occurrences
 4) Turning sizes into strings for human consumption: ~19 occurrences
 5) Just reexporting the functions somehow. Uninformative. ~8 occurrences

 There were ~38 occurrences over ~13 repos where it appeared to be
 somehow fundamental to an algorithm (I didn't look into most of these
 in detail). I've put those after my message.

 Frankly I am surprised how much size gets used. It seems that making
 it fast is more important than I thought.

Nice analysis. Does this apply to maps as well as sets or are sets use
differently than maps somehow?

IntMap (which shares data structure with HashMap) only hash O(n) size.
I wonder if people avoid using IntMap because of this.

I wonder if there's a way to implement size that doesn't mess up the
code so badly (see the commit on GitHub to see how badly it messes up
e.g. insert).

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

2011-02-23 Thread Johan Tibell
Attached are all the uses of S.size and Set.size from a semi-recent
snapshot of Hackage.

Johan
Combinatorrent/0.3.2/Combinatorrent-0.3.2/src/Process/Peer.hs:let sz = 
S.size q
Combinatorrent/0.3.2/Combinatorrent-0.3.2/src/Process/PieceMgr.hs:
ipHave = S.size . ipHaveBlocks
Combinatorrent/0.3.2/Combinatorrent-0.3.2/src/Process/PieceMgr.hs:when 
( (S.size $ ipHaveBlocks ipp) = ipDone ipp)
cpsa/2.2.1/cpsa-2.2.1/src/CPSA/Graph/Tree.hs:  if S.size new == S.size 
old then
darcswatch/0.4.3/darcswatch-0.4.3/src/HTML.hs:  show (S.size (r2p d  r)) 
+++ 
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Coll/UnbalancedSet.hs:unsafeFromOrdSeq
 xs = fst (ins xs (S.size xs))
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/RevSeq.hs:fromSeq xs 
= N (S.size xs - 1) xs
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/RevSeq.hs:k = 
S.size ys
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/RevSeq.hs:k = 
S.size ys
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/RevSeq.hs:structuralInvariant
 (N i s) = i == ((S.size s) - 1)
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/SizedSeq.hs:fromSeq 
xs = N (S.size xs) xs
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/SizedSeq.hs:m 
= S.size ys
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/SizedSeq.hs:m 
= S.size ys
EdisonCore/1.2.1.3/EdisonCore-1.2.1.3/src/Data/Edison/Seq/SizedSeq.hs:structuralInvariant
 (N i s) = i == S.size s
fountain/0.0.1/fountain-0.0.1/Codec/Fountain.hs:| S.size s == degree = (s, 
g)
fountain/0.0.1/fountain-0.0.1/Codec/Fountain.hs:  | S.size indices == 0 = 
decode' decoder newDroplets
fountain/0.0.1/fountain-0.0.1/Codec/Fountain.hs:  | S.size indices == 1 = 
decode' (Decoder messageLength extraSymbols (M.insert (head $ S.toList indices) 
symbol symbols) old) (new ++ newDroplets)
hashmap/1.1.0/hashmap-1.1.0/Data/HashSet.hs:some_size (More t) = S.size 
t
hashmap/1.1.0/hashmap-1.1.0/Data/HashSet.hs:some_norm s = case S.size s of 0 - 
Nothing
hashmap/1.1.0/hashmap-1.1.0/Data/HashSet.hs:some_norm' s = case S.size s of 1 
- Only $ S.findMin s
HaskellForMaths/0.3.1/HaskellForMaths-0.3.1/Math/Algebra/Group/PermutationGroup.hs:order
 gs = S.size $ eltsS gs -- length $ elts gs
HaskellForMaths/0.3.1/HaskellForMaths-0.3.1/Math/Combinatorics/LatinSquares.hs:isOneOfEach
 xs = length xs == S.size (S.fromList xs)
HaskellTorrent/0.1.1/HaskellTorrent-0.1.1/src/Process/Peer.hs:let 
sz = S.size q
HaskellTorrent/0.1.1/HaskellTorrent-0.1.1/src/Process/PieceMgr.hs:
ipHave = S.size . ipHaveBlocks
HaskellTorrent/0.1.1/HaskellTorrent-0.1.1/src/Process/PieceMgr.hs:when 
( (S.size $ ipHaveBlocks ipp) = ipDone ipp)
hburg/1.1.2/hburg-1.1.2/src/Csa/Csa.hs:  ' - expected 
type++ (if (S.size p  1) then s else ) ++
hgom/0.6/hgom-0.6/Gom/Checker.hs:f s = S.size s  1
Holumbus-MapReduce/0.1.1/Holumbus-MapReduce-0.1.1/Examples/MapReduce/Crawler/Crawl.hs:
   runX (traceMsg 1 (  Status: already processed:  ++ show 
(S.size $ cs_wereProcessed cs) ++ 
Holumbus-MapReduce/0.1.1/Holumbus-MapReduce-0.1.1/Examples/MapReduce/Crawler/Crawl.hs:
 , to be processed:++ show (S.size $ 
cs_toBeProcessed cs)))
hoopl/3.8.6.0/hoopl-3.8.6.0/Compiler/Hoopl/Unique.hs:  setSize (US s) = S.size s
hs2bf/0.6.2/hs2bf-0.6.2/SAM.hs:when (S.size rs/=length args) $ report 
duplicate arguments
ideas/0.7/ideas-0.7/src/Domain/Math/Polynomial/RationalExercises.hs:  
S.size (varSet expr)  1
ideas/0.7/ideas-0.7/src/Domain/Math/Polynomial/RationalExercises.hs:   
manyVars = S.size (varSet a `S.union` varSet b)  1
mahoro/0.1.2/mahoro-0.1.2/DB.hs:delJID (n, jids) = if S.size jids == 1
minesweeper/0.9/minesweeper-0.9/Data/ChangeSet.hs:size= S.size . toSet
minesweeper/0.9/minesweeper-0.9/State/Functions.hs:= mines (configuration 
st) - S.size (marked $ game st)
minesweeper/0.9/minesweeper-0.9/Table.hs:=  S.size (marked g) + M.size 
(revealResults g) == msize c
minesweeper/0.9/minesweeper-0.9/Table.hs:= mines c - S.size (marked g)
panda/2009.4.1/panda-2009.4.1/src/Panda/Model/Tag.hs:sorted xs= 
xs.sortBy(compare_by (resources  S.size)).reverse
phybin/0.1.2/phybin-0.1.2/Bio/Phylogeny/PhyBin/Main.hs:putStrLn$ \nTotal 
unique taxa (++ show (S.size taxa) ++):  ++ 
regex-tdfa/1.1.7/regex-tdfa-1.1.7/Data/IntSet/EnumSet2.hs:size (EnumSet s) = 
S.size s
relacion/0.1/relacion-0.1/Data/Relacion.hs:size r  =   M.fold ((+) . S.size) 0 
(dominio r)
repa/1.1.0.0/repa-1.1.0.0/Data/Array/Repa.hs:   $! U.enumFromTo 
(0 :: Int) (S.size sh - 1)
repa/1.1.0.0/repa-1.1.0.0/Data/Array/Repa.hs:   | U.length uarr /= S.size sh
repa/1.1.0.0/repa-1.1.0.0/Data/Array/Repa.hs:   , size of 
shape =  ++ (show $ S.size sh)  ++ \n
repa/1.1.0.0/repa-1.1.0.0/Data/Array/Repa.hs:   $ (flip reshape) (Z :. 
(S.size $ extent arr1)) 

Re: [Haskell-cafe] Automated tests with cabal

2011-03-10 Thread Johan Tibell
On Thu, Mar 10, 2011 at 10:06 AM, Bas van Dijk v.dijk@gmail.com wrote:
 On 10 March 2011 08:12, Hauschild, Klaus (EXT)
 klaus.hauschild@siemens.com wrote:
 Hi Haskellers,

 I read about the cabal features for running test code from Setup.hs with
 defaultMainWithHooks. I'm looking for more generic code that allows me to
 place any haskell in a subdirectory test or so and cabal test will run
 this test without any modification of my Setup.hs.

 Is there a possibility?

 There is. The just released Cabal- 1.10.1.0 and cabal-install-0.10.2
 have support for test-suites.

 The documentation is not online yet so here's an example from my
 threads package:

You can find the docs here:

http://www.haskell.org/cabal/release/cabal-1.10.1.0/doc/users-guide/#test-suites

These are not yet the default docs (i.e. the docs linked from the
Cabal home page) but will likely be soon.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Update Cabal

2011-03-10 Thread Johan Tibell
On Thu, Mar 10, 2011 at 11:27 AM, Hauschild, Klaus (EXT)
klaus.hauschild@siemens.com wrote:
 Hallo,

 I'm using Haskell Platform 2010.2.0.0 on a Windows XP machine. This haskell
 platform includes cabal-1.8.0.6.
 Now I want to update cabal by cabal install cabal. Installation works
 well.
 Call like runhaskell ./Setup.hs will use the updatetd cabal-1.10.0.0. But
 cabal --version says still 1.8.0.6 and I have to re-configure.

 How I update cabal in my current Haskel Platform?

cabal install cabal-install will not install the latest version of
cabal-install due to a setting on Hackage (i.e. Duncan probably want
to wait before automatically changing everyone over). You need to give
the exact version:

cabal install cabal-install-0.10.2

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

2011-03-14 Thread Johan Tibell
On Mar 14, 2011 6:23 PM, Malcolm Wallace malcolm.wall...@me.com wrote:


 On 22 Feb 2011, at 22:21, Bryan O'Sullivan wrote:

 for some code that's (b) faster than anything else currently available


 I look forward to seeing some benchmarks against libraries other than
containers, such as AVL trees, bytestring-trie, hamtmap, list-trie, etc.
 Good comparisons of different API-usage patterns are hard to come by.

Milan Straka compared containers to a number of other libraries (including
some of those you mentioned) and found them all to be slower. Since
unordered-containers is faster (or sometimes as fast) as containers I
haven't really bothered comparing it to libraries other than containers.

Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Three Google Summer of Code project proposals

2011-03-20 Thread Johan Tibell
Hi,

I'd like to advertise three Google Summer of Code projects that I
recently added to the list [1] of proposed projects:

*** Build multiple Cabal packages in parallel ***
http://hackage.haskell.org/trac/summer-of-code/ticket/1594

Many developers have multi-core machines but Cabal runs the build
process in a single thread, only making use of one core. If the build
process could be parallelized build times could be cut by perhaps a
factor of 2-8, depending on the number of cores and opportunity of
parallel execution available.

*** Simpler support for isolated/sandboxed Cabal builds ***
http://hackage.haskell.org/trac/summer-of-code/ticket/1590

cabal-dev and capri allow developers to build packages in their own
sandboxes, using a separate package database for each. This allows for
isolated builds and prevents breakages due e.g. package upgrades.
Merging cabal-dev into Cabal allows us to share lots of code and makes
the feature more accessible to developers.

*** Convert the text package to use UTF-8 internally ***
http://hackage.haskell.org/trac/summer-of-code/ticket/1595

When the text package was created, early benchmarks showed that using
UTF-16 as the internal representation for Unicode code points was the
fastest. The package still uses UTF-16 internally.

The benchmarks might not have given a complete picture of the
performance implications of using different internal encodings: all
benchmarks were run on input data that used the same encoding as used
internally, but most real world data uses UTF-8. If the benchmarks
would also have taken the cost of decoding and encoding from and to
UTF-8 the results might have been different. The goal of this project
is to revisit the choice of internal encoding.

I encourage any interested students to have a look at the three
proposals (and the other proposals on the list), discuss them on this
list, and sign up for GSoC (after March 28).

1. http://hackage.haskell.org/trac/summer-of-code/report/1

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Help optimising a Haskell program

2011-03-21 Thread Johan Tibell
You use a lot of (linked lists). Are they all used to represent streams or
are they actually manifest during runtime? If it's the latter switch to a
better data structure, like Vector.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Three Google Summer of Code project proposals

2011-03-27 Thread Johan Tibell
On Sun, Mar 27, 2011 at 1:03 PM, Andrew Coppin
andrewcop...@btinternet.com wrote:
 *** Build multiple Cabal packages in parallel ***
 http://hackage.haskell.org/trac/summer-of-code/ticket/1594

 Many developers have multi-core machines but Cabal runs the build
 process in a single thread, only making use of one core. If the build
 process could be parallelized build times could be cut by perhaps a
 factor of 2-8, depending on the number of cores and opportunity of
 parallel execution available.

 Isn't the Cabal build process strictly I/O-limited rather than CPU-limited?

It's mostly CPU-limited due to spending most of its time in ghc --make
(which is CPU limited). It would be nice to parallelize GHC itself at
some point but that's a harder task I believe.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: Google Summer of Code student application period opens today

2011-03-28 Thread Johan Tibell
Hi,

The Google Summer of Code student application period starts today at
19:00 UTC. If you're a student and like to get paid to work on a
Haskell project this summer I recommend you go find an interesting
project [1] and start working on your application. You can find more
information on the wiki [2].

Cheers,
Johan

1. http://hackage.haskell.org/trac/summer-of-code/report/1
2. http://hackage.haskell.org/trac/summer-of-code/wiki/Soc2011

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Iteratee, ghc 6.12/7.0 strange behaviour - epollControl: permission denied (Operation not permitted)

2011-03-28 Thread Johan Tibell
On Mon, Mar 28, 2011 at 7:31 PM, Michael A Baikov pa...@bk.ru wrote:


 I am still playing with lastest iteratee and i think i found something 
 strange.


 let's suppose we have a file test.hs like this:

 import Data.Iteratee
 import Data.Iteratee.IO
 import Data.Iteratee.Char

 main = fileDriver printLines /etc/passwd

 It works fine when executed via runhaskell / ghci in ghc 6.12. Compiled 
 version in ghc 7
 also works, but when i am trying to execute it via runhaskell / ghci i am 
 getting this error:

 iter.hs: epollControl: permission denied (Operation not permitted)

 Any ideas?

Make sure you have at least GHC 7.0.2. There were some I/O manager
bugs in 7.0.1.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Iteratee, ghc 6.12/7.0 strange behaviour -epollControl: permission denied (Operation not permitted)

2011-03-29 Thread Johan Tibell
Could you please file a bug at

http://hackage.haskell.org/trac/ghc/

Give an as small reproducible test case as possible and as much
information about your system as possible.

Thanks!

Johan

2011/3/29 Michael A Baikov pa...@bk.ru:

 I tried with both 7.0.2 and 7.0.3

 -Original Message-

 On Mon, Mar 28, 2011 at 7:31 PM, Michael A Baikov pa...@bk.ru wrote:
 
 
  I am still playing with lastest iteratee and i think i found something 
  strange.
 
 
  let's suppose we have a file test.hs like this:
 
  import Data.Iteratee
  import Data.Iteratee.IO
  import Data.Iteratee.Char
 
  main = fileDriver printLines /etc/passwd
 
  It works fine when executed via runhaskell / ghci in ghc 6.12. Compiled 
  version in ghc 7
  also works, but when i am trying to execute it via runhaskell / ghci i am 
  getting this error:
 
  iter.hs: epollControl: permission denied (Operation not permitted)
 
  Any ideas?

 Make sure you have at least GHC 7.0.2. There were some I/O manager
 bugs in 7.0.1.

 Johan



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] SoC project: advice requested

2011-04-01 Thread Johan Tibell
Hi Gábor,

There are a few non-Cabal projects on the ideas list
(http://hackage.haskell.org/trac/summer-of-code/report/1). Just
thought I mentioned it in case you missed it.

2011/4/1 Gábor Lehel illiss...@gmail.com:
 Alternately, I'd be very happy to receive suggestions
 about other GHC-related work which would be considered appropriate.
 (Or, heck, any other compiler.)

Perhaps you could send an email to the GHC mailing list and ask if
they have any good GSoC projects? I'm not sure the Simons are reading
every post on this list.

 A related problem is that, having done only minimal GHC hacking so
 far, drawing up a detailed plan / design in advance as part of the
 proposal would be difficult. If this is considered necessary and there
 is someone willing to mentor the project I'd be happy to research the
 problem in advance of the submission deadline so I can submit a more
 detailed proposal. Alternately, if it's deemed acceptable to learn the
 ropes / come up to speed as part of the SoC itself that's fine by me
 as well. (Wasn't this sort of the point originally?)

It's not required but it helps. Us mentors need to figure out if
you're likely to finish your project or not. Showing that you
understand what needs to be done is a good sign. If you're not sure
what needs to be done there's still a chance you'll get accepted if
people who already know GHC thinks one summer is enough time to both
get familiar with GHC and add something worthwhile.

 Background info: I've taken part in the SoC once before, back in 2006
 (when I applied to KDE to work on Krita). I don't yet have any Hackage
 packages to my name, however I'm working on a C++-to-Haskell bindings
 generator for my bachelor's thesis (the primary target being Qt*)
 which is likely to spawn quite a few. (I've avoided making any noise
 about this because I didn't want to put the cart before the horse: the
 plan was (and still is) to announce something once there is something
 worth announcing, and it's not at that point quite yet.)

How about adding Haskell support for SWIG? Being able to call C++
libraries from Haskell would be very useful.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] SoC project: advice requested

2011-04-01 Thread Johan Tibell
2011/4/1 Gábor Lehel illiss...@gmail.com:
 Oh, hmm. Good idea. Should've cross-posted from the beginning :|.
 What's the accepted etiquette here? Forward the original message? Send
 a short heads-up with a link to this thread in the archives?

I'd send a new message with only the parts relevant for the GHC devs
(i.e. asking if they have some good GSoC project).

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: Google Summer of Code student application period opens today

2011-04-05 Thread Johan Tibell
Hi Marco

On Tue, Apr 5, 2011 at 3:17 PM, Marco Túlio Gontijo e Silva
mar...@marcot.eti.br wrote:
 I've writed a draft of the proposal at
 http://www2.dcc.ufmg.br/laboratorios/llp/wiki/doku.php?id=marco_soc2011 .  If
 you have any comments, I'll be glad to receive them.

Thanks for taking the time to put together such a well-written
proposals. I have two comments at this point:

If not all of the dependencies were build yet, the dependencies
are included in the queue, and also the package or module, after
them.

Minor nit: note that several packages can share a dependency so
naively adding a dependency to the queue could cause unnecessary
rebuilds.

I'll work on a released version of GHC, to avoid having to
rebuild it whenever the git is updated, and to avoid handling with
changed on the git tree during my development.

I would strongly recommend against this as you might end up with an
impossible merge towards the end of the project, putting the whole
project in jeopardy. I'd suggest getting patches in early and
frequently. By submitting patches (at least for review) early and
often you'll benefit from feedback and buy-in from the maintainer(s)
and make it easier for him/her/they to merge your work.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] how to optmize this code?

2011-04-10 Thread Johan Tibell
Hi Gilberto,

On Wed, Mar 30, 2011 at 4:39 PM, Gilberto Garcia giba@gmail.com wrote:
 fkSum :: Int - [Int] - Int
 fkSum a [] = 0
 fkSum a (b) = foldl (+) 0 (filter (\x - isMultiple x b) [1..a])

 isMultiple :: Int - [Int] - Bool
 isMultiple a [] = False
 isMultiple a (x:xs) = if (mod a x == 0) then True else isMultiple a xs

You can make both these functions a little bit more efficient by
making them strict in the first argument, like so:

{-# LANGUAGE BangPatterns #-}

fkSum :: Int - [Int] - Int
fkSum !a [] = 0
fkSum a (b) = foldl (+) 0 (filter (\x - isMultiple x b) [1..a])

isMultiple :: Int - [Int] - Bool
isMultiple !a [] = False
isMultiple a (x:xs) = if (mod a x == 0) then True else isMultiple a xs

This change ensures that the first argument is always evaluated.
Before `fkSum undefined []` would return 0, now it results in an
error. The upside is that when a function is strict in an argument,
GHC can use a more efficient calling convention for the function. In
this case it means that instead of passing the first argument as a
pointer to a machine integer, it can pass the machine integer directly
(in a register).

This optimization is particularly worthwhile for accumulator parameters.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need comments on a libusb asynchronous select/poll loop

2011-04-20 Thread Johan Tibell
On Tue, Apr 19, 2011 at 10:36 PM, Bas van Dijk v.dijk@gmail.com wrote:
 On 19 April 2011 15:06, John Obbele john.obb...@gmail.com wrote:
        -- Step 3 is the most important step. Submitting the transfer:
        handleUSBException $ c'libusb_submit_transfer transPtr

        -- TODO: Now we need to do the complicated stuff described in:
        -- http://libusb.sourceforge.net/api-1.0/group__poll.html
        --
        -- First we need the function:
        -- getPollFds ∷ Ctx → IO [C'libusb_pollfd]
        --
        -- A C'libusb_pollfd:
        -- http://libusb.sourceforge.net/api-1.0/structlibusb__pollfd.html
        -- is a structure containing a file descriptor which should be
        -- polled by the GHC event manager and an abstract integer
        -- describing the event flags to be polled.
        --
        -- The idea is to call getPollFds and register the returned
        -- file descriptors and associated events with the GHC event
        -- manager using registerFd:
        -- 
 http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/System-Event.html#v:registerFd
        --
        -- As the callback we use libusb_handle_events_timeout.
        --
        -- But here we run into a problem: We need to turn our
        -- concrete event integer into a value of the _abstract_ type
        -- Event. But the only way to create Events is by evtRead or
        -- evtWrite!
        --
        -- I would really like a solution for this.
        -- Bryan, Johan any ideas?

Could you do something like:

toEvent :: Int - Event
toEvent flag
| flag `xor` (#const POLLIN) = evtRead

etc?

Not that evtRead and evtWrite maps to different things on different platforms.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need comments on a libusb asynchronous select/poll loop

2011-04-20 Thread Johan Tibell
On Wed, Apr 20, 2011 at 5:22 PM, Bas van Dijk v.dijk@gmail.com wrote:
 On 20 April 2011 17:04, Johan Tibell johan.tib...@gmail.com wrote:
 Not that evtRead and evtWrite maps to different things on different 
 platforms.

 Do you mean Not or Note?

Yes, sorry.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need comments on a libusb asynchronous select/poll loop

2011-04-20 Thread Johan Tibell
On Wed, Apr 20, 2011 at 6:11 PM, Bas van Dijk v.dijk@gmail.com wrote:
 I still need to add appropriate conditions for checking whether the
 program is using the threaded RTS. What is the recommended approach
 for this?

 I see GHC.Conc.IO uses a dynamic check:

 foreign import ccall unsafe rtsSupportsBoundThreads threaded :: Bool

 Is this also available to me as a library author?

I think there's a ticket for adding something along the lines of

getSystemEventManager :: IO (Maybe EventManager)

If that returns Just em, you're in the threaded RTS and have an EventManager.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Binary and Serialize classes

2011-05-01 Thread Johan Tibell
On Fri, Apr 29, 2011 at 10:25 AM, Evan Laforge qdun...@gmail.com wrote:
 Indeed, and I was starting to do that... well, I would make my own
 project specific Serialize class, since the type dispatch is useful.
 But copy pasting a UTF8 encoder, or the variable length Integer
 encoder, or whatever seemed kinda unpleasant.  Surely we could expose
 that stuff in a library, whose explicit goal was that they *would*
 remain compatible ways to serialize various basic types, and then just
 reuse those functions?  E.g. that is already done for words with the
 putWordN{be,le} functions, and is available separately for UTF8.

I intend to add support for different UTF encodings to
Data.Binary.Builder for this very reason. I also intend to add two
functions to Data.Binary.Builder.Internal that lets you implement
variable length encoding as efficiently as possible.

I'm a bit skeptical of adding builders for different variable length
encodings to the library, simply because there are so many
possibilities. I think creating a binary-vle (for variable length
encoding) package would be worthwhile. I have an implementation of the
VLE used in protocol buffers.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: unordered-container 0.1.3.0

2011-05-05 Thread Johan Tibell
Hi all,

I've just uploaded a new version of the unordered-containers package,
a package of fast hashing-based container types. Version
0.1.3.0 [1] adds:

* the ability to take the union of two maps,
* lower memory overhead per key/value pair (contributed by Jan-Willem
  Maessen), and
* the beginning of a `Data.HashSet` module (contributed by Bryan
  O'Sullivan).

If you want to contribute, you can get the source from the git
repository:

git clone https://github.com/tibbe/unordered-containers.git

Alternatively, just fork the project on GitHub [2].

1. http://http://hackage.haskell.org/package/unordered-containers-0.1.3.0
2. https://github.com/tibbe/unordered-containers

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] blaze-builder and FlexibleInstances in code that aims to become part of the Haskell platform

2011-05-19 Thread Johan Tibell
Hi Simon,

On Wed, May 18, 2011 at 7:32 PM, Simon Meier iridc...@gmail.com wrote:
 In fact, one of my current goals with this work is to polish it such
 that it can be integrated into the 'bytestring' library.

We should definitely add a builder monoid in the bytestring package.

Since Write mentions IO, I thought I should point out that we need to
separate any code that mentions IO from the the code that doesn't
(i.e. the pure Builder API). The use of IO is an implementation detail
in bytestring. We should follow the existing bytestring pattern and
put any code that mentions IO in e.g.
Data.ByteString.Lazy.Builder.Internal. This allows the few people who
need to access the internals to do so while making it clear that these
are in fact internals. Long term we'd like to switch bytestring over
from ForeignPtr to ByteArray#, if possible. There are currently some
technical obstacles to such a switch, but factoring out the IO code at
least makes it somewhat easier if we ever get around to switching.

Avoiding IO in the main API means that the main builder type must not
mention IO (or things related to IO, such as Storable).

 The core principle used to tackle (1) is avoiding intermediate data
 structures.  The core abstraction used is the one of a Write (see [1]
 for the corresponding library.)

  data Write a = Write Int (a - Ptr Word8 - IO (Ptr Word8))

 A value `Write bound io :: Write a` denotes an encoding scheme for
 values of type `a` that uses at most `bound` bytes space. Given a
 values `x :: a` and a pointer `po` to the next free byte `io x po`
 encodes `x` to memory starting from `po` and returns the pointer to
 the next free byte after the encoding of `x`.

 In most cases Writes are used as an abstract datatype. They serve as
 an interface between implementors of the low-level bit-twiddling
 required to efficiently implement encodings like UTF-8 or Base16 and
 the providers of efficient traversal functions through streams of
 Haskell values. Hence, typical users of Writes are functions like

  fromWrite          :: Write a - a - Builder
  fromWriteList      :: Write a - [a] - Builder
  fromWriteUnfoldr   :: Write b - (a - Maybe (b, a)) - a - Builder
  mapWriteByteString :: Write Word8 - S.ByteString - Builder

We want to allow users to efficiently create new builders, for their
own data type. This is crucial as the bytestring package cannot
provide efficient builders for every possible type, as it would have
to depend on most of Hackage (i.e. on all packages that define types
that we want efficient builders for) to do so. Allowing the user to
get hold of the underlying buffer in a controlled way makes the
builder extensible. This is good.

Write achieves this separation, but it has some costs which I'm not
entirely comfortable with.

First, it leads to lots of API duplication. For every type (e.g. Word)
we want to be able serialize we have two morally identical functions

writeWordhost :: Word - Write
fromWordhost :: Word - Builder

in the API, where the latter simply calls the former and does some
additional wrapping.

See 
http://hackage.haskell.org/packages/archive/blaze-builder/0.3.0.1/doc/html/Blaze-ByteString-Builder-Word.html
for examples.

Simon, is the reason for this duplication this comment on top of
Blaze.ByteString.Builder.Word?

Note that for serializing a three tuple (x,y,z) of bytes (or other
word values) you should use the expression

fromWrite $ writeWord8 x `mappend` writeWord8 y `mappend` writeWord z

instead of

fromWord8 x `mappend` fromWord8 y `mappend` fromWord z

The first expression will result in a single atomic write of three
bytes, while the second expression will check for each byte, if
there is free space left in the output buffer. Coalescing these
checks can improve performance quite a bit, as long as you use it
sensibly.

Coalescing of buffer space checks can be achieved without separating
writes into Write and Builder. I've done so in the binary package [1]
using rewrite rules. The rewrite rules fire reliable so that any
syntactic series of puts i.e.

f = do
putWord8 1
putWord8 2
putWord8 3

result in one bounds check, followed by three pokes into the buffer.
To do so all that is needed is to define all builders in terms of

writeAtMost :: Int - (Ptr Word8 - IO Int) - Builder

and create a rewrite rule for append/writeAtMost. writeAtMost is
essentially the same as your Write [2], except it never leads to any
constructors getting allocated.

At the moment, the addition of Write means that

import Blaze.ByteString.Builder

f :: [Word8] - Builder
f xs = fromWriteList writeWord8 xs

is faster than the Data.Binary equivalent

import Data.Binary.Builder

g :: [Word8] - Builder
g [] = mempty
g (x:xs) = singleton x `mappend` g xs

Fortunately this was due to a bug in GHC [3]. After this bug has been
fixed I expect Data.Binary to perform on par with
Blaze.ByteString.Builder, 

Re: [Haskell-cafe] blaze-builder and FlexibleInstances in code that aims to become part of the Haskell platform

2011-05-20 Thread Johan Tibell
Hi Simon,

On Thu, May 19, 2011 at 10:46 PM, Simon Meier iridc...@gmail.com wrote:
 Write achieves this separation, but it has some costs which I'm not
 entirely comfortable with.

 First, it leads to lots of API duplication. For every type (e.g. Word)
 we want to be able serialize we have two morally identical functions

    writeWordhost :: Word - Write
    fromWordhost :: Word - Builder

 in the API, where the latter simply calls the former and does some
 additional wrapping.

 Yes, I agree with this duplication. I'll explain below what we gain
 from it. Note that I factored out the whole Write stuff into its own
 library (system-io-write) for the bytestring integration. Therefore,
 an end-user of bytestring will only see the Builder versions except
 he's doing more low-level stuff to gain some extra performance.

There are (at least) two cases where I think the simple Builder API
must perform well for it to be usable on its own: simple loops and
sequential writes. To be specific, if the following two cases don't
compile into near optimal code, there's a compiler bug we should fix.
First, a simple loop:

f :: [Word8] - Builder
f [] = mempty
f (x:xs) = singleton x `mappend` xs

This code is already quite low level, there should be enough
information here for the compiler to emit a simple loop with one
buffer bounds check per iteration. Second, a bunch of sequential
writes:

g :: Word8 - Word8 - Word8 - Word8 - Builder
g a b c d = singleton `mappend` (b `mappend` (c `mappend` d))

This ought to compile to a single bounds check followed by for memory writes.

The user shouldn't have to get more low-level than this in these
simple examples. Today this is currently only true for the second
example, which we can solve using rewrite rules. The first example
doesn't work due to the GHC compiler bug I mentioned.


 Simon, is the reason for this duplication this comment on top of
 Blaze.ByteString.Builder.Word?

    snip

 That's one of the reasons, but not the main one. The core reason is
 that Write's provide
 an interface between implementors of the low-level bit-twiddling
 required to efficiently implement encodings like UTF-8 or Base16 and
 the providers of efficient traversal functions through (streams of)
 Haskell values. For simple traversals like

   fromWrite          :: Write a - a - Builder
   fromWriteList      :: Write a - [a] - Builder
   fromWriteUnfoldr   :: Write b - (a - Maybe (b, a)) - a - Builder

 there might be the option that GHC is clever enough and can find the
 efficient loop. However, for more complicated functions like

   mapWriteByteString :: Write Word8 - S.ByteString - Builder

 That certainly isn't the case. I'm using quite a few tricks there [3]
 to enable a tight inner loop with few live variables.

Right. So this argues for having an escape hatch, and I agree we
should have one. Write at writeAtMost are both such escape hatches and
I believe them to equal in expressiveness. This shouldn't come as a
surprise as Write is writeAtMost with one argument reified into into a
constructor field:

writeAtMost :: Int - (Ptr Word8 - IO Int) - IO ()
data Write = {-# UNPACK #-} !Int (Ptr Word8 - IO (Ptr Word8))

(That the second argument of writeAtMost is an Int instead of a Ptr
Word8 as in Write is an unimportant difference.)

There are some operational differences.

* The argument to Write can be inspected at runtime, while the
argument to writeAtMost can only be inspected at compile time (by a
rewrite rule).

* Write might exist at runtime, if it's allocation site cannot be seen
by its use site, which hard to guarantee in general (it requires
serious staring at Core). This is not the case for writeAtMost, unless
it's partially applied.

* The second field of Write is lazy. I'm not sure what, if any,
implications this might have for how GHC optimizes the code.

 In my opinion, Writes and Builders have different use-cases and
 different semantics. Providing a type modeling Writes makes therefore
 sense to me. Moreover, note that Writes are built as a compile time
 abstraction. All their definitions are intended to be completely
 inlined and care is taken that the inliner also does so. Therefore,
 they incur no runtime cost.

This is up to the user of the Write abstraction to ensure, as any
function that takes a Write as an argument must have the correct
INLINE incantations applied to make this happen.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] blaze-builder and FlexibleInstances in code that aims to become part of the Haskell platform

2011-05-23 Thread Johan Tibell
On Fri, May 20, 2011 at 11:12 PM, Simon Meier iridc...@gmail.com wrote:
 There, seems to be a historical artefact here. The new Write
 abstraction in system-io-write is different from the one used in
 blaze-builder. It's type is

  data Write a = Write Int (a - Ptr Word8 - IO (Ptr Word8))

 This definition ensures that the bound on the number of bytes written
 is independent of the value being encoded. That's crucial for the
 implementation of `mapWriteByteString`. It also benefits the other
 Write combinators, as the bound can always be computed in a
 data-independent fashion. Inlining, is therefore really sufficient to
 arrive at a constant bound during compile time.

I don't see why this makes a difference, you could still do

myWrite x = Write (length x) (\ _ p - pokePokePoke p x)

 I don't see how this Write type can be emulated using `writeAtMost`, do you?

There's no difference, as I showed above. Both can result in data
dependent lengths. It's up to the programmer to make sure the length
is independent of the value being written, when so desired.

 Hmm, all my Writes are top-level function definitions annotated with
 {-# INLINE #-}. Moreover, all combinators for Writes are also inlined
 and all their calls are saturated. Therefore, I thought GHC is capable
 of optimizing away the pattern matches on the Write constructor.

You also need to make all top-level functions non-recursive but from
what I remember you did so. The case for Writes is the same as for
higher-order arguments, the call site must meet the definition site.
So if you have something like:

myWrite :: Write Word8

writeList :: Write a - [a] - ...

f xs = writeList myWrite xs

we need to make sure both myWrite and writeList are inlined into f.
The case is similar for writeAtMost. The question is what happens if
the user ever fails to get everything to inline optimally. In the
writeAtMost case just have an indirect function call instead of a
direct one. In the Write case we also have extra allocation and
indirection. We've had such problems in e.g. attoparsec. While things
should inline properly in big programs they rarely do. Same problem
exists for fusion where fusion constructors end up in the final
program although they should have been eliminated.

 I'm happy to remove Writes, if there's a superior way of sharing the
 low-level encoding code that they abstract. However, I did peek at
 Core from time to time and found that the Write constructors were
 optimized away. I currently see Writes as an expert domain to be used
 by authors of libraries like bytestring, text, aeson, blaze-html, etc.
 With appropriate documentation and benchmarks I expect them to be able
 to make good choices w.r.t. inlining and partial application.

I agree. Writes (and writeAtMost) would be the domain of experts.

If we expects write to be reused a lot it might make sense to have a
separate Write type. Note that I'd be reluctant to see dependencies
that involve I/O underneath bytestring as it's designed as a pure data
structure library (and is likely to have things involving I/O on top
of it).

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bytestring package: Int and Int64

2011-05-24 Thread Johan Tibell
Hi Daniel,

On Tue, May 24, 2011 at 10:07 AM, Daniel Díaz danield...@asofilak.es wrote:
 Hi, cafe,

 I just feel curiosity. In the bytestring package, Data.ByteString module,
 functions like length, index, and others with Int in its type signature,
 have Int64 in the analogous Data.ByteString.Lazy version. What is the
 reason?

A strict ByteString is one contiguous chunk of memory so it cannot be
longer than an Int (if we assume and Int is either 32 or 64 bits for a
second). However, a lazily generated stream can be much bigger than
main memory, so it makes sense to use a bigger type to refer to e.g.
it length. Now, you might say that lazy ByteString should use Integer
instead of Int64. However, Int64 performs much better so I think the
loss of generality is worth it.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Policy for taking over a package on Hackage

2011-05-25 Thread Johan Tibell
On Wed, May 25, 2011 at 2:01 PM, Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com wrote:
 With my wl-pprint-text package, Jason Dagit suggested to me on
 #haskell that it would make sense to make such a pretty-printer be
 class-based so that the same API could be used for String, ByteString,
 Text, etc.

I'm a bit skeptical of using type classes to abstract over Unicode
string types and byte sequence types. The only API shared by the two
kind of types is that of a sequence. Things like dot , spaces, etc.
don't make much sense on binary data. You must assume that the
ByteString contains text in some encoding to make sense of such
concepts.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Johan Tibell
Hi Aleksandar,

On Tue, May 31, 2011 at 6:10 PM, Aleksandar Dimitrov
aleks.dimit...@googlemail.com wrote:
 Say, we have an input file that contains a word per line. I want to find all
 unigrams (unique words) in that file, and associate with them the amount of
 times they occurred in the file. This would allow me, for example, to make a
 list of word frequencies in a given text.

Here's how I would do it:

{-# LANGUAGE BangPatterns #-}
module Ngram (countUnigrams) where

import qualified Data.ByteString as S
import qualified Data.HashMap.Strict as M
import System.IO

foldLines :: (a - S.ByteString - a) - a - Handle - IO a
foldLines f z0 h = go z0
  where
go !z = do
eof - hIsEOF h
if eof
then return z
else do
line - S.hGetLine h
go $ f z line
{-# INLINE foldLines #-}

-- Example use
countUnigrams :: IO (M.HashMap S.ByteString Int)
countUnigrams = foldLines (\ m s - M.insertWith (+) s 1 m) M.empty stdin

 RANT

 I have tried and tried again to avoid writing programs in Haskell that would
 leak space like BP likes to leak oil. However, I have yet to produce a single
 instance of a program that would do anything at all and at the same time 
 consume
 less memory than there is actual data in the input file.

 It is very disconcerting to me that I seem to be unable, even after quite some
 practice, to identify space leaks in trivial programs like the above. I know 
 of
 no good resource to educate myself in that regard. I have read the GHC manual,
 RWH's chapter on profiling, also Inside T5's recent series on the Haskell
 heap, but no dice. Even if I can clearly see the exact line where at least 
 some
 of the leaking happens (as I can in this case,) it seems impossible for me to
 prevent it.

 *thank you very much* for reading this far. This is probably a mostly useless
 email anyhow, I just had to get it off my chest. Maybe, just maybe, someone
 among you will have a crucial insight that will save Haskell for me :-) But
 currently, I see no justification to not start my next project in Lua, Python 
 or
 Java. Sure, Haskell's code is pretty, and it's fun, but if I can't actually
 *run* it, why bother?  (Yes, this isn't the first time I've ran into this
 problem …)

We definitely need more accessible material on how to reliably write
fast Haskell code. There are those among us who can, but it shouldn't
be necessary to learn it in the way they did (i.e. by lots of
tinkering, learning from the elders, etc). I'd like to write a 60 (or
so) pages tutorial on the subject, but haven't found the time.

In addition to RWH, perhaps the slides from the talk on
high-performance Haskell I gave could be useful:


http://blog.johantibell.com/2010/09/slides-from-my-high-performance-haskell.html

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Johan Tibell
Hi Aleksandar,

I thought it'd be educational to do some back-of-the-envelope
calculations to see how much memory we'd expect to use to store words
in a HashMap ByteString Int. First, lets start by looking at how much
memory one ByteString uses. Here's the definition of ByteString [1]:

data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8)
 {-# UNPACK #-} !Int-- offset
 {-# UNPACK #-} !Int-- length

The two Int fields are used to support O(1) slicing of ByteStrings.

We also need the definitions of ForeignPtr [2] and Int.

data ForeignPtr a = ForeignPtr Addr# ForeignPtrContents

data ForeignPtrContents
= PlainForeignPtr !(IORef (Finalizers, [IO ()]))
| MallocPtr  (MutableByteArray# RealWorld) !(IORef
(Finalizers, [IO ()]))
| PlainPtr   (MutableByteArray# RealWorld)

data Int = I# Int#

The UNPACK indicates to the compiler that it should unpack the
contents of a constructor field into the constructor itself, removing
a level of indirection. We'll end up with a structure that looks like
this:

data ByteString = PS Addr# ForeignPtrContents
 Int#-- offset
 Int#-- length

To compute the size of a ByteString, count the number of constructor
fields and add one (for the constructor itself). This is how many
words the value is going to use. In this case it's 5 words. In
addition we need to add the size of the ForeignPtrContents
constructor, which happens to be a PlainPtr in this case, so we add
two more words.

Finally we need to look at the definition of MutableByteArray# [3],
which is implemented by a C struct named StgArrWords:

typedef struct {
StgHeader  header;
StgWordbytes;
StgWordpayload[FLEXIBLE_ARRAY];
} StgArrWords;

The StgHeader takes one word (when compiling with profiling turned
off) so StgArrWords takes 2 words, plus the actual payload.

If we add it all up we get 9 words, plus the size of the payload (i.e.
the length of a word encoded using UTF-8 in your case).

Now lets look at the definition of HashMap [4]:

data HashMap k v
= Bin {-# UNPACK #-} !SuffixMask
  !(HashMap k v)
  !(HashMap k v)
| Tip {-# UNPACK #-} !Hash
  {-# UNPACK #-} !(FL.FullList k v)
| Nil

We also need the definition of FullList [5]:

data FullList k v = FL !k v !(List k v)
data List k v = Nil | Cons !k v !(List k v)

For the sake of this discussion lets assume the tree is perfectly
balanced. For a HashMap of size N this means that we have N leaves
(Tip) and N-1 interior nodes (Bin). Each Bin takes 4 words.

The size of the Tip depends on the number of hash collisions. These
are quite rare so lets assume that the FullList only has one element.
Also, the Nil constructor is free as it can be shared by all instances
in the program. After applying the UNPACK pragmas the Tip constructor
looks like:

| Tip Int# !k v !(List k v)

This takes another 5 words.

Now when we know the overhead of both Bin and Tip we can compute the
overhead per key/value pair as: (5N + 4(N-1)) / N = 9 - 4/N ~= 9
words.

Given that an Int (not an Int#) takes 2 words, we can approximate the
memory cost of a key/value pair in a HashMap ByteString Int as

(9 + 9 + 2) * MachineWordSize + AvgByteStringLength

bytes.

For example, the average English word length is 5 characters, if you
include stop words. We'd expect to use

(9 + 9 + 2) * 8 + 5 = 165

bytes per unique word in our input corpus, on a 64-bit machine. Plus
any GC overhead. This is probably more than one would expect.

I'm working on switching HashMap to use another data structure, a Hash
Array Mapped Trie, in its implementation. This will bring the overhead
down from 9 words to about 4 words.

You could try using the lower overhead SmallString type from the
smallstring package [6]. It has an overhead of 4 words per string
(instead of 9 like ByteString). There's some runtime overhead involved
in converting a value (i.e. Text) to a SmallString. I don't know if
this overhead will be noticeable in your program.

1. http://code.haskell.org/bytestring/Data/ByteString/Internal.hs
2. https://github.com/ghc/packages-base/blob/master/GHC/ForeignPtr.hs
3. https://github.com/ghc/ghc/blob/master/includes/rts/storage/Closures.h
4. 
https://github.com/tibbe/unordered-containers/blob/master/Data/HashMap/Common.hs
5. 
https://github.com/tibbe/unordered-containers/blob/master/Data/FullList/Lazy.hs
6. http://hackage.haskell.org/package/smallstring

P.S. If your program indeed has a space leak this won't help you, but
it's a good way to figure out if your program uses a reasonable amount
of memory.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Johan Tibell
On Wed, Jun 1, 2011 at 4:24 PM, Aleksandar Dimitrov
aleks.dimit...@googlemail.com wrote:
 One additional thought: it might be interesting to provide this outside of 
 this
 mailing list, perhaps as a documentation addendum to unordered-containers, 
 since
 it really explains the size needs for HashMaps of ByteStrings to folks without
 knowledge of higher arcane wizardry.

I think it would be a good answer on StackOverflow, but no one asked
me this question there. I could list the size overhead of a HashMap in
the docs, but I'm about to change it so I won't do so until after the
change. I don't know how big guarantees I want to make in the docs
either, as it might constrain future improvements to the
implementation. Perhaps in an addendum, like you said.

 I ended up using Data.Text over SmallString, also because I need to do other
 operations on the words (case folding, mass matching, and possibly more) and
 Text seemed more attractive for these tasks, but I'll try using SmallString,
 too.

Text uses 6 words per value, plus the size of the UTF-16 encoded
content. There's a Google Summer of Code project this year to convert
Text to UTF-8, which should decrease the space usage. In addition,
Text values aren't pinned on the heap, unlike ByteStrings, so they
should be nicer to the GC. The lowest overhead string type you could
imagine (given how GHC implements ByteArray#) would have a 4 word
overhead. Text trades 2 extra words to support efficient slicing (just
like ByteString).

When Text uses UTF-8 internally it should be possible to convert Text
to/from SmallString in O(1) time as the underlying ByteArray# could
just be wrapped in a SmallString constructor instead of a Text
constructor. This means that you could freely convert HashMap keys to
SmallString to save some space.

 I think, overall, Text will probably use more memory than ByteString, but in 
 my
 particular case, the problem wasn't the size of the data structure, but the 
 fact
 that it seemed to retain chunks of the original input file.

Given that ByteString uses 3 words more than Text you'll probably use
about the same (or slightly less) amount of space, given your string
lengths.

 To me, strict byte strings were a high-performance black box I didn't think
 about. I thought, if I store stuff as a bytestring, and a strict one, no less,
 there's *no* way I could expect to perform any better by switching the string
 type. Turns out that was an unjustified assumption.

You only have to know very little to use ByteString. Knowing that it
does slicing by sharing the underlying buffer (just like Java's
Strings) is unfortunately one of them. You can use the 'copy' function
to make sure the underlying storage matches the ByteString's length.

 The wealth of string types in Haskell land seems kind of confusing to the 
 newbie

I sympathize. My rule of thumb is: use Text for Unicode data and
ByteString for (large) byte blobs.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-06-01 Thread Johan Tibell
Hi Aleks,

On Wed, Jun 1, 2011 at 12:14 AM, Aleksandar Dimitrov
aleks.dimit...@googlemail.com wrote:
 I implemented your method, with these minimal changes (i.e. just using a main
 driver in the same file.)

 countUnigrams :: Handle - IO (M.Map S.ByteString Int)
 countUnigrams = foldLines (\ m s - M.insertWith (+) s 1 m) M.empty

 main :: IO ()
 main = do (f:_) - getArgs
           openFile f ReadMode = countUnigrams = print . M.toList

 It seems to perform about 3x worse than the iteratee method in terms of time,
 and worse in terms of space :-( On Brandon's War  Peace example, hGetLine 
 uses
 1.565 seconds for the small file, whereas my iteratee method uses 1.085s for 
 the
 small file, and around 2 minutes for the large file.

That's curious. I chatted with Duncan Coutts today and he mentioned
that hGetLine can be a bit slow as it needs to take a lock in every
read and causes some copying, which could explain why it's slower than
iteratee which works in blocks. However, I don't understand why it
uses more memory. The ByteStrings that are returned by hGetLine should
have an underlying storage of the same size as the ByteString (as
reported by length). You can try to verify this by calling 'copy' on
the ByteString before inserting it.

It looks like hGetLine needs some love.

 I also tried sprinkling strictness annotations throughout your above code, 
 but I
 failed to produce good results :-(

The strictness of the code I gave should be correct. The problem
should be elsewhere.

 I, unfortunately, don't really have any contact to the elders, apart from 
 what
 I read on their respective blogs…

You and everyone else. :) I just spent enough time talking to people
on IRC, reading good code, blogs and mailing list posts. I think Bryan
described the process pretty well in his CUFP keynote:

http://www.serpentine.com/blog/2009/09/23/video-of-my-cufp-keynote/

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-01 Thread Johan Tibell
On Thu, Jun 2, 2011 at 5:10 AM, Jason Dagit dag...@gmail.com wrote:
 One of the cool things about SO is that you can answer your own
 question.  For example, you might do that if you're anticipating an
 FAQ.  I think asking this question on SO and reposting your answer
 from this thread would be great.

Good to know.

I've decided to stick it in a blog post, add some pictures, and
elaborate some more (e.g. provide numbers for all containers and for
Text, so people can refer to the post when needed).

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: unordered-container 0.1.3.0

2011-06-04 Thread Johan Tibell
Hi Joachim,

On Sat, Jun 4, 2011 at 2:23 PM, Joachim Breitner nome...@debian.org wrote:
 Hi Johan,

 Am Donnerstag, den 05.05.2011, 23:02 +0200 schrieb Johan Tibell:
 I've just uploaded a new version of the unordered-containers package,
 a package of fast hashing-based container types.

 I was looking into using HashMap in a tool for the Debian Haskell Group
 that currently uses a somewhat large Map, but I found some functions
 missing that I was using:
  * unions
  * element
  * unionWith
  * mapWithKey

 Do you plan to provide an API as similar to Map as possible, and those
 functions just have not been implemented yet, or will those likely never
 be added?

I plan to add those functions. I just haven't gotten around to it
(hacking on GHC at the moment). Feel free to send patches.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-06-09 Thread Johan Tibell
On Thu, Jun 2, 2011 at 7:52 AM, Johan Tibell johan.tib...@gmail.com wrote:
 I've decided to stick it in a blog post, add some pictures, and
 elaborate some more (e.g. provide numbers for all containers and for
 Text, so people can refer to the post when needed).

I ended up writing two blog posts:

Computing the size of a HashMap
http://blog.johantibell.com/2011/06/computing-size-of-hashmap.html

Memory footprints of some common data types
http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.html

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Beginners Question: Problem with Data Type Declaration

2011-06-16 Thread Johan Tibell
Hi,

On Thu, Jun 16, 2011 at 9:53 AM, kaffeepause73 kaffeepaus...@yahoo.de wrote:
 I try to create an own data type containing Vector Double from the H-Matrix
 package. The code:

 ##

 data PowerSig = PowerSig Int Double Vector Double

You need to put parenthesis around (Vector Double). Otherwise this is
interpreted as a constructor with 4 fields (instead of 3).

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.Map: Values to keys and keys to values

2011-06-16 Thread Johan Tibell
On Thu, Jun 16, 2011 at 3:01 PM, Dmitri O.Kondratiev doko...@gmail.com wrote:
 Hi,
 Data.Map has many great functions, yet I could not find the one that allows
 from one map create another map where keys are values and values are keys of
 the first one.
 Something like:
 transMap:: (Ord k, Ord a) = Map k a - Map a k

 Does such function exist?

Note that such a function would be lossy as there might be duplicate
values in the map.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.Map: Values to keys and keys to values

2011-06-16 Thread Johan Tibell
On Thu, Jun 16, 2011 at 3:01 PM, Dmitri O.Kondratiev doko...@gmail.com wrote:
 Hi,
 Data.Map has many great functions, yet I could not find the one that allows
 from one map create another map where keys are values and values are keys of
 the first one.
 Something like:
 transMap:: (Ord k, Ord a) = Map k a - Map a k

I don't think implementing this function in the library would add much
as it cannot be implemented more efficiently with access to the
internal representation than it can using the public API. Just write

transMap = M.fromList . map swap . M.toList

and stick it in some utility file.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANN] mysql-simple - your go-to package for talking to MySQL

2011-06-21 Thread Johan Tibell
On Tue, Jun 21, 2011 at 2:34 PM, cheater cheater cheate...@gmail.com wrote:
 Hi,
 does the package adhere to some form of standard API that works the
 same way across other similar packages (different mysql drivers,
 postgres, mongo, couch, etc)?

 Is there such a standard for haskell?

Not at the moment. I believe Bryan has at least talked with one other
author (of a PostreSQL binding) about perhaps sharing an API in the
future.

My opinions is that we should wait to consolidate APIs/standardize
interfaces until we actually have an idea what a good Haskell API for
databases looks like. To know that we need to seem some different
ones.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANN] mysql-simple - your go-to package for talking to MySQL

2011-06-21 Thread Johan Tibell
On Tue, Jun 21, 2011 at 4:47 PM, David Virebayre
dav.vire+hask...@gmail.com wrote:
 The problem isn't with the stored procedure, it works if I call it
 from the mysql client.

Does mysql-simple support stored procedures?

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Hackage Server not reachable

2011-06-22 Thread Johan Tibell
Hi Stuart,

On Wed, Jun 22, 2011 at 11:02 AM, Stuart Coyle stuart.co...@gmail.com wrote:
 I cannot reach the hackage server so cabal can't download packages.
 Have I the correct address?
 http://hackage.haskell.org

Yes.

 stuart@rumbaba:~# resolveip hackage.haskell.org
 IP address of hackage.haskell.org is 69.30.63.204
 I also cannot access any of the Hackage web pages.
 I suspect that my ISP may be filtering or that their DNS is giving me the
 wrong info.

http://www.downforeveryoneorjustme.com/http://hackage.haskell.org

It looks like the problem is on your end.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Homework help - calculator function

2011-06-22 Thread Johan Tibell
On Wed, Jun 22, 2011 at 10:39 PM, SM Design social_me...@abv.bg wrote:
  I have a homework which is very important to be done but I can't complete
 the task at all. The program i should write is:

http://www.haskell.org/haskellwiki/Homework_help

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Inconsistent trailing comma in export list and record syntax

2011-07-11 Thread Johan Tibell
On Mon, Jul 11, 2011 at 11:51 AM, Sjoerd Visscher sjo...@w3future.com wrote:

 On Jul 11, 2011, at 11:42 AM, Jack Henahan wrote:

 Well, for your example frustration, the leading comma style would sort your 
 problem nicely. As for the particulars… hmm, not sure. I use leading commas 
 for both, so I never really noticed.

 That just shifts the problem, I think? Now you can no longer comment out the 
 first line.

I've found this quite annoying, especially when using CPP to
conditionally include something in a list, as it might force you to
reorder the list to make the commas appear correctly when the
conditional section is enabled/disabled.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Inconsistent trailing comma in export list and record syntax

2011-07-11 Thread Johan Tibell
On Mon, Jul 11, 2011 at 3:54 PM, Henning Thielemann
thunderb...@henning-thielemann.de wrote:
 Johan Tibell wrote:
 I've found this quite annoying, especially when using CPP to
 conditionally include something in a list, as it might force you to
 reorder the list to make the commas appear correctly when the
 conditional section is enabled/disabled.

 In this case I use the colon like a terminator.

 http://www.haskell.org/haskellwiki/List_notation

Sorry, I wasn't being clear. I meant for import/export lists.

Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Please take the State of Haskell, 2011 survey

2011-07-17 Thread Johan Tibell
Hi all,

I've put together a quick, 12-question State of Haskell, 2011 survey:

http://blog.johantibell.com/2011/07/its-time-for-this-years-state-of.html

The survey will hopefully give us some insight into how people use
Haskell and perhaps also some ideas on how Haskell tools and libraries
could be improved.

This is the second year I run this survey. New from last year are
specific questions on library support and reasoning about run-time
performance.

P.S. Please direct replies to this email to haskell-cafe@haskell.org.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Analyzing slow performance of a Haskell program

2011-08-09 Thread Johan Tibell
Hi Chris,

On Tue, Aug 9, 2011 at 12:47 PM, Chris Yuen kizzx2+hask...@gmail.com wrote:
 1. Why are bangs needed on the length arrays?

 If I remove them from below, performance drops 10%. I thought `unsafeIndex`
 is straight in both arguments, no?

 wordLength i = go i
   where
     go n
   | n  10 = lengthOnes !! n
   | n  20 = lengthTeens !! (n-10)
   | n  100 = (lengthTens !! (n // 10)) + (lengthOnes !! (n % 10))
   | n  1000 = (lengthOnes !! (n // 100)) + 7 + go (n % 100)
   | n  100 = go (n // 1000) + 8 + go (n % 1000)
   | otherwise = go (n // 100) + 7 + go (n % 100)
     !lengthOnes = lengthVec ones
     !lengthTens = lengthVec tens
     !lengthTeens = lengthVec teens

(It's strict, not straight.)

The different lengths are not used in all branches and since Haskell
is a lazy (or to be pendantic: non-strict) language we cannot compute
them before knowing which branch will be evaluated. For example, given
that we have

ones = ...
tens = error Boom!

test = wordLength 0

evaluating 'test' should not cause an exception to be raised as the
first (n  10) branch is taken, but it would if lengthOnes was strict.

Delaying the evaluation has some costs, namely allocating a thunk for
e.g. `lengthVec ones` and later evaluate that thunk. By making the
lengths strict we can evaluate them earlier and avoid some allocation
and forcing of thunks.

 2. Why the single element worker wrapper pattern (`go` functions) increases
 performance?

 If we change wordLength to

 wordLength n
   | n  10 = lengthOnes !! n
   | n  20 = lengthTeens !! (n-10)
   | n  100 = (lengthTens !! (n // 10)) + (lengthOnes !! (n % 10))
   | n  1000 = (lengthOnes !! (n // 100)) + 7 + wordLength (n % 100)
   | n  100 = wordLength (n // 1000) + 8 + wordLength (n % 1000)
   | otherwise = wordLength (n // 100) + 7 + wordLength (n % 100)
   where
     !lengthOnes = lengthVec ones
     !lengthTens = lengthVec tens
     !lengthTeens = lengthVec teens

 The performance drops by another 10%. This really surprised me. `go i`
 seemed obvious to me and I don't understand how it could make any
 difference. The full source code is available to GHC so it shouldn't be
 related to call-by-pointer problem? If this is the case, shouldn't we always
 wrap a go function for **any** recursive functions?

Making wordLength non-recursive lets GHC inline it, which can
sometimes help performance (e.g. if the inlining enables more
optimizations). Inlining does increase code size (and sometimes
allocation if a closure has to be allocated to capture free
variables), so it's not always a good idea.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Please take the State of Haskell, 2011 survey

2011-08-22 Thread Johan Tibell
[bcc: haskell@, beginners@]

Hi all,

On Sun, Jul 17, 2011 at 1:21 PM, Johan Tibell johan.tib...@gmail.com wrote:
 I've put together a quick, 12-question State of Haskell, 2011 survey:

    http://blog.johantibell.com/2011/07/its-time-for-this-years-state-of.html

 The survey will hopefully give us some insight into how people use
 Haskell and perhaps also some ideas on how Haskell tools and libraries
 could be improved.

The results of this survey are now available:

http://blog.johantibell.com/2011/08/results-from-state-of-haskell-2011.html

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Please take the State of Haskell, 2011 survey

2011-08-23 Thread Johan Tibell
On Tue, Aug 23, 2011 at 10:01 PM, David Pollak
feeder.of.the.be...@gmail.com wrote:
 If you need any help with the tutorial, I might be able to help.  Beginning
 Scala is reputed to be approachable by a broad range of developers and I'd
 be happy to try to apply my approach in Beginning Scala to Haskell
 (although, I stand in slack-jawed awe of both Learn you a Haskell and Real
 World Haskell which are both amazing works.)

I'd be happy to get some help. I'll let you know if I find the time to
start writing it.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance of concurrent array access

2011-08-23 Thread Johan Tibell
On Tue, Aug 23, 2011 at 10:04 PM, Andreas Voellmy
andreas.voel...@gmail.com wrote:
 data DAT = DAT (IOArray Int32 Char)

Try to make this a newtype instead. The data type adds a level of indirection.

   do let p j c = insertDAT a j c  lookupDAT a j = \v - v `pseq` return
 ()

You most likely want (insertDAT a j $! c) to make sure that the
element is force, to avoid thunks building up in the array.

 -- Parameters
 arraySize :: Int32

Int might work better than Int32. While they should behave the same on
32-bit machines Int might have a few more rewrite rules that makes it
optimize better.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Binding a socket to all interfaces

2011-09-21 Thread Johan Tibell
Hi Michael,

Kazu recently fixed this (in the stable branch on GitHub) in
Network.listenOn but perhaps the more basic Network.Socket.listen should
also be changed. Lets discuss what's the right thing to do in this thread.

On Wed, Sep 21, 2011 at 1:38 PM, Michael Snoyman mich...@snoyman.comwrote:

 Hi,

 One of the recurring issues that comes up in Warp is binding to IPv4
 versus IPv6 hosts. Our current code is available at [1]. It was
 updated to look like that in this commit [2] in order to support both
 IPv4 and IPv6 hosts by default. However, now it seems than on Debian
 and FreeBSD, it *only* responds to IPv6 by default[3][4]. I'm frankly
 stumped at this point on how to have our cake and eat it too.

 Does anyone have an idea of the correct incantation to get Warp to do
 the Right Thing(tm) here? And if not, is there any advice on sensible
 default behavior? I'm considering allowing a few special host values:

 * * (default, what we have now): Make this bind to IPv4
 * ipv4: Again, bind to IPv4. Guaranteed not to change in the future
 * ipv6: Bind to IPv6.

 Michael

 [1]
 https://github.com/yesodweb/wai/blob/master/warp/Network/Wai/Handler/Warp.hs#L119
 [2]
 https://github.com/snoyberg/warp/commit/02c1396c86e3fceb48cbe7df58cb631c804e24d4
 [3] https://github.com/snoyberg/warp/issues/9
 [4]
 http://stackoverflow.com/questions/7486257/yesod-devel-server-only-listening-on-ipv6

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Binding a socket to all interfaces

2011-09-21 Thread Johan Tibell
Hi,

On Wed, Sep 21, 2011 at 7:38 PM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Johan's observation is correct. Network.listenOn is alreay fixed but
 Network.Socket.listen, which Warp relies on, is not fixed yet. I will
 try to fix it. When the next version of the network library will be
 released, the problem will disappear, I hope.


We should consider how we fix this. Right now N.S.listen just wraps the
underlying system call. Is that the right place to set socket options?
Perhaps we should set them when creating the socket instead?

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Binding a socket to all interfaces

2011-09-28 Thread Johan Tibell
I've released a new version of network, 2.3.0.6, that contains the fix.

On Wed, Sep 28, 2011 at 1:34 AM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Hello,

 Sorry for the delay but I made a patch and sent a pull request:

https://github.com/haskell/network/pull/18

 After consideration, I realized that Johan's opinion is better.
 Please read the comment of this request above.

 When the next network package will be released, this problem will
 disappear, I hope. We don't have to change Warp at all.

 --Kazu

  Hi,
 
  We should consider how we fix this. Right now N.S.listen just wraps the
  underlying system call. Is that the right place to set socket options?
 Perhaps
  we should set them when creating the socket instead?
 
  Yes, of course.
 
  If I remember correctly, this option works only between socket() and
  listen(). I need to check that this option is effective to all sockets
  or only to listing sockets. Anyway, I will try this in the next week.
 
  I used to be an expert of IPv6 but I forget many things recently...
  I should remember.
 
  --Kazu
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] network package Windows co-maintainer/buildbot needed (Was: network 2.3.05 on windows 7 with ghc-7.2.1)

2011-09-28 Thread Johan Tibell
Let me take this opportunity to ask for a co-maintainer that can help me
keep the network package working on Windows. I don't have easy access to a
Windows machine (or even VM) anymore so testing on Windows is hard. What I'd
really like is a buildbot that builds the following Jenkins job as a slave
node:

http://ci.johantibell.com/job/network/

config.xml:

?xml version='1.0' encoding='UTF-8'?
matrix-project
  actions/
  description/description
  keepDependenciesfalse/keepDependencies
  properties/
  scm class=hudson.plugins.git.GitSCM
configVersion2/configVersion
userRemoteConfigs
  hudson.plugins.git.UserRemoteConfig
nameorigin/name
refspec+refs/heads/*:refs/remotes/origin/*/refspec
urlhttps://github.com/haskell/network.git/url
  /hudson.plugins.git.UserRemoteConfig
/userRemoteConfigs
branches
  hudson.plugins.git.BranchSpec
namestable/name
  /hudson.plugins.git.BranchSpec
/branches
recursiveSubmodulesfalse/recursiveSubmodules

doGenerateSubmoduleConfigurationsfalse/doGenerateSubmoduleConfigurations
authorOrCommitterfalse/authorOrCommitter
cleanfalse/clean
wipeOutWorkspacefalse/wipeOutWorkspace
pruneBranchesfalse/pruneBranches
remotePollfalse/remotePoll
buildChooser class=hudson.plugins.git.util.DefaultBuildChooser/
gitToolDefault/gitTool
browser class=hudson.plugins.git.browser.GithubWeb
  urlhttps://github.com/haskell/network//url
/browser
submoduleCfg class=list/
relativeTargetDir/relativeTargetDir
excludedRegions/excludedRegions
excludedUsers/excludedUsers
gitConfigName/gitConfigName
gitConfigEmail/gitConfigEmail
skipTagfalse/skipTag
scmName/scmName
  /scm
  canRoamtrue/canRoam
  disabledfalse/disabled
  blockBuildWhenDownstreamBuildingfalse/blockBuildWhenDownstreamBuilding
  blockBuildWhenUpstreamBuildingfalse/blockBuildWhenUpstreamBuilding
  triggers class=vector
hudson.triggers.SCMTrigger
  spec*/5 * * * */spec
/hudson.triggers.SCMTrigger
  /triggers
  concurrentBuildfalse/concurrentBuild
  axes
hudson.matrix.TextAxis
  namecompiler/name
  values
stringghc-6.10.4/string
stringghc-6.12.3/string
stringghc-7.0.3/string
  /values
/hudson.matrix.TextAxis
  /axes
  builders
hudson.tasks.Shell
  commandcabal install -w $compiler --only-dependencies --enable-tests
cabal clean
autoreconf
cabal configure -w $compiler --enable-tests
cabal build
cabal test
cabal sdist/command
/hudson.tasks.Shell
  /builders
  publishers
hudson.tasks.Mailer
  recipientsjohan.tib...@gmail.com/recipients
  dontNotifyEveryUnstableBuildfalse/dontNotifyEveryUnstableBuild
  sendToIndividualsfalse/sendToIndividuals
/hudson.tasks.Mailer
  /publishers
  buildWrappers/
  runSequentiallyfalse/runSequentially
/matrix-project
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] SMP parallelism increasing GC time dramatically

2011-10-05 Thread Johan Tibell
On Wed, Oct 5, 2011 at 2:37 PM, Tom Thorne thomas.thorn...@gmail.comwrote:

 The only problem is that now I am getting random occasional segmentation
 faults that I was not been getting before, and once got a message saying:
 Main: schedule: re-entered unsafely
 Perhaps a 'foreign import unsafe' should be 'safe'?
 I think this may be something to do with creating a lot of sparks though,
 since this occurs whether I have the parallel GC on or not.


Unless you (or some library you're using) is doing what the error message
says then you should file a GHC bug here:

http://hackage.haskell.org/trac/ghc/

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] proper way to generate a random data in criterion

2011-10-19 Thread Johan Tibell
Hi,

On Wed, Oct 19, 2011 at 1:13 AM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Hello,

 I'm measuring performance of the insertion operation of red-black
 trees. For input, three kinds of [Int] are prepared: the increasing
 the order, decreasing order, and random.

 The random case is 4 or 5 times slower than the others. I'm afraid
 that my program also measured the cost of random Int generation.

 My benchmark code can be found:


 https://github.com/kazu-yamamoto/llrbtree/blob/master/bench/insert/Bench.hs

 Does anyone kindly take a look and tell me whether or not my criterion
 code measures the cost of random Int generation?


It does. You need to use evaluate to have ensure actually be evaluated.


 If so, would you suggest how to avoid it?


Have a look at:


https://github.com/tibbe/unordered-containers/blob/master/benchmarks/Benchmarks.hs

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] proper way to generate a random data in criterion

2011-10-19 Thread Johan Tibell
On Wed, Oct 19, 2011 at 12:21 PM, Gregory Collins
g...@gregorycollins.netwrote:

 On Wed, Oct 19, 2011 at 5:03 PM, Johan Tibell johan.tib...@gmail.com
 wrote:
 
  It does. You need to use evaluate to have ensure actually be evaluated.
 

 I'm almost certain you're wrong about this. The bang pattern on the
 return from ensure (!r1 - ensure $ ...) forces r1 to WHNF, which goes
 through deepseq, and thus the whole list is forced. See
 https://gist.github.com/1299380 for a short counterexample.


I should have paid more attention; I missed the bangs on the bindings.

I still recommend the pattern I linked in my previous email. If you want to
do it they way you currently do use

let !foo = xs `deepseq` xs

no return needed.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Who is working on high performance threadsafe mutable data structures in Haskell?

2011-10-27 Thread Johan Tibell
Hi,

On Thu, Oct 27, 2011 at 8:45 AM, Ryan Newton rrnew...@gmail.com wrote:

 Based a quick perusal of Hackage there does not seem to be a lot of work in
 this area.  Of course, for Haskell the importance of this topic may be
 diminished relative to pure data structures, but for doing systems-level
 work like monad par good concurrent data structures are also very important.



Gregory Collins and I haven't wanted a fast lock-free hashtable for some
time. There's also a priority queue (inside an IORef) in the I/O manager
that could use a replacement. Note that this priority queue needs to support
access both by priority and key.


 We are about to embark on some work to fix this problem for monad-par 
 Deques, but if there are others working in this vicinity it would be nice to
 team up.
We are going to try both pure Haskell approaches using the new
 casMutVar# primop as well as wrapping foreign data structures such as those
 provided by TBB.  There are a whole bunch of issues with the latter -- see
 Appendix A and help me out if you know how to do this.


You could try the FFI approach if it's not too much work but I expect it to
perform worse than a native Haskell version. It'd also be bad for the GC,
which doesn't like having lots of small pinned objects as they fragment the
heap. I'd rather look into what primops/compiler optimizations we're lacking
to be able to do this well from within Haskell.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Hackage feature request: E-mail author when a package breaks

2011-10-31 Thread Johan Tibell
On Mon, Oct 31, 2011 at 12:08 AM, Gregory Crosswhite
gcrosswh...@gmail.comwrote:

 I have uploaded a number of small packages to Hackage that I no longer
 actively use so that I don't find out immediately when a new version of GHC
 has broken them.  Since Hackage is going to the trouble of finding out when
 a package no longer builds anyway, could it have a feature where when a
 working package breaks with a new version of GHC the author is
 automatically e-mailed?  This would make me (and probably others) a lot
 more likely to notice and proactively fix broken packages.  (Heck, I
 wouldn't even necessarily mind being nagged about it from time to time.
  :-) )


If done well I think this is a good idea. Currently I have my buildbot
email me whenever a package breaks (although the bot doesn't automatically
install new GHCs).

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The type class wilderness + Separating instances and implementations into separate packages

2011-11-02 Thread Johan Tibell
These are all very good questions! Here's my stab at it:

On Wed, Nov 2, 2011 at 11:28 AM, Ryan Newton rrnew...@gmail.com wrote:

 What is the right interface for a queue?  What is the right interface for
 a random number generator?


For any given class I'd try to get a few experts/interested parties
together and discuss.


 I don't know, but in both cases you will find many packages on hackage
 offering different takes on the matter.  In fact, there is a wilderness of
 alternative interfaces.  We've had various discussions on this list about
 the number of alternative packages.


The lack of cohesion in our library offerings is a problem and so is the
lack of interfaces. We end up programming against concrete types way too
often.


 I'm fine with lots of packages, but I think it would be great if not every
 package introduced a new interface as well as a new implementation.  If we
 could agree as a community on common interfaces to use for some basics,
 that would probably go a long way towards taming the type class wilderness.
  People have mentioned this problem before with respect to Collections
 generally.


Aside: The problem with collections is that we don't have the programming
language means to do this well yet (although soon!). The issue is that we
want to declare a type class where the context of the methods depends on
the instance e.g.

class MapLike m where
type Ctx :: Context  -- Can't do this today!
insert Ctx = k - v - m - m

Java et all cheats in their container hierarchy by doing unsafe casts (i.e.
they never solved this problem)!


 One basic part of reaching such a goal is separating interface from
 implementation.  I ran into the following problems just  in the last 24
 hours.  In both cases I wanted to use a type class, but didn't want to
 depend on the whole package it lived in:

- I wanted to use the Benchmarkable class in Criterion in my package.
 (Criterion deserving to be a standard package.)  But I can't get that
typeclass without depending on the whole Criterion package, which has
several dependencies.  And in fact on the machine I was on at the time some
of those dependencies were broken, so I decided not to use Benchmarkable.
- I wanted to use, or at least support, an existing class for Queues.
 I found the following:


 http://hackage.haskell.org/packages/archive/queuelike/1.0.9/doc/html/Data-MQueue-Class.html


I think the best option at the moment is to break out type classes in their
own packages. That's what I did with hashable.

How can we enumerate packages that at least purport to provide standard
 interfaces that you should both use and pick up to implement?  On a Wiki
 page?


I would hope that we could get all the important interfaces into the
Haskell Platform eventually (and have all packages there use them).

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] System calls and Haskell threads

2011-11-03 Thread Johan Tibell
On Thu, Nov 3, 2011 at 8:35 AM, Andreas Voellmy
andreas.voel...@gmail.comwrote:

 I just read Kazu Yamamoto's article on a high performance web server in
 the latest Monad.Reader, and I came across a statement that doesn't sound
 correct to me. He says:

 When a user thread issues a system call, a context switch occurs. This
 means that all Haskell user threads stop, and instead the kernel is given
 the CPU time. 

 Is this right? I thought that when a system call is made by a Haskell
 thread being run by a particular worker thread on a CPU, other runnable
 Haskell threads in the run queues of the HECs for other CPUs can continue
 running concurrently (provided we've run our Haskell program with multiple
 CPUs using the -Nx RTS argument). That's what I understood from the
 discussion of foreign calls in Runtime Support for Multicore Haskell.


That's correct. Blocking syscalls will not prevent other Haskell threads
from running. IIRC it will block the OS thread used to run the Haskell
thread making the blocking syscall, but the RTS always has one free OS
thread (i.e. it will allocated more if needed) that it can use to continue
running other Haskell threads with. Your best reference is probably
Extending the Haskell Foreign Function Interface with Concurrency.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] zlib build failure on recent GHC

2011-11-07 Thread Johan Tibell
On Mon, Nov 7, 2011 at 7:53 AM, Ben Gamari bgamari.f...@gmail.com wrote:

 With GHC 1ece7b27a11c6947f0ae3a11703e22b7065a6b6c zlib fails to build,
 apparently due to Safe Haskell (bug 5610 [1]). The error is specifically,

 $ cabal install zlib
 Resolving dependencies...
 Configuring zlib-0.5.3.1...
 Preprocessing library zlib-0.5.3.1...
 Building zlib-0.5.3.1...
 [1 of 5] Compiling Codec.Compression.Zlib.Stream (
 dist/build/Codec/Compression/Zlib/Stream.hs,
 dist/build/Codec/Compression/Zlib/Stream.o )

 Codec/Compression/Zlib/Stream.hsc:857:1:
Unacceptable argument type in foreign declaration: CInt
When checking declaration:
  foreign import ccall unsafe static zlib.h inflateInit2_
 c_inflateInit2_
:: StreamState - CInt - Ptr CChar - CInt - IO CInt

 Codec/Compression/Zlib/Stream.hsc:857:1:
Unacceptable argument type in foreign declaration: CInt
When checking declaration:
  foreign import ccall unsafe static zlib.h inflateInit2_
 c_inflateInit2_
:: StreamState - CInt - Ptr CChar - CInt - IO CInt

 Codec/Compression/Zlib/Stream.hsc:857:1:
Unacceptable result type in foreign declaration: IO CInt
Safe Haskell is on, all FFI imports must be in the IO monad
When checking declaration:
  foreign import ccall unsafe static zlib.h inflateInit2_
 c_inflateInit2_
:: StreamState - CInt - Ptr CChar - CInt - IO CInt
 ...

 This is a little strange since,

  a) It's not clear why Safe Haskell is enabled
  b) The declarations in question seem to be valid

 Does this seem like a compiler issue to you?


This is due to a change in how FFI imports and newtypes work. GHC was
recently changed to not allow you to use newtypes in FFI imports unless the
constructor of the newtype is in scope. This broke quite a few libraries. I
have patched a few of them and I've sent a patch to the zlib maintainer.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] zlib build failure on recent GHC

2011-11-07 Thread Johan Tibell
On Mon, Nov 7, 2011 at 12:06 PM, Jason Dagit dag...@gmail.com wrote:

  This is due to a change in how FFI imports and newtypes work. GHC was
  recently changed to not allow you to use newtypes in FFI imports unless
 the
  constructor of the newtype is in scope. This broke quite a few
 libraries. I
  have patched a few of them and I've sent a patch to the zlib maintainer.

 This seems like a big change.  Where should I be watching to know
 about this ahead of time?  I bet I have to fix some of my packages.


I suppose it will be in the release notes. I saw it on the GHC mailing list.

-- Johan\
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Tool to brute-force test against hackage libraries to determine lower bounds?

2011-11-09 Thread Johan Tibell
On Wed, Nov 9, 2011 at 3:58 PM, Ryan Newton rrnew...@gmail.com wrote:

 I don't know about you, but I personally haven't found the time to cast
 back in time for each of my package's dependencies to find a true lower
 bound version.

 Do we have any tools that would do the following?

- ask Hackage for the available versions of package foo
- use cabal-dev to build your package against foo-X.Y.Z forall {X,Y,Z}
(but leaving other packages unconstrained)
- report successes and failures, including last failure before the
present version (and therefore lower bound, exclusive)

 What about dependency interactions? If you depend on foo and bar there
might be versions of foo and bar that don't build together that you might
not discover by varying their versions independently.




 Johan, would it make any sense to extend your Jenkins setup to do this?


If someone came up with a recipe, sure. It might be a bit CPU intensive for
my little VPS though.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ST not strict enough?

2011-11-15 Thread Johan Tibell
Hi Jason,

On Tue, Nov 15, 2011 at 12:08 PM, Jason Dusek jason.du...@gmail.com wrote:

 Should I be annotating my functions with strictness, for the
 vector reference, for example? Should I be using STUArrays,
 instead?


From
http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.4.1.0/Control-Monad-ST-Safe.html

The = and  operations are strict in the state (though not in
values stored in the state).

which implies that

 modifySTRef counter (+1)

is too lazy.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ST not strict enough?

2011-11-16 Thread Johan Tibell
On Wed, Nov 16, 2011 at 11:58 AM, Jason Dusek jason.du...@gmail.com wrote:

 diff --git a/Rebuild.hs b/Rebuild.hs
 @@ -15,6 +15,7 @@ import Data.STRef
  import Data.String
  import Data.Word

 +import Control.DeepSeq
  import Data.Vector.Unboxed (Vector)
  import qualified Data.Vector.Unboxed as Vector (create, length)
  import qualified Data.Vector.Unboxed.Mutable as Vector hiding (length)
 @@ -46,8 +47,8 @@ rebuildAsVector bytes=  byteVector
 n   -  readSTRef counter
 return (Vector.unsafeSlice 0 n v)
   writeOneByte v counter b   =  do n - readSTRef counter
 -   Vector.unsafeWrite v n b
 +   w v n b
modifySTRef counter (+!1)
 +  (+!) a b   =  ((+) $!! a) $!! b
 +  w v n b = (Vector.unsafeWrite v $!! n) $!! b


+! doesn't work unless modifySTRef is already strict in the result of the
function application. You need to write modifySTRef' that seq:s the result
of the function application before calling writeSTRef.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ST not strict enough?

2011-11-16 Thread Johan Tibell
On Wed, Nov 16, 2011 at 12:07 PM, Johan Tibell johan.tib...@gmail.comwrote:

 On Wed, Nov 16, 2011 at 11:58 AM, Jason Dusek jason.du...@gmail.comwrote:

 diff --git a/Rebuild.hs b/Rebuild.hs
 @@ -15,6 +15,7 @@ import Data.STRef
  import Data.String
  import Data.Word

 +import Control.DeepSeq
  import Data.Vector.Unboxed (Vector)
  import qualified Data.Vector.Unboxed as Vector (create, length)
  import qualified Data.Vector.Unboxed.Mutable as Vector hiding (length)
 @@ -46,8 +47,8 @@ rebuildAsVector bytes=  byteVector
 n   -  readSTRef counter
 return (Vector.unsafeSlice 0 n v)
   writeOneByte v counter b   =  do n - readSTRef counter
 -   Vector.unsafeWrite v n b
 +   w v n b
modifySTRef counter (+!1)
 +  (+!) a b   =  ((+) $!! a) $!! b
 +  w v n b = (Vector.unsafeWrite v $!! n) $!! b


 +! doesn't work unless modifySTRef is already strict in the result of the
 function application. You need to write modifySTRef' that seq:s the result
 of the function application before calling writeSTRef.


Just double checked. modifySTRef is too lazy:

-- |Mutate the contents of an 'STRef'
modifySTRef :: STRef s a - (a - a) - ST s ()
modifySTRef ref f = writeSTRef ref . f = readSTRef ref

We need Data.STRef.Strict
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ST not strict enough?

2011-11-16 Thread Johan Tibell
On Wed, Nov 16, 2011 at 12:33 PM, Antoine Latter aslat...@gmail.com wrote:

 We already have one in base - it re-exports Data.STRef in whole :-)


 http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-STRef-Strict.html


Then it's wrong. :( In what sense is it strict? I think it should be strict
in the value stored in the ref.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ST not strict enough?

2011-11-16 Thread Johan Tibell
On Wed, Nov 16, 2011 at 2:23 PM, Jason Dusek jason.du...@gmail.com wrote:

  Just double checked. modifySTRef is too lazy:
  -- |Mutate the contents of an 'STRef'
  modifySTRef :: STRef s a - (a - a) - ST s ()
  modifySTRef ref f = writeSTRef ref . f = readSTRef ref
  We need Data.STRef.Strict

 Tried a modifySTRef' defined this way:

 modifySTRef' ref f   =  do
  val   -  (f $!!) $ readSTRef ref
  writeSTRef ref (val `seq` val)

 ...but there was no change in memory usage.


Why not just

modifySTRef :: STRef s a - (a - a) - ST s ()
modifySTRef ref f = do
x - readSTRef ref
writeSTRef ref $! f x

(Note that I didn't check if modifySTRef was actually a problem in this
case).

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Superset of Haddock and Markdown

2011-11-17 Thread Johan Tibell
Hi all,

I spent some time today documenting a library and the experience left me
wanting a better markup language. In particular, Haddock lacks:

 * markup for bold text: bold text works better than italics for emphasis
on computer monitors.
 * hyperlinks with anchor texts: having the actual URL rendered inline with
text hurts readability.

Could Haddock markup be extended to also include some Markdown features?
The new features could be hidden behind a flag so old documentation doesn't
get unwanted markup (e.g. if it uses *...* to not mean bold).

P.S. This good make a good weekend hack that shouldn't be too difficult.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-17 Thread Johan Tibell
Hi all,

Data.Map is getting split into Data.Map.Lazy and Data.Map.Strict (with
Data.Map re-exporting the lazy API). I want to better document the
strictness properties of the two new modules. Right now the
documentation for Data.Map.Strict reads:

Strictness properties
=

 * All functions are strict in both key and value arguments.  Examples:

  insertWith (+) k undefined m  ==  undefined
  delete undefined m  ==  undefined

 * Keys and values are evaluated to WHNF before they are stored in the
map.  Examples:

  map (\ v - undefined)  ==  undefined
  mapKeys (\ k - undefined)  ==  undefined

I'm not entirely happy with this formulation. I'm looking for
something that's clear (i.e. precise and concise, without leaving out
important information), assuming that the reader already knows how
lazy evaluation works at a high level.

Ideas?

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-17 Thread Johan Tibell
On Thu, Nov 17, 2011 at 9:21 PM, Johan Tibell johan.tib...@gmail.com wrote:
 I'm not entirely happy with this formulation. I'm looking for
 something that's clear (i.e. precise and concise, without leaving out
 important information), assuming that the reader already knows how
 lazy evaluation works at a high level.

 Ideas?

This reads a bit better to me:

Strictness properties
=

This module is strict in keys and values.  In particular,

 * key and value function arguments passed to functions are
   evaluated to WHNF before the function body is evaluated, and

 * keys and values returned by high-order function arguments are
   evaluated to WHNF before they are inserted into the map.

Here are some examples:

insertWith (+) k undefined m  ==  undefined
delete undefined m  ==  undefined
map (\ v - undefined)  ==  undefined
mapKeys (\ k - undefined)  ==  undefined

Any ideas for further improvements?

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-18 Thread Johan Tibell
On Fri, Nov 18, 2011 at 12:09 AM, Roman Cheplyaka r...@ro-che.info wrote:
 Is it mentioned anywhere that Map is spine-strict?

It's not and we should probably mention it.

I was mulling this over last night. My initial thought was that it
shouldn't matter as long as the algorithmic complexity of the
functions is maintained. But it is important in that a lookup
following an insert might do all the work of the insert, which is
somewhat surprising (and inefficient).

 An important property, although may be non-trivial to formulate while
 keeping the implementation abstract.

Perhaps we could talk about the presence or absence of thunks of a Map
that's in WHNF?

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-18 Thread Johan Tibell
On Fri, Nov 18, 2011 at 1:58 AM, Roman Leshchinskiy r...@cse.unsw.edu.au 
wrote:
 Johan Tibell wrote:

       map (\ v - undefined)  ==  undefined
       mapKeys (\ k - undefined)  ==  undefined

 Not really related to the question but I don't really understand how these
 properties can possibly hold. Shouldn't it be:

  map (\v - undefined) x = undefined

 And even then, does this really hold for empty maps?

It doesn't hold. It needs the side condition that the map is initially
empty. I wonder if there's any function in the API that'd let me
express this property (of HOFs) that doesn't require a side condition.
I don't think so e.g.

insertWith (\old new - undefined) k v m

has a side condition that k is in the map.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-18 Thread Johan Tibell
On Fri, Nov 18, 2011 at 5:02 AM, Twan van Laarhoven twa...@gmail.com wrote:
 * key and value function arguments passed to functions are
  evaluated to WHNF before the function body is evaluated, and

 function arguments passed to functions sounds a bit redundant. Either say
 arguments passed to functions or function arguments. Also before the
 function body is evaluated says something about evaluation order, does that
 really matter for strictness?

It is a bit redundant. I will remove it.

  * keys and values returned by high-order function arguments are
    evaluated to WHNF before they are inserted into the map.

 Keys and values not returned by higher order functions, but passed in
 directly are also evaluated to WHNF (per the first rule), so that
 qualification is unnecessary. Just say:

  * keys and values are evaluated to WHNF before they are
    inserted into the map.

I don't think we have any higher-order functions that don't store
evaluated keys/values in the map so this should be equivalent. Without
the part about higher-order functions it's not quite clear why this
second property is needed and that's why I included it to begin with.
Perhaps I should instead clarify that particular part with an example.

 I also think 'stored' is better here than 'inserted', because the latter
 might give the impression that it only applies to the insert function, and
 not to things like map.

'stored' is a bit more clear, I agree.

      insertWith (+) k undefined m  ==  undefined

       etc.

 As Roman suggested, use = here instead of ==.

I was trying to be consistent with e.g. Control.Functor etc, which use
== and two surrounding spaces. I think it's good, as it avoids
confusion with function declarations.

 To really illustrate the first rule, insertWith (+) is not enough, you would
 really need a function that doesn't use the value, so

    insertWith (\new old - old) k undefined m = undefined

 But that is just nitpicking.

My example is enough, but I forgot to include the side condition that
k is in the map. You're example is a bit better in that it doesn't
require that side condition.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-18 Thread Johan Tibell
Here's an attempt at an improved version:

Strictness properties
=

This module satisfies the following properties:

1. Key and value arguments are evaluated to WHNF;

2. Keys and values are evaluated to WHNF before they are stored in
the map.

Here are some examples that illustrate the first property:

insertWith (\ old new - old) k undefined m  ==  undefined
delete undefined m  ==  undefined

Here are some examples that illustrate the second property:

map (\ v - undefined) m  ==  undefined  -- m is not empty
mapKeys (\ k - undefined) m  ==  undefined  -- m is not empty

What do you think?

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-18 Thread Johan Tibell
On Fri, Nov 18, 2011 at 9:16 AM, Roman Cheplyaka r...@ro-che.info wrote:
 * Johan Tibell johan.tib...@gmail.com [2011-11-18 08:06:29-0800]
 On Fri, Nov 18, 2011 at 12:09 AM, Roman Cheplyaka r...@ro-che.info wrote:
  Is it mentioned anywhere that Map is spine-strict?

 It's not and we should probably mention it.

 Hm. Perhaps I'm missing something, but

  data Map k a  = Tip
                | Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)

 looks pretty (spine-)strict to me.
 (This is in the latest rev from http://github.com/haskell/containers.git)

it's not as in it's not documented.

 It's also space and stack complexities that matter (not sure if you
 include those in algorithmic complexity).

 For example, if it's not spine-strict, then

  Map.lookup k $ foldl' Map.union Map.empty longList

 would overflow the stack despite the prime in foldl'.

Good point. I will mull this over.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] List x ByteString x Lazy Bytestring

2011-12-05 Thread Johan Tibell
On Mon, Dec 5, 2011 at 6:09 AM, Yves Parès limestr...@gmail.com wrote:

 However the performance issues seem odd: text is based on bytestring.


This is not the case. Text is based on ByteArray#, GHC internal type for
blocks of bytes. The text package depends on the bytestring package because
it allows you to encode/decode Text-ByteString.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Recommended class instances for container type

2011-12-08 Thread Johan Tibell
On Thu, Dec 8, 2011 at 8:12 AM, Christoph Breitkopf 
chbreitk...@googlemail.com wrote:

 Hello,

 I'm in the process of implementing a container data type, and wonder what
 class instances are generally considered necessary. E.g. is it ok to start
 out with a Show that's adequate for debugging, or is it a 'must' to include
 instances of everything possible (Eq, Ord if possible, Read, Show, Functor,
 ...).


Start out with Show and spend your time making sure that you're container
type performs well (unless you're doing this as an exercise of course). A
featureful API for something that's as slow as linked lists isn't very
useful. ;)

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANN: ekg-0.2 - Remote monitoring of processes

2011-12-29 Thread Johan Tibell
(I forgot to announce v0.1 so this is a combined announcement.)

I'm proud to announce the ekg [1] library. The library lets you remotely
monitor any running Haskell program, using your web browser or an automated
monitoring program.

The library lets you monitor garbage collector and memory usage statistics
(similar to what's provided by +RTS -s, but live), but also lets you
defined your own counters that you can update from within your program.

For some more examples see:
http://blog.johantibell.com/2011/12/remotely-monitor-any-haskell.html
http://blog.johantibell.com/2011/12/more-monitoring-goodies.html

1. http://hackage.haskell.org/package/ekg
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Package Versioning Policy

2012-01-05 Thread Johan Tibell
On Thu, Jan 5, 2012 at 9:05 AM, Christoph Breitkopf
chbreitk...@googlemail.com wrote:
 a) You are not allowed to remove or change the types of existing stuff.
 Ok.

Unless you bump the major version number (i.e. X or Y in X.Y.Z.P).

That way people can depend on

IntervalMap == X.Y.Z.*

and be guaranteed that their package won't fail to compile if you
release a bugfix to X.Y (which would increase the patch component of
the version number).

 b) You are allowed to add new functions. But that can break compilation
 because of name conflicts. Seems to be allowed on the grounds that this is
 easy to fix in the client code.

I cannot break compilation if you use explicit import lists or
qualified imports. People who use that (which I recommend they do) can
specify a dependency on

IntervalMap == X.Y.*

and still be guaranteed not to break if you make a new (minor) release.

 c) You are not allowed to add new instances. I don't get this - how is
 this any worse than b)?

You cannot prevent the import of new instances. When you import a
module you get all its instances. This means that explicit import
lists can't protect you.

 I do understand that it is not generally possible to prevent breaking code
 - for example if the client code depends on buggy behavior that gets fixed
 in a minor version update. That seems unavoidable - after all, bugfixes are
 _the_ reason for minor updates.

Bugfixes you be in bugfix releases (i.e. bumps to the 4th version component.)

Yes, if you depend on bugs you might have your code break. There's no
way around this. In practice it hasn't been a problem.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Package Versioning Policy

2012-01-05 Thread Johan Tibell
On Thu, Jan 5, 2012 at 9:30 AM, Christoph Breitkopf
chbreitk...@googlemail.com wrote:
 If I understand correctly, you would recommend:

 - Mayor Version changes: as described in the guidelines: changed interface,
 new instances
 - Minor version change: when I just add functions
 - Patchlevel change: for bugfixes, performance fixes, documentation changes

Yes. This is in fact what the PVP specifies IIRC.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] feed release plan

2012-01-13 Thread Johan Tibell
On Fri, Jan 13, 2012 at 10:46 AM, Simon Michael si...@joyful.com wrote:
 Aha, thanks both.

 The haskell organisation looks bigger, I think I'd like to upload feed
 there. Could the owner add contact info or a how-to-join note to the page ?

The Haskell organization on GitHub is for core libraries (i.e. the
Haskell Platform) only at this point. It exists to make it easier for
a few maintainers to maintain all those libraries.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Johan Tibell
On Tue, Jan 31, 2012 at 1:22 PM, Gregory Collins
g...@gregorycollins.net wrote:
 I completely agree on the first part, but deepseq is not a panacea either.
 It's a big hammer and overuse can sometimes cause wasteful O(n) no-op
 traversals of already-forced data structures. I also definitely wouldn't go
 so far as to say that you can't do serious parallel development without it!

I agree. The only time I ever use deepseq is in Criterion benchmarks,
as it's a convenient way to make sure that the input data is evaluated
before the benchmark starts. If you want a data structure to be fully
evaluated, evaluate it as it's created, not after the fact.

 The only real solution to problems like these is a thorough understanding of
 Haskell's evaluation order, and how and why call-by-need is different than
 call-by-value. This is both a pedagogical problem and genuinely hard -- even
 Haskell experts like the guys at GHC HQ sometimes spend a lot of time
 chasing down space leaks. Haskell makes a trade-off here; reasoning about
 denotational semantics is much easier than in most other languages because
 of purity, but non-strict evaluation makes reasoning about operational
 semantics a little bit harder.

+1

We can do a much better job at teaching how to reason about
performance. A few rules of thumb gets you a long way. I'm (slowly)
working on improving the state of affairs here.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-31 Thread Johan Tibell
On Tue, Jan 31, 2012 at 12:19 PM, Steve Severance
ssevera...@alphaheavy.com wrote:
 The webpage data was split out across tens of thousands of files compressed
 binary. I used enumerator to load these files and select the appropriate
 columns. This step was performed in parallel using parMap and worked fine
 once i figured out how to add the appropriate !s.

Even though advertised as parallel programming tools, parMap and other
functions that work in parallel over *sequential* access data
structures (i.e. linked lists.) We want flat, strict, unpacked data
structures to get good performance out of parallel algorithms. DPH,
repa, and even vector show the way.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The State of Testing?

2012-02-02 Thread Johan Tibell
On Thu, Feb 2, 2012 at 4:19 PM, Conrad Parker con...@metadecks.org wrote:

 I've followed what Johan Tibbell did in the hashable package:


If I had known how much confusion my childhood friends would unleash on the
Internet when they, at age 7, gave me a nickname that's spelled slightly
differently from my last name, I would have asked them to pick another one.
;)

-- Johan Tibell
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The State of Testing?

2012-02-02 Thread Johan Tibell
On Thu, Feb 2, 2012 at 4:46 PM, Conrad Parker con...@metadecks.org wrote:

 On 3 February 2012 08:30, Johan Tibell johan.tib...@gmail.com wrote:
  On Thu, Feb 2, 2012 at 4:19 PM, Conrad Parker con...@metadecks.org
 wrote:
 
  I've followed what Johan Tibbell did in the hashable package:
 
 
  If I had known how much confusion my childhood friends would unleash on
 the
  Internet when they, at age 7, gave me a nickname that's spelled slightly
  differently from my last name, I would have asked them to pick another
 one.
  ;)

 lol, sorry, I actually double-checked the number of l's before writing
 that but didn't consider the b's. For future reference I've produced a
 handy chart:



 Letter | Real-name count | Nickname count
 ---+-+---
 b  | 1   | 2
 l  | 2   | 0
 ---+-+---
 SUM| 3   | 2


Excellent. I will tattoo it on my forehead.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] network-2.3.0.10 compiled for ghc 7.4.1 windows

2012-02-06 Thread Johan Tibell
Hi,

Someone recently contributed a fix that should make network build with 7.4:
https://github.com/haskell/network/pull/25

Can you see if that works for you? I haven't yet had time to merge and
release that fix (I'm on vacation.)

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Google Summer of Code 2012 Announced

2012-02-06 Thread Johan Tibell
Hi all,

Here's a heads-up that this year's Google of Code is kicking off. My
experience from the last few years is that we can maximize the output we
get from GSoC by being proactive and writing down semi-detailed
explanations of what kind of projects we'd like to see, instead of letting
the students pick themselves*. Here's three examples of such write-ups I
did last year:

http://blog.johantibell.com/2011/03/summer-of-code-project-suggestions.html

Concretely:

 1. Write down the project suggestions somewhere (e.g. on the wiki, your
blog, etc).
 2. Advertise the projects on haskell-cafe, reddit, twitter, Google+
 3. Profit.

* The students tend to not know what makes a good GSoC project and often
aim for something too difficult, like writing a new project from scratch
instead of contributing to an old one. Contributing to widely used
libraries or infrastructure usually results in

 * a larger benefit for the community, and
 * the students sticking around.

My guess is that the student tend to stick around after the project is done
if they contribute to infrastructure projects, as they'll see their stuff
get used and will get feature requests/bug reports that will make them
continue working on the project.

-- Johan

-- Forwarded message --
From: Carol Smith car...@google.com
Date: Sat, Feb 4, 2012 at 8:43 AM
Subject: Google Summer of Code 2012 Announced
To: Google Summer of Code Announce 
google-summer-of-code-annou...@googlegroups.com


Hi all,

We're pleased to announce that Google Summer of Code will be happening
for its eighth year this year. Please check out the blog post [1] about
the program and read the FAQs [2] and Timeline [3] on Melange for
more information.

[1] -
http://google-opensource.blogspot.com/2012/02/google-summer-of-code-2012-is-on.html
[2] -
http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2012/faqs
[3] - http://www.google-melange.com/gsoc/events/google/gsoc2012

Cheers,
Carol

-- 
You received this message because you are subscribed to the Google Groups
Google Summer of Code Announce group.
To post to this group, send email to
google-summer-of-code-annou...@googlegroups.com.
To unsubscribe from this group, send email to
google-summer-of-code-announce+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/google-summer-of-code-announce?hl=en.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] network-2.3.0.10 compiled for ghc 7.4.1 windows

2012-02-07 Thread Johan Tibell
Note that there are two branches on github, master and stable. You want the
latter.
On Feb 7, 2012 8:23 AM, Alberto G. Corona agocor...@gmail.com wrote:

 This is quite different.
 I don´t know how but I was looking at some other older patch around
 the same issue and I supposed that it was the one refered by Yohan
 Tibell.

 I´ll try your patch.

 Thanks!.

 2012/2/7 Holger Reinhardt hreinha...@gmail.com:
  Hi,
 
  (I submitted the patch that Johan linked to)
  Network/Socket/Internal.hsc has the following code:
 
  #if defined(WITH_WINSOCK) || defined(cygwin32_HOST_OS)
  type CSaFamily = (#type unsigned short)
  #elif defined(darwin_HOST_OS)
  type CSaFamily = (#type u_char)
  #else
  type CSaFamily = (#type sa_family_t)
  #endif
 
  You have patched this part to always use 'unsigned short'. But the real
  issue is that WITH_WINSOCK is not defined, even though it should be. The
  reason for this lies in include/HsNet.h:
 
  #if defined(HAVE_WINSOCK_H)  !defined(cygwin32_HOST_OS)
  # define WITH_WINSOCK  1
  #endif
 
  The problem here is that it checks for HAVE_WINSOCK_H, but the configure
  script never defines this variable. Instead it defines HAVE_WINSOCK2_H.
 It
  seems that the network library used Winsock1 in the past and in the
  transition to Winsock2 someone forgot to change a few of the #ifdefs.
 
  My patch just changes all occurences of HAVE_WINSOCK_H
 to HAVE_WINSOCK2_H.
  You might want to try that and report back if it works for you.
 
  2012/2/7 Alberto G. Corona agocor...@gmail.com
 
  Hi Johan,
  The patch is not for the current version of network and the code is
  quite different. Basically it is necesary to  define this variable as
  unsigned short that is the thing intended in the patch. however I
  put it by brute force, without regard of the prerpocessor directives.
  With this change the code compiles well with:
 
 
 
 http://neilmitchell.blogspot.com/2010/12/installing-haskell-network-library-on.html
 
  However my compiled library lack the methods defined as foreign. I´ll
  keep trying.
 
  2012/2/6 Johan Tibell johan.tib...@gmail.com:
   Hi,
  
   Someone recently contributed a fix that should make network build with
   7.4: https://github.com/haskell/network/pull/25
  
   Can you see if that works for you? I haven't yet had time to merge and
   release that fix (I'm on vacation.)
  
   -- Johan
  
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] network-2.3.0.10 compiled for ghc 7.4.1 windows

2012-02-08 Thread Johan Tibell
I will merge this as soon as I get back from vacation.
On Feb 8, 2012 8:54 AM, Holger Reinhardt hreinha...@gmail.com wrote:

 Having discussed the issue privately with Alberto, I've found another bug
 and updated my pull request [1]. Using that code it should be possible to
 build the network library on Windows using MSys on GHC 7.4.1.

 [1] https://github.com/haskell/network/pull/25

 2012/2/8 Alberto G. Corona agocor...@gmail.com

 yes i did it,.

 the error is as follows:

 shop.exe: NetworkSocket.hsc:(948,3)-(1007,23): Non-exhaustive patterns in
 case

 I will download network form hackage and will do it form the beginning. .



 2012/2/8 Holger Reinhardt hreinha...@gmail.com:
  Did you run cabal clean before rebuilding with Git Bash? And can you
 post
  the exact runtime error you get?
 
  2012/2/8 Alberto G. Corona agocor...@gmail.com
 
  I switched to Git bash and the runtime error produced by the library
  is the same.
  This error may be produced because  the configuration it does not
  detect the netwiorkin related includes such is socket.h. This does not
  exist neither in the ghc installation neither in GIT/Mingw
 
 
  2012/2/7 Holger Reinhardt hreinha...@gmail.com:
   I just use the version of MSys that is included with Git [1]. This
 puts
   a
   Git bash icon on your desktop which you can then use to build the
   network
   library.
  
   [1] http://code.google.com/p/msysgit/
  
  
   2012/2/7 Alberto G. Corona agocor...@gmail.com
  
   Nothing bur a long history of failures. The problem is the
   configuration and versioning of MinGW and MSys. This  is a nighmare.
  
   2012/2/7 Holger Reinhardt hreinha...@gmail.com:
Oh you are using Cygwin. I'm using MSys so this is why I cannot
reproduce
your problem. Is there anything preventing you from using MSys?
   
   
2012/2/7 Alberto G. Corona agocor...@gmail.com
   
The problem this time is in Configure :
   
case $host in
*-mingw32)
   EXTRA_SRCS=cbits/initWinSock.c, cbits/winSockErr.c,
cbits/asyncAccept.c
   EXTRA_LIBS=ws2_32
   CALLCONV=stdcall ;;
*-solaris2*)
   EXTRA_SRCS=cbits/ancilData.c
   EXTRA_LIBS=nsl, socket
   CALLCONV=ccall ;;
*)
   EXTRA_SRCS=cbits/ancilData.c
   EXTRA_LIBS=
   CALLCONV=ccall ;;
esac
   
   
   
Since I´m cross-compiling with cygwin, the variable Host does not
contain ¨*-muingw32  but i686-pc-cygwin
   
changing the case , the library incorporates the lost C coded
 files.
   
Now the library links fine win imported, but there is a runtime
error:
   
NetworkSocket.hsc:(948,3)-(1007,23): Non-exhaustive patterns in
 case
   
maybe it is due to some other preprocessor directive mismatch
   
   
2012/2/7 Holger Reinhardt hreinha...@gmail.com:
 Did you also change the files in the /cbits/ folder? Because
 they
 also
 check
 for HAVE_WINSOCK_H.


 2012/2/7 Alberto G. Corona agocor...@gmail.com

 The code is evolving and none of the versions match exactily
 with
 the
 patch, but substituting HAVE_WINSOCK by HAVE WINSOCK2 in these
 files
 solves the compilation problem at least in the network
 2.3.0.10
 version from hackage.

 However it produces the same undefined references when this
 library
 is
 imported in my application. It seems that some object code is
 not
 included in the final library.  I verified that at least some
 of
 these
 undefined references correspond with  C code in the source,
 but
 somehow this is not included in the object library

 2012/2/7 Johan Tibell johan.tib...@gmail.com:
  Note that there are two branches on github, master and
 stable.
  You
  want
  the
  latter.
 
  On Feb 7, 2012 8:23 AM, Alberto G. Corona
  agocor...@gmail.com
  wrote:
 
  This is quite different.
  I don´t know how but I was looking at some other older
 patch
  around
  the same issue and I supposed that it was the one refered
 by
  Yohan
  Tibell.
 
  I´ll try your patch.
 
  Thanks!.
 
  2012/2/7 Holger Reinhardt hreinha...@gmail.com:
   Hi,
  
   (I submitted the patch that Johan linked to)
   Network/Socket/Internal.hsc has the following code:
  
   #if defined(WITH_WINSOCK) || defined(cygwin32_HOST_OS)
   type CSaFamily = (#type unsigned short)
   #elif defined(darwin_HOST_OS)
   type CSaFamily = (#type u_char)
   #else
   type CSaFamily = (#type sa_family_t)
   #endif
  
   You have patched this part to always use 'unsigned
 short'.
   But
   the
   real
   issue is that WITH_WINSOCK is not defined, even though it
   should
   be. The
   reason for this lies in include/HsNet.h:
  
   #if defined(HAVE_WINSOCK_H)  !defined(cygwin32_HOST_OS)
   # define WITH_WINSOCK  1
   #endif

Re: [Haskell-cafe] network-2.3.0.10 compiled for ghc 7.4.1 windows

2012-02-13 Thread Johan Tibell
I've merged and pushed the changes to the stable branch on GitHub. If
someone could verify that it works fine on Windows, I'll make another
release.

In addition to running whatever program you're interested in, also run:

cabal clean
autoreconf
cabal configure --enable-tests
cabal build
cabal test

All tests should pass.

-- Johan


On Wed, Feb 8, 2012 at 11:04 AM, Johan Tibell johan.tib...@gmail.comwrote:

 I will merge this as soon as I get back from vacation.
 On Feb 8, 2012 8:54 AM, Holger Reinhardt hreinha...@gmail.com wrote:

 Having discussed the issue privately with Alberto, I've found another bug
 and updated my pull request [1]. Using that code it should be possible to
 build the network library on Windows using MSys on GHC 7.4.1.

 [1] https://github.com/haskell/network/pull/25

 2012/2/8 Alberto G. Corona agocor...@gmail.com

 yes i did it,.

 the error is as follows:

 shop.exe: NetworkSocket.hsc:(948,3)-(1007,23): Non-exhaustive patterns
 in case

 I will download network form hackage and will do it form the beginning. .



 2012/2/8 Holger Reinhardt hreinha...@gmail.com:
  Did you run cabal clean before rebuilding with Git Bash? And can you
 post
  the exact runtime error you get?
 
  2012/2/8 Alberto G. Corona agocor...@gmail.com
 
  I switched to Git bash and the runtime error produced by the library
  is the same.
  This error may be produced because  the configuration it does not
  detect the netwiorkin related includes such is socket.h. This does not
  exist neither in the ghc installation neither in GIT/Mingw
 
 
  2012/2/7 Holger Reinhardt hreinha...@gmail.com:
   I just use the version of MSys that is included with Git [1]. This
 puts
   a
   Git bash icon on your desktop which you can then use to build the
   network
   library.
  
   [1] http://code.google.com/p/msysgit/
  
  
   2012/2/7 Alberto G. Corona agocor...@gmail.com
  
   Nothing bur a long history of failures. The problem is the
   configuration and versioning of MinGW and MSys. This  is a
 nighmare.
  
   2012/2/7 Holger Reinhardt hreinha...@gmail.com:
Oh you are using Cygwin. I'm using MSys so this is why I cannot
reproduce
your problem. Is there anything preventing you from using MSys?
   
   
2012/2/7 Alberto G. Corona agocor...@gmail.com
   
The problem this time is in Configure :
   
case $host in
*-mingw32)
   EXTRA_SRCS=cbits/initWinSock.c, cbits/winSockErr.c,
cbits/asyncAccept.c
   EXTRA_LIBS=ws2_32
   CALLCONV=stdcall ;;
*-solaris2*)
   EXTRA_SRCS=cbits/ancilData.c
   EXTRA_LIBS=nsl, socket
   CALLCONV=ccall ;;
*)
   EXTRA_SRCS=cbits/ancilData.c
   EXTRA_LIBS=
   CALLCONV=ccall ;;
esac
   
   
   
Since I´m cross-compiling with cygwin, the variable Host does
 not
contain ¨*-muingw32  but i686-pc-cygwin
   
changing the case , the library incorporates the lost C coded
 files.
   
Now the library links fine win imported, but there is a runtime
error:
   
NetworkSocket.hsc:(948,3)-(1007,23): Non-exhaustive patterns in
 case
   
maybe it is due to some other preprocessor directive mismatch
   
   
2012/2/7 Holger Reinhardt hreinha...@gmail.com:
 Did you also change the files in the /cbits/ folder? Because
 they
 also
 check
 for HAVE_WINSOCK_H.


 2012/2/7 Alberto G. Corona agocor...@gmail.com

 The code is evolving and none of the versions match exactily
 with
 the
 patch, but substituting HAVE_WINSOCK by HAVE WINSOCK2 in
 these
 files
 solves the compilation problem at least in the network
 2.3.0.10
 version from hackage.

 However it produces the same undefined references when this
 library
 is
 imported in my application. It seems that some object code
 is not
 included in the final library.  I verified that at least
 some of
 these
 undefined references correspond with  C code in the source,
 but
 somehow this is not included in the object library

 2012/2/7 Johan Tibell johan.tib...@gmail.com:
  Note that there are two branches on github, master and
 stable.
  You
  want
  the
  latter.
 
  On Feb 7, 2012 8:23 AM, Alberto G. Corona
  agocor...@gmail.com
  wrote:
 
  This is quite different.
  I don´t know how but I was looking at some other older
 patch
  around
  the same issue and I supposed that it was the one refered
 by
  Yohan
  Tibell.
 
  I´ll try your patch.
 
  Thanks!.
 
  2012/2/7 Holger Reinhardt hreinha...@gmail.com:
   Hi,
  
   (I submitted the patch that Johan linked to)
   Network/Socket/Internal.hsc has the following code:
  
   #if defined(WITH_WINSOCK) || defined(cygwin32_HOST_OS)
   type CSaFamily = (#type unsigned short)
   #elif defined(darwin_HOST_OS)
   type CSaFamily = (#type u_char)
   #else
   type

Re: [Haskell-cafe] network-2.3.0.10 compiled for ghc 7.4.1 windows

2012-02-13 Thread Johan Tibell
Version 2.3.0.11 released.

On Mon, Feb 13, 2012 at 9:58 AM, Holger Reinhardt hreinha...@gmail.comwrote:

 Did as requested and everything seems to work fine. Test suite also passes:

  Test Cases  Total
  Passed  10  10
  Failed  0   0
  Total   10  10
 Test suite simple: PASS



 2012/2/13 Johan Tibell johan.tib...@gmail.com

 I've merged and pushed the changes to the stable branch on GitHub. If
 someone could verify that it works fine on Windows, I'll make another
 release.

 In addition to running whatever program you're interested in, also run:

 cabal clean
 autoreconf
 cabal configure --enable-tests
 cabal build
 cabal test

 All tests should pass.

 -- Johan


 On Wed, Feb 8, 2012 at 11:04 AM, Johan Tibell johan.tib...@gmail.comwrote:

 I will merge this as soon as I get back from vacation.
 On Feb 8, 2012 8:54 AM, Holger Reinhardt hreinha...@gmail.com wrote:

 Having discussed the issue privately with Alberto, I've found another
 bug and updated my pull request [1]. Using that code it should be possible
 to build the network library on Windows using MSys on GHC 7.4.1.

 [1] https://github.com/haskell/network/pull/25

 2012/2/8 Alberto G. Corona agocor...@gmail.com

 yes i did it,.

 the error is as follows:

 shop.exe: NetworkSocket.hsc:(948,3)-(1007,23): Non-exhaustive patterns
 in case

 I will download network form hackage and will do it form the
 beginning. .



 2012/2/8 Holger Reinhardt hreinha...@gmail.com:
  Did you run cabal clean before rebuilding with Git Bash? And can
 you post
  the exact runtime error you get?
 
  2012/2/8 Alberto G. Corona agocor...@gmail.com
 
  I switched to Git bash and the runtime error produced by the library
  is the same.
  This error may be produced because  the configuration it does not
  detect the netwiorkin related includes such is socket.h. This does
 not
  exist neither in the ghc installation neither in GIT/Mingw
 
 
  2012/2/7 Holger Reinhardt hreinha...@gmail.com:
   I just use the version of MSys that is included with Git [1].
 This puts
   a
   Git bash icon on your desktop which you can then use to build
 the
   network
   library.
  
   [1] http://code.google.com/p/msysgit/
  
  
   2012/2/7 Alberto G. Corona agocor...@gmail.com
  
   Nothing bur a long history of failures. The problem is the
   configuration and versioning of MinGW and MSys. This  is a
 nighmare.
  
   2012/2/7 Holger Reinhardt hreinha...@gmail.com:
Oh you are using Cygwin. I'm using MSys so this is why I cannot
reproduce
your problem. Is there anything preventing you from using MSys?
   
   
2012/2/7 Alberto G. Corona agocor...@gmail.com
   
The problem this time is in Configure :
   
case $host in
*-mingw32)
   EXTRA_SRCS=cbits/initWinSock.c, cbits/winSockErr.c,
cbits/asyncAccept.c
   EXTRA_LIBS=ws2_32
   CALLCONV=stdcall ;;
*-solaris2*)
   EXTRA_SRCS=cbits/ancilData.c
   EXTRA_LIBS=nsl, socket
   CALLCONV=ccall ;;
*)
   EXTRA_SRCS=cbits/ancilData.c
   EXTRA_LIBS=
   CALLCONV=ccall ;;
esac
   
   
   
Since I´m cross-compiling with cygwin, the variable Host does
 not
contain ¨*-muingw32  but i686-pc-cygwin
   
changing the case , the library incorporates the lost C coded
 files.
   
Now the library links fine win imported, but there is a
 runtime
error:
   
NetworkSocket.hsc:(948,3)-(1007,23): Non-exhaustive patterns
 in case
   
maybe it is due to some other preprocessor directive mismatch
   
   
2012/2/7 Holger Reinhardt hreinha...@gmail.com:
 Did you also change the files in the /cbits/ folder?
 Because they
 also
 check
 for HAVE_WINSOCK_H.


 2012/2/7 Alberto G. Corona agocor...@gmail.com

 The code is evolving and none of the versions match
 exactily with
 the
 patch, but substituting HAVE_WINSOCK by HAVE WINSOCK2 in
 these
 files
 solves the compilation problem at least in the network
 2.3.0.10
 version from hackage.

 However it produces the same undefined references when this
 library
 is
 imported in my application. It seems that some object code
 is not
 included in the final library.  I verified that at least
 some of
 these
 undefined references correspond with  C code in the
 source, but
 somehow this is not included in the object library

 2012/2/7 Johan Tibell johan.tib...@gmail.com:
  Note that there are two branches on github, master and
 stable.
  You
  want
  the
  latter.
 
  On Feb 7, 2012 8:23 AM, Alberto G. Corona
  agocor...@gmail.com
  wrote:
 
  This is quite different.
  I don´t know how but I was looking at some other older
 patch
  around
  the same issue and I supposed that it was the one
 refered by
  Yohan
  Tibell.
 
  I´ll try your patch.
 
  Thanks!.
 
  2012/2/7 Holger

Re: [Haskell-cafe] network-2.3.0.10 compiled for ghc 7.4.1 windows

2012-02-13 Thread Johan Tibell
Resending as the last message got held for moderation:

On Mon, Feb 13, 2012 at 10:18 AM, Johan Tibell johan.tib...@gmail.comwrote:

 Version 2.3.0.11 released.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code 2012 Announced

2012-02-13 Thread Johan Tibell
On Mon, Feb 13, 2012 at 1:10 PM, Heinrich Apfelmus 
apfel...@quantentunnel.de wrote:

 What's the time frame for project proposals? I have two ideas in my head
 that I think are unusually cool. To make a successful SOC project, they
 need a bit of preparation on my part, though, so I'm wondering how much
 time I have to implement a proof of concept or two.


I suggest having the project proposals ready and published by the time the
application window for students opens. The timeline should be on the GSoC
site.

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How do I get official feedback (ratings) on my GSoC proposal?

2012-02-13 Thread Johan Tibell
On Mon, Feb 13, 2012 at 3:20 PM, Greg Weber g...@gregweber.info wrote:

 Other than changing the status myself, how do I get a priority
 attached to my GSoC proposal?


What priorities are you referring to?

-- Johan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How do I get official feedback (ratings) on my GSoC proposal?

2012-02-13 Thread Johan Tibell
Yes. I rated some myself and left a motivation for my rating and waited for
someone to disagree. :) In general I was just trying to help students out
by pushing down proposals that (in my experience) where too hard to
complete in a summer or that were too narrow to benefit a larger portion of
the community.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How do I get official feedback (ratings) on my GSoC proposal?

2012-02-15 Thread Johan Tibell
On Wed, Feb 15, 2012 at 7:40 PM, Ryan Newton rrnew...@gmail.com wrote:
 I'm interested in mentoring any projects related to concurrent data
 structure implementation.  Is it too late to propose new projects?


  http://parfunk.blogspot.com/2012/02/potential-gsoc-haskell-lock-free-data.html

Not all all. It's quite early in fact (I tried to get people to think
about this early on.) I'd also post it to the Haskell reddit to make
sure it gets a bit more exposure (it's kinda buried here in this
thread.)

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compressed Data.Map for more efficient RAM usage?

2012-02-16 Thread Johan Tibell
On Thu, Feb 16, 2012 at 1:51 PM, Jeremy Shaw jer...@n-heptane.com wrote:
 Sometimes we  want to store very large collection types in RAM -- such as a
 Data.Map or Data.IxSet.

 It seems like we could trade-off some speed for space savings by compressing
 the values in RAM.

Not knowing the actual data you want to store or the usage pattern I
cannot quite say if these suggestions will be of use:

* If the data is used in a read-only fashion (e.g. as a big lookup
table,) consider using an SSTable
(http://en.wikipedia.org/wiki/SSTable). The wiki page doesn't contain
a lot of documentation but I think you can find one implemented in
Cassandra. An SSTable is a sorted table from byte keys and byte
values. It's highly memory efficient as the keys can be prefix encoded
(as they are stored sorted) and whole blocks of the table can be
compressed using e.g. Zippy.

* If you need mutability you could consider using LevelDB.

* If you need a pure data structure consider using a type-specialized
version of the HAMT data structure which I use in the experimental
hamt branch  of unordered-containers
(https://github.com/tibbe/unordered-containers). The HAMT has quite
low overhead as is and if you specialize the key and value types (at
the cost of lots of code duplication) you can decrease the per
key/value overhead with another 4 words per such pair.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compressed Data.Map for more efficient RAM usage?

2012-02-16 Thread Johan Tibell
On Thu, Feb 16, 2012 at 2:03 PM, Antoine Latter aslat...@gmail.com wrote:
 You could have a re-implemented HashMap which would un-pack the
 payload's ByteString constructor into the leaves of the HashMap type
 itself.

 Then you would save on both the keys and the values.

Note that ByteString has a high per-value overhead due to the internal
use of ForeignPtrs and indicies that track offset/size:
http://blog.johantibell.com/2011/06/memory-footprints-of-some-common-data.html

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: hp2html, a tool for viewing GHC heap-profiles

2012-02-20 Thread Johan Tibell
Hi Iavor,

On Mon, Feb 20, 2012 at 6:45 PM, Iavor Diatchki
iavor.diatc...@gmail.com wrote:
 Hello,

 I am happy to announce the availability of a little tool that I wrote while
 I was doing some Haskell profiling.  It converts GHC's heap-profiles into
 HTML, and renders them nicely using the flot library.  Its functionality is
 similar to `hp2ps`.  I wrote it because I find the HTML output easier to
 work with and, also, because it can cope with partial profiles, so one can
 refresh the profile while the program is running.  The tool is a very short
 Haskell program, so it should be quite easy to modify and improve (and there
 is a lot that can be improved in it! :-).

Looks really nice. The hovering behavior is nice, but I'd like to see
the legend as well. It makes it quicker when you want to get a quick
overview of what types there are, as the eye can travel back-and-forth
between the graph and the legend.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: hp2html, a tool for viewing GHC heap-profiles

2012-02-20 Thread Johan Tibell
On Mon, Feb 20, 2012 at 9:23 PM, Iavor Diatchki
iavor.diatc...@gmail.com wrote:
 I started with the legend but it was too big on the program that i was
 profiling, so i switched to the hovering mode. I agree that it is not
 optimal. Perhaps there's a way to instruct flot to show only some of the
 entries or, better, order them in some useful way.  I'm no flot expert, so
 ideas (or patches) on how to do it would be most appreciated!

Big in what sense? Area wise? You could perhaps put it outside the flot graph.

-- Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


<    1   2   3   4   5   6   >