Re: happstack-ixset internals/performance (was Re: [Haskell-cafe] Inverse of HaskellDB)

2010-10-02 Thread Thomas M. DuBuisson
Thanks Jeremy, I just wrote up my own little analysis (below) while you
were responding.  I'll look for the kd-tree work; if I see discussion
(and am stupid enough to heap more work onto my plate) then I might get
involved.

Oops, didn't send...

Cheers,
Thomas

-

So another glance tells me there is a list of maps (one element for each
index method) and it uses Data.Map under the hood.  So I have O(m lg n)
where m is the number of index methods and n is the number of elements.

Space wise, I think Data.Map takes up 6 words per Bin constructor (1 for
the constructor, 1 for the 'Size' and one for the pointer indirection
for each additional field), so the space is 6 * n * m * w where w is
the word size.  This means indexing by 5 methods for 1M entries takes
about 256MB, assuming 28B per entry that's 28MB of data + 256 indexing ~
282MB needed.

Indexing my imaginary 1B points by user,date,lat,lon is 6 * 2^30 * 4 * 8
- or about 192 GB of indexing + 28GB of data for 220GB total.  Obviously
I shouldn't be talking about keeping a live data set of 28GB in memory
let alone indexing it all, but I was just curious about the ratio (220MB
for 1M points, which is just one heavy user).



On Sat, 2010-10-02 at 14:09 -0500, Jeremy Shaw wrote:
 In the current version of IxSet, the performance of querying on just
 the Lon would be essentially the same as if you just had a Data.Map
 Lon Point. But the queries on the second index are current not so
 great. There is work in progress to rewrite the internals of IxSet to
 be based on a kd-tree, in which case your query should be pretty
 efficient.
 
 So, that answer is pretty vague :) I am in the process of wrapping up
 happstack 0.6 which has focused on fixing some performance issues with
 happstack-server, and refactoring the code so that user API and
 internals are more clearly separated and better documented.
 
 happstack 0.7 is all about happstack-state. A key aspect will be
 nailing down some solid performance benchmarks instead of vague hand
 waving :)
 
 The numbers you give are certainly within the scope of what we would
 like 0.7 to be able to handle. Also, I should note that
 happstack-state and happstack-ixset are independent from each other.
 You can easily use something other than IxSet to store your points and
 still use happstack-state.
 
 - jeremy
 
 On Fri, Oct 1, 2010 at 1:53 PM, Thomas M. DuBuisson
 thomas.dubuis...@gmail.com wrote:
  That is pretty close to how it would look using happstack-state. Here
  is a complete, runnable example which defines the types, a query,
  creates/initializes the database, performs the query, and prints the
  results.
  [snip]
 
  How is data stored in Happstack.State?  I see the Component instance
  uses fromList from happstack-ixset but can't find any information on
  the algorithm used or its efficiency (computationally or wrt space).
 
  If making this more concrete helps then here is a possible use:
 
  == GPS Points ==
  I have a GPS logger that logs every 10 seconds when I jog.  Jogging for
  an hour a day for the past 180 days has resulted in 64k points.
  Pretending I hosted a site for joggers (and all points were in the same
  DB) I could easily result in a billion points ( 20K users).  Would
  happstack-ixset code in the form points @ (Lon -120) @ (Lon -125) @
  (Lat 45) @ (Lat 50) perform reasonably?
 
  Cheers,
  Thomas
 
 


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


happstack-ixset internals/performance (was Re: [Haskell-cafe] Inverse of HaskellDB)

2010-10-01 Thread Thomas M. DuBuisson
 That is pretty close to how it would look using happstack-state. Here
 is a complete, runnable example which defines the types, a query,
 creates/initializes the database, performs the query, and prints the
 results.
[snip]

How is data stored in Happstack.State?  I see the Component instance
uses fromList from happstack-ixset but can't find any information on
the algorithm used or its efficiency (computationally or wrt space).

If making this more concrete helps then here is a possible use:

== GPS Points ==
I have a GPS logger that logs every 10 seconds when I jog.  Jogging for
an hour a day for the past 180 days has resulted in 64k points.
Pretending I hosted a site for joggers (and all points were in the same
DB) I could easily result in a billion points ( 20K users).  Would
happstack-ixset code in the form points @ (Lon -120) @ (Lon -125) @
(Lat 45) @ (Lat 50) perform reasonably?

Cheers,
Thomas

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] cryptohash and an incremental API

2010-07-12 Thread Thomas M. DuBuisson
Vincent,

Due to spam-like comments on -cafe I hadn't been subscribed for a while
and missed your cryptohash discussion!  Particularly:

 The main reason for this library is the lack of incremental api
exposed by
 current digest libraries, and filling the void about some missing
digest
 algorithms; Also the speed comes as a nice bonus.

I've been working on a new crypto library specifically to provide a
unified API for packages implementing cryptographic algorithms.  You can
see the discussions on librar...@haskell.org [1] [2].  Please feel free
to take a look, comment, contribute, and hopefully move to the
interface.  I should be finishing up BlockCipher modes and adding hash
tests soon.

Cheers,
Thomas



[1] http://www.haskell.org/pipermail/libraries/2010-May/013688.html
[2] http://www.haskell.org/pipermail/libraries/2010-June/013782.html

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ByteString, zipWith', and rewrite rules

2010-07-11 Thread Thomas M. DuBuisson
Comments on the zipWith' function inside of Data.ByteString say:

-- Rewrite rules
-- are used to automatically covert zipWith into zipWith' when a pack is
-- performed on the result of zipWith.

This is only true internally to Data.ByteString because the zipWith'
function could be inlined away by GHC once that module is compiled,
right?  If I want pack (zipWith xor bs1 bs2) to be efficient then I'll
have to either get zipWith' exported or write my own lower level
zipWith'?

If I'm wrong here then the comment needs to be moved to somewhere that
will get haddockized, perhaps with zipWith.

Cheers,
Thomas

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Hackage Improvement Ideas

2008-10-23 Thread Thomas M. DuBuisson

 1) Popularity statistics -- like debian's popcon, gives stats on how
 many people have which packages from hackage installed

Popularity has been suggested for some time.  I think any new features
should be going into the happs version of Hackage
( http://code.haskell.org/hackage-server )
Though it seems not to have seen patches in recent weeks.

It also might interest you to know hackage has a trac:
http://hackage.haskell.org/trac/hackage/

Tom

 What else should hackage do?
Automate HPC, and quickChecks.
Automatic package dep graph.
Decentralize and distributed file serving (packages would need signed).
Package signing.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Verifying a list of properties using QuickCheck

2008-10-22 Thread Thomas M. DuBuisson
Thomas van Noort wrote:
 However, I would like a single result for the complete list of 
 properties instead of a result for each property. I realize that this 
 restricts the properties to be of the same type, but that isn't a 
 problem for my application.

You're types can be different, so long as the result of quickCheck
isn't.

I've often seen (and used):

data Test = forall a. (Testable a) = T String a

testSetA :: [Test]
testSetA = [T 1+1=2 prop_onePlusOne
,T 2+2=5 prop_twoPlusTwo
...
]

tests = testSetA ++ testSetB ++ ...

runTests = :: [Test] - Bool
runTests = and $ map runTest tests

runTest :: Test - Bool
runTest (T s a) = putStr (s ++ : )  quickCheck a  putStr \n



Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] About do notation.

2008-10-14 Thread Thomas M. DuBuisson
Magicloud wrote:
 Hi,
 As some articles say, do notation is expand to () and (=)
 when 
 being compiled.
 So I want to know the details. Like:
 main = do
   a - getArgs
   b - getLine
   myFunc1 (head a) b
   myFunc2 b (head a)
 
 I cannot figure out what is the () and (=) way of this.
 
 Thanks.

In the beginning, their was the bind (=) operator and lambdas.  You
bind the results of one action to the argument of the next
(act = \a - act2 a)

A genious realized that not all actions have an intersting result, so
instead of a meaningless binding (act = \_ - act2) you can sequence
the actions (act  act2).

So code would look like:
--
main =
  getArgs = \a - getLine = \b -
  myFunc1 (head a) b 
  myFunc2 b (head a)
--

It didn't take too long and people got in the habbit of lining up the
binding on the right-most column:

--
main =
  getArgs   = \a -
  getLine   = \b -
  myFunc1 (head a) b
  myFunc2 b (head a)
--

And eventually someone figured out this looked an aweful lot like
imparitive code with the variable and function flipped around (and some
ugly characters inbetween).
getArgs = \a -
~
a = getArgs();

So the shortest most non-descript word, 'do', was found to ensure no one
would have to type much.

Tom

* Story not historically accurate.  Technical accuracy is questionable.
Sanity of author not guarenteed.  All rights reserved.  No refunds.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] control-timeout with gtk

2008-09-19 Thread Thomas M. DuBuisson
On Fri, 2008-09-19 at 09:09 -0300, Marco TĂșlio Gontijo e Silva wrote:
 I added the NOINLINE annotations and even tried building with -fno-cse,
 but the result was the same.  Do you have any other suggestions?

A while ago I made a shim using control-event to provide the
control-timeout api in Control.Event.Timeout.  It performs worse than
control-timeout because it computes absolute times on each addTimeout
call, but might be fine for your needs.

If this is a problem with unsafePeformIO it won't be fixed by this
change - control-event uses the same hack to provide the control-timeout
API.  Alternatively, you could pass the timeout data structure as an
argument as expected by the Control.Event module.

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/control-event

Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Python's big challenges, Haskell's big advantages?

2008-09-17 Thread Thomas M. DuBuisson
jason.dusek:
   What does Haskell have to say about cloud computing?

If by 'cloud computing' you wish to discuss mapReduce then:
http://www.cs.vu.nl/~ralf/MapReduce/paper.pdf

Map reduce in Haskell, enjoy!

Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Weekly News: Issue 85 - September 13, 2008

2008-09-13 Thread Thomas M. DuBuisson

 What would theorem proofs do for me?   
Imagine if you used SmallCheck to exhastively test the ENTIRE problem
space for a given property.  Now imagine you used your brain to show the
programs correctness before the heat death of the universe...

Proofs are not features, nor are they code.  What you prove is entirely
up to you and might not be what you think.  Take, for example, the issue
of proving a sort function works correctly [1].

I'm not saying this to discourage complete proofs, but just cautioning
you that proving something as unimportant and IO laden as an IRC bot
probably isn't the best example.  Do see the linked PDF, and [2] as
well.

Oh, and for examples where people should have used FM, search for
'ariane 1996'  or the gulf war patriot missle failure

TomMD

[1]
http://www.cl.cam.ac.uk/~mjcg/Teaching/SpecVer1/Lectures/pslides07x4.pdf
[2] http://users.lava.net/~newsham/formal/reverse/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Speed Myth

2008-08-26 Thread Thomas M. DuBuisson

  Wow!  3x the performance for a simple change.  Frustrating that there
  isn't a protable/standard way to express this.  Also frustrating that
  the threaded version doesn't improve on the situation (utilization is
  back at 50%).
GR, retraction, retraction!

I was obviously too tired when I posted this.  In generallizing the
system to take run-time specified number of CPUs (for forkOnIO) and
tokens the behavior changed from my 3 minute runs.  I'll play with it
more tonight.


Thomas

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Speed Myth

2008-08-25 Thread Thomas M. DuBuisson
dons:
 Simon Marlow sez:
 
 The thread-ring benchmark needs careful scheduling to get a speedup
 on multiple CPUs. I was only able to get a speedup by explicitly
 locking half of the ring onto each CPU. You can do this using
 GHC.Conc.forkOnIO in GHC 6.8.x, and you'll also need +RTS -qm -qw.
 
 Also make sure that you're not using the main thread for any part of
 the main computation, because the main thread is a bound thread and
 runs in its own OS thread, so communication between the main thread
 and any other thread is slow.

I had to see the results for myself :-)

old RTS: 0m54.296s
threaded RTS (-N1):0m56.839s
threaded RTS (-N2):0m52.623s

Wow!  3x the performance for a simple change.  Frustrating that there
isn't a protable/standard way to express this.  Also frustrating that
the threaded version doesn't improve on the situation (utilization is
back at 50%).

Anyway, that was a fun miro-benchmark to play with.

Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Speed Myth

2008-08-24 Thread Thomas M. DuBuisson
 Hmm thanks, that's interesting -- I was think it was probably caused  
 by OS X, but it appears to happen on Linux too.  Could you try running  
 the old code too, and see if you experience the order of magnitude  
 slowdown too?

The original program on my Linux 2.6.26 Core2 Duo:

[EMAIL PROTECTED] Test]$ time ./tr-threaded 100
37

real0m0.635s
user0m0.530s
sys 0m0.077s
[EMAIL PROTECTED] Test]$ time ./tr-nothreaded 100
37

real0m0.352s
user0m0.350s
sys 0m0.000s
[EMAIL PROTECTED] Test]$ time ./tr-threaded 100 +RTS -N2
37

real0m13.954s
user0m4.333s
sys 0m5.736s

--

Seeing as there still was obviously not enough computation to justify
the OS threads in my last example, I made a test where it hashed a 32
byte string (show . md5 . encode $ val):
[EMAIL PROTECTED] Test]$ time ./threadring-nothreaded 100
50
552

real0m1.408s
user0m1.323s
sys 0m0.083s
[EMAIL PROTECTED] Test]$ time ./threadring-threaded 100
50
552

real0m1.948s
user0m1.807s
sys 0m0.143s
[EMAIL PROTECTED] Test]$ time ./threadring-threaded 100 +RTS -N2
552
50

real0m1.663s
user0m1.427s
sys 0m0.237s
[EMAIL PROTECTED] Test]$ 

---

Seeing as this still doesn't beat the old RTS, I decided to increase the
per unit work a little more.  This code will hash 10KB every time the
token is passed / decremented.

[EMAIL PROTECTED] Test]$ time ./threadring-nothreaded 10
(308,77851ef5e9e781c04850a7df9cc855d2)


real2m56.453s
user2m55.399s
sys 0m0.457s

[EMAIL PROTECTED] Test]$ time ./threadring-threaded 10 
(308,77851ef5e9e781c04850a7df9cc855d2)


real3m6.430s
user3m5.868s
sys 0m0.460s

[EMAIL PROTECTED] Test]$ time ./threadring-threaded 10 +RTS -N2
(810,77851ef5e9e781c04850a7df9cc855d2)
(308,77851ef5e9e781c04850a7df9cc855d2)


real1m55.616s
user2m47.982s
sys 0m3.586s

* Yes, I notice its exiting before the output gets printed a couple
times, oh well.

-
REFLECTION

Yay, the multicore version pays off when the workload is non-trivial.
CPU utilization is still rather low for the -N2 case (70%).  I think the
Haskell threads have an affinity for certain OS threads (and thus a
CPU).  Perhaps it results in a CPU having both tokens of work and the
other having none?  

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Propeganda

2008-08-24 Thread Thomas M. DuBuisson
Chris said:
 I personally think such pattern matching errors
 are a weaknesss of the language; with possibly no solutions to resolve.

Actually tools like CATCH [1] exist and could be incorporated into a
compiler to eliminate this problem.

[1] http://www-users.cs.york.ac.uk/~ndm/catch/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Propeganda

2008-08-23 Thread Thomas M. DuBuisson
wrt head [], Niels said:
 So now what? Action plan = []

Oh come now.  Between ghci, hpc, and manual analysis I've never hit a
Haskell error and thrown my hands up, I can't go any further, I'm at a
complete loss!  Also it helps that I run into this extremely rarely - I
have a larger habit of hidding black holes :-(

Niels said:
 Thank you for the URL, but I'm aware of the work in GHC(i).

It might interest you to know some people actually use hpc for debugging
with reasonble success - it colors the unevaluated sections and this can
be very helpful in determining where a program stopped and threw an
exception.

If you have your eyes on the future you should see Dana Xu's work on
static contract checking (check out her Cambridge page).

Cheers,
Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell Speed Myth

2008-08-23 Thread Thomas M. DuBuisson

 That's really interesting -- I just tried this.
 
 Compiling not using -threaded: 1.289 seconds
 Compiling using -threaded, but not running with -N2: 3.403 seconds
 Compiling using -threaded, and using -N2: 55.072 seconds
 

I was hoping to see a relative improvement when introducting an
opportunity parallelism in the program, so I made a version with two
MVars filled at the start.  This didn't work out though - perhaps some
performance stands to be gained by improving the GHC scheduler wrt cpu /
OS thread affinity for the Haskell threads?

For the curious:

-O2: 7.3 seconds (CPU: 99.7% user)
-O2 -threaded: 11.5 seconds (CPU: 95% user, 5% system)
-O2 -threaded ... +RTS -N2: killed after 3 minutes (CPUs: 15% user, 20%
system)

Thats quite a lot of CPU time going to the system.

Specs:
Linux 2.6.26 (Arch) x86_64
Intel Core 2 Duo 2.5GHz


SOURCE
Notice the threads still exist when a value of 0 is seen - this is OK as
the other value will be terminating ringsize threads earlier and this
thread will never be needed again.

import Control.Monad
import Control.Concurrent
import System.Environment

ring = 503

new l i = do
  r - newEmptyMVar
  forkIO (thread i l r)
  return r

thread :: Int - MVar Int - MVar Int - IO ()
thread i l r = go
  where go = do
  m - takeMVar l
  when (m == 1) (print i)
  putMVar r $! m - 1
  when (m  0) go

main = do
  val - return . read . head = getArgs
  a - newMVar val
  b - newMVar val
  y - foldM new a [2..ring]
  forkIO $ thread (ring+1) y b
  z - foldM new b [(ring + 2)..(ring + ring)]
  thread 1 z a


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] a recent Haskell contribution?

2008-08-20 Thread Thomas M. DuBuisson
You might be talking about my 'ipc' library [1] I mentioned here a
little while ago [2].  Don't forget the plethora of caviats (quick hack,
BSD sockets, trunkates messages around 4 kB).

I thought I'd be interested in developing and supporting some high level
IPC library but I'm really not and I don't foresee many users.  If no
ones becomes interested in this library, and that wouldn't be
surprising, I intend to delete it from hackage (if possible) to help the
community avoid clutter.

[1] http://www.haskell.org/haskellwiki/IPC
[2] http://www.mail-archive.com/haskell-cafe@haskell.org/msg43842.html

On Wed, 2008-08-20 at 03:28 -0500, Galchin, Vasili wrote:
 Hello,
 
  Recently someone made a post about a message-passing IPC that
 they contributed (within one month?). I have been searching the
 Haskell cafe archive to no avail. Can the contributor (or any one
 else) tell where the code is and any papers?
 
 Thank you, Vasili
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's in a name?

2008-08-13 Thread Thomas M. DuBuisson

 Do we have a formal convention for the naming of 
 packages and/or the naming of the modules they contain? 
There is a recommended set of categories and in general I believe
library authors try and follow the previously established names.

 How are name 
 collisions supposed to be avoided?
In the case of pureMD5 I looked at the other modules and decided to name
mine something with a proper hierarchy that doesn't collide with
'Crypto'.  Hence the extra Pure part of the module name.

I believe that an informal process, such as what I did, is much better
than formalizing every aspect of Haskell/Hackage libraries.  The cost in
terms of processes / bureaucracy are just too much to formalize
everything.

Suggestion: Have Hackage warn when a library is uploaded that has Module
conflicts with other libraries.

Thomas

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Recommended Haskell Books

2008-08-10 Thread Thomas M. DuBuisson
I know someone else is going to say it, so I may as well beat them to
the punch:

Real World Haskell isn't released yet, but beta chapters are available
online at book.realworldhaskell.org/beta

As for me, I learned though the Yet Another Haskell tutorial, Haskell
School of Expression (book), Haskell: The Craft of Functional
Programming (book), and plenty of playing around.

Tom

On Sun, 2008-08-10 at 11:29 -0700, Warren Aldred wrote:
 Hi all,
 
 I'm new to Haskell and looking for recommendations on introductory Haskell 
 books.  Online or offline.  Any suggestions?
 
 Thanks kindly,
 Warren 
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] message-passing IPC for Haskell?

2008-07-31 Thread Thomas M. DuBuisson

  I have seen postings about work on message-passing IPCs for
 Haskell. I like STM but want to keep an open mind ... I can't find
 those postings. Can something remind of this work and where/how I can
 read about?

I made a quick hack composing BSD sockets from Network.Socket for higher level 
IPC.  It was for a one use deal, but you're free to use and improve on the 
library - called 'ipc' on hackage.

Tom


Hackage:
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ipc

Basic homepage:
http://www.haskell.org/haskellwiki/IPC


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


GHCi Debugger (was Re: [Haskell-cafe] Architecturally flawed)

2008-07-10 Thread Thomas M. DuBuisson

 I could try GHC's new debugger. But my experiences with it so far have 
 shown that for all but the most trivial programs possible, it becomes 
 intractably difficult to figure out what the debugger is actually 
 showing you. 

GDB is to C as
(a) GHCi debugger :: Haskell
(b) Pigs :: Farmers
(c) Food :: TomMD
(d) None of the above

Hint: Its not (a).  The GHCi debugger seems to catch extra flack because
people want to pour through their Haskell code as they do imperative
code.  I can sympathize - I would like to do that too - but it would be
an inaccurate picture of the programs execution.  so long as you regard
the GHCi as a new/useful tool and not try to pretend its like other
debuggers you'll probably be happier.

I've found ghcid to be useful when quickcheck + HPC + ChasingBottoms
fails to narrow down the problem any further.  Its gotten to the point
where I often know exactly which LOC/module will be the next step (based
on knowledge of the data dependencies).  A fair share of bugs have
fallen to the sword named vim as a result :-).  It really is useful, but
like all other haskellisms, one must learn to ride a bike all over
again.

At times I think of ghcid as the anti-gdb.  If there's a series of let
bindings, each mutating the predecessor, its enjoyable to see the
debugger start at the bottom and crawl its way back up.

Tom

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Why the exception?

2008-06-24 Thread Thomas M. DuBuisson
Cafe
I'm a bit lost on this exception and curious about what's going on.  Is
there a valid reason for this exception that I am missing?  Note the
hard-coded [0..100] could be any Word8 list you want (generated via
arbitrary, [], or other) and it gives the same result.

Load the module and perform:
  :break prop_LPS
  quickCheck prop_LPS
  :step
  :force ps
 *** Exception: Prelude.head: empty list

 import qualified Data.ByteString as L
 import Test.QuickCheck

 instance Arbitrary L.ByteString where
 arbitrary = do
 return $ L.pack [0..100]

 prop_LPS :: L.ByteString - Bool
 prop_LPS ps = ps `seq` True

P.S. Same result with GHCi 6.8.2 and 6.8.3.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to do this in FP way?

2008-06-15 Thread Thomas M. DuBuisson
Magicloud Magiclouds wrote:
 Say I have something like this in C:
 static int old;
 int diff (int now) { /* this would be called once a second */
   int ret = now - old;
   old = now;
   return ret;
 }
 Because there is no variable in Haskell. So how to do this in a
 FP way?

So you have a global time count that is updated every second - I presume
for the rest of the process to use, avoiding syscalls due to expense.
Am I getting this right?

You can use mutable variables in Haskell, though some people will frown,
but it might fit your need.  See MVar, STM, or IO Refs.  At any rate, if
what you are doing needs to check the clock time then this code isn't
going to be pure.

Perhaps you just want something in particular to be triggered every
second then use control-timeout, event-list, or control-event.

Tom


signature.asc
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting the impossible

2008-06-14 Thread Thomas M. DuBuisson
I think when Andy made HPC he added a way to mark code unreachable so it
wouldn't harm your test coverage report.


On Sat, 2008-06-14 at 19:58 +0100, Andrew Coppin wrote:
 I have a small idea. I'm curios if anybody else thinks it's a good idea...
 
 How about a {-# IMPOSSIBLE #-} pragma that documents the fact that a 
 particular point in the program *should* be unreachable?
 
 For example, you look up a key in a Map. You know the key is there, 
 because you just put it there yourself two seconds ago. But, 
 theoretically, lookup returns a Maybe x so - theoretically - it's 
 possible it might return Nothing. No matter what you do, GHC will insert 
 code to check whether we have Just x or Nothing, and in the Nothing case 
 throw an exception.
 
 Obviously there is no way in hell this code path can ever be executed. 
 At least, assuming your code isn't broken... This is where the pragma 
 comes in. The idea is that you write something like
 
   case lookup k m of
 Just v - process v
 Nothing - {-# IMPOSSIBLE lookup in step 3 #-}
 
 When compiled without optimisations, the pragma just causes an exception 
 to be thrown, rather like error does. When compiled with 
 optimisations, the whole case alternative is removed, and no code is 
 generated for it. (And if the impossible somehow happens... behaviour is 
 undefined.) So you test your program with your code compiled 
 unoptimised, and when you're sure the impossible can't happen, you 
 tell the compiler to remove the check for it. (Actually, it would 
 possibly be a good idea to have a switch to turn this on and off 
 seperately if needed...)
 
 Does anybody think this is a useful idea? If so I'll add a feature 
 request ticket to the GHC Trac. But I want to see if folks like the idea 
 first...
 
 (I was thinking the message in the pragma would be what gets displayed 
 on screen, along with some auto-generated location info. Just a module 
 name and line number ought to be sufficient...)
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


signature.asc
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] [WARN] Bug fix release of pureMD5 (Was: pureMD5)

2008-06-12 Thread Thomas M. DuBuisson
Cafe,
Daniel Larsson noticed a correctness issue with the pureMD5 package.
This issue would affect you if you built the value incrementally via the
'updateMD5' function (vs just using 'md5') and didn't provide 512 bit
long bytestrings (an MD5 block of operation).

As you can probably tell, I didn't invest enough into the
non-performance aspects of pureMD5.  Faced with actual users ;-), I have
released version 0.2.0 which has the bug fix, a new API (type prevention
from re-finalizing a digest), and a reasonable set of quickchecks
(covering Show / Binary instances, known answer and incremented
hashing).  Oh, also the module name has changed to place it inline with
'Crypto' package naming while not colliding.

Sorry if this causes anyone headaches.

Daniel,
Good catch, I hope this didn't consume much of your time.  Thanks a
bunch!

TomMD

On Fri, 2008-06-13 at 04:14 +0200, Daniel Larsson wrote:
 Hi Thomas,
 
 
 I was fiddling around with your pureMD5 package, and encountered some
 problems with using the md5Update/md5Finalize sequence. I tried to
 calculate the md5 of some scattered data, but it kept returning the
 wrong values. It seems that each md5Update must supply an exact
 blockSize number of bits, since the mdLeftOver part isn't taken into
 account in subsequent calls to md5Update.
 
 
 I wrote a small patch, and a simple QuickCheck property, to support
 calculating md5 of scattered data, attached to this mail. Hopefully I
 didn't mess up something...
 
 
 --
 Daniel


signature.asc
Description: This is a digitally signed message part
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Simple list processing

2008-06-11 Thread Thomas M. DuBuisson
Why is there no mapAccumL' (strict)?  Just a library deficiency that we
can remedy or am I missing something?

Don Stewart wrote:
 andrewcoppin:
  OK, so this is a fairly basic question about list processing.
  
  Several times now, I have found myself wanting to process a list to 
  produce a new list. However, the way later elements are processed 
  depends on what the earlier elements are - in other words, this isn't a 
  simple map.
  
  What is the best way to handle this?
  
  According to the theory, anything that consumes a list and produces a 
  value is some kind of fold. [Assuming it traverses the list in a 
  sensible order!] So it looks like you could implement this as a fold. 
  But should that be a LEFT-fold or a RIGHT-fold? (I always get confused 
  between the two!)
  
 
 Sounds like a mapAccum, a combination of map and fold,
 
 
 http://haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html#v%3AmapAccumL
 
 If you gave a concrete example it would be easier to diagnose.
 
 -- Don
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANN] Bindings to Xen Control (xenctrl.h)

2008-06-07 Thread Thomas M. DuBuisson
Version 0.0.3 was release, which is nearly a complete (and completely
trivial) xenctrl.h binding.  Aside from unit tests and programs, the
main things missing are grant table operations.  There is now a quick
and dirty home page at the wiki [1] and a darcs repo on c.h.o [2].
Comments, requests, and contributions are all encouraged.

TomMD

P.S. I intend to finish the bindings then move onto
design/implementation of the higher level 'System.Xen' module when time
permits.

[1] http://haskell.org/haskellwiki/HsXenCtrl
[2] http://code.haskell.org/~tommd/hsXenCtrl

On Mon, 2008-06-02 at 00:19 -0400, Thomas M. DuBuisson wrote:
 Don,
 I'll throw future work ideas in the next releases cabal.  The most
 obvious doors opened are Haskell rewrites of the current Xen
 infrastructure (virt-install, xm, xend).  Slightly more interesting
 tasks could be (warning: random thoughts):
 
 1) HAPPS server that can manage Xen domains (without requiring python
 libs or System.cmd)
 2) Network or BSD socket based remote management programs in Haskell.
 haxr or session-types might be of use here.
 3) With the functions that can alter the CPUs availability, more complex
 rules or sharing arrangements could be implemented.  Same concept
 applies to other resources such as PCI and memory allocation.  Seeing as
 Xen doesn't over-commit memory, a guest VM program that detects low/high
 memory situations and reports to the host for policy controlled
 balloning (adjustment of VM allocated memory) would be kind of fun...
 but probably complete overkill for any user of Xen.
 
 Right now I think the bindings are enough for an 'xm' replacement but
 nothing more.  I've over doubled the number of functions bound just now.
 There will probably be regular releases for the next few weeks as I find
 time.
 
 Thomas
 
 On Sat, 2008-05-31 at 23:31 -0700, Don Stewart wrote: 
  thomas.dubuisson:
   All,
   I'm just getting started with hsXenCtrl [1] as both a fun way to play
   with Xen and become proficient with Haskell FFI.  Once I get my
   community.haskell.org account squared away I'll likely setup a public
   darcs repo (and a homepage somewhere).
   
   As for modules: I intend to expand on the trival FFI bindings in
   System.Xen.CBindings and there'll be a higher level interface in
   System.Xen (functions will standardize on returning Either or throwing
   exceptions, hiding the xc_handle open/close, etc).
   
   Typical Disclaimers:
   The _only_ thing I've actually done with this, just tonight, is pause /
   unpause domains.  API is subject to change!  Only my exact build of Xen
   is sure to work, and no promise even then.
   
   TomMD
   
   [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hsXenCtrl
  
  Good stuff. Perhaps the synopsis should include a link or description of
  the typical use cases/ thinks we might achieve?
  
  -- Don

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] [ANN] Bindings to Xen Control (xenctrl.h)

2008-06-01 Thread Thomas M. DuBuisson
Don,
I'll throw future work ideas in the next releases cabal.  The most
obvious doors opened are Haskell rewrites of the current Xen
infrastructure (virt-install, xm, xend).  Slightly more interesting
tasks could be (warning: random thoughts):

1) HAPPS server that can manage Xen domains (without requiring python
libs or System.cmd)
2) Network or BSD socket based remote management programs in Haskell.
haxr or session-types might be of use here.
3) With the functions that can alter the CPUs availability, more complex
rules or sharing arrangements could be implemented.  Same concept
applies to other resources such as PCI and memory allocation.  Seeing as
Xen doesn't over-commit memory, a guest VM program that detects low/high
memory situations and reports to the host for policy controlled
balloning (adjustment of VM allocated memory) would be kind of fun...
but probably complete overkill for any user of Xen.

Right now I think the bindings are enough for an 'xm' replacement but
nothing more.  I've over doubled the number of functions bound just now.
There will probably be regular releases for the next few weeks as I find
time.

Thomas

On Sat, 2008-05-31 at 23:31 -0700, Don Stewart wrote: 
 thomas.dubuisson:
  All,
  I'm just getting started with hsXenCtrl [1] as both a fun way to play
  with Xen and become proficient with Haskell FFI.  Once I get my
  community.haskell.org account squared away I'll likely setup a public
  darcs repo (and a homepage somewhere).
  
  As for modules: I intend to expand on the trival FFI bindings in
  System.Xen.CBindings and there'll be a higher level interface in
  System.Xen (functions will standardize on returning Either or throwing
  exceptions, hiding the xc_handle open/close, etc).
  
  Typical Disclaimers:
  The _only_ thing I've actually done with this, just tonight, is pause /
  unpause domains.  API is subject to change!  Only my exact build of Xen
  is sure to work, and no promise even then.
  
  TomMD
  
  [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hsXenCtrl
 
 Good stuff. Perhaps the synopsis should include a link or description of
 the typical use cases/ thinks we might achieve?
 
 -- Don

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance: MD5

2008-05-17 Thread Thomas M. DuBuisson
Andrew,
I spent a reasonable amount of time making pureMD5 (available on
hackage) faster, which mainly ment strictness annoitations and unboxing
strict fields, but I also spent a good deal of time with the profiler.
One of my early versions was fairly slow due to the converting of the
LPS to blocks (an Isabelle proven 'blocks' function) caused O(n) space
requirement.  I've been meaning to revisit this to optimize more and
look closly at GHC core.

I'm on vacation now, but when I get back I'll try to make time to look
at your code (unless its moot by then).  Finally, I encourage anyone
interested to make reasonably fast bytestring implementations of SHA
algorithms as well - Haskell/Hackage has a number of pure and FFI
implementations of MD5 but is a bit lacking in any other cryptographic
algorithm.

TomMD

On Sat, 2008-05-17 at 15:51 +0100, Andrew Coppin wrote:
 Hi folks.
 
 OK, try this:
 
   darcs get http://darcs.orphi.me.uk/MyMD5
   cd MyMD5
   ghc -O2 --make md5sum
   md5sum some large filename
 
 On my PC, it takes 3.5 seconds to compute the MD5 hash of a 10 MB file. 
 For comparison, the md5.exe I downloaded from the Internet does it 
 instantaneously. It's literally so fast I can't even time it. As far as 
 I know, my program always produces the *correct* answer, it just takes 
 far too long to do it.
 
 If anybody has any interesting insights as to why my version is so 
 damned slow, I'd like to hear them. (Indeed, a description of the 
 process for tracking the problem down would also be useful!)
 
 [Much to my bemusement, when I had a look at the code, I discovered that 
 tucked away in a corner somewhere, it reads a ByteString from disk and 
 instantly converts it to [Word8]. Uh... oops? Anyway, eliminating this 
 changed the numbers I get from the profiler, but it's still far too slow.]
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] type families and type signatures

2008-04-06 Thread Thomas M. DuBuisson
Id is an operation over types yielding a type, as such it doesn't make
much sense to me to have (Id a - Id a) but rather something like (a -
Id a).  One could make this compile by adding the obvious instance:

 type instance Id a = a

Curiously, is this a reduction from a real world use of families?  I
just can't think of how a (Fam a - Fam a) function would be of use.

Cheers,
Thomas

Ganesh Sittampalam wrote:
 The following program doesn't compile in latest GHC HEAD, although it does 
 if I remove the signature on foo'. Is this expected?
 
 Cheers,
 
 Ganesh
 
 {-# LANGUAGE TypeFamilies #-}
 module Test7 where
 
 type family Id a
 
 type instance Id Int = Int
 
 foo :: Id a - Id a
 foo = id
 
 foo' :: Id a - Id a
 foo' = foo
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-09 Thread Thomas M. DuBuisson
I minor changes, fixing up my chunking function (finally) thus
eliminating the space leak.  Performance is now under 3x that of C!
Yay!  Also, nano MD5 benched at 1.15x 'C' (for files small enough for
strict ByteStrings to do ok).

Get the code:
darcs get http://code.haskell.org/~tommd/pureMD5

On the 2GB benchmark it is even more competitive (see my blog on
sequence.complete).  Let me know if you get significantly different
results (and you will if you IO doesn't horribly bottle neck you like on
my laptop).

-Tom

 You might like to test against,
 
 http://hackage.haskell.org/cgi-bin/hackage-scripts/package/nano-md5-0.1
 
 which is a strict bytestring openssl binding.
 
 -- Don
-- 
The philosophy behind your actions should never change, on the other
hand, the practicality of them is never constant. - Thomas Main
DuBuisson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-08 Thread Thomas M. DuBuisson
Glad you asked!

http://sequence.complete.org/node/367

I just posted that last night!  Once I get a a community.haskell.org
login I will put the code on darcs.

The short of it it:
1) The code is still ugly, I haven't been modivated to clean.
2) Manually unrolled, it is ~ 6 times slower than C
3) When Rolled it is still much slower than that
4) There is some optimizer bug in GHC - this code could be 2x faster, I
feel certain.
5) I benchmarked using a 200MB file, so I think it will handle whatever.

Thomas DuBuisson

On Thu, 2007-11-08 at 22:14 +, Andrew Coppin wrote:
 Don Stewart wrote:
  dpiponi:

  I was getting about 1.5s for the Haskell program and about 0.08s for
  the C one with the same n=10,000,000.
  
 
  I'm sure we can do better than that!

 That's the spirit! :-D
 
 
 Speaking of which [yes, I'm going to totally hijack this thread now...], 
 does anybody have a Haskell MD5 hash implementation that goes fast? 
 IIRC, I found one in MissingH, and it worked great. Except that as soon 
 as you feed it a 10 MB file, the standard Unix md5sum executable takes 
 about 0.001s to do it, and the Haskell version goes crazy and starts 
 eating virtual memory like candy. o_O (Although given a few minutes it 
 *does* produce the correct answer. But given that I want to run it over 
 an entire CD..)
 
 Given the choise, I'd *like* to find a fast 100% Haskell implementation 
 - but failing that, (nice) bindings to a fast C implementation will do I 
 guess. (I *only* need to compute MD5 hashes for files on disk. I don't 
 need to do anything more fancy than that...)
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
-- 
The philosophy behind your actions should never change, on the other
hand, the practicality of them is never constant. - Thomas Main
DuBuisson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Of phantom types and type extentions

2007-10-15 Thread Thomas M. DuBuisson
All,

I've been casually developing a PacketBB (i.e. Generalized Manet Packet
Format) library in Haskell.  I think I have a need to pass state
information as a phantom type - I'll step through the issue now.

With the 'AddressBlock' (S5.2.1 packetBB draft 8), network addresses are
abbreviated as sets of bytes (potentially just one byte each, with a
head or tail identical with other addresses).  How many bytes are in the
set is determined, in part, by the type of address stored (ex: IPv4 or
IPv6).  Thus, when serializing, I need to provide this information.

Saying this again, but in (simplified) code:

data NetworkAddress a = AddressBlock a =
  AddrBlkWire {
lenHd   :: Word8,
hd  :: [Word8],
lenTl   :: Word8,
tl  :: [Word8],
nrAddrs :: Word8,
addrs   :: [Word8] }
| AddrBlkAbstract [a]

data (NetworkAddress a) = SomeHigherLevelStruct a =
SHLS (AddressBlock a) Word32 Word8

-- length (addrs x) == (TotalAddressLength - lenHd - lenTl) * nrAddrs

I can think of several ways to convert between AddrBlkWire and
ByteStrings:
1) Make separate instance of 'Binary' for each data type element of
NetworkAddress.
instance Binary (AddressBlock IPv4) where
get = ...
put = ...
instance Binary (AddressBlock IPv6) where
get = ...
put = ...

This solution immediately causes problems with every higher level
structure you wish to serialize.  For example, now you have to have
individual instance for SHLS, you can't do:

instance (NetworkAddress a) = Binary (SomeHigherLevelStruct a) where
...

2) You can pass another argument into a custom 'get' routine.  I see
this as a hack that makes me break a good naming convention.

getNetworkAddress :: Int-- bytes per address
- Get NetworkAddress

3) If you don't worry about decoding, only encoding, then an extra field
in the data structure can fill the void of an extra argument.  Also a
hack.

I'm hoping someone here has a better solution.  Perhaps I am making a
mountain out of a mole hill, or perhaps this requires one of those type
system extensions I have yet to look hard at.  The solution I would want
looks like this:

class NetworkAddress a where
addressByteSize :: a - Int

instance (NetworkAddress a) = Binary (AddressBlock a) where
get = do
lenH - get
h- replicateM get (fromIntegral lenH)
lenT - get
t- replicateM get (fromIntegral lenT)
nr   - get
let addrSize = addressByteSize (undefined :: a)
bytes = (addrSize - lenH - lenT) * nr
addrs - replicateM get (fromIntegral bytes)
return ...

The line 'addrSize = ' is what I don't know how to write.  How does one
call an instance of a type class without knowing the type at compile
time?

Thanks,
Tom

-- 
The philosophy behind your actions should never change, on the other
hand, the practicality of them is never constant. - Thomas Main
DuBuisson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe