Send Beginners mailing list submissions to
        beginners@haskell.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        beginners-requ...@haskell.org

You can reach the person managing the list at
        beginners-ow...@haskell.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1.  Re: I have created an ugly Haskell program.. (Heinrich Apfelmus)
   2.  Does System.Directory work on Windows XP?
      (Patrick Larrivee-Woods)
   3. Re:  Does System.Directory work on Windows XP? (Jason Dusek)
   4. Re:  Does System.Directory work on Windows XP?
      (Patrick Larrivee-Woods)
   5. Re:  Does System.Directory work on Windows XP? (Daniel Fischer)
   6.  Lazy file IO & Space leaks/waste (Aleksandar Dimitrov)
   7.  if ands (Nathan M. Holden)
   8. Re:  if ands (Joe Fredette)
   9. Re:  if ands (Keith Sheppard)


----------------------------------------------------------------------

Message: 1
Date: Wed, 04 Nov 2009 18:21:17 +0100
From: Heinrich Apfelmus <apfel...@quantentunnel.de>
Subject: [Haskell-beginners] Re: I have created an ugly Haskell
        program..
To: beginners@haskell.org
Message-ID: <hcsd6d$vv...@ger.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1

Brent Yorgey wrote:
> Ask yourself: What Would Conal Do (WWCD)?  Conal Elliott is always
> trying to get people to think about the semantic essence of their
> problems, so let's try it.
> 
> What are we REALLY trying to do here?  What are those lists of tuples,
> REALLY?  Well, it seems to me that the lists of tuples are really just
> representing *functions* on some totally ordered domain.
> [...]
>
> So, let's try converting these lists of pairs to actual functions:
> 
> 
>   asFunc :: (Ord a) => [(a,b)] -> (a -> Maybe b)
>   asFunc is a = fmap snd . listToMaybe . reverse . takeWhile ((<=a) . fst) $ 
> is
>
> [...]
>
> Now, you might object that this is much more inefficient than the
> other solutions put forth.  That is very true. [...]
>
> However, I still find it very helpful to think about the essence
> of the problem like this: elegant yet inefficient code is a much
> better starting place than the other way around! [...]
>
> You can also try to optimize, taking advantage of the fact that we
> always call the functions built by asFunc with a sequence of strictly
> increasing inputs.

I am with Brent and Conal here. Now, to continue, ask yourself: What
Would Conal Do Next (WWCDN)?

What are we really trying to do here? What is this function, really,
considering that we are only evaluating it at a strictly increasing
sequence of inputs? Well, it seems to me that it is some special kind of
function, best captured as an *abstract data type*.


In particular, the function is something which I will call a "time
series". In other words, the input is to be thought of as time.

    data Time t = Moment t | Infinity
                deriving (Eq,Ord,Show)

The inclusion of infinity will turn out to be very convenient.

Now, the time series is a function that has a value  x1  in the distant
past, until a time  t1  where it begins to have the value  x2 , again
until a time  t2  where it switches to  x3  and so on, until a value  xn
 that is kept until infinity. In Haskell, this looks like this

  function t
     | -Infinity <= t && t < t1       = x1
     |        t1 <= t && t < t2       = x2
     |        t2 <= t && t < t3       = x3
     | ...
     |        t1 <= t && t < Infinity = xn

and pictorially, something like this:

                                                 ____ xn _____
                ____ x2 ____                    |
               |            |____ x3 ____ ...   |
  _____ x1 ____|

 -Inf          t1            t2           ...   tn          Inf


Of course, we can implement this abstract data type with a list of pairs
 (tk,xk)

    newtype TimeSeries t a = TS { unTS :: [(a,Time t)] }
                           deriving (Show)

and our goal is to equip this data type with a few natural operations
that can be used to implement Philip's zip-like function.


The first two operations are

    progenitor :: TimeSeries t a -> a
    progenitor = fst . head . unTS

which returns the value from the distant past and

    beginning :: TimeSeries t a -> Time t
    beginning = snd . head . unTS

which returns the first point in time when the function changes its
value. These correspond to the operation  head  on lists.


The next operation is called  `forgetTo` t  and will throw away all
values and changes before and including a given time  t .

    forgetTo :: Ord t => TimeSeries t a -> Time t -> TimeSeries t a
    forgetTo (TS xs) Infinity = TS [last xs]
    forgetTo (TS xs) t        = TS $ dropWhile ((<= t) . snd) xs

This roughly corresponds to  tail , but takes advantage of the time
being continuous.


Last but not least, we need a way to create a time series

    forever :: a -> TimeSeries t a
    forever x = TS [(x,Infinity)]

and we need to add values to a time series, which can be done with an
operation called  prepend  that adds a new  beginning  and  replaces the
 progenitor .

        -- We assume that  t < beginning xs
    prepend :: a -> Time t -> TimeSeries t a -> TimeSeries t a
    prepend x Infinity _       = TS [(x,Infinity)]
    prepend x t        (TS xs) = TS $ (x,t) : xs

These operations correspond to [] and (:) for lists.


The key about these operations is that they have a description /
intuition that is *independent* of the implementation of times series.
At no point do we need to know how exactly  TimeSeries  is implemented
to understand what these five operations do.

Now, Philip's desired zip-like function is straightforward to implement:

    zipSeries :: Ord t => TimeSeries t a -> TimeSeries t b
                          -> TimeSeries t (a,b)
    zipSeries xs ys = prepend (progenitor xs, progenitor ys) t $
        zipSeries (xs `forgetTo` t) (ys `forgetTo` t)
        where t = min (beginning xs) (beginning ys)

and you may want to convince yourself of its correctness by appealing to
the intuition behind time series.


Regards,
apfelmus

--
http://apfelmus.nfshost.com



------------------------------

Message: 2
Date: Wed, 04 Nov 2009 16:02:47 -0800
From: Patrick Larrivee-Woods <plarr...@sfu.ca>
Subject: [Haskell-beginners] Does System.Directory work on Windows XP?
To: beginners@haskell.org
Message-ID: <4af21627.5030...@sfu.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi,

I"m having trouble getting the System.Directory module in ghci on a 
Windows XP machine. Whenever I call a function like getCurrentDirectory 
or getDirectoryContents I get no results back. I was wondering if anyone 
knows why this is happening, or if someone can point me to a module that 
works on Windows.

Here's an example of what I get:

   ___         ___ _
  / _ \ /\  /\/ __(_)
 / /_\// /_/ / /  | |      GHC Interactive, version 6.4.2, for Haskell 98.
/ /_\\/ __  / /___| |      http://www.haskell.org/ghc/
\____/\/ /_/\____/|_|      Type :? for help.

Loading package base-1.0 ... linking ... done.
Prelude> :module System.Directory
Prelude System.Directory> getCurrentDirectory
Prelude System.Directory> getDirectoryContents "."
Prelude System.Directory> getDirectoryContents "c:\\perl"
Prelude System.Directory>


------------------------------

Message: 3
Date: Wed, 4 Nov 2009 16:02:49 -0800
From: Jason Dusek <jason.du...@gmail.com>
Subject: Re: [Haskell-beginners] Does System.Directory work on Windows
        XP?
To: Patrick Larrivee-Woods <plarr...@sfu.ca>
Cc: beginners@haskell.org
Message-ID:
        <42784f260911041602h32aa484cg391e8ad3f583a...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

  Could you try a recent version of GHC and let us know if you
  still have trouble?

--
Jason Dusek


------------------------------

Message: 4
Date: Wed, 04 Nov 2009 16:14:59 -0800
From: Patrick Larrivee-Woods <plarr...@sfu.ca>
Subject: Re: [Haskell-beginners] Does System.Directory work on Windows
        XP?
To: Jason Dusek <jason.du...@gmail.com>
Cc: beginners@haskell.org
Message-ID: <4af21903.3070...@sfu.ca>
Content-Type: text/plain; charset=UTF-8; format=flowed

With 6.10.4 it works fine. Thanks for your help,  I hadn't even realized 
I was using an older version.

Cheers,
Patrick

Jason Dusek wrote:
>   Could you try a recent version of GHC and let us know if you
>   still have trouble?
>
> --
> Jason Dusek
>   


------------------------------

Message: 5
Date: Thu, 5 Nov 2009 01:59:12 +0100
From: Daniel Fischer <daniel.is.fisc...@web.de>
Subject: Re: [Haskell-beginners] Does System.Directory work on Windows
        XP?
To: beginners@haskell.org
Message-ID: <200911050159.12432.daniel.is.fisc...@web.de>
Content-Type: text/plain;  charset="utf-8"

Am Donnerstag 05 November 2009 01:14:59 schrieb Patrick Larrivee-Woods:
> With 6.10.4 it works fine. Thanks for your help,  I hadn't even realized
> I was using an older version.
>
> Cheers,
> Patrick

Just to explain:

getDirectoryContents "." is an IO action producing a String.
In newer GHC releases (I don't remember whther that change came with 6.6 or 
6.8), when you 
invoke an IO action from the ghci-prompt, if the value returned from the action 
has not 
type (), by default it is bound to the variable 'it' and printed out (since 
it's not 
unconditionally a good thing to do that - consider readFile "HUGEFile.txt" - it 
can be 
disabled via
ghci> :set -fno-print-bind-result
- re-enable with -fprint-bind-result).
Before, the action was simply run and its result not printed out, to use the 
result of an 
IO action, you had to use
ghci> res <- getDirectoryContents "."
ghci> print res

or
ghci> getDirectoryContents "." >>= mapM_ putStrLn
or whatever.

If you still have 6.4.2 installed, you can try it out (but do your real work 
with the 
newer, it produces better code).

>
> Jason Dusek wrote:
> >   Could you try a recent version of GHC and let us know if you
> >   still have trouble?
> >
> > --
> > Jason Dusek



------------------------------

Message: 6
Date: Thu, 5 Nov 2009 12:01:11 +0100
From: Aleksandar Dimitrov <aleks.dimit...@googlemail.com>
Subject: [Haskell-beginners] Lazy file IO & Space leaks/waste
To: beginners@haskell.org
Message-ID: <20091105110110.ga7...@bylha.uni-tuebingen.de>
Content-Type: text/plain; charset="us-ascii"

Hello list,

I'm currently writing a small linguistic corpus analyzer. My input file is only
25MB, but profiling shows that the overall amount of allocation over the
program's runtime is several GB. That's a little too much - adding to that is
the fact that the program is abysmally slow, so I'm suspecting a space leak
somewhere.

I'm using the ByteString.Lazy.Char8 class in order to work efficiently with lazy
IO and I must admit that I'm very inexperienced with predicting runtime and
space behaviour of lazy IO :-( It worked well in the past, but I'm stuck now.

The program can be found here:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=11863#a11863

The important bits are as follows:

> mf :: [C.ByteString] -> StdWord
> mf [] = Word [] C.empty
> mf s = Word (tail s) (head s)
> 
> f' = mf . reverse . C.words

> main :: IO ()
> main = do
>     corpus_name <- liftM head getArgs
>     corpus <- liftM (Corpus . (map f') . C.lines) $ C.readFile corpus_name
>     print $ length (content corpus)
>     let interesting = filterForInterestingTags interestingTags corpus
>     print $ show (freqMap interesting)

freqMap also uses foldr' to fold the corpus it is given into a Data.Map, but
according to the profiling output, that's not where the bad stuff happens.
Here's the interesting parts of the profiling output:

>    analyse +RTS -p -RTS corpus/bnc-samp.tt
>
> total time  =       30.46 secs   (1523 ticks @ 20 ms)
> total alloc = 14,871,619,904 bytes  (excludes profiling overheads)
>
> COST CENTRE                 MODULE no. entries  %time %alloc %time %alloc
>
> MAIN                        MAIN     1        0   0.0    0.0 100.0  100.0
>   main                      Main   221        0   0.0    0.0   0.0    0.0
>  CAF                        Main   206       14   0.1    0.1 100.0  100.0
>   showsPrec_aSF             Main   223        1   0.0    0.0   0.0    0.0
>   interestingTags           Main   218        1   0.0    0.0   0.0    0.0
>   f'                        Main   216        1   0.0    0.0   0.0    0.0
>    mf                       Main   217        1   0.0    0.0   0.0    0.0
>   main                      Main   212        1   2.4    4.4  99.9   99.9
>    f'                       Main   219        0  89.0   91.5  90.2   92.7
>     mf                      Main   220  2427450   1.2    1.2   1.2    1.2
>    filterForInterestingTags Main   215        1   5.4    0.1   5.4    0.1
>    freqMap                  Main   214        1   0.6    0.5   2.0    2.7
>     compare_aRL             Main   222  1141996   1.4    2.2   1.4    2.2

Obviously the main cost centre seems to be f' (which I've factored out in order
to view it separately as a cost centre.) It's not the call to reverse that makes
it slow (removing it doesn't affect the run time.) Interestingly, it prints out
the first statement (length) very quickly, then takes a lot of time. I've
removed the call to freqMap, and it seems that GHC is smart enough to drop the
call to f' completely, because only the length ever gets evaluated. But the
freqMap also needs the processing from f', and that's where the bad stuff starts
happening.

Should I look into DeepSeq? I've tried adding strictness to the functions by
hand, and also to the data structures, but that didn't seem to help so far. So
I'm looking for solutions to make this faster. I'm guessing that a smart `seq`
somewhere might help, but I don't know where :-\

As a side note: how can I find out how the run-time data structures look like in
memory? I.e. I want to know *what* exactly stays around as thunks in memory so
that I can focus better on where to add strictness or redirect the program flow.
Ideally, only a very smart part of the file should ever be in memory, with
processing happening incrementally!

Thanks for any pointers!
Aleks
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : 
http://www.haskell.org/pipermail/beginners/attachments/20091105/5917de71/attachment-0001.bin

------------------------------

Message: 7
Date: Thu, 5 Nov 2009 21:05:22 -0500
From: "Nathan M. Holden" <nathanmhol...@gmail.com>
Subject: [Haskell-beginners] if ands
To: beginners@haskell.org
Message-ID: <200911052105.22124.nathanmhol...@gmail.com>
Content-Type: Text/Plain;  charset="us-ascii"

If you have an if statement like

if (a&&b) then fun else fun'

and a is false, does GHC actually bother to check b?


------------------------------

Message: 8
Date: Thu, 5 Nov 2009 21:09:13 -0500
From: Joe Fredette <jfred...@gmail.com>
Subject: Re: [Haskell-beginners] if ands
To: Nathan M.Holden <nathanmhol...@gmail.com>
Cc: beginners@haskell.org
Message-ID: <0c6944e9-12d2-4e97-8358-46f3caab1...@gmail.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes

No, consider the definition of (&&)

-- I hope this is the def from the prelude. If it's not, then it's  
probably isomorphic...
(&&) :: Bool -> Bool -> Bool

True  && x = x
False && _ = False

Since (&&) ignores it's second argument if the first is false, then it  
will "Short circuit" (like most `&` operators in other languages) due  
to lazy evaluation.

/Joe

On Nov 5, 2009, at 9:05 PM, Nathan M. Holden wrote:

> If you have an if statement like
>
> if (a&&b) then fun else fun'
>
> and a is false, does GHC actually bother to check b?
> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org
> http://www.haskell.org/mailman/listinfo/beginners



------------------------------

Message: 9
Date: Thu, 5 Nov 2009 21:12:49 -0500
From: Keith Sheppard <keiths...@gmail.com>
Subject: Re: [Haskell-beginners] if ands
To: "Nathan M. Holden" <nathanmhol...@gmail.com>,
        beginners@haskell.org
Message-ID:
        <92e42b740911051812j2042cf78pb6f587b604905...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Also, an nice way to check how evaluation works in ghci is to do something like:

> if False && error "error here" then "it's true" else "it's false"

This expression will evaluate as "it's false" without any "error here"
error message appearing

On Thu, Nov 5, 2009 at 9:05 PM, Nathan M. Holden
<nathanmhol...@gmail.com> wrote:
> If you have an if statement like
>
> if (a&&b) then fun else fun'
>
> and a is false, does GHC actually bother to check b?
> _______________________________________________
> Beginners mailing list
> Beginners@haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>



-- 
keithsheppard.name


------------------------------

_______________________________________________
Beginners mailing list
Beginners@haskell.org
http://www.haskell.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 17, Issue 5
****************************************

Reply via email to