Beginners Digest, Vol 71, Issue 14

beginners-request Mon, 12 May 2014 09:56:19 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."


Today's Topics:

   1.  Addition of "Float" and "Int". (Venu Chakravorty)
   2. Re:  Addition of "Float" and "Int". (Brandon Allbery)
   3. Re:  Space leak while reading from a file? (David McBride)
   4. Re:  Space leak while reading from a file? (Bob Ippolito)


----------------------------------------------------------------------

Message: 1
Date: Mon, 12 May 2014 11:44:18 -0400 (EDT)
From: Venu Chakravorty <[email protected]>
To: [email protected]
Subject: [Haskell-beginners] Addition of "Float" and "Int".
Message-ID: <[email protected]>
Content-Type: text/plain; charset="us-ascii"



Hello everyone,
        I am just starting with Haskell so please bear with me.


Here's my question:


Consider the below definition / output:


Prelude> :t (+)
(+) :: (Num a) => a -> a -> a


        What I understand from the above is that "+" is a function that takes 
two args
which are types of anything that IS-AN instance of "Num" (Int, Integer, Float, 
Double)
and returns an instance of "Num".
Hence this works fine:
Prelude> 4.3 + 2
6.3


But I can't understand why this doesn't work:
Prelude> 4.3 + 4 :: Int


<interactive>:1:0:
    No instance for (Fractional Int)
      arising from the literal `4.3' at <interactive>:1:0-2
    Possible fix: add an instance declaration for (Fractional Int)
    In the first argument of `(+)', namely `4.3'
    In the expression: 4.3 + 4 :: Int
    In the definition of `it': it = 4.3 + 4 :: Int


        I expected that the second addition would work as both "Float" and 
"Int" are
instances of "Num". Is it that since both the formal args are defined as "a" 
they
have to be exactly the same instances? Had "+" been defined something like: 
(+) :: (Num a, Num b) => a -> b -> a
my second addition would have worked?


Please let me know what I am missing.


Regards,
Venu Chakravorty.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20140512/dcef70bb/attachment-0001.html>

------------------------------

Message: 2
Date: Mon, 12 May 2014 12:28:32 -0400
From: Brandon Allbery <[email protected]>
To: The Haskell-Beginners Mailing List - Discussion of primarily
        beginner-level topics related to Haskell <[email protected]>
Subject: Re: [Haskell-beginners] Addition of "Float" and "Int".
Message-ID:
        <CAKFCL4VZP3WcHe=ee0gfvj12niq85dysb3e_9leua_kaehi...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Mon, May 12, 2014 at 11:44 AM, Venu Chakravorty <[email protected]> wrote:

> Prelude> :t (+)
> (+) :: (Num a) => a -> a -> a
>
>  What I understand from the above is that "+" is a function that takes
> two args
> which are types of anything that IS-AN instance of "Num" (Int, Integer,
> Float, Double)
> and returns an instance of "Num".
>

Not exactly. It says that, given some type a that is an instance of Num, it
will add two values of that type and produce a new value of that same type.
You cannot mix and match types; it always works on some specific type,
although those types may change between uses of (+).

This is somewhat hidden by the way numeric literals are handled: a literal
without a decimal point is handled as if you had wrapped it in
fromIntegral, and one with a decimal point is handled as if you had wrapped
it in fromRational.

-- 
brandon s allbery kf8nh                               sine nomine associates
[email protected]                                  [email protected]
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20140512/bcffd5f6/attachment-0001.html>

------------------------------

Message: 3
Date: Mon, 12 May 2014 12:33:20 -0400
From: David McBride <[email protected]>
To: The Haskell-Beginners Mailing List - Discussion of primarily
        beginner-level topics related to Haskell <[email protected]>
Subject: Re: [Haskell-beginners] Space leak while reading from a file?
Message-ID:
        <can+tr43+1b4kjey2g8pdhn+4s17yrp9dnc5ssxqeya+zhm3...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

This is a bit advanced for the beginners list.  You would probably have
better luck on stackoverflow.


On Sun, May 11, 2014 at 7:24 AM, Jan Snajder <[email protected]> wrote:

> Dear all,
>
> I'm trying to implement a simple file-based database. I apparently have
> a space leak, but I have no clue where it comes from.
>
> Here's the file-based database implementation:
> http://pastebin.com/QqiqcXFw
>
> The idea to have a database table in a single textual file. One line
> equals one table row. The fields within a row are whitespace separated.
> The first field is the key. Because I'd like to work with large files, I
> don't want to load the whole file into memory. Instead, I'd like to be
> able to fetch the rows on demand, by keys. Thus I first create an index
> that links keys to file seeks. I use the readerT to add the index to the
> IO monad.
>
> For testing, I use a dummy table produced as follows:
>
> import System.IO
> import Text.Printf
> import Control.Monad
>
> row = unwords [printf "field%03d" (i::Int) | i <- [1..999]]
>
> main = do
>   forM_ [1..250000] $ \i ->
>      putStrLn $ printf "row%06d %s" (i::Int) row
>
> This generates a 2.1G textual file, which I store on my disk.
>
> The testing code:
>
> import FileDB
> import qualified Data.Text as T
> import Text.Printf
> import Control.Applicative
> import Control.Monad
> import Control.Monad.Trans
> import System.IO
> import System.Environment
>
> main = do
>   (f:_) <- getArgs
>   t <- openTable f
>   runDB t $ do
>     ks <- getKeys
>     liftIO $ do
>       putStrLn . printf "%d keys read" $ length ks
>       putStrLn "Press any key to continue..."
>       getChar
>     forM_ ks $ \k -> do
>       Just r <- getRow k
>       liftIO . putStrLn $ printf "Row \"%s\" has %d fields"
>         (T.unpack k) (length r)
>
> When I run the test on the 2.1GB file, the whole program consumes 10GB.
>
> 6GB seem to be allocated after the index is built (just before entering
> the forM_ function). The remaining 4GB are allocated while fetching all
> the rows.
>
> I find both things difficult to explain.
>
> 6GB seems too much for the index. Each key is 9 characters (stored as
> Data.Text), and I have 250K such keys in a Data.Map. Should this really
> add up to 6GB?
>
> Also, I have no idea why fetching all the rows, one by one, should
> consume any additional memory. Each row is fetched and its length is
> computed and printed out. I see no reason for the rows to be retained in
> the memory.
>
> Here's the memory allocation summary:
>
> > 1,093,931,338,632 bytes allocated in the heap
> >    2,225,144,704 bytes copied during GC
> >    4,533,898,000 bytes maximum residency (26 sample(s))
> >    3,080,926,336 bytes maximum slop
> >            10004 MB total memory in use (0 MB lost due to fragmentation)
> >
> >                                     Tot time (elapsed)  Avg pause  Max
> pause
> >   Gen  0     2171739 colls,     0 par   45.29s   45.26s     0.0000s
>  0.0030s
> >   Gen  1        26 colls,     0 par    1.50s    1.53s     0.0589s
> 0.7087s
> >
> >   INIT    time    0.00s  (  0.00s elapsed)
> >   MUT     time  279.92s  (284.85s elapsed)
> >   GC      time   46.80s  ( 46.79s elapsed)
> >   EXIT    time    0.68s  (  0.71s elapsed)
> >   Total   time  327.40s  (332.35s elapsed)
> >
> >   %GC     time      14.3%  (14.1% elapsed)
> >
> >   Alloc rate    3,908,073,170 bytes per MUT second
> >
> >   Productivity  85.7% of total user, 84.4% of total elapsed
>
>
> Btw., I don't get the "bytes allocated in the heap" figure, which is
> approx. 1000 GB (?).
>
> I'm obviously doing something wrong here. I'd be thankful for any help.
>
> Best,
> Jan
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20140512/9c165224/attachment-0001.html>

------------------------------

Message: 4
Date: Mon, 12 May 2014 09:54:12 -0700
From: Bob Ippolito <[email protected]>
To: The Haskell-Beginners Mailing List - Discussion of primarily
        beginner-level topics related to Haskell <[email protected]>
Subject: Re: [Haskell-beginners] Space leak while reading from a file?
Message-ID:
        <CACwMPm__Wf2-djmQhad3MqZE6zY0Xs_ogpwKgx_t_V5HH=c...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

I haven't looked closely but I suspect if you use foldM to build your Map
rather than untilM it might consume less memory since you won't be
allocating this big list.

BUT? the real reason why the memory usage is so much higher than you expect
is because slicing a Data.Text is an O(1) operation that shares the
underlying buffer between the original and the slice. Calling T.words will
ensure that the full original line stays around. You can see this in the
implementation:
http://hackage.haskell.org/package/text-0.11.2.0/docs/src/Data-Text.html#words

This behavior is documented somewhat near there in the docs, but I think it
should really be a top-level thing:
http://hackage.haskell.org/package/text-0.11.2.0/docs/Data-Text.html#g:18

I don't know what function to use to force the array to be copied,
hopefully there is one! Erlang's binaries work similarly and there is a
copy function for them for exactly this purpose:
http://www.erlang.org/doc/man/binary.html

-bob



On Sun, May 11, 2014 at 4:24 AM, Jan Snajder <[email protected]> wrote:

> Dear all,
>
> I'm trying to implement a simple file-based database. I apparently have
> a space leak, but I have no clue where it comes from.
>
> Here's the file-based database implementation:
> http://pastebin.com/QqiqcXFw
>
> The idea to have a database table in a single textual file. One line
> equals one table row. The fields within a row are whitespace separated.
> The first field is the key. Because I'd like to work with large files, I
> don't want to load the whole file into memory. Instead, I'd like to be
> able to fetch the rows on demand, by keys. Thus I first create an index
> that links keys to file seeks. I use the readerT to add the index to the
> IO monad.
>
> For testing, I use a dummy table produced as follows:
>
> import System.IO
> import Text.Printf
> import Control.Monad
>
> row = unwords [printf "field%03d" (i::Int) | i <- [1..999]]
>
> main = do
>   forM_ [1..250000] $ \i ->
>      putStrLn $ printf "row%06d %s" (i::Int) row
>
> This generates a 2.1G textual file, which I store on my disk.
>
> The testing code:
>
> import FileDB
> import qualified Data.Text as T
> import Text.Printf
> import Control.Applicative
> import Control.Monad
> import Control.Monad.Trans
> import System.IO
> import System.Environment
>
> main = do
>   (f:_) <- getArgs
>   t <- openTable f
>   runDB t $ do
>     ks <- getKeys
>     liftIO $ do
>       putStrLn . printf "%d keys read" $ length ks
>       putStrLn "Press any key to continue..."
>       getChar
>     forM_ ks $ \k -> do
>       Just r <- getRow k
>       liftIO . putStrLn $ printf "Row \"%s\" has %d fields"
>         (T.unpack k) (length r)
>
> When I run the test on the 2.1GB file, the whole program consumes 10GB.
>
> 6GB seem to be allocated after the index is built (just before entering
> the forM_ function). The remaining 4GB are allocated while fetching all
> the rows.
>
> I find both things difficult to explain.
>
> 6GB seems too much for the index. Each key is 9 characters (stored as
> Data.Text), and I have 250K such keys in a Data.Map. Should this really
> add up to 6GB?
>
> Also, I have no idea why fetching all the rows, one by one, should
> consume any additional memory. Each row is fetched and its length is
> computed and printed out. I see no reason for the rows to be retained in
> the memory.
>
> Here's the memory allocation summary:
>
> > 1,093,931,338,632 bytes allocated in the heap
> >    2,225,144,704 bytes copied during GC
> >    4,533,898,000 bytes maximum residency (26 sample(s))
> >    3,080,926,336 bytes maximum slop
> >            10004 MB total memory in use (0 MB lost due to fragmentation)
> >
> >                                     Tot time (elapsed)  Avg pause  Max
> pause
> >   Gen  0     2171739 colls,     0 par   45.29s   45.26s     0.0000s
>  0.0030s
> >   Gen  1        26 colls,     0 par    1.50s    1.53s     0.0589s
> 0.7087s
> >
> >   INIT    time    0.00s  (  0.00s elapsed)
> >   MUT     time  279.92s  (284.85s elapsed)
> >   GC      time   46.80s  ( 46.79s elapsed)
> >   EXIT    time    0.68s  (  0.71s elapsed)
> >   Total   time  327.40s  (332.35s elapsed)
> >
> >   %GC     time      14.3%  (14.1% elapsed)
> >
> >   Alloc rate    3,908,073,170 bytes per MUT second
> >
> >   Productivity  85.7% of total user, 84.4% of total elapsed
>
>
> Btw., I don't get the "bytes allocated in the heap" figure, which is
> approx. 1000 GB (?).
>
> I'm obviously doing something wrong here. I'd be thankful for any help.
>
> Best,
> Jan
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20140512/a535b500/attachment.html>

------------------------------

Subject: Digest Footer

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners


------------------------------

End of Beginners Digest, Vol 71, Issue 14
*****************************************

Beginners Digest, Vol 71, Issue 14

Reply via email to