Out of memory using pmap

2011-08-06 Thread Shoeb Bhinderwala
Problem summary: I am running out of memory using pmap but the same code
works with regular map function.

My problem is that I am trying to break my data into sets and process them
in parallel. My data is for an entire month and I am breaking it into 30/31
sets - one for each day. I run a function for each daily set of data using
pmap, something like:

(defn process-monthly-data
  [grp-id month year]
  (doall (pmap
#(process-daily-data grp-id % month year)
(range 31)))

(defn process-daily-data
  [grp-id day month year]
  (
 ;load and process daily data …
  ))

When I run my function using regular map it works fine, but when I change it
to pmap I get an OutOfMemoryException.

What am I doing wrong?

-- Shoeb

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Out of memory using pmap

2011-08-06 Thread Sunil S Nandihalli
Just  a guess. If your daily data is huge you will be loading the data for
only one day when using map and you will be loading the data for multiple
days (equal to number of parallel threads) .. and may be this is the cause
of the problem.
Sunil.

On Sat, Aug 6, 2011 at 11:40 PM, Shoeb Bhinderwala 
shoeb.bhinderw...@gmail.com wrote:

 Problem summary: I am running out of memory using pmap but the same code
 works with regular map function.

 My problem is that I am trying to break my data into sets and process them
 in parallel. My data is for an entire month and I am breaking it into 30/31
 sets - one for each day. I run a function for each daily set of data using
 pmap, something like:

 (defn process-monthly-data
   [grp-id month year]
   (doall (pmap
 #(process-daily-data grp-id % month year)
 (range 31)))

 (defn process-daily-data
   [grp-id day month year]
   (
  ;load and process daily data …
   ))

 When I run my function using regular map it works fine, but when I change
 it to pmap I get an OutOfMemoryException.

 What am I doing wrong?

 -- Shoeb

  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Out of memory using pmap

2011-08-06 Thread Shoeb Bhinderwala
You didn't understand my problem. The exact same code throws out of
memory when I change map to pmap.

My monthly data is evenly divided into 30 sets. For e.g total monthly
data = 9 records, daily data size for each day = 3000 records. I
am trying to achieve performance gain by processing the daily data in
parallel.

On Aug 6, 2:18 pm, Sunil S Nandihalli sunil.nandiha...@gmail.com
wrote:
 Just  a guess. If your daily data is huge you will be loading the data for
 only one day when using map and you will be loading the data for multiple
 days (equal to number of parallel threads) .. and may be this is the cause
 of the problem.
 Sunil.

 On Sat, Aug 6, 2011 at 11:40 PM, Shoeb Bhinderwala 







 shoeb.bhinderw...@gmail.com wrote:
  Problem summary: I am running out of memory using pmap but the same code
  works with regular map function.

  My problem is that I am trying to break my data into sets and process them
  in parallel. My data is for an entire month and I am breaking it into 30/31
  sets - one for each day. I run a function for each daily set of data using
  pmap, something like:

  (defn process-monthly-data
    [grp-id month year]
    (doall (pmap
      #(process-daily-data grp-id % month year)
      (range 31)))

  (defn process-daily-data
    [grp-id day month year]
    (
       ;load and process daily data …
    ))

  When I run my function using regular map it works fine, but when I change
  it to pmap I get an OutOfMemoryException.

  What am I doing wrong?

  -- Shoeb

   --
  You received this message because you are subscribed to the Google
  Groups Clojure group.
  To post to this group, send email to clojure@googlegroups.com
  Note that posts from new members are moderated - please be patient with
  your first post.
  To unsubscribe from this group, send email to
  clojure+unsubscr...@googlegroups.com
  For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


Re: Out of memory using pmap

2011-08-06 Thread Colin Yates
The point is that sequentially the GC gets to remove stale entries so
simplistically only 3000 records are in memory at any one time, in
parallel processing all 9 can be in memory at the same time.

Sent from my iPad

On 6 Aug 2011, at 21:34, Shoeb Bhinderwala shoeb.bhinderw...@gmail.com wrote:

 You didn't understand my problem. The exact same code throws out of
 memory when I change map to pmap.

 My monthly data is evenly divided into 30 sets. For e.g total monthly
 data = 9 records, daily data size for each day = 3000 records. I
 am trying to achieve performance gain by processing the daily data in
 parallel.

 On Aug 6, 2:18 pm, Sunil S Nandihalli sunil.nandiha...@gmail.com
 wrote:
 Just  a guess. If your daily data is huge you will be loading the data for
 only one day when using map and you will be loading the data for multiple
 days (equal to number of parallel threads) .. and may be this is the cause
 of the problem.
 Sunil.

 On Sat, Aug 6, 2011 at 11:40 PM, Shoeb Bhinderwala 







 shoeb.bhinderw...@gmail.com wrote:
 Problem summary: I am running out of memory using pmap but the same code
 works with regular map function.

 My problem is that I am trying to break my data into sets and process them
 in parallel. My data is for an entire month and I am breaking it into 30/31
 sets - one for each day. I run a function for each daily set of data using
 pmap, something like:

 (defn process-monthly-data
   [grp-id month year]
   (doall (pmap
 #(process-daily-data grp-id % month year)
 (range 31)))

 (defn process-daily-data
   [grp-id day month year]
   (
  ;load and process daily data …
   ))

 When I run my function using regular map it works fine, but when I change
 it to pmap I get an OutOfMemoryException.

 What am I doing wrong?

 -- Shoeb

  --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with
 your first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en

 --
 You received this message because you are subscribed to the Google
 Groups Clojure group.
 To post to this group, send email to clojure@googlegroups.com
 Note that posts from new members are moderated - please be patient with your 
 first post.
 To unsubscribe from this group, send email to
 clojure+unsubscr...@googlegroups.com
 For more options, visit this group at
 http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups Clojure group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en