I think it's possible to retain even more than two copies of the keyval buffer.
This can happen because the comparator's buffer is set only when a comparison
is performed, so if no data exists for a partition in a spill cycle, the
partition's comparator will retain the buffer from a previous spill.
Take this scenario:
We have 3 partitions: A,B,C
a keyval buffer that can hold 3 keys
and an input data set that generates output keys in the following sequence:
AAABBBCCC
A
A
A
-- Spill occurs here. A's comparator has keyval buffer 1.
B
B
B
-- Spill occurs here. B's comparator has keyval buffer 2. A's comparator has
keyval buffer 1.
C
C
C
-- Spill occurs here. C's comparator has keyval buffer 3. B's A's comparator
has keyval buffer 2. A's comparator has keyval buffer 1.
This is fairly unlikely, but I think it happening to me occasionally because
I'm seeing OOMEs I can't explain any other way. My heap is more than large
enough to support two 100M buffers.
FYI, I added code to clear the comparator's buffer (see patch below), and a job
that was failing with 650M heaps now succeeds with 512M.
Travis
Index: src/java/org/apache/hadoop/io/WritableComparator.java
===================================================================
--- src/java/org/apache/hadoop/io/WritableComparator.java (revision 618649)
+++ src/java/org/apache/hadoop/io/WritableComparator.java (working copy)
@@ -78,6 +78,11 @@
}
}
+ /** Free up memory used by the internal DataInputBuffer. */
+ public void clearBuffer() {
+ buffer.reset(new byte[] {}, 0, 0);
+ }
+
/** Optimization hook. Override this to make SequenceFile.Sorter's scream.
*
* <p>The default implementation reads the data into two [EMAIL PROTECTED]
Index: src/java/org/apache/hadoop/mapred/BasicTypeSorterBase.java
===================================================================
--- src/java/org/apache/hadoop/mapred/BasicTypeSorterBase.java (revision
618649)
+++ src/java/org/apache/hadoop/mapred/BasicTypeSorterBase.java (working copy)
@@ -124,6 +124,7 @@
//release the large key-value buffer so that the GC, if necessary,
//can collect it away
keyValBuffer = null;
+ comparator.clearBuffer();
}
//A compare method that references the keyValBuffer through the indirect
//pointers
----- Original Message ----
From: Amar Kamat <[EMAIL PROTECTED]>
To: [email protected]
Sent: Tuesday, February 5, 2008 12:08:48 AM
Subject: Re: Possible memory "leak" in MapTask$MapOutputBuffer
Hi,
Yes,
you
are
correct.
The
reference
to
the
old
keyval
buffers
are
still
there
even
after
the
buffers
are
re-initialized
but
the
reference
is
there
just
between
the
consecutive
spills.
The
scenario
before
HADOOP-1965
was
that
the
memory
used
for
one
sort-spill
phase
is
io.sort.mb
causing
the
max
memory
usage
to
be
(2
*
io.sort.mb).
Post
HADOOP-1965,
the
total
memory
used
for
once
sort-spill
phase
is
io.sort.mb/2,
the
max
memory
usage
is
io.sort.mb
and
the
time
duration
between
two
consecutive
spills
is
also
reduced
since
they
happen
in
parallel.
Thanks
for
pointing
it
out.
I
have
opened
HADOOP-2782
addressing
the
same.
Amar
Travis
Woodruff
wrote:
>
Well,
this
is
what
I
get
for
not
doing
my
homework
first.
>
>
I
pulled
down
the
latest
code
from
trunk,
and
it
looks
like
the
updates
for
HADOOP-1965
have
changed
this
code
significantly.
>From
what
I
can
tell,
these
changes
have
removed
the
issue;
however,
the
problem
still
exists
in
the
0.15
branch.
>
>
>
Travis
>
>
-----
Original
Message
----
>
From:
Travis
Woodruff
<[EMAIL PROTECTED]>
>
To:
[email protected]
>
Sent:
Monday,
February
4,
2008
6:41:31
PM
>
Subject:
Possible
memory
"leak"
in
MapTask$MapOutputBuffer
>
<snip for goofy formatting>
____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs