Can someone lead me in the right direction as to configuring settings
for large sorting operations > 1M rows. I keep getting out of memory
exceptions during the sort phase. Here are my current settings. I have
2G heap space on each box.
Dennis
<property>
<name>io.sort.factor</name>
<value>20</value>
<description>
The number of streams to merge at once while sorting
files. This determines the number of open file handles.
</description>
</property>
<property>
<name>io.sort.mb</name>
<value>200</value>
<description>
The total amount of buffer memory to use while sorting
files, in megabytes. By default, gives each merge stream 1MB, which
should minimize seeks.
</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>8192</value>
<description>
The size of buffer for use in sequence files.
The size of this buffer should probably be a multiple of hardware
page size (4096 on Intel x86), and it determines how much data is
buffered during read and write operations.
</description>
</property>
<property>
<name>io.bytes.per.checksum</name>
<value>4096</value>
<description>
The number of bytes per checksum. Must not be larger than
io.file.buffer.size.
</description>
</property>