We should reuse key and value objects in the MultithreadedMapRunner.
--------------------------------------------------------------------
Key: HADOOP-3125
URL: https://issues.apache.org/jira/browse/HADOOP-3125
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Owen O'Malley
Currently, each key/value pair read from the record reader is allocated a new a
key and value. It would be better if it had a pool of key/value pairs that were
reused. I'm picturing something like:
BlockingQueue<KeyValuePair> empties;
BlockingQueue<KeyValuePair> newInputs;
the record reader thread would take a KeyValuePair from the empties queue, read
into it using the RecordReader, and put it on the newInputs queue.
The work threads would read from newInputs, process the key and value and put
the processed objects on the empties queue. The initialization would put the
desired number of key-value pairs on the empties queue to start it off.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.