Hi,

I am writing an M-R code using MapRunnable interface.
The input format is SequenceFileInputFormat.

Each Sequence-record contains a key-value pair of type <Text key,Text value> 
(Text: org.apache.hadoop.io.Text)

The "key" Text object contains small string where as "value" Text object 
contains large XML string.
"value" Text object can contain the data as large as 100 to 300 MB.

I convert the "value" Text object to String using value.toString() method.
It goes OutOfMemory for large data in "value" object.

Is there any other way for converting large Text object to java String object?
Alternatively, can I limit the number of records in RecordReader object coming 
to run method so that total memory utilization would be limited?

Thanks,
- Bhushan


DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

Reply via email to