Wang Zhong wrote:

> Where did you get the large string? Can't you generate the string one
> line per time and append it to local files, then upload to HDFS when
> finished?
>
> On Wed, Apr 29, 2009 at 10:47 AM, nguyenhuynh.mr
> <nguyenhuynh...@gmail.com> wrote:
>   
>> Hi all!
>>
>>
>> I have the large String and I want to write it into the file in HDFS.
>>
>> (The large string has >100.000 lines.)
>>
>>
>> Current, I use method copyBytes of class org.apache.hadoop.io.IOUtils.
>> But the copyBytes request the InputStream of content. Therefore, I have
>> to convert the String to InputStream, some things like:
>>
>>
>>
>>    InputStream in=new ByteArrayInputStream(sb.toString().getBytes());
>>
>>    The "sb" is a StringBuffer.
>>
>>
>> It not work with the command line above. :(
>>
>> There is the error:
>>
>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>>    at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
>>    at java.lang.StringCoding.encode(StringCoding.java:272)
>>    at java.lang.String.getBytes(String.java:947)
>>    at asnet.haris.mapred.jobs.Test.main(Test.java:32)
>>
>>
>>
>> Please give me the good solution!
>>
>>
>> Thanks,
>>
>>
>> Best regards,
>>
>> Nguyen,
>>
>>
>>
>>
>>     
>
>
>
>   
Thanks for your answer!

I have Map/Reduce job. It partition URI from HBase into groups URIs.
In the map phase, get group name of the URI and collect output
<groupname, uri>.
In the reduce phase, I get the String (URIs of the partition) and save
into HDFS.
Each group is a file.

Thanks,

Best regards,
NguyenHuynh.

Reply via email to