Wang Zhong wrote: > Where did you get the large string? Can't you generate the string one > line per time and append it to local files, then upload to HDFS when > finished? > > On Wed, Apr 29, 2009 at 10:47 AM, nguyenhuynh.mr > <nguyenhuynh...@gmail.com> wrote: > >> Hi all! >> >> >> I have the large String and I want to write it into the file in HDFS. >> >> (The large string has >100.000 lines.) >> >> >> Current, I use method copyBytes of class org.apache.hadoop.io.IOUtils. >> But the copyBytes request the InputStream of content. Therefore, I have >> to convert the String to InputStream, some things like: >> >> >> >> InputStream in=new ByteArrayInputStream(sb.toString().getBytes()); >> >> The "sb" is a StringBuffer. >> >> >> It not work with the command line above. :( >> >> There is the error: >> >> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >> at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) >> at java.lang.StringCoding.encode(StringCoding.java:272) >> at java.lang.String.getBytes(String.java:947) >> at asnet.haris.mapred.jobs.Test.main(Test.java:32) >> >> >> >> Please give me the good solution! >> >> >> Thanks, >> >> >> Best regards, >> >> Nguyen, >> >> >> >> >> > > > > Thanks for your answer!
I have Map/Reduce job. It partition URI from HBase into groups URIs. In the map phase, get group name of the URI and collect output <groupname, uri>. In the reduce phase, I get the String (URIs of the partition) and save into HDFS. Each group is a file. Thanks, Best regards, NguyenHuynh.