[jira] [Created] (CSV-311) OutOfMemory for very long rows despite using column value of type Reader

Christian Feuersaenger (Jira) Mon, 11 Mar 2024 03:01:12 -0700

Christian Feuersaenger created CSV-311:
------------------------------------------


             Summary: OutOfMemory for very long rows despite using column value 
of type Reader
                 Key: CSV-311
                 URL: https://issues.apache.org/jira/browse/CSV-311
             Project: Commons CSV
          Issue Type: Bug
          Components: Printer
    Affects Versions: 1.10.0
            Reporter: Christian Feuersaenger


Our application makes use of commons-csv (great software, thanks!) .

Recently, we got a support request because someone had unexpectedly large 
column values in one CSV row - so large that our explicitly chosen limits on 
the java heap did not suffice.

We analyzed the heap dump and found that a huge row was the culprit; the stack 
trace in question is
{noformat}
Caused by ->  java.lang.OutOfMemoryError: Java heap space  
 Arrays.copyOf:3537  
 AbstractStringBuilder.ensureCapacityInternal:228  
 AbstractStringBuilder.append:802  
 StringBuilder.append:246  
 CSVFormat.printWithQuotes:2127  
 CSVFormat.print:1834  
 CSVFormat.print:1783  
 CSVPrinter.print:166  
 CSVPrinter.printRecord:259  
 CSVPrinter.printRecord:278   {noformat}
Note that we provide column values of type java.io.Reader as accepted by 
org.apache.commons.csv.CSVFormat.print(Object, Appendable, boolean) .

The problem is that CSVFormat.print really supports java.io.Reader, but despite 
just piping characters (possibly escaped/quoted) into the appendable output 
stream, it reads everything into a StringBuilder which is then copied to the 
appendable output stream.

Is there a way to improve this situation? We are working in a setup in which 
java heap cannot be spend arbitrarily and we would rather have an approach 
which works out of the box.

Thanks for looking into it!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CSV-311) OutOfMemory for very long rows despite using column value of type Reader

Reply via email to