Oscar,

there is also a fork of Excel Streamer which is worth to
mention: https://github.com/pjfanning/excel-streaming-reader
Personally I prefer that fork, but I have no good reason or argument.
In the past this fork picked up version changes of POI faster. 

Cheers
Andreas



On Mon, 2021-05-03 at 06:05 -0500, Oscar Bastidas wrote:
> Awesome.  Thanks, I'll give this a try.
> 
> Oscar
> 
> Oscar Bastidas
> Research Associate
> University of Minnesota
> 
> On Mon, May 3, 2021, 6:04 AM Andreas Reichel
> <[email protected]>
> wrote:
> 
> > Greetings.
> > 
> > Please use the Excel Streaming Reader when reading large
> > files: https://github.com/monitorjbl/excel-streaming-reader
> > 
> > import com.monitorjbl.xlsx.StreamingReader;
> > 
> > InputStream is = new FileInputStream(new
> > File("/path/to/workbook.xlsx"));
> > Workbook workbook = StreamingReader.builder()
> >         .rowCacheSize(100)    // number of rows to keep in memory
> > (defaults to 10)
> >         .bufferSize(4096)     // buffer size to use when reading
> > InputStream to file (defaults to 1024)
> >         .open(is);            // InputStream or File for XLSX file
> > (required)
> > 
> > 
> > 
> > With the code above you can loop through your rows and write it to
> > CSV.
> > Best regards
> > Andreas
> > 
> > 
> > On Mon, 2021-05-03 at 05:31 -0500, Oscar Bastidas wrote:
> > > Hello,
> > > 
> > > I am trying to read a large Excel spreadsheet (60,000 rows) but I
> > > get
> > > what
> > > appears to be a memory leak error from the JVM when I use the
> > > *XSSFWorkbook
> > > *API.  I learned recently that there are size limitations on
> > > Excel
> > > files
> > > being read in this way and the Apache POI website specifically
> > > recommends
> > > reading the file in a streaming fashion instead of taking the
> > > whole
> > > file in
> > > memory.  To do this, POI recommends using something called
> > > *XLSX2CSV*
> > > but
> > > the provided link to teach how to use this returns a "page not
> > > found
> > > error."
> > > 
> > > Would someone please point me in the direction of how to handle
> > > reading my
> > > big Excel file?
> > > 
> > > The Apache POI URL that contains the link to *XLSX2CSV* is:
> > > 
> > > http://poi.apache.org/components/spreadsheet/limitations.html
> > > 
> > > Thanks for any help anyone can provide.
> > > 
> > > Oscar
> > > 
> > > Oscar Bastidas
> > > Research Associate
> > > University of Minnesota
> > 
> > 

Reply via email to