Hi, For the merging, I would look at the underlying structure of xlsx-files. They are Zip-Files where the main data for the sheets is contained in separate files with a few links via "relations" from other places. The only common part might be the "sharedstrings" part.
So an approach that would be much faster would be to simply extract the sheets from the xlsx-zip-file and combine them into a new one and handle the shared strings somehow, either by parsing the XML and combining them or by using the related POI classes like SharedStringsTable. I still sounds like a bit of work, and I did not actually try any of this, but in theory it should be a whole lot faster than doing it row-by-row and cell-by-cell. Dominik. On Thu, Feb 18, 2016 at 3:57 PM, Christopher Gardner < [email protected]> wrote: > I'm producing a large xlsx workbook with several worksheets. The data that > will fill each worksheet comes from different sources (database queries, > web services, etc.). I suppose that POI won't let you write multiple > worksheets to the same file via multithreading. Hence, I'd like to create > a single worksheet workbook for each source and, when all individual > workbooks are created, aggregate them into one workbook (each original > workbook in its own worksheet). I know that I will have to copy each > worksheet, row by row, cell by cell into the aggregated worksheet. > > Are there better approaches to doing this? > > As I copy each worksheet into the workbook (assuming I'm using SXSSF) could > I run into memory issues as the aggregated workbook grows? In other words, > even if I'm using SXSSF, when I open the file, is the whole file loaded > into memory even before I start writing additional items to it? >
