Darren Roberts <robertsdjguard-poi <at> yahoo.com> writes:

> 
> When you deleted the data from row 21 onwards did you just highlight the 
cells/data and press delete, or
> select all the row numbers from 21 to the end, right click and select 
delete? There is a difference, the
> former just clears the data from the cells but otherwise Excel considers 
the cells to have content
> (retains styles etc). If you do the latter it actually deletes the rows 
entirely from its internal
> representation. 
> 
> Have you opened up the xlsx file as a zip and had a look at the XML file 
for the sheet? Since you're still seeing
> SAX processing those rows it's my guess that when you saved the file Excel 
thought they were still valid
> content cells and wrote out empty cells for all of them. Try deleting the 
rows using the method I described,
> and then seeing if SAX still processes them.
> 
> >________________________________
> > From: Enrique <enrique.rochina <at> gmail.com>
> >To: user <at> poi.apache.org 
> >Sent: Monday, March 10, 2014 10:48 AM
> >Subject: XSSF - detecting end of file when reading XLSX files
> > 
> >
> >Hello,
> >
> >I'm working on making extraction of some excel files (very large files), 
> >using XSSF + SAX.
> >
> >I've used some tests files, specifically one that contained thousands of 
rows 
> >of data. I only wanted to keep few of them, so I deleted all the content 
from 
> >the 20th row. When I process this excel file, I see that since the 21th 
row, 
> >every row I processed, with empty results obviously. 
> >
> >I would rather expect to have this process finished. Is there anyway to 
say 
> >to the XSSF processor not to enter this empty rows?
> >
> >I suppose there is a relation with the fact that the file contained more 
data 
> >in this rows before, but this is a real case that our end users could do.
> >
> >Thanks for your help.
> >
> >Kind regards,
> >Enrique
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: user-unsubscribe <at> poi.apache.org
> >For additional commands, e-mail: user-help <at> poi.apache.org
> >
> >
> >
> >


Hello Darren,

I've tested what you said, and that's right. My previous test was based on a 
file I deleted the cells with the Delete key. After deleting the content 
property, it is not processed by SAX. Moreover, if I've verified that the 
content in the zip file has decreased (file \xl\worksheets\sheet3.xml, many 
less row elements)

Now I have a second question that it may relate to SAX: is it possible to 
abort or terminate the processing promptly? I have a business case where I 
only need to extract the content on the 10 first rows, thus I don't need to 
spend time on processing the whole file.

I've verified, with XSSF + Usermodel it takes too long and too much memory, 
only for opening the file. So, XSSF is still a better option

Thanks



Darren Roberts <robertsdjguard-poi <at> yahoo.com> writes:



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to