Hi Bryce,
I am intrigued by your work here. It is very interesting. I think that there
are probably some interesting possibilities with creating a more efficient
structure to hold Rows of data. Probably down to the level of the Cells within
each row.
Somehow the default behavior of OOXML would need to be overridden. I'm not sure
about how to do it.
> Continuing to chip away at CTSheetDataImpl's performance issues.
> However, the problematic code is unlikely to be changed, I have a discussion
> going on the XMLBeans website, and as it turns out that code is most likely
> not going to change, or will not change easily.
It might be illuminating if you could either highlight some of the issues
discussed on the XMLBeans list, or provide a link to the thread.
If we are going to hack into or over XMLBeans we have to make a strong case.
> With that in mind, I started taking a close look at the POI side of things
> and very specifically the problematic code from a POI perspective.
I think that what we are discussing is implementing random access to the in
memory cell objects. Done in a way where we keep the linear/serial structure
this silly algorithm is preserving by using a Tree rather than an array and
walking it when output is needed.
This would significantly improve the performance of ooxml.
Regards,
Dave
>
>
> At approximately line 2368 of XSSFSheet.java
>
> the following line is the suspect line:
> I have modified the code to break 1 line into 3 the slow line is the last
> one.
>
> CTRow[] lctRow = new CTRow[rArray.size()];
> lctRow = rArray.toArray(lctRow);
> sheetData.setRowArray(lctRow);
>
>
>
> My Question is: in looking at XSSFSheet.java
> are there any assumption that we can make about the state of sheetData (A
> CTSheetData object)
> that would allow us to either overwrite the default setRowArray(lctRow)
> method,
> or side step this call all together by calling a method that is not generic,
> but instead much
> more targeted to the function and process that is currently going on.
>
>
>
>
>
>
>
>
> Here is the full call in context...Take from my modified version of
> XSSFSheet.java
>
>
> protected void write(OutputStream out) throws IOException {
>
> if(worksheet.getColsArray().length == 1) {
> CTCols col = worksheet.getColsArray(0);
> CTCol[] cols = col.getColArray();
> if(cols.length == 0) {
> worksheet.setColsArray(null);
> }
> }
>
> // Now re-generate our CTHyperlinks, if needed
> if(hyperlinks.size() > 0) {
> if(worksheet.getHyperlinks() == null) {
> worksheet.addNewHyperlinks();
> }
> CTHyperlink[] ctHls = new CTHyperlink[hyperlinks.size()];
> for(int i=0; i<ctHls.length; i++) {
> // If our sheet has hyperlinks, have them add
> // any relationships that they might need
> XSSFHyperlink hyperlink = hyperlinks.get(i);
> hyperlink.generateRelationIfNeeded(getPackagePart());
> // Now grab their underling object
> ctHls[i] = hyperlink.getCTHyperlink();
> }
> worksheet.getHyperlinks().setHyperlinkArray(ctHls);
> }
>
> CTSheetData sheetData = worksheet.getSheetData();
> ArrayList<CTRow> rArray = new ArrayList<CTRow>(rows.size());
> for(XSSFRow row : rows.values()){
> row.onDocumentWrite();
> rArray.add(row.getCTRow());
> }
> long startTimeSheetData = System.currentTimeMillis();
> CTRow[] lctRow = new CTRow[rArray.size()];
> lctRow = rArray.toArray(lctRow);
> sheetData.setRowArray(lctRow);
> System.out.println("Sheet Time : " +
> (System.currentTimeMillis() - startTimeSheetData));
>
>
> XmlOptions xmlOptions = new XmlOptions(DEFAULT_XML_OPTIONS);
> xmlOptions.setSaveSyntheticDocumentElement(new
> QName(CTWorksheet.type.getName().getNamespaceURI(), "worksheet"));
> Map<String, String> map = new HashMap<String, String>();
> map.put(STRelationshipId.type.getName().getNamespaceURI(), "r");
> xmlOptions.setSaveSuggestedPrefixes(map);
> long startTimeWorkSheetSave = System.currentTimeMillis();
>
> worksheet.save(out, xmlOptions);
> System.out.println("Worksheet save Time : " +
> (System.currentTimeMillis() - startTimeWorkSheetSave));
>
> }
>
>
>
> Best Regards
> Bryce Alcock
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]