Nick, I've uncovered the source of the problem. It is happening, as you suggested, in the SharedFormulaRecord.convertSharedFormulas() method in the call to fixupRelativeColumn().
The problem stems from the fact that it appears Excel is putting incorrect information in the shared formula data, and that's causing the relative column in the second parameter of the SUMIF function to be calculated incorrectly. It looks to me like Excel itself ignores or compensates for this error while the POI implementation doesn't. Here's an example of the source of a cell with a shared formula in which the relative column is miscalculated: <c r="S40" s="52"> <f t="shared" ref="M40:V42" si="59">SUMIF($AI$3:$JN$3,S$36,$AI40:$JN40)</f> <v>0</v> </c> The formula in cell S40 correctly references cell S$36 in the SUMIF() function, but it's resolved by convertSharedFormulas() as Y$36. The reason is because of the "ref" attribute. It references the cells across which this function is shared, and in this example ref="M40:V42". In fact, this formula was probably originally copied over the range C40:V42 but somehow the ref in S40 was changed to M40:V42. When fixupRelativeColumn() is called, it sees that the first column in the ref range is M (column 12) and the relative column is S (col 18). It computes the offset between those two columns (18 - 12 = 6), and adds that to the relative column (18 + 6 = 24 => Y$36). Since Y36 is empty it causes a NullPointerException. Two questions, then: 1) Why is this shared formula ref attribute getting mangled? 2) Since this mangled data exists in the *.xlsx file, how does Excel actually handle it correctly? I have no clue about 1), but I suspect that in the case of 2) that it simply ignores the ref attribute when <f[unction]> element isn't blank, in other words, if a function is actually provided. In the example case above since the SUMIF function is provided directly, rather than a pointer to a shared function, the function can be evaluated directly without calculating the relative column offsets. How does Excel handle the other columns that actually need to use the shared formula and the (bad) relative cell references? Again, no clue, though I'm guessing that when it sees a situation like this, in which a shared formula is given but the ref attribute is wrong, that it internally adjusts the ref to point to the correct starting cell. In this case instead of M40:V42 it would see the SUMIF function for S30 and internally correct the ref to be S40:V42, which would result in correct shared formula calculations since the relative offsets would then be correct. I'm sure there are numerous corner cases I'm not considering but the fact is that Excel correctly calculates the formula value despite the botched ref attribute. I'm loathe to try to write a patch in the SharedFormulaRecord class for this problem since it seems to be making up for an internal inconsistency that Excel is putting into a shared formula description. Let me know what you think of all this. Cheers, Chris On Wed, Aug 10, 2011 at 4:01 AM, Nick Burch <[email protected]> wrote: > On Tue, 9 Aug 2011, Chris McCann wrote: > >> <c r="R40" s="52"> >> <f t="shared" si="58"/> >> <v>0</v> >> </c> >> <c r="S40" s="52"> >> <f t="shared" ref="M40:V42" >> si="59">SUMIF($AI$3:$JN$3,S$**36,$AI40:$JN40)</f> >> <v>0</v> >> </c> >> >> My assumptions are that <f> is "formula", t = "type", si = "shared index" >> and <v> = value. >> > > Off the top of my head, I think that's correct. If you look in the file > format documentation (linked from the poi website, but it's a bit big...) > it'll confirm > > > > Evaluating the cells using POI (in a JRuby app) shows the following two >> cell >> formulas: >> >> R40 = SUMIF($AI$3:$JN$3,R$36,$AI40:$**JN40) >> S40 = SUMIF($AI$3:$JN$3,*Y$36*,$**AI40:$JN40) >> >> Clearly something is amiss with the Y$36 reference in S40 -- it should be >> S$36. The SUMIF for T46 likewise shows what looks to be the correct >> formula >> in the XML file but POI shows the 2nd param to the formula to be AA$36 >> instead of T$36. >> > > Firstly, any chance you could open a new bug in bugzilla, and upload the > problem file? (Or if not that one, a similar one with dummy data that shows > the same thing). If you can, then also do a short unit test using the file > which shows POI getting the formula correct on the first cell, but incorrect > on a later shared one > > If you look in XSSFCell, then I think it's the convertSharedFormula method > that's used to look up the formula in the later cells. If you have time, I'd > suggest you then dive into that with a debugger against your unit test, and > see if you can spot the incorrect logic > > Cheers > > Nick > > ------------------------------**------------------------------**--------- > To unsubscribe, e-mail: > [email protected].**org<[email protected]> > For additional commands, e-mail: [email protected] > >
