Nick,

I've uncovered the source of the problem.  It is happening, as you
suggested, in the SharedFormulaRecord.convertSharedFormulas() method in the
call to fixupRelativeColumn().

The problem stems from the fact that it appears Excel is putting incorrect
information in the shared formula data, and that's causing the relative
column in the second parameter of the SUMIF function to be calculated
incorrectly.  It looks to me like Excel itself ignores or compensates for
this error while the POI implementation doesn't.

Here's an example of the source of a cell with a shared formula in which the
relative column is miscalculated:

<c r="S40" s="52">
  <f t="shared" ref="M40:V42"
si="59">SUMIF($AI$3:$JN$3,S$36,$AI40:$JN40)</f>
  <v>0</v>
</c>

The formula in cell S40 correctly references cell S$36 in the SUMIF()
function, but it's resolved by convertSharedFormulas() as Y$36.  The reason
is because of the "ref" attribute.  It references the cells across which
this function is shared, and in this example ref="M40:V42".

In fact, this formula was probably originally copied over the range C40:V42
but somehow the ref in S40 was changed to M40:V42.  When
fixupRelativeColumn() is called, it sees that the first column in the ref
range is M (column 12) and the relative column is S (col 18).  It computes
the offset between those two columns (18 - 12 = 6), and adds that to the
relative column (18 + 6 = 24 => Y$36).  Since Y36 is empty it causes a
NullPointerException.

Two questions, then: 1) Why is this shared formula ref attribute getting
mangled? 2) Since this mangled data exists in the *.xlsx file, how does
Excel actually handle it correctly?

I have no clue about 1), but I suspect that in the case of 2) that it simply
ignores the ref attribute when <f[unction]> element isn't blank, in other
words, if a function is actually provided.  In the example case above since
the SUMIF function is provided directly, rather than a pointer to a shared
function, the function can be evaluated directly without calculating the
relative column offsets.

How does Excel handle the other columns that actually need to use the shared
formula and the (bad) relative cell references?  Again, no clue, though I'm
guessing that when it sees a situation like this, in which a shared formula
is given but the ref attribute is wrong, that it internally adjusts the ref
to point to the correct starting cell.

In this case instead of M40:V42 it would see the SUMIF function for S30 and
internally correct the ref to be S40:V42, which would result in correct
shared formula calculations since the relative offsets would then be
correct. I'm sure there are numerous corner cases I'm not considering but
the fact is that Excel correctly calculates the formula value despite the
botched ref attribute.

I'm loathe to try to write a patch in the SharedFormulaRecord class for this
problem since it seems to be making up for an internal inconsistency that
Excel is putting into a shared formula description.

Let me know what you think of all this.

Cheers,

Chris

On Wed, Aug 10, 2011 at 4:01 AM, Nick Burch <[email protected]> wrote:

> On Tue, 9 Aug 2011, Chris McCann wrote:
>
>> <c r="R40" s="52">
>>  <f t="shared" si="58"/>
>>  <v>0</v>
>> </c>
>> <c r="S40" s="52">
>>  <f t="shared" ref="M40:V42"
>> si="59">SUMIF($AI$3:$JN$3,S$**36,$AI40:$JN40)</f>
>>  <v>0</v>
>> </c>
>>
>> My assumptions are that <f> is "formula", t = "type", si = "shared index"
>> and <v> = value.
>>
>
> Off the top of my head, I think that's correct. If you look in the file
> format documentation (linked from the poi website, but it's a bit big...)
> it'll confirm
>
>
>
>  Evaluating the cells using POI (in a JRuby app) shows the following two
>> cell
>> formulas:
>>
>> R40 = SUMIF($AI$3:$JN$3,R$36,$AI40:$**JN40)
>> S40 = SUMIF($AI$3:$JN$3,*Y$36*,$**AI40:$JN40)
>>
>> Clearly something is amiss with the Y$36 reference in S40 -- it should be
>> S$36.  The SUMIF for T46 likewise shows what looks to be the correct
>> formula
>> in the XML file but POI shows the 2nd param to the formula to be AA$36
>> instead of T$36.
>>
>
> Firstly, any chance you could open a new bug in bugzilla, and upload the
> problem file? (Or if not that one, a similar one with dummy data that shows
> the same thing). If you can, then also do a short unit test using the file
> which shows POI getting the formula correct on the first cell, but incorrect
> on a later shared one
>
> If you look in XSSFCell, then I think it's the convertSharedFormula method
> that's used to look up the formula in the later cells. If you have time, I'd
> suggest you then dive into that with a debugger against your unit test, and
> see if you can spot the incorrect logic
>
> Cheers
>
> Nick
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> [email protected].**org<[email protected]>
> For additional commands, e-mail: [email protected]
>
>

Reply via email to