hssf eval-devguide.xml eval.xml

avik Thu, 19 May 2005 04:01:16 -0700

avik        2005/05/19 03:36:06

  Modified:    src/documentation/content/xdocs/hssf eval-devguide.xml
                        eval.xml
  Log:
  new formula eval docs
  
  Revision  Changes    Path
  1.2       +186 -183  
jakarta-poi/src/documentation/content/xdocs/hssf/eval-devguide.xml
  
  Index: eval-devguide.xml
  ===================================================================
  RCS file: 
/home/cvs/jakarta-poi/src/documentation/content/xdocs/hssf/eval-devguide.xml,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -r1.1 -r1.2
  --- eval-devguide.xml 13 May 2005 14:52:42 -0000      1.1
  +++ eval-devguide.xml 19 May 2005 10:36:06 -0000      1.2
  @@ -1,184 +1,187 @@
  -<?xml version="1.0" encoding="UTF-8"?>
  -<!-- Copyright (C) 2005 The Apache Software Foundation. All rights reserved. 
-->
  -<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" 
"../dtd/document-v11.dtd">
  -
  -<document>
  -    <header>
  -        <title>Developing Formula Evaluation</title>
  -        <authors>
  -                     <person email="[EMAIL PROTECTED]" name="Amol Deshmukh" 
id="AD"/>
  -        </authors>
  -    </header>
  -    <body>
  -     <section><title>Introduction</title>
  -             <p>This document is for developers wishing to contribute to the 
  -                     FormulaEvaluator API functionality.</p>
  -             <p>Currently, contribution is desired for implementing the 
standard MS 
  -                     excel functions. Place holder classes for these have 
been created, 
  -                     contributors only need to insert implementation for the 
  -                     individual "evaluate()" methods that do the actual 
evaluation.</p>
  -     </section>
  -     <section><title>Overview of FormulaEvaluator </title>
  -             <p>Briefly, a formula string (along with the sheet and workbook 
that 
  -                     form the context in which the formula is evaluated) is 
first parsed 
  -                     into RPN tokens using the FormulaParser class in 
POI-HSSF main. 
  -                     (If you dont know what RPN tokens are, now is a good 
time to 
  -                     read <link 
href="http://www-stone.ch.cam.ac.uk/documentation/rrf/rpn.html";>
  -                     this</link>.) 
  -             </p>
  -             <section><title> The big picture</title>
  -                     <p>RPN tokens are mapped to Eval classes. (Class 
hierarchy for the Evals 
  -                             is best understood if you view the class 
diagram in a class diagram 
  -                             viewer.) Depending on the type of RPN token 
(also called as Ptgs 
  -                             henceforth since that is what the FormulaParser 
calls the classes) a 
  -                             specific type of Eval wrapper is constructed to 
wrap the RPN token and 
  -                             is pushed on the stack.... UNLESS the Ptg is an 
OperationPtg. If it is an 
  -                             OperationPtg, an OperationEval instance is 
created for the specific 
  -                             type of OperationPtg. And depending on how many 
operands it takes, 
  -                             that many Evals are popped of the stack and 
passed in an array to 
  -                             the OperationEval instance's evaluate method 
which returns an Eval 
  -                             of subtype ValueEval.Thus an operation in the 
formula is evaluated. </p>
  -                             <note> An Eval is of subinterface ValueEval or 
OperationEval. 
  -                             Operands are always ValueEvals, Operations are 
always OperationEvals.</note> 
  -                             <p><code>OperationEval.evaluate(Eval[])</code> 
returns an Eval which is supposed 
  -                             to be of type ValueEval (actually since 
ValueEval is an interface, 
  -                             the return value is instance of one of the 
implementations of 
  -                             ValueEval). The valueEval resulting from 
evaluate() is pushed on the 
  -                             stack and the next RPN token is evaluated.... 
this continues till 
  -                             eventually there are no more RPN tokens at 
which point, if the formula 
  -                             string was correctly parsed, there should be 
just one Eval on the 
  -                             stack - which contains the result of evaluating 
the formula.</p>
  -                     <p>Ofcourse I glossed over the details of how AreaPtg 
and ReferencePtg 
  -                             are handled a little differently, but the code 
should be self 
  -                             explanatory for that. Very briefly, the cells 
included in AreaPtg and 
  -                             RefPtg are examined and their values are 
populated in individual 
  -                             ValueEval objects which are set into the 
AreaEval and RefEval (ok, 
  -                             since AreaEval and RefEval are interfaces, the 
implementations of 
  -                             AreaEval and RefEval - but you'll figure all 
that out from the code)</p>
  -                     <p>OperationEvals for the standard operators have been 
implemented and 
  -                             basic testing has been done </p>
  -             </section>
  -             <section><title> FunctionEval and FuncVarEval</title>
  -                     <p>FunctionEval is an abstract super class of 
FuncVarEval. The reason for this is that in the FormulaParser Ptg classes, 
there are two Ptgs, FuncPtg and FuncVarPtg. In my tests, I did not see FuncPtg 
being used so there is no corresponding FuncEval right now. But in case the 
need arises for a FuncVal class, FuncEval and FuncVarEval need to be isolated 
with a common interface/abstract class, hence FunctionEval.</p>
  -                     <p>FunctionEval also contains the mapping of which 
function class maps to which function index. This mapping has been done for all 
the functions, so all you really have to do is implement the evaluate method in 
the function class that has not already been implemented. The Function indexes 
are defined in AbstractFunctionPtg class in POI main.</p>
  -             </section>
  -     </section>
  -     <section><title>Walkthrough of an "evaluate()" implementation.</title>
  -             <p>So here is the fun part - lets walk through the 
implementation of the excel 
  -                     function... <strong>AVERAGE()</strong> </p>
  -             <section><title>The Code</title>
  -             <source>
  -public Eval evaluate(Eval[] operands) {
  -    double d = 0;
  -    int count = 0;
  -    ValueEval retval = null;
  -    for (int i = 0, iSize = operands.length; i &lt; iSize; i++) {
  -        if (operands[i] == null) continue;
  -        if (operands[i] instanceof AreaEval) {
  -            AreaEval ap = (AreaEval) operands[i];
  -            Object[] values = ap.getValues();
  -            for (int j = 0, jSize = values.length; j &lt; jSize; j++) {
  -                if (values[j] == null) continue;
  -                if (values[j] instanceof NumberEval) {
  -                //inside areas, ignore bools
  -                    d += ((NumberEval) values[j]).getNumberValue();
  -                    count++;
  -                }
  -                else if (values[j] instanceof RefEval) {
  -                    RefEval re = (RefEval) values[j];
  -                    ValueEval ve = re.getInnerValueEval();
  -                    if (ve != null &amp;&amp; ve instanceof NumberEval) {
  -                        d += ((NumberEval) ve).getNumberValue();
  -                        count++;
  -                    }
  -                }
  -            }
  -        }
  -        else if (operands[i] instanceof NumericValueEval) { 
  -            // for direct operands evaluate bools
  -            NumericValueEval np = (NumericValueEval) operands[i];
  -            d += np.getNumberValue();
  -            count++;
  -        }
  -        else if (operands[i] instanceof RefEval) {
  -            RefEval re = (RefEval) operands[i];
  -            ValueEval ve = re.getInnerValueEval();
  -            if (ve instanceof NumberEval) { 
  -                //if it is a reference, ignore bools
  -                NumberEval ne = (NumberEval) ve;
  -                d += ne.getNumberValue();
  -                count++;
  -            }
  -        }
  -    }
  -
  -    if (retval == null) {
  -        retval = (Double.isNaN(d)) ? 
  -          (ValueEval) ErrorEval.ERROR_503 : new NumberEval(d/count);
  -    }
  -    return retval;
  -}
  -     </source>
  -             </section>
  -             <section><title>Implementation Details</title>
  -             <ul>
  -                     <li>The implementation of the AVERAGE function lives in 
package 
  -                             o.a.p.hssf.record.formula.functions named  
Average.java. 
  -                             (Every excel function has a corresponding java 
source file 
  -                             in the above package) </li>
  -                     <li>If you open the file for a function thats not yet 
implemented, you will see one un-implemented method: 
  -                             <code>public Eval evaluate(Eval[] operands) 
{}</code> </li>
  -                     <li>Since the excel Average() function can take 1 or 
more operands, we iterate over all operands that are passed in the evaluate 
method:
  -                             <code>for (int i=0, iSize=operands.length; 
i&lt;iSize; i++) {...}</code></li>
  -                     <li>inside the loop, you will do the following
  -                             <ol>
  -                                     <li>Do a null check: <code>if 
(operands[i] == null) continue;</code></li>
  -                                     <li>Figure out the actual subtype of 
ValueEval that the operands 
  -                                             implements. The possible types 
that you will encounter in an 
  -                                             evaluate() are: NumberEval, 
BoolEval, StringEval, ErrorEval, 
  -                                             AreaEval, RefEval, 
BlankEval.</li>
  -                                     <li>Implement the function. See the 
next section for some
  -                                             caveats on implementing the 
Excel semantics. </li>
  -                             </ol>
  -                     </li>
  -                     <li>Finally before returning the NumberEval wrapping 
the double value that 
  -                             you computed, do one final check to see if the 
double is a NaN, 
  -                             if it is return ErrorEval.ERROR_503 (see the 
javadoc in ErrorEval.java 
  -                             for description of error codes - it is html so 
you might as well 
  -                             generate the javadocs)</li>
  -                     
  -             </ul>
  -             </section>
  -             <section><title>Modelling Excel Semantics</title>
  -                     <p>Strings are ignored. Booleans are ignored!!! (damn 
Oo.o!  I was almost misled here - nevermind). Actually here's the info on 
Bools: 
  -                             if you have formula: "=TRUE+1", it evaluates to 
2. 
  -                             So also, when you use TRUE like this: 
"=SUM(1,TRUE)", you see the result is: 2. 
  -                             So TRUE means 1 when doing numeric 
calculations, right? 
  -                             Wrong!
  -                             Because when you use TRUE in referenced cells 
with arithmetic functions, it evaluates to blank - meaning it is not evaluated 
- as if it was string or a blank cell. 
  -                             eg. "=SUM(1,A1)" when A1 is TRUE evaluates to 1.
  -                             So you have to do this kind of check for every 
possible data type as a function argument for any function before you 
understand the behaviour of the function. The operands can be entered in excel 
as comma separated or as a region specified like: A2:D4. Regions are treated as 
a single token by the parser hence we have AreaEval which stores the ValueEval 
at each cell in a region in a 1D array. So in our function if the operand is of 
type AreaEval we need to get the array of ValueEvals in the region of the 
AreaEval and iterate over each of them as if each of them were individual 
operands to the AVERAGE function. 
  -                             </p>
  -                     <p>Thus, since sometimes, Excel treats 
  -                                             Booleans as the numbers 0 and 1 
(for F and T respectively). 
  -                                             Hence BoolEval and NumberEval 
both implement a common interface: 
  -                                             NumericValueEval (since numbers 
and bools are also valid string 
  -                                             values, they also implement 
StringValueEval interface which is 
  -                                             also implemented by 
StringEval).</p>
  -                     <p>
  -                     The ValueEval inside an AreaEval can be one of: 
  -                             NumberEval, BoolEval, StringEval, ErrorEval, 
BlankEval. 
  -                     So you must handle each of these cases. 
  -                     Similarly, RefEvals have a property: innerValueEval 
that returns the ValueEval at the referenced cell. The ValueEval inside a 
RefEval can be one of: NumberEval, BoolEval, StringEval, ErrorEval, BlankEval. 
So you must handle each of these cases  - see how excel treats each one of them.
  -                     </p>                            
  -
  -             </section>
  -     </section>
  -     <section><title>Testing Framework</title>
  -     <fixme author="AD">TODO! FormulaEval comes with a testing framework, 
where you add 
  -             formula's and their expected values to an Excel sheet, and the 
test code
  -             automatically validates them. Since this is still in flux, the 
docs
  -             will be put online once the system is stable </fixme>   
  -     </section>
  -     </body>
  +<?xml version="1.0" encoding="UTF-8"?>


  +<!-- Copyright (C) 2005 The Apache Software Foundation. All rights reserved. 
-->

  +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" 
"../dtd/document-v11.dtd">

  +

  +<document>

  +    <header>

  +        <title>Developing Formula Evaluation</title>

  +        <authors>

  +                     <person email="[EMAIL PROTECTED]" name="Amol Deshmukh" 
id="AD"/>

  +        </authors>

  +    </header>

  +    <body>

  +     <section><title>Introduction</title>

  +             <p>This document is for developers wishing to contribute to the 

  +                     FormulaEvaluator API functionality.</p>

  +             <p>Currently, contribution is desired for implementing the 
standard MS 

  +                     excel functions. Place holder classes for these have 
been created, 

  +                     contributors only need to insert implementation for the 

  +                     individual "evaluate()" methods that do the actual 
evaluation.</p>

  +     </section>

  +     <section><title>Overview of FormulaEvaluator </title>

  +             <p>Briefly, a formula string (along with the sheet and workbook 
that 

  +                     form the context in which the formula is evaluated) is 
first parsed 

  +                     into RPN tokens using the FormulaParser class in 
POI-HSSF main. 

  +                     (If you dont know what RPN tokens are, now is a good 
time to 

  +                     read <link 
href="http://www-stone.ch.cam.ac.uk/documentation/rrf/rpn.html";>

  +                     this</link>.)

  +             </p>

  +             <section><title> The big picture</title>

  +                     <p>RPN tokens are mapped to Eval classes. (Class 
hierarchy for the Evals 

  +                             is best understood if you view the class 
diagram in a class diagram 

  +                             viewer.) Depending on the type of RPN token 
(also called as Ptgs 

  +                             henceforth since that is what the FormulaParser 
calls the classes) a 

  +                             specific type of Eval wrapper is constructed to 
wrap the RPN token and 

  +                             is pushed on the stack.... UNLESS the Ptg is an 
OperationPtg. If it is an 

  +                             OperationPtg, an OperationEval instance is 
created for the specific 

  +                             type of OperationPtg. And depending on how many 
operands it takes, 

  +                             that many Evals are popped of the stack and 
passed in an array to 

  +                             the OperationEval instance's evaluate method 
which returns an Eval 

  +                             of subtype ValueEval.Thus an operation in the 
formula is evaluated. </p>

  +                             <note> An Eval is of subinterface ValueEval or 
OperationEval. 

  +                             Operands are always ValueEvals, Operations are 
always OperationEvals.</note> 

  +                             <p><code>OperationEval.evaluate(Eval[])</code> 
returns an Eval which is supposed 

  +                             to be of type ValueEval (actually since 
ValueEval is an interface, 

  +                             the return value is instance of one of the 
implementations of 

  +                             ValueEval). The valueEval resulting from 
evaluate() is pushed on the 

  +                             stack and the next RPN token is evaluated.... 
this continues till 

  +                             eventually there are no more RPN tokens at 
which point, if the formula 

  +                             string was correctly parsed, there should be 
just one Eval on the 

  +                             stack - which contains the result of evaluating 
the formula.</p>

  +                     <p>Ofcourse I glossed over the details of how AreaPtg 
and ReferencePtg 

  +                             are handled a little differently, but the code 
should be self 

  +                             explanatory for that. Very briefly, the cells 
included in AreaPtg and 

  +                             RefPtg are examined and their values are 
populated in individual 

  +                             ValueEval objects which are set into the 
AreaEval and RefEval (ok, 

  +                             since AreaEval and RefEval are interfaces, the 
implementations of 

  +                             AreaEval and RefEval - but you'll figure all 
that out from the code)</p>

  +                     <p>OperationEvals for the standard operators have been 
implemented and tested.</p>

  +             </section>

  +             <section><title> FunctionEval and FuncVarEval</title>

  +                     <p>FunctionEval is an abstract super class of 
FuncVarEval. The reason for this is that in the FormulaParser Ptg classes, 
there are two Ptgs, FuncPtg and FuncVarPtg. In my tests, I did not see FuncPtg 
being used so there is no corresponding FuncEval right now. But in case the 
need arises for a FuncVal class, FuncEval and FuncVarEval need to be isolated 
with a common interface/abstract class, hence FunctionEval.</p>

  +                     <p>FunctionEval also contains the mapping of which 
function class maps to which function index. This mapping has been done for all 
the functions, so all you really have to do is implement the evaluate method in 
the function class that has not already been implemented. The Function indexes 
are defined in AbstractFunctionPtg class in POI main.</p>

  +             </section>

  +     </section>

  +     <section><title>Walkthrough of an "evaluate()" implementation.</title>

  +             <p>So here is the fun part - lets walk through the 
implementation of the excel 

  +                     function... <strong>SQRT()</strong> </p>

  +             <section><title>The Code</title>

  +             <source>

  +public class Sqrt extends NumericFunction {

  +    

  +    private static final ValueEvalToNumericXlator NUM_XLATOR = 

  +        new ValueEvalToNumericXlator((short)

  +                ( ValueEvalToNumericXlator.BOOL_IS_PARSED 

  +                | ValueEvalToNumericXlator.EVALUATED_REF_BOOL_IS_PARSED

  +                | ValueEvalToNumericXlator.EVALUATED_REF_STRING_IS_PARSED

  +                | ValueEvalToNumericXlator.REF_BOOL_IS_PARSED

  +                | ValueEvalToNumericXlator.STRING_IS_PARSED

  +                ));

  +

  +    protected ValueEvalToNumericXlator getXlator() {

  +        return NUM_XLATOR;

  +    }

  +

  +    public Eval evaluate(Eval[] operands, int srcRow, short srcCol) {

  +        double d = 0;

  +        ValueEval retval = null;

  +        

  +        switch (operands.length) {

  +        default:

  +            retval = ErrorEval.VALUE_INVALID;

  +            break;

  +        case 1:

  +            ValueEval ve = singleOperandEvaluate(operands[0], srcRow, 
srcCol);

  +            if (ve instanceof NumericValueEval) {

  +                NumericValueEval ne = (NumericValueEval) ve;

  +                d = ne.getNumberValue();

  +            }

  +            else if (ve instanceof BlankEval) {

  +                // do nothing

  +            }

  +            else {

  +                retval = ErrorEval.NUM_ERROR;

  +            }

  +        }

  +        

  +        if (retval == null) {

  +            d = Math.sqrt(d);

  +            retval = (Double.isNaN(d)) ? (ValueEval) ErrorEval.VALUE_INVALID 
: new NumberEval(d);

  +        }

  +        return retval;

  +    }

  +

  +}

  +     </source>

  +             </section>

  +             <section><title>Implementation Details</title>

  +             <ul>

  +                     <li>The first thing to realise is that classes already 
exist, even for functions that are not yet implemented.

  +                     Just that they extend from DefaultFunctionImpl whose 
behaviour is to return an ErrorEval.FUNCTION_NOT_IMPLEMENTED value.</li>

  +                     <li>In order to implement SQRT(..), we need to: a. 
Extend from the correct Abstract super class; b. implement the evaluate(..) 
method</li>

  +                     <li>Hence we extend SQRT(..) from the predefined class 
NumericFunction</li>

  +                     <li>Since SQRT(..) takes a single argument, we verify 
the length of the operands array else set the return value to 
ErrorEval.VALUE_INVALID</li>

  +                     <li>Next we normalize each operand to a limited set of 
ValueEval subtypes, specifically, we call the function 

  +                     <code>singleOperandEvaluate(..)</code> to do 
conversions of different value eval types to one of: NumericValueEval,

  +                     BlankEval and ErrorEval. The conversion logic is 
configured by a ValueEvalToNumericXlator instance which

  +                     is returned by the Factory method: 
<code>getXlator(..)</code> The flags used to create the ValueEvalToNumericXlator

  +                     instance are briefly explained as follows:

  +                     BOOL_IS_PARSED means whether this function treats 
Boolean values as 1, 

  +                     REF_BOOL_IS_PARSED means whether Boolean values in cell 
references are parsed or not.

  +                     So also, EVALUATED_REF_BOOL_IS_PARSED means if the 
operand was a RefEval that was assigned a

  +                     Boolean value as a result of evaluation of the formula 
that it contained.

  +                     eg. SQRT(TRUE) returns 1: This means BOOL_IS_PARSED 
should be set.

  +                     SQRT(A1) returns 1 when A1 has TRUE: This means 
REF_BOOL_IS_PARSED should be set.

  +                     SQRT(A1) returns 1 when A1 has a formula that evaluates 
to TRUE: This means EVALUATED_REF_BOOL_IS_PARSED should be set.

  +                     If the flag is not set for a particular case, that case 
is ignored (treated as if the cell is blank) _unless_

  +                     there is a flag like: STRING_IS_INVALID_VALUE (which 
means that Strings should be treated as resulting in VALUE_INVALID ErrorEval)

  +                     </li>

  +                     <li>Next perform the appropriate Math function on the 
double value (if an error didnt occur already).</li>

  +                     <li>Finally before returning the NumberEval wrapping 
the double value that 

  +                             you computed, do one final check to see if the 
double is a NaN, (or if it is "Infinite")

  +                             If it is return the appropriate ErrorEval 
instance. Note: The OpenOffice.org error codes

  +                             should NOT be preferred. Instead use the excel 
specific error codes like VALUE_INVALID, NUM_ERROR, DIV_ZERO etc. 

  +                             (Thanks to Avik for bringing this issue up 
early!) The Oo.o ErrorCodes will be removed (if they havent already been :)</li>

  +             </ul>

  +             </section>

  +             <section><title>Modelling Excel Semantics</title>

  +                     <p>Strings are ignored. Booleans are ignored!!!. 
Actually here's the info on Bools: 

  +                             if you have formula: "=TRUE+1", it evaluates to 
2. 

  +                             So also, when you use TRUE like this: 
"=SUM(1,TRUE)", you see the result is: 2. 

  +                             So TRUE means 1 when doing numeric 
calculations, right? 

  +                             Wrong!

  +                             Because when you use TRUE in referenced cells 
with arithmetic functions, it evaluates to blank - meaning it is not evaluated 
- as if it was string or a blank cell. 

  +                             eg. "=SUM(1,A1)" when A1 is TRUE evaluates to 1.

  +                             This behaviour changes depending on which 
function you are using. eg. SQRT(..) that was 

  +                             described earlier treats a TRUE as 1 in all 
cases. This is why the configurable ValueEvalToNumericXlator

  +                             class had to be written.

  +                             </p>

  +                     <p>Note that when you are extending from an abstract 
function class like

  +                     NumericFunction (rather than implementing the interface 
o.a.p.hssf.record.formula.eval.Function directly)

  +                     you can use the utility methods in the super class - 
singleOperandEvaluate(..) - to quickly

  +                     reduce the different ValueEval subtypes to a small set 
of possible types. However when

  +                     implemenitng the Function interface directly, you will 
have to handle the possiblity

  +                     of all different ValueEval subtypes being sent in as 
'operands'. (Hard to put this in

  +                     word, please have a look at the code for 
NumericFunction for an example of

  +                     how/why different ValueEvals need to be handled)

  +                     </p>

  +             </section>

  +     </section>

  +     <section><title>Testing Framework</title>

  +     <p>Automated testing of the implemented Function is easy.

  +     The source code for this is in the file: 
o.a.p.h.record.formula.GenericFormulaTestCase.java

  +     This class has a reference to the test xls file (not /a/ test xls, 
/the/ test xls :)

  +     which may need to be changed for your environment. Once you do that, in 
the test xls,

  +     locate the entry for the function that you have implemented and enter 
different tests 

  +     in a cell in the FORMULA row. Then copy the "value of" the formula that 
you entered in the

  +     cell just below it (this is easily done in excel as: 

  +     [copy the formula cell] > [go to cell below] > Edit > Paste Special > 
Values > "ok").

  +     You can enter multiple such formulas and paste their values in the cell 
below and the

  +     test framework will automatically test if the formula evaluation 
matches the expected

  +     value (Again, hard to put in words, so if you will, please take time to 
quickly look

  +     at the code and the currently entered tests in the patch attachment 
"FormulaEvalTestData.xls" 

  +     file).

  +     </p>    

  +     </section>

  +     </body>

   </document>
  \ No newline at end of file
  
  
  
  1.2       +145 -151  jakarta-poi/src/documentation/content/xdocs/hssf/eval.xml
  
  Index: eval.xml
  ===================================================================
  RCS file: 
/home/cvs/jakarta-poi/src/documentation/content/xdocs/hssf/eval.xml,v
  retrieving revision 1.1
  retrieving revision 1.2
  diff -u -r1.1 -r1.2
  --- eval.xml  13 May 2005 14:52:42 -0000      1.1
  +++ eval.xml  19 May 2005 10:36:06 -0000      1.2
  @@ -1,151 +1,145 @@
  -<?xml version="1.0" encoding="UTF-8"?>
  -<!-- Copyright (C) 2005 The Apache Software Foundation. All rights reserved. 
-->
  -<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" 
"../dtd/document-v11.dtd">
  -
  -<document>
  -    <header>
  -        <title>Formula Evaluation</title>
  -        <authors>
  -                     <person email="[EMAIL PROTECTED]" name="Amol Deshmukh" 
id="AD"/>
  -        </authors>
  -    </header>
  -    <body>
  -             <section><title>Introduction</title>
  -                     <p>The POI formula evaluation code enables you to 
calculate the result of 
  -                             formulas in Excels sheets read-in, or created 
in POI. This document explains
  -                             how to use the API to evaluate your formulas. 
  -                     </p>
  -                     <warning> This code currently lives in Bugzilla as 
  -                             <link 
href="http://issues.apache.org/bugzilla/show_bug.cgi?id=34828";> 
  -                             bug 34828 </link>. It is expected to land in 
POI CVS in the scratchpad 
  -                             area soon.
  -                     </warning>
  -             </section>
  -             <section><title>Status</title>
  -                     <p>     The code currently provides implementations for 
all the arithmatic operators.
  -                             It also provides implementations for about 30 
built in 
  -                             functions in Excel. The framework however makes 
is easy to add 
  -                             implementation of new functions. See the <link 
href="eval-devguide.html"> Formula
  -                             evaluation development guide</link> for 
details. </p>
  -                     <p> Note that user-defined functions are not supported, 
and is not likely to done
  -                             any time soon... at least, not till there is a 
VB implementation in Java!
  -                     </p>
  -             </section>
  -             <section><title>User API How-TO</title>
  -                     <p>The following code demonstrates how to use the 
HSSFFormulaEvaluator 
  -                             in the context of other POI excel reading code.
  -                     </p>
  -                     <p>There are two ways in which you can use the 
HSSFFormulaEvalutator API.</p>
  -                     <section><title>Using 
HSSFFormulaEvaluator.<strong>evaluate</strong>(HSSFCell cell)</title>
  -                             <source>
  -FileInputStream fis = new FileInputStream("c:/temp/test.xls");
  -HSSFWorkbook wb = new HSSFWorkbook(fis);
  -HSSFSheet sheet = wb.getSheetAt(0);
  -HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb);
  -
  -// suppose your formula is in B3
  -CellReference cellReference = new CellReference("B3"); 
  -HSSFRow row = sheet.getRow(cellReference.getRow());
  -HSSFCell cell = row.getCell(cellReference.getCol()); 
  -String formulaString = c.getCellFormula();
  -HSSFFormulaEvaluator.CellValue cellValue = 
  -        evaluator.evaluate(formulaString);
  -
  -switch (cellValue.getCellType()) {
  -     case HSSFCell.CELL_TYPE_BOOLEAN:
  -     System.out.println(cellValue.getBooleanCellValue());
  -     break;
  -     case HSSFCell.CELL_TYPE_NUMERIC:
  -     System.out.println(cellValue.getNumberCellValue());
  -     break;
  -     case HSSFCell.CELL_TYPE_STRING:
  -     System.out.println(cellValue.getStringCellValue());
  -     break;
  -     case HSSFCell.CELL_TYPE_BLANK:
  -     break;
  -     case HSSFCell.CELL_TYPE_ERROR:
  -     break;
  -
  -     // CELL_TYPE_FORMULA will never happen
  -     case HSSFCell.CELL_TYPE_FORMULA: 
  -     break;
  -}                            
  -                             </source>
  -                             <p>Thus using the retrieved value (of type 
  -                                     HSSFFormulaEvaluator.CellValue - a 
nested class) returned 
  -                                     by HSSFFormulaEvaluator is similar to 
using a HSSFCell object 
  -                                     containing the value of the formula 
evaluation. CellValue is 
  -                                     a simple value object and does not 
maintain reference 
  -                                     to the original cell.
  -                             </p>
  -                             
  -                     </section>
  -                     <section><title>Using 
HSSFFormulaEvaluator.<strong>evaluateInCell</strong>(HSSFCell cell)
  -                             </title>
  -                             <source>
  -FileInputStream fis = new FileInputStream("c:/temp/test.xls");
  -HSSFWorkbook wb = new HSSFWorkbook(fis);
  -HSSFSheet sheet = wb.getSheetAt(0);
  -HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb);
  -
  -// suppose your formula is in B3
  -CellReference cellReference = new CellReference("B3"); 
  -HSSFRow row = sheet.getRow(cellReference.getRow());
  -HSSFCell cell = row.getCell(cellReference.getCol()); 
  -String formulaString = c.getCellFormula();
  -
  -if (cell!=null) {
  -     switch (<strong>evaluator.evaluateInCell</strong>(cell).getCellType()) {
  -             case HSSFCell.CELL_TYPE_BOOLEAN:
  -                 System.out.println(cell.getBooleanCellValue());
  -                 break;
  -             case HSSFCell.CELL_TYPE_NUMERIC:
  -                 System.out.println(cell.getNumberCellValue());
  -                 break;
  -             case HSSFCell.CELL_TYPE_STRING:
  -                 System.out.println(cell.getStringCellValue());
  -                 break;
  -             case HSSFCell.CELL_TYPE_BLANK:
  -                 break;
  -             case HSSFCell.CELL_TYPE_ERROR:
  -                 System.out.println(cell.getErrorCellValue());
  -                 break;
  -             
  -             // CELL_TYPE_FORMULA will never occur
  -             case HSSFCell.CELL_TYPE_FORMULA: 
  -                 break;
  -     }
  -}
  -                                     </source>
  -
  -                             </section>
  -             </section>
  -             
  -             <section><title></title>
  -                     
  -             </section>
  -             
  -             <section><title>Performance Notes</title>
  -                     <ul>
  -                             <li>Generally you should have to create only 
one HSSFFormulaEvaluator 
  -                                     instance per sheet, but there really is 
no overhead in creating 
  -                                     multiple HSSFFormulaEvaluators per 
sheet other than that of the 
  -                                     HSSFFormulaEvaluator object creation. 
  -                             </li>
  -                             <li>Also note that HSSFFormulaEvaluator 
maintains a reference to 
  -                                     the sheet and workbook, so ensure that 
the evaluator instance 
  -                                     is available for garbage collection 
when you are done with it 
  -                                     (in other words don't maintain long 
lived reference to 
  -                                     HSSFFormulaEvaluator if you don't 
really need to - unless 
  -                                     all references to the sheet and 
workbook are removed, these 
  -                                     don't get garbage collected and 
continue to occupy potentially 
  -                                     large amounts of memory). 
  -                             </li>   
  -                             <li>CellValue instances however do not maintain 
reference to the 
  -                                     HSSFCell or the sheet or workbook, so 
these can be long-lived 
  -                                     objects without any adverse effect on 
performance.
  -                             </li>
  -                     </ul>
  -             </section>
  -     </body>
  -</document>
  +<?xml version="1.0" encoding="UTF-8"?>

  +<!-- Copyright (C) 2005 The Apache Software Foundation. All rights reserved. 
-->

  +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" 
"../dtd/document-v11.dtd">

  +

  +<document>

  +    <header>

  +        <title>Formula Evaluation</title>

  +        <authors>

  +                     <person email="[EMAIL PROTECTED]" name="Amol Deshmukh" 
id="AD"/>

  +        </authors>

  +    </header>

  +    <body>

  +             <section><title>Introduction</title>

  +                     <p>The POI formula evaluation code enables you to 
calculate the result of 

  +                             formulas in Excels sheets read-in, or created 
in POI. This document explains

  +                             how to use the API to evaluate your formulas. 

  +                     </p>

  +                     <warning> This code currently lives in Bugzilla as 

  +                             <link 
href="http://issues.apache.org/bugzilla/show_bug.cgi?id=34828";> 

  +                             bug 34828 </link>. It is expected to land in 
POI CVS in the scratchpad 

  +                             area soon.

  +                     </warning>

  +             </section>

  +             <section><title>Status</title>

  +                     <p>     The code currently provides implementations for 
all the arithmatic operators.

  +                             It also provides implementations for approx. 20 
built in 

  +                             functions in Excel. The framework however makes 
is easy to add 

  +                             implementation of new functions. See the <link 
href="eval-devguide.html"> Formula

  +                             evaluation development guide</link> for 
details. </p>

  +                     <p> Note that user-defined functions are not supported, 
and is not likely to done

  +                             any time soon... at least, not till there is a 
VB implementation in Java!

  +                     </p>

  +             </section>

  +             <section><title>User API How-TO</title>

  +                     <p>The following code demonstrates how to use the 
HSSFFormulaEvaluator 

  +                             in the context of other POI excel reading code.

  +                     </p>

  +                     <p>There are two ways in which you can use the 
HSSFFormulaEvalutator API.</p>

  +                     <section><title>Using 
HSSFFormulaEvaluator.<strong>evaluate</strong>(HSSFCell cell)</title>

  +                             <source>

  +FileInputStream fis = new FileInputStream("c:/temp/test.xls");

  +HSSFWorkbook wb = new HSSFWorkbook(fis);

  +HSSFSheet sheet = wb.getSheetAt(0);

  +HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb);

  +

  +// suppose your formula is in B3

  +CellReference cellReference = new CellReference("B3"); 

  +HSSFRow row = sheet.getRow(cellReference.getRow());

  +HSSFCell cell = row.getCell(cellReference.getCol()); 

  +HSSFFormulaEvaluator.CellValue cellValue = evaluator.evaluate(cell);

  +

  +switch (cellValue.getCellType()) {

  +     case HSSFCell.CELL_TYPE_BOOLEAN:

  +     System.out.println(cellValue.getBooleanValue());

  +     break;

  +     case HSSFCell.CELL_TYPE_NUMERIC:

  +     System.out.println(cellValue.getNumberValue());

  +     break;

  +     case HSSFCell.CELL_TYPE_STRING:

  +     System.out.println(cellValue.getStringValue());

  +     break;

  +     case HSSFCell.CELL_TYPE_BLANK:

  +     break;

  +     case HSSFCell.CELL_TYPE_ERROR:

  +     break;

  +

  +     // CELL_TYPE_FORMULA will never happen

  +     case HSSFCell.CELL_TYPE_FORMULA: 

  +     break;

  +}                            

  +                             </source>

  +                             <p>Thus using the retrieved value (of type 

  +                                     HSSFFormulaEvaluator.CellValue - a 
nested class) returned 

  +                                     by HSSFFormulaEvaluator is similar to 
using a HSSFCell object 

  +                                     containing the value of the formula 
evaluation. CellValue is 

  +                                     a simple value object and does not 
maintain reference 

  +                                     to the original cell.

  +                             </p>

  +                             

  +                     </section>

  +                     <section><title>Using 
HSSFFormulaEvaluator.<strong>evaluateInCell</strong>(HSSFCell cell)

  +                             </title>

  +                             <source>

  +FileInputStream fis = new FileInputStream("/somepath/test.xls");

  +HSSFWorkbook wb = new HSSFWorkbook(fis);

  +HSSFSheet sheet = wb.getSheetAt(0);

  +HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb);

  +

  +// suppose your formula is in B3

  +CellReference cellReference = new CellReference("B3"); 

  +HSSFRow row = sheet.getRow(cellReference.getRow());

  +HSSFCell cell = row.getCell(cellReference.getCol()); 

  +

  +

  +if (cell!=null) {

  +     switch (<strong>evaluator.evaluateInCell</strong>(cell).getCellType()) {

  +             case HSSFCell.CELL_TYPE_BOOLEAN:

  +                 System.out.println(cell.getBooleanCellValue());

  +                 break;

  +             case HSSFCell.CELL_TYPE_NUMERIC:

  +                 System.out.println(cell.getNumberCellValue());

  +                 break;

  +             case HSSFCell.CELL_TYPE_STRING:

  +                 System.out.println(cell.getStringCellValue());

  +                 break;

  +             case HSSFCell.CELL_TYPE_BLANK:

  +                 break;

  +             case HSSFCell.CELL_TYPE_ERROR:

  +                 System.out.println(cell.getErrorCellValue());

  +                 break;

  +             

  +             // CELL_TYPE_FORMULA will never occur

  +             case HSSFCell.CELL_TYPE_FORMULA: 

  +                 break;

  +     }

  +}

  +                                     </source>

  +

  +                             </section>

  +             </section>

  +             

  +             <section><title>Performance Notes</title>

  +                     <ul>

  +                             <li>Generally you should have to create only 
one HSSFFormulaEvaluator 

  +                                     instance per sheet, but there really is 
no overhead in creating 

  +                                     multiple HSSFFormulaEvaluators per 
sheet other than that of the 

  +                                     HSSFFormulaEvaluator object creation. 

  +                             </li>

  +                             <li>Also note that HSSFFormulaEvaluator 
maintains a reference to 

  +                                     the sheet and workbook, so ensure that 
the evaluator instance 

  +                                     is available for garbage collection 
when you are done with it 

  +                                     (in other words don't maintain long 
lived reference to 

  +                                     HSSFFormulaEvaluator if you don't 
really need to - unless 

  +                                     all references to the sheet and 
workbook are removed, these 

  +                                     don't get garbage collected and 
continue to occupy potentially 

  +                                     large amounts of memory). 

  +                             </li>   

  +                             <li>CellValue instances however do not maintain 
reference to the 

  +                                     HSSFCell or the sheet or workbook, so 
these can be long-lived 

  +                                     objects without any adverse effect on 
performance.

  +                             </li>

  +                     </ul>

  +             </section>

  +     </body>

  +</document>

  
  
  

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/

cvs commit: jakarta-poi/src/documentation/content/xdocs/hssf eval-devguide.xml eval.xml

Reply via email to