Thanks Nick!

Those two references helped a ton.  I've solved the problem, and posting
some of my steps here just for Google Fodder for future people.

https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ExcelExtractor.javahad
the first secret I was missing (
formula.hasCachedResultString()):

This worked to print the value of a good formula:
        if(frec.hasCachedResultString()) {
          // Formula result is a string
          // This is stored in the next record
          outputNextStringRecord = true
          nextRow = frec.getRow()
          nextColumn = frec.getColumn()
        } else {
          thisStr = formatListener.formatNumberDateCell(frec)
        }

Where this didn't (maybe that original example I referenced could get
updated?):

        if(Double.isNaN( frec.getValue() )) {
          // Formula result is a string
          // This is stored in the next record
          outputNextStringRecord = true
          nextRow = frec.getRow()
          nextColumn = frec.getColumn()
        } else {
          thisStr = formatListener.formatNumberDateCell(frec)
        }

The only missing piece now is displaying the "#DIV/0" type of error
messages.

For my particular Error Formula, frec.hasCachedResultString() returns false

frec.getCachedErrorValue() returns: 7 (
http://poi.apache.org/apidocs/org/apache/poi/hssf/record/FormulaRecord.html#getCachedErrorValue()is
a very bad javadoc) - not sure if its related to
http://poi.apache.org/apidocs/org/apache/poi/ss/usermodel/FormulaError.html

frec.getParsedExpression() is an array of three:
expression: org.apache.poi.ss.formula.ptg.RefPtg [G2]
expression: org.apache.poi.ss.formula.ptg.IntPtg [0]
expression: class org.apache.poi.ss.formula.ptg.DividePtg


I tried printing frec.getCachedErrorValue() for each Formula entry to see
what a "normal' value might look like, only to find out I triggered an
exception on my first "good formula":

09:29:44.705 [main] ERROR fm.util.Resource - Caught exception using resource
java.lang.IllegalStateException: Not an error cached value - <string>
        at
org.apache.poi.hssf.record.FormulaRecord$SpecialCachedValue.getErrorValue(FormulaRecord.java:163)
~[poi-3.9.jar:3.9]
        at
org.apache.poi.hssf.record.FormulaRecord.getCachedErrorValue(FormulaRecord.java:252)
~[poi-3.9.jar:3.9]
        at
fm.util.XLSStreamProcessor.processRecord(XLSStreamReader.scala:153)
~[classes/:na]
        at
org.apache.poi.hssf.eventusermodel.MissingRecordAwareHSSFListener.processRecord(MissingRecordAwareHSSFListener.java:189)
~[poi-3.9.jar:3.9]
        at
org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord(FormatTrackingHSSFListener.java:85)
~[poi-3.9.jar:3.9]
        at
org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord(HSSFRequest.java:112)
~[poi-3.9.jar:3.9]

Based upon the methods available to:
http://poi.apache.org/apidocs/org/apache/poi/hssf/record/FormulaRecord.htmlI
wasn't quite sure howto check for what the formula type was, but based
upon:
http://grepcode.com/file_/repo1.maven.org/maven2/org.apache.poi/poi/3.9/org/apache/poi/hssf/record/FormulaRecord.java/?v=source
I was able to see how FormulaRecord was dealing with the various
fields
under the covers.  The result is I've so far pieced this together, which
seems to do the trick:

        if(frec.hasCachedResultString()) {
          // Formula result is a string
          // This is stored in the next record
          outputNextStringRecord = true
          nextRow = frec.getRow()
          nextColumn = frec.getColumn()
        } else {
          thisStr = frec.getCachedResultType match {
            case Cell.CELL_TYPE_STRING =>
formatListener.formatNumberDateCell(frec)
            case Cell.CELL_TYPE_BOOLEAN => frec.getCachedBooleanValue()
match {
              case false => "FALSE"
              case true => "TRUE"
            }
            case Cell.CELL_TYPE_ERROR =>
ErrorEval.getText(frec.getCachedErrorValue())
            case _ => logger.warn("Unknown Formula Result Type");
"<unknown>"
          }
        }

Now my test file looks exactly like my XSSF:

Vector(
  Vector(text_value1, formatted_value1, TRUE, 8/7/12, 5:01:01 AM, 8/7/12
5:01, 1, 1.10, 11.1whee, #DIV/0!, sharedstrings_value),
  Vector(, formatted_value2, FALSE, 8/8/12, 5:02:02 AM, 8/8/12 5:02, 2,
2.20, 22.2whee, #DIV/0!, sharedstrings_value),
  Vector(text_value3, formatted_value3, TRUE, 8/9/12, 5:03:03 AM, 8/9/12
5:03, 3, 3.30, 33.3whee, #DIV/0!, ),
  Vector(text_value4, formatted_value4, FALSE, 8/10/12, 5:04:04 AM, 8/10/12
5:04, 4, 4.40, 44.4whee, #DIV/0!, sharedstrings_value),
  Vector(text_value5, formatted_value5, TRUE, , 5:05:05 AM, 8/11/12 5:05,
5, 5.50, 55.5whee, #DIV/0!, sharedstrings_value),
  Vector(text_value6, formatted_value6, FALSE, 8/12/12, 5:06:06 AM, 8/12/12
5:06, 6, 6.60, 66.6whee, #DIV/0!, sharedstrings_value)
 )

Thanks again Nick!

Eric


On Thu, Jul 18, 2013 at 3:08 AM, Nick Burch <[email protected]> wrote:

> On Wed, 17 Jul 2013, Eric Peters wrote:
>
>> Based upon
>> http://svn.apache.org/repos/**asf/poi/trunk/src/examples/**
>> src/org/apache/poi/hssf/**eventusermodel/examples/**XLS2CSVmra.java<http://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java>
>>
>
> Two other good event based examples to view are:
> https://svn.apache.org/repos/**asf/poi/trunk/src/java/org/**
> apache/poi/hssf/extractor/**EventBasedExcelExtractor.java<https://svn.apache.org/repos/asf/poi/trunk/src/java/org/apache/poi/hssf/extractor/EventBasedExcelExtractor.java>
> https://svn.apache.org/repos/**asf/tika/trunk/tika-parsers/**
> src/main/java/org/apache/tika/**parser/microsoft/**ExcelExtractor.java<https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ExcelExtractor.java>
>
>
>  I was expecting to be able to get the result of a formula (One example I
>> have a CONCATENATE() statement, but I end up getting a value like "0.00" or
>> in another column I expect an error message like: #DIV/0! but instead get
>> "0"Thoughts on how I can get this to match up with the XSSF?
>>
>
> Can you try reading that cell via the usermodel, to check if a cached
> value was actually saved to the file?
>
> Nick
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> [email protected].**org<[email protected]>
> For additional commands, e-mail: [email protected]
>
>

Reply via email to