Anthony,

Must agree with you about HSSF; I think it is - to quote a comic hero of mine - a 'cracking' API. The ideal, I think, would be to split it into two streams; one concerned with supporting a product that allows us to create and read BIFF8 files and another that deals with formulae as it alone is a very challenging aspect and something I associate more with the Excel the application than the files it produces. The first could be used to parse Excel files, creating a series of objects that together represent the file. This could be passed to the latter - the calculation engine - that would support performing operations with the sheet and any formulae it contained.

Actually we are combining OOXML and BIFF - HSSF / XSSF - 
http://poi.apache.org/spreadsheet/index.html

I do think that there are probably a huge number of spreadsheets out there used to collect information AND there is a huge need to pull that information out in an organized manner.

There is good reason to be able to evaluate on the server, if only to assure that the spreadsheet is consistent.

I wonder if a simple Servlet would be a good example. UPLOAD an xls / xlsx. Recalc - for any number of cycles. Display named ranges. Download resulting file. This could be a handy sample tool that someone could build on for their particular application.

Another reason to be able to evaluate the formulas and functions used in a spreadsheet could be to look for and possibly disable UDFs to follow a clear security policy even if the spreadsheet is actually used in OpenOffice, OpenOffice.org, or Microsoft Office.

Understand about OpenOffice - and I have to admit that it can be slow; certainly connecting to OpenOffice or bootstrapping as they call it takes a couple of seconds. Will keep my eyes open regarding other solutions. Finally, have you thought about working with Nick and the others on developing POI's formula support? Maybe your company would support POI or allow you to work on and then submit patches, that way we all win.

Before Yegor and I started using POI we created simple "CVS" style XLS and made PPT using PICT file imports driven by a Java COM bridge. So, we went and improved HSLF to do what we needed.

The current set of active committers to the project represent really the second and third "generation". We are trying to keep this a crackling API!

All contributions are welcome. Examples and documentation are always appreciated. Submit a patch to bugzilla.

Best Regards,
Dave




--- On Wed, 1/7/09, Darío Vasconcelos <[email protected]> wrote:
From: Darío Vasconcelos <[email protected]>
Subject: Re: Conceptual question, using HSSF as a recalculation engine
To: "POI Users List" <[email protected]>, [email protected]
Date: Wednesday, January 7, 2009, 7:19 AM

Hi,

actually I've also been looking into using UNO, seems promising, the only problem being the platform: I need AIX and there is no official support for
it. But in Windows it works, all I need is to do some other tests.

But I would prefer to use POI since it seems to me it is a "cleaner"
solution: with OpenOffice, the number of components is daunting, and the documentation is at its best sparse. Using HSSF sounds like a better idea but as you mention, maybe it is an API more meant to creating and reading
files. I'm also looking at COM but it would obviously force us to use
Windows at some point at the backend.

Please do send me any URLs that you find because I've looked at a couple of
commercial products with not very convincing results.

Regards,


On Wed, Jan 7, 2009 at 1:40 AM, Anthony Andrews
<[email protected]>wrote:

Hello David,

May I throw a spanner into the works.

To my mind, HSSF/HSSF is an API that is used to create or read files. I would argue that if you want a calculation engine then there are other
solutions and the one I would look at - well two really - is either
controling Excel itself via OLE/COM or installing OpenOffice and using UNO (Unoversal Network Objects) to accomplish a similar task. The advantages
UNO
has over OLE is that the OpenOffice API supports full access via Java (and
other languages) and OpenOffice itself is platform independent. Using
either
of these approaches would allow you to interact with a live, running
application and to take advatage of the calculation engine built into
those
pieces of software.

Alternatively, there are pieces of software that I have seen kicking
around
that claim to emulate Excel's support for formulae. If I come across
one
again this morning, I will send you the URL.

--- On Tue, 1/6/09, Darío Vasconcelos <[email protected]>
wrote:
From: Darío Vasconcelos <[email protected]>
Subject: Conceptual question, using HSSF as a recalculation engine
To: [email protected]
Date: Tuesday, January 6, 2009, 12:02 PM

Hi,

My problem is, I have a very complex XLS file (some 33,000 formulas,
unordered and scattered all over the place) which I was initially trying
to
fully implement in Java. When the complexity of the document struck us, we
wondered if POI could do the job. So the idea is to:
* Read some input data from a text file
* Open the file via HSSF
* Feed all the inputs to the cells
* Tell HSSF to recalculate all cells
* Retrieve the value of some of the results.
The XLS file is then dismissed because we're saving these results to a
database.

So this is my question: is this a good idea? Is POI supposed to do such
things? Because so far, my tests have shown that the recalculation
isn't
fully working: if I change the value of A1, and A2 and A3 have formulas
that
depend on its value, only A2 is updated but A3 not. Of course this is a
trivial example: in reality, some cells have more than 20 levels of
dependency, sometimes the formulas refer to cells that are
"before"
them,
other times refer to cells that are after.

The process I'm using is as follows:
* Open the workbook
* Create a HSSFFormulaEvaluator
* Get the value of a cell (with getSheetAt, getRow, getCell)
* Set its value to some double
* Traverse all rows and all cells (using sheet.lastRowNum() and
row.getLastCellNum() and do an evaluateFormulaCell on each one
* Get the value of a result cell

And the value isn't even the same as Excel shows before changing the
value
of the cell. I'm worried because for some process we're
implementing a
MonteCarlo method, thus we're going to to this whole process 500
times,
retrieving the results each cycle, and plotting these points in graphs.

The actual code is as follows:

                       FileInputStream fis = new


FileInputStream("/Users/dariovasconcelos/Documents/desa/QA76/ Bancomext/src_paramext/poi/riesgo4.xls");
HSSFWorkbook wb = new HSSFWorkbook(fis); // or new
                       HSSFFormulaEvaluator evaluator = new
HSSFFormulaEvaluator(wb);

                       BufferedWriter out = new BufferedWriter(new




FileWriter("/Users/dariovasconcelos/Documents/desa/QA76/Bancomext/ src_paramext/poi/sinCambios.txt"));
                       HSSFSheet hoja = wb.getSheetAt(0);
                       HSSFRow renglon = hoja.getRow(686);
                       HSSFCell celda = renglon.getCell(8);
                       celda.setCellValue((double)0.483);
                       evaluator.notifySetFormula(celda);
                       evaluator.clearAllCachedResultValues();

                       n = recalculateAll(wb, evaluator, out, n);

And recalculateAll is coded like this:

          private int recalculateAll(HSSFWorkbook wb,
HSSFFormulaEvaluator evaluator, BufferedWriter out, int
n) {
           try {
               // Recalculo todos los renglones
               //for(int sheetNum = 0; sheetNum <
wb.getNumberOfSheets();sheetNum++) {
               for(int sheetNum = 0; sheetNum < 1; sheetNum++) {
                   //int sheetNum =0;
                   HSSFSheet sheet = wb.getSheetAt(sheetNum);
                   for (int i = 0; i < sheet.getLastRowNum(); i++) {
                       HSSFRow r = sheet.getRow(i);
                       if (r == null)
                           continue;
                       for (int j=0; j < r.getLastCellNum(); j++) {
                           HSSFCell c = r.getCell(j);
                           if (c == null)
                               continue;
                           int tipo = c.getCellType();
                           int row = c.getRowIndex();
                           int col = c.getColumnIndex();
                           switch (tipo) {
                               case HSSFCell.CELL_TYPE_NUMERIC:
miPrint(out, n++, sheetNum, row, col ,
                                           "" +
c.getNumericCellValue());
                                   break;
                                   /*
                               case HSSFCell.CELL_TYPE_STRING:
miPrint(out, n++, sheetNum, row, col,
                                           c.getStringCellValue());
                                   break;
                                   */
                               case HSSFCell.CELL_TYPE_FORMULA:
miPrint(out, n++, sheetNum, row, col,

c.getCellFormula().replace("$",
                                           ""));
                                   int tipoRes =
evaluator.evaluateFormulaCell(c);
                                   if (tipoRes ==
HSSFCell.CELL_TYPE_NUMERIC)
miPrint(out, n++, sheetNum, row,
col,
                                               "" +
c.getNumericCellValue());
                                   else if (tipoRes ==
HSSFCell.CELL_TYPE_STRING)
miPrint(out, n++, sheetNum, row,
col,
c.getStringCellValue());
                                   else if (tipoRes ==
HSSFCell.CELL_TYPE_ERROR)
miPrint(out, n++, sheetNum, row,
col,
                                               "" +
c.getErrorCellValue());
                                   /*
                                   HSSFCell celdaTmp =
evaluator.evaluateInCell(c);
                                   if (celdaTmp.getCellType()
== HSSFCell.CELL_TYPE_NUMERIC) miPrint(out, n++, sheetNum, row,
col,
                                               "" +
c.getNumericCellValue());
                                               */
                                   break;
                           }
                       }
                   }
               }
           } catch (Exception e) {
               e.printStackTrace();
           }

           return n;
          }


"miPrint" is simply a routine to print the value of the cell to
a
Buffered
Writer.

Regards,

Dario
--
Here's a rule I recommend: Never practice two vices at once.
- Tallulah Bankhead








--
Here's a rule I recommend: Never practice two vices at once.
- Tallulah Bankhead





---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to