On Tue, 29 Jan 2008, Daniel Noll wrote:
Is your formula related eventusermodel code in a format suitable for
contributing back? It'd be handy to be able to put something in svn
that would make dealing with the formula stuff much simpler. I'd be
happy to spend a bit of time tidying it up / writing tests for it, if
you could contribute it?
If I ever figure out how to handle it, I probably would contribute it
back because it would mean changes to how shared formulas work. At the
moment as you say, it does require a Workbook. At the moment I don't
have a Workbook to work with. Maybe I can store off the first however
many records and then create the Workbook from those -- I haven't tried
so I don't know what happens if you feed in a list of records without
the ones which make up the read of the file.
I think you might be able to get away with that. If not, shout and we can
tweak things.
If it gets you close, then we should probably come up with something like
a WorkbookRecordSource interface, which model.Workbook implements. Tweak
the formula code to use those instead, then it's easier for you to pass in
the records that mater. Let us know if that looks like being worth doing.
Memory is indeed cheap, but unless you have the luxury of a 64-bit JVM,
there is an upper limit of somewhere around 1.4GB, sometimes less.
This would normally be nearly 2GB but Windows allocates some DLLs in
weird positions on some systems, and Sun insist on allocating a
contiguous block of memory for the heap which sometimes causes a huge
unusable memory hole above that.
Have you tried tweaking your windows box to use a 1gb/3gb split, instead
of the usual 2gb/2gb one? Might help out in the absence of a 64 bit jvm /
a licence for a non-hobbled 32 bit version of windows.
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
In actual fact for us, something closer to RecordInputStream would be
even better, where we can just say nextRecord() and have it return a
properly constructed Record. Then we have control over the loop, which
is ideal when you need to return a Reader.
Does the newly added org.apache.poi.hssf.eventusermodel.HSSFRecordStream
look roughly like what you need? I've converted the existing
eventusermodel code to use it under the hood, so it ought to behave
pretty much the same, except with pull instead of push.
As far as the records keeping a copy, could they not instead keep an
offset and a reference to the original buffer? Then if someone calls a
setter, it would need to create a new buffer, set the offset to 0 and
copy the data before doing the actual set.
In many cases, they only keep the parsed data in memory, and not the
source bytes. That's certainly one of the advantages of the (not so) new
RecordInputStream method
And as far as POIFS keeping a copy, yes... POIFS is full of issues like
that. For instance, even if all you need to read is the CLSID, you still
have to read the entire file. If POIFSFileSystem could construct from a
ByteBuffer and not take unnecessary copies, it could speed things up
dramatically for that situation... but ultimately that would need to
propagate to the whole framework for it to really show benefits.
Do feel free to submit patches for that sort of thing :)
I haven't played with ByteBuffer before, so do feel free to suggest how it
might help + point at code examples / patches that show it
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]