[EMAIL PROTECTED] wrote:
Mike Matrigali <[EMAIL PROTECTED]> writes:
It would be fine to use an unchecked and/or an ASSERT based check for
readFieldLengthAndSetStreamPosition. The "store" module owns this
access and is not counting on limit checks to catch anything here.
Another question:
All observed calls to setLimit (single tuple select load) come from the same
method: StoredPage.readRecordFromArray(...)
(setLimit is also called from readRecordFromStream() but this does not
seem to happen with this type of load)
And the argument to setLimit()is always the local variable fieldDataLength
which is the return value from
StoredFieldHeader.readLengthAndSetStreamPosition().
So if readLengthAndSetStreamPosition() can update the position without
checking, presumably the return value from this method can be trusted
as well? Is it then necessary to check this value again in setLimit(),
or could we have used an unchecked version of setLimit here?
I'm worried by this approach of removing checking of the limit or the
position, it's much like saying we don't needs bounds checking on arrays
because I know my code is correct.
The current code provides some protection from a software bug, corrupted
page or hacked page. Removing those checks may lead to hard to detect
bugs where a position and/or limit is calculated incorrectly and
subsequently leads to corrupted data or random exceptions being thrown.
My feeling is that the integrity of the store and the code is better
served by keeping these checks.
I also think we need more performance numbers to justify such a change,
a single set of runs from a single vm does not justify it. I will run
numbers on linux with a number of vm's when I get the chance.
Also often in these cases it is better to try and optimize at a higher
level, rather than try to optimize at the lowest level (especially when
removing such checks). In this case see if the number of calls to
setLimit() or setPosition() could be reduced rather than
micro-optimizing these methods by changing their core functionality.
As an example the setLimit() call around the readExternalFromArray
method. Maybe this responsibility could be pushed into the data type
itself, and for some builtin types we trust their read mechanism to read
the correct amount of data. E.g. reading a INTEGER will always read four
bytes, so no need to set a limit around it. The limit is pushed there to
support types that do not know their length on disk (e.g. some character
types, some binary types, user defined types) and was to support
arbitrary user types when the engine cannot trust or require that the
de-serialization will read the complete stream and only its data.
Dan.