[
https://issues.apache.org/jira/browse/LUCENE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687581#comment-13687581
]
Robert Muir commented on LUCENE-5066:
-------------------------------------
I opened LUCENE-5067 for a way to add tests for this (and other existing ones)
for all of our directories.
We should also open another issue to improve codecs so that they don't OOM
trying to allocate absurdly-huge datastructures on bugs, but instead trip a
check or an assert.
We discussed some of these ideas at berlin buzzwords:
instead of:
{code}
int size = readVint();
byte something[] = new byte[size]; // gives OOM if 'size' is corrupt: no chance
for read past EOF
readBytes(something, size);
{code}
we can do:
{code}
int size = readVint();
assert size < MAX_SIZE; // for something like terms with a bounded limit
assert getFilePointer() + size < length(); // for something unbounded, at least
it must not exceed the file's length
byte something[] = new byte[size];
readBytes(something, size);
{code}
> TestFieldsReader fails in 4.x with OOM
> --------------------------------------
>
> Key: LUCENE-5066
> URL: https://issues.apache.org/jira/browse/LUCENE-5066
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Attachments: LUCENE-5066.patch
>
>
> Its FaultyIndexInput is broken (doesn't implement seek/clone correctly).
> This causes it to read bogus data and try to allocate an enormous byte[] for
> a term.
> The bug was previously hidden:
> FaultyDirectory doesnt override openSlice, so CFS must not be used at flush
> if you want to trigger the bug.
> FailtyIndexInput's clone is broken, it uses "new" but doesn't seek the clone
> to the right place. This causes a disaster with BufferedIndexInput (which it
> extends), because BufferedIndexInput (not just the delegate) must "know" its
> position since it has seek-within-block etc code...
> It seems with this test (very simple one), that only 3.x codec triggers it
> because its term dict relies upon clone()'s being seek'd to right place.
> I'm not sure what other codecs rely upon this, but imo we should also add a
> low-level test for directories that does something like this to ensure its
> really tested:
> {code}
> dir.createOutput(x);
> dir.openInput(x);
> input.seek(somewhere);
> clone = input.clone();
> assertEquals(somewhere, clone.getFilePointer());
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]