Github user joshelser commented on a diff in the pull request:
https://github.com/apache/accumulo/pull/33#discussion_r29161718
--- Diff: docs/src/main/asciidoc/chapters/iterator_design.txt ---
@@ -145,8 +146,16 @@ alter the internal state of the Iterator.
These methods simply return the current Key-Value pair for this iterator.
If `hasTop` returns true,
both of these methods should return non-null objects. If `hasTop` returns
false, it is undefined
-what these methods should return. Multiple calls to these methods should
not alter the state
-of the Iterator like `hasTop`.
+what these methods should return. Like `hasTop`, multiple calls to these
methods should not alter
+the state of the Iterator.
+
+When saving a Key or Value from a source iterator's `getTopKey` or
`getTopValue` methods
+for use after calling `next` on the source iterator (e.g., when cacheing
keys or values
+from the source iterator), it is important to copy the Key or Value into a
new object
+because the source iterator may reuse the Key or Value objects for
performance reasons.
--- End diff --
Great, thanks Dylan. My memory isn't what it should be :).
> I think some iterators do this frequently
I took a gander at the "user" iterators, and I was a little surprised to
see as many iterators doing a copy in some form or another. Perhaps I'm being
overly sensitive on the performance concerns. Maybe my concern could be
reworded: are there cases in which a user doesn't need to copy the Key we could
outline?
"An iterator which doesn't modify the Key from its source doesn't need to
copy it" ? This would catch cases like value transformations. What do you
think? Is that a safe clarification to make?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---