Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/33#discussion_r29155217
  
    --- Diff: docs/src/main/asciidoc/chapters/iterator_design.txt ---
    @@ -145,8 +146,16 @@ alter the internal state of the Iterator.
     
     These methods simply return the current Key-Value pair for this iterator. 
If `hasTop` returns true,
     both of these methods should return non-null objects. If `hasTop` returns 
false, it is undefined
    -what these methods should return. Multiple calls to these methods should 
not alter the state
    -of the Iterator like `hasTop`.
    +what these methods should return. Like `hasTop`, multiple calls to these 
methods should not alter 
    +the state of the Iterator.
    +
    +When saving a Key or Value from a source iterator's `getTopKey` or 
`getTopValue` methods
    +for use after calling `next` on the source iterator (e.g., when cacheing 
keys or values
    +from the source iterator), it is important to copy the Key or Value into a 
new object 
    +because the source iterator may reuse the Key or Value objects for 
performance reasons.
    --- End diff --
    
    I'm a little concerned about recommending to always copy the Key or Value 
is that returned as it will drastically increase the number of created objects 
in the tserver and probably tank performance. At the same time, I don't think 
I've ever done this myself (copy the Key/Value in an iterator), but I haven't 
run into any issues that you're warning against (maybe it only happens farther 
"down" the stack at the iterators reading off of disk?). How have you run into 
this issue? Can we try to make this more specific?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to