This is an automated email from the ASF dual-hosted git repository.
ibella pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/accumulo-website.git
The following commit(s) were added to refs/heads/asf-site by this push:
new cb67d19 Jekyll build from master:915b78b
cb67d19 is described below
commit cb67d19452608db6a47f2ad25f5bcbf06fe14af8
Author: Ivan Bella <[email protected]>
AuthorDate: Tue Jun 11 17:06:55 2019 -0400
Jekyll build from master:915b78b
fixes #183: Added a chapter on yielding and fixed pseudocode (#185)
* fixes #183: Added a chapter on yielding and fixed pseudocode
---
docs/2.x/development/iterators.html | 46 +++++++++++++++++++++++++++++--------
feed.xml | 4 ++--
redirects.json | 2 +-
search_data.json | 2 +-
4 files changed, 41 insertions(+), 13 deletions(-)
diff --git a/docs/2.x/development/iterators.html
b/docs/2.x/development/iterators.html
index af17359..81d4bbe 100644
--- a/docs/2.x/development/iterators.html
+++ b/docs/2.x/development/iterators.html
@@ -424,7 +424,7 @@
that allow users to implement custom retrieval or computational purpose within
Accumulo TabletServers. The name rightly
brings forward similarities to the Java Iterator interface; however, Accumulo
Iterators are more complex than Java
Iterators. Notably, in addition to the expected methods to retrieve the
current element and advance to the next element
-in the iteration, Accumulo Iterators must also support the ability to “move”
(<code class="highlighter-rouge">seek</code>) to an specified point in the
+in the iteration, Accumulo Iterators must also support the ability to “move”
(<code class="highlighter-rouge">seek</code>) to a specified point in the
iteration (the Accumulo table). Accumulo Iterators are designed to be
concatenated together, similar to applying a
series of transformations to a list of elements. Accumulo Iterators can
duplicate their underlying source to create
multiple “pointers” over the same underlying data (which is extremely powerful
since each stream is sorted) or they can
@@ -434,7 +434,7 @@ are not designed to act as triggers nor are they designed
to operate outside of
<p>Understanding how TabletServers invoke the methods on a <a
href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.0.0-alpha-2/org/apache/accumulo/core/iterators/SortedKeyValueIterator.html">SortedKeyValueIterator</a>
can be obtuse as the actual code is
buried within the implementation of the TabletServer; however, it is generally
unnecessary to have a strong
-understanding of this as the interface provides clear definitions about what
each action each method should take. This
+understanding of this as the interface provides clear definitions about what
each method should take. This
chapter aims to provide a more detailed description of how Iterators are
invoked, some best practices and some common
pitfalls.</p>
@@ -587,6 +587,24 @@ early programming assignments which implement their own
tree data structures. <c
copy on its sources (the children), copies itself, attaches the copies of the
children, and
then returns itself.</p>
+<h2 id="yielding-interface">Yielding Interface</h2>
+
+<p>If you have implemented an iterator with a next or seek call that can take
a very long time
+resulting in starving out other scans within the same thread pool, try
implementing the
+optional YieldingKeyValueIterator interface which SortedKeyValueIterator
extends.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="k">default</span> <span
class="kt">void</span> <span class="nf">enableYielding</span><span
class="o">(</span><span class="n">YieldCallback</span> <span
class="n">callback</span><span class="o">)</span> <span class="o">{</span>
<span class="o">}</span>
+</code></pre></div></div>
+
+<h3 id="enableyielding">enableYielding</h3>
+
+<p>The implementation of this method should simply cache the supplied callback
as a member of
+the iterator. Then one can call the yield(Key key) method on the callback
within a next or
+seek call when the iterator is to yield control. The supplied key will be
used as the
+start key in a follow-on seek call’s range allowing the iterator to continue
where it left
+off. Note when an iterator yields, the hasTop() method must return false.
Also note that
+the enableYielding method will not be called in isolation mode.</p>
+
<h2 id="tabletserver-invocation-of-iterators">TabletServer invocation of
Iterators</h2>
<p>The following code is a general outline for how TabletServers invoke
Iterators.</p>
@@ -603,21 +621,34 @@ then returns itself.</p>
<span class="n">source</span> <span class="o">=</span> <span
class="n">iter</span><span class="o">;</span>
<span class="o">}</span>
- <span class="c1">// read a batch of data to return to client</span>
+ <span class="c1">// read a batch of data to return to client from</span>
<span class="c1">// the last iterator, the "top"</span>
<span class="n">SortedKeyValueIterator</span> <span
class="n">topIter</span> <span class="o">=</span> <span
class="n">source</span><span class="o">;</span>
- <span class="n">topIter</span><span class="o">.</span><span
class="na">seek</span><span class="o">(</span><span
class="n">getRangeFromUser</span><span class="o">(),</span> <span
class="o">...)</span>
+
+ <span class="n">YieldCallback</span> <span class="n">cb</span> <span
class="o">=</span> <span class="k">new</span> <span
class="n">YieldCallback</span><span class="o">();</span>
+ <span class="n">topIter</span><span class="o">.</span><span
class="na">enableYielding</span><span class="o">(</span><span
class="n">cb</span><span class="o">)</span>
+
+ <span class="n">topIter</span><span class="o">.</span><span
class="na">seek</span><span class="o">(</span><span class="n">range</span><span
class="o">,</span> <span class="o">...)</span>
<span class="k">while</span> <span class="o">(</span><span
class="n">topIter</span><span class="o">.</span><span
class="na">hasTop</span><span class="o">()</span> <span
class="o">&&</span> <span class="o">!</span><span
class="n">overSizeLimit</span><span class="o">(</span><span
class="n">batch</span><span class="o">))</span> <span class="o">{</span>
<span class="n">key</span> <span class="o">=</span> <span
class="n">topIter</span><span class="o">.</span><span
class="na">getTopKey</span><span class="o">()</span>
<span class="n">val</span> <span class="o">=</span> <span
class="n">topIter</span><span class="o">.</span><span
class="na">getTopValue</span><span class="o">()</span>
<span class="n">batch</span><span class="o">.</span><span
class="na">add</span><span class="o">(</span><span class="k">new</span> <span
class="n">KeyValue</span><span class="o">(</span><span
class="n">key</span><span class="o">,</span> <span class="n">val</span><span
class="o">)</span>
+ <span class="c1">// remember the last key returned</span>
+ <span class="n">setLastKeyReturned</span><span class="o">(</span><span
class="n">key</span><span class="o">);</span>
<span class="k">if</span> <span class="o">(</span><span
class="n">systemDataSourcesChanged</span><span class="o">())</span> <span
class="o">{</span>
<span class="c1">// code does not show isolation case, which
will</span>
<span class="c1">// keep using same data sources until a row
boundary is hit</span>
<span class="n">range</span> <span class="o">=</span> <span
class="k">new</span> <span class="n">Range</span><span class="o">(</span><span
class="n">key</span><span class="o">,</span> <span class="kc">false</span><span
class="o">,</span> <span class="n">range</span><span class="o">.</span><span
class="na">endKey</span><span class="o">(),</span> <span
class="n">range</span><span class="o">.</span><span
class="na">endKeyInclusive</span><span class="o">());</span>
<span class="k">break</span><span class="o">;</span>
<span class="o">}</span>
+ <span class="n">topIter</span><span class="o">.</span><span
class="na">next</span><span class="o">()</span>
+ <span class="o">}</span>
+
+ <span class="k">if</span> <span class="o">(</span><span
class="n">cb</span><span class="o">.</span><span
class="na">hasYielded</span><span class="o">())</span> <span class="o">{</span>
+ <span class="c1">// remember the yield key as the last key
returned</span>
+ <span class="n">setLastKeyReturned</span><span class="o">(</span><span
class="n">cb</span><span class="o">.</span><span class="na">getKey</span><span
class="o">());</span>
+ <span class="k">break</span><span class="o">;</span>
<span class="o">}</span>
<span class="o">}</span>
<span class="c1">//return batch of key values to client</span>
@@ -628,15 +659,12 @@ then returns itself.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="c1">// Given the above</span>
<span class="n">List</span><span class="o"><</span><span
class="n">KeyValue</span><span class="o">></span> <span
class="n">batch</span> <span class="o">=</span> <span
class="n">getNextBatch</span><span class="o">();</span>
-<span class="c1">// Store off lastKeyReturned for this client</span>
-<span class="n">lastKeyReturned</span> <span class="o">=</span> <span
class="n">batch</span><span class="o">.</span><span class="na">get</span><span
class="o">(</span><span class="n">batch</span><span class="o">.</span><span
class="na">size</span><span class="o">()</span> <span class="o">-</span> <span
class="mi">1</span><span class="o">).</span><span class="na">getKey</span><span
class="o">();</span>
-
<span class="c1">// thread goes away (client stops asking for the next
batch).</span>
<span class="c1">// Eventually client comes back</span>
<span class="c1">// Setup as before...</span>
-<span class="n">Range</span> <span class="n">userRange</span> <span
class="o">=</span> <span class="n">getRangeFromUser</span><span
class="o">();</span>
-<span class="n">Range</span> <span class="n">actualRange</span> <span
class="o">=</span> <span class="k">new</span> <span class="n">Range</span><span
class="o">(</span><span class="n">lastKeyReturned</span><span
class="o">,</span> <span class="kc">false</span><span class="o">,</span> <span
class="n">userRange</span><span class="o">.</span><span
class="na">getEndKey</span><span class="o">(),</span> <span
class="n">userRange</span><span class="o">.</span><span
class="na">isEndKeyInclusive< [...]
+<span class="n">Range</span> <span class="n">userRange</span> <span
class="o">=</span> <span class="n">getRangeFromClient</span><span
class="o">();</span>
+<span class="n">Range</span> <span class="n">actualRange</span> <span
class="o">=</span> <span class="k">new</span> <span class="n">Range</span><span
class="o">(</span><span class="n">getLastKeyReturned</span><span
class="o">(),</span> <span class="kc">false</span><span class="o">,</span>
<span class="n">userRange</span><span class="o">.</span><span
class="na">getEndKey</span><span class="o">(),</span> <span
class="n">userRange</span><span class="o">.</span><span
class="na">isEndKeyInclu [...]
<span class="c1">// Use the actualRange, not the user provided one</span>
<span class="n">topIter</span><span class="o">.</span><span
class="na">seek</span><span class="o">(</span><span
class="n">actualRange</span><span class="o">);</span>
diff --git a/feed.xml b/feed.xml
index 684e00f..33c67e7 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
</description>
<link>https://accumulo.apache.org/</link>
<atom:link href="https://accumulo.apache.org/feed.xml" rel="self"
type="application/rss+xml"/>
- <pubDate>Tue, 11 Jun 2019 16:34:36 -0400</pubDate>
- <lastBuildDate>Tue, 11 Jun 2019 16:34:36 -0400</lastBuildDate>
+ <pubDate>Tue, 11 Jun 2019 17:06:47 -0400</pubDate>
+ <lastBuildDate>Tue, 11 Jun 2019 17:06:47 -0400</lastBuildDate>
<generator>Jekyll v3.8.5</generator>
diff --git a/redirects.json b/redirects.json
index 9c19363..2d3a54c 100644
--- a/redirects.json
+++ b/redirects.json
@@ -1 +1 @@
-{"/release_notes/1.5.1.html":"https://accumulo.apache.org/release/accumulo-1.5.1/","/release_notes/1.6.0.html":"https://accumulo.apache.org/release/accumulo-1.6.0/","/release_notes/1.6.1.html":"https://accumulo.apache.org/release/accumulo-1.6.1/","/release_notes/1.6.2.html":"https://accumulo.apache.org/release/accumulo-1.6.2/","/release_notes/1.7.0.html":"https://accumulo.apache.org/release/accumulo-1.7.0/","/release_notes/1.5.3.html":"https://accumulo.apache.org/release/accumulo-1.5.3/"
[...]
\ No newline at end of file
+{"/release_notes/1.5.1.html":"https://accumulo.apache.org/release/accumulo-1.5.1/","/release_notes/1.6.0.html":"https://accumulo.apache.org/release/accumulo-1.6.0/","/release_notes/1.6.1.html":"https://accumulo.apache.org/release/accumulo-1.6.1/","/release_notes/1.6.2.html":"https://accumulo.apache.org/release/accumulo-1.6.2/","/release_notes/1.7.0.html":"https://accumulo.apache.org/release/accumulo-1.7.0/","/release_notes/1.5.3.html":"https://accumulo.apache.org/release/accumulo-1.5.3/"
[...]
\ No newline at end of file
diff --git a/search_data.json b/search_data.json
index aee0418..10370fe 100644
--- a/search_data.json
+++ b/search_data.json
@@ -100,7 +100,7 @@
"docs-2-x-development-iterators": {
"title": "Iterators",
- "content" : "Accumulo SortedKeyValueIterators, commonly referred
to as Iterators for short, are server-side programming constructsthat allow
users to implement custom retrieval or computational purpose within Accumulo
TabletServers. The name rightlybrings forward similarities to the Java
Iterator interface; however, Accumulo Iterators are more complex than
JavaIterators. Notably, in addition to the expected methods to retrieve the
current element and advance to the next elementin [...]
+ "content" : "Accumulo SortedKeyValueIterators, commonly referred
to as Iterators for short, are server-side programming constructsthat allow
users to implement custom retrieval or computational purpose within Accumulo
TabletServers. The name rightlybrings forward similarities to the Java
Iterator interface; however, Accumulo Iterators are more complex than
JavaIterators. Notably, in addition to the expected methods to retrieve the
current element and advance to the next elementin [...]
"url": " /docs/2.x/development/iterators",
"categories": "development"
},