Repository: apex-site
Updated Branches:
  refs/heads/asf-site 3ce49df52 -> 029c6d054 (forced update)


from bce72c1f0df80169955ccd09ec2b21254f3c334e


Project: http://git-wip-us.apache.org/repos/asf/apex-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/apex-site/commit/029c6d05
Tree: http://git-wip-us.apache.org/repos/asf/apex-site/tree/029c6d05
Diff: http://git-wip-us.apache.org/repos/asf/apex-site/diff/029c6d05

Branch: refs/heads/asf-site
Commit: 029c6d0546fa6fc8b880cd6dc71276ce5174a749
Parents: 78858fe
Author: Thomas Weise <[email protected]>
Authored: Wed Sep 28 20:14:23 2016 -0700
Committer: Thomas Weise <[email protected]>
Committed: Wed Sep 28 20:14:23 2016 -0700

----------------------------------------------------------------------
 content/docs.html    |  8 ++---
 content/roadmap.html | 91 +++++++++++++++++++++++++++++++++--------------
 2 files changed, 69 insertions(+), 30 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/apex-site/blob/029c6d05/content/docs.html
----------------------------------------------------------------------
diff --git a/content/docs.html b/content/docs.html
index e405e95..8b20f94 100644
--- a/content/docs.html
+++ b/content/docs.html
@@ -85,6 +85,7 @@
 <ul>
 <li><a href="http://docs.datatorrent.com/beginner/"; 
rel="nofollow">Beginner&#39;s Guide to Apache Apex</a> This document provides a 
comprehensive overview of Apex and is recommended for developers just starting 
out with Apex.</li>
 <li><a href="https://youtu.be/LwRWBudOjg4";>Building Your First Apache Apex 
Application</a> This video has a hands-on demonstration of how to check out the 
source code repositories and build them, then run the maven archetype command 
to generate a new Apache Apex project, populate the project with Java source 
files for a new application, and finally, build and run the application -- all 
on a virtual machine running Linux with Apache Hadoop installed.</li>
+<li><a 
href="http://files.meetup.com/18978602/University%20program%20-%20Writing%20an%20Apache%20Apex%20application.pdf";>Writing
 an Apache Apex application</a> A PDF document that frames a hands-on exercise 
of building a basic application; also includes a diagram illustrating the 
life-cycle of operators.</li>
 <li><a href="http://docs.datatorrent.com/tutorials/topnwords/"; 
rel="nofollow">Top N Words Application Tutorial</a> This document provides a 
detailed step-by-step description of how to build and run a
 word counting application with Apache Apex starting with setting up your 
development environment, progressing to building, running and monitoring the 
application, visualizing the output and concluding with some advanced features 
such as assessing operator memory requirements, partitioning, and 
debugging.</li>
 <li><a href="http://docs.datatorrent.com/tutorials/salesdimensions/"; 
rel="nofollow">Sales Dimensions Application Tutorial</a> Similar to the Top N 
Words application but covers
@@ -95,12 +96,11 @@ dimensional computations on a simulated sales data 
stream.</li>
 <h3 id="presentations">Presentations</h3>
 <ul>
 <li><a 
href="http://www.slideshare.net/ApacheApex/presentations";>Slideshare/ApacheApex</a>
 Presentations from past meetup events and other talks covering Apache Apex 
introduction, feature deep dive, integration, customer use cases and more.</li>
-<li><a 
href="http://files.meetup.com/18978602/University%20program%20-%20Writing%20an%20Apache%20Apex%20application.pdf";>Writing
 an Apache Apex application</a> A PDF document that frames a hands-on exercise 
of building a basic application; also includes a diagram illustrating the 
life-cycle of operators.</li>
 <li><a href="https://www.youtube.com/watch?v=98EW5NGM3u0";>Next Gen Decision 
Making in &lt; 2ms</a> A video discussing CapitalOne&#39;s experience with 
Apache Apex and evaluation of competing technologies along with the <a 
href="http://www.slideshare.net/ApacheApex/capital-ones-next-generation-decision-in-less-than-2-ms";>slides</a>.
 </li>
-<li><a href="https://www.youtube.com/watch?v=EdBiOnQn3Gw";>Apache Nifi 
Integration with Apex</a> video and <a 
href="http://www.slideshare.net/ApacheApex/integrating-ni-fiandapex-by-bryan-bende";>slide
 deck</a>.</li>
 <li><a href="https://www.brighttalk.com/webcast/13685/190407";>Introducing 
Apache Apex</a> A webinar that begins with the historical context for the rise 
of Hadoop and Big Data, discusses why the promise of Hadoop remains largely 
unfulfilled and why moving beyond Map-Reduce model is essential and why 
operability is critically important. It continues with a discussion of the 
programming model, the various components of a running application on a YARN 
cluster and the large library of operators and connectors available with Apache 
Apex for reading data from and writing data to external systems. Concludes with 
a brief description of the visualization dashboards.</li>
-<li><a href="http://www.slideshare.net/PramodImmaneni/meetup-59089806";>Stream 
Processing with Apache Apex</a> A broad overview slide deck covering topics 
such as windowing, static and dynamic partitioning, unification, fault 
tolerance, locality, monitoring, etc.</li>
-<li><a href="https://www.brighttalk.com/webcast/13685/194115";>Fault Tolerance 
and Processing Semantics</a> A webinar and associated <a 
href="http://www.slideshare.net/ApacheApexOrganizer/webinar-fault-toleranceandprocessingsemantics";>slides</a>
 covering core Apache Apex features including checkpointing and fault tolerance 
with fast, incremental recovery via a buffer server which uses a 
publish-subscribe model for inter-operator data transport. A variety of failure 
scenarios and processing guarantees are discussed.</li>
+<li><a href="https://www.youtube.com/watch?v=1DVMSRTNdIQ";>Stream Processing 
with Apache Apex (video)</a> and <a 
href="http://www.slideshare.net/ApacheApex/hadoop-summit-sj-2016-next-gen-big-data-analytics-with-apache-apex";>(slides)</a>
 A broad overview slide deck covering topics such as windowing, static and 
dynamic partitioning, unification, fault tolerance, locality, monitoring, 
etc.</li>
+<li><a href="https://www.youtube.com/watch?v=FCMY6Ii89Nw";>Fault Tolerance and 
Processing Semantics (video)</a> and <a 
href="http://www.slideshare.net/ApacheApexOrganizer/webinar-fault-toleranceandprocessingsemantics";>(slides)</a>
 A webinar covering core Apache Apex features including checkpointing and fault 
tolerance with fast, incremental recovery via a buffer server which uses a 
publish-subscribe model for inter-operator data transport. A variety of failure 
scenarios and processing guarantees are discussed.</li>
+<li><a href="https://www.youtube.com/watch?v=kJWMajIjGG0";>Smart Partitioning 
with Apache Apex (video)</a> and <a 
href="http://www.slideshare.net/ApacheApex/smart-partitioning-with-apache-apex-webinar";>(slides)</a>
 Webinar covering partitioning, including unique Apex features such as 
elasticity with dynamic resource allocation, parallel partitions for 
speculative execution and processing SLA etc.</li>
 <li><a 
href="http://www.slideshare.net/DevendraVyavahare/windowing-in-apex";>Windows in 
Apache Apex</a> Discusses the various flavors of windows available in Apache 
Apex and how to configure and
 use them via callbacks. Contrasts windows with micro-batches.</li>
 <li><a 
href="http://www.slideshare.net/DevendraVyavahare/batch-processing-vs-real-time-data-processing-streaming";>Real
 Time Stream Processing Versus Batch</a> Slide deck compares and contrasts the 
needs, use cases and challenges of stream processing with those of batch 
processing.</li>

http://git-wip-us.apache.org/repos/asf/apex-site/blob/029c6d05/content/roadmap.html
----------------------------------------------------------------------
diff --git a/content/roadmap.html b/content/roadmap.html
index 8371f65..5bee01a 100644
--- a/content/roadmap.html
+++ b/content/roadmap.html
@@ -275,6 +275,36 @@ http://mesos.apache.org/documentation/latest/fetcher/
 
         </td>
       </tr>
+      <tr>
+        <td>
+          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXCORE-498";>APEXCORE-498</a>
+        </td>
+        <td title="Named Checkpoints 
+
+1. Ability to tag/name the checkpoints
+2. On demand - checkpoint the DAG
+3. Start the app from the named checkpoints
+
+All checkpoints that happened before the committed window is deleted but the 
named checkpoints won&#x27;t be deleted.">
+          Named Checkpoints - Checkpoint the DAG with a name/tag and start the 
app from that point
+        </td>
+        <td>
+    
+
+        </td>
+      </tr>
+      <tr>
+        <td>
+          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXCORE-536";>APEXCORE-536</a>
+        </td>
+        <td title="Currently Apex depends on Hadoop 2.2 and runs on all later 
2.x version. Hadoop 2.2 is quite old, most Apex users have more recent Hadoop 
installs. Latest distro releases are based on 2.6 and 2.7. There are several 
important features that were added in Hadoop since 2.2 that Apex should be able 
to leverage.">
+          Upgrade Hadoop dependency
+        </td>
+        <td>
+    
+
+        </td>
+      </tr>
     </tbody>
   </table>
 
@@ -370,40 +400,49 @@ This jira item can contain tasks for providing similar 
support in Apex">
       </tr>
       <tr>
         <td>
-          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2026";>APEXMALHAR-2026</a>
+          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2089";>APEXMALHAR-2089</a>
         </td>
-        <td title="Add libraryies for spooling datastructures to a key value 
store. There are several customer use cases which require spooled data 
structures.
-
-1 - Some operators like AbstractFileInputOperator have ever growing state. 
This is an issue because eventually the state of the operator will grow larger 
than the memory allocated to the operator, which will cause the operator to 
perpetually fail. However if the operator&#x27;s datastructures are spooled 
then the operator will never run out of memory.
-
-2 - Some users have requested for the ability to maintain a map as well as a 
list of keys over which to iterate. Most key value stores don&#x27;t provide 
this functionality. However, with spooled datastructures this functionality can 
be provided by maintaining a spooled map and an iterable set of keys.
-
-3 - Some users have requested building graph databases within APEX. This would 
require implementing a spooled graph data structure.
-
-4 - Another use case for spooled data structures is database operators. 
Database operators need to write data to a data base, but sometimes the 
database is down. In this case most of the database operators repeatedly fail 
until the database comes back up. In order to avoid constant failures the 
database operator need to writes data to a queue when the data base is down, 
then when the database is up the operator need to take data from the queue and 
write it to the database. In the case of a database failure this queue will 
grow larger than the total amount of memory available to the operator, so the 
queue should be spooled in order to prevent the operator from failing.
-
-5 - Any operator which needs to maintain a large data structure in memory 
currently needs to have that data serialized and written out to HDFS with every 
checkpoint. This is costly when the data structure is large. If the data 
structure is spooled, then only the changes to the data structure are written 
out to HDFS instead of the entire data structure.
-
-6 - Also building an Apex Native database for aggregations requires indices. 
These indices need to take the form of spooled data structures.
+        <td title="Apex should provide a runner for Beam. This ticket is a 
proxy for BEAM-261 as the implementation should probably live in the Beam 
repository.
+">
+          Apache Beam support
+        </td>
+        <td>
+    
+
+        </td>
+      </tr>
+      <tr>
+        <td>
+          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2130";>APEXMALHAR-2130</a>
+        </td>
+        <td title="This feature is used for supporting windowing.
 
-7 - In the future any operator which needs to maintain a data structure larger 
than the memory available to it will need to spool the data structure.">
-          Spill-able Datastructures
+The storage needs to have the following features:
+1. Spillable key value storage (integrate with APEXMALHAR-2026)
+2. Upon checkpoint, it saves a snapshot for the entire data set with the 
checkpointing window id.  This should be done incrementally (ManagedState) to 
avoid wasting space with unchanged data
+3. When recovering, it takes the recovery window id and restores to that 
snapshot
+4. When a window is committed, all windows with a lower ID should be purged 
from the store.
+5. It should implement the WindowedStorage and WindowedKeyedStorage 
interfaces, and because of 2 and 3, we may want to add methods to the 
WindowedStorage interface so that the implementation of WindowedOperator can 
notify the storage of checkpointing, recovering and committing of a window.
+">
+          Scalable windowed storage
         </td>
         <td>
     
 
-            <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR/fixforversion/12335815";>3.5.0</a>&nbsp;
+            <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR/fixforversion/12338174";>3.6.0</a>&nbsp;
 
 
         </td>
       </tr>
       <tr>
         <td>
-          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2089";>APEXMALHAR-2089</a>
+          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2260";>APEXMALHAR-2260</a>
         </td>
-        <td title="Apex should provide a runner for Beam. This ticket is a 
proxy for BEAM-261 as the implementation should probably live in the Beam 
repository.
+        <td title="Support execution of Python code in an operator. 
+
+https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E
 ">
-          Apache Beam support
+          Python execution for operator logic 
         </td>
         <td>
     
@@ -412,17 +451,17 @@ This jira item can contain tasks for providing similar 
support in Apex">
       </tr>
       <tr>
         <td>
-          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2142";>APEXMALHAR-2142</a>
+          <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR-2261";>APEXMALHAR-2261</a>
         </td>
-        <td title="">
-          High-level API window support
+        <td title="A high level API similar to the Apex Java stream API that 
lets users specify an application in Python.
+
+https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E
+">
+          Python binding for high level API
         </td>
         <td>
     
 
-            <a target="_blank" 
href="https://issues.apache.org/jira/browse/APEXMALHAR/fixforversion/12335815";>3.5.0</a>&nbsp;
-
-
         </td>
       </tr>
     </tbody>

Reply via email to