[12/46] incubator-apex-core git commit: SPOI-6684 #resolve Modifying referneces for applications and platform to Apache Apex

thw Sun, 28 Feb 2016 23:04:22 -0800

SPOI-6684 #resolve Modifying referneces for applications and platform to Apache 
Apex



Project: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-apex-core/commit/84ffc251
Tree: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/tree/84ffc251
Diff: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/diff/84ffc251

Branch: refs/heads/APEXCORE-293
Commit: 84ffc25105cb7a674fea0eb52e59beda26fde84f
Parents: 9d50fb6
Author: sashadt <[email protected]>
Authored: Tue Nov 3 18:55:32 2015 -0800
Committer: Thomas Weise <[email protected]>
Committed: Sun Feb 28 22:46:34 2016 -0800

----------------------------------------------------------------------
 apex.md                   |  6 ++--
 apex_malhar.md            | 20 +++++++-------
 application_packages.md   | 62 ++++++++++++++++--------------------------
 configuration_packages.md | 46 ++++++++++++-------------------
 4 files changed, 56 insertions(+), 78 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/84ffc251/apex.md
----------------------------------------------------------------------
diff --git a/apex.md b/apex.md
index f06bdb1..b6813a5 100644
--- a/apex.md
+++ b/apex.md
@@ -1,7 +1,7 @@
 Apache Apex
 
================================================================================
 
-Apache Apex (incubating) is the industryâs only Apache 2.0 licensed open 
source enterprise grade unified stream and batch processing engine.  Project 
Apex includes key features requested by open source developer community that 
are not available in current open source technologies.
+Apache Apex (incubating) is the industryâs only open source, 
enterprise-grade unified stream and batch processing engine.  Apache Apex 
includes key features requested by open source developer community that are not 
available in current open source technologies.
 
 * Event processing guarantees
 * In-memory performance & scalability
@@ -9,4 +9,6 @@ Apache Apex (incubating) is the industryâs only Apache 2.0 
licensed open sourc
 * Native rolling and tumbling window support
 * Hadoop-native YARN & HDFS implementation
 
-For additional information visit [Apache 
Apex](http://apex.incubator.apache.org/).
\ No newline at end of file
+For additional information visit [Apache 
Apex](http://apex.incubator.apache.org/).
+
+![](images/apex_logo.png)

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/84ffc251/apex_malhar.md
----------------------------------------------------------------------
diff --git a/apex_malhar.md b/apex_malhar.md
index d193f12..ef2e371 100644
--- a/apex_malhar.md
+++ b/apex_malhar.md
@@ -1,17 +1,17 @@
 Apache Apex Malhar
 
================================================================================
 
-Apache Apex-Malhar is an open source operator and codec library that can be 
used with the DataTorrent platform to build real-time streaming applications.  
As part of enabling enterprises extract value quickly, Malhar operators help 
get data in, analyze it in real-time and get data out of Hadoop in real-time 
with no paradigm limitations.  In addition to the operators, the library 
contains a number of demos applications, demonstrating operator features and 
capabilities.
+Apache Apex Malhar is an open source operator and codec library that can be 
used with the Apache Apex platform to build real-time streaming applications.  
As part of enabling enterprises extract value quickly, Malhar operators help 
get data in, analyze it in real-time and get data out of Hadoop in real-time 
with no paradigm limitations.  In addition to the operators, the library 
contains a number of demos applications, demonstrating operator features and 
capabilities.
 
 ![MalharDiagram](images/MalharOperatorOverview.png)
 
 # Capabilities common across Malhar operators
 
-For most streaming platforms, connectors are afterthoughts and often end up 
being simple âbolt-onsâ to the platform. As a result they often cause 
performance issues or data loss when put through failure scenarios and 
scalability requirements. Malhar operators do not face these issues as they 
were designed to be integral parts of DataTorrent RTS. Hence, they have 
following core streaming runtime capabilities
+For most streaming platforms, connectors are afterthoughts and often end up 
being simple âbolt-onsâ to the platform. As a result they often cause 
performance issues or data loss when put through failure scenarios and 
scalability requirements. Malhar operators do not face these issues as they 
were designed to be integral parts of apex*.md RTS. Hence, they have following 
core streaming runtime capabilities
 
-1.  **Fault tolerance** â DataTorrent Malhar operators where applicable have 
fault tolerance built in. They use the checkpoint capability provided by the 
framework to ensure that there is no data loss under ANY failure scenario.
-2.  **Processing guarantees** â DataTorrent Malhar operators where 
applicable provide out of the box support for ALL three processing guarantees 
â exactly once, at-least once & at-most once WITHOUT requiring the user to 
write any additional code. &nbsp;Some operators like MQTT operator deal with 
source systems that cant track processed data and hence need the operators to 
keep track of the data. Malhar has support for a generic operator that uses 
alternate storage like HDFS to facilitate this. Finally for databases that 
support transactions or support any sort of atomic batch operations Malhar 
operators can do exactly once down to the tuple level.
-3.  **Dynamic updates** â Based on changing business conditions you often 
have to tweak several parameters used by the operators in your streaming 
application without incurring any application downtime. You can also change 
properties of a Malhar operator at runtime without having to bring down the 
application. 
+1.  **Fault tolerance** â Apache Apex Malhar operators where applicable have 
fault tolerance built in. They use the checkpoint capability provided by the 
framework to ensure that there is no data loss under ANY failure scenario.
+2.  **Processing guarantees** â Malhar operators where applicable provide 
out of the box support for ALL three processing guarantees â exactly once, 
at-least once & at-most once WITHOUT requiring the user to write any additional 
code.  Some operators like MQTT operator deal with source systems that cant 
track processed data and hence need the operators to keep track of the data. 
Malhar has support for a generic operator that uses alternate storage like HDFS 
to facilitate this. Finally for databases that support transactions or support 
any sort of atomic batch operations Malhar operators can do exactly once down 
to the tuple level.
+3.  **Dynamic updates** â Based on changing business conditions you often 
have to tweak several parameters used by the operators in your streaming 
application without incurring any application downtime. You can also change 
properties of a Malhar operator at runtime without having to bring down the 
application.
 4.  **Ease of extensibility** â Malhar operators are based on templates that 
are easy to extend.
 5.  **Partitioning support** â In streaming applications the input data 
stream often needs to be partitioned based on the contents of the stream. Also 
for operators that ingest data from external systems partitioning needs to be 
done based on the capabilities of the external system. E.g. With the Kafka or 
Flume operator, the operator can automatically scale up or down based on the 
changes in the number of Kafka partitions or Flume channels
 
@@ -24,7 +24,7 @@ Below is a summary of the various sub categories of input and 
output operators.
 *   **File Systems** â Most streaming analytics use cases we have seen 
require the data to be stored in HDFS or perhaps S3 if the application is 
running in AWS. Also, customers often need to re-run their streaming analytical 
applications against historical data or consume data from upstream processes 
that are perhaps writing to some NFS share. Hence, itâs not just enough to be 
able to save data to various file systems. You also have to be able to read 
data from them. RTS supports input & output operators for HDFS, S3, NFS & Local 
Files
 *   **Flume** â NOTE: Flume operator is not yet part of Malhar
 
-Many customers have existing Flume deployments that are being used to 
aggregate log data from variety of sources. However Flume does not allow 
analytics on the log data on the fly. The Flume input/output operator enables 
RTS to consume data from flume and analyze it in real-time before being 
persisted. 
+Many customers have existing Flume deployments that are being used to 
aggregate log data from variety of sources. However Flume does not allow 
analytics on the log data on the fly. The Flume input/output operator enables 
RTS to consume data from flume and analyze it in real-time before being 
persisted.
 
 *   **Relational databases** â Most stream processing use cases require some 
reference data lookups to enrich, tag or filter streaming data. There is also a 
need to save results of the streaming analytical computation to a database so 
an operational dashboard can see them. RTS supports a JDBC operator so you can 
read/write data from any JDBC compliant RDBMS like Oracle, MySQL etc.
 *   **NoSQL databases** âNoSQL key-value pair databases like Cassandra & 
HBase are becoming a common part of streaming analytics application 
architectures to lookup reference data or store results. Malhar has operators 
for HBase, Cassandra, Accumulo (common with govt. & healthcare companies) 
MongoDB & CouchDB.
@@ -37,11 +37,11 @@ Many customers have existing Flume deployments that are 
being used to aggregate
 
 ## Compute
 
-One of the most important promises of a streaming analytics platform like 
DataTorrent RTS is the ability to do analytics in real-time. However delivering 
on the promise becomes really difficult when the platform does not provide out 
of the box operators to support variety of common compute functions as the user 
then has to worry about making these scalable, fault tolerant etc. Malhar takes 
this responsibility away from the application developer by providing a huge 
variety of out of the box computational operators. The application developer 
can thus focus on the analysis. 
+One of the most important promises of a streaming analytics platform like 
Apache Apex is the ability to do analytics in real-time. However delivering on 
the promise becomes really difficult when the platform does not provide out of 
the box operators to support variety of common compute functions as the user 
then has to worry about making these scalable, fault tolerant etc. Malhar takes 
this responsibility away from the application developer by providing a huge 
variety of out of the box computational operators. The application developer 
can thus focus on the analysis.
 
 Below is just a snapshot of the compute operators available in Malhar
 
-*   Statistics & Math - Provide various mathematical and statistical 
computations over application defined time windows. 
+*   Statistics & Math - Provide various mathematical and statistical 
computations over application defined time windows.
 *   Filtering & pattern matching
 *   Machine learning & Algorithms
 *   Real-time model scoring is a very common use case for stream processing 
platforms. &nbsp;Malhar allows users to invoke their R models from streaming 
applications
@@ -54,7 +54,7 @@ Many streaming use cases are legacy implementations that need 
to be ported over.
 
 ## Parsers
 
-There are many industry vertical specific data formats that a streaming 
application developer might need to parse. Often there are existing parsers 
available for these that can be directly plugged into a DataTorrent streaming 
application. E.g. In the Telco space, a Java based CDR parser can be directly 
plugged into DataTorrent RTS. To further simplify development experience, 
Malhar also provides some operators for parsing common formats like XML (DOM & 
SAX), JSON (flat map converter), Apache log files & Syslog.
+There are many industry vertical specific data formats that a streaming 
application developer might need to parse. Often there are existing parsers 
available for these that can be directly plugged into an Apache Apex 
application. For example in the Telco space, a Java based CDR parser can be 
directly plugged into Apache Apex operator. To further simplify development 
experience, Malhar also provides some operators for parsing common formats like 
XML (DOM & SAX), JSON (flat map converter), Apache log files, syslog, etc.
 
 ## Stream manipulation
 
@@ -62,4 +62,4 @@ Streaming data aka âstreamâ is raw data that inevitably 
needs processing to
 
 ## Social Media
 
-DataTorrent supports an operator to connect to the popular Twitter stream fire 
hose
+Malhar includes an operator to connect to the popular Twitter stream fire hose.

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/84ffc251/application_packages.md
----------------------------------------------------------------------
diff --git a/application_packages.md b/application_packages.md
index 1d6f6e8..3e58038 100644
--- a/application_packages.md
+++ b/application_packages.md
@@ -24,15 +24,11 @@ an Apex application project using Maven by running the 
following
 command.  Replace "com.example", "mydtapp" and "1.0-SNAPSHOT" with the
 appropriate values (make sure this is all on one line):
 
-```
- $ mvn archetype:generate
- -DarchetypeRepository=https://www.datatorrent.com/maven/content/reposito
- ries/releases
- -DarchetypeGroupId=com.datatorrent
- -DarchetypeArtifactId=apex-app-archetype -DarchetypeVersion=3.2.0
- -DgroupId=com.example -Dpackage=com.example.mydtapp -DartifactId=mydtapp
- -Dversion=1.0-SNAPSHOT
-```
+    $ mvn archetype:generate \
+     -DarchetypeGroupId=org.apache.apex \
+     -DarchetypeArtifactId=apex-app-archetype 
-DarchetypeVersion=3.2.0-incubating \
+     -DgroupId=com.example -Dpackage=com.example.mydtapp -DartifactId=mydtapp \
+     -Dversion=1.0-SNAPSHOT
 
 This creates a Maven project named "mydtapp". Open it with your favorite
 IDE (e.g. NetBeans, Eclipse, IntelliJ IDEA). In the project, there is a
@@ -44,9 +40,8 @@ runs the unit test for the DAG is in
 src/test/java/com/example/mydtapp/ApplicationTest.java. Try it out by
 running the following command:
 
-```
- $cd mydtapp; mvn package
-```
+    $cd mydtapp; mvn package
+
 This builds the App Package runs the unit test of the DAG.  You should
 be getting test output similar to this:
 
@@ -95,19 +90,16 @@ Then fill the Group ID, Artifact ID, Version and Repository 
entries as shown bel
 
 ![](images/AppPackage/ApplicationPackages.html-image02.png)
 
-Group ID: com.datatorrent
+Group ID: org.apache.apex
 Artifact ID: apex-app-archetype
-Version: 3.0.0 (or any later version)
-
-Repository:
-[https://www.datatorrent.com/maven/content/repositories/releases](https://www.datatorrent.com/maven/content/repositories/releases)
+Version: 3.2.0-incubating (or any later version)
 
 Press Next and fill out the rest of the required information. For
 example:
 
 ![](images/AppPackage/ApplicationPackages.html-image01.png)
 
-Click Finish, and now you have created your own DataTorrent App Package
+Click Finish, and now you have created your own Apache Apex App Package
 project, with a default unit test.  You can run the unit test, make code
 changes or make dependency changes within your IDE.  The procedure for
 other IDEs, like Eclipse or IntelliJ, is similar.
@@ -115,7 +107,7 @@ other IDEs, like Eclipse or IntelliJ, is similar.
 # Writing Your Own App Package
 
 
-Please refer to the [Application Developer 
Guide](https://www.datatorrent.com/docs/guides/ApplicationDeveloperGuide.html) 
on the basics on how to write a DataTorrent application.  In your AppPackage 
project, you can add custom operators (refer to [Operator Developer 
Guide](https://www.datatorrent.com/docs/guides/OperatorDeveloperGuide.html)), 
project dependencies, default and required configuration properties, pre-set 
configurations and other metadata.
+Please refer to the [Creating Apps](create.md) on the basics on how to write 
an Apache Apex application.  In your AppPackage project, you can add custom 
operators (refer to [Operator Development 
Guide](https://www.datatorrent.com/docs/guides/OperatorDeveloperGuide.html)), 
project dependencies, default and required configuration properties, pre-set 
configurations and other metadata.
 
 ## Adding (and removing) project dependencies
 
@@ -126,9 +118,9 @@ the default pom.xml:
   <dependencies>
     <!-- add your dependencies here -->
     <dependency>
-      <groupId>com.datatorrent</groupId>
+      <groupId>org.apache.apex</groupId>
       <artifactId>malhar-library</artifactId>
-      <version>${datatorrent.version}</version>
+      <version>${apex.version}</version>
       <!--
            If you know your application do not need the transitive 
dependencies that are pulled in by malhar-library,
            Uncomment the following to reduce the size of your app package.
@@ -143,9 +135,9 @@ the default pom.xml:
       -->
     </dependency>
     <dependency>
-      <groupId>com.datatorrent</groupId>
-      <artifactId>dt-engine</artifactId>
-      <version>${datatorrent.version}</version>
+      <groupId>org.apache.apex</groupId>
+      <artifactId>apex-engine</artifactId>
+      <version>${apex.version}</version>
       <scope>provided</scope>
     </dependency>
     <dependency>
@@ -176,7 +168,7 @@ warnings similar to the following:
 ```
 
  [WARNING] 'dependencies.dependency.exclusions.exclusion.groupId' for
- com.datatorrent:malhar-library:jar with value '*' does not match a
+ org.apache.apex:malhar-library:jar with value '*' does not match a
  valid id pattern.
 
  [WARNING]
@@ -289,7 +281,7 @@ setHost. The method is called using JAVA reflection and the 
property
 value is passed as an argument. In the above example the method setHost
 will be called on the âredisâ operator with â127.0.0.1â as the 
argument.
 
-## Port attributes 
+## Port attributes
 Port attributes are used to specify the platform behavior for input and
 output ports. They can be specified using the parameter 
```dt.operator.<operator-name>.inputport.<port-name>.attr.<attribute>```
 for input port and 
```dt.operator.<operator-name>.outputport.<port-name>.attr.<attribute>```
@@ -464,14 +456,14 @@ section that looks like:
 
 ```
 <properties>
-  <datatorrent.version>3.0.0</datatorrent.version>
-  
<datatorrent.apppackage.classpath\>lib*.jar</datatorrent.apppackage.classpath>
+  <apex.version>3.2.0-incubating</apex.version>
+  <apex.apppackage.classpath\>lib*.jar</apex.apppackage.classpath>
 </properties>
 ```
-datatorrent.version is the DataTorrent RTS version that are to be used
+apex.version is the Apache Apex version that are to be used
 with this Application Package.
 
-datatorrent.apppackage.classpath is the classpath that is used when
+apex.apppackage.classpath is the classpath that is used when
 launching the application in the Application Package.  The default is
 lib/\*.jar, where lib is where all the dependency jars are kept within
 the Application Package.  One reason to change this field is when your
@@ -499,7 +491,7 @@ that file is only used for the unit test.
 # Zip Structure of Application Package
 
 
-DataTorrent Application Package files are zip files.  You can examine the 
content of any Application Package by using unzip -t on your Linux command line.
+Apache Apex Application Package files are zip files.  You can examine the 
content of any Application Package by using unzip -t on your Linux command line.
 
 There are four top level directories in an Application Package:
 
@@ -673,12 +665,6 @@ You can launch an application within an Application 
Package.
 ```
 dt> launch [-D property-name=property-value, ...] [-conf config-name]
  [-apconf config-file-within-app-package] <app-package-file>
- [matching-app-name] 
+ [matching-app-name]
  ```
 Note that -conf expects a configuration file in the file system, while -apconf 
expects a configuration file within the app package.
-
-
-
-
-
-

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/84ffc251/configuration_packages.md
----------------------------------------------------------------------
diff --git a/configuration_packages.md b/configuration_packages.md
index 27d3978..331abd1 100644
--- a/configuration_packages.md
+++ b/configuration_packages.md
@@ -3,7 +3,7 @@ Apache Apex Configuration Packages
 
 An Apache Apex Application Configuration Package is a zip file that contains
 configuration files and additional files to be launched with an
-[Application 
Package](https://www.datatorrent.com/docs/guides/ApplicationPackages.html) 
using 
+[Application Package](application_packages.md) using 
 DTCLI or REST API.  This guide assumes the readerâs familiarity of
 Application Package.  Please read the Application Package document to
 get yourself familiar with the concept first if you have not done so.
@@ -28,28 +28,23 @@ DT configuration project using Maven by running the 
following command.
  Replace "com.example", "mydtconfig" and "1.0-SNAPSHOT" with the
 appropriate values:
 
-```
- $ mvn archetype:generate                                                
- -DarchetypeRepository=https://www.datatorrent.com/maven/content/reposito 
- ries/releases                                                            
- -DarchetypeGroupId=com.datatorrent                                       
- -DarchetypeArtifactId=apex-conf-archetype -DarchetypeVersion=3.0.0       
- -DgroupId=com.example -Dpackage=com.example.mydtconfig                   
- -DartifactId=mydtconfig -Dversion=1.0-SNAPSHOT                           
-
-```
+    $ mvn archetype:generate \
+     -DarchetypeGroupId=org.apache.apex \
+     -DarchetypeArtifactId=apex-conf-archetype 
-DarchetypeVersion=3.2.0-incubating \
+     -DgroupId=com.example -Dpackage=com.example.mydtconfig 
-DartifactId=mydtconfig \
+     -Dversion=1.0-SNAPSHOT
 
 This creates a Maven project named "mydtconfig". Open it with your
 favorite IDE (e.g. NetBeans, Eclipse, IntelliJ IDEA).  Try it out by
 running the following command:
+
 ```
 $ mvn package                                                         
 ```
 
 The "mvn package" command creates the Config Package file in target
 directory as target/mydtconfig.apc. You will be able to use that
-Configuration Package file to launch an application in your actual
-DataTorrent RTS installation.
+Configuration Package file to launch an Apache Apex application.
 
 ## Using IDE 
 
@@ -64,14 +59,9 @@ shown below.
 
 ![](images/AppConfig/ApplicationConfigurationPackages.html-image02.png)
 
-Group ID: com.datatorrent
+Group ID: org.apache.apex
 Artifact ID: apex-conf-archetype
-Version: 3.0.0 (or any later version)
-
-Repository:
-[https://www.datatorrent.com/maven/content/repositories/releases](https://www.datatorrent.com/maven/content/repositories/releases)
-
-[](https://www.datatorrent.com/maven/content/repositories/releases)
+Version: 3.2.0-incubating (or any later version)
 
 Press Next and fill out the rest of the required information. For
 example:
@@ -82,19 +72,19 @@ Click Finish, and now you have created your own Apex
 Configuration Package project.  The procedure for other IDEs, like
 Eclipse or IntelliJ, is similar.
 
-#Assembling your own configuration package 
+
+# Assembling your own configuration package 
 
 Inside the project created by the archetype, these are the files that
 you should know about when assembling your own configuration package:
 
-        ./pom.xml
-        ./src/main/resources/classpath
-        ./src/main/resources/files
-
-./src/main/resources/META-INF/properties.xml
-./src/main/resources/META-INF/properties-{appname}.xml
+    ./pom.xml
+    ./src/main/resources/classpath
+    ./src/main/resources/files
+    ./src/main/resources/META-INF/properties.xml
+    ./src/main/resources/META-INF/properties-{appname}.xml
 
-##pom.xml 
+## pom.xml 
 
 Example:

[12/46] incubator-apex-core git commit: SPOI-6684 #resolve Modifying referneces for applications and platform to Apache Apex

Reply via email to