[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310987#comment-15310987
 ] 

Ilya Ganelin edited comment on APEXMALHAR-2099 at 6/1/16 7:53 PM:
------------------------------------------------------------------

The current implementation of the Apex Stream API (ApexStreamImpl.java) 
supports the following functions:

- map
- flatMap
- filter
- reduce
- fold

On the Beam side, there is not a strict "API" as far as applying a 
transformation. Instead, Beam defines a PTransform class which implements an 
"apply" function that applies a given function (PTransform) to incoming data 
represented as a PCollection.  There are presently on the order of 40 different 
transformations implemented for Beam: 
https://github.com/apache/incubator-beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms

The analogs to the Stream API are: 
Apex => Beam
map => ParDo
flatMap => FlatMapElements
filter => Filter
reduce => Combine (sort of)

In General, beam presently supports a much greater variety of transformations. 
They also support different classes of transformation. For example, some 
transformations are applied over a window, while others are applied on a 
per-tuple basis. The windowing behavior can be explicitly specified by defining 
a windowing strategy. Key limitations of the current Apex Stream API are that 
it does not have any support for cross-stream interaction. Specifically, 
operations like groupByKey or join are not currently defined within the scope 
of the Apex Stream API and this is a serious limitation since it limits the 
applications that can be built. 




was (Author: ilganeli):
The current implementation of the Apex Stream API (ApexStreamImpl.java) 
supports the following functions:

- map
- flatMap
- filter
- reduce
- fold

On the Beam side, there is not a strict "API" as far as transformation goes. 
Instead, Beam defines a PTransform class which implements an "apply" function 
that applies a given function to incoming data represented as a PCollection.  
There are presently on the order of 40 different transformations implemented 
for Beam: 
https://github.com/apache/incubator-beam/tree/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms

The analogs to the Stream API are: 
Apex => Beam
map => ParDo
flatMap => FlatMapElements
filter => Filter
reduce => Combine (sort of)

In General, beam presently supports a much greater variety of transformations. 
They also support different classes of transformation. For example, some 
transformations are applied over a window, while others are applied on a 
per-tuple basis. The windowing behavior can be explicitly specified by defining 
a windowing strategy. Key limitations of the current Apex Stream API are that 
it does not have any support for cross-stream interaction. Specifically, 
operations like groupByKey or join are not currently defined within the scope 
of the Apex Stream API and this is a serious limitation since it limits the 
applications that can be built. 



> Identify overlap between Beam API and existing Apex Stream API
> --------------------------------------------------------------
>
>                 Key: APEXMALHAR-2099
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2099
>             Project: Apache Apex Malhar
>          Issue Type: Sub-task
>            Reporter: Ilya Ganelin
>
> There should be some overlap between the Beam API and the recently released 
> Apex Stream API. This task captures the need to understand and document this 
> overlap.
> AC:
> * A document or JIRA comment identifying which components of the Beam API are 
> implement, similar, or absent within the Apex Stream API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to