[ 
https://issues.apache.org/jira/browse/STORM-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108824#comment-15108824
 ] 

ASF GitHub Bot commented on STORM-1214:
---------------------------------------

Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1029#discussion_r50276631
  
    --- Diff: 
storm-core/src/jvm/org/apache/storm/trident/operation/Function.java ---
    @@ -19,6 +19,73 @@
     
     import org.apache.storm.trident.tuple.TridentTuple;
     
    +import java.util.Map;
    +
    +/**
    + * A function takes in a set of input fields and emits zero or more tuples 
as output. The fields of the output tuple
    + * are appended to the original input tuple in the stream. If a function 
emits no tuples, the original input tuple is
    + * filtered out. Otherwise, the input tuple is duplicated for each output 
tuple.
    + *
    + * For example, if you have the following function:
    + *
    + * ```java
    + * public class MyFunction extends BaseFunction {
    + *      public void execute(TridentTuple tuple, TridentCollector 
collector) {
    + *      for(int i=0; i < tuple.getInteger(0); i++) {
    + *          collector.emit(new Values(i));
    + *      }
    + *    }
    + * }
    + *
    + * ```
    + *
    + * Now suppose you have a stream in the variable `mystream` with the 
fields `["a", "b", "c"]` with the following tuples:
    + *
    + * ```
    + * [1, 2, 3]
    + * [4, 1, 6]
    + * [3, 0, 8]
    + * ```
    + * If you had the following code in your topology definition:
    + *
    + * ```java
    + * mystream.each(new Fields("b"), new MyFunction(), new Fields("d")))
    + * ```
    + *
    + * The resulting tuples would have the fields `["a", "b", "c", "d"]` and 
look like this:
    + *
    + * ```
    + * [1, 2, 3, 0]
    + * [1, 2, 3, 1]
    + * [4, 1, 6, 0]
    + * ```
    + *
    + * In this case, the parameter `new Fields("b")` tells Trident that you 
would like to select the field "b" as input
    + * to the function, and that will be the only field in the Tuple passed to 
the `execute()` method. The value of "b" in
    + * the first tuple (2) causes the for loop to execute twice, so 2 tuples 
are emitted. similarly the second tuple causes
    + * one tuple to be emitted. For the third tuple, the value of 0 causes the 
`for` loop to be skipped, so nothing is
    + * emitted and the incoming tuple is filtered out of the stream.
    + *
    + * ### Configuration
    + * If your `Function` implementation has configuration requirements, you 
will typically want to extend
    + * {@link storm.trident.operation.BaseFunction} and override the
    + * {@link storm.trident.operation.Operation#prepare(Map, 
TridentOperationContext)} method to perform your custom
    + * initialization.
    + *
    + * ### Performance Considerations
    + * Because Trident Functions perform logic on individual tuples -- as 
opposed to batches -- it is advisable
    + * to avoid expensive operations such as database operations in a 
Function, if possible. For data store interactions
    + * it is better to use a {@link storm.trident.state.State} or {@link 
storm.trident.state.QueryFunction} implementation
    + * since Trident states operate on batch partitions and can perform bulk 
updates to a database.
    + *
    + *
    + */
    --- End diff --
    
    org.apache in the links here too.


> Trident API Improvements
> ------------------------
>
>                 Key: STORM-1214
>                 URL: https://issues.apache.org/jira/browse/STORM-1214
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: P. Taylor Goetz
>            Assignee: P. Taylor Goetz
>
> There are a few idiosyncrasies in the Trident API that can sometimes trip 
> developers up (e.g. when and how to set the parallelism of components). There 
> are also a few areas where the API could be made slightly more intuitive 
> (e.g. add Java 8 streams-like methods like {{filter()}}, {{map()}}, 
> {{flatMap()}}, etc.).
> Some of these concerns can be addressed through documentation, and some by 
> altering the API. Since we are approaching a 1.0 release, it would be good to 
> address any API changes before a major release.
> The goal of this JIRA is to identify specific areas of improvement and 
> formulate an implementation that addresses them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to