[
https://issues.apache.org/jira/browse/STORM-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515234#comment-15515234
]
Roshan Naik edited comment on STORM-1961 at 9/23/16 5:11 AM:
-------------------------------------------------------------
# Strong typing in the APIs is really great!
# Would be good to do away with StreamBuilder and write it more directly and
concisely :
{quote}
Stream<String> x = new Stream(...).flatmap().blah()
{quote}
# Can we avoid the build() call there
{quote}
StormSubmitter.submitTopologyWithProgressBar("test", new Config(),
builder.build());
{quote}
and simplify it to
{quote}
StormSubmitter.submitTopologyWithProgressBar("test", new Config(), stream );
{quote}
and have the build() or whatever else needs to happen, get invoked internally
within submitTopology ?
# Good to have overloaded version taking arrays in flatMap(T[] ) and elsewhere
to natively support arrays... so that conversion via Arrays.asList is not
needed.
# The doc Needs to have a more concrete definition for Stream concept.
#* Is it just the data stream produced by the first operator/spout ? or is it
the whole pipeline of operators ?
#* is it different from what we call 'topology' in storm ?
#* When you say *Stream<T>* .. what is T ? Is the type of value produced by
the terminal operator ? or that of the first ?
#* What if there is a branch/split and each terminal operator creates different
types ?
#* Can a stream pick up data from two different sources ? for example from
kafka and hdfs.
# The diagram in the doc shows fields and shuffle groupings. Not clear from the
examples as to how the various gropings will be supported in the API.
# Would like to see API examples in doc as how to the grouping and parallelism
hints will be expressed in code.
# Would this API provide a mechanism to use the existing set of Storm spouts
and terminal bolts (like KafkaSpout, HdfsSpout, HbaseBolt, etc) ? Or do we need
to have new implementations ?
# How will custom/user-defined operators be supported ?
# Doc says
{quote}windowing defines the batch boundaries{quote}
It is fatal mistake to associate batch boundaries with window boundaries.
was (Author: roshan_naik):
# Strong typing in the APIs is really great!
# Would be good to do away with StreamBuilder and write it more directly and
concisely :
{quote}
Stream<String> x = new Stream(...).flatmap().blah()
{quote}
# Can we avoid the build() call there
{quote}
StormSubmitter.submitTopologyWithProgressBar("test", new Config(),
builder.build());
{quote}
and simplify it to
{quote}
StormSubmitter.submitTopologyWithProgressBar("test", new Config(), stream );
{quote}
and have the build() or whatever else needs to happen, get invoked internally
within submitTopology ?
# Good to have overloaded version taking arrays in flatMap(T[] ) and elsewhere
to natively support arrays... so that conversion via Arrays.asList is not
needed.
# The doc Needs to have a more concrete definition for Stream concept.
#* Is it just the data stream produced by the first operator/spout ? or is it
the whole pipeline of operators ?
#* is it different from what we call 'topology' in storm ?
#* When you say *Stream<T>* .. what is T ? Is the type of value produced by
the terminal operator ? or that of the first ?
#* What if there is a branch/split and each terminal operator creates different
types ?
#* Can a stream pick up data from two different sources ? for example from
kafka and hdfs.
# The diagram in the doc shows fields and shuffle groupings. Not clear from the
examples as to how the various gropings will be supported in the API.
# Would like to see API examples in doc as how to the grouping and parallelism
hints will be expressed in code.
# Would this API provide a mechanism to use the existing set of Storm spouts
and terminal bolts (like KafkaSpout, HdfsSpout, HbaseBolt, etc) ? Or do we need
to have new implementations ?
# How will custom/user-defined operators be supported ?
> Come up with streams api for storm core use cases
> -------------------------------------------------
>
> Key: STORM-1961
> URL: https://issues.apache.org/jira/browse/STORM-1961
> Project: Apache Storm
> Issue Type: Sub-task
> Reporter: Arun Mahadevan
> Assignee: Arun Mahadevan
> Attachments: UnifiedStreamapiforStorm.pdf
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)