Chesnay Schepler created FLINK-3751:
---------------------------------------

             Summary: default Operator names are inconsistent
                 Key: FLINK-3751
                 URL: https://issues.apache.org/jira/browse/FLINK-3751
             Project: Flink
          Issue Type: Bug
          Components: DataSet API, DataStream API
    Affects Versions: 1.0.1
            Reporter: Chesnay Schepler
            Priority: Minor


h3. The Problem
If a user doesn't name an operator explicitly (generally using the name() 
method) then Flink auto generates a name. These generated names are really 
(like, _really_) inconsistent within and across API's.

In the batch API non-source/-sink operator names are _generally_ formed like 
this:
{code}FlatMap (FlatMap at main(WordCount.java:81)){code}

We have
* FlatMap, describing the runtime operator type
* another FlatMap, describing which user-call created this operator
* main(WordCount.java:81), describing the call location

This already falls apart when you have a DataSource, which looks like this:
{code}DataSource (at getDefaultTextLineDataSet(WordCountData.java:70) 
(org.apache.flink.CollectionInputFormat){code}
It is missing the call that created the sink (fromElements()) and suddenly 
includes the inputFormat name.

Sink are a different story yet again, since collect() is displayed as
{code} DataSink (collect()) {code}
which is missing the call location.

Then we have the Streaming API  where things are named completely different as 
well:

The fromElements source is displayed as 
{code} Source: Collection Source {code}

non-source/-sink operators are displayed simply as their runtime operator type
{code} FlatMap {code}

and sinks, at times, do not have a name at all.
{code} Sink: Unnamed {code}

To put the cherry on top, chains are displayed in the Batch API as
{code} CHAIN <operator> -> <operator> {code}
while in the Streaming API we lost the CHAIN keyword
{code} <operator> -> <operator> {code}

Considering that these names are right in the users face via the Dashboard we 
should try to homogenize them a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to