[ 
https://issues.apache.org/jira/browse/FLINK-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745391#comment-14745391
 ] 

Greg Hogan commented on FLINK-2668:
-----------------------------------

The project methods work well, particularly for interleaving elements from 
joined {DataSet}s. The issue I had was in composing multiple custom library 
algorithms. The first call finished with a projection:

{code}
        ...

        DataSet<Tuple2<T,T>> result = data
            .partitionByHash(2)
            .sortPartition(2, Order.ASCENDING)
            .project(0, 1);

        return result;
{code}

and the second call started with a projection.

{code}
        DataSet<Tuple1<T>> first = this.data
            .<Tuple1<T>>project(0)
            .distinct();

        ...
{code}

My workaround was to replace the first projection with a simple map function.

> ProjectOperator method to close projection
> ------------------------------------------
>
>                 Key: FLINK-2668
>                 URL: https://issues.apache.org/jira/browse/FLINK-2668
>             Project: Flink
>          Issue Type: Improvement
>          Components: Java API
>    Affects Versions: master
>            Reporter: Greg Hogan
>            Priority: Minor
>
> I have come across an issue in my code where I called project(...) on a 
> {{DataSet}} which was already a {{ProjectOperator}}. Instead of reducing the 
> number of fields from 2 to 1 this instead increased the number of fields from 
> 2 to 3 resulting in 
> {{org.apache.flink.api.common.functions.InvalidTypesException: Input 
> mismatch: Tuple arity '3' expected but was '1'.}} when processing the next 
> operator.
> This can be resolved by adding an optional explicit call to conclude the 
> projection, perhaps {{ProjectOperator.closeProjection()}}. Can this be done 
> without creating a new no-op operator?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to