aljoscha closed pull request #6753: [FLINK-1960] [Documentation] Add docs for 
withForwardedFields and related operators in Scala API
URL: https://github.com/apache/flink/pull/6753
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala 
b/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala
index 71037c3ae1d..c3a46c79e6a 100644
--- a/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala
+++ b/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala
@@ -282,6 +282,49 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
     this
   }
 
+  /**
+    * Adds semantic information about forwarded fields of the user-defined 
function.
+    * The forwarded fields information declares fields which are never 
modified by the function and
+    * which are forwarded to the same position in the output or copied 
unchanged to another position
+    * in the output.
+    *
+    * <p>Fields that are forwarded to the same position are specified just by 
their position.
+    * The specified position must be valid for the input and output data type 
and have
+    * the same type.
+    * For example <code>withForwardedFields("_3")</code> declares that the 
third field of
+    * an input tuple is copied to the third field of an output tuple.
+    *
+    * <p>Fields which are copied to another position in the output unchanged 
are declared by
+    * specifying the source field reference in the input and the target field 
reference
+    * in the output.
+    * {@code withForwardedFields("_1->_3")} denotes that the first field of 
the input tuple is
+    * copied to the third field of the output tuple unchanged. When using a 
wildcard ("*") ensure
+    * that the number of declared fields and their types in input and output 
type match.
+    *
+    * <p>Multiple forwarded fields can be annotated in one
+    * ({@code withForwardedFields("_2; _3->_1; _4")})
+    * or separate Strings ({@code withForwardedFields("_2", "_3->_1", "_4")}).
+    * Please refer to the JavaDoc of {@link 
org.apache.flink.api.common.functions.Function}
+    * or Flink's documentation for details on field references such as nested 
fields and wildcard.
+    *
+    * <p>It is not possible to override existing semantic information about 
forwarded fields
+    * which was for example added by a
+    * {@link 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields} class
+    * annotation.
+    *
+    * <p><b>NOTE: Adding semantic information for functions is optional!
+    * If used correctly, semantic information can help the Flink optimizer to 
generate more
+    * efficient execution plans.
+    * However, incorrect semantic information can cause the optimizer to 
generate incorrect
+    * execution plans which compute wrong results!
+    * So be careful when adding semantic information.
+    * </b>
+    *
+    * @param forwardedFields A list of field forward expressions.
+    * @return This operator with annotated forwarded field information.
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation
+    * @see 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields
+    */
   def withForwardedFields(forwardedFields: String*) = {
     javaSet match {
       case op: SingleInputUdfOperator[_, _, _] => 
op.withForwardedFields(forwardedFields: _*)
@@ -292,6 +335,52 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
     this
   }
 
+  /**
+    * Adds semantic information about forwarded fields of the first input
+    * of the user-defined function.
+    * The forwarded fields information declares fields which are never 
modified by the function and
+    * which are forwarded to the same position in the output or copied 
unchanged
+    * to another position in the output.
+    *
+    * <p>Fields that are forwarded to the same position are specified just by 
their position.
+    * The specified position must be valid for the input and output data type
+    * and have the same type.
+    * For example <code>withForwardedFieldsFirst("_3")</code> declares that 
the third field
+    * of an input tuple from the first input is copied to the third field of 
an output tuple.
+    *
+    * <p>Fields which are copied from the first input to another position in 
the output unchanged
+    * are declared by specifying the source field reference in the first input 
and the target field
+    * reference in the output. {@code withForwardedFieldsFirst("_1->_3")} 
denotes that the first
+    * field of the first input tuple is copied to the third field of the 
output tuple unchanged.
+    * When using a wildcard ("*") ensure that the number of declared fields 
and their types
+    * in the first input and output type match.
+    *
+    * <p>Multiple forwarded fields can be annotated in one
+    * ({@code withForwardedFieldsFirst("_2; _3->_0; _4")})
+    * or separate Strings ({@code withForwardedFieldsFirst("_2", "_3->_0", 
"_4")}).
+    * Please refer to the JavaDoc of
+    * {@link org.apache.flink.api.common.functions.Function} or Flink's 
documentation for
+    * details on field references such as nested fields and wildcard.
+    *
+    * <p>It is not possible to override existing semantic information about 
forwarded fields
+    * of the first input which was for example added by a
+    * {@link 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst}
+    * class annotation.
+    *
+    * <p><b>NOTE: Adding semantic information for functions is optional!
+    * If used correctly, semantic information can help the Flink optimizer to 
generate more
+    * efficient execution plans.
+    * However, incorrect semantic information can cause the optimizer to 
generate incorrect
+    * execution plans which compute wrong results!
+    * So be careful when adding semantic information.
+    * </b>
+    *
+    * @param forwardedFields A list of forwarded field expressions for the 
first input
+    *                        of the function.
+    * @return This operator with annotated forwarded field information.
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation
+    * @see 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst
+    */
   def withForwardedFieldsFirst(forwardedFields: String*) = {
     javaSet match {
       case op: TwoInputUdfOperator[_, _, _, _] => 
op.withForwardedFieldsFirst(forwardedFields: _*)
@@ -302,6 +391,53 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
     this
   }
 
+  /**
+    * Adds semantic information about forwarded fields of the second input
+    * of the user-defined function.
+    * The forwarded fields information declares fields which are never 
modified by the function and
+    * which are forwarded to the same position in the output or copied 
unchanged
+    * to another position in the output.
+    *
+    * <p>Fields that are forwarded to the same position are specified just by 
their position.
+    * The specified position must be valid for the input and output data type
+    * and have the same type.
+    * For example <code>withForwardedFieldsFirst("_3")</code> declares that 
the third field
+    * of an input tuple from the second input is copied to the third field of 
an output tuple.
+    *
+    * <p>Fields which are copied from the second input to another position
+    * in the output unchanged are declared by specifying the source field 
reference
+    * in the second input and the target field reference in the output.
+    * {@code withForwardedFieldsFirst("_1->_3")} denotes that the first field 
of the second input
+    * tuple is copied to the third field of the output tuple unchanged. When 
using a wildcard ("*")
+    * ensure that the number of declared fields and their types in the second 
input and
+    * output type match.
+    *
+    * <p>Multiple forwarded fields can be annotated in one
+    * ({@code withForwardedFieldsFirst("_2; _3->_0; _4")})
+    * or separate Strings ({@code withForwardedFieldsFirst("_2", "_3->_0", 
"_4")}).
+    * Please refer to the JavaDoc of
+    * {@link org.apache.flink.api.common.functions.Function} or Flink's 
documentation for
+    * details on field references such as nested fields and wildcard.
+    *
+    * <p>It is not possible to override existing semantic information about 
forwarded fields
+    * of the second input which was for example added by a
+    * {@link 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst}
+    * class annotation.
+    *
+    * <p><b>NOTE: Adding semantic information for functions is optional!
+    * If used correctly, semantic information can help the Flink optimizer to 
generate more
+    * efficient execution plans.
+    * However, incorrect semantic information can cause the optimizer to 
generate incorrect
+    * execution plans which compute wrong results!
+    * So be careful when adding semantic information.
+    * </b>
+    *
+    * @param forwardedFields A list of forwarded field expressions for the 
second input
+    *                        of the function.
+    * @return This operator with annotated forwarded field information.
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation
+    * @see 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst
+    */
   def withForwardedFieldsSecond(forwardedFields: String*) = {
     javaSet match {
       case op: TwoInputUdfOperator[_, _, _, _] => 
op.withForwardedFieldsSecond(forwardedFields: _*)


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to