[jira] [Updated] (FLINK-3680) Remove or improve (not set) text in the Job Plan UI
[ https://issues.apache.org/jira/browse/FLINK-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Grier updated FLINK-3680: --- Attachment: Screen Shot 2016-03-29 at 8.12.17 PM.png Screen Shot 2016-03-29 at 8.13.12 PM.png > Remove or improve (not set) text in the Job Plan UI > --- > > Key: FLINK-3680 > URL: https://issues.apache.org/jira/browse/FLINK-3680 > Project: Flink > Issue Type: Bug > Components: Webfrontend >Reporter: Jamie Grier > Attachments: Screen Shot 2016-03-29 at 8.12.17 PM.png, Screen Shot > 2016-03-29 at 8.13.12 PM.png > > > When running streaming jobs the UI display (not set) in the UI in a few > different places. This is not the case for batch jobs. > To illustrate I've included screen shots of the UI for the batch and > streaming WordCount examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3680) Remove or improve (not set) text in the Job Plan UI
Jamie Grier created FLINK-3680: -- Summary: Remove or improve (not set) text in the Job Plan UI Key: FLINK-3680 URL: https://issues.apache.org/jira/browse/FLINK-3680 Project: Flink Issue Type: Bug Components: Webfrontend Reporter: Jamie Grier When running streaming jobs the UI display (not set) in the UI in a few different places. This is not the case for batch jobs. To illustrate I've included screen shots of the UI for the batch and streaming WordCount examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2998) Support range partition comparison for multi input nodes.
[ https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217301#comment-15217301 ] ASF GitHub Bot commented on FLINK-2998: --- Github user gallenvara commented on the pull request: https://github.com/apache/flink/pull/1838#issuecomment-203216796 @fhueske @ChengXiangLi Can you please help with review? :) > Support range partition comparison for multi input nodes. > - > > Key: FLINK-2998 > URL: https://issues.apache.org/jira/browse/FLINK-2998 > Project: Flink > Issue Type: New Feature > Components: Optimizer >Reporter: Chengxiang Li >Priority: Minor > > The optimizer may have potential opportunity to optimize the DAG while it > found two input range partition are equivalent, we does not support the > comparison yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2998] Support range partition compariso...
Github user gallenvara commented on the pull request: https://github.com/apache/flink/pull/1838#issuecomment-203216796 @fhueske @ChengXiangLi Can you please help with review? :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2998) Support range partition comparison for multi input nodes.
[ https://issues.apache.org/jira/browse/FLINK-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217300#comment-15217300 ] ASF GitHub Bot commented on FLINK-2998: --- GitHub user gallenvara opened a pull request: https://github.com/apache/flink/pull/1838 [FLINK-2998] Support range partition comparison for multi input nodes. The PR implements range partition comparison in operation such as join and cogroup for multi inputs, now optimizer can optimize the dag to avoid re-partition if it find the data distributions user supplied are equal. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gallenvara/flink flink-2998 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1838.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1838 commit 37e6147a829e50ba8a45c26f225e16e7695f6489 Author: gallenvaraDate: 2016-03-29T14:36:21Z Support range partition comparison for multi input nodes. > Support range partition comparison for multi input nodes. > - > > Key: FLINK-2998 > URL: https://issues.apache.org/jira/browse/FLINK-2998 > Project: Flink > Issue Type: New Feature > Components: Optimizer >Reporter: Chengxiang Li >Priority: Minor > > The optimizer may have potential opportunity to optimize the DAG while it > found two input range partition are equivalent, we does not support the > comparison yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2998] Support range partition compariso...
GitHub user gallenvara opened a pull request: https://github.com/apache/flink/pull/1838 [FLINK-2998] Support range partition comparison for multi input nodes. The PR implements range partition comparison in operation such as join and cogroup for multi inputs, now optimizer can optimize the dag to avoid re-partition if it find the data distributions user supplied are equal. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gallenvara/flink flink-2998 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1838.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1838 commit 37e6147a829e50ba8a45c26f225e16e7695f6489 Author: gallenvaraDate: 2016-03-29T14:36:21Z Support range partition comparison for multi input nodes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (FLINK-3670) Kerberos: Improving long-running streaming jobs
[ https://issues.apache.org/jira/browse/FLINK-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217092#comment-15217092 ] Eron Wright edited comment on FLINK-3670 at 3/29/16 11:57 PM: --- Another possibility worth considering is to leverage Hadoop's 'proxy user' functionality. https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html In this approach, the JobManager impersonates the job submitter when accessing HDFS, HBASE, or Hive. Those servers would be configured to treat the JobManager principal as a superuser. Note that the above solution isn't general, since Kafka (for example) doesn't provide proxy user functionality.Maybe both options could be provided. was (Author: eronwright): Another possibility worth considering is to leverage Hadoop's 'proxy user' functionality. https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html In this approach, the JobManager impersonates the job submitter when accessing HDFS, HBASE, or Hive. Those servers would be configured to treat the JobManager principal as a proxy user. Note that the above solution isn't general, since Kafka (for example) doesn't provide proxy user functionality.Maybe both options could be provided. > Kerberos: Improving long-running streaming jobs > --- > > Key: FLINK-3670 > URL: https://issues.apache.org/jira/browse/FLINK-3670 > Project: Flink > Issue Type: Improvement > Components: Command-line client, Local Runtime >Reporter: Maximilian Michels > > We have seen in the past, that Hadoop's delegation tokens are subject to a > number of subtle token renewal bugs. In addition, they have a maximum life > time that can be worked around but is very inconvenient for the user. > As per [mailing list > discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Kerberos-for-Streaming-amp-Kafka-td10906.html], > a way to work around the maximum life time of DelegationTokens would be to > pass the Kerberos principal and key tab upon job submission. A daemon could > then periodically renew the ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3670) Kerberos: Improving long-running streaming jobs
[ https://issues.apache.org/jira/browse/FLINK-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217092#comment-15217092 ] Eron Wright commented on FLINK-3670: - Another possibility worth considering is to leverage Hadoop's 'proxy user' functionality. https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html In this approach, the JobManager impersonates the job submitter when accessing HDFS, HBASE, or Hive. Those servers would be configured to treat the JobManager principal as a proxy user. Note that the above solution isn't general, since Kafka (for example) doesn't provide proxy user functionality.Maybe both options could be provided. > Kerberos: Improving long-running streaming jobs > --- > > Key: FLINK-3670 > URL: https://issues.apache.org/jira/browse/FLINK-3670 > Project: Flink > Issue Type: Improvement > Components: Command-line client, Local Runtime >Reporter: Maximilian Michels > > We have seen in the past, that Hadoop's delegation tokens are subject to a > number of subtle token renewal bugs. In addition, they have a maximum life > time that can be worked around but is very inconvenient for the user. > As per [mailing list > discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Kerberos-for-Streaming-amp-Kafka-td10906.html], > a way to work around the maximum life time of DelegationTokens would be to > pass the Kerberos principal and key tab upon job submission. A daemon could > then periodically renew the ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3579) Improve String concatenation
[ https://issues.apache.org/jira/browse/FLINK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217029#comment-15217029 ] ASF GitHub Bot commented on FLINK-3579: --- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57815729 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -116,7 +118,18 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { atom <~ ".cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | -atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } +atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } | +// When an integer is directly casted. +atom <~ "cast(BYTE)" ^^ { e => Cast(e, BasicTypeInfo.BYTE_TYPE_INFO) } | +atom <~ "cast(SHORT)" ^^ { e => Cast(e, BasicTypeInfo.SHORT_TYPE_INFO) } | +atom <~ "cast(INT)" ^^ { e => Cast(e, BasicTypeInfo.INT_TYPE_INFO) } | +atom <~ "cast(LONG)" ^^ { e => Cast(e, BasicTypeInfo.LONG_TYPE_INFO) } | +atom <~ "cast(FLOAT)" ^^ { e => Cast(e, BasicTypeInfo.FLOAT_TYPE_INFO) } | +atom <~ "cast(DOUBLE)" ^^ { e => Cast(e, BasicTypeInfo.DOUBLE_TYPE_INFO) } | +atom <~ "cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | --- End diff -- This is about the Table API. SQL will be parsed by Calcite. So it is up to us what we accept. > Improve String concatenation > > > Key: FLINK-3579 > URL: https://issues.apache.org/jira/browse/FLINK-3579 > Project: Flink > Issue Type: Bug > Components: Table API >Reporter: Timo Walther >Assignee: ramkrishna.s.vasudevan >Priority: Minor > > Concatenation of a String and non-String does not work properly. > e.g. {{f0 + 42}} leads to RelBuilder Exception > ExpressionParser does not like {{f0 + 42.cast(STRING)}} either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3579 Improve String concatenation (Ram)
Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57815729 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -116,7 +118,18 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { atom <~ ".cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | -atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } +atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } | +// When an integer is directly casted. +atom <~ "cast(BYTE)" ^^ { e => Cast(e, BasicTypeInfo.BYTE_TYPE_INFO) } | +atom <~ "cast(SHORT)" ^^ { e => Cast(e, BasicTypeInfo.SHORT_TYPE_INFO) } | +atom <~ "cast(INT)" ^^ { e => Cast(e, BasicTypeInfo.INT_TYPE_INFO) } | +atom <~ "cast(LONG)" ^^ { e => Cast(e, BasicTypeInfo.LONG_TYPE_INFO) } | +atom <~ "cast(FLOAT)" ^^ { e => Cast(e, BasicTypeInfo.FLOAT_TYPE_INFO) } | +atom <~ "cast(DOUBLE)" ^^ { e => Cast(e, BasicTypeInfo.DOUBLE_TYPE_INFO) } | +atom <~ "cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | --- End diff -- This is about the Table API. SQL will be parsed by Calcite. So it is up to us what we accept. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3477) Add hash-based combine strategy for ReduceFunction
[ https://issues.apache.org/jira/browse/FLINK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216998#comment-15216998 ] ASF GitHub Bot commented on FLINK-3477: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-203143579 Thanks for doing these experiments! The results are quite convincing. I'm currently on vacation and will be back in about a week. > Add hash-based combine strategy for ReduceFunction > -- > > Key: FLINK-3477 > URL: https://issues.apache.org/jira/browse/FLINK-3477 > Project: Flink > Issue Type: Sub-task > Components: Local Runtime >Reporter: Fabian Hueske > > This issue is about adding a hash-based combine strategy for ReduceFunctions. > The interface of the {{reduce()}} method is as follows: > {code} > public T reduce(T v1, T v2) > {code} > Input type and output type are identical and the function returns only a > single value. A Reduce function is incrementally applied to compute a final > aggregated value. This allows to hold the preaggregated value in a hash-table > and update it with each function call. > The hash-based strategy requires special implementation of an in-memory hash > table. The hash table should support in place updates of elements (if the > updated value has the same size as the new value) but also appending updates > with invalidation of the old value (if the binary length of the new value > differs). The hash table needs to be able to evict and emit all elements if > it runs out-of-memory. > We should also add {{HASH}} and {{SORT}} compiler hints to > {{DataSet.reduce()}} and {{Grouping.reduce()}} to allow users to pick the > execution strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3477] [runtime] Add hash-based combine ...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-203143579 Thanks for doing these experiments! The results are quite convincing. I'm currently on vacation and will be back in about a week. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3673) Annotations for code generation
[ https://issues.apache.org/jira/browse/FLINK-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216988#comment-15216988 ] Fabian Hueske commented on FLINK-3673: -- Hi [~horvgab], can you add more detail about the annotations you are proposing with this issue? What would the semantics be and how could they be used to generate more efficient code for serializers? Thanks > Annotations for code generation > --- > > Key: FLINK-3673 > URL: https://issues.apache.org/jira/browse/FLINK-3673 > Project: Flink > Issue Type: Sub-task > Components: Type Serialization System >Reporter: Gabor Horvath >Assignee: Gabor Horvath > Labels: gsoc2016 > > Annotations should be utilized to generate more efficient serialization code. > The very same annotations can be used to make the getLength method much > smarter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3679) DeserializationSchema should handle zero or more outputs for every input
Jamie Grier created FLINK-3679: -- Summary: DeserializationSchema should handle zero or more outputs for every input Key: FLINK-3679 URL: https://issues.apache.org/jira/browse/FLINK-3679 Project: Flink Issue Type: Bug Components: DataStream API Reporter: Jamie Grier There are a couple of issues with the DeserializationSchema API that I think should be improved. This request has come to me via an existing Flink user. The main issue is simply that the API assumes that there is a one-to-one mapping between input and outputs. In reality there are scenarios where one input message (say from Kafka) might actually map to zero or more logical elements in the pipeline. Particularly important here is the case where you receive a message from a source (such as Kafka) and say the raw bytes don't deserialize properly. Right now the only recourse is to throw IOException and therefore fail the job. This is definitely not good since bad data is a reality and failing the job is not the right option. If the job fails we'll just end up replaying the bad data and the whole thing will start again. Instead in this case it would be best if the user could just return the empty set. The other case is where one input message should logically be multiple output messages. This case is probably less important since there are other ways to do this but in general it might be good to make the DeserializationSchema.deserialize() method return a collection rather than a single element. Maybe we need to support a DeserializationSchema variant that has semantics more like that of FlatMap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2157) Create evaluation framework for ML library
[ https://issues.apache.org/jira/browse/FLINK-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216749#comment-15216749 ] ASF GitHub Bot commented on FLINK-2157: --- Github user thvasilo commented on the pull request: https://github.com/apache/flink/pull/871#issuecomment-203075630 @rawkintrevo AFAIK it's lack of time from a commiter to review it. If @tillrohrmann can find some time to review this I'll refactor it to get rid of the conflicts and hopefully we can merge this and move on to #891 and #902 > Create evaluation framework for ML library > -- > > Key: FLINK-2157 > URL: https://issues.apache.org/jira/browse/FLINK-2157 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Theodore Vasiloudis > Labels: ML > Fix For: 1.0.0 > > > Currently, FlinkML lacks means to evaluate the performance of trained models. > It would be great to add some {{Evaluators}} which can calculate some score > based on the information about true and predicted labels. This could also be > used for the cross validation to choose the right hyper parameters. > Possible scores could be F score [1], zero-one-loss score, etc. > Resources > [1] [http://en.wikipedia.org/wiki/F1_score] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2157] [ml] Create evaluation framework ...
Github user thvasilo commented on the pull request: https://github.com/apache/flink/pull/871#issuecomment-203075630 @rawkintrevo AFAIK it's lack of time from a commiter to review it. If @tillrohrmann can find some time to review this I'll refactor it to get rid of the conflicts and hopefully we can merge this and move on to #891 and #902 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2157) Create evaluation framework for ML library
[ https://issues.apache.org/jira/browse/FLINK-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216721#comment-15216721 ] ASF GitHub Bot commented on FLINK-2157: --- Github user rawkintrevo commented on the pull request: https://github.com/apache/flink/pull/871#issuecomment-203067582 Continued from mailing list: Till already mentioned that having Rsquared built in to MLR was just a convenience method, it's not good for a number of reasons in practice. Also- what is the hold up on this PR? what needs to be done/ what are the remaining things to decide? Having some model scoring would be very handy. > Create evaluation framework for ML library > -- > > Key: FLINK-2157 > URL: https://issues.apache.org/jira/browse/FLINK-2157 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Theodore Vasiloudis > Labels: ML > Fix For: 1.0.0 > > > Currently, FlinkML lacks means to evaluate the performance of trained models. > It would be great to add some {{Evaluators}} which can calculate some score > based on the information about true and predicted labels. This could also be > used for the cross validation to choose the right hyper parameters. > Possible scores could be F score [1], zero-one-loss score, etc. > Resources > [1] [http://en.wikipedia.org/wiki/F1_score] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2157] [ml] Create evaluation framework ...
Github user rawkintrevo commented on the pull request: https://github.com/apache/flink/pull/871#issuecomment-203067582 Continued from mailing list: Till already mentioned that having Rsquared built in to MLR was just a convenience method, it's not good for a number of reasons in practice. Also- what is the hold up on this PR? what needs to be done/ what are the remaining things to decide? Having some model scoring would be very handy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3678) Make Flink logs directory configurable
[ https://issues.apache.org/jira/browse/FLINK-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216397#comment-15216397 ] ASF GitHub Bot commented on FLINK-3678: --- GitHub user stefanobaghino opened a pull request: https://github.com/apache/flink/pull/1837 [FLINK-3678] Make Flink logs directory configurable * Edit config.sh * Document the newly defined log directory configuration key You can merge this pull request into a Git repository by running: $ git pull https://github.com/radicalbit/flink 3678 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1837.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1837 commit 05002c01345425d6fe9814ea7f669630fa5514b3 Author: Stefano BaghinoDate: 2016-03-29T17:10:46Z [FLINK-3678] Make Flink logs directory configurable * Edit config.sh * Document the newly defined log directory configuration key > Make Flink logs directory configurable > -- > > Key: FLINK-3678 > URL: https://issues.apache.org/jira/browse/FLINK-3678 > Project: Flink > Issue Type: Improvement > Components: Start-Stop Scripts >Affects Versions: 1.0.0 >Reporter: Stefano Baghino >Assignee: Stefano Baghino >Priority: Minor > Fix For: 1.0.1 > > > Currently Flink logs are stored under {{$FLINK_HOME/log}} and the user cannot > configure an alternative storage location. It would be nice to add a > configuration key in the {{flink-conf.yaml}} and edit the {{bin/flink}} > launch script accordingly to get the value if present or default to the > current behavior if no value is provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3678] Make Flink logs directory configu...
GitHub user stefanobaghino opened a pull request: https://github.com/apache/flink/pull/1837 [FLINK-3678] Make Flink logs directory configurable * Edit config.sh * Document the newly defined log directory configuration key You can merge this pull request into a Git repository by running: $ git pull https://github.com/radicalbit/flink 3678 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1837.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1837 commit 05002c01345425d6fe9814ea7f669630fa5514b3 Author: Stefano BaghinoDate: 2016-03-29T17:10:46Z [FLINK-3678] Make Flink logs directory configurable * Edit config.sh * Document the newly defined log directory configuration key --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3651) Fix faulty RollingSink Restore
[ https://issues.apache.org/jira/browse/FLINK-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216361#comment-15216361 ] ASF GitHub Bot commented on FLINK-3651: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1830 > Fix faulty RollingSink Restore > -- > > Key: FLINK-3651 > URL: https://issues.apache.org/jira/browse/FLINK-3651 > Project: Flink > Issue Type: Bug > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.1.0, 1.0.1 > > > The RollingSink restore logic has a bug where the sink for subtask index 1 > also removes files for subtask index 11 because the regex that checks for the > file name also matches that one. Adding the suffix to the regex should solve > the problem because then the regex for 1 will only match files for subtask > index 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-3651) Fix faulty RollingSink Restore
[ https://issues.apache.org/jira/browse/FLINK-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi closed FLINK-3651. -- Resolution: Fixed Fix Version/s: 1.0.1 1.1.0 Fixed in 580a177 (master), 2089029 (release-1.0) > Fix faulty RollingSink Restore > -- > > Key: FLINK-3651 > URL: https://issues.apache.org/jira/browse/FLINK-3651 > Project: Flink > Issue Type: Bug > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > Fix For: 1.1.0, 1.0.1 > > > The RollingSink restore logic has a bug where the sink for subtask index 1 > also removes files for subtask index 11 because the regex that checks for the > file name also matches that one. Adding the suffix to the regex should solve > the problem because then the regex for 1 will only match files for subtask > index 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3651] Fix faulty RollingSink Restore
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1830 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3651) Fix faulty RollingSink Restore
[ https://issues.apache.org/jira/browse/FLINK-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216351#comment-15216351 ] ASF GitHub Bot commented on FLINK-3651: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/1830#issuecomment-203004909 I'm going to merge this to `master` and `release-1.0`. > Fix faulty RollingSink Restore > -- > > Key: FLINK-3651 > URL: https://issues.apache.org/jira/browse/FLINK-3651 > Project: Flink > Issue Type: Bug > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > The RollingSink restore logic has a bug where the sink for subtask index 1 > also removes files for subtask index 11 because the regex that checks for the > file name also matches that one. Adding the suffix to the regex should solve > the problem because then the regex for 1 will only match files for subtask > index 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3651] Fix faulty RollingSink Restore
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1830#issuecomment-203004909 I'm going to merge this to `master` and `release-1.0`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3678) Make Flink logs directory configurable
Stefano Baghino created FLINK-3678: -- Summary: Make Flink logs directory configurable Key: FLINK-3678 URL: https://issues.apache.org/jira/browse/FLINK-3678 Project: Flink Issue Type: Improvement Components: Start-Stop Scripts Affects Versions: 1.0.0 Reporter: Stefano Baghino Assignee: Stefano Baghino Priority: Minor Fix For: 1.0.1 Currently Flink logs are stored under {{$FLINK_HOME/log}} and the user cannot configure an alternative storage location. It would be nice to add a configuration key in the {{flink-conf.yaml}} and edit the {{bin/flink}} launch script accordingly to get the value if present or default to the current behavior if no value is provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns
[ https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216289#comment-15216289 ] Maximilian Michels commented on FLINK-3677: --- Linked FLINK-3655 which is related to this issue. > FileInputFormat: Allow to specify include/exclude file name patterns > > > Key: FLINK-3677 > URL: https://issues.apache.org/jira/browse/FLINK-3677 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.0.0 >Reporter: Maximilian Michels >Priority: Minor > Labels: starter > > It would be nice to be able to specify a regular expression to filter files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns
Maximilian Michels created FLINK-3677: - Summary: FileInputFormat: Allow to specify include/exclude file name patterns Key: FLINK-3677 URL: https://issues.apache.org/jira/browse/FLINK-3677 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.0.0 Reporter: Maximilian Michels Priority: Minor It would be nice to be able to specify a regular expression to filter files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3667) Generalize client<->cluster communication
[ https://issues.apache.org/jira/browse/FLINK-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216222#comment-15216222 ] Maximilian Michels commented on FLINK-3667: --- +1 Makes sense to convert the {{Client}} class to a base class. > Generalize client<->cluster communication > - > > Key: FLINK-3667 > URL: https://issues.apache.org/jira/browse/FLINK-3667 > Project: Flink > Issue Type: Improvement > Components: YARN Client >Reporter: Maximilian Michels >Assignee: Maximilian Michels > > Here are some notes I took when inspecting the client<->cluster classes with > regard to future integration of other resource management frameworks in > addition to Yarn (e.g. Mesos). > {noformat} > 1 Cluster Client Abstraction > > 1.1 Status Quo > ── > 1.1.1 FlinkYarnClient > ╌ > • Holds the cluster configuration (Flink-specific and Yarn-specific) > • Contains the deploy() method to deploy the cluster > • Creates the Hadoop Yarn client > • Receives the initial job manager address > • Bootstraps the FlinkYarnCluster > 1.1.2 FlinkYarnCluster > ╌╌ > • Wrapper around the Hadoop Yarn client > • Queries cluster for status updates > • Life time methods to start and shutdown the cluster > • Flink specific features like shutdown after job completion > 1.1.3 ApplicationClient > ╌╌╌ > • Acts as a middle-man for asynchronous cluster communication > • Designed to communicate with Yarn, not used in Standalone mode > 1.1.4 CliFrontend > ╌ > • Deeply integrated with FlinkYarnClient and FlinkYarnCluster > • Constantly distinguishes between Yarn and Standalone mode > • Would be nice to have a general abstraction in place > 1.1.5 Client > > • Job submission and Job related actions, agnostic of resource framework > 1.2 Proposal > > 1.2.1 ClusterConfig (before: AbstractFlinkYarnClient) > ╌ > • Extensible cluster-agnostic config > • May be extended by specific cluster, e.g. YarnClusterConfig > 1.2.2 ClusterClient (before: AbstractFlinkYarnClient) > ╌ > • Deals with cluster (RM) specific communication > • Exposes framework agnostic information > • YarnClusterClient, MesosClusterClient, StandaloneClusterClient > 1.2.3 FlinkCluster (before: AbstractFlinkYarnCluster) > ╌ > • Basic interface to communicate with a running cluster > • Receives the ClusterClient for cluster-specific communication > • Should not have to care about the specific implementations of the > client > 1.2.4 ApplicationClient > ╌╌╌ > • Can be changed to work cluster-agnostic (first steps already in > FLINK-3543) > 1.2.5 CliFrontend > ╌ > • CliFrontend does never have to differentiate between different > cluster types after it has determined which cluster class to load. > • Base class handles framework agnostic command line arguments > • Pluggables for Yarn, Mesos handle specific commands > {noformat} > I would like to create/refactor the affected classes to set us up for a more > flexible client side resource management abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3524] [kafka] Add JSONDeserializationSc...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1834#issuecomment-202964440 Looks good. Is this a Kafka-only util? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3524) Provide a JSONDeserialisationSchema in the kafka connector package
[ https://issues.apache.org/jira/browse/FLINK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216193#comment-15216193 ] ASF GitHub Bot commented on FLINK-3524: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1834#issuecomment-202964440 Looks good. Is this a Kafka-only util? > Provide a JSONDeserialisationSchema in the kafka connector package > -- > > Key: FLINK-3524 > URL: https://issues.apache.org/jira/browse/FLINK-3524 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Reporter: Robert Metzger >Assignee: Chesnay Schepler > Labels: starter > > (I don't want to include this into 1.0.0) > Currently, there is no standardized way of parsing JSON data from a Kafka > stream. I see a lot of users using JSON in their topics. It would make things > easier for our users to provide a serializer for them. > I suggest to use the jackson library because we have that aready as a > dependency in Flink and it allows to parse from a byte[]. > I would suggest to provide the following classes: > - JSONDeserializationSchema() > - JSONDeKeyValueSerializationSchema(bool includeMetadata) > The second variant should produce a record like this: > {code} > {"key": "keydata", > "value": "valuedata", > "metadata": {"offset": 123, "topic": "", "partition": 2 } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-3676) WebClient hasn't been removed from the docs
[ https://issues.apache.org/jira/browse/FLINK-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels closed FLINK-3676. - Resolution: Fixed master: 875cb448d7407687402eec15c77c24683c2d5c56 release-1.0: 875cb448d7407687402eec15c77c24683c2d5c56 > WebClient hasn't been removed from the docs > --- > > Key: FLINK-3676 > URL: https://issues.apache.org/jira/browse/FLINK-3676 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.0.0, 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0, 1.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3676) WebClient hasn't been removed from the docs
Maximilian Michels created FLINK-3676: - Summary: WebClient hasn't been removed from the docs Key: FLINK-3676 URL: https://issues.apache.org/jira/browse/FLINK-3676 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.0.0, 1.1.0 Reporter: Maximilian Michels Assignee: Maximilian Michels Fix For: 1.1.0, 1.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3675) YARN ship folder incosistent behavior
Stefano Baghino created FLINK-3675: -- Summary: YARN ship folder incosistent behavior Key: FLINK-3675 URL: https://issues.apache.org/jira/browse/FLINK-3675 Project: Flink Issue Type: Bug Components: YARN Client Affects Versions: 1.0.0 Reporter: Stefano Baghino After [some discussion on the user mailing list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-YARN-ship-folder-td5458.html] it came up that the {{flink/lib}} folder is always supposed to be shipped to the YARN cluster so that all the nodes have access to its contents. Currently however, the Flink long-running YARN session actually ships the folder because it's explicitly specified in the {{yarn-session.sh}} script, while running a single job on YARN does not automatically ship it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3477) Add hash-based combine strategy for ReduceFunction
[ https://issues.apache.org/jira/browse/FLINK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215928#comment-15215928 ] ASF GitHub Bot commented on FLINK-3477: --- Github user aalexandrov commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-202868272 Overall it seems that the hash-based combiner works better than the sort-based one for (a) uniform, or normal key distribution, and (b) fixed-length records. For skewed key distribution (like Zipf) the two strategies are practically equal, and for variable-length record the extra effort in compacting the record offsets the advanges of the hash-based aggregation approach. > Add hash-based combine strategy for ReduceFunction > -- > > Key: FLINK-3477 > URL: https://issues.apache.org/jira/browse/FLINK-3477 > Project: Flink > Issue Type: Sub-task > Components: Local Runtime >Reporter: Fabian Hueske > > This issue is about adding a hash-based combine strategy for ReduceFunctions. > The interface of the {{reduce()}} method is as follows: > {code} > public T reduce(T v1, T v2) > {code} > Input type and output type are identical and the function returns only a > single value. A Reduce function is incrementally applied to compute a final > aggregated value. This allows to hold the preaggregated value in a hash-table > and update it with each function call. > The hash-based strategy requires special implementation of an in-memory hash > table. The hash table should support in place updates of elements (if the > updated value has the same size as the new value) but also appending updates > with invalidation of the old value (if the binary length of the new value > differs). The hash table needs to be able to evict and emit all elements if > it runs out-of-memory. > We should also add {{HASH}} and {{SORT}} compiler hints to > {{DataSet.reduce()}} and {{Grouping.reduce()}} to allow users to pick the > execution strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3477] [runtime] Add hash-based combine ...
Github user aalexandrov commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-202868272 Overall it seems that the hash-based combiner works better than the sort-based one for (a) uniform, or normal key distribution, and (b) fixed-length records. For skewed key distribution (like Zipf) the two strategies are practically equal, and for variable-length record the extra effort in compacting the record offsets the advanges of the hash-based aggregation approach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3667) Generalize client<->cluster communication
[ https://issues.apache.org/jira/browse/FLINK-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215927#comment-15215927 ] Stephan Ewen commented on FLINK-3667: - Sounds good! Here are some thoughts I had when reading the proposal: I am wondering, if we get too many clients in the end (JobClient, Client, ClusterClient), and whether we should make your proposed {{ClusterClient}} and the {{Client}} class one thing. {code} AbstractClient (common functionality, like the JobManager communication, submitting jobs once everything runs) | +-- StandaloneClient (like current Client) +-- YarnClient +-- MesosClient {code} Similarly, one would have {{FlinkCluster}} (abstract superclass), and {{StandaloneCluster}}, {{YarnCluster}}, {{MesosCluster}} that would be created from the configs, and would have a {{getClient()}} call that returns the above clients. > Generalize client<->cluster communication > - > > Key: FLINK-3667 > URL: https://issues.apache.org/jira/browse/FLINK-3667 > Project: Flink > Issue Type: Improvement > Components: YARN Client >Reporter: Maximilian Michels >Assignee: Maximilian Michels > > Here are some notes I took when inspecting the client<->cluster classes with > regard to future integration of other resource management frameworks in > addition to Yarn (e.g. Mesos). > {noformat} > 1 Cluster Client Abstraction > > 1.1 Status Quo > ── > 1.1.1 FlinkYarnClient > ╌ > • Holds the cluster configuration (Flink-specific and Yarn-specific) > • Contains the deploy() method to deploy the cluster > • Creates the Hadoop Yarn client > • Receives the initial job manager address > • Bootstraps the FlinkYarnCluster > 1.1.2 FlinkYarnCluster > ╌╌ > • Wrapper around the Hadoop Yarn client > • Queries cluster for status updates > • Life time methods to start and shutdown the cluster > • Flink specific features like shutdown after job completion > 1.1.3 ApplicationClient > ╌╌╌ > • Acts as a middle-man for asynchronous cluster communication > • Designed to communicate with Yarn, not used in Standalone mode > 1.1.4 CliFrontend > ╌ > • Deeply integrated with FlinkYarnClient and FlinkYarnCluster > • Constantly distinguishes between Yarn and Standalone mode > • Would be nice to have a general abstraction in place > 1.1.5 Client > > • Job submission and Job related actions, agnostic of resource framework > 1.2 Proposal > > 1.2.1 ClusterConfig (before: AbstractFlinkYarnClient) > ╌ > • Extensible cluster-agnostic config > • May be extended by specific cluster, e.g. YarnClusterConfig > 1.2.2 ClusterClient (before: AbstractFlinkYarnClient) > ╌ > • Deals with cluster (RM) specific communication > • Exposes framework agnostic information > • YarnClusterClient, MesosClusterClient, StandaloneClusterClient > 1.2.3 FlinkCluster (before: AbstractFlinkYarnCluster) > ╌ > • Basic interface to communicate with a running cluster > • Receives the ClusterClient for cluster-specific communication > • Should not have to care about the specific implementations of the > client > 1.2.4 ApplicationClient > ╌╌╌ > • Can be changed to work cluster-agnostic (first steps already in > FLINK-3543) > 1.2.5 CliFrontend > ╌ > • CliFrontend does never have to differentiate between different > cluster types after it has determined which cluster class to load. > • Base class handles framework agnostic command line arguments > • Pluggables for Yarn, Mesos handle specific commands > {noformat} > I would like to create/refactor the affected classes to set us up for a more > flexible client side resource management abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3477) Add hash-based combine strategy for ReduceFunction
[ https://issues.apache.org/jira/browse/FLINK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215916#comment-15215916 ] ASF GitHub Bot commented on FLINK-3477: --- Github user aalexandrov commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-202864630 @fhueske We have used the Easter break to conduct the experiments. A preliminary writeup is in the Google Doc. @ggevay will provide the results analysis later today. Cheers! > Add hash-based combine strategy for ReduceFunction > -- > > Key: FLINK-3477 > URL: https://issues.apache.org/jira/browse/FLINK-3477 > Project: Flink > Issue Type: Sub-task > Components: Local Runtime >Reporter: Fabian Hueske > > This issue is about adding a hash-based combine strategy for ReduceFunctions. > The interface of the {{reduce()}} method is as follows: > {code} > public T reduce(T v1, T v2) > {code} > Input type and output type are identical and the function returns only a > single value. A Reduce function is incrementally applied to compute a final > aggregated value. This allows to hold the preaggregated value in a hash-table > and update it with each function call. > The hash-based strategy requires special implementation of an in-memory hash > table. The hash table should support in place updates of elements (if the > updated value has the same size as the new value) but also appending updates > with invalidation of the old value (if the binary length of the new value > differs). The hash table needs to be able to evict and emit all elements if > it runs out-of-memory. > We should also add {{HASH}} and {{SORT}} compiler hints to > {{DataSet.reduce()}} and {{Grouping.reduce()}} to allow users to pick the > execution strategy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3477] [runtime] Add hash-based combine ...
Github user aalexandrov commented on the pull request: https://github.com/apache/flink/pull/1517#issuecomment-202864630 @fhueske We have used the Easter break to conduct the experiments. A preliminary writeup is in the Google Doc. @ggevay will provide the results analysis later today. Cheers! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-3547) Add support for streaming projection, selection, and union
[ https://issues.apache.org/jira/browse/FLINK-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasia Kalavri resolved FLINK-3547. -- Resolution: Implemented Fix Version/s: 1.1.0 > Add support for streaming projection, selection, and union > -- > > Key: FLINK-3547 > URL: https://issues.apache.org/jira/browse/FLINK-3547 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Vasia Kalavri >Assignee: Vasia Kalavri > Fix For: 1.1.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3547) Add support for streaming projection, selection, and union
[ https://issues.apache.org/jira/browse/FLINK-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215848#comment-15215848 ] ASF GitHub Bot commented on FLINK-3547: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1820 > Add support for streaming projection, selection, and union > -- > > Key: FLINK-3547 > URL: https://issues.apache.org/jira/browse/FLINK-3547 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Vasia Kalavri >Assignee: Vasia Kalavri > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3547] add support for streaming filter,...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1820 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Closed] (FLINK-3545) ResourceManager: YARN integration
[ https://issues.apache.org/jira/browse/FLINK-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels closed FLINK-3545. - Resolution: Implemented Implemented in 4405235e5483d3e4ad94f4ba31627aa852580042. > ResourceManager: YARN integration > - > > Key: FLINK-3545 > URL: https://issues.apache.org/jira/browse/FLINK-3545 > Project: Flink > Issue Type: Sub-task > Components: ResourceManager, YARN Client >Affects Versions: 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0 > > > This integrates YARN support with the ResourceManager abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-3544) ResourceManager runtime components
[ https://issues.apache.org/jira/browse/FLINK-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels closed FLINK-3544. - Resolution: Implemented Implemented in 92ff2b152cac3ad6a53373c0c022579306051133. > ResourceManager runtime components > -- > > Key: FLINK-3544 > URL: https://issues.apache.org/jira/browse/FLINK-3544 > Project: Flink > Issue Type: Sub-task > Components: ResourceManager >Affects Versions: 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3544) ResourceManager runtime components
[ https://issues.apache.org/jira/browse/FLINK-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215832#comment-15215832 ] ASF GitHub Bot commented on FLINK-3544: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1741 > ResourceManager runtime components > -- > > Key: FLINK-3544 > URL: https://issues.apache.org/jira/browse/FLINK-3544 > Project: Flink > Issue Type: Sub-task > Components: ResourceManager >Affects Versions: 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3544] Introduce ResourceManager compone...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1741 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3674) Add an interface for EventTime aware User Function
Stephan Ewen created FLINK-3674: --- Summary: Add an interface for EventTime aware User Function Key: FLINK-3674 URL: https://issues.apache.org/jira/browse/FLINK-3674 Project: Flink Issue Type: New Feature Components: Streaming Affects Versions: 1.0.0 Reporter: Stephan Ewen Fix For: 1.1.0 I suggest to add an interface that UDFs can implement, which will let them be notified upon watermark updates. Example usage: {code} public interface EventTimeFunction { void onWatermark(Watermark watermark); } public class MyMapper implements MapFunction, EventTimeFunction { private long currentEventTime = Long.MIN_VALUE; public String map(String value) { return value + " @ " + currentEventTime; } public void onWatermark(Watermark watermark) { currentEventTime = watermark.getTimestamp(); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-3673) Annotations for code generation
Gabor Horvath created FLINK-3673: Summary: Annotations for code generation Key: FLINK-3673 URL: https://issues.apache.org/jira/browse/FLINK-3673 Project: Flink Issue Type: Sub-task Components: Type Serialization System Reporter: Gabor Horvath Assignee: Gabor Horvath Annotations should be utilized to generate more efficient serialization code. The very same annotations can be used to make the getLength method much smarter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3257) Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
[ https://issues.apache.org/jira/browse/FLINK-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215789#comment-15215789 ] ASF GitHub Bot commented on FLINK-3257: --- Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/1668#issuecomment-202818056 Thanks @StephanEwen and @uce for looking into it! I really appreciate it. How about the following: 1. I update this PR with the patch that uses ListState and apply some nice refactorings Gyula made 2. I will also address all your comments and then merge this to master 3. We start working on perfecting stream finalization on loops and backpressure deadlock elimination in seperate PRs right away. These are different problems and we need to address them separately, in my view of course. > Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs > --- > > Key: FLINK-3257 > URL: https://issues.apache.org/jira/browse/FLINK-3257 > Project: Flink > Issue Type: Improvement >Reporter: Paris Carbone >Assignee: Paris Carbone > > The current snapshotting algorithm cannot support cycles in the execution > graph. An alternative scheme can potentially include records in-transit > through the back-edges of a cyclic execution graph (ABS [1]) to achieve the > same guarantees. > One straightforward implementation of ABS for cyclic graphs can work as > follows along the lines: > 1) Upon triggering a barrier in an IterationHead from the TaskManager start > block output and start upstream backup of all records forwarded from the > respective IterationSink. > 2) The IterationSink should eventually forward the current snapshotting epoch > barrier to the IterationSource. > 3) Upon receiving a barrier from the IterationSink, the IterationSource > should finalize the snapshot, unblock its output and emit all records > in-transit in FIFO order and continue the usual execution. > -- > Upon restart the IterationSource should emit all records from the injected > snapshot first and then continue its usual execution. > Several optimisations and slight variations can be potentially achieved but > this can be the initial implementation take. > [1] http://arxiv.org/abs/1506.08603 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3579) Improve String concatenation
[ https://issues.apache.org/jira/browse/FLINK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215791#comment-15215791 ] ASF GitHub Bot commented on FLINK-3579: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57699432 --- Diff: flink-libraries/flink-table/src/test/java/org/apache/flink/api/java/table/test/StringExpressionsITCase.java --- @@ -121,6 +120,66 @@ public void testNonWorkingSubstring2() throws Exception { resultSet.collect(); } + @Test + public void testStringConcat() throws Exception { + ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + TableEnvironment tableEnv = new TableEnvironment(); + + DataSet> ds = env.fromElements( + new Tuple2<>("ABCD", 3), + new Tuple2<>("ABCD", 2)); + + Table in = tableEnv.fromDataSet(ds, "a, b"); + + Table result = in + .select("a + b + 42"); --- End diff -- What happens if you do `"42 + a"` or even `"42 + b + a"`? > Improve String concatenation > > > Key: FLINK-3579 > URL: https://issues.apache.org/jira/browse/FLINK-3579 > Project: Flink > Issue Type: Bug > Components: Table API >Reporter: Timo Walther >Assignee: ramkrishna.s.vasudevan >Priority: Minor > > Concatenation of a String and non-String does not work properly. > e.g. {{f0 + 42}} leads to RelBuilder Exception > ExpressionParser does not like {{f0 + 42.cast(STRING)}} either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3579 Improve String concatenation (Ram)
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57699432 --- Diff: flink-libraries/flink-table/src/test/java/org/apache/flink/api/java/table/test/StringExpressionsITCase.java --- @@ -121,6 +120,66 @@ public void testNonWorkingSubstring2() throws Exception { resultSet.collect(); } + @Test + public void testStringConcat() throws Exception { + ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + TableEnvironment tableEnv = new TableEnvironment(); + + DataSet> ds = env.fromElements( + new Tuple2<>("ABCD", 3), + new Tuple2<>("ABCD", 2)); + + Table in = tableEnv.fromDataSet(ds, "a, b"); + + Table result = in + .select("a + b + 42"); --- End diff -- What happens if you do `"42 + a"` or even `"42 + b + a"`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3257] Add Exactly-Once Processing Guara...
Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/1668#issuecomment-202818056 Thanks @StephanEwen and @uce for looking into it! I really appreciate it. How about the following: 1. I update this PR with the patch that uses ListState and apply some nice refactorings Gyula made 2. I will also address all your comments and then merge this to master 3. We start working on perfecting stream finalization on loops and backpressure deadlock elimination in seperate PRs right away. These are different problems and we need to address them separately, in my view of course. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3579) Improve String concatenation
[ https://issues.apache.org/jira/browse/FLINK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215786#comment-15215786 ] ASF GitHub Bot commented on FLINK-3579: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57699238 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/RexNodeTranslator.scala --- @@ -134,7 +136,13 @@ object RexNodeTranslator { case Plus(left, right) => val l = toRexNode(left, relBuilder) val r = toRexNode(right, relBuilder) -relBuilder.call(SqlStdOperatorTable.PLUS, l, r) +if(SqlTypeName.STRING_TYPES.contains(l.getType.getSqlTypeName)) { --- End diff -- What if `r` is a String type and `l` would have to be casted to `String`? > Improve String concatenation > > > Key: FLINK-3579 > URL: https://issues.apache.org/jira/browse/FLINK-3579 > Project: Flink > Issue Type: Bug > Components: Table API >Reporter: Timo Walther >Assignee: ramkrishna.s.vasudevan >Priority: Minor > > Concatenation of a String and non-String does not work properly. > e.g. {{f0 + 42}} leads to RelBuilder Exception > ExpressionParser does not like {{f0 + 42.cast(STRING)}} either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3579 Improve String concatenation (Ram)
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57699238 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/RexNodeTranslator.scala --- @@ -134,7 +136,13 @@ object RexNodeTranslator { case Plus(left, right) => val l = toRexNode(left, relBuilder) val r = toRexNode(right, relBuilder) -relBuilder.call(SqlStdOperatorTable.PLUS, l, r) +if(SqlTypeName.STRING_TYPES.contains(l.getType.getSqlTypeName)) { --- End diff -- What if `r` is a String type and `l` would have to be casted to `String`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3547) Add support for streaming projection, selection, and union
[ https://issues.apache.org/jira/browse/FLINK-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215784#comment-15215784 ] ASF GitHub Bot commented on FLINK-3547: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/1820#issuecomment-202816038 merging this > Add support for streaming projection, selection, and union > -- > > Key: FLINK-3547 > URL: https://issues.apache.org/jira/browse/FLINK-3547 > Project: Flink > Issue Type: Sub-task > Components: Table API >Reporter: Vasia Kalavri >Assignee: Vasia Kalavri > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3547] add support for streaming filter,...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/1820#issuecomment-202816038 merging this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3579) Improve String concatenation
[ https://issues.apache.org/jira/browse/FLINK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215781#comment-15215781 ] ASF GitHub Bot commented on FLINK-3579: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57699015 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -116,7 +118,18 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { atom <~ ".cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | -atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } +atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } | +// When an integer is directly casted. +atom <~ "cast(BYTE)" ^^ { e => Cast(e, BasicTypeInfo.BYTE_TYPE_INFO) } | +atom <~ "cast(SHORT)" ^^ { e => Cast(e, BasicTypeInfo.SHORT_TYPE_INFO) } | +atom <~ "cast(INT)" ^^ { e => Cast(e, BasicTypeInfo.INT_TYPE_INFO) } | +atom <~ "cast(LONG)" ^^ { e => Cast(e, BasicTypeInfo.LONG_TYPE_INFO) } | +atom <~ "cast(FLOAT)" ^^ { e => Cast(e, BasicTypeInfo.FLOAT_TYPE_INFO) } | +atom <~ "cast(DOUBLE)" ^^ { e => Cast(e, BasicTypeInfo.DOUBLE_TYPE_INFO) } | +atom <~ "cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | --- End diff -- Is it valid SQL to have both `cast(BOOL)` and `cast(BOOLEAN)`? > Improve String concatenation > > > Key: FLINK-3579 > URL: https://issues.apache.org/jira/browse/FLINK-3579 > Project: Flink > Issue Type: Bug > Components: Table API >Reporter: Timo Walther >Assignee: ramkrishna.s.vasudevan >Priority: Minor > > Concatenation of a String and non-String does not work properly. > e.g. {{f0 + 42}} leads to RelBuilder Exception > ExpressionParser does not like {{f0 + 42.cast(STRING)}} either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3579) Improve String concatenation
[ https://issues.apache.org/jira/browse/FLINK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215780#comment-15215780 ] ASF GitHub Bot commented on FLINK-3579: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57698940 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -116,7 +118,18 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { atom <~ ".cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | -atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } +atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } | +// When an integer is directly casted. +atom <~ "cast(BYTE)" ^^ { e => Cast(e, BasicTypeInfo.BYTE_TYPE_INFO) } | +atom <~ "cast(SHORT)" ^^ { e => Cast(e, BasicTypeInfo.SHORT_TYPE_INFO) } | +atom <~ "cast(INT)" ^^ { e => Cast(e, BasicTypeInfo.INT_TYPE_INFO) } | +atom <~ "cast(LONG)" ^^ { e => Cast(e, BasicTypeInfo.LONG_TYPE_INFO) } | +atom <~ "cast(FLOAT)" ^^ { e => Cast(e, BasicTypeInfo.FLOAT_TYPE_INFO) } | +atom <~ "cast(DOUBLE)" ^^ { e => Cast(e, BasicTypeInfo.DOUBLE_TYPE_INFO) } | +atom <~ "cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | +atom <~ "cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } --- End diff -- this rule was already added at the top of the list > Improve String concatenation > > > Key: FLINK-3579 > URL: https://issues.apache.org/jira/browse/FLINK-3579 > Project: Flink > Issue Type: Bug > Components: Table API >Reporter: Timo Walther >Assignee: ramkrishna.s.vasudevan >Priority: Minor > > Concatenation of a String and non-String does not work properly. > e.g. {{f0 + 42}} leads to RelBuilder Exception > ExpressionParser does not like {{f0 + 42.cast(STRING)}} either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3579 Improve String concatenation (Ram)
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57699015 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -116,7 +118,18 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { atom <~ ".cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | -atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } +atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } | +// When an integer is directly casted. +atom <~ "cast(BYTE)" ^^ { e => Cast(e, BasicTypeInfo.BYTE_TYPE_INFO) } | +atom <~ "cast(SHORT)" ^^ { e => Cast(e, BasicTypeInfo.SHORT_TYPE_INFO) } | +atom <~ "cast(INT)" ^^ { e => Cast(e, BasicTypeInfo.INT_TYPE_INFO) } | +atom <~ "cast(LONG)" ^^ { e => Cast(e, BasicTypeInfo.LONG_TYPE_INFO) } | +atom <~ "cast(FLOAT)" ^^ { e => Cast(e, BasicTypeInfo.FLOAT_TYPE_INFO) } | +atom <~ "cast(DOUBLE)" ^^ { e => Cast(e, BasicTypeInfo.DOUBLE_TYPE_INFO) } | +atom <~ "cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | --- End diff -- Is it valid SQL to have both `cast(BOOL)` and `cast(BOOLEAN)`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: FLINK-3579 Improve String concatenation (Ram)
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57698940 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -116,7 +118,18 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { atom <~ ".cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | atom <~ ".cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | -atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } +atom <~ ".cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } | +// When an integer is directly casted. +atom <~ "cast(BYTE)" ^^ { e => Cast(e, BasicTypeInfo.BYTE_TYPE_INFO) } | +atom <~ "cast(SHORT)" ^^ { e => Cast(e, BasicTypeInfo.SHORT_TYPE_INFO) } | +atom <~ "cast(INT)" ^^ { e => Cast(e, BasicTypeInfo.INT_TYPE_INFO) } | +atom <~ "cast(LONG)" ^^ { e => Cast(e, BasicTypeInfo.LONG_TYPE_INFO) } | +atom <~ "cast(FLOAT)" ^^ { e => Cast(e, BasicTypeInfo.FLOAT_TYPE_INFO) } | +atom <~ "cast(DOUBLE)" ^^ { e => Cast(e, BasicTypeInfo.DOUBLE_TYPE_INFO) } | +atom <~ "cast(BOOL)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(BOOLEAN)" ^^ { e => Cast(e, BasicTypeInfo.BOOLEAN_TYPE_INFO) } | +atom <~ "cast(STRING)" ^^ { e => Cast(e, BasicTypeInfo.STRING_TYPE_INFO) } | +atom <~ "cast(DATE)" ^^ { e => Cast(e, BasicTypeInfo.DATE_TYPE_INFO) } --- End diff -- this rule was already added at the top of the list --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3579) Improve String concatenation
[ https://issues.apache.org/jira/browse/FLINK-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215779#comment-15215779 ] ASF GitHub Bot commented on FLINK-3579: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57698703 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -59,6 +59,8 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { Literal(str.toInt) } else if (str.endsWith("f") | str.endsWith("F")) { Literal(str.toFloat) +} else if (str.endsWith(".")) { --- End diff -- Is it valid SQL standard to have a format like `1.`? What happens to `.1`? > Improve String concatenation > > > Key: FLINK-3579 > URL: https://issues.apache.org/jira/browse/FLINK-3579 > Project: Flink > Issue Type: Bug > Components: Table API >Reporter: Timo Walther >Assignee: ramkrishna.s.vasudevan >Priority: Minor > > Concatenation of a String and non-String does not work properly. > e.g. {{f0 + 42}} leads to RelBuilder Exception > ExpressionParser does not like {{f0 + 42.cast(STRING)}} either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-3579 Improve String concatenation (Ram)
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/1821#discussion_r57698703 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala --- @@ -59,6 +59,8 @@ object ExpressionParser extends JavaTokenParsers with PackratParsers { Literal(str.toInt) } else if (str.endsWith("f") | str.endsWith("F")) { Literal(str.toFloat) +} else if (str.endsWith(".")) { --- End diff -- Is it valid SQL standard to have a format like `1.`? What happens to `.1`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3651) Fix faulty RollingSink Restore
[ https://issues.apache.org/jira/browse/FLINK-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215721#comment-15215721 ] ASF GitHub Bot commented on FLINK-3651: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/1830#issuecomment-202794232 OK, depending on whether it is feasible to test this separately, I would go ahead and merge it as is or add a test and merge then. :+1: > Fix faulty RollingSink Restore > -- > > Key: FLINK-3651 > URL: https://issues.apache.org/jira/browse/FLINK-3651 > Project: Flink > Issue Type: Bug > Components: Streaming >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > The RollingSink restore logic has a bug where the sink for subtask index 1 > also removes files for subtask index 11 because the regex that checks for the > file name also matches that one. Adding the suffix to the regex should solve > the problem because then the regex for 1 will only match files for subtask > index 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-3651] Fix faulty RollingSink Restore
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1830#issuecomment-202794232 OK, depending on whether it is feasible to test this separately, I would go ahead and merge it as is or add a test and merge then. :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-3670) Kerberos: Improving long-running streaming jobs
Maximilian Michels created FLINK-3670: - Summary: Kerberos: Improving long-running streaming jobs Key: FLINK-3670 URL: https://issues.apache.org/jira/browse/FLINK-3670 Project: Flink Issue Type: Improvement Components: Command-line client, Local Runtime Reporter: Maximilian Michels We have seen in the past, that Hadoop's delegation tokens are subject to a number of subtle token renewal bugs. In addition, they have a maximum life time that can be worked around but is very inconvenient for the user. As per [mailing list discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Kerberos-for-Streaming-amp-Kafka-td10906.html], a way to work around the maximum life time of DelegationTokens would be to pass the Kerberos principal and key tab upon job submission. A daemon could then periodically renew the ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1159] Case style anonymous functions no...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/1704#issuecomment-202778390 Sounds great @stefanobaghino. I think you can push your work to this PR as well since it is all related to the partial function support. Looking forward having partial function support :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1159) Case style anonymous functions not supported by Scala API
[ https://issues.apache.org/jira/browse/FLINK-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215685#comment-15215685 ] ASF GitHub Bot commented on FLINK-1159: --- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/1704#issuecomment-202778390 Sounds great @stefanobaghino. I think you can push your work to this PR as well since it is all related to the partial function support. Looking forward having partial function support :-) > Case style anonymous functions not supported by Scala API > - > > Key: FLINK-1159 > URL: https://issues.apache.org/jira/browse/FLINK-1159 > Project: Flink > Issue Type: Bug > Components: Scala API >Reporter: Till Rohrmann >Assignee: Stefano Baghino > > In Scala it is very common to define anonymous functions of the following form > {code} > { > case foo: Bar => foobar(foo) > case _ => throw new RuntimeException() > } > {code} > These case style anonymous functions are not supported yet by the Scala API. > Thus, one has to write redundant code to name the function parameter. > What works is the following pattern, but it is not intuitive for someone > coming from Scala: > {code} > dataset.map{ > _ match{ > case foo:Bar => ... > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3650) Add maxBy/minBy to Scala DataSet API
[ https://issues.apache.org/jira/browse/FLINK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215684#comment-15215684 ] Till Rohrmann commented on FLINK-3650: -- Sure you can work on this [~ram_krish]. It is actually just exposing the {{maxBy/minBy}} Java API calls in the Scala API. > Add maxBy/minBy to Scala DataSet API > > > Key: FLINK-3650 > URL: https://issues.apache.org/jira/browse/FLINK-3650 > Project: Flink > Issue Type: Improvement > Components: Java API, Scala API >Affects Versions: 1.1.0 >Reporter: Till Rohrmann >Assignee: ramkrishna.s.vasudevan > > The stable Java DataSet API contains the API calls {{maxBy}} and {{minBy}}. > These methods are not supported by the Scala DataSet API. These methods > should be added in order to have a consistent API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)