[GitHub] flink issue #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's algorithm)
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/3192 @skonto I'm traveling right now and won't be able to push an update until Monday/Tuesday. On Feb 9, 2017 09:31, "Stavros Kontopoulos" <notificati...@github.com> wrote: > @sachingoel0101 <https://github.com/sachingoel0101> could you update the > PR so I can do a final review and request a merge? > @tillrohrmann <https://github.com/tillrohrmann> could assist with the > forwardedfields question? > > â > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/apache/flink/pull/3192#issuecomment-278541212>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AIdpFUGbVFcX21M_1AJF25G4gfabG5ioks5rao-1gaJpZM4LrFiq> > . > --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's alg...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/3192#discussion_r98649636 --- Diff: flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/clustering/KMeans.scala --- @@ -0,0 +1,263 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.clustering + +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields +import org.apache.flink.api.scala.{DataSet, _} +import org.apache.flink.ml._ +import org.apache.flink.ml.common.{LabeledVector, _} +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{BLAS, Vector} +import org.apache.flink.ml.metrics.distances.EuclideanDistanceMetric +import org.apache.flink.ml.pipeline._ + + +/** + * Implements the KMeans algorithm which calculates cluster centroids based on set of training data + * points and a set of k initial centroids. + * + * [[KMeans]] is a [[Predictor]] which needs to be trained on a set of data points and can then be + * used to assign new points to the learned cluster centroids. + * + * The KMeans algorithm works as described on Wikipedia + * (http://en.wikipedia.org/wiki/K-means_clustering): + * + * Given an initial set of k means m1(1),â¦,mk(1) (see below), the algorithm proceeds by alternating + * between two steps: + * + * ===Assignment step:=== + * + * Assign each observation to the cluster whose mean yields the least within-cluster sum of + * squares (WCSS). Since the sum of squares is the squared Euclidean distance, this is intuitively + * the "nearest" mean. (Mathematically, this means partitioning the observations according to the + * Voronoi diagram generated by the means). + * + * `S_i^(t) = { x_p : || x_p - m_i^(t) ||^2 ⤠|| x_p - m_j^(t) ||^2 \forall j, 1 ⤠j ⤠k}`, + * where each `x_p` is assigned to exactly one `S^{(t)}`, even if it could be assigned to two or + * more of them. + * + * ===Update step:=== + * + * Calculate the new means to be the centroids of the observations in the new clusters. + * + * `m^{(t+1)}_i = ( 1 / |S^{(t)}_i| ) \sum_{x_j \in S^{(t)}_i} x_j` + * + * Since the arithmetic mean is a least-squares estimator, this also minimizes the within-cluster + * sum of squares (WCSS) objective. + * + * @example + * {{{ + * val trainingDS: DataSet[Vector] = env.fromCollection(Clustering.trainingData) + * val initialCentroids: DataSet[LabledVector] = env.fromCollection(Clustering.initCentroids) + * + * val kmeans = KMeans() + * .setInitialCentroids(initialCentroids) + * .setNumIterations(10) + * + * kmeans.fit(trainingDS) + * + * // getting the computed centroids + * val centroidsResult = kmeans.centroids.get.collect() + * + * // get matching clusters for new points + * val testDS: DataSet[Vector] = env.fromCollection(Clustering.testData) + * val clusters: DataSet[LabeledVector] = kmeans.predict(testDS) + * }}} + * + * =Parameters= + * + * - [[org.apache.flink.ml.clustering.KMeans.NumIterations]]: + * Defines the number of iterations to recalculate the centroids of the clusters. As it + * is a heuristic algorithm, there is no guarantee that it will converge to the global optimum. The + * centroids of the clusters and the reassignment of the data points will be repeated till the + * given number of iterations is reached. + * (Default value: '''10''') + * + * - [[org.apache.flink.ml.clustering.KMeans.InitialCentroids]]: + * Defines the initial k centroids of the k clusters. They are used as start off point of the + * algorithm for clustering the data set. The centroids are recalculated as often as set in + * [[org.apache.flink.ml.clustering.KMeans.NumIterations
[GitHub] flink pull request #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's alg...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/3192#discussion_r98649239 --- Diff: flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/clustering/KMeans.scala --- @@ -0,0 +1,263 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.clustering + +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields +import org.apache.flink.api.scala.{DataSet, _} +import org.apache.flink.ml._ +import org.apache.flink.ml.common.{LabeledVector, _} +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{BLAS, Vector} +import org.apache.flink.ml.metrics.distances.EuclideanDistanceMetric +import org.apache.flink.ml.pipeline._ + + +/** + * Implements the KMeans algorithm which calculates cluster centroids based on set of training data + * points and a set of k initial centroids. + * + * [[KMeans]] is a [[Predictor]] which needs to be trained on a set of data points and can then be + * used to assign new points to the learned cluster centroids. + * + * The KMeans algorithm works as described on Wikipedia + * (http://en.wikipedia.org/wiki/K-means_clustering): + * + * Given an initial set of k means m1(1),â¦,mk(1) (see below), the algorithm proceeds by alternating + * between two steps: + * + * ===Assignment step:=== + * + * Assign each observation to the cluster whose mean yields the least within-cluster sum of + * squares (WCSS). Since the sum of squares is the squared Euclidean distance, this is intuitively + * the "nearest" mean. (Mathematically, this means partitioning the observations according to the + * Voronoi diagram generated by the means). + * + * `S_i^(t) = { x_p : || x_p - m_i^(t) ||^2 ⤠|| x_p - m_j^(t) ||^2 \forall j, 1 ⤠j ⤠k}`, + * where each `x_p` is assigned to exactly one `S^{(t)}`, even if it could be assigned to two or + * more of them. + * + * ===Update step:=== + * + * Calculate the new means to be the centroids of the observations in the new clusters. + * + * `m^{(t+1)}_i = ( 1 / |S^{(t)}_i| ) \sum_{x_j \in S^{(t)}_i} x_j` + * + * Since the arithmetic mean is a least-squares estimator, this also minimizes the within-cluster + * sum of squares (WCSS) objective. + * + * @example + * {{{ + * val trainingDS: DataSet[Vector] = env.fromCollection(Clustering.trainingData) + * val initialCentroids: DataSet[LabledVector] = env.fromCollection(Clustering.initCentroids) + * + * val kmeans = KMeans() + * .setInitialCentroids(initialCentroids) + * .setNumIterations(10) + * + * kmeans.fit(trainingDS) + * + * // getting the computed centroids + * val centroidsResult = kmeans.centroids.get.collect() + * + * // get matching clusters for new points + * val testDS: DataSet[Vector] = env.fromCollection(Clustering.testData) + * val clusters: DataSet[LabeledVector] = kmeans.predict(testDS) + * }}} + * + * =Parameters= + * + * - [[org.apache.flink.ml.clustering.KMeans.NumIterations]]: + * Defines the number of iterations to recalculate the centroids of the clusters. As it + * is a heuristic algorithm, there is no guarantee that it will converge to the global optimum. The + * centroids of the clusters and the reassignment of the data points will be repeated till the + * given number of iterations is reached. + * (Default value: '''10''') + * + * - [[org.apache.flink.ml.clustering.KMeans.InitialCentroids]]: + * Defines the initial k centroids of the k clusters. They are used as start off point of the + * algorithm for clustering the data set. The centroids are recalculated as often as set in + * [[org.apache.flink.ml.clustering.KMeans.NumIterations
[GitHub] flink issue #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's algorithm)
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/3192 Just fyi, that is not an example of the usage of machine learning library. It is just a standalone implementation of the linear regression model. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's algorithm)
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/3192 I'm not sure about adding examples under flink-examples project. None of the other ml algorithms are there either. Also, requires adding flink-ml as a dependency in the pom. I can however add a kmeans section under ML docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's alg...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/3192#discussion_r98471870 --- Diff: flink-libraries/flink-ml/src/main/scala/org/apache/flink/ml/clustering/KMeans.scala --- @@ -0,0 +1,263 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.clustering + +import org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields +import org.apache.flink.api.scala.{DataSet, _} +import org.apache.flink.ml._ +import org.apache.flink.ml.common.{LabeledVector, _} +import org.apache.flink.ml.math.Breeze._ +import org.apache.flink.ml.math.{BLAS, Vector} +import org.apache.flink.ml.metrics.distances.EuclideanDistanceMetric +import org.apache.flink.ml.pipeline._ + + +/** + * Implements the KMeans algorithm which calculates cluster centroids based on set of training data + * points and a set of k initial centroids. --- End diff -- For the purpose of this PR, the centroids are assumed to be provided by the user. That automatically sets a default k. I will take this into consideration after we have set up the complete initialization architecture. Without some good initialization schemes, it's not very useful to compare different values of k. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's algorithm)
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/3192 /cc @skonto @thvasilo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #757: [FLINK-2131][ml]: Initialization schemes for k-means clust...
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/757 I have opened a PR https://github.com/apache/flink/pull/3192 as per the discussion here to add the Lloyd's algorithm first. Please review that first and then I will rebase this work on top of that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3192: [FLINK-1731][ml] Add KMeans clustering(Lloyd's alg...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/3192 [FLINK-1731][ml] Add KMeans clustering(Lloyd's algorithm) This is a breakoff from https://github.com/apache/flink/pull/757 to add the lloyd's algorithm first. I will follow this up with initialization schemes in the above linked PR. To address a few comments from the previous PR: We cannot use `DataSet[LabeledVector]` instead of `DataSet[Seq[LabeledVector]]` because the model here is of type `Seq[LabeledVector]` and the semantics of pipeline require as such. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink kmeans Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3192.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3192 commit 598f1ea9b4a0e1daf1f151c8b69c88bf83224f71 Author: Peter Schrott <peter.schrot...@gmail.com> Date: 2015-07-29T22:44:54Z [FLINK-1731][ml]Added KMeans algorithm to ML library commit d70c46e71e152b374c9b3f23c9d0bd006bf503ff Author: Florian Goessler <m...@floriangoessler.de> Date: 2015-07-29T22:50:22Z [FLINK-1731][ml]Added unit tests for KMeans algorithm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #757: [FLINK-2131][ml]: Initialization schemes for k-means clust...
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/757 @skonto Been a bit busy. My apologies. I was working on this again some time back and would like to split this into two PRs. One for K means itself, another for adding initialization schemes. How does that sound? Managing everything at once is a bit of headache because the first two commits are from two other contributors. I'll try to push a commit in the next 2-3 days. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3093: [FLINK-5444] Made Flink UI links relative.
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/3093#discussion_r95749646 --- Diff: flink-runtime-web/web-dashboard/app/scripts/modules/submit/submit.ctrl.coffee --- @@ -167,7 +167,7 @@ angular.module('flinkApp') $scope.uploader['success'] = null else $scope.uploader['success'] = "Uploaded!" - xhr.open("POST", "/jars/upload") + xhr.open("POST", "jars/upload") --- End diff -- Primarily, this makes all server side requests consistent. Further, if you check in index.coffee, the jobserver URL is set to empty for production, and a local host string for development purposes. This helps with streamlining dashboard dev without having to rebuild the maven modules again and again. This particular case was disabling the dev on submit page. Best to get this in while you're fixing all urls. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3093: [FLINK-5444] Made Flink UI links relative.
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/3093#discussion_r95740581 --- Diff: flink-runtime-web/web-dashboard/app/scripts/modules/submit/submit.ctrl.coffee --- @@ -167,7 +167,7 @@ angular.module('flinkApp') $scope.uploader['success'] = null else $scope.uploader['success'] = "Uploaded!" - xhr.open("POST", "/jars/upload") + xhr.open("POST", "jars/upload") --- End diff -- Can you use `flinkConfig.jobServer + "jars/upload"` here? Otherwise, looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2941: [FLINK-3555] Web interface does not render job informatio...
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/2941 Imo, the best way to achieve the change equivalent to the change to vendor.css file would be to add a new class, say, .panel-body-flowable in index.styl which defines the overflow rule, and add this class to the elements wherever needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3070: [FLINK-5119][web-frontend] Fix problems in display...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/3070 [FLINK-5119][web-frontend] Fix problems in displaying TM heartbeat and path You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink flink-5119 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3070.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3070 commit 3543946e15ed0944182aae3703364efafdd23361 Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2017-01-05T18:00:26Z [FLINK-5119][web-frontend] Fix problems in displaying TM heartbeat and path. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3069: [FLINK-5381][web-frontend] Fix scrolling issues
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/3069 [FLINK-5381][web-frontend] Fix scrolling issues This also fixes Flink-5359, Flink-5267. These should also be marked resolved. While working on this, I also observed scrolling issues on practically every page with content that doesn't fit in one browser window height. Those are also fixed now. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink flink-5381 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3069.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3069 commit 741d6c32dac68f73c6754ee54895a10be9146b85 Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2017-01-05T16:55:58Z [FLINK-5381][web-frontend] Fix scrolling issues --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3055: [FLINK-5382][web-frontend] Fix problems with downl...
Github user sachingoel0101 closed the pull request at: https://github.com/apache/flink/pull/3055 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3055: [FLINK-5382][web-frontend] Fix problems with downloading ...
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/3055 Unrelated failures on travis. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3055: [FLINK-5382][web-frontend] Fix problems with downl...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/3055 [FLINK-5382][web-frontend] Fix problems with downloading TM logs on Yarn Handle transition URL relative to current page. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink flink-5382 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3055.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3055 commit b94fd312d307cce5998766cf8438455d4cf66475 Author: Sachin <sachingoel0...@gmail.com> Date: 2017-01-03T10:28:11Z [FLINK-5382][web-frontend] Fix problems with downloading TM logs on Yarn --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #757: [FLINK-2131][ml]: Initialization schemes for k-means clust...
Github user sachingoel0101 commented on the issue: https://github.com/apache/flink/pull/757 I'll update based on your comments in a few days. ^^ On Oct 8, 2016 06:21, "Stavros Kontopoulos" <notificati...@github.com> wrote: > @sachingoel0101 <https://github.com/sachingoel0101> @tillrohrmann > <https://github.com/tillrohrmann> any plans for this PR? > > â > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/apache/flink/pull/757#issuecomment-252364165>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AIdpFR7VFxxqFaf_74WUxwKsOrGhOi1Eks5qxrfugaJpZM4E0H5N> > . > --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181772422 That said, just for a comparison purpose, spark has its own model export and import feature, along with pmml export. Hoping to fully support pmml import in a framework like flink or spark is a next to impossible thing which requires changes to the entire way our pipelines and datasets and represented. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181771679 As the original author of this PR, I'd say this: I tried implementing the import features but they aren't worth it. You have to discard most of the valid pmml models because they don't fit in with the flink framework. Further, in my opinion, the use of flink is to train the model. Once we export that model in pmml, you can use it pretty much anywhere, say R or matlab, which support a complete pmml import and export functionality. The exported model is in most cases going to be used for testing, evaluating and predictions purposes, for which flink isn't a good platform to use anyway. This can be accomplished anywhere. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181798643 That is a good point. In streaming setting, it does indeed make sense for the model to be available. However, in my opinion, then it would make sense to actually just use jppml and import the object, followed by extracting the model parameters. Granted, it is an added effort on the user side, but I still think it beats the complexity introduced by supporting imports directly. Furthermore, it would be a bad design to have to reject valid pmml models, just because a minor thing isn't supported in Flink. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1966][ml]Add support for Predictive Mod...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1186#issuecomment-181803637 I'm all for that. Flink's models should be transferable at least across flink. But that should be part of a separate PR, and not block this one as it has been for far too long. It should be pretty easy to accomplish --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2928][web-dashboard] Fix confusing job ...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1421#issuecomment-168634287 Hi @StephanEwen, apologies for the delayed response. Unfortunately, I haven't been able to work on Flink recently as I've just moved abroad. I've also just joined Samsung, so I will have to check whether I can continue contributing. Since this isn't a blocker issue, I will let you know in a few days if I'll be able to continue working on it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-165744791 @StephanEwen can you have a look at this again? Apologies for being hasty, but I've already addressed all concerns; this should be mergeable now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-165754941 Olrite. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-164384200 @StephanEwen, @rmetzger. Ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2769] [runtime-web] Add configurable jo...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1449#issuecomment-163866658 Tested it out locally. Works well with `gulp watch` and custom web frontend port. A few things I noticed: 1. Job Manager's log and stdout are not accessible with CORS, since they're served via `StaticFileServerHandler` which doesn't set CORS headers. This however shouldn't be an issue. We need to have a better logs service anyway, for both Job manager and task managers IMO. 2. At some point, we will remove the `GET /jobs//yarn-cancel` method, which is only there due to a limitation of yarn. To this end, there needs to be an `OPTIONS` handlers for the `DELETE` request to send CORS response headers in the preflight phase. https://developer.mozilla.org/en-US/docs/Web/HTTP/Access_control_CORS#Preflighted_requests. Not for the foreseeable future though. :-') --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3023][web-dashboard] Display version an...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1422#discussion_r47076123 --- Diff: flink-runtime-web/web-dashboard/app/partials/overview.jade --- @@ -22,6 +22,12 @@ nav.navbar.navbar-default.navbar-fixed-top.navbar-main .navbar-title | Overview + .navbar-info.last.first +| Version: {{overview['flink-version']}} + + .navbar-info.last.first(ng-if="overview['flink-commit']") +| Commit: {{overview['flink-commit']}} --- End diff -- It will not be displayed at all. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-163188173 Hi @StephanEwen, I have modified the signature of the `handleRequest` method to separate path and query parameters. If there are any more concerns, let me know. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3023][web-dashboard] Display version an...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1422#issuecomment-163200275 Olrite. Great! #1418 should also be meregable with this one IMO. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077][cli] Add version option to Cli.
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-162796587 Ping. Also, what is the procedure for creating a release artifact locally[without GPG keys, Apache id, etc.] so I can check that this works okay for those too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2928][web-dashboard] Fix confusing job ...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1421#issuecomment-162796105 That sounds great. @iampeter thanks for the suggestion. :) I will see if I can update the PR today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2379][ml]Add column wise statistics for...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1032#issuecomment-162550968 Hi @chiwanpark, thanks for picking this up. :) Since a `Vector` might contain discrete fields as well as continuous fields, we need to have a `FieldStats` object which can cover both types. To prevent the need of casting from `FieldStats` to `ContinuousFieldStats` and `DiscreteFieldStats` in case there is an abstract class `FieldStats`, I supported them both in a single class. What do you think would be the best solution here? As for your second point regarding `T <: Vector`, will fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-162463009 Travis passes. Should be good to merge. This also resolves FLINK-2524, so the commit message should reflect that too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-162325866 @StephanEwen can you take a look again? I have introduced the `TaskInfo` object in the second commit, and we can squash them before merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-162345554 @StephanEwen thanks for the review. I have addressed all but one of your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-162340095 @rmetzger thanks. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1386#discussion_r46775517 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/operators/BatchTask.java --- @@ -1150,8 +1149,8 @@ else if (index < 0 || index >= this.driver.getNumberOfDriverComparators()) { * @return The string for logging. */ public static String constructLogString(String message, String taskName, AbstractInvokable parent) { - return message + ": " + taskName + " (" + (parent.getEnvironment().getIndexInSubtaskGroup() + 1) + - '/' + parent.getEnvironment().getNumberOfSubtasks() + ')'; + return message + ": " + taskName + " (" + (parent.getEnvironment().getTaskInfo().getIndexOfThisSubtask() + 1) + --- End diff -- That can lead to problems in handling of Chained Tasks. For those, the name of chained task, but the index and parallelism of parent is used. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3023][web-dashboard] Display version an...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1422#discussion_r46774718 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/ClusterOverviewHandler.java --- @@ -63,6 +67,10 @@ public String handleRequest(Map<String, String> params, ActorGateway jobManager) gen.writeNumberField("jobs-finished", overview.getNumJobsFinished()); gen.writeNumberField("jobs-cancelled", overview.getNumJobsCancelled()); gen.writeNumberField("jobs-failed", overview.getNumJobsFailed()); + gen.writeStringField("flink-version", version); + if (commitID != null) { --- End diff -- Gah. Apologies. I should've been more careful. Fixed it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2928][web-dashboard] Fix confusing job ...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1421#issuecomment-162341452 Makes sense. Lemme figure something out here, and I will post an update in a day or two. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2928][web-dashboard] Fix confusing job ...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1421#issuecomment-162207406 I'm not very sure about this. Displaying two rows doesn't look good at all. Maybe @iampeter has a better suggestion here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r46759219 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -302,6 +302,15 @@ */ public static final String JOB_MANAGER_WEB_LOG_PATH_KEY = "jobmanager.web.log.path"; + /** +* Config parameter indicating whether jobs can be uploaded and run from the web-frontend. +*/ + public static final String JOB_MANAGER_WEB_SUBMISSION_KEY = "jobmanager.web.submit.allow"; --- End diff -- Sorry about the delay. What should I rename it to? :confused: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r46760092 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -302,6 +302,15 @@ */ public static final String JOB_MANAGER_WEB_LOG_PATH_KEY = "jobmanager.web.log.path"; + /** +* Config parameter indicating whether jobs can be uploaded and run from the web-frontend. +*/ + public static final String JOB_MANAGER_WEB_SUBMISSION_KEY = "jobmanager.web.submit.allow"; --- End diff -- Ah. I think I missed Stephan's first point. Will change this. :smile: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-162224088 Hi @rmetzger, can you test this on a cluster again? @StephanEwen can we discuss your concerns about this? I would like to quickly get this merged after addressing all comments actively as I might not be able to continue working on it after a few weeks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077][cli] Add version option to Cli.
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-161910704 I have made the required changes. @StephanEwen can you take a look again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-161921756 @StephanEwen you're absolutely right. It will most certainly do that. I will push both things in a single commit and file an additional jira for introducing the `TaskInfo` object. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3023][web-dashboard] Display version an...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1422#issuecomment-161914994 Addressed null commit ids. If the commit id is not available, only version will be displayed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077][cli] Add version option to Cli.
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-161339112 Makes sense. I will address it as per Stephan's comments. I won't be able to get to it till this weekend though as I'm on vacation. On Dec 2, 2015 8:04 PM, "Max" <notificati...@github.com> wrote: > +1 please only print if the version is not null. > > â > Reply to this email directly or view it on GitHub > <https://github.com/apache/flink/pull/1418#issuecomment-161317022>. > --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-161449824 Yes. I was planning to start working on that after this. Since it's preferable to fix different jiras in different commits, it'd be good if I can base my work after this is committed. What do you think? On Dec 2, 2015 6:39 PM, "Stephan Ewen" <notificati...@github.com> wrote: > This generally looks good and pretty straight forward. > As such it is actually good to merge. > > I remember that a while back, we were discussing to create a TaskInfo > object that would contain "subtaskIndex", "parallelism", "name", > "name-with-subtask", "attempt", "vertex id", "attempt id", etc... > Having such an object would allow to pass all these elements simply from > the TaskDeploymentDescriptor to the Task to the RuntimeContext. And > whenever we add another field, we need not propagate it manually through > all the functions calls, but simply add it to the TaskInfo. > > What do you think? > > â > Reply to this email directly or view it on GitHub > <https://github.com/apache/flink/pull/1386#issuecomment-161286037>. > --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2928][web-dashboard] Fix confusing job ...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1421#issuecomment-160743609 The presence of job id was leading to the cancel button disappearing sometimes due to lack of space which is something very important. I considered a two row approach but that looked ugly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077][cli] Add version option to Cli.
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-160744724 Do I need to address this? Perhaps we can check if the commit id is null and not print it in such a case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2399] Version checks for Job Manager an...
Github user sachingoel0101 closed the pull request at: https://github.com/apache/flink/pull/945 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2399] Version checks for Job Manager an...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/945#issuecomment-160431771 Un-assigning myself from the issue for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077] Add functions to access Flink Ver...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1418#discussion_r46093211 --- Diff: flink-core/src/main/java/org/apache/flink/util/VersionUtils.java --- @@ -0,0 +1,73 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.util; + +import java.io.IOException; +import java.net.URL; +import java.util.Properties; +import java.util.jar.Attributes; +import java.util.jar.Manifest; + +/** + * Utility class which provides various methods for accessing version information. + */ +public final class VersionUtils { + + private static final VersionUtils INSTANCE = new VersionUtils(); + + /** +* Private constructor used to overwrite public one. +*/ + private VersionUtils() {} + + /** +* Returns the version of Flink. +*/ + public static String getFlinkVersion() { + // a version can only be provided when running from a Jar. + URL manifestUrl = INSTANCE.getClass().getClassLoader().getResource("META-INF/MANIFEST.MF"); + if (manifestUrl != null) { + try { + Attributes attr = new Manifest(manifestUrl.openStream()).getMainAttributes(); + return attr.getValue("Implementation-Version"); + } catch (IOException e) { + // + } + } + return null; + } + + /** +* Returns the commit id of the source from which the flink jar is built. +*/ + public static String getCommitId() { --- End diff -- Ah. I was entirely unaware of this class. There's also a method already for accessing the revision id, albeit abbreviated. Is the short rev id okay or should I just place a full rev id string in the same class? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3023][web-dashboard] Display version an...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/1422 [FLINK-3023][web-dashboard] Display version and commit information on Overview Page. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink 3023-web-version Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1422.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1422 commit ccd05eacc4fe78bf199a7195ded15ba75e0951a5 Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2015-11-28T09:26:16Z [FLINK-3023][web-dashboard] Display version and commit information on Overview Page. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077][cli] Add version option to Cli.
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-160422689 @rmetzger I have split this PR to separate the Web dashboard specific work into another PR #1422 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2030][FLINK-2274][core][utils]Histogram...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/861#issuecomment-160428072 Ping! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077][cli] Add version option to Cli.
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/1418 [FLINK-3077][cli] Add version option to Cli. Adds a version option (-v, --version) to the Cli, which prints the implementation version and the git commit for the build. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink 3077-cli-version Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1418.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1418 commit aed855c905d714858bff118e2c1120f34a4aa157 Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2015-11-28T09:26:16Z [FLINK-3077][cli] Add version option to Cli. This commit adds a version option (-v, --version) to the Cli, which prints the implementation version and the git commit for the build. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3056][web-dashboard] Represent bytes in...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/1419 [FLINK-3056][web-dashboard] Represent bytes in more readable form. Bytes are now displayed in the following fashion: 1. For [0, 1000) units, display three significant digits. 2. For [1000,1024) units, display 2 decimal points for the next higher unit. For example, 1010 KB is displayed as 0.99 MB, 10 MB is displayed as 10.0 MB and 230 MB is displayed as such. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink 3056-ui-bytes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1419.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1419 commit fc65aa4287689e05cd00fde82e2acbc45f28e854 Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2015-11-28T11:24:23Z [FLINK-3056][web-dashboard] Represent bytes in more readable form. Bytes are now displayed in the following fashion: 1. For [0, 1000) units, display three significant digits. 2. For [1000,1024) units, display 2 decimal points for the next higher unit. For example, 1010 KB is displayed as 0.99 MB, 10 MB is displayed as 10.0 MB and 230 MB is displayed as such. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2928][web-dashboard] Fix confusing job ...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/1421 [FLINK-2928][web-dashboard] Fix confusing job status visualization in job overview. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink 2928-job-overview Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1421.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1421 commit 23c4ad6708fc3a6ef062d35600eb564a188c969d Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2015-11-28T17:41:14Z [FLINK-2928][web-dashboard] Fix confusing job status visualization in job overview. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077] Add functions to access Flink Ver...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-160311838 Also, resolves FLINK-3056. ![capture10](https://cloud.githubusercontent.com/assets/8874261/11452591/aa9f5dae-9613-11e5-9546-d862f29cfd3c.PNG) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3077] Add functions to access Flink Ver...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1418#issuecomment-160312062 Also resolves FLINK-3023. ![capture10](https://cloud.githubusercontent.com/assets/8874261/11452593/d80811aa-9613-11e5-9c1e-c5fe6c61adc8.PNG) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-159853758 I'll wait for Stephan to review this then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2904][web-dashboard] Fix truncation of ...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1321#issuecomment-159655565 Hi @mxm, is there any further improvement needed on this? If not, I think it should be merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-159656251 Hi @rmetzger, can you take another look? I need to base some further work on this [related to cleaning up passing of task name, index, etc.]. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45640613 --- Diff: flink-runtime-web/web-dashboard/app/partials/jobs/job.jade --- @@ -43,6 +43,10 @@ nav.navbar.navbar-default.navbar-fixed-top.navbar-main(ng-if="job") span.navbar-info-button.btn.btn-default(ng-click="cancelJob($event)") | Cancel + .navbar-info.last.first(ng-if!="job.type=='STREAMING' && (job.state=='RUNNING' || job.state=='CREATED')") --- End diff -- Ah. That helps. Thanks. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45639014 --- Diff: flink-runtime-web/web-dashboard/app/partials/jobs/job.jade --- @@ -43,6 +43,10 @@ nav.navbar.navbar-default.navbar-fixed-top.navbar-main(ng-if="job") span.navbar-info-button.btn.btn-default(ng-click="cancelJob($event)") | Cancel + .navbar-info.last.first(ng-if!="job.type=='STREAMING' && (job.state=='RUNNING' || job.state=='CREATED')") --- End diff -- Yes. Quite odd indeed. Can you point out the resource where you saw this? I'm a little curious. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3000]Adds shutdown hook to clean up lin...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1354#issuecomment-159054302 Ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45634054 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitor.java --- @@ -188,9 +189,12 @@ public WebRuntimeMonitor( // Cancel a job via GET (for proper integration with YARN this has to be performed via GET) .GET("/jobs/:jobid/yarn-cancel", handler(new JobCancellationHandler())) - // DELETE is the preferred way of cancelling a job (Rest-conform) + // DELETE is the preferred way of canceling a job (Rest-conform) .DELETE("/jobs/:jobid", handler(new JobCancellationHandler())) + // stop a job + .DELETE("/jobs/:jobid/stop", handler(new JobStoppingHandler())) + --- End diff -- This is somewhat counter-intuitive. `DELETE` is meant to remove the resource at the specified URI. Specifying a action as `/stop` is not a good idea. Perhaps a better idea would be to identify these actions as `/jobs/:jobid?mode=cancel` and `/jobs/:jobid?mode=stop`. @StephanEwen might have a better idea about this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45635941 --- Diff: flink-runtime-web/web-dashboard/web/partials/jobs/job.html --- @@ -32,7 +32,7 @@ - {{ job['end-time'] | amDateFormat:'-MM-DD, H:mm:ss' }} {{job.duration | humanizeDuration:true}} - Cancel + Cancel --- End diff -- This is weird. The stop button code isn't actually here. Are you sure this file is up-to-date? Can you run gulp again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45636371 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/WebRuntimeMonitor.java --- @@ -188,9 +189,12 @@ public WebRuntimeMonitor( // Cancel a job via GET (for proper integration with YARN this has to be performed via GET) .GET("/jobs/:jobid/yarn-cancel", handler(new JobCancellationHandler())) - // DELETE is the preferred way of cancelling a job (Rest-conform) + // DELETE is the preferred way of canceling a job (Rest-conform) .DELETE("/jobs/:jobid", handler(new JobCancellationHandler())) + // stop a job + .DELETE("/jobs/:jobid/stop", handler(new JobStoppingHandler())) + --- End diff -- `DELETE` is more conforming to the REST framework. `GET` is just a workaround to support yarn when working on the AM proxy page. Yarn doesn't transfer anything besides `GET` to the AM's http server. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45634227 --- Diff: flink-runtime-web/web-dashboard/app/partials/jobs/job.jade --- @@ -43,6 +43,10 @@ nav.navbar.navbar-default.navbar-fixed-top.navbar-main(ng-if="job") span.navbar-info-button.btn.btn-default(ng-click="cancelJob($event)") | Cancel + .navbar-info.last.first(ng-if!="job.type=='STREAMING' && (job.state=='RUNNING' || job.state=='CREATED')") --- End diff -- Are you sure this works? :confused: The `!=` shouldn't be there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r45634353 --- Diff: flink-runtime-web/web-dashboard/app/scripts/modules/jobs/jobs.svc.coffee --- @@ -216,4 +216,7 @@ angular.module('flinkApp') # proper "DELETE jobs//" $http.get "jobs/" + jobid + "/yarn-cancel" + @stopJob = (jobid) -> +$http.delete "jobs/" + jobid + "/stop" --- End diff -- As I commented before, this should rather be a `GET` request at `/yarn-stop` to support Yarn cancellations. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-158765098 @rmetzger 1. I have introduced a separate directory for uploads, named randomly, and deleted as part of the shut down hook. 2. I also tried closing the `URLClassLoader`, but it still doesn't allow for the Jars to be deleted. I will look further into it and see if I can get it to work. It shouldn't be a blocker however for merging this IMO. @StephanEwen 1. I have removed the key to specify the upload directory, which takes care of naming of the config entry. It now is just `jobmanager.web.submit.allow` 2. What would be the best way to separate the passing of path and query parameters? Of course, the most obvious choice is to change the method signature for `handleRequest` but it appears too much of a change, just to allow query params. Since the interface is designed by us, I think we can keep them together, and disallow any query param which overrides the path param. Otherwise, if we do indeed change this, I suggest a construct `Parameters` with fields `pathParams` and `queryParams`, with access method as `getParam(key, enum{PATH, QUERY})` This will be a much cleaner solution. 3. I have some concerns about error handling. There are four handlers: a. `StaticFileServerHandler`: Handles exception events itself, by sending an `INTERNAL_SERVER_ERROR` b. `RuntimeMonitorHandler`: Handles all exceptions itself, with either a `NOT_FOUND` or `INTERNAL_SERVER_ERROR` code. c. `HttpRequestHandler` [introduced in this PR]: Doesn't handler exceptions. But I'm inclined towards sending an `INTERNAL_SERVER_ERROR` code for any exceptions here. d. `PipelineErrorHandler` [introduced in this PR]: If an exception caught event is fired here, it can only happen because the netty threw an error [as reported by @gyfora in an instance]. There's really nothing to do here except sending an internal server error. What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1386#issuecomment-158408532 @rmetzger I have added a test case to verify the functionality. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45317406 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -635,8 +644,18 @@ * The default number of archived jobs for the jobmanager */ public static final int DEFAULT_JOB_MANAGER_WEB_ARCHIVE_COUNT = 5; - - + + /** +* By default, submitting jobs from the web-frontend is allowed. +*/ + public static final boolean DEFAULT_JOB_MANAGER_WEB_SUBMISSION = true; + + /** +* Default directory for uploaded file storage for the Web frontend. +*/ + public static final String DEFAULT_JOB_MANAGER_WEB_UPLOAD_DIR = + (System.getProperty("java.io.tmpdir") == null ? "/tmp" : System.getProperty("java.io.tmpdir")) + "/webmonitor/"; --- End diff -- @rmetzger can we make a decision on this? I'm in favor of having a directory `/jars/` inside the `webRootDir` created by the Web Monitor. There is already a shutdown hook for removing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488] Expose Attempt Number in RuntimeC...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/1386 [FLINK-2488] Expose Attempt Number in RuntimeContext Passes the attempt number all the way from `TaskDeploymentDescriptor` to the `RuntimeContext`. Small thing I want to confirm: For `RuntimeContext` in Tez, is it okay to use `TaskContext#getTaskAttemptNumber` provided by Tez as a proxy for the attempt number? You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink attempt_number Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1386.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1386 commit c7b28bbf7d40b8571ee18d1d525535ab5ed523ca Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2015-11-19T16:38:15Z [FLINK-2488] Expose Attempt Number in RuntimeContext --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45240771 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/HttpRequestHandler.java --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor; + +/* + * This code is based on the "HttpUploadServerHandler" from the + * Netty project's HTTP server example. + * + * See http://netty.io and + * https://github.com/netty/netty/blob/netty-4.0.31.Final/example/src/main/java/io/netty/example/http/upload/HttpUploadServerHandler.java + */ + +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.SimpleChannelInboundHandler; +import io.netty.handler.codec.http.HttpContent; +import io.netty.handler.codec.http.HttpHeaders; +import io.netty.handler.codec.http.HttpMethod; +import io.netty.handler.codec.http.HttpObject; +import io.netty.handler.codec.http.HttpRequest; +import io.netty.handler.codec.http.LastHttpContent; +import io.netty.handler.codec.http.QueryStringDecoder; +import io.netty.handler.codec.http.QueryStringEncoder; +import io.netty.handler.codec.http.multipart.DefaultHttpDataFactory; +import io.netty.handler.codec.http.multipart.DiskFileUpload; +import io.netty.handler.codec.http.multipart.HttpDataFactory; +import io.netty.handler.codec.http.multipart.HttpPostRequestDecoder; +import io.netty.handler.codec.http.multipart.HttpPostRequestDecoder.EndOfDataDecoderException; +import io.netty.handler.codec.http.multipart.InterfaceHttpData; +import io.netty.handler.codec.http.multipart.InterfaceHttpData.HttpDataType; + +import java.io.File; +import java.util.UUID; + +/** + * Simple code which handles all HTTP requests from the user, and passes them to the Router + * handler directly if they do not involve file upload requests. + * If a file is required to be uploaded, it handles the upload, and in the http request to the + * next handler, passes the name of the file to the next handler. + */ +public class HttpRequestHandler extends SimpleChannelInboundHandler { + + private HttpRequest request; + + private boolean readingChunks; + + private static final HttpDataFactory factory = new DefaultHttpDataFactory(true); // use disk + + private String requestPath; + + private HttpPostRequestDecoder decoder; + + private final File uploadDir; + + /** +* The directory where files should be uploaded. +*/ + public HttpRequestHandler(File uploadDir) { + this.uploadDir = uploadDir; + } + + @Override + public void channelUnregistered(ChannelHandlerContext ctx) throws Exception { + if (decoder != null) { + decoder.cleanFiles(); + } + } + + @Override + public void channelRead0(ChannelHandlerContext ctx, HttpObject msg) throws Exception { + if (msg instanceof HttpRequest) { + request = (HttpRequest) msg; + requestPath = new QueryStringDecoder(request.getUri()).path(); + if (request.getMethod() != HttpMethod.POST) { --- End diff -- I'm not very sure about the conventions, but only `PUT` and `POST` methods have a payload associated with them, since they're intuitively *unsafe* methods, which change some state on the server. Further, as for `DELETE`, the HTTP specification states that there is no defined semantics for associating bodies. https://tools.ietf.org/html/rfc7231#section-4.3.5 Since the netty server currently only processes GET, DELETE and POST, I think we can safely assume no other requests will arrive, and if they
[GitHub] flink pull request: [FLINK-2111] Add "stop" signal to cleanly shut...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/750#issuecomment-157817216 Hi @mjsax , your change to coffeescript aren't ported to the actual javascript. It appears you forgot to run `gulp`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1026#issuecomment-157818292 Yes. That's why I closed it. :') I'm thinking of having a construct named `TaskInfo`, which contains information like name, index, parallel tasks, attempt number, etc. which will be passed all the way down from the `TDD` to the `RuntimeContext`. Let me know if that's a good idea. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2488][FLINK-2496] Expose Task Manager c...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1026#issuecomment-157826454 Agreed. I also would like to cleanup the several getter functions for these fields. We can just add a `getTaskInfo` in the `TDD`, `Task` and `Environment`. The fields can then be accessed from this object. Since this isn't the user-facing API, it should be fine IMO. `RuntimeContext` will still provide separate access to every field. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45245432 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/HttpRequestHandler.java --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor; + +/* + * This code is based on the "HttpUploadServerHandler" from the + * Netty project's HTTP server example. + * + * See http://netty.io and + * https://github.com/netty/netty/blob/netty-4.0.31.Final/example/src/main/java/io/netty/example/http/upload/HttpUploadServerHandler.java + */ + +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.SimpleChannelInboundHandler; +import io.netty.handler.codec.http.HttpContent; +import io.netty.handler.codec.http.HttpHeaders; +import io.netty.handler.codec.http.HttpMethod; +import io.netty.handler.codec.http.HttpObject; +import io.netty.handler.codec.http.HttpRequest; +import io.netty.handler.codec.http.LastHttpContent; +import io.netty.handler.codec.http.QueryStringDecoder; +import io.netty.handler.codec.http.QueryStringEncoder; +import io.netty.handler.codec.http.multipart.DefaultHttpDataFactory; +import io.netty.handler.codec.http.multipart.DiskFileUpload; +import io.netty.handler.codec.http.multipart.HttpDataFactory; +import io.netty.handler.codec.http.multipart.HttpPostRequestDecoder; +import io.netty.handler.codec.http.multipart.HttpPostRequestDecoder.EndOfDataDecoderException; +import io.netty.handler.codec.http.multipart.InterfaceHttpData; +import io.netty.handler.codec.http.multipart.InterfaceHttpData.HttpDataType; + +import java.io.File; +import java.util.UUID; + +/** + * Simple code which handles all HTTP requests from the user, and passes them to the Router + * handler directly if they do not involve file upload requests. + * If a file is required to be uploaded, it handles the upload, and in the http request to the + * next handler, passes the name of the file to the next handler. + */ +public class HttpRequestHandler extends SimpleChannelInboundHandler { + + private HttpRequest request; + + private boolean readingChunks; + + private static final HttpDataFactory factory = new DefaultHttpDataFactory(true); // use disk + + private String requestPath; + + private HttpPostRequestDecoder decoder; + + private final File uploadDir; + + /** +* The directory where files should be uploaded. +*/ + public HttpRequestHandler(File uploadDir) { + this.uploadDir = uploadDir; + } + + @Override + public void channelUnregistered(ChannelHandlerContext ctx) throws Exception { + if (decoder != null) { + decoder.cleanFiles(); + } + } + + @Override + public void channelRead0(ChannelHandlerContext ctx, HttpObject msg) throws Exception { + if (msg instanceof HttpRequest) { + request = (HttpRequest) msg; + requestPath = new QueryStringDecoder(request.getUri()).path(); + if (request.getMethod() != HttpMethod.POST) { --- End diff -- Ah yes. It appears I made a mistake. You're right. `PUT` modifies the state of an existing resource. Just had a more careful look at the rfc too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45091056 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/PipelineErrorHandler.java --- @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor; + +import com.fasterxml.jackson.core.JsonGenerator; +import io.netty.buffer.Unpooled; +import io.netty.channel.ChannelHandler; +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.SimpleChannelInboundHandler; +import io.netty.handler.codec.http.DefaultFullHttpResponse; +import io.netty.handler.codec.http.HttpHeaders; +import io.netty.handler.codec.http.HttpResponseStatus; +import io.netty.handler.codec.http.HttpVersion; +import org.apache.flink.runtime.webmonitor.handlers.JsonFactory; + +import java.io.IOException; +import java.io.StringWriter; + +/** + * This is the last handler in the pipeline and logs all error messages. + */ +@ChannelHandler.Sharable +public class PipelineErrorHandler extends SimpleChannelInboundHandler { + + @Override + protected void channelRead0(ChannelHandlerContext ctx, Object message) { + // we can't deal with this message. No one in the pipeline handled it. Log it. + System.err.println("Unknown message received: " + message); --- End diff -- Ah. Forgot to change this. Will do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45090748 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarDeleteHandler.java --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor.handlers; + +import com.fasterxml.jackson.core.JsonGenerator; +import org.apache.flink.runtime.instance.ActorGateway; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.StringWriter; +import java.util.Map; + +/** + * Handles requests for deletion of jars. + */ +public class JarDeleteHandler implements RequestHandler, RequestHandler.JsonResponse { + + private final File jarDir; + + public JarDeleteHandler(File jarDirectory) { + jarDir = jarDirectory; + } + + @Override + public String handleRequest(Map<String, String> params, ActorGateway jobManager) throws Exception { + final String file = params.get("jarid"); + try { + File[] list = jarDir.listFiles(new FilenameFilter() { + @Override + public boolean accept(File dir, String name) { + return name.equals(file); + } + }); + boolean success = false; + for (File f: list) { + // although next to impossible for multiple files, we still delete them. + success = success || f.delete(); + } + StringWriter writer = new StringWriter(); + JsonGenerator gen = JsonFactory.jacksonFactory.createJsonGenerator(writer); + gen.writeStartObject(); + if (!success) { + // this seems to always fail on Windows. --- End diff -- File deletions are an issue on Windows. The problem is that, when a `PackagedProgram` is formed using the jar, and a class loader is constructed, the jars are effectively open. And Windows doesn't allow for deleting them. Linux on the other hand doesn't care. This was not working on the earlier webclient either. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45091186 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarListHandler.java --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor.handlers; + +import com.fasterxml.jackson.core.JsonGenerator; +import org.apache.flink.client.program.PackagedProgram; +import org.apache.flink.client.program.ProgramInvocationException; +import org.apache.flink.runtime.instance.ActorGateway; +import org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.IOException; +import java.io.StringWriter; +import java.util.Map; +import java.util.UUID; +import java.util.jar.JarFile; +import java.util.jar.Manifest; + +public class JarListHandler implements RequestHandler, RequestHandler.JsonResponse { + + private final File jarDir; + + public JarListHandler(File jarDirectory) { + jarDir = jarDirectory; + } + + @Override + public String handleRequest(Map<String, String> params, ActorGateway jobManager) throws Exception { + try { + StringWriter writer = new StringWriter(); + JsonGenerator gen = JsonFactory.jacksonFactory.createJsonGenerator(writer); + gen.writeStartObject(); + gen.writeStringField("address", params.get(RuntimeMonitorHandler.WEB_MONITOR_ADDRESS_KEY)); + if (jarDir != null) { --- End diff -- Well, technically, the condition is made sure of the way `jarDir` is initialized in the web monitor. But I can definitely make it more explicit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45090990 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -635,8 +644,18 @@ * The default number of archived jobs for the jobmanager */ public static final int DEFAULT_JOB_MANAGER_WEB_ARCHIVE_COUNT = 5; - - + + /** +* By default, submitting jobs from the web-frontend is allowed. +*/ + public static final boolean DEFAULT_JOB_MANAGER_WEB_SUBMISSION = true; + + /** +* Default directory for uploaded file storage for the Web frontend. +*/ + public static final String DEFAULT_JOB_MANAGER_WEB_UPLOAD_DIR = + (System.getProperty("java.io.tmpdir") == null ? "/tmp" : System.getProperty("java.io.tmpdir")) + "/webmonitor/"; --- End diff -- No. In case user wants them persisted, they should specify a directory themselves. Otherwise, as long as the OS isn't rebooted, the jars will stay there. Also, perhaps a better idea will be to not have a default directory, and determine it while initializing the Web Monitor. That way, we can get the directory on the job manager machine, instead of the machine where the jar was built. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45094279 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarListHandler.java --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor.handlers; + +import com.fasterxml.jackson.core.JsonGenerator; +import org.apache.flink.client.program.PackagedProgram; +import org.apache.flink.client.program.ProgramInvocationException; +import org.apache.flink.runtime.instance.ActorGateway; +import org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.IOException; +import java.io.StringWriter; +import java.util.Map; +import java.util.UUID; +import java.util.jar.JarFile; +import java.util.jar.Manifest; + +public class JarListHandler implements RequestHandler, RequestHandler.JsonResponse { + + private final File jarDir; + + public JarListHandler(File jarDirectory) { + jarDir = jarDirectory; + } + + @Override + public String handleRequest(Map<String, String> params, ActorGateway jobManager) throws Exception { + try { + StringWriter writer = new StringWriter(); + JsonGenerator gen = JsonFactory.jacksonFactory.createJsonGenerator(writer); + gen.writeStartObject(); + gen.writeStringField("address", params.get(RuntimeMonitorHandler.WEB_MONITOR_ADDRESS_KEY)); + if (jarDir != null) { --- End diff -- That's exactly why I offered to make it more explicit. :') On Nov 17, 2015 11:26 PM, "Robert Metzger" <notificati...@github.com> wrote: > In > flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarListHandler.java > <https://github.com/apache/flink/pull/1338#discussion_r45094049>: > > > +public class JarListHandler implements RequestHandler, RequestHandler.JsonResponse { > > + > > + private final File jarDir; > > + > > + public JarListHandler(File jarDirectory) { > > + jarDir = jarDirectory; > > + } > > + > > + @Override > > + public String handleRequest(Map<String, String> params, ActorGateway jobManager) throws Exception { > > + try { > > + StringWriter writer = new StringWriter(); > > + JsonGenerator gen = JsonFactory.jacksonFactory.createJsonGenerator(writer); > > + gen.writeStartObject(); > > + gen.writeStringField("address", params.get(RuntimeMonitorHandler.WEB_MONITOR_ADDRESS_KEY)); > > + if (jarDir != null) { > > I don't doubt that its working correctly now. The issue is that its really > not obvious what this check is for. Somebody else changing your code might > easily break it! > > â > Reply to this email directly or view it on GitHub > <https://github.com/apache/flink/pull/1338/files#r45094049>. > --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45095325 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarDeleteHandler.java --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor.handlers; + +import com.fasterxml.jackson.core.JsonGenerator; +import org.apache.flink.runtime.instance.ActorGateway; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.StringWriter; +import java.util.Map; + +/** + * Handles requests for deletion of jars. + */ +public class JarDeleteHandler implements RequestHandler, RequestHandler.JsonResponse { + + private final File jarDir; + + public JarDeleteHandler(File jarDirectory) { + jarDir = jarDirectory; + } + + @Override + public String handleRequest(Map<String, String> params, ActorGateway jobManager) throws Exception { + final String file = params.get("jarid"); + try { + File[] list = jarDir.listFiles(new FilenameFilter() { + @Override + public boolean accept(File dir, String name) { + return name.equals(file); + } + }); + boolean success = false; + for (File f: list) { + // although next to impossible for multiple files, we still delete them. + success = success || f.delete(); + } + StringWriter writer = new StringWriter(); + JsonGenerator gen = JsonFactory.jacksonFactory.createJsonGenerator(writer); + gen.writeStartObject(); + if (!success) { + // this seems to always fail on Windows. --- End diff -- Yes. I came across that too. I just gave it somewhat lower preference, since we already have too many issues with deletions on Windows. That's the primary reason `mvn verify` fails. Let me try it out and I will report back if it works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45101828 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/RuntimeMonitorHandler.java --- @@ -113,7 +117,17 @@ private void respondAsLeader(ChannelHandlerContext ctx, Routed routed, ActorGate DefaultFullHttpResponse response; try { - String result = handler.handleRequest(routed.pathParams(), jobManager); + Map<String, String> params = routed.pathParams(); --- End diff -- I was not aware of this particular aspect of REST. How should query params be represented? Since here, the resource is `/jars/:jarid`, and representing a plan or submit as `/jars/:jarid/:parallelism/:entry-class/...` doesn't seem okay. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45101180 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -635,8 +644,18 @@ * The default number of archived jobs for the jobmanager */ public static final int DEFAULT_JOB_MANAGER_WEB_ARCHIVE_COUNT = 5; - - + + /** +* By default, submitting jobs from the web-frontend is allowed. +*/ + public static final boolean DEFAULT_JOB_MANAGER_WEB_SUBMISSION = true; + + /** +* Default directory for uploaded file storage for the Web frontend. +*/ + public static final String DEFAULT_JOB_MANAGER_WEB_UPLOAD_DIR = + (System.getProperty("java.io.tmpdir") == null ? "/tmp" : System.getProperty("java.io.tmpdir")) + "/webmonitor/"; --- End diff -- I thought this was considered a feature of the webclient that jars were persisted. I personally do not like it either. Should I do this then? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45103208 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -302,6 +302,15 @@ */ public static final String JOB_MANAGER_WEB_LOG_PATH_KEY = "jobmanager.web.log.path"; + /** +* Config parameter indicating whether jobs can be uploaded and run from the web-frontend. +*/ + public static final String JOB_MANAGER_WEB_SUBMISSION_KEY = "jobmanager.web.submit.allow"; --- End diff -- That's a good point. Will take care of this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45103312 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/HttpRequestHandler.java --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor; + +/* + * This code is based on the "HttpUploadServerHandler" from the + * Netty project's HTTP server example. + * + * See http://netty.io and + * https://github.com/netty/netty/blob/netty-4.0.31.Final/example/src/main/java/io/netty/example/http/upload/HttpUploadServerHandler.java + */ + +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.SimpleChannelInboundHandler; +import io.netty.handler.codec.http.HttpContent; +import io.netty.handler.codec.http.HttpHeaders; +import io.netty.handler.codec.http.HttpMethod; +import io.netty.handler.codec.http.HttpObject; +import io.netty.handler.codec.http.HttpRequest; +import io.netty.handler.codec.http.LastHttpContent; +import io.netty.handler.codec.http.QueryStringDecoder; +import io.netty.handler.codec.http.QueryStringEncoder; +import io.netty.handler.codec.http.multipart.DefaultHttpDataFactory; +import io.netty.handler.codec.http.multipart.DiskFileUpload; +import io.netty.handler.codec.http.multipart.HttpDataFactory; +import io.netty.handler.codec.http.multipart.HttpPostRequestDecoder; +import io.netty.handler.codec.http.multipart.HttpPostRequestDecoder.EndOfDataDecoderException; +import io.netty.handler.codec.http.multipart.InterfaceHttpData; +import io.netty.handler.codec.http.multipart.InterfaceHttpData.HttpDataType; + +import java.io.File; +import java.util.UUID; + +/** + * Simple code which handles all HTTP requests from the user, and passes them to the Router + * handler directly if they do not involve file upload requests. + * If a file is required to be uploaded, it handles the upload, and in the http request to the + * next handler, passes the name of the file to the next handler. + */ +public class HttpRequestHandler extends SimpleChannelInboundHandler { + + private HttpRequest request; + + private boolean readingChunks; + + private static final HttpDataFactory factory = new DefaultHttpDataFactory(true); // use disk + + private String requestPath; + + private HttpPostRequestDecoder decoder; + + private final File uploadDir; + + /** +* The directory where files should be uploaded. +*/ + public HttpRequestHandler(File uploadDir) { + this.uploadDir = uploadDir; + } + + @Override + public void channelUnregistered(ChannelHandlerContext ctx) throws Exception { + if (decoder != null) { + decoder.cleanFiles(); + } + } + + @Override + public void channelRead0(ChannelHandlerContext ctx, HttpObject msg) throws Exception { + if (msg instanceof HttpRequest) { + request = (HttpRequest) msg; + requestPath = new QueryStringDecoder(request.getUri()).path(); + if (request.getMethod() != HttpMethod.POST) { --- End diff -- I'm not sure what you mean. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45104323 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/RuntimeMonitorHandler.java --- @@ -113,7 +117,17 @@ private void respondAsLeader(ChannelHandlerContext ctx, Routed routed, ActorGate DefaultFullHttpResponse response; try { - String result = handler.handleRequest(routed.pathParams(), jobManager); + Map<String, String> params = routed.pathParams(); --- End diff -- I guess separation would be the better solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on a diff in the pull request: https://github.com/apache/flink/pull/1338#discussion_r45104595 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarActionHandler.java --- @@ -0,0 +1,253 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.webmonitor.handlers; + +import com.fasterxml.jackson.core.JsonGenerator; +import org.apache.flink.client.program.Client; +import org.apache.flink.client.program.PackagedProgram; +import org.apache.flink.client.program.ProgramInvocationException; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.core.fs.Path; +import org.apache.flink.optimizer.CompilerException; +import org.apache.flink.optimizer.DataStatistics; +import org.apache.flink.optimizer.Optimizer; +import org.apache.flink.optimizer.costs.DefaultCostEstimator; +import org.apache.flink.optimizer.plan.FlinkPlan; +import org.apache.flink.optimizer.plan.OptimizedPlan; +import org.apache.flink.optimizer.plan.StreamingPlan; +import org.apache.flink.optimizer.plantranslate.JobGraphGenerator; +import org.apache.flink.runtime.client.JobClient; +import org.apache.flink.runtime.client.JobExecutionException; +import org.apache.flink.runtime.instance.ActorGateway; +import org.apache.flink.runtime.jobgraph.JobGraph; +import org.apache.flink.runtime.jobgraph.jsonplan.JsonPlanGenerator; +import scala.concurrent.duration.FiniteDuration; + +import java.io.File; +import java.io.IOException; +import java.io.OutputStream; +import java.io.PrintStream; +import java.io.PrintWriter; +import java.io.StringWriter; +import java.net.URISyntaxException; +import java.net.URL; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +/** + * This handler handles requests to fetch plan for an already uploaded jar, as well as for + * running it. + */ +public class JarActionHandler implements RequestHandler, RequestHandler.JsonResponse { + + private final File jarDir; + + private final boolean toRun; --- End diff -- Fair enough. Will change this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-157056790 @gyfora this is the PR I was talking about regarding your *connection reset* exception. If you want, you can add the `PipelineErrorHandler` at the end of current pipeline and rerun your program to check the origin of that exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3000]Adds shutdown hook to clean up lin...
GitHub user sachingoel0101 opened a pull request: https://github.com/apache/flink/pull/1354 [FLINK-3000]Adds shutdown hook to clean up lingering yarn sessions Adds a shutdown hook to cleanup the submitted yarn application while the user cancels it in any but a `RUNNING` state. If running, there are two cases: Detached and Non-detached. Non detached modes will connect to the cluster and add a shut down hook, while detached ones, well, since they're detached, once deployed, users have to cancel them explicitly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachingoel0101/flink yarn_cli Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1354.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1354 commit e1b8778491bf8fd592bddc8fe157d8aa8c150b7f Author: Sachin Goel <sachingoel0...@gmail.com> Date: 2015-11-15T00:41:30Z Adds shutdown hook to clean up lingering yarn sessions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-156475956 Hi @rmetzger , are you sure you built the source with scala 2.11? I was getting the same error you showed here, but building with scala 2.11 fixes it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-156483602 Yes. Except for the trace leading up to the handler, I think rest should be okay to show. In case the exception isn't`ProgramInvocationException` or `CompilerException`, something went really wrong anyway, in which case there's no harm printing the entire stack trace. Fixing it right now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/1338#issuecomment-156477959 Yes. I will write some better exception reporting. As for this particular error, how can it be detected? From what I observed, ideally, `Client#getOptimizedPlan` sets `OptimizerPlanEnvironment` which in turn constructs an `StreamPlanEnvironment`. But when the scala version doesn't match, the `execute` call delegates to the default `ExecutionEnvironment#execute` instead of `StreamPlanEnvironment#execute`, which of course leads to error since yours was a Streaming program. Can't quite see why that should happen. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---