Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/952#issuecomment-127185418
Ah. Okay. You can fix the remaining things.
This build also fails on the StreamCheckpointingITCase. I'll file another
JIRA ticket for that. I've observed
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/952#issuecomment-127139227
...have it actually wrap the configuration, and have all the getX()
methods delegate to the config,
while all the setX() methods fail with an exception
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/975#issuecomment-127331693
The basic methodology is this:
1. `TaskManager` keeps asking `JobManager` for running `TaskManagers` at
some interval, same as the `heartbeat` interval.
2
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-127037112
@StephanEwen, it would be much simpler to have an interface. But then we
leave the part about implementing the `setRuntimeContext` and
`getRuntimeContext
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/954#issuecomment-127041808
Yes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/954#issuecomment-127042540
Ah. Yes, actually.
I tried running the webclient on cygwin once and it didn't work [I was
ignorant to the fact that I had to add those two lines in the bash
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/971#issuecomment-127055248
@StephanEwen, I've force pushed this branch to only contain the name
change. You can merge this again.
---
If your project is set up for it, you can reply
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/971#issuecomment-127040180
Pushed a fix for changing the name to `quiet`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/952#issuecomment-127071830
Added test to verify all setter methods are overridden by the
`UnmodifiableConfiguration` class.
---
If your project is set up for it, you can reply
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/971#issuecomment-126943148
@mxm, you should verify if should we do away with the logging test since
you reviewed it.
If not, the problem is easily fixed by setting a free port instead
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/970
[FLINK-2458][FLINK-2449]Access distributed cache entries for
CollectionExecution and in Iterative tasks.
1. This PR adds support for accessing distributed cache entries when
running
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/971
[FLINK-2459][cli]Cli API and doc fixes.
1. Remove CliFrontendLoggingTest. Test directly that the logging flag is
interpreted correctly.
2. [hotfix] Doc fix for cli api
3. [hotfix
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/966#issuecomment-126706999
Ah yes. I'll update them in a while. There's actually some problem with the
unit test I've written too. Travis fails sporadically.
---
If your project is set up
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126672392
Yeah. I think we should remove it then. Too many entries tend to confuse
people.
---
If your project is set up for it, you can reply to this email and have your
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/966
[FLINK-1819][core]Allow access to RuntimeContext from Input and Output
formats
1. Introduces new Rich Input and Output formats, similar to Rich Functions.
2. Makes all existing input
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126672811
I've updated the code to remove any changes in Configuration.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126675074
Sure. No problem. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126695988
Travis build passes. You can merge it @mxm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/957#discussion_r35967625
--- Diff:
flink-clients/src/test/java/org/apache/flink/client/CliFrontendLoggingTest.java
---
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126614333
@mxm , updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126623148
Done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126661277
yarnfifo case failure.
nothing to do with the changes made in this PR.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/957#issuecomment-126642822
There was no specific need for it. However, since the CliFrontend only
passes the Client the configuration, I decided to include it with that.
Further, I think
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/956
[FLINK-2238][api]Add env.fromCollection(set) method to scala api
Used the same technique as in env.fromCollection(Seq) to first convert the
data to a Java Collection.
You can merge this pull
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/956#issuecomment-126257935
Ah. Yes. That does make more sense. :+1:
Updating now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/957
[FLINK-2248]Add flag to disable sysout logging from cli
Enables disabling of sysout messages on cli via a flag `q`.
You can merge this pull request into a Git repository by running
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/952
[FLINK-2425]Provide access to task manager configuration from
RuntimeEnvironment
Also fixes [FLINK-2426]: Define an UnmodifiableConfiguration class which
doesn't allow modifications
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/954
[FLINK-2433][docs]Add script to build local documentation on windows
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sachingoel0101/flink
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/942#issuecomment-125604252
No one will actually find out unless they specifically went through the
documentation for these. Most people would only ever see the docs for
Accumulator
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/942#issuecomment-125654972
Sure. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/942#issuecomment-125597443
Should we add some kind of annotation to suggest the usage of the primitive
functions? We can't deprecate the older ones probably.
---
If your project is set up
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/945#issuecomment-125734265
I had decided to work with my own understanding of what version means since
nobody replied to the JIRA comment.
getClass.getPackage.getImplementationVersion
Github user sachingoel0101 closed the pull request at:
https://github.com/apache/flink/pull/944
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/921#issuecomment-123390786
This leads to non-mutually exclusive splits. I tracked down the reason for
this: The input data is parallelized differently while performing the splits
for every
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/891#discussion_r34973783
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/evaluation/CrossValidation.scala
---
@@ -0,0 +1,97 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/891#discussion_r34975724
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/evaluation/CrossValidation.scala
---
@@ -0,0 +1,97 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-122705520
This now also incorporates [Flink-2379].
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/921
[FLINK-2312][ml][WIP] Randomly Splitting a Data Set according to weights
given
Adds a method for randomly splitting a data set.
However, there are a few problems. We're effectively
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/710#issuecomment-122379328
The changes proposed by Theodore in the PR #861 have been incorporated here
too. This can be reviewed now, and merging this will also close #861.
---
If your
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/918#issuecomment-122382475
Ah. Okay. No problem. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/891#discussion_r34923042
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/evaluation/CrossValidation.scala
---
@@ -0,0 +1,97 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-121924666
Okay. @tillrohrmann, can you review this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/918#discussion_r34788837
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/SVM.scala
---
@@ -382,7 +402,11 @@ object SVM
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/918
[FLINK-2368][ml]Adds convergence criteria [WIP]
Adds a convergence criteria class which allows the user to decide whether
they want to terminate training at any point, based on the solutions
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-121605233
@thvasilo I've incorporated different initialization strategies in the
KMeans algorithm itself. Please review.
---
If your project is set up for it, you can
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-121605463
@thvasilo, @tillrohrmann, I'm still waiting for a decision on this. It
would be impossible to work further on the decision tree PR until this is
merged
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-117722491
@thvasilo , right now, there aren't other features in the library which
need sampling. Perhaps it isn't a good idea to file a separate feature request
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/700#issuecomment-117731195
@peedeeX21 , try this link:
https://github.com/sachingoel0101/flink/compare/clustering_initializations...peedeeX21:feature_kmeans
I had a lot of trouble
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/700#issuecomment-117723450
@thvasilo , how do I merge this PR into mine? Maybe @peedeeX21 can create a
pull request to my branch at
https://github.com/sachingoel0101/flink/tree
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-117049680
Further, the probability distribution doesn't need to be scaled down to
between [0,1]. We just take care that of while building the cumulative
distribution
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-117047575
Hi @thvasilo, thanks for taking the time to go through it.
Consider for example a probability distribution P(X_0) = 0.2, P(X_1) = 0.3,
P(X_2) = 0.5
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-117051023
Sorry about the formatting though. I'll fix it. I haven't worked on this in
a while.
I'll incorporate your suggestions from the previous PR.
---
If your
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-117055846
Okay. I'll update it today itself with a few trivial fixes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/700#issuecomment-117060737
Hi. IMO, the purpose of learning is to develop a model which compactly
represents the data somehow. Thus, having a distributed model doesn't make
sense. Besides
Github user sachingoel0101 closed the pull request at:
https://github.com/apache/flink/pull/757
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/757#issuecomment-117220314
Hey @thvasilo , I'm going to break up this PR further. The motivation is
that, the Sampling code should be available as a general feature. Given a
probability
GitHub user sachingoel0101 reopened a pull request:
https://github.com/apache/flink/pull/757
[FLINK-2131][ml]: Initialization schemes for k-means clustering
This adds two most common initialization strategies for the k-means
clustering algorithm, namely, Random initialization
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r33358232
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/ContinuousHistogram.scala
---
@@ -0,0 +1,337 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-115210476
Okay. So I guess we can leave adding a createHistogram function to
DataSetUtils for now [It would also require utilizing the FlinkMLTools.block
for an efficient
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-115204540
How should I import a class in flink.ml.math from say, flink-java? I tried
adding flink-staging as a dependency to pom.xml of flink-java but to no avail.
I'm
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-115199291
Where should I place the Histogram implementations? Currently, they are in
{{org.apache.flink.ml.math}}, but I can't import them from the flink-core where
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r33151075
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/ContinuousHistogram.scala
---
@@ -0,0 +1,325 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r33150869
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/CategoricalHistogram.scala
---
@@ -0,0 +1,167 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-114895657
Adding a Utility method does certainly make sense. User will be supposed to
provide an argument depicting whether the values in DataSet[Double] are
continuous
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-114880501
Hello Theodore, the semantics for Discrete Histogram is such that you have
to specify what classes or discrete values are going to arrive. Once you fix
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-114899521
Changing the current discrete histogram implementation would not break the
decision tree functionality. Although I might have to review the code for any
potential
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-114901644
Okay. Sure.
I will update the make the Discrete version online first.
Should I try to explicitly use Scala library data structures instead of
Java
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r33050123
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/OnlineHistogram.scala
---
@@ -0,0 +1,81 @@
+/*
+ * Licensed
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/710#discussion_r33046001
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/DecisionTree.scala
---
@@ -0,0 +1,490 @@
+/*
+ * Licensed
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/861
[Flink-2030][ml]Online Histogram: Discrete and Categorical
This implements the Online Histograms for both categorical and continuous
data. For continuous data, we emulate a continuous
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/710#issuecomment-114173765
The fundamental idea for a scalable decision tree algorithm is to reduce
the number of splits required to be checked at every node. Ideally, we'd check
for every
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/772#issuecomment-108482112
Great. This is exactly what I had in mind.
There is perhaps another feature we could incorporate. Every algorithm has
some performance measure to so it can
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/700#issuecomment-107831123
Hey guys. You might wanna look at the initialization schemes here:
https://github.com/apache/flink/pull/757
---
If your project is set up for it, you can reply
Github user sachingoel0101 closed the pull request at:
https://github.com/apache/flink/pull/756
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/757
[FLINK-2131]: Initialization schemes for k-means clustering
This adds two most common initialization strategies for the k-means
clustering algorithm, namely, Random initialization and kmeans
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/756
[FLINK-2131]: Initialization schemes for k-means clustering
This adds two most common initialization strategies for the k-means
clustering algorithm, namely, Random initialization and kmeans
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/708
Decision tree [Flink-1727]
This implements a part of the Decision Tree Algorithm. As of now, only
continuous valued fields are implemented. Also, Gini index based splitting
only. Entropy
Github user sachingoel0101 closed the pull request at:
https://github.com/apache/flink/pull/708
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user sachingoel0101 opened a pull request:
https://github.com/apache/flink/pull/710
Decision tree [Flink-1727]
This implements a part of the Decision Tree Algorithm. As of now, only
continuous valued fields are implemented. Also, Gini index based splitting
only. Entropy
501 - 579 of 579 matches
Mail list logo