Github user thvasilo commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132552269
Sounds good, just make sure to update the title and description of both PRs
to reflect the current state
---
If your project is set up for it, you can reply to this
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132512002
Okay. Done.
I will modify the `createContinuousHistogram` to use `DataStats` in #1032,
since that is going to be merged after this anyway.
---
If your
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37390659
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,98 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132494738
Good! But because there is another issue (FLINK-2379) covered the
column-wise statistics, I think that spliting them would be better.
---
If your project is set up
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132128284
+1 for having two different methods by return type but we need more
comments from @tillrohrmann, @thvasilo or other people because I'm not sure
this is best approach.
Github user thvasilo commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132146208
Sounds good.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132170286
@thvasilo , @chiwanpark , I've made the required changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132118352
What about having two different functions? One for Discrete and one for
continuous?
Or perhaps just one for the Continuous as that is more likely to be used.
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132134747
Okay. That sounds good enough. Can we make a final decision then?
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132116059
I think that we can merge this PR after we decide the return type of
`createHistogram` method. Any other points seem okay.
---
If your project is set up for it, you
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132187018
Olrite. I'll be back in a few. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132183402
Is there the valid reason to have multiple names ('discrete histogram',
'categorical histogram')? I think that to avoid confusion, we need to unify the
names.
---
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132188989
@chiwanpark , modified the names.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132186504
I'm also inclined to use Discrete.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132193111
They're not coupled at all. But both are related to statistics over data
sets. This is why I combined them both in one. If you're wondering, there is a
JIRA for
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132439907
Including the collect step in the documentation seems weird, IMO. Since the
return type is a `DataSet[Histogram]`, anyone familiar with flink will know to
collect
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132430686
The only problem to merge this is invalid example in documentation. We need
calling `collect()` method and `apply(0)` to execute statistics methods such as
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132449023
Hi @chiwanpark , I have modified the documentation to include the collect
step.
I have also made a few modification to how the histogram is created by
using
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132209035
I have kept the documentation for statistics in this however. We can merge
#1032 first and then this to maintain the illusion of cause-effect. :')
---
If your
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132208131
@thvasilo , @chiwanpark , I've split this into two parts. The statistics
part is now in #1032
---
If your project is set up for it, you can reply to this email
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132264800
Haha. It's no problem. I've updated the documentation for this. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user thvasilo commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-132261492
Sorry to be a nitpick, but if we are going to split the PR then the
documentation should be split accordingly, we can merge the column-wise
statistics once this one is
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37143932
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131512180
@sachingoel0101 There is no need to forgive. I just reviewed this PR and
left my opinions. :) If you feel that my comments is aggressive, I'm sorry
about that.
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144073
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/MLUtils.scala ---
@@ -119,4 +120,31 @@ object MLUtils {
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144068
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/MLUtils.scala ---
@@ -119,4 +120,31 @@ object MLUtils {
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131518423
The occasional space issues are because I stopped using Idea's reformat
tool. It sometimes messes up the indentations. :')
And no. Your comments weren't
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131523559
Fixed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131509499
Hi @chiwanpark , I've fixed the *cosmetic* issues.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144130
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,100 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131514958
I found some points to improve and added line notes. :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144066
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/MLUtils.scala ---
@@ -119,4 +120,31 @@ object MLUtils {
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144175
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,100 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144173
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,100 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144171
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,100 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144183
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,129 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144182
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,129 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144195
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,129 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144192
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,129 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37144137
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,100 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131520764
I just found that `Predef.assume` could be replaced by `assume`. Sorry for
wrong guidance. Could you change `Predef.assume` to `assume`?
---
If your project is set
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37141049
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user sachingoel0101 commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131474758
Hi @chiwanpark, thanks for reviewing this. This was my first time working
in Scala, so I hope you'll forgive the slight mistakes (oversights, perhaps?).
I've
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37141028
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user sachingoel0101 commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37141076
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,108 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137570
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/MLUtils.scala ---
@@ -119,4 +120,30 @@ object MLUtils {
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137605
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,135 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137592
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,135 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137631
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,135 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137647
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,135 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137658
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/CategoricalHistogram.scala
---
@@ -0,0 +1,135 @@
+/*
+ * Licensed
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137718
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137756
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137749
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137752
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137762
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137774
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137811
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137819
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137821
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137856
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137866
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37137965
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/FieldStats.scala
---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37138026
--- Diff:
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/statistics/ContinuousHistogram.scala
---
@@ -0,0 +1,343 @@
+/*
+ * Licensed to
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37138063
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,108 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on a diff in the pull request:
https://github.com/apache/flink/pull/861#discussion_r37138093
--- Diff: docs/libs/ml/statistics.md ---
@@ -0,0 +1,108 @@
+---
+mathjax: include
+htmlTitle: FlinkML - Statistics
+title: a
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-131425060
Hi @sachingoel0101, Thanks for your contribution. I reviewed this PR and
commented the source code.
There are some problems which aren't commented. In
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/861#issuecomment-129336972
Hi, I just discovered the review request. I'll review this PR soon. Because
I'm busy in working for my graduation essay, maybe I can start reviewing on
weekend.
---
68 matches
Mail list logo