Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
I am too busy recently to fix those failed R tests. Anyone who has spare
time can take over this PR and I will help review. Thanks!
---
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/19621
I think we need to address that too. Sounds to me these tests arenât
stable before.
---
-
To unsubscribe,
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@felixcheung Another failed testcase, spark.mlp in sparkR, it also use
`RFormula` and it will also generate indeterministic result, see class
`MultilayerPerceptronClassifierWrapper` line 78:
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/19621
You can change the dataset used in testing.
Will be good if you could test with the same data before and after your
change to make sure thatâs not broken.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84955/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84955 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84955/testReport)**
for PR 19621 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84955 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84955/testReport)**
for PR 19621 at commit
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@felixcheung "iris" is a built-in dataset in R, used in many algo testing,
so is it proper to change it ?
---
-
To
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/19621
maybe we could also change the test itself to make it more deterministic?
we could first create a new test dataset that avoid having frequency
values, run it through the original
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@felixcheung Yes, the spark.mlp test result changed because of indexer
order changed. That's because, StringIndexer when item frequency equal, there's
no definite rule for index order. And, in
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/19621
I think I understand what you are saying but the latest test failure I see
it from spark.mlp instead and be results are different from the existing ones.
---
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@felixcheung There is no breaking change. But, we meet some trouble thing
about indeterministic behavior. When frequency equal, the indexer result is
indeterministic. I already fix those in
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/19621
stringindexer is set automatically for index column. are we having breaking
API change here?
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
Any one can provide some suggestion ? for fixing sparkR glm test failure
here.
---
-
To unsubscribe, e-mail:
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
I checked the failed tests in sparkR. There's some trouble in the failed
`glm` sparkR tests.
These tests compare sparkR glm and R-lib glm results on test data "iris",
but, what's the
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84125/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84125 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84125/testReport)**
for PR 19621 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84125 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84125/testReport)**
for PR 19621 at commit
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@viirya @MLnick Code updated. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@MLnick Ah, I don't express it exactly, the first case, what I mean is,
sort by frequency, but if the case frequency equal, sort by alphabet.
---
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19621
The first case you mention wouldnât actually end up sorting by freq, no?
It
would have to be the other way around?
For second case, yes equality must mean it is the same string / key
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@MLnick How about this way:
The case "fequencyAsc/Desc", sort first by frequency and then by alphabet,
The case "alphabetAsc/Desc", sort by alphabet (and if alphabetically equal,
the
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84101/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84101 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84101/testReport)**
for PR 19621 at commit
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19621
It won't be deterministic in the case of different RDDs / partitions /
shuffle etc. For a given input RDD it _should_ be deterministic?
But perhaps we could ensure it by first sorting
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@MLnick Will RDD "count by value" aggregation be deterministic ? e.g., 2
RDD with the same elements, but with different element order and different
partition number, will
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/19621
@WeichenXu123 with reference to
https://github.com/apache/spark/pull/19621#issuecomment-344530228 - the sort
is
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84101 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84101/testReport)**
for PR 19621 at commit
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
Jenkins retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84093 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84093/testReport)**
for PR 19621 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84093/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84093 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84093/testReport)**
for PR 19621 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84066/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84066 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84066/testReport)**
for PR 19621 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #84066 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84066/testReport)**
for PR 19621 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19621
Seems in the frequency-based string orders, the order of labels with same
frequency is non-deterministic.
---
-
To unsubscribe,
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
I want to ask, for option `StringIndexer.frequencyDesc`, in the case
existing two labels which have the same frequency, which of them will be put in
the front ?
If this is not specified,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83878/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #83878 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83878/testReport)**
for PR 19621 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #83878 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83878/testReport)**
for PR 19621 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83872/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #83872 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83872/testReport)**
for PR 19621 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19621
@WeichenXu123 I will try to look into this today.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@viirya @MLnick Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #83872 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83872/testReport)**
for PR 19621 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83396/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83392/
Test FAILed.
---
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
Jenkins, test this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83323/
Test FAILed.
---
Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19621
@viirya Code updated. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83265/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83263/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #83263 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83263/testReport)**
for PR 19621 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19621
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19621
**[Test build #83263 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83263/testReport)**
for PR 19621 at commit
66 matches
Mail list logo