Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
Merging with master
This is awesome to get in---thanks a lot @imatiach-msft and everyone who
contributed and reviewed!!
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84118/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84118 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84118/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84117/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84117 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84117/testReport)**
for PR 19439 at commit
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
Thanks! LGTM pending tests
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84118 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84118/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
@jkbradley good catch - I added the missing link to the license file and I
rebased the code against the very very latest master. Thanks!
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84117 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84117/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
good catch, it's from here:
https://ccsearch.creativecommons.org/image/detail/B2CVP_j5KjwZm7UAVJ3Hvw==
let me add it to the list
---
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
I just noticed: Where is data/mllib/images/kittens/DP153539.jpg from?
(It's missing in the license list.)
---
-
To
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84100/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84100 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84100/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84100 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84100/testReport)**
for PR 19439 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84042/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84042 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84042/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #84042 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84042/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
@jkbradley done! rebased to latest master. Thanks!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
LGTM, except that it looks like this doesn't merge cleanly. Would you mind
rebasing it on master?
---
-
To unsubscribe,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #3988 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3988/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #3988 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3988/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83845/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83845 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83845/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83845 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83845/testReport)**
for PR 19439 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
FWIW, I have no more comments and seems fine. I happened to review this and
manually test multiple times.
---
-
To
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83821/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83821 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83821/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83821 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83821/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
@HyukjinKwon and @jkbradley I've updated the documentation based on your
latest comments. I believe all comments have been resolved for this PR at this
point, please let me know if I missed
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
@jkbradley agreed, we can add a warning for now. The user can always
ignore sampling by setting the sampling rate to 1 as a workaround.
---
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
Thanks for the explanation! Given the complexity here, I'm OK with the
random seed approach but recommend we add a warning about sampling being more
efficient but potentially non-deterministic.
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/19439
@jkbradley I'm not confident enough about this part but a quick check
suggested that typically `PathFilter`s are used in
`FileInputFormat.listStatus()`, which is usually called in
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
@liancheng I see you've worked with PathFilters in Spark SQL, so I'll ask
here: We're uncertain about how PathFilters are used in Hadoop, and it would be
helpful to understand (and use) them in
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83768/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83768 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83768/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83768 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83768/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
I've updated the code to take care of all comments except this one:
"Determinism for sampling (commented above)"
I will need to think about this a bit more. @jkbradley
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83767/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83767 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83767/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83767 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83767/testReport)**
for PR 19439 at commit
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
Thanks for the updates! My only remaining comments are about:
* Default arguments for readImages in Scala not being Java-friendly (I'd
still recommend taking the easy route by having 1
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83679/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83679 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83679/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83679 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83679/testReport)**
for PR 19439 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83672/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83672 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83672/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83672 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83672/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83671 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83671/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83671/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83670/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83670 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83670/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83671 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83671/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83670 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83670/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
Yes, I am working on updating the PR, will have a new update soon, thanks!
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
@imatiach-msft would you maybe have some time to address the comments
above? I am actually very curious about this PR and want to push forward :).
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83344/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83344 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83344/testReport)**
for PR 19439 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83335/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83335 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83335/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83344 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83344/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
Jenkins retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83342/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
@jkbradley thank you for taking a look - I've moved the images used in the
scala tests to the new directory:
https://github.com/apache/spark/tree/master/data/mllib
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83335 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83335/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
@HyukjinKwon the way you disallowed __init__ method but cached variables
like the ocvTypes in python is very cool, nice suggestion!
---
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
Will do as soon as I can!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
To me, I have two concern. One is Python API shape, imatiach-msft#1 and
Java API support related with `Map` -
https://github.com/apache/spark/pull/19439#discussion_r148289879 at high level.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
@jkbradley, BTW, mind checking the API structure please? I reviewed this to
be consistent with other components and codes at my best but, to be honest, my
ML knowledge and familiarity are
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/19439
Quick comment: I see that data are being added under
mllib/src/test/resources/ That appears to be a new directory, created
recently. The standard directory is
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83303/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83303 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83303/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83303 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83303/testReport)**
for PR 19439 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
I tried
https://github.com/graphframes/graphframes/pull/169/files#diff-e81e6b169c0aa35012a3263b2f31b330R381
way first in my local but seems causing warnings and failed to generate the
doc via
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/19439
I opened a PR to deal with docstring issue and completing
https://github.com/apache/spark/pull/19439#discussion_r148027184 here
https://github.com/imatiach-msft/spark/pull/1
Merging it
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83277/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83277 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83277/testReport)**
for PR 19439 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19439
**[Test build #83277 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83277/testReport)**
for PR 19439 at commit
Github user imatiach-msft commented on the issue:
https://github.com/apache/spark/pull/19439
Jenkins retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19439
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83275/
Test FAILed.
---
1 - 100 of 244 matches
Mail list logo