[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193691#comment-16193691
]
Ilya Matiach commented on SPARK-21742:
--
[~podongfeng] The test was just validating that the edge
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137773#comment-16137773
]
zhengruifeng commented on SPARK-21742:
--
[~srowen] Yes, if we cache the input dataset in testsuite,
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134326#comment-16134326
]
Sean Owen commented on SPARK-21742:
---
I haven't noticed test failures in Jenkins. Is it something that
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130045#comment-16130045
]
zhengruifeng commented on SPARK-21742:
--
[~srowen] I cache the dataset in that test, then it fails.
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16130032#comment-16130032
]
Sean Owen commented on SPARK-21742:
---
OK, so there's no difference attributable only to persisting?
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128623#comment-16128623
]
zhengruifeng commented on SPARK-21742:
--
[~srowen] I create {{random}} and {{rdd}} twice in REPL with
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128580#comment-16128580
]
Sean Owen commented on SPARK-21742:
---
Fixing the seed still doesn't mean that the two cases get the same
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128535#comment-16128535
]
zhengruifeng commented on SPARK-21742:
--
[~mlnick] The seed is already fixed. It looks like if we use
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128512#comment-16128512
]
Nick Pentreath commented on SPARK-21742:
Isn't the solution to set a fixed seed for the randomly
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128509#comment-16128509
]
zhengruifeng commented on SPARK-21742:
--
[~srowen] you are right. When I create the same dataset in a
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128455#comment-16128455
]
Sean Owen commented on SPARK-21742:
---
Can you demonstrate this with a data set that isn't randomly
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128452#comment-16128452
]
zhengruifeng commented on SPARK-21742:
--
[~srowen] I retest it in different spark-shell. And the
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128433#comment-16128433
]
Sean Owen commented on SPARK-21742:
---
You're defining a source DataFrame that's non-deterministic, and
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128427#comment-16128427
]
zhengruifeng commented on SPARK-21742:
--
[~srowen] I set the seed for generate dataset and training
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128393#comment-16128393
]
Sean Owen commented on SPARK-21742:
---
Is that a bug? Isn't it stochastic and dependent on the data order
15 matches
Mail list logo