[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265397#comment-16265397 ] Apache Spark commented on SPARK-21693: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/19816 > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon > Fix For: 2.2.1, 2.3.0 > > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in Spark. I asked this for my account few times before but it > looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (roughly 10ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248524#comment-16248524 ] Apache Spark commented on SPARK-21693: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/19722 > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in Spark. I asked this for my account few times before but it > looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (roughly 10ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248470#comment-16248470 ] Hyukjin Kwon commented on SPARK-21693: -- For caching, yup, so, looks that's why the test failures are less frequent in the master branch. Not sure for other parts. I was first thinking of this build time IIRC but failed to come up a good idea to speed up. Will investigate the single(?) test that takes 20ish(?) mins (IIRC) for now and share the results if I can't make the PR to fix it. > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in Spark. I asked this for my account few times before but it > looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (roughly 10ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247884#comment-16247884 ] Felix Cheung commented on SPARK-21693: -- let's continue this conversation. >From the first link on caching, it looks like it can still function if a run >is kicked off not via a PR? Also, aside from the ml test run taking 30min, the rest of the time is building Spark jar right, is there more parts we can trim there? > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in Spark. I asked this for my account few times before but it > looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (roughly 10ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121888#comment-16121888 ] Hyukjin Kwon commented on SPARK-21693: -- Yes, it does build multiple times and If I have observed this correctly, it won't affect queuing particularly but it'd add roughly 25-30ish mins more for each build .. Will check out other possible things too and also try to check each time in each test in "MLlib classification algorithms". > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in AppVeyor. I asked this for my account few times before but > it looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (10-20ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121866#comment-16121866 ] Felix Cheung commented on SPARK-21693: -- splitting test matrix is also possible, I worry though since caching is disabled, then isn't Spark jar being built multiple times? My main concerns are how long tests will run and whether that will lengthen queuing of test runs (which could get quite long already and people are ignoring pending appveyor runs sometimes) > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in AppVeyor. I asked this for my account few times before but > it looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (10-20ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121860#comment-16121860 ] Felix Cheung commented on SPARK-21693: -- we could certainly simplify the classification set - but there's a fair number of API being tested in their, perhaps we could time them to see which ones are taking time. > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in AppVeyor. I asked this for my account few times before but > it looks we can't increase this time limit again and again. > I could identify two things that look taking a quite a bit of time: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (10-20ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > I am also checking and testing other ways. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
[ https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121827#comment-16121827 ] Hyukjin Kwon commented on SPARK-21693: -- FYI, [~felixcheung] and [~shivaram]. > AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests > - > > Key: SPARK-21693 > URL: https://issues.apache.org/jira/browse/SPARK-21693 > Project: Spark > Issue Type: Test > Components: Build, SparkR >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon > > We finally sometimes reach the time limit, 1.5 hours, > https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master > I requested to increase this from an hour to 1.5 hours before but it looks we > should fix this in AppVeyor. I asked this for my account few times before but > it looks we can't increase this time limit again and again. > I could identify three things that take a quite a bit of times: > 1. Disabled cache feature in pull request builder, which ends up downloading > Maven dependencies (15-20ish mins) > https://www.appveyor.com/docs/build-cache/ > {quote} > Note: Saving cache is disabled in Pull Request builds. > {quote} > and also see > http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working > This seems difficult to fix within Spark. > 2. "MLlib classification algorithms" tests (30-35ish mins) > This test below looks taking 30-35ish mins. > {code} > MLlib classification algorithms, except for tree-based algorithms: Spark > package found in SPARK_HOME: C:\projects\spark\bin\.. > .. > {code} > As a (I think) last resort, we could make a matrix for this test alone, so > that we run the other tests after a build and then run this test after > another build, for example, I run Scala tests by this workaround - > https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix > with 7 build and test each). > 3. Disabled {{spark.sparkr.use.daemon}} on Windows due to the limitation of > {{mcfork}} > See [this > codes|https://github.com/apache/spark/blob/478fbc866fbfdb4439788583281863ecea14e8af/core/src/main/scala/org/apache/spark/api/r/RRunner.scala#L362-L392]. > We disabled this feature and currently fork processes from Java that is > expensive. I haven't tested this yet but maybe reducing > {{spark.sql.shuffle.partitions}} can be an approach to work around this. > Currently, if I understood correctly, this is 200 by default in R tests, > which ends up with 200 Java processes for every shuffle. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org