[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-11-24 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265397#comment-16265397
 ] 

Apache Spark commented on SPARK-21693:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/19816

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
> Fix For: 2.2.1, 2.3.0
>
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in Spark. I asked this for my account few times before but it 
> looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (roughly 10ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-11-11 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248524#comment-16248524
 ] 

Apache Spark commented on SPARK-21693:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/19722

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in Spark. I asked this for my account few times before but it 
> looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (roughly 10ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-11-11 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16248470#comment-16248470
 ] 

Hyukjin Kwon commented on SPARK-21693:
--

For caching, yup, so, looks that's why the test failures are less frequent in 
the master branch.
Not sure for other parts. I was first thinking of this build time IIRC but 
failed to come up a good idea to speed up.
Will investigate the single(?) test that takes 20ish(?) mins (IIRC) for now and 
share the results if I can't make the PR to fix it.


> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in Spark. I asked this for my account few times before but it 
> looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (roughly 10ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-11-10 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16247884#comment-16247884
 ] 

Felix Cheung commented on SPARK-21693:
--

let's continue this conversation.
>From the first link on caching, it looks like it can still function if a run 
>is kicked off not via a PR?

Also, aside from the ml test run taking 30min, the rest of the time is building 
Spark jar right, is there more parts we can trim there?

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in Spark. I asked this for my account few times before but it 
> looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (roughly 10ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121888#comment-16121888
 ] 

Hyukjin Kwon commented on SPARK-21693:
--

Yes, it does build multiple times and If I have observed this correctly, it 
won't affect queuing particularly but it'd add roughly 25-30ish mins more for 
each build .. Will check out other possible things too and also try to check 
each time in each test in "MLlib classification algorithms".

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in AppVeyor. I asked this for my account few times before but 
> it looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (10-20ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121866#comment-16121866
 ] 

Felix Cheung commented on SPARK-21693:
--

splitting test matrix is also possible, I worry though since caching is 
disabled, then isn't Spark jar being built multiple times? My main concerns are 
how long tests will run and whether that will lengthen queuing of test runs 
(which could get quite long already and people are ignoring pending appveyor 
runs sometimes)

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in AppVeyor. I asked this for my account few times before but 
> it looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (10-20ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Felix Cheung (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121860#comment-16121860
 ] 

Felix Cheung commented on SPARK-21693:
--

we could certainly simplify the classification set - but there's a fair number 
of API being tested in their, perhaps we could time them to see which ones are 
taking time.

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in AppVeyor. I asked this for my account few times before but 
> it looks we can't increase this time limit again and again.
> I could identify two things that look taking a quite a bit of time:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (10-20ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> I am also checking and testing other ways.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21693) AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests

2017-08-10 Thread Hyukjin Kwon (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121827#comment-16121827
 ] 

Hyukjin Kwon commented on SPARK-21693:
--

FYI, [~felixcheung] and [~shivaram].

> AppVeyor tests reach the time limit, 1.5 hours, sometimes in SparkR tests
> -
>
> Key: SPARK-21693
> URL: https://issues.apache.org/jira/browse/SPARK-21693
> Project: Spark
>  Issue Type: Test
>  Components: Build, SparkR
>Affects Versions: 2.3.0
>Reporter: Hyukjin Kwon
>
> We finally sometimes reach the time limit, 1.5 hours, 
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1676-master
> I requested to increase this from an hour to 1.5 hours before but it looks we 
> should fix this in AppVeyor. I asked this for my account few times before but 
> it looks we can't increase this time limit again and again.
> I could identify three things that take a quite a bit of times:
> 1. Disabled cache feature in pull request builder, which ends up downloading 
> Maven dependencies (15-20ish mins)
> https://www.appveyor.com/docs/build-cache/
> {quote}
> Note: Saving cache is disabled in Pull Request builds.
> {quote}
> and also see 
> http://help.appveyor.com/discussions/problems/4159-cache-doesnt-seem-to-be-working
> This seems difficult to fix within Spark.
> 2. "MLlib classification algorithms" tests (30-35ish mins)
> This test below looks taking 30-35ish mins.
> {code}
> MLlib classification algorithms, except for tree-based algorithms: Spark 
> package found in SPARK_HOME: C:\projects\spark\bin\..
> ..
> {code}
> As a (I think) last resort, we could make a matrix for this test alone, so 
> that we run the other tests after a build and then run this test after 
> another build, for example, I run Scala tests by this workaround - 
> https://ci.appveyor.com/project/spark-test/spark/build/757-20170716 (a matrix 
> with 7 build and test each).
> 3. Disabled {{spark.sparkr.use.daemon}} on Windows due to the limitation of 
> {{mcfork}}
> See [this 
> codes|https://github.com/apache/spark/blob/478fbc866fbfdb4439788583281863ecea14e8af/core/src/main/scala/org/apache/spark/api/r/RRunner.scala#L362-L392].
>  We disabled this feature and currently fork processes from Java that is 
> expensive. I haven't tested this yet but maybe reducing 
> {{spark.sql.shuffle.partitions}} can be an approach to work around this. 
> Currently, if I understood correctly, this is 200 by default in R tests, 
> which ends up with 200 Java processes for every shuffle.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org