[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238453#comment-16238453 ] ASF GitHub Bot commented on JENA-1407: -- Github user afs commented on the issue: https://github.com/apache/jena/pull/297 The changes do make the build significantly more usable! Thanks. > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Affects Versions: Jena 3.4.0 >Reporter: Andy Seaborne >Assignee: Rob Vesse >Priority: Minor > Fix For: Jena 3.5.0 > > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238367#comment-16238367 ] ASF GitHub Bot commented on JENA-1407: -- Github user rvesse commented on the issue: https://github.com/apache/jena/pull/297 I think leaving at 2 is good, I was playing with using `false` though obviously that lessens the benefit. I think we can close this out though if we've delivered noticeable improvements already, not really worth tweaking the configs too much as any further gain is likely limited > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238368#comment-16238368 ] ASF GitHub Bot commented on JENA-1407: -- Github user rvesse closed the pull request at: https://github.com/apache/jena/pull/297 > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238334#comment-16238334 ] ASF GitHub Bot commented on JENA-1407: -- Github user afs commented on the issue: https://github.com/apache/jena/pull/297 Jena 3.5.0 went out with `2` which is per-core (hyperthreading seems to count as 2 for a total of 4 per CPU-chip). `threadCount=1` made a big difference and a setting of 2 more so. Changing 2 to 4 made little difference. [Write-up on the JIRA ticket](https://issues.apache.org/jira/browse/JENA-1407?focusedCommentId=16222769#comment-16222769). @rvesse With that, are you happy to close the JIRA and this PR? Do you want to bump to 4 anyway? > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222769#comment-16222769 ] Andy Seaborne commented on JENA-1407: - A comparison of the changes: The machine is quad-core with hypterthreading (so 8 hardware threads). Watching with gnome-system-monitor. Running old-style (no old POM changes): In the first test, BatchedTriGOutputTest, CPU use hops around for a short time (expected, start-up), then maxes out one and only CPU at 100% smoothly, no thread hoping showing. Then it hits some fast tests and the usage is jumpy, typical of many small work items. When it hits TriXAsQuadsInputTest CPU drops to low, noise levels. Slow tests have a small burst of CPU at the start, then nothing. Compression adds a small amount to the CPU burst but still the majority of the ~1 min is no appreciable CPU use. Conclusion: it is not compute bound, it is some sort of thread scheduling/locking. With threadCount=2 , parallel=classes Long wait (~1min) until first Surefire message: one CPU maxed'ed out, smooth. {noformat} Running org.apache.jena.hadoop.rdf.io.registry.TestHadoopRdfIORegistry {noformat} Much better. All 8 threads, 4 CPUs, are being used for a while then it drops off to zero for the last minute or so of the test run. With threadCount=1 , parallel=classes (i.e. Surefire "one thread per core") I get much the thread behaviour as threadCount=2, 8 threads in action. Maybe "core" = "hyperthread". Some rough times: ||ThreadCount||Time|| |1|3m40s| |2|2m45s| |4|2m40s| > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1601#comment-1601 ] ASF GitHub Bot commented on JENA-1407: -- Github user afs commented on the issue: https://github.com/apache/jena/pull/297 From the documentation, the default mode of Surefire is that `` is count-per-core. {noformat} perCoreThreadCount (Default: true) (JUnit 4.7 provider) Indicates that threadCount, threadCountSuites, threadCountClasses, threadCountMethods are per cpu core. User property: perCoreThreadCount {noformat} So setting thread count to 50% CPUs is a number per core? Let's go with the original PR for the 3.5.0 RC2. > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221942#comment-16221942 ] ASF GitHub Bot commented on JENA-1407: -- Github user rvesse commented on the issue: https://github.com/apache/jena/pull/297 I have now changed this to dynamically set test parallelism to use 50% of available cores > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221929#comment-16221929 ] ASF GitHub Bot commented on JENA-1407: -- Github user rvesse commented on the issue: https://github.com/apache/jena/pull/297 Parallelism of 2 gives similar benefits. I have a quad core machine personally. One possible approach is to use the `build-helper:cpu-count` plugin - http://www.mojohaus.org/build-helper-maven-plugin/cpu-count-mojo.html - to dynamically set the level of parallelism so that we don't over subscribe the users system > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221448#comment-16221448 ] ASF GitHub Bot commented on JENA-1407: -- Github user afs commented on the issue: https://github.com/apache/jena/pull/297 Total 36:11 min with 15:46 in Elephas I/O Total 13:38 min with 03:27 in Elephas I/O 23 min reduction, 12 mins from Elephas I/O Maybe the saving is even more than reported (clearup? setup?) > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221437#comment-16221437 ] ASF GitHub Bot commented on JENA-1407: -- Github user kinow commented on the issue: https://github.com/apache/jena/pull/297 Tested on a 4 core, 6GB, Ubuntu LTS VM, with a simple `mvn clean test install`. First I applied the pull request #295 in order to get the build passing. Then manually applied the diff in this pull request. Before the patch, I got the following for Elephas. ``` [INFO] Apache Jena - Elephas .. SUCCESS [ 0.125 s] [INFO] Apache Jena - Elephas - Common API . SUCCESS [ 4.346 s] [INFO] Apache Jena - Elephas - I/O SUCCESS [15:46 min] [INFO] Apache Jena - Elephas - Map/Reduce . SUCCESS [ 22.584 s] [INFO] Apache Jena - Elephas - Statistics Demo App SUCCESS [ 5.769 s] [INFO] Apache Jena - OSGi . SUCCESS [ 0.098 s] [INFO] Apache Jena - OSGi bundle .. SUCCESS [ 9.213 s] [INFO] Apache Jena - OSGi Karaf features .. SUCCESS [ 4.971 s] [INFO] Apache Jena SUCCESS [ 0.503 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 36:11 min [INFO] Finished at: 2017-10-27T11:21:38+13:00 [INFO] Final Memory: 109M/728M ``` After applying the patch. ``` [INFO] Apache Jena - Elephas .. SUCCESS [ 0.226 s] [INFO] Apache Jena - Elephas - Common API . SUCCESS [ 4.269 s] [INFO] Apache Jena - Elephas - I/O SUCCESS [03:27 min] [INFO] Apache Jena - Elephas - Map/Reduce . SUCCESS [ 20.512 s] [INFO] Apache Jena - Elephas - Statistics Demo App SUCCESS [ 3.620 s] [INFO] Apache Jena - OSGi . SUCCESS [ 0.083 s] [INFO] Apache Jena - OSGi bundle .. SUCCESS [ 8.088 s] [INFO] Apache Jena - OSGi Karaf features .. SUCCESS [ 0.170 s] [INFO] Apache Jena SUCCESS [ 0.378 s] [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 13:38 min [INFO] Finished at: 2017-10-27T12:13:42+13:00 [INFO] Final Memory: 167M/739M ``` Hope that helps, Bruno > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221385#comment-16221385 ] ASF GitHub Bot commented on JENA-1407: -- Github user afs commented on the issue: https://github.com/apache/jena/pull/297 Will try it out ASAP. How many cores do you have? If the parallelism is 2, how much effect is there? I'm wondering if it consumes the whole machine for a few minutes, which might be fine. ATM the user can go do something unrelated (like read email!). > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16221319#comment-16221319 ] Andy Seaborne commented on JENA-1407: - [^Elephas-Test-Times] is the results of running the (non-parallel) Elephas I/O test suite three times in a row, then edited to pick out the interesting numbers. The same tests are slow each time. and run like this the total elapsed time is consistent TriX, json-ld are >60s. Thrift, RDFXML are not slow but *2 longer than the fast formats. RDF-Thrift is binary N-triples which makes it strange. Simplifed results: {noformat} * First test - maybe startup costs. -- org.apache.jena.hadoop.rdf.io.output.trig.BatchedTriGOutputTest Time elapsed: 65.085 sec -- org.apache.jena.hadoop.rdf.io.input.trix.TriXAsQuadsInputTest Time elapsed: 70.228 sec -- org.apache.jena.hadoop.rdf.io.input.trix.TriXInputTest Time elapsed: 70.124 sec -- org.apache.jena.hadoop.rdf.io.input.thrift.ThriftQuadInputTest Time elapsed: 6.373 sec -- org.apache.jena.hadoop.rdf.io.input.thrift.ThriftTripleInputTest Time elapsed: 6.339 sec -- org.apache.jena.hadoop.rdf.io.input.rdfxml.RdfXmlInputTest Time elapsed: 7.481 sec -- org.apache.jena.hadoop.rdf.io.input.rdfxml.RdfXmlAsTriplesInputTest Time elapsed: 7.507 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.thrift.BZippedThriftTripleInputTest Time elapsed: 10.015 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.thrift.GZippedThriftQuadInputTest Time elapsed: 6.68 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.thrift.GZippedThriftTripleInputTest Time elapsed: 6.587 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.thrift.DeflatedThriftTripleInputTest Time elapsed: 6.62 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.thrift.DeflatedThriftQuadInputTest Time elapsed: 6.629 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.thrift.BZippedThriftQuadInputTest Time elapsed: 10.448 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.rdfxml.GZippedRdfXmlInputTest Time elapsed: 7.737 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.rdfxml.DeflatedRdfXmlInputTest Time elapsed: 7.782 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.rdfxml.BZippedRdfXmlInputTest Time elapsed: 10.958 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.jsonld.GZippedJsonLDQuadInputTest Time elapsed: 61.071 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.jsonld.GZippedJsonLDTripleInputTest Time elapsed: 60.825 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.jsonld.DeflatedJsonLDTripleInputTest Time elapsed: 60.728 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.jsonld.BZippedJsonLDQuadInputTest Time elapsed: 63.307 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.jsonld.BZippedJsonLDTripleInputTest Time elapsed: 61.121 sec -- org.apache.jena.hadoop.rdf.io.input.compressed.jsonld.DeflatedJsonLDQuadInputTest Time elapsed: 60.821 sec -- org.apache.jena.hadoop.rdf.io.input.jsonld.JsonLDQuadInputTest Time elapsed: 60.544 sec -- org.apache.jena.hadoop.rdf.io.input.jsonld.JsonLDTripleInputTest Time elapsed: 60.489 sec {noformat} > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > Attachments: Elephas-Test-Times > > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220816#comment-16220816 ] Rob Vesse commented on JENA-1407: - Tried out parallel testing and it reduces test time to approximately 3-4 minutes using 4 threads. Have opened a PR for this. > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220798#comment-16220798 ] ASF GitHub Bot commented on JENA-1407: -- GitHub user rvesse opened a pull request: https://github.com/apache/jena/pull/297 Enable parallel testing for Elephas IO (JENA-1407) Using parallelism of 4 reduces build time for Elephas IO to approximately 3-4 minutes in my testing. Also fixed one test that was not parallel safe to use the pre-existing temporary folder. Ran this a ton of times in a tight loop on my system to check that there weren't any transient errors or serial test assumptions encoded in the tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rvesse/jena JENA-1407 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/297.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #297 commit b26643b4cf19704cd4ad969d65f59095b235d667 Author: Rob VesseDate: 2017-10-26T17:03:33Z Enable parallel testing for Elephas IO (JENA-1407) Using parallelism of 4 reduces build time for Elephas IO to approximately 3-4 minutes in my testing. Also fixed one test that was not parallel safe to use the pre-existing temporary folder. > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220402#comment-16220402 ] Rob Vesse commented on JENA-1407: - Skipped tests are and artefacts of the implementation of the test suite. Various abstract classes are used to provide a common set of tests across different input and output formats. Due to variations in feature support across the different formats some of them Will need to be skipped if the feature being tested is unsupported. I guess it could be heap size effects? Some of the tests generate relatively large data which depending on the format is held fully in memory in some form or another e.g. JSON-LD. This could be causing a certain degree of garbage collection thrashing? Could we try enabling parallel testing for that module? > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (JENA-1407) Improvements to build/test time of Elephas tests.
[ https://issues.apache.org/jira/browse/JENA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220303#comment-16220303 ] Andy Seaborne commented on JENA-1407: - I'm not sure it is the same tests every time. The build time seems to jump around a bit but I need to verify that with some repeated runs under same conditions. It is at least TriX as well. I also see skipped tests. (from Jenkins, exact timing not help at it is bursty slow at the moment - output.trig.BatchedTriGOutputTest was 139s but I'm assuming that's not a real problem) {code} org.apache.jena.hadoop.rdf.io.input.trix.TriXAsQuadsInputTest Tests run: 12, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 72.204 sec {code} (Changing the JIRA to be just this one point and put the build profile chnages on another way. I shouldn't have combined them to start with.) > Improvements to build/test time of Elephas tests. > - > > Key: JENA-1407 > URL: https://issues.apache.org/jira/browse/JENA-1407 > Project: Apache Jena > Issue Type: Improvement >Reporter: Andy Seaborne >Priority: Minor > > The Elephas test can take a significant proportion of the total build time. > if this could be improved without lost of testing, development and release > work, building locally would be improved. -- This message was sent by Atlassian JIRA (v6.4.14#64029)