OK I managed to finally get this smoketester to pass on my machine, so
for THIS release I will retract my -1 and change it to a +1.

I have reset my system configuration back though, so we should really
fix these test problems for the future.

SUCCESS! [1:08:26.448122]

There were a few compounding issues, I will break out some issues a
bit later. I don't think they need to be blockers for THIS release,
but let's please fix them! I can help try to dig on each one, but here
are the biggest two problems:

1. some solr tests don't obey their sandbox and fail with
tests.workDir (if it is set in the user's build.properties). These
tests try to access wrong parts of the filesystem which can cause
tests to meddle with each other. obeying the test sandbox
(tests.workDir) is important, it is how I prevent these tests from
destroying my SSDs.

2. some solr HDFS tests will falsely fail if they "think" disk space
is low (even when it is not running out). They dump megabytes of
output, but this part is the key:

   [junit4]   2> 1000960 WARN  (IPC Server handler 3 on 33951) [     ]
o.a.h.h.s.b.BlockPlacementPolicy Failed to place enough replicas,
still in need of 2 to reach 2 (unavailableStorages=[],
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true)
For more information, please enable DEBUG log level on
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and
org.apache.hadoop.net.NetworkTopology
   [junit4]   2> 1000960 WARN  (IPC Server handler 3 on 33951) [     ]
o.a.h.h.p.BlockStoragePolicy Failed to place enough replicas: expected
size is 2 but only 0 storage types can be selected (replication=2,
selected=[], unavailable=[DISK], removed=[DISK, DISK],
policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
   [junit4]   2> 1000960 WARN  (IPC Server handler 3 on 33951) [     ]
o.a.h.h.s.b.BlockPlacementPolicy Failed to place enough replicas,
still in need of 2 to reach 2 (unavailableStorages=[DISK],
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK],
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true)
All required storage types are unavailable:
unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7,
storageTypes=[DISK], creationFallbacks=[],
replicationFallbacks=[ARCHIVE]}
   [junit4]   2> 1000961 WARN  (Thread-2642) [     ]
o.a.h.h.DataStreamer DataStreamer Exception
   [junit4]   2>           =>
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/testfile could only be written to 0 of the 1 minReplication nodes.
There are 2 datanode(s) running and 2 node(s) are excluded in this
operation.

So I think these tests should be tweaked to not require gigabytes of
free space to pass. (fix the threshold or whatever, or add an assume
or something). I worked around the situation by temporarily
repartitioning and giving them another gigabyte (!). In no event was
there ever any danger of running out of space! They just falsely fail
even when there are hundreds of MB available. Seems they have some
kind of bogus threshold in the algorithm (e.g. inspecting percentages
or something).

On Sat, Jun 12, 2021 at 12:22 PM Robert Muir <[email protected]> wrote:
>
> The tests also aren't "timing out". They are failing.
>
> On Sat, Jun 12, 2021 at 12:21 PM Robert Muir <[email protected]> wrote:
> >
> > Ishan, no, they arent running out of resources, not even close. I have
> > 20GB of ram and by default it is only using 3 JVMs.
> >
> > On Sat, Jun 12, 2021 at 12:04 PM Ishan Chattopadhyaya
> > <[email protected]> wrote:
> > >
> > > Hi Rob, could it be possible that the tests are timing out on your 
> > > machine due to lack of resources? Can you try running them with just just 
> > > one JVM at a time?
> > >
> > > On Sat, 12 Jun, 2021, 8:20 pm Robert Muir, <[email protected]> wrote:
> > >>
> > >> I ran smoketester yet one more time and again numerous tests fail:
> > >>
> > >>    [junit4] Tests with failures [seed: A3FDDCE09965D7AE] (first 10 out 
> > >> of 18):
> > >>    [junit4]   - 
> > >> org.apache.solr.update.TestHdfsUpdateLog.testFSThreadSafety
> > >>    [junit4]   - org.apache.solr.update.TestHdfsUpdateLog (suite)
> > >>    [junit4]   -
> > >> org.apache.solr.cloud.hdfs.HDFSCollectionsAPITest.testDataDirIsNotReused
> > >>    [junit4]   - org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testBasic
> > >>    [junit4]   -
> > >> org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testMultiThreaded
> > >>    [junit4]   - org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest (suite)
> > >>    [junit4]   -
> > >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testCanDistinguishBetweenFilesAndDirectories
> > >>    [junit4]   -
> > >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testCanDeleteEmptyOrFullDirectories
> > >>    [junit4]   -
> > >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testCanDeleteIndividualFiles
> > >>    [junit4]   -
> > >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testArbitraryFileDataCanBeStoredAndRetrieved
> > >>    [junit4]
> > >>    [junit4]
> > >>    [junit4] JVM J0:     0.72 ..  1241.80 =  1241.08s
> > >>    [junit4] JVM J1:     0.67 ..  1198.72 =  1198.05s
> > >>    [junit4] JVM J2:     0.91 ..  1198.75 =  1197.84s
> > >>    [junit4] Execution time total: 20 minutes 41 seconds
> > >>    [junit4] Tests summary: 939 suites (5 ignored), 4884 tests, 3
> > >> suite-level errors, 15 errors, 1 failure, 2581 ignored (506
> > >> assumptions)
> > >>
> > >> On Fri, Jun 11, 2021 at 11:52 AM Robert Muir <[email protected]> wrote:
> > >> >
> > >> > After nuking all settings (i simply removed the whole
> > >> > lucene.build.properties in my homedir), it still fails. Seems maybe
> > >> > like less failures though?
> > >> >
> > >> > I will upload logs to the JIRA issue.
> > >> >
> > >> >    [junit4] Completed [939/939 (4!)] on J2 in 383.02s, 2 tests, 1
> > >> > failure <<< FAILURES!
> > >> >    [junit4]
> > >> >    [junit4]
> > >> >    [junit4] Tests with failures [seed: AC205159663D0461]:
> > >> >    [junit4]   - 
> > >> > org.apache.solr.update.TestHdfsUpdateLog.testFSThreadSafety
> > >> >    [junit4]   - org.apache.solr.update.TestHdfsUpdateLog (suite)
> > >> >    [junit4]   -
> > >> > org.apache.solr.core.HdfsDirectoryFactoryTest.testLocalityReporter
> > >> >    [junit4]   - 
> > >> > org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testBasic
> > >> >    [junit4]   -
> > >> > org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testMultiThreaded
> > >> >    [junit4]   - org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest (suite)
> > >> >    [junit4]   -
> > >> > org.apache.solr.cloud.api.collections.TestLocalFSCloudBackupRestore.test
> > >> >    [junit4]
> > >> >    [junit4]
> > >> >    [junit4] JVM J0:     0.68 ..  1197.30 =  1196.62s
> > >> >    [junit4] JVM J1:     0.71 ..  1113.59 =  1112.89s
> > >> >    [junit4] JVM J2:     0.68 ..  1406.83 =  1406.15s
> > >> >    [junit4] Execution time total: 23 minutes 26 seconds
> > >> >    [junit4] Tests summary: 939 suites (5 ignored), 4884 tests, 3
> > >> > suite-level errors, 4 errors, 1 failure, 2457 ignored (517
> > >> > assumptions)
> > >> >
> > >> > BUILD FAILED
> > >> > /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/solr/build.xml:231:
> > >> > The following error occurred while executing this line:
> > >> > /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/solr/common-build.xml:550:
> > >> > The following error occurred while executing this line:
> > >> > /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/lucene/common-build.xml:1608:
> > >> > The following error occurred while executing this line:
> > >> > /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/lucene/common-build.xml:1135:
> > >> > There were test failures: 939 suites (5 ignored), 4884 tests, 3
> > >> > suite-level errors, 4 errors, 1 failure, 2457 ignored (517
> > >> > assumptions) [seed: AC205159663D0461]
> > >> >
> > >> > Total time: 24 minutes 16 seconds
> > >> >
> > >> >
> > >> > Traceback (most recent call last):
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 1495, in <module>
> > >> >     main()
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 1417, in main
> > >> >     smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir,
> > >> > c.is_signed, c.local_keys, ' '.join(c.test_args),
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 1483, in smokeTest
> > >> >     solrSrcUnpackPath = unpackAndVerify(java, 'solr', tmpDir,
> > >> > 'solr-%s-src.tgz' % version,
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 566, in unpackAndVerify
> > >> >     verifyUnpacked(java, project, artifact, unpackPath, gitRevision,
> > >> > version, testArgs, tmpDir, baseURL)
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 687, in verifyUnpacked
> > >> >     java.run_java8('ant clean test -Dtests.slow=false %s' % testArgs,
> > >> > '%s/test.log' % unpackPath)
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 1212, in run_java
> > >> >     run('%s; %s' % (cmd_prefix, cmd), logfile)
> > >> >   File 
> > >> > "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
> > >> > line 500, in run
> > >> >     raise RuntimeError('command "%s" failed; see log file %s' %
> > >> > (command, logPath))
> > >> > RuntimeError: command "export
> > >> > JAVA_HOME="/home/rmuir/Downloads/jdk8u282-b08"
> > >> > PATH="/home/rmuir/Downloads/jdk8u282-b08/bin:$PATH"
> > >> > JAVACMD="/home/rmuir/Downloads/jdk8u282-b08/bin/java"; ant clean test
> > >> > -Dtests.slow=false -Dtests.badapples=false " failed; see log file
> > >> > /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/test.log
> > >> >
> > >> > On Fri, Jun 11, 2021 at 9:47 AM Robert Muir <[email protected]> wrote:
> > >> > >
> > >> > > I nuked all my settings and am rerunning with all defaults. I'll
> > >> > > report back what happens/upload log when/if it finishes or fails.
> > >> > >
> > >> > > On Fri, Jun 11, 2021 at 9:45 AM Michael Sokolov <[email protected]> 
> > >> > > wrote:
> > >> > > >
> > >> > > > I tried to comment on the JIRA, but it seems to be timing out. Now
> > >> > > > when I go back, SOLR issues are marked as "You can't view this 
> > >> > > > issue
> > >> > > > It may have been deleted or you don't have permission to view it."
> > >> > > > Waat?
> > >> > > >
> > >> > > > Anyway, Robert you suggested there that maybe the problem is being
> > >> > > > surfaced by using a different working directory for the tests. Do 
> > >> > > > you
> > >> > > > think that the tests need to be fixed so that they work with this
> > >> > > > tmp.workDir parameter? What if you were to cd to the place you 
> > >> > > > want to
> > >> > > > use as the working dir and call the smokeTester from there?
> > >> > > >
> > >> > > >
> > >> > > > On Fri, Jun 11, 2021 at 9:29 AM Mayya Sharipova
> > >> > > > <[email protected]> wrote:
> > >> > > > >
> > >> > > > > Thanks very much Robert for detailed investigations, and thanks 
> > >> > > > > Jan for your tests.
> > >> > > > >
> > >> > > > > I will sort out the problem with my GPG key, but I  am not sure 
> > >> > > > > what to do with this SOLR-15473. I've run the smoker test again, 
> > >> > > > > and it passed on my Mac again: SUCCESS! [1:00:03.751500]
> > >> > > > > Would appreciate more guidance, if we need to resolve SOLR-15473 
> > >> > > > > before 8.9 release.
> > >> > > > >
> > >> > > > >
> > >> > > > > On Fri, Jun 11, 2021 at 8:09 AM Robert Muir <[email protected]> 
> > >> > > > > wrote:
> > >> > > > >>
> > >> > > > >> Dude, if you can vote +1 when the smoketester passes, then I 
> > >> > > > >> can vote
> > >> > > > >> -1 when it fails. This is my vote, not your vote. You don't get 
> > >> > > > >> to
> > >> > > > >> decide about it, or change it in any way.
> > >> > > > >>
> > >> > > > >> On Fri, Jun 11, 2021 at 8:04 AM Jan Høydahl 
> > >> > > > >> <[email protected]> wrote:
> > >> > > > >> >
> > >> > > > >> > Does it reproduce for you? Are you suspecting a bug in Solr 
> > >> > > > >> > that we cannot ship, or only a bug in the smoketester py 
> > >> > > > >> > itself? The -1 should be about the released bits, not about 
> > >> > > > >> > other tooling?
> > >> > > > >> > My JVM is OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 
> > >> > > > >> > 25.292-b10, mixed mode)
> > >> > > > >> >
> > >> > > > >> > Jan
> > >> > > > >> >
> > >> > > > >> > > 11. jun. 2021 kl. 13:48 skrev Robert Muir 
> > >> > > > >> > > <[email protected]>:
> > >> > > > >> > >
> > >> > > > >> > > Jan, I'm using the same automated smoketester as everyone 
> > >> > > > >> > > else. It
> > >> > > > >> > > fails, so my vote is -1.
> > >> > > > >> > >
> > >> > > > >> > > On Fri, Jun 11, 2021 at 7:22 AM Jan Høydahl 
> > >> > > > >> > > <[email protected]> wrote:
> > >> > > > >> > >>
> > >> > > > >> > >> Tested on MacOS (Intel), No other verification than 
> > >> > > > >> > >> smoketester done
> > >> > > > >> > >>
> > >> > > > >> > >> SUCCESS! [1:08:19.953492]
> > >> > > > >> > >>
> > >> > > > >> > >> +1
> > >> > > > >> > >>
> > >> > > > >> > >> Robert - not sure if one test-run failure should cancel 
> > >> > > > >> > >> the build. Our smoketester and tests are sometimes a bit 
> > >> > > > >> > >> picky, and does not mean that the artifacts are faulty.
> > >> > > > >> > >>
> > >> > > > >> > >> Jan
> > >> > > > >> > >>
> > >> > > > >> > >> 11. jun. 2021 kl. 04:14 skrev Mayya Sharipova 
> > >> > > > >> > >> <[email protected]>:
> > >> > > > >> > >>
> > >> > > > >> > >> Please vote for release candidate 1 for Lucene/Solr 8.9.0
> > >> > > > >> > >>
> > >> > > > >> > >> The artifacts can be downloaded from:
> > >> > > > >> > >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.9.0-RC1-rev05c8a6f0163fe4c330e93775e8e91f3ab66a3f80
> > >> > > > >> > >>
> > >> > > > >> > >> You can run the smoke tester directly with this command:
> > >> > > > >> > >>
> > >> > > > >> > >> python3 -u dev-tools/scripts/smokeTestRelease.py \
> > >> > > > >> > >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.9.0-RC1-rev05c8a6f0163fe4c330e93775e8e91f3ab66a3f80
> > >> > > > >> > >>
> > >> > > > >> > >> The vote will be open for at least 72 hours i.e. until 
> > >> > > > >> > >> 2021-06-16 02:00 UTC.
> > >> > > > >> > >>
> > >> > > > >> > >> [ ] +1  approve
> > >> > > > >> > >> [ ] +0  no opinion
> > >> > > > >> > >> [ ] -1  disapprove (and reason why)
> > >> > > > >> > >>
> > >> > > > >> > >> Here is my +1
> > >> > > > >> > >> SUCCESS! [0:01:43.815224]
> > >> > > > >> > >>
> > >> > > > >> > >>
> > >> > > > >> > >
> > >> > > > >> > > ---------------------------------------------------------------------
> > >> > > > >> > > To unsubscribe, e-mail: [email protected]
> > >> > > > >> > > For additional commands, e-mail: [email protected]
> > >> > > > >> > >
> > >> > > > >> >
> > >> > > > >> >
> > >> > > > >> > ---------------------------------------------------------------------
> > >> > > > >> > To unsubscribe, e-mail: [email protected]
> > >> > > > >> > For additional commands, e-mail: [email protected]
> > >> > > > >> >
> > >> > > > >>
> > >> > > > >> ---------------------------------------------------------------------
> > >> > > > >> To unsubscribe, e-mail: [email protected]
> > >> > > > >> For additional commands, e-mail: [email protected]
> > >> > > > >>
> > >> > > >
> > >> > > > ---------------------------------------------------------------------
> > >> > > > To unsubscribe, e-mail: [email protected]
> > >> > > > For additional commands, e-mail: [email protected]
> > >> > > >
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: [email protected]
> > >> For additional commands, e-mail: [email protected]
> > >>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to