Thanks! You guys are awesome! It's nice to know I'm not completely crazy ;).
I definitely needed another set of eyes on this. Special thanks to Otto for identifying the root cause so quickly. I can confirm all of the integration tests pass for me with this PR. Working on running through the rest of Casey's test plan now. -Kyle > On Feb 25, 2017, at 11:57 PM, Casey Stella <[email protected]> wrote: > > METRON-743 (https://github.com/apache/incubator-metron/pull/467) for > reference. > >> On Sat, Feb 25, 2017 at 11:51 PM, Casey Stella <[email protected]> wrote: >> Hmm, that's a very good catch if it's the issue. I was able to verify that >> if you botch the sort order of the files that it fails. >> >> Would you mind sorting the files on PcapJob line 199 by filename? Something >> like Collections.sort(files, (o1,o2) -> >> o1.getName().compareTo(o2.getName())); >> >> I'm going to submit a PR regardless because we should own the assumptions >> here, but I suspect that for the HDFS filesystem this works as expected. >> That being said, it's better to be safe than sorry. >> >> Casey >> >>> On Sat, Feb 25, 2017 at 11:35 PM, Otto Fowler <[email protected]> >>> wrote: >>> /** >>> * List the statuses and block locations of the files in the given path. >>> * Does not guarantee to return the iterator that traverses statuses >>> * of the files in a sorted order. >>> * <pre> >>> * If the path is a directory, >>> * if recursive is false, returns files in the directory; >>> * if recursive is true, return files in the subtree rooted at the path. >>> * If the path is a file, return the file's status and block locations. >>> * </pre> >>> * @param f is the path >>> * @param recursive if the subdirectories need to be traversed recursively >>> * >>> * @return an iterator that traverses statuses of the files >>> * >>> * @throws FileNotFoundException when the path does not exist; >>> * @throws IOException see specific implementation >>> */ >>> public RemoteIterator<LocatedFileStatus> listFiles( >>> >>> >>> So if we depend on this returning something sorted, it is only working >>> accidentally? >>> >>> >>> On February 25, 2017 at 23:10:59, Otto Fowler ([email protected]) >>> wrote: >>> >>> https://issues.apache.org/jira/browse/HADOOP-12009 makes it seem like >>> there is no order >>> >>> >>> On February 25, 2017 at 23:06:37, Otto Fowler ([email protected]) >>> wrote: >>> >>> Maybe Hadoop Local FileSystem returns different things from ListFiles() on >>> different platforms? >>> That would be something to check? >>> >>> Sorry that is all I got right now >>> >>> >>> >>> On February 25, 2017 at 22:57:49, Otto Fowler ([email protected]) >>> wrote: >>> >>> There are also some if Log.isDebugEnabled() outputs, so maybe try changing >>> the logging level, maybe running just this test? >>> >>> >>> >>> On February 25, 2017 at 22:39:02, Otto Fowler ([email protected]) >>> wrote: >>> >>> There are multiple “tests” within the test, with different parameters. If >>> you look at where this is breaking, it is at >>> >>> { >>> //make sure I get them all. >>> Iterable<byte[]> results = >>> job.query(new Path(outDir.getAbsolutePath()) >>> , new Path(queryDir.getAbsolutePath()) >>> , getTimestamp(0, pcapEntries) >>> , getTimestamp(pcapEntries.size()-1, pcapEntries) + 1 >>> , 10 >>> , new EnumMap<>(Constants.Fields.class) >>> , new Configuration() >>> , FileSystem.get(new Configuration()) >>> , new FixedPcapFilter.Configurator() >>> ); >>> assertInOrder(results); >>> Assert.assertEquals(Iterables.size(results), pcapEntries.size()); >>> >>> >>> >>> Which is the 7th test job run against the data. I am not familiar with >>> this test or code, but >>> that has to be significant. >>> >>> Maybe you should enable and print out the information of the results - and >>> we can see a pattern there? >>> >>> On February 25, 2017 at 22:19:00, Kyle Richardson >>> ([email protected]) >>> wrote: >>> >>> mvn integration-test >>> >>> Although I have also tried... >>> mvn clean install && mvn integration-test >>> mvn clean package && mvn integration-test >>> mvn install && mvn surefire-test@unit-tests && mvn >>> surefire-test@integration-tests >>> >>> -Kyle >>> >>> On Feb 25, 2017, at 8:34 PM, Otto Fowler <[email protected]> wrote: >>> >>> What command are you using to build? >>> >>> >>> >>> On February 25, 2017 at 17:40:20, Kyle Richardson >>> ([email protected]) >>> wrote: >>> >>> Tried with Oracle JDK and got the same result. I went as far as trying to >>> run it through the debugger but am not that familiar with this part of the >>> code. The timestamps of the packets are definitely not coming back in the >>> expected order, but I'm not sure why. Could it be related to something >>> filesystem specific? >>> >>> Apologies if I'm just being dense but I'd really like to understand why >>> this consistently fails on some platforms and not others. >>> >>> -Kyle >>> >>> > On Feb 25, 2017, at 9:07 AM, Kyle Richardson <[email protected]> >>> wrote: >>> > >>> > Ok, I've tried this so many times I may be going crazy, so thought I'd >>> ask the community for a sanity check. >>> > >>> > I'm trying to verify RC5 and I keep running into the same integration >>> test failures but only on my Fedora (24 and 25) and CentOS 7 systems. It >>> passes fine on my Macbook. >>> > >>> > It always fails on the PcapTopologyIntegrationTest (test results pasted >>> below). Anyone have any ideas? I'm using the exact same version of maven in >>> all cases (v3.3.9). The only difference I can think of is the Fedora/CentOS >>> systems are using OpenJDK whereas the Macbook is running Sun/Oracle JDK. >>> > >>> > ------------------------------------------------------- >>> > T E S T S >>> > ------------------------------------------------------- >>> > Running org.apache.metron.pcap.integration.PcapTopologyIntegrationTest >>> > Formatting using clusterid: testClusterID >>> > Formatting using clusterid: testClusterID >>> > Sent pcap data: 20 >>> > Wrote 20 to kafka >>> > Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 42.011 >>> sec <<< FAILURE! - in >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest >>> > >>> testTimestampInPacket(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest) >>> Time elapsed: 26.968 sec <<< FAILURE! >>> > java.lang.AssertionError >>> > at org.junit.Assert.fail(Assert.java:86) >>> > at org.junit.Assert.assertTrue(Assert.java:41) >>> > at org.junit.Assert.assertTrue(Assert.java:52) >>> > at >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.assertInOrder(PcapTopologyIntegrationTest.java:537) >>> > at >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTopology(PcapTopologyIntegrationTest.java:383) >>> > at >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTimestampInPacket(PcapTopologyIntegrationTest.java:135) >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> > at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> > at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> > at java.lang.reflect.Method.invoke(Method.java:498) >>> > at >>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) >>> > at >>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >>> > at >>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) >>> > at >>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >>> > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) >>> > at >>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) >>> > at >>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) >>> > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) >>> > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) >>> > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) >>> > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) >>> > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) >>> > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) >>> > at >>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) >>> > at >>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) >>> > at >>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) >>> > >>> > >>> testTimestampInKey(org.apache.metron.pcap.integration.PcapTopologyIntegrationTest) >>> Time elapsed: 15.038 sec <<< FAILURE! >>> > java.lang.AssertionError >>> > at org.junit.Assert.fail(Assert.java:86) >>> > at org.junit.Assert.assertTrue(Assert.java:41) >>> > at org.junit.Assert.assertTrue(Assert.java:52) >>> > at >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.assertInOrder(PcapTopologyIntegrationTest.java:537) >>> > at >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTopology(PcapTopologyIntegrationTest.java:383) >>> > at >>> org.apache.metron.pcap.integration.PcapTopologyIntegrationTest.testTimestampInKey(PcapTopologyIntegrationTest.java:152) >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> > at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> > at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> > at java.lang.reflect.Method.invoke(Method.java:498) >>> > at >>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) >>> > at >>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >>> > at >>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) >>> > at >>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >>> > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) >>> > at >>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) >>> > at >>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) >>> > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) >>> > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) >>> > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) >>> > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) >>> > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) >>> > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) >>> > at >>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128) >>> > at >>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203) >>> > at >>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155) >>> > at >>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) >>> > >>> > >>> > Results : >>> > >>> > Failed tests: >>> > >>> PcapTopologyIntegrationTest.testTimestampInKey:152->testTopology:383->assertInOrder:537 >>> null >>> > >>> PcapTopologyIntegrationTest.testTimestampInPacket:135->testTopology:383->assertInOrder:537 >>> null >>> > >>> > >>> > >>> > Tests run: 2, Failures: 2, Errors: 0, Skipped: 0 >>> > >>> > [ERROR] Failed to execute goal >>> org.apache.maven.plugins:maven-surefire-plugin:2.18:test >>> (integration-tests) on project metron-pcap-backend: There are test failures. >>> > [ERROR] >>> > [ERROR] Please refer to >>> /home/kyle/projects/metron-fork/metron-platform/metron-pcap-backend/target/surefire-reports >>> for the individual test results. >>> > [ERROR] -> [Help 1] >>> > [ERROR] >>> > [ERROR] To see the full stack trace of the errors, re-run Maven with the >>> -e switch. >>> > [ERROR] Re-run Maven using the -X switch to enable full debug logging. >>> > [ERROR] >>> > [ERROR] For more information about the errors and possible solutions, >>> please read the following articles: >>> > [ERROR] [Help 1] >>> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException >>> > [ERROR] >>> > [ERROR] After correcting the problems, you can resume the build with the >>> command >>> > [ERROR] mvn <goals> -rf :metron-pcap-backend >>> > >>> > >> >
