Actually, I see 9 threads remaining for each startup/shutdown of the Drillbit in the same JVM (which would only impact tests)
-- Jacques Nadeau CTO and Co-Founder, Dremio On Fri, Nov 6, 2015 at 6:44 PM, Jacques Nadeau <[email protected]> wrote: > I see that we're bleeding Workmanager Status threads that aren't shutdown > when the Drillbit is shutdown. > > I'll get a patch together. > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > On Fri, Nov 6, 2015 at 4:31 PM, Hanifi Gunes <[email protected]> wrote: > >> Looks like we are possibly leaking some threads. Investigating. >> >> On Fri, Nov 6, 2015 at 4:25 PM, Jacques Nadeau <[email protected]> >> wrote: >> >> > Hmm.. that is quite strange. I wonder if we need to look at thread >> counts >> > on the daemon. >> > >> > We haven't changed how we create but there were changes to shutdown >> > (although I can't imagine why that would be a problem). >> > >> > -- >> > Jacques Nadeau >> > CTO and Co-Founder, Dremio >> > >> > On Fri, Nov 6, 2015 at 4:11 PM, Hanifi Gunes <[email protected]> >> wrote: >> > >> > > Not the testAggregateWithEmptyRequiredInput but I got the following on >> > > my branch rebased top of master -- @CentOS. >> > > >> > > Tests in error: >> > > TestImpersonationQueries.sequenceFileChainedImpersonationWithView » >> > > UserRemote >> > > >> > > >> > >> TestImpersonationQueries.testMultiLevelImpersonationJoinEachSideReachesMaxUserHops:233->BaseTestQuery.updateClient:222->BaseTestQuery. >> > > updateClient:236->BaseTestQuery.updateClient:213 » Rpc >> > > >> > > >> > >> TestImpersonationQueries.testMultiLevelImpersonationExceedsMaxUserHops:219->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient: >> > > 236->BaseTestQuery.updateClient:213 » IllegalState >> > > >> > > >> > >> TestImpersonationQueries.avroChainedImpersonationWithView:280->BaseTestImpersonation.createView:186->BaseTestQuery.updateClient:222- >> > > >BaseTestQuery.updateClient:236->BaseTestQuery.updateClient:213 » >> > > IllegalState >> > > >> > > >> > >> TestImpersonationQueries.testDirectImpersonation_HasGroupReadPermissions:186->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient: >> > > 236->BaseTestQuery.updateClient:213 » IllegalState >> > > >> > > >> > >> TestImpersonationQueries.testDirectImpersonation_NoReadPermissions:196->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient:236- >> > > >BaseTestQuery.updateClient:213 » IllegalState >> > > >> > > >> > >> TestImpersonationQueries.testMultiLevelImpersonationEqualToMaxUserHops:210->BaseTestQuery.updateClient:222->BaseTestQuery.updateClient: >> > > 236->BaseTestQuery.updateClient:213 » IllegalState >> > > >> > > exception details ---> >> > > >> > > >> > > >> > >> testMultiLevelImpersonationExceedsMaxUserHops(org.apache.drill.exec.impersonation.TestImpersonationQueries) >> > > Time elapsed: 0.008 sec <<< ERROR! >> > > java.lang.IllegalStateException: failed to create a child event loop >> > > at sun.nio.ch.IOUtil.makePipe(Native Method) >> > > at >> > io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:126) >> > > at io.netty.channel.nio.NioEventLoop.<init>(NioEventLoop.java:120) >> > > at >> > > >> > >> io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:87) >> > > at >> > > >> > >> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64) >> > > at >> > > >> > >> io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49) >> > > at >> > > >> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:61) >> > > at >> > > >> io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:52) >> > > at >> > > >> > >> org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:74) >> > > at >> > > >> > >> org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239) >> > > at >> > org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220) >> > > at >> > org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178) >> > > at >> org.apache.drill.QueryTestUtil.createClient(QueryTestUtil.java:67) >> > > at >> org.apache.drill.BaseTestQuery.updateClient(BaseTestQuery.java:213) >> > > at >> org.apache.drill.BaseTestQuery.updateClient(BaseTestQuery.java:236) >> > > >> > > >> > > My god's telling me that we are creating too many NioEventLoopGroup's. >> > > Did we make any recent changes around RPC causing this? >> > > >> > > -Hanifi >> > > >> > > >> > > On Fri, Nov 6, 2015 at 3:58 PM, Jacques Nadeau <[email protected]> >> > wrote: >> > > >> > > > Do you have that other output/stack trace I asked about? If we can >> also >> > > see >> > > > the illegalreference count on something other than the JDBC client >> > close >> > > > method, that would be helpful. >> > > > >> > > > -- >> > > > Jacques Nadeau >> > > > CTO and Co-Founder, Dremio >> > > > >> > > > On Fri, Nov 6, 2015 at 2:48 PM, Jinfeng Ni <[email protected]> >> > > wrote: >> > > > >> > > > > I just re-run, and the previous 4 failures are gone. But it failed >> > > > > with two new ones: >> > > > > >> > > > > Tests in error: >> > > > > >> > > > > >> > > > >> > > >> > >> TestSqlStdBasedAuthorization.org.apache.drill.exec.impersonation.hive.TestSqlStdBasedAuthorization >> > > > > » UserRemote >> > > > > >> > > > > >> > > > >> > > >> > >> TestStorageBasedHiveAuthorization.org.apache.drill.exec.impersonation.hive.TestStorageBasedHiveAuthorization >> > > > > » UserRemote >> > > > > >> > > > > I re-start the machine, and there are not too many applications >> > > > > running and the memory should be enough. At least some days >> back, I >> > > > > got clean run on the same machine. >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On Fri, Nov 6, 2015 at 2:39 PM, Jacques Nadeau < >> [email protected]> >> > > > wrote: >> > > > > > Can you provide the complete output for this failure: >> > > > > > >> > > > > > TestAggregateFunctions.testAggregateWithEmptyRequiredInput:237 » >> > > > > > IllegalReferenceCount >> > > > > > >> > > > > > I haven't seen the other issues. The last one looks like the >> system >> > > was >> > > > > > having an issue since thread creation failure is usually an OS >> > > problem. >> > > > > Was >> > > > > > your system under resourced? >> > > > > > >> > > > > > -- >> > > > > > Jacques Nadeau >> > > > > > CTO and Co-Founder, Dremio >> > > > > > >> > > > > > On Fri, Nov 6, 2015 at 12:55 PM, Jinfeng Ni < >> [email protected] >> > > >> > > > > wrote: >> > > > > > >> > > > > >> I'm seeing unit test case failure when run "mvn clean install" >> > over >> > > > > >> drill master branch, on Mac. >> > > > > >> >> > > > > >> The first one seems to be the issue #3 in Jacques's list. The >> last >> > > > > >> three seems to different from the 4 issues. Has anyone seen >> this >> > > > > >> failure before, or it just happened to my mac? Thanks. >> > > > > >> >> > > > > >> >> > > > > >> ================================================= >> > > > > >> git log >> > > > > >> commit 1a24233475ca46aaf2a49a5624b4042f088382f4 >> > > > > >> >> > > > > >> >> > > > > >> Tests in error: >> > > > > >> >> TestAggregateFunctions.testAggregateWithEmptyRequiredInput:237 » >> > > > > >> IllegalReferenceCount >> > > > > >> >> > > > >> TestImpersonationQueries.testMultiLevelImpersonationEqualToMaxUserHops >> > > > > >> » UserRemote >> > > > > >> >> > > > > >> >> > > > > >> > > > >> > > >> > >> TestImpersonationQueries.removeMiniDfsBasedStorage:294->BaseTestImpersonation.stopMiniDfsCluster:151 >> > > > > >> » OutOfMemory >> > > > > >> TestImpersonationQueries>BaseTestQuery.closeClient:260 » >> > > OutOfMemory >> > > > > >> unable to... >> > > > > >> >> > > > > >> Tests run: 1483, Failures: 0, Errors: 4, Skipped: 118 >> > > > > >> >> > > > > >> [INFO] >> > > > > >> >> > > > >> > ------------------------------------------------------------------------ >> > > > > >> [INFO] Reactor Summary: >> > > > > >> [INFO] >> > > > > >> [INFO] Apache Drill Root POM .............................. >> > SUCCESS >> > > [ >> > > > > >> 8.440 s] >> > > > > >> [INFO] tools/Parent Pom ................................... >> > SUCCESS >> > > [ >> > > > > >> 0.631 s] >> > > > > >> [INFO] tools/freemarker codegen tooling ................... >> > SUCCESS >> > > [ >> > > > > >> 5.236 s] >> > > > > >> [INFO] Drill Protocol ..................................... >> > SUCCESS >> > > [ >> > > > > >> 5.839 s] >> > > > > >> [INFO] Common (Logical Plan, Base expressions) ............ >> > SUCCESS >> > > [ >> > > > > >> 10.831 s] >> > > > > >> [INFO] contrib/Parent Pom ................................. >> > SUCCESS >> > > [ >> > > > > >> 0.815 s] >> > > > > >> [INFO] contrib/data/Parent Pom ............................ >> > SUCCESS >> > > [ >> > > > > >> 0.331 s] >> > > > > >> [INFO] contrib/data/tpch-sample-data ...................... >> > SUCCESS >> > > [ >> > > > > >> 2.838 s] >> > > > > >> [INFO] exec/Parent Pom .................................... >> > SUCCESS >> > > [ >> > > > > >> 0.635 s] >> > > > > >> [INFO] exec/Java Execution Engine ......................... >> > FAILURE >> > > > > [12:05 >> > > > > >> min] >> > > > > >> [INFO] exec/JDBC Driver using dependencies ................ >> > SKIPPED >> > > > > >> [INFO] JDBC JAR with all dependencies ..................... >> > SKIPPED >> > > > > >> [INFO] contrib/mongo-storage-plugin ....................... >> > SKIPPED >> > > > > >> >> > > > > >> Tests run: 11, Failures: 0, Errors: 3, Skipped: 0, Time >> elapsed: >> > > > > >> 17.042 sec <<< FAILURE! - in >> > > > > >> org.apache.drill.exec.impersonation.TestImpersonationQueries >> > > > > >> >> > > > > >> >> > > > > >> > > > >> > > >> > >> testMultiLevelImpersonationEqualToMaxUserHops(org.apache.drill.exec.impersonation.TestImpersonationQueries) >> > > > > >> Time elapsed: 0.099 sec <<< ERROR! >> > > > > >> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM >> > > ERROR: >> > > > > >> OutOfMemoryError: unable to create new native thread >> > > > > >> >> > > > > >> >> > > > > >> [Error Id: a826ac5d-e278-49bc-8f92-fdf241d0e634 on >> > > 10.250.50.52:31010 >> > > > ] >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118) >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112) >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47) >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32) >> > > > > >> at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:68) >> > > > > >> at >> > > > org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:390) >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105) >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > > > > >> at >> > > > > >> >> > > > > >> > > > >> > > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > > > > >> at java.lang.Thread.run(Thread.java:744) >> > > > > >> >> > > > > >> On Fri, Nov 6, 2015 at 9:42 AM, Jacques Nadeau < >> > [email protected]> >> > > > > wrote: >> > > > > >> > It seems like we have four potentially show stopping issues >> at >> > the >> > > > > >> moment: >> > > > > >> > >> > > > > >> > DRILL-4042: Windows build doesn't include right version of >> > Hadoop >> > > > > >> > dependencies >> > > > > >> > DRILL-3480: Random message propagation timeouts >> > > > > >> > DRILL-4041: Reference count issue >> > > > > >> > DRILL-4046: Performance regression for some TPCH queries >> > > > > >> > >> > > > > >> > Proposed next steps: >> > > > > >> > >> > > > > >> > DRILL-4042 has a clear fix and reproduction. Patrick, do you >> > think >> > > > can >> > > > > >> have >> > > > > >> > a fix up for this shortly? >> > > > > >> > >> > > > > >> > For the 3480 & 4041, consistent reproductions are missing. It >> > > would >> > > > be >> > > > > >> > great if everybody could try to help find reproductions to >> these >> > > > > issues. >> > > > > >> I >> > > > > >> > think we should take stock again at the end of the day to >> decide >> > > > next >> > > > > >> steps >> > > > > >> > and whether we want to hold the release for these. >> > > > > >> > >> > > > > >> > For 4046: I've heard that there are some performance >> regressions >> > > > > around a >> > > > > >> > couple of queries but the current symptoms don't make a lot >> of >> > > > sense. >> > > > > I'd >> > > > > >> > like to collect some more data here and then decide next >> steps. >> > > > > >> > >> > > > > >> > Let's see if we can get repros for each of the inconsistent >> > issues >> > > > and >> > > > > >> > check in again EOD. >> > > > > >> > >> > > > > >> > thanks, >> > > > > >> > Jacques >> > > > > >> > >> > > > > >> > -- >> > > > > >> > Jacques Nadeau >> > > > > >> > CTO and Co-Founder, Dremio >> > > > > >> > >> > > > > >> > On Thu, Nov 5, 2015 at 3:36 PM, Aditya < >> [email protected] >> > > >> > > > > wrote: >> > > > > >> > >> > > > > >> >> Ran into another one - DRILL-4042 >> > > > > >> >> <https://issues.apache.org/jira/browse/DRILL-4042>. >> > > > > >> >> >> > > > > >> >> On Thu, Nov 5, 2015 at 1:48 PM, Jacques Nadeau < >> > > [email protected] >> > > > > >> > > > > >> wrote: >> > > > > >> >> >> > > > > >> >>> Yeah, I think that sinks it. Weird how Rat complains only >> on >> > > > > windows... >> > > > > >> >>> >> > > > > >> >>> Let's take the rest of the business day to test the current >> > > > > candidate >> > > > > >> to >> > > > > >> >>> make sure that we don't spin extra builds unnecessarily. >> > > > > >> >>> >> > > > > >> >>> thanks, >> > > > > >> >>> Jacques >> > > > > >> >>> >> > > > > >> >>> >> > > > > >> >>> -- >> > > > > >> >>> Jacques Nadeau >> > > > > >> >>> CTO and Co-Founder, Dremio >> > > > > >> >>> >> > > > > >> >>> On Thu, Nov 5, 2015 at 1:24 PM, Aditya < >> > [email protected] >> > > > >> > > > > >> wrote: >> > > > > >> >>> >> > > > > >> >>>> Oh, I thought only master/trunk branch was protected, but >> > now I >> > > > see >> > > > > >> the >> > > > > >> >>>> mail from David Nalley. >> > > > > >> >>>> >> > > > > >> >>>> In such case, I propose that the release manager could >> push >> > the >> > > > > branch >> > > > > >> >>>> to his/her private fork and put the URL/hash in the vote >> > > starter >> > > > > >> thread. >> > > > > >> >>>> >> > > > > >> >>>> The reason I was looking to the commit history to >> determine >> > if >> > > > the >> > > > > >> >>>> candidate suffer from DRILL-4040, which, evidently it >> does. >> > > > > >> >>>> >> > > > > >> >>>> -1 as the build from source is failing. >> > > > > >> >>>> >> > > > > >> >>>> [1] https://issues.apache.org/jira/browse/DRILL-4040 >> > > > > >> >>>> >> > > > > >> >>>> On Thu, Nov 5, 2015 at 1:12 PM, Jacques Nadeau < >> > > > [email protected] >> > > > > > >> > > > > >> >>>> wrote: >> > > > > >> >>>> >> > > > > >> >>>>> I'm not sure what to do here. INFRA just changed the Git >> > > > behavior >> > > > > so >> > > > > >> it >> > > > > >> >>>>> is no longer possible to delete branches. I generally >> don't >> > > like >> > > > > to >> > > > > >> have >> > > > > >> >>>>> failed branches in a release history (otherwise you get a >> > > > release >> > > > > >> branch >> > > > > >> >>>>> with all these maven forward/backwards commits). As >> such, I >> > > > would >> > > > > >> overwrite >> > > > > >> >>>>> candidate branches historically (dropping the failed >> release >> > > > > >> commits). >> > > > > >> >>>>> >> > > > > >> >>>>> The commit is here right now: >> > > > > >> >>>>> https://github.com/jacques-n/drill/tree/drill-1.3.0-rc0 >> > > > > >> >>>>> >> > > > > >> >>>>> The parent of 4822068a006aeb251b686d2b51871573c4337e60 >> > > > > >> >>>>> is >> > > > > >> >>>>> 3dedc158f3af8ec8320a9cd336b2798b09cc9a8d (the tip of >> master) >> > > > > >> >>>>> >> > > > > >> >>>>> >> > > > > >> >>>>> >> > > > > >> >>>>> -- >> > > > > >> >>>>> Jacques Nadeau >> > > > > >> >>>>> CTO and Co-Founder, Dremio >> > > > > >> >>>>> >> > > > > >> >>>>> On Thu, Nov 5, 2015 at 1:01 PM, Aditya < >> > > [email protected] >> > > > > >> > > > > >> wrote: >> > > > > >> >>>>> >> > > > > >> >>>>>> I am having trouble determining the git commit this >> release >> > > is >> > > > > based >> > > > > >> >>>>>> on as >> > > > > >> >>>>>> I could not find the >> > > > > >> >>>>>> id (4822068a006aeb251b686d2b51871573c4337e60) captured >> in >> > the >> > > > > >> >>>>>> git.properties bundled in the >> > > > > >> >>>>>> tarballs in the Drill Git repository. >> > > > > >> >>>>>> >> > > > > >> >>>>>> Most likely the last commit is only in your local branch >> > and >> > > > > since >> > > > > >> >>>>>> git.properties captures only the >> > > > > >> >>>>>> last commit, it is impossible to find the parent commit. >> > > > > >> >>>>>> >> > > > > >> >>>>>> Would it make sense to push the release branch? >> > > > > >> >>>>>> >> > > > > >> >>>>>> aditya... >> > > > > >> >>>>>> >> > > > > >> >>>>>> On Wed, Nov 4, 2015 at 11:08 PM, Jacques Nadeau < >> > > > > [email protected] >> > > > > >> > >> > > > > >> >>>>>> wrote: >> > > > > >> >>>>>> >> > > > > >> >>>>>> > Hey Everybody, >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > I'm happy to propose a new release of Apache Drill, >> > version >> > > > > 1.3.0. >> > > > > >> >>>>>> This is >> > > > > >> >>>>>> > the first release candidate (rc0). It covers a total >> of >> > > ~50 >> > > > > >> closed >> > > > > >> >>>>>> JIRAs >> > > > > >> >>>>>> > [1]. >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > The tarball artifacts are hosted at [2] and the maven >> > > > artifacts >> > > > > >> are >> > > > > >> >>>>>> hosted >> > > > > >> >>>>>> > at [3]. >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > The vote will be open for 72 hours ending at 11PM >> > Pacific, >> > > > > >> November >> > > > > >> >>>>>> 7, >> > > > > >> >>>>>> > 2015. >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > [ ] +1 >> > > > > >> >>>>>> > [ ] +0 >> > > > > >> >>>>>> > [ ] -1 >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > thanks, >> > > > > >> >>>>>> > Jacques >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > [1] >> > > > > >> >>>>>> > >> > > > > >> >>>>>> > >> > > > > >> >>>>>> >> > > > > >> >> > > > > >> > > > >> > > >> > >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12332946 >> > > > > >> >>>>>> > [2] >> > > > http://people.apache.org/~jacques/apache-drill-1.3.0.rc0/ >> > > > > >> >>>>>> > [3] >> > > > > >> >>>>>> > >> > > > > >> >>>>>> >> > > > > >> >> > > > >> > https://repository.apache.org/content/repositories/orgapachedrill-1013/ >> > > > > >> >>>>>> > >> > > > > >> >>>>>> >> > > > > >> >>>>> >> > > > > >> >>>>> >> > > > > >> >>>> >> > > > > >> >>> >> > > > > >> >> >> > > > > >> >> > > > > >> > > > >> > > >> > >> > >
