[ 
https://issues.apache.org/jira/browse/DRILL-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464867#comment-16464867
 ] 

ASF GitHub Bot commented on DRILL-6380:
---------------------------------------

asfgit closed pull request #1249: DRILL-6380: Fix sporadic mongo db hangs.
URL: https://github.com/apache/drill/pull/1249
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestSuit.java
 
b/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestSuit.java
index b3f0bd1374..487396d70f 100644
--- 
a/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestSuit.java
+++ 
b/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestSuit.java
@@ -20,9 +20,10 @@
 import java.io.IOException;
 import java.net.UnknownHostException;
 import java.util.ArrayList;
-import java.util.HashMap;
+import java.util.LinkedHashMap;
 import java.util.List;
 import java.util.Map;
+import java.util.TreeMap;
 import java.util.concurrent.atomic.AtomicInteger;
 
 import com.google.common.collect.Lists;
@@ -94,7 +95,9 @@ private static void setup() throws Exception {
       configServers.add(crateConfigServerConfig(CONFIG_SERVER_3_PORT));
 
       // creating replicaSets
-      Map<String, List<IMongodConfig>> replicaSets = new HashMap<>();
+      // A LinkedHashMap ensures that the config servers are started first.
+      Map<String, List<IMongodConfig>> replicaSets = new LinkedHashMap<>();
+
       List<IMongodConfig> replicaSet1 = new ArrayList<>();
       replicaSet1.add(crateIMongodConfig(MONGOD_1_PORT, false,
           REPLICA_SET_1_NAME));
@@ -102,7 +105,6 @@ private static void setup() throws Exception {
           REPLICA_SET_1_NAME));
       replicaSet1.add(crateIMongodConfig(MONGOD_3_PORT, false,
           REPLICA_SET_1_NAME));
-      replicaSets.put(REPLICA_SET_1_NAME, replicaSet1);
       List<IMongodConfig> replicaSet2 = new ArrayList<>();
       replicaSet2.add(crateIMongodConfig(MONGOD_4_PORT, false,
           REPLICA_SET_2_NAME));
@@ -110,8 +112,10 @@ private static void setup() throws Exception {
           REPLICA_SET_2_NAME));
       replicaSet2.add(crateIMongodConfig(MONGOD_6_PORT, false,
           REPLICA_SET_2_NAME));
-      replicaSets.put(REPLICA_SET_2_NAME, replicaSet2);
+
       replicaSets.put(CONFIG_REPLICA_SET, configServers);
+      replicaSets.put(REPLICA_SET_1_NAME, replicaSet1);
+      replicaSets.put(REPLICA_SET_2_NAME, replicaSet2);
 
       // create mongos
       IMongosConfig mongosConfig = createIMongosConfig();


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Mongo db storage plugin tests can hang on jenkins.
> --------------------------------------------------
>
>                 Key: DRILL-6380
>                 URL: https://issues.apache.org/jira/browse/DRILL-6380
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>            Priority: Major
>              Labels: ready-to-commit
>             Fix For: 1.14.0
>
>
> When running on our Jenkins server the mongodb tests hang because the Config 
> servers take up to 5 seconds to process each request (see *Error 1*). This 
> causes the tests to never finish within a reasonable span of time. Searching 
> online people run into this issue when mixing versions of mongo db, but that 
> is not happening in our tests. A possible cause is *Error 2* which seems to 
> indicate that the mongo db config servers are not completely initialized 
> since the config servers should have a lockping document when starting up.
> *Error 1*
> {code}
> [mongod output] 2018-05-01T23:38:47.468-0700 I COMMAND  
> [replSetDistLockPinger] command config.lockpings command: findAndModify { 
> findAndModify: "lockpings", query: { _id: "ConfigServer" }, update: { $set: { 
> ping: new Date(1525243123413) } }, upsert: true, writeConcern: { w: 
> "majority", wtimeout: 15000 } } planSummary: IDHACK update: { $set: { ping: 
> new Date(1525243123413) } } keysExamined:0 docsExamined:0 nMatched:0 
> nModified:0 upsert:1 keysInserted:2 numYields:0 reslen:198 locks:{ Global: { 
> acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, 
> Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, 
> oplog: { acquireCount: { w: 1 } } } protocol:op_query 4055ms
> [mongod output] 2018-05-01T23:38:47.469-0700 W SHARDING 
> [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused 
> by :: LockStateChangeFailed: findAndModify query predicate didn't match any 
> lock document
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] lock 
> 'balancer' successfully forced
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] 
> distributed lock 'balancer' acquired, ts : 5ae95cd5d1023488104e6282
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS 
> balancer thread is recovering
> [mongod output] 2018-05-01T23:38:47.498-0700 I SHARDING [Balancer] CSRS 
> balancer thread is recovered
> [mongod output] 2018-05-01T23:38:48.056-0700 I NETWORK  [thread2] connection 
> accepted from 127.0.0.1:50244 #10 (7 connections now open)
> {code}
> *Error 2*
> {code}
> [mongod output] 2018-05-01T23:39:37.690-0700 I COMMAND  [conn7] command 
> config.settings command: find { find: "settings", filter: { _id: "chunksize" 
> }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp 
> 1525243172000|1, t: 1 } }, limit: 1, maxTimeMS: 30000 } planSummary: EOF 
> keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:0 nreturned:0 
> reslen:354 locks:{ Global: { acquireCount: { r: 2 } }, Database: { 
> acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 
> protocol:op_command 4988ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to