The first WARN happened during ZK timeout. Please check the ZK connection first.
2017-08-10 10:58 GMT+08:00 shendandan <shendandan2015y...@163.com>: > Hi! > The process of kylin often hangs up,and the erroe in kylin,log: > > > 2017-08-10 10:14:22,292 INFO [BadQueryDetector] > service.BadQueryDetector:151 : System free memory less than 100 MB. 1 > queries running. > 2017-08-10 10:14:34,498 INFO [Thread-11-EventThread] > state.ConnectionStateManager:228 : State change: SUSPENDED > 2017-08-10 10:14:34,506 INFO [pool-7-thread-1] > threadpool.DefaultScheduler:118 : Job Fetcher: 0 should running, 0 actual > running, 0 ready, 63 already succeed, 3 error, 13 discarded, 0 others > 2017-08-10 10:14:34,999 INFO [localhost-startStop-1-SendThread( > hadoop-senior01.ctcf.com:2181)] zookeeper.ClientCnxn:975 : Opening socket > connection to server hadoop-senior01.ctcf.com/10.1.8.90:2181. Will not > attempt to authenticate using SASL (unknown error) > 2017-08-10 10:14:34,999 INFO [Thread-11-SendThread(hadoop- > senior03.ctcf.com:2181)] zookeeper.ClientCnxn:975 : Opening socket > connection to server hadoop-senior03.ctcf.com/10.1.8.92:2181. Will not > attempt to authenticate using SASL (unknown error) > 2017-08-10 10:14:34,999 INFO [localhost-startStop-1-SendThread( > hadoop-senior01.ctcf.com:2181)] zookeeper.ClientCnxn:852 : Socket > connection established, initiating session, client: /10.1.8.90:38229, > server: hadoop-senior01.ctcf.com/10.1.8.90:2181 > 2017-08-10 10:14:34,999 INFO [Thread-11-SendThread(hadoop- > senior03.ctcf.com:2181)] zookeeper.ClientCnxn:852 : Socket connection > established, initiating session, client: /10.1.8.90:32962, server: > hadoop-senior03.ctcf.com/10.1.8.92:2181 > 2017-08-10 10:14:35,001 INFO [Thread-11-SendThread(hadoop- > senior03.ctcf.com:2181)] zookeeper.ClientCnxn:1094 : Unable to reconnect > to ZooKeeper service, session 0x25dc4bd90cc002a has expired, closing socket > connection > 2017-08-10 10:14:35,002 INFO [localhost-startStop-1-SendThread( > hadoop-senior01.ctcf.com:2181)] zookeeper.ClientCnxn:1094 : Unable to > reconnect to ZooKeeper service, session 0x35dc4bd90460028 has expired, > closing socket connection > 2017-08-10 10:14:35,013 WARN [localhost-startStop-1-EventThread] > client.HConnectionManager$HConnectionImplementation:2468 : This client > just lost it's session with ZooKeeper, closing it. It will be recreated > next time someone needs it > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent( > ZooKeeperWatcher.java:517) > at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher. > process(ZooKeeperWatcher.java:435) > at org.apache.zookeeper.ClientCnxn$EventThread. > processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2017-08-10 10:14:35,013 INFO [Thread-11-EventThread] > state.ConnectionStateManager:228 : State change: LOST > 2017-08-10 10:14:35,014 INFO [localhost-startStop-1-EventThread] > client.HConnectionManager$HConnectionImplementation:1837 : Closing > zookeeper sessionid=0x35dc4bd90460028 > 2017-08-10 10:14:35,014 WARN [Thread-11-EventThread] > curator.ConnectionState:289 : Session expired event received > 2017-08-10 10:14:35,014 INFO [localhost-startStop-1-EventThread] > zookeeper.ClientCnxn:512 : EventThread shut down > 2017-08-10 10:14:35,015 INFO [Thread-11-EventThread] > zookeeper.ZooKeeper:438 : Initiating client connection, connectString= > hadoop-senior01.ctcf.com:2181,hadoop-senior02.ctcf.com:2181, > hadoop-senior03.ctcf.com:2181 sessionTimeout=60000 > watcher=org.apache.curator.ConnectionState@4d6f4fa7 > 2017-08-10 10:14:35,017 INFO [Thread-11-SendThread(hadoop- > senior02.ctcf.com:2181)] zookeeper.ClientCnxn:975 : Opening socket > connection to server hadoop-senior02.ctcf.com/10.1.8.91:2181. Will not > attempt to authenticate using SASL (unknown error) > 2017-08-10 10:14:35,017 INFO [Thread-11-SendThread(hadoop- > senior02.ctcf.com:2181)] zookeeper.ClientCnxn:852 : Socket connection > established, initiating session, client: /10.1.8.90:57503, server: > hadoop-senior02.ctcf.com/10.1.8.91:2181 > 2017-08-10 10:14:35,021 INFO [Thread-11-EventThread] > zookeeper.ClientCnxn:512 : EventThread shut down > 2017-08-10 10:14:35,021 INFO [Thread-11-SendThread(hadoop- > senior02.ctcf.com:2181)] zookeeper.ClientCnxn:1235 : Session > establishment complete on server hadoop-senior02.ctcf.com/10.1.8.91:2181, > sessionid = 0x25dc4bd90cc002c, negotiated timeout = 40000 > 2017-08-10 10:14:35,021 INFO [Thread-11-EventThread] > state.ConnectionStateManager:228 : State change: RECONNECTED > 2017-08-10 10:15:09,204 INFO [pool-7-thread-1] > threadpool.DefaultScheduler:118 : Job Fetcher: 0 should running, 0 actual > running, 0 ready, 63 already succeed, 3 error, 13 discarded, 0 others > 2017-08-10 10:15:26,969 INFO [BadQueryDetector] > service.BadQueryDetector:179 : Slow query has been running 551.737 seconds > (project:eam_contract_baseinfo, thread: 0x4a, user:ADMIN) -- SELECT * > FROM "TN_QUERY"."BASEINFO_REPAY_HIS_FACT_PARTATION" > 2017-08-10 10:15:26,971 DEBUG [BadQueryDetector] > badquery.BadQueryHistoryManager:84 > : Loaded 10 Bad Query(s) > 2017-08-10 10:15:26,976 DEBUG [BadQueryDetector] > hbase.HBaseResourceStore:262 : Update row > /bad_query/eam_contract_baseinfo.json > from oldTs: 1502331183691, to newTs: 1502331326972, operation result: true > 2017-08-10 10:15:26,977 INFO [BadQueryDetector] > service.BadQueryDetector:169 : Problematic thread 0x4a > at org.apache.kylin.metadata.datatype.BigDecimalSerializer.deserialize( > BigDecimalSerializer.java:74) > at org.apache.kylin.metadata.datatype.BigDecimalSerializer.deserialize( > BigDecimalSerializer.java:33) > at org.apache.kylin.cube.gridtable.CubeCodeSystem.decodeColumnValue( > CubeCodeSystem.java:133) > at org.apache.kylin.gridtable.GTRecord.getValues(GTRecord.java:129) > at org.apache.kylin.storage.gtrecord.CubeTupleConverter.translateResult( > CubeTupleConverter.java:207) > at org.apache.kylin.storage.gtrecord.SegmentCubeTupleIterator.hasNext( > SegmentCubeTupleIterator.java:100) > at com.google.common.collect.Iterators$5.hasNext(Iterators.java:596) > at org.apache.kylin.storage.gtrecord.SequentialCubeTupleIterator.hasNext( > SequentialCubeTupleIterator.java:84) > at org.apache.kylin.query.enumerator.OLAPEnumerator. > moveNext(OLAPEnumerator.java:68) > at org.apache.calcite.linq4j.Linq4j$EnumeratorIterator. > next(Linq4j.java:673) > > > How can I solve the problem,thank you!