Re: Re: when build kylin cube,the job stopping at 77.78%,can't go on the job
hi kun! I don't know why we need such a extro config. I guess there is something I missed. best regards! wangxianbin1...@gmail.com From: zhangrongkun Date: 2016-05-06 10:14 To: dev Subject: Re: when build kylin cube,the job stopping at 77.78%,can't go on the job Accross add this config at my kylin_hive_conf.xml and kylin_job_conf.xml,My cube build successfully,the config is: hbase.zookeeper.quorum hadoop21:2181,hadoop25:2181,hadoop37:2181,hadoop45:2181,hadoop140:2181 -- View this message in context: http://apache-kylin.74782.x6.nabble.com/when-build-kylin-cube-the-job-stopping-at-77-78-can-t-go-on-the-job-tp4421p4438.html Sent from the Apache Kylin mailing list archive at Nabble.com.
Re: when build kylin cube,the job stopping at 77.78%,can't go on the job
Accross add this config at my kylin_hive_conf.xml and kylin_job_conf.xml,My cube build successfully,the config is: hbase.zookeeper.quorum hadoop21:2181,hadoop25:2181,hadoop37:2181,hadoop45:2181,hadoop140:2181 -- View this message in context: http://apache-kylin.74782.x6.nabble.com/when-build-kylin-cube-the-job-stopping-at-77-78-can-t-go-on-the-job-tp4421p4438.html Sent from the Apache Kylin mailing list archive at Nabble.com.
Re: Re: when build kylin cube,the job stopping at 77.78%,can't go on the job
Yes,the reduce's log get this exception: 2016-05-06 09:33:29,020 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/hadoop 2016-05-06 09:33:29,020 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/var/u01/hadoop/yarn_dir/local/usercache/hadoop/appcache/application_1461663034262_4223/container_1461663034262_4223_01_19 2016-05-06 09:33:29,021 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=18 watcher=hconnection-0x24fc81b40x0, quorum=localhost:2181, baseZNode=/hbase 2016-05-06 09:33:29,221 INFO [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2016-05-06 09:33:29,223 WARN [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-05-06 09:33:29,329 INFO [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2016-05-06 09:33:29,329 WARN [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-05-06 09:33:29,335 INFO [main] org.apache.hadoop.hbase.util.RetryCounter: Sleeping 1000ms before retry #0... 2016-05-06 09:33:30,430 INFO [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2016-05-06 09:33:30,431 WARN [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-05-06 09:33:30,531 INFO [main] org.apache.hadoop.hbase.util.RetryCounter: Sleeping 2000ms before retry #1... 2016-05-06 09:33:30,531 INFO [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2016-05-06 09:33:30,532 WARN [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-05-06 09:33:31,632 INFO [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2016-05-06 09:33:31,633 WARN [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-05-06 09:33:31,733 INFO [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown
Re: Re: when build kylin cube,the job stopping at 77.78%,can't go on the job
hi kun! I mean check your hadoop mapreduce task status & log, there may have something goes wrong. notice that hbase client will try many times before throw Exception out, which means you need to wait for a while(on my case, over 30 mins), before you can see the error in mapreduce task log! best regards! wangxianbin1...@gmail.com From: zhangrongkun Date: 2016-05-05 18:55 To: dev Subject: Re: when build kylin cube,the job stopping at 77.78%,can't go on the job No! My kylin‘s log haven't any exception,and My HBase works well. there is my MRApp’s log: 2016-05-05 18:32:37,273 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 2 2016-05-05 18:32:37,274 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2016-05-05 18:32:37,274 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1461663034262_1777_01_10 to attempt_1461663034262_1777_r_00_0 2016-05-05 18:32:37,274 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2016-05-05 18:32:37,275 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1461663034262_1777_01_11 to attempt_1461663034262_1777_r_01_0 2016-05-05 18:32:37,275 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:7 AssignedReds:2 CompletedMaps:1 CompletedReds:0 ContAlloc:10 ContRel:0 HostLocal:2 RackLocal:6 2016-05-05 18:32:37,290 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved hadoop22 to /default-rack 2016-05-05 18:32:37,291 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_00_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-05-05 18:32:37,292 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved hadoop24 to /default-rack 2016-05-05 18:32:37,293 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_01_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-05-05 18:32:37,294 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1461663034262_1777_01_11 taskAttempt attempt_1461663034262_1777_r_01_0 2016-05-05 18:32:37,294 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1461663034262_1777_r_01_0 2016-05-05 18:32:37,295 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1461663034262_1777_01_10 taskAttempt attempt_1461663034262_1777_r_00_0 2016-05-05 18:32:37,295 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1461663034262_1777_r_00_0 2016-05-05 18:32:37,301 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1461663034262_1777_r_01_0 : 13562 2016-05-05 18:32:37,301 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1461663034262_1777_r_00_0 : 13562 2016-05-05 18:32:37,301 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1461663034262_1777_r_01_0] using containerId: [container_1461663034262_1777_01_11 on NM: [hadoop24:45083] 2016-05-05 18:32:37,302 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_01_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-05-05 18:32:37,302 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1461663034262_1777_r_00_0] using containerId: [container_1461663034262_1777_01_10 on NM: [hadoop22:43623] 2016-05-05 18:32:37,303 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_00_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-05-05 18:32:37,303 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1461663034262_1777_r_01 2016-05-05 18:32:37,304 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1461663034262_1777_r_01 Task Transitioned
Re: when build kylin cube,the job stopping at 77.78%,can't go on the job
No! My kylin‘s log haven't any exception,and My HBase works well. there is my MRApp’s log: 2016-05-05 18:32:37,273 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 2 2016-05-05 18:32:37,274 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2016-05-05 18:32:37,274 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1461663034262_1777_01_10 to attempt_1461663034262_1777_r_00_0 2016-05-05 18:32:37,274 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned to reduce 2016-05-05 18:32:37,275 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1461663034262_1777_01_11 to attempt_1461663034262_1777_r_01_0 2016-05-05 18:32:37,275 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:7 AssignedReds:2 CompletedMaps:1 CompletedReds:0 ContAlloc:10 ContRel:0 HostLocal:2 RackLocal:6 2016-05-05 18:32:37,290 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved hadoop22 to /default-rack 2016-05-05 18:32:37,291 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_00_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-05-05 18:32:37,292 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved hadoop24 to /default-rack 2016-05-05 18:32:37,293 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_01_0 TaskAttempt Transitioned from UNASSIGNED to ASSIGNED 2016-05-05 18:32:37,294 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1461663034262_1777_01_11 taskAttempt attempt_1461663034262_1777_r_01_0 2016-05-05 18:32:37,294 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1461663034262_1777_r_01_0 2016-05-05 18:32:37,295 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_LAUNCH for container container_1461663034262_1777_01_10 taskAttempt attempt_1461663034262_1777_r_00_0 2016-05-05 18:32:37,295 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Launching attempt_1461663034262_1777_r_00_0 2016-05-05 18:32:37,301 INFO [ContainerLauncher #4] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1461663034262_1777_r_01_0 : 13562 2016-05-05 18:32:37,301 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Shuffle port returned by ContainerManager for attempt_1461663034262_1777_r_00_0 : 13562 2016-05-05 18:32:37,301 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1461663034262_1777_r_01_0] using containerId: [container_1461663034262_1777_01_11 on NM: [hadoop24:45083] 2016-05-05 18:32:37,302 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_01_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-05-05 18:32:37,302 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1461663034262_1777_r_00_0] using containerId: [container_1461663034262_1777_01_10 on NM: [hadoop22:43623] 2016-05-05 18:32:37,303 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1461663034262_1777_r_00_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 2016-05-05 18:32:37,303 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1461663034262_1777_r_01 2016-05-05 18:32:37,304 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1461663034262_1777_r_01 Task Transitioned from SCHEDULED to RUNNING 2016-05-05 18:32:37,304 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1461663034262_1777_r_00 2016-05-05 18:32:37,304 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1461663034262_1777_r_00 Task Transitioned from SCHEDULED to RUNNING 2016-05-05 18:32:37,968 INFO [IPC Server handler 1 on 45950]
Re: when build kylin cube,the job stopping at 77.78%,can't go on the job
hi kun! check your yarn log, see if it is duplicated with https://issues.apache.org/jira/browse/KYLIN-1659 best regards! wangxianbin1...@gmail.com From: zhangrongkun Date: 2016-05-05 18:46 To: dev Subject: when build kylin cube,the job stopping at 77.78%,can't go on the job My cluster resource is enough,and the kylin.log haven't any exception <http://apache-kylin.74782.x6.nabble.com/file/n4421/QQ%E6%88%AA%E5%9B%BE20160505185528.png> -- View this message in context: http://apache-kylin.74782.x6.nabble.com/when-build-kylin-cube-the-job-stopping-at-77-78-can-t-go-on-the-job-tp4421.html Sent from the Apache Kylin mailing list archive at Nabble.com.