I found the reason, but do not know the solution:

in the 36000seconds, the mapper does only one thing:

016-06-16 16:39:08,622 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #211
2016-06-16 16:39:08,622 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 1
2016-06-16 16:39:11,623 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #212
2016-06-16 16:39:11,624 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #212
2016-06-16 16:39:11,624 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 1
2016-06-16 16:39:14,625 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #213
2016-06-16 16:39:14,626 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #213
2016-06-16 16:39:14,626 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 1
2016-06-16 16:39:17,627 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #214
2016-06-16 16:39:17,628 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #214
2016-06-16 16:39:17,628 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 1
2016-06-16 16:39:20,629 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #215
2016-06-16 16:39:20,630 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #215
2016-06-16 16:39:20,630 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 1
2016-06-16 16:39:23,631 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #216
2016-06-16 16:39:23,632 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #216
2016-06-16 16:39:23,632 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 1
2016-06-16 16:39:26,633 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #217
2016-06-16 16:39:26,634 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #217
2016-06-16 16:39:26,635 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 2
2016-06-16 16:39:29,635 DEBUG [IPC Parameter Sending Thread #0] 
org.apache.hadoop.ipc.Client: IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024 sending #218
2016-06-16 16:39:29,636 DEBUG [IPC Client (1991820533) connection to 
/10.106.4.2:49222 from job_1466061903809_0024] org.apache.hadoop.ipc.Client: 
IPC Client (1991820533) connection to /10.106.4.2:49222 from 
job_1466061903809_0024 got value #218
2016-06-16 16:39:29,637 DEBUG [communication thread] org.apache.hadoop.ipc.RPC: 
Call: ping 2


The target machine is listening at the port, so the connection is there. I think the connection is built to write the map output to the directory specified by mapreduce.cluster.local.dir. But I do not see any data at this directory on the target machine. The directory is write-allowed for any users. Maybe the remote-disk-eriting is too long?

Jie




Am 16.06.2016 um 10:19 schrieb Jie Tao:
Both mappers hangs in spilling map output. May some Hadoop configuration be wrong with our cluster?



e: Handled 1200000 records!
2016-06-16 10:12:01,895 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1300000 records! 2016-06-16 10:12:02,571 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1400000 records! 2016-06-16 10:12:03,247 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1500000 records! 2016-06-16 10:12:03,948 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1600000 records! 2016-06-16 10:12:04,647 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1700000 records! 2016-06-16 10:12:05,368 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1800000 records! 2016-06-16 10:12:06,056 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 1900000 records! 2016-06-16 10:12:06,744 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2000000 records! 2016-06-16 10:12:07,445 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2100000 records! 2016-06-16 10:12:08,143 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2200000 records! 2016-06-16 10:12:08,842 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2300000 records! 2016-06-16 10:12:09,540 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2400000 records! 2016-06-16 10:12:10,238 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2500000 records! 2016-06-16 10:12:10,938 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2600000 records! 2016-06-16 10:12:11,624 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2700000 records! 2016-06-16 10:12:12,312 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2800000 records! 2016-06-16 10:12:12,999 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 2900000 records! 2016-06-16 10:12:13,698 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3000000 records! 2016-06-16 10:12:14,387 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3100000 records! 2016-06-16 10:12:15,075 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3200000 records! 2016-06-16 10:12:15,764 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3300000 records! 2016-06-16 10:12:16,460 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3400000 records! 2016-06-16 10:12:17,147 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3500000 records! 2016-06-16 10:12:17,834 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3600000 records! 2016-06-16 10:12:18,520 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3700000 records! 2016-06-16 10:12:19,216 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3800000 records! 2016-06-16 10:12:19,901 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 3900000 records! 2016-06-16 10:12:20,590 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 4000000 records! 2016-06-16 10:12:21,286 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 4100000 records! 2016-06-16 10:12:21,970 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 4200000 records! 2016-06-16 10:12:22,657 INFO [main] org.apache.kylin.engine.mr.steps.BaseCuboidMapperBase: Handled 4300000 records! 2016-06-16 10:12:22,801 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output 2016-06-16 10:12:22,802 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output 2016-06-16 10:12:22,802 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 190884986; bufvoid = 536870912 2016-06-16 10:12:22,802 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 134217724(536870896); kvend = 116938212(467752848); length = 17279513/33554432



Cheers,

Jie


Am 16.06.2016 um 04:38 schrieb hongbin ma:
​Since you intermediate table is merely 200M, each mapper deals with 100M fact table. This is not very heavy for a mapper. Have you checked the log
for the mapper to know what is it doing in the 3600s?

On Thu, Jun 16, 2016 at 10:35 AM, hongbin ma <[email protected]> wrote:

3.7 GB of 8.4 GB virtual memory used

​do you have any idea why the container is using lots of virtual memory? is there any heavy contention ​on the memory resource? Not sure if it's the
root cause for slow mappers
​


--
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone





Reply via email to