Hi,

      I've had this nagging problem where a task will hang and the
    entire job hangs. Using pyspark. Spark 1.5.1

    

    The job output looks like this, and hangs after the last task:

    

    ......

    15/12/29 17:00:38 INFO BlockManagerInfo: Added broadcast_0_piece0 in
    memory on 10.65.143.174:34385 (size: 5.8 KB, free: 2.1 GB)

    15/12/29 17:00:39 INFO TaskSetManager: Finished task 15.0 in stage
    0.0 (TID 15) in 11668 ms on 10.65.143.174 (29/32)

    15/12/29 17:00:39 INFO TaskSetManager: Finished task 23.0 in stage
    0.0 (TID 23) in 11684 ms on 10.65.143.174 (30/32)

    15/12/29 17:00:39 INFO TaskSetManager: Finished task 7.0 in stage
    0.0 (TID 7) in 11717 ms on 10.65.143.174 (31/32)

    {nothing here for a while, ~6mins}

    

    

    Here is the executor status, from UI.

    

    
    
      
        
          31
          31
          0
          RUNNING
          PROCESS_LOCAL
          2 / 10.65.143.174
          2015/12/29 17:00:28
          6.8 min
          0 ms
          0 ms
          60 ms
          0 ms
          0 ms
          0.0 B
        
      
    
    

    Here is executor 2 from 10.65.143.174. Never see task 31 get to the
    executor.....any ideas?

    

    .....

    15/12/29 17:00:38 INFO TorrentBroadcast: Started reading broadcast
    variable 0

    15/12/29 17:00:38 INFO MemoryStore: ensureFreeSpace(5979) called
    with curMem=0, maxMem=2223023063

    15/12/29 17:00:38 INFO MemoryStore: Block broadcast_0_piece0 stored
    as bytes in memory (estimated size 5.8 KB, free 2.1 GB)

    15/12/29 17:00:38 INFO TorrentBroadcast: Reading broadcast variable
    0 took 208 ms

    15/12/29 17:00:38 INFO MemoryStore: ensureFreeSpace(8544) called
    with curMem=5979, maxMem=2223023063

    15/12/29 17:00:38 INFO MemoryStore: Block broadcast_0 stored as
    values in memory (estimated size 8.3 KB, free 2.1 GB)

    15/12/29 17:00:39 INFO PythonRunner: Times: total = 913, boot = 747,
    init = 166, finish = 0

    15/12/29 17:00:39 INFO Executor: Finished task 15.0 in stage 0.0
    (TID 15). 967 bytes result sent to driver

    15/12/29 17:00:39 INFO PythonRunner: Times: total = 955, boot = 735,
    init = 220, finish = 0

    15/12/29 17:00:39 INFO Executor: Finished task 23.0 in stage 0.0
    (TID 23). 967 bytes result sent to driver

    15/12/29 17:00:39 INFO PythonRunner: Times: total = 970, boot = 812,
    init = 158, finish = 0

    15/12/29 17:00:39 INFO Executor: Finished task 7.0 in stage 0.0 (TID
    7). 967 bytes result sent to driver

    root@ip-10-65-143-174 2]$ 


Sent from my Verizon Wireless 4G LTE smartphone

Reply via email to