shahidki31 opened a new pull request #24398: [SPARK-27468][Core][WEBUI] 
BlockUpdate replication event shouldn't overwrite storage level description in 
the UI
URL: https://github.com/apache/spark/pull/24398
 
 
   ## What changes were proposed in this pull request?
   Test steps to reproduce this:
   
   1) bin/spark-shell local-cluster[2,1,1024]
   ```
   scala> import org.apache.spark.storage.StorageLevel
   scala> val rdd = sc.parallelize(1 to 10, 
1).persist(StorageLevel.MEMORY_ONLY_2)
   scala> rdd.count
   ```
   Events generated are shown like below
   ```
   event: SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(1, 
10.8.132.160, 65473, None),rdd_0_0,StorageLevel(memory, deserialized, 2 
replicas),56,0))
   event: SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(0, 
10.8.132.160, 65474, None),rdd_0_0,StorageLevel(memory, deserialized, 1 
replicas),56,0))
   ```
   But in the UI, in the storage tab it displays in the description like,
     "Memory Deserialized 1x Replicated", even though we have given replication 
as 2.
   
   The root cause is that, the replication block update events will have 
replication factor 1. Hence in the AppStatusListener class, we overwrite 
whatever event comes later. If the replication event comes later, then we 
update replication factor as 1. 
   
   In the PR, I am fixing from the AppStatusListener class side, as we need to 
detect if the event is replication or not. Else we need to update the rdd store.
   
   
   
   ## How was this patch tested?
   
   Added UT and Manually tested.
   
   Before patch:
   
   ![Screenshot from 2019-04-18 
14-50-06](https://user-images.githubusercontent.com/23054875/56342031-6d44a400-61e9-11e9-916d-b4040b0ffd7c.png)
   
   After patch:
   ![Screenshot from 2019-04-18 
14-51-04](https://user-images.githubusercontent.com/23054875/56342037-7170c180-61e9-11e9-9ef0-71b4ec0da9ec.png)
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to