[jira] [Commented] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2015-01-08 Thread Paul Wolfe (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269243#comment-14269243
 ] 

Paul Wolfe commented on SPARK-2316:
---

Any workaround ideas for users who can't yet upgrade (stuck on version 1.0.0)? 

 StorageStatusListener should avoid O(blocks) operations
 ---

 Key: SPARK-2316
 URL: https://issues.apache.org/jira/browse/SPARK-2316
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.0.0
Reporter: Patrick Wendell
Assignee: Andrew Or
Priority: Critical
 Fix For: 1.1.0


 In the case where jobs are frequently causing dropped blocks the storage 
 status listener can bottleneck. This is slow for a few reasons, one being 
 that we use Scala collection operations, the other being that we operations 
 that are O(number of blocks). I think using a few indices here could make 
 this much faster.
 {code}
  at java.lang.Integer.valueOf(Integer.java:642)
 at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70)
 at 
 org.apache.spark.storage.StorageUtils$$anonfun$9.apply(StorageUtils.scala:82)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:328)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:327)
 at 
 scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:327)
 at scala.collection.AbstractTraversable.groupBy(Traversable.scala:105)
 at 
 org.apache.spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:82)
 at 
 org.apache.spark.ui.storage.StorageListener.updateRDDInfo(StorageTab.scala:56)
 at 
 org.apache.spark.ui.storage.StorageListener.onTaskEnd(StorageTab.scala:67)
 - locked 0xa27ebe30 (a 
 org.apache.spark.ui.storage.StorageListener)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-07-30 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080449#comment-14080449
 ] 

Apache Spark commented on SPARK-2316:
-

User 'andrewor14' has created a pull request for this issue:
https://github.com/apache/spark/pull/1679

 StorageStatusListener should avoid O(blocks) operations
 ---

 Key: SPARK-2316
 URL: https://issues.apache.org/jira/browse/SPARK-2316
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.0.0
Reporter: Patrick Wendell
Assignee: Andrew Or
Priority: Critical
 Fix For: 1.1.0


 In the case where jobs are frequently causing dropped blocks the storage 
 status listener can bottleneck. This is slow for a few reasons, one being 
 that we use Scala collection operations, the other being that we operations 
 that are O(number of blocks). I think using a few indices here could make 
 this much faster.
 {code}
  at java.lang.Integer.valueOf(Integer.java:642)
 at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70)
 at 
 org.apache.spark.storage.StorageUtils$$anonfun$9.apply(StorageUtils.scala:82)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:328)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:327)
 at 
 scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:327)
 at scala.collection.AbstractTraversable.groupBy(Traversable.scala:105)
 at 
 org.apache.spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:82)
 at 
 org.apache.spark.ui.storage.StorageListener.updateRDDInfo(StorageTab.scala:56)
 at 
 org.apache.spark.ui.storage.StorageListener.onTaskEnd(StorageTab.scala:67)
 - locked 0xa27ebe30 (a 
 org.apache.spark.ui.storage.StorageListener)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-07-25 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074692#comment-14074692
 ] 

Shivaram Venkataraman commented on SPARK-2316:
--

On a related note, can we have flags to turn off some of the UI listeners ? If 
the StorageTab is going to be too expensive to update, it'll be good to have a 
way to turn it off and just have the JobProgress show up in the UI

 StorageStatusListener should avoid O(blocks) operations
 ---

 Key: SPARK-2316
 URL: https://issues.apache.org/jira/browse/SPARK-2316
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.0.0
Reporter: Patrick Wendell
Assignee: Andrew Or
Priority: Critical

 In the case where jobs are frequently causing dropped blocks the storage 
 status listener can bottleneck. This is slow for a few reasons, one being 
 that we use Scala collection operations, the other being that we operations 
 that are O(number of blocks). I think using a few indices here could make 
 this much faster.
 {code}
  at java.lang.Integer.valueOf(Integer.java:642)
 at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70)
 at 
 org.apache.spark.storage.StorageUtils$$anonfun$9.apply(StorageUtils.scala:82)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:328)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:327)
 at 
 scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:327)
 at scala.collection.AbstractTraversable.groupBy(Traversable.scala:105)
 at 
 org.apache.spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:82)
 at 
 org.apache.spark.ui.storage.StorageListener.updateRDDInfo(StorageTab.scala:56)
 at 
 org.apache.spark.ui.storage.StorageListener.onTaskEnd(StorageTab.scala:67)
 - locked 0xa27ebe30 (a 
 org.apache.spark.ui.storage.StorageListener)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-07-17 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14065256#comment-14065256
 ] 

Shivaram Venkataraman commented on SPARK-2316:
--

I'd just like to add that in cases where we have many thousands of blocks, this 
stack trace occupies one core constantly on the Master and is probably one of 
the reasons why the WebUI stops functioning after a certain point. 

 StorageStatusListener should avoid O(blocks) operations
 ---

 Key: SPARK-2316
 URL: https://issues.apache.org/jira/browse/SPARK-2316
 Project: Spark
  Issue Type: Bug
  Components: Spark Core, Web UI
Affects Versions: 1.0.0
Reporter: Patrick Wendell
Assignee: Andrew Or

 In the case where jobs are frequently causing dropped blocks the storage 
 status listener can bottleneck. This is slow for a few reasons, one being 
 that we use Scala collection operations, the other being that we operations 
 that are O(number of blocks). I think using a few indices here could make 
 this much faster.
 {code}
  at java.lang.Integer.valueOf(Integer.java:642)
 at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70)
 at 
 org.apache.spark.storage.StorageUtils$$anonfun$9.apply(StorageUtils.scala:82)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:328)
 at 
 scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:327)
 at 
 scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
 at 
 scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:327)
 at scala.collection.AbstractTraversable.groupBy(Traversable.scala:105)
 at 
 org.apache.spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:82)
 at 
 org.apache.spark.ui.storage.StorageListener.updateRDDInfo(StorageTab.scala:56)
 at 
 org.apache.spark.ui.storage.StorageListener.onTaskEnd(StorageTab.scala:67)
 - locked 0xa27ebe30 (a 
 org.apache.spark.ui.storage.StorageListener)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)