[jira] [Commented] (SPARK-8503) SizeEstimator returns negative value for recursive data structures

AJ (JIRA) Wed, 25 Nov 2015 14:14:21 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-8503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027697#comment-15027697
 ]


AJ commented on SPARK-8503:
---------------------------

I have experienced a similar problem with spark-1.5.1-bin-hadoop2.6 that I 
don't believe is fixed yet.  The problem appears to be that transients are 
counted in the size, even though they won't ever be serialized.  In 
SizeEstimator.scala:308, it checks to make sure to skip static fields, but it 
never checks transient fields.  I think the fix might be as simple as:

   for (field <- cls.getDeclaredFields) {
      if (!Modifier.isStatic(field.getModifiers)) { // CHANGE THIS LINE TO: 
if(!Modifier.isStatic(field.getModifiers) & 
!Modifier.isTransient(field.getModifiers))
        val fieldClass = field.getType
        if (fieldClass.isPrimitive) {
          sizeCount(primitiveSize(fieldClass)) += 1
        } else {
          field.setAccessible(true) // Enable future get()'s on this field
          sizeCount(pointerSize) += 1
          pointerFields = field :: pointerFields
        }
      }
    }

Unfortunately, I don't have time right now to compile a full spark build.  Were 
there any test cases that were added for this that would make it easy to verify?

> SizeEstimator returns negative value for recursive data structures
> ------------------------------------------------------------------
>
>                 Key: SPARK-8503
>                 URL: https://issues.apache.org/jira/browse/SPARK-8503
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.1
>            Reporter: Ilya Rakitsin
>
> When estimating size of recursive data structures like graphs, with transient 
> fields referencing one another, SizeEstimator may return negative value if 
> the structure if big enough.
> This then affects the logic of other components, e.g. 
> SizeTracker#takeSample() and may lead to incorrect behavior and exceptions 
> like:
> java.lang.IllegalArgumentException: requirement failed: sizeInBytes was 
> negative: -9223372036854691384
>       at scala.Predef$.require(Predef.scala:233)
>       at org.apache.spark.storage.BlockInfo.markReady(BlockInfo.scala:55)
>       at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:810)
>       at 
> org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:637)
>       at 
> org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:991)
>       at 
> org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:98)
>       at 
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:84)
>       at 
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
>       at 
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
>       at 
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
>       at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-8503) SizeEstimator returns negative value for recursive data structures

Reply via email to