[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread roberto hashioka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347355#comment-15347355
 ] 

roberto hashioka commented on SPARK-13288:
--

Do you really need to create multiple streams with DirectStream? I think you 
can create just one and Spark does the rest.

> [1.6.0] Memory leak in Spark streaming
> --
>
> Key: SPARK-13288
> URL: https://issues.apache.org/jira/browse/SPARK-13288
> Project: Spark
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.0
> Environment: Bare metal cluster
> RHEL 6.6
>Reporter: JESSE CHEN
>  Labels: streaming
>
> Streaming in 1.6 seems to have a memory leak.
> Running the same streaming app in Spark 1.5.1 and 1.6, all things equal, 1.6 
> showed a gradual increasing processing time. 
> The app is simple: 1 Kafka receiver of tweet stream and 20 executors 
> processing the tweets in 5-second batches. 
> Spark 1.5.0 handles this smoothly and did not show increasing processing time 
> in the 40-minute test; but 1.6 showed increasing time about 8 minutes into 
> the test. Please see chart here:
> https://ibm.box.com/s/7q4ulik70iwtvyfhoj1dcl4nc469b116
> I captured heap dumps in two version and did a comparison. I noticed the Byte 
> is using 50X more space in 1.5.1.
> Here are some top classes in heap histogram and references. 
> Heap Histogram
>   
> All Classes (excluding platform)  
>   1.6.0 Streaming 1.5.1 Streaming 
> Class Instance Count  Total Size  Class   Instance Count  Total 
> Size
> class [B  84533,227,649,599   class [B5095
> 62,938,466
> class [C  44682   4,255,502   class [C130482  
> 12,844,182
> class java.lang.reflect.Method90591,177,670   class 
> java.lang.String  130171  1,562,052
>   
>   
> References by TypeReferences by Type  
>   
> class [B [0x640039e38]class [B [0x6c020bb08]  
> 
>   
> Referrers by Type Referrers by Type   
>   
> Class Count   Class   Count   
> java.nio.HeapByteBuffer   3239
> sun.security.util.DerInputBuffer1233
> sun.security.util.DerInputBuffer  1233
> sun.security.util.ObjectIdentifier  620 
> sun.security.util.ObjectIdentifier620 [[B 397 
> [Ljava.lang.Object;   408 java.lang.reflect.Method
> 326 
> 
> The total size by class B is 3GB in 1.5.1 and only 60MB in 1.6.0.
> The Java.nio.HeapByteBuffer referencing class did not show up in top in 
> 1.5.1. 
> I have also placed jstack output for 1.5.1 and 1.6.0 online..you can get them 
> here
> https://ibm.box.com/sparkstreaming-jstack160
> https://ibm.box.com/sparkstreaming-jstack151
> Jesse 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread roberto hashioka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347265#comment-15347265
 ] 

roberto hashioka commented on SPARK-13288:
--

I'm using the createDirectStream. 

> [1.6.0] Memory leak in Spark streaming
> --
>
> Key: SPARK-13288
> URL: https://issues.apache.org/jira/browse/SPARK-13288
> Project: Spark
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.0
> Environment: Bare metal cluster
> RHEL 6.6
>Reporter: JESSE CHEN
>  Labels: streaming
>
> Streaming in 1.6 seems to have a memory leak.
> Running the same streaming app in Spark 1.5.1 and 1.6, all things equal, 1.6 
> showed a gradual increasing processing time. 
> The app is simple: 1 Kafka receiver of tweet stream and 20 executors 
> processing the tweets in 5-second batches. 
> Spark 1.5.0 handles this smoothly and did not show increasing processing time 
> in the 40-minute test; but 1.6 showed increasing time about 8 minutes into 
> the test. Please see chart here:
> https://ibm.box.com/s/7q4ulik70iwtvyfhoj1dcl4nc469b116
> I captured heap dumps in two version and did a comparison. I noticed the Byte 
> is using 50X more space in 1.5.1.
> Here are some top classes in heap histogram and references. 
> Heap Histogram
>   
> All Classes (excluding platform)  
>   1.6.0 Streaming 1.5.1 Streaming 
> Class Instance Count  Total Size  Class   Instance Count  Total 
> Size
> class [B  84533,227,649,599   class [B5095
> 62,938,466
> class [C  44682   4,255,502   class [C130482  
> 12,844,182
> class java.lang.reflect.Method90591,177,670   class 
> java.lang.String  130171  1,562,052
>   
>   
> References by TypeReferences by Type  
>   
> class [B [0x640039e38]class [B [0x6c020bb08]  
> 
>   
> Referrers by Type Referrers by Type   
>   
> Class Count   Class   Count   
> java.nio.HeapByteBuffer   3239
> sun.security.util.DerInputBuffer1233
> sun.security.util.DerInputBuffer  1233
> sun.security.util.ObjectIdentifier  620 
> sun.security.util.ObjectIdentifier620 [[B 397 
> [Ljava.lang.Object;   408 java.lang.reflect.Method
> 326 
> 
> The total size by class B is 3GB in 1.5.1 and only 60MB in 1.6.0.
> The Java.nio.HeapByteBuffer referencing class did not show up in top in 
> 1.5.1. 
> I have also placed jstack output for 1.5.1 and 1.6.0 online..you can get them 
> here
> https://ibm.box.com/sparkstreaming-jstack160
> https://ibm.box.com/sparkstreaming-jstack151
> Jesse 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-23 Thread roberto hashioka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346803#comment-15346803
 ] 

roberto hashioka commented on SPARK-13288:
--

Yep, I tried it and I didn't see any memory usage increase. 

> [1.6.0] Memory leak in Spark streaming
> --
>
> Key: SPARK-13288
> URL: https://issues.apache.org/jira/browse/SPARK-13288
> Project: Spark
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.0
> Environment: Bare metal cluster
> RHEL 6.6
>Reporter: JESSE CHEN
>  Labels: streaming
>
> Streaming in 1.6 seems to have a memory leak.
> Running the same streaming app in Spark 1.5.1 and 1.6, all things equal, 1.6 
> showed a gradual increasing processing time. 
> The app is simple: 1 Kafka receiver of tweet stream and 20 executors 
> processing the tweets in 5-second batches. 
> Spark 1.5.0 handles this smoothly and did not show increasing processing time 
> in the 40-minute test; but 1.6 showed increasing time about 8 minutes into 
> the test. Please see chart here:
> https://ibm.box.com/s/7q4ulik70iwtvyfhoj1dcl4nc469b116
> I captured heap dumps in two version and did a comparison. I noticed the Byte 
> is using 50X more space in 1.5.1.
> Here are some top classes in heap histogram and references. 
> Heap Histogram
>   
> All Classes (excluding platform)  
>   1.6.0 Streaming 1.5.1 Streaming 
> Class Instance Count  Total Size  Class   Instance Count  Total 
> Size
> class [B  84533,227,649,599   class [B5095
> 62,938,466
> class [C  44682   4,255,502   class [C130482  
> 12,844,182
> class java.lang.reflect.Method90591,177,670   class 
> java.lang.String  130171  1,562,052
>   
>   
> References by TypeReferences by Type  
>   
> class [B [0x640039e38]class [B [0x6c020bb08]  
> 
>   
> Referrers by Type Referrers by Type   
>   
> Class Count   Class   Count   
> java.nio.HeapByteBuffer   3239
> sun.security.util.DerInputBuffer1233
> sun.security.util.DerInputBuffer  1233
> sun.security.util.ObjectIdentifier  620 
> sun.security.util.ObjectIdentifier620 [[B 397 
> [Ljava.lang.Object;   408 java.lang.reflect.Method
> 326 
> 
> The total size by class B is 3GB in 1.5.1 and only 60MB in 1.6.0.
> The Java.nio.HeapByteBuffer referencing class did not show up in top in 
> 1.5.1. 
> I have also placed jstack output for 1.5.1 and 1.6.0 online..you can get them 
> here
> https://ibm.box.com/sparkstreaming-jstack160
> https://ibm.box.com/sparkstreaming-jstack151
> Jesse 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13288) [1.6.0] Memory leak in Spark streaming

2016-06-16 Thread roberto hashioka (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335184#comment-15335184
 ] 

roberto hashioka commented on SPARK-13288:
--

I'm having the same issue. I'll try with Spark 1.5.1 to see if the issue goes 
away. 

> [1.6.0] Memory leak in Spark streaming
> --
>
> Key: SPARK-13288
> URL: https://issues.apache.org/jira/browse/SPARK-13288
> Project: Spark
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.0
> Environment: Bare metal cluster
> RHEL 6.6
>Reporter: JESSE CHEN
>  Labels: streaming
>
> Streaming in 1.6 seems to have a memory leak.
> Running the same streaming app in Spark 1.5.1 and 1.6, all things equal, 1.6 
> showed a gradual increasing processing time. 
> The app is simple: 1 Kafka receiver of tweet stream and 20 executors 
> processing the tweets in 5-second batches. 
> Spark 1.5.0 handles this smoothly and did not show increasing processing time 
> in the 40-minute test; but 1.6 showed increasing time about 8 minutes into 
> the test. Please see chart here:
> https://ibm.box.com/s/7q4ulik70iwtvyfhoj1dcl4nc469b116
> I captured heap dumps in two version and did a comparison. I noticed the Byte 
> is using 50X more space in 1.5.1.
> Here are some top classes in heap histogram and references. 
> Heap Histogram
>   
> All Classes (excluding platform)  
>   1.6.0 Streaming 1.5.1 Streaming 
> Class Instance Count  Total Size  Class   Instance Count  Total 
> Size
> class [B  84533,227,649,599   class [B5095
> 62,938,466
> class [C  44682   4,255,502   class [C130482  
> 12,844,182
> class java.lang.reflect.Method90591,177,670   class 
> java.lang.String  130171  1,562,052
>   
>   
> References by TypeReferences by Type  
>   
> class [B [0x640039e38]class [B [0x6c020bb08]  
> 
>   
> Referrers by Type Referrers by Type   
>   
> Class Count   Class   Count   
> java.nio.HeapByteBuffer   3239
> sun.security.util.DerInputBuffer1233
> sun.security.util.DerInputBuffer  1233
> sun.security.util.ObjectIdentifier  620 
> sun.security.util.ObjectIdentifier620 [[B 397 
> [Ljava.lang.Object;   408 java.lang.reflect.Method
> 326 
> 
> The total size by class B is 3GB in 1.5.1 and only 60MB in 1.6.0.
> The Java.nio.HeapByteBuffer referencing class did not show up in top in 
> 1.5.1. 
> I have also placed jstack output for 1.5.1 and 1.6.0 online..you can get them 
> here
> https://ibm.box.com/sparkstreaming-jstack160
> https://ibm.box.com/sparkstreaming-jstack151
> Jesse 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org