[jira] [Commented] (SPARK-9539) Repeated sc.close() in PySpark causes JVM memory leak
[ https://issues.apache.org/jira/browse/SPARK-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651067#comment-14651067 ] Andrey Zimovnov commented on SPARK-9539: Hi, Owen! I'm not sure what Permanent in java heap means, but it grows with time. I really have such a use case, when I need to recreate spark context a lot. The only workaround for now is to try to increase MaxPermSize, I guess. Repeated sc.close() in PySpark causes JVM memory leak - Key: SPARK-9539 URL: https://issues.apache.org/jira/browse/SPARK-9539 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.4.1 Reporter: Andrey Zimovnov Priority: Minor Attachments: Screenshot at авг. 02 19-10-53.png Example code in Python: {code:python} for i in range(20): print i conf = SparkConf().setAppName(test) sc = SparkContext(conf=conf) hivec = HiveContext(sc) hivec.sql(select id from details_info limit 1).show() sc.stop() del hivec del sc {code} Jstat output: {noformat} S0CS1CS0US1U EC EUOC OU PC PUYGC YGCTFGCFGCT GCT 196608,0 196608,0 97566,2 0,0 1179648,0 542150,0 3145728,0120,0 154112,0 153613,2 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 679041,7 3145728,0120,0 164352,0 164183,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 907928,4 3145728,0120,0 164352,0 164200,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 912132,7 3145728,0120,0 164352,0 164200,5 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 913741,5 3145728,0120,0 164352,0 164200,8 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 929458,6 3145728,0120,0 164352,0 164206,0 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 1003138,1 3145728,0120,0 168960,0 168646,0 40,434 0 0,0000,434 131584,0 196608,0 0,0 109725,6 1179648,0 0,03145728,0128,0 175104,0 174802,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 152654,9 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 158586,1 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 160659,8 3145728,0128,0 175104,0 174805,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 181935,2 3145728,0128,0 175104,0 174819,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 283389,1 3145728,0128,0 185856,0 185371,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 342596,4 3145728,0128,0 185856,0 185379,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 547634,7 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 555930,9 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 557888,6 3145728,0128,0 185856,0 185386,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 573907,5 3145728,0128,0 185856,0 185397,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 637955,0 3145728,0128,0 189952,0 189533,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 895866,1 3145728,0128,0 196096,0 195968,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 948046,5 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 952427,2 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 957977,5 3145728,0128,0 196096,0 195973,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 977811,1 3145728,0128,0 196096,0 195977,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 1118722,0 3145728,0128,0 206848,0 206539,0 50,591 0 0,0000,591 131584,0 144384,0 118692,5 0,0 1284096,0 183470,8 3145728,0136,0 206848,0 206543,4 60,773 0 0,0000,773 131584,0 144384,0 118692,5 0,0 1284096,0 189718,5 3145728,0136,0
[jira] [Commented] (SPARK-9539) Repeated sc.close() in PySpark causes JVM memory leak
[ https://issues.apache.org/jira/browse/SPARK-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651073#comment-14651073 ] Sean Owen commented on SPARK-9539: -- This just shows Spark is using memory. It's normal to use some of the permanent generation. Your jstat dump shows normal growth and GC of the heap. It does not show any out-of-memory condition. It may simply be that you need to increase the memory you allocate, especially the permanent generation (you should probably read up on this). Unless you can point to an actual memory leak from a heap dump, I'd like to close this. Repeated sc.close() in PySpark causes JVM memory leak - Key: SPARK-9539 URL: https://issues.apache.org/jira/browse/SPARK-9539 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.4.1 Reporter: Andrey Zimovnov Priority: Minor Attachments: Screenshot at авг. 02 19-10-53.png Example code in Python: {code:python} for i in range(20): print i conf = SparkConf().setAppName(test) sc = SparkContext(conf=conf) hivec = HiveContext(sc) hivec.sql(select id from details_info limit 1).show() sc.stop() del hivec del sc {code} Jstat output: {noformat} S0CS1CS0US1U EC EUOC OU PC PUYGC YGCTFGCFGCT GCT 196608,0 196608,0 97566,2 0,0 1179648,0 542150,0 3145728,0120,0 154112,0 153613,2 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 679041,7 3145728,0120,0 164352,0 164183,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 907928,4 3145728,0120,0 164352,0 164200,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 912132,7 3145728,0120,0 164352,0 164200,5 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 913741,5 3145728,0120,0 164352,0 164200,8 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 929458,6 3145728,0120,0 164352,0 164206,0 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 1003138,1 3145728,0120,0 168960,0 168646,0 40,434 0 0,0000,434 131584,0 196608,0 0,0 109725,6 1179648,0 0,03145728,0128,0 175104,0 174802,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 152654,9 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 158586,1 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 160659,8 3145728,0128,0 175104,0 174805,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 181935,2 3145728,0128,0 175104,0 174819,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 283389,1 3145728,0128,0 185856,0 185371,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 342596,4 3145728,0128,0 185856,0 185379,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 547634,7 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 555930,9 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 557888,6 3145728,0128,0 185856,0 185386,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 573907,5 3145728,0128,0 185856,0 185397,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 637955,0 3145728,0128,0 189952,0 189533,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 895866,1 3145728,0128,0 196096,0 195968,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 948046,5 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 952427,2 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 957977,5 3145728,0128,0 196096,0 195973,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 977811,1 3145728,0128,0 196096,0 195977,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 1118722,0 3145728,0128,0 206848,0 206539,0 50,591 0 0,0000,591 131584,0 144384,0
[jira] [Commented] (SPARK-9539) Repeated sc.close() in PySpark causes JVM memory leak
[ https://issues.apache.org/jira/browse/SPARK-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651064#comment-14651064 ] Sean Owen commented on SPARK-9539: -- Why do you think this is a memory leak? That exception does not even indicate an out-of-memory condition. Repeated sc.close() in PySpark causes JVM memory leak - Key: SPARK-9539 URL: https://issues.apache.org/jira/browse/SPARK-9539 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.4.1 Reporter: Andrey Zimovnov Priority: Minor Attachments: Screenshot at авг. 02 19-10-53.png Example code in Python: {code:python} for i in range(20): print i conf = SparkConf().setAppName(test) sc = SparkContext(conf=conf) hivec = HiveContext(sc) hivec.sql(select id from details_info limit 1).show() sc.stop() del hivec del sc {code} Jstat output: {noformat} S0CS1CS0US1U EC EUOC OU PC PUYGC YGCTFGCFGCT GCT 196608,0 196608,0 97566,2 0,0 1179648,0 542150,0 3145728,0120,0 154112,0 153613,2 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 679041,7 3145728,0120,0 164352,0 164183,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 907928,4 3145728,0120,0 164352,0 164200,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 912132,7 3145728,0120,0 164352,0 164200,5 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 913741,5 3145728,0120,0 164352,0 164200,8 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 929458,6 3145728,0120,0 164352,0 164206,0 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 1003138,1 3145728,0120,0 168960,0 168646,0 40,434 0 0,0000,434 131584,0 196608,0 0,0 109725,6 1179648,0 0,03145728,0128,0 175104,0 174802,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 152654,9 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 158586,1 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 160659,8 3145728,0128,0 175104,0 174805,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 181935,2 3145728,0128,0 175104,0 174819,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 283389,1 3145728,0128,0 185856,0 185371,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 342596,4 3145728,0128,0 185856,0 185379,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 547634,7 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 555930,9 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 557888,6 3145728,0128,0 185856,0 185386,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 573907,5 3145728,0128,0 185856,0 185397,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 637955,0 3145728,0128,0 189952,0 189533,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 895866,1 3145728,0128,0 196096,0 195968,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 948046,5 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 952427,2 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 957977,5 3145728,0128,0 196096,0 195973,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 977811,1 3145728,0128,0 196096,0 195977,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 1118722,0 3145728,0128,0 206848,0 206539,0 50,591 0 0,0000,591 131584,0 144384,0 118692,5 0,0 1284096,0 183470,8 3145728,0136,0 206848,0 206543,4 60,773 0 0,0000,773 131584,0 144384,0 118692,5 0,0 1284096,0 189718,5 3145728,0136,0 206848,0 206543,4 60,773 0 0,0000,773 131584,0 144384,0 118692,5 0,0 1284096,0 192165,0 3145728,0136,0
[jira] [Commented] (SPARK-9539) Repeated sc.close() in PySpark causes JVM memory leak
[ https://issues.apache.org/jira/browse/SPARK-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651078#comment-14651078 ] Andrey Zimovnov commented on SPARK-9539: OK, I'll work on this later and reopen if necessary. Thanks! Repeated sc.close() in PySpark causes JVM memory leak - Key: SPARK-9539 URL: https://issues.apache.org/jira/browse/SPARK-9539 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.4.1 Reporter: Andrey Zimovnov Priority: Minor Attachments: Screenshot at авг. 02 19-10-53.png Example code in Python: {code:python} for i in range(20): print i conf = SparkConf().setAppName(test) sc = SparkContext(conf=conf) hivec = HiveContext(sc) hivec.sql(select id from details_info limit 1).show() sc.stop() del hivec del sc {code} Jstat output: {noformat} S0CS1CS0US1U EC EUOC OU PC PUYGC YGCTFGCFGCT GCT 196608,0 196608,0 97566,2 0,0 1179648,0 542150,0 3145728,0120,0 154112,0 153613,2 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 679041,7 3145728,0120,0 164352,0 164183,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 907928,4 3145728,0120,0 164352,0 164200,3 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 912132,7 3145728,0120,0 164352,0 164200,5 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 913741,5 3145728,0120,0 164352,0 164200,8 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 929458,6 3145728,0120,0 164352,0 164206,0 40,434 0 0,0000,434 196608,0 196608,0 97566,2 0,0 1179648,0 1003138,1 3145728,0120,0 168960,0 168646,0 40,434 0 0,0000,434 131584,0 196608,0 0,0 109725,6 1179648,0 0,03145728,0128,0 175104,0 174802,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 152654,9 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 158586,1 3145728,0128,0 175104,0 174803,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 160659,8 3145728,0128,0 175104,0 174805,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 181935,2 3145728,0128,0 175104,0 174819,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 283389,1 3145728,0128,0 185856,0 185371,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 342596,4 3145728,0128,0 185856,0 185379,3 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 547634,7 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 555930,9 3145728,0128,0 185856,0 185385,8 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 557888,6 3145728,0128,0 185856,0 185386,0 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 573907,5 3145728,0128,0 185856,0 185397,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 637955,0 3145728,0128,0 189952,0 189533,1 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 895866,1 3145728,0128,0 196096,0 195968,5 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 948046,5 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 952427,2 3145728,0128,0 196096,0 195969,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 957977,5 3145728,0128,0 196096,0 195973,4 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 977811,1 3145728,0128,0 196096,0 195977,7 50,591 0 0,0000,591 131584,0 196608,0 0,0 109725,6 1179648,0 1118722,0 3145728,0128,0 206848,0 206539,0 50,591 0 0,0000,591 131584,0 144384,0 118692,5 0,0 1284096,0 183470,8 3145728,0136,0 206848,0 206543,4 60,773 0 0,0000,773 131584,0 144384,0 118692,5 0,0 1284096,0 189718,5 3145728,0136,0 206848,0 206543,4 60,773 0 0,0000,773 131584,0 144384,0 118692,5 0,0 1284096,0 192165,0 3145728,0136,0 206848,0 206543,4 6