Re: How to unit test HiveContext without OutOfMemoryError (using sbt)

2015-08-26 Thread Mike Trienis
Thanks for your response Yana,

I can increase the MaxPermSize parameter and it will allow me to run the
unit test a few more times before I run out of memory.

However, the primary issue is that running the same unit test in the same
JVM (multiple times) results in increased memory (each run of the unit
test) and I believe it has something to do with HiveContext not reclaiming
memory after it is finished (or I'm not shutting it down properly).

It could very well be related to sbt, however, it's not clear to me.


On Tue, Aug 25, 2015 at 1:12 PM, Yana Kadiyska yana.kadiy...@gmail.com
wrote:

 The PermGen space error is controlled with MaxPermSize parameter. I run
 with this in my pom, I think copied pretty literally from Spark's own
 tests... I don't know what the sbt equivalent is but you should be able to
 pass it...possibly via SBT_OPTS?


  plugin
   groupIdorg.scalatest/groupId
   artifactIdscalatest-maven-plugin/artifactId
   version1.0/version
   configuration

 reportsDirectory${project.build.directory}/surefire-reports/reportsDirectory
   parallelfalse/parallel
   junitxml./junitxml
   filereportsSparkTestSuite.txt/filereports
   argLine-Xmx3g -XX:MaxPermSize=256m
 -XX:ReservedCodeCacheSize=512m/argLine
   stderr/
   systemProperties
   java.awt.headlesstrue/java.awt.headless
   spark.testing1/spark.testing
   spark.ui.enabledfalse/spark.ui.enabled

 spark.driver.allowMultipleContextstrue/spark.driver.allowMultipleContexts
   /systemProperties
   /configuration
   executions
   execution
   idtest/id
   goals
   goaltest/goal
   /goals
   /execution
   /executions
   /plugin
   /plugins


 On Tue, Aug 25, 2015 at 2:10 PM, Mike Trienis mike.trie...@orcsol.com
 wrote:

 Hello,

 I am using sbt and created a unit test where I create a `HiveContext` and
 execute some query and then return. Each time I run the unit test the JVM
 will increase it's memory usage until I get the error:

 Internal error when running tests: java.lang.OutOfMemoryError: PermGen
 space
 Exception in thread Thread-2 java.io.EOFException

 As a work-around, I can fork a new JVM each time I run the unit test,
 however, it seems like a bad solution as takes a while to run the unit
 test.

 By the way, I tried to importing the TestHiveContext:

- import org.apache.spark.sql.hive.test.TestHiveContext

 However, it suffers from the same memory issue. Has anyone else suffered
 from the same problem? Note that I am running these unit tests on my mac.

 Cheers, Mike.





Re: How to unit test HiveContext without OutOfMemoryError (using sbt)

2015-08-26 Thread Michael Armbrust
I'd suggest setting sbt to fork when running tests.

On Wed, Aug 26, 2015 at 10:51 AM, Mike Trienis mike.trie...@orcsol.com
wrote:

 Thanks for your response Yana,

 I can increase the MaxPermSize parameter and it will allow me to run the
 unit test a few more times before I run out of memory.

 However, the primary issue is that running the same unit test in the same
 JVM (multiple times) results in increased memory (each run of the unit
 test) and I believe it has something to do with HiveContext not reclaiming
 memory after it is finished (or I'm not shutting it down properly).

 It could very well be related to sbt, however, it's not clear to me.


 On Tue, Aug 25, 2015 at 1:12 PM, Yana Kadiyska yana.kadiy...@gmail.com
 wrote:

 The PermGen space error is controlled with MaxPermSize parameter. I run
 with this in my pom, I think copied pretty literally from Spark's own
 tests... I don't know what the sbt equivalent is but you should be able to
 pass it...possibly via SBT_OPTS?


  plugin
   groupIdorg.scalatest/groupId
   artifactIdscalatest-maven-plugin/artifactId
   version1.0/version
   configuration

 reportsDirectory${project.build.directory}/surefire-reports/reportsDirectory
   parallelfalse/parallel
   junitxml./junitxml
   filereportsSparkTestSuite.txt/filereports
   argLine-Xmx3g -XX:MaxPermSize=256m
 -XX:ReservedCodeCacheSize=512m/argLine
   stderr/
   systemProperties
   java.awt.headlesstrue/java.awt.headless
   spark.testing1/spark.testing
   spark.ui.enabledfalse/spark.ui.enabled

 spark.driver.allowMultipleContextstrue/spark.driver.allowMultipleContexts
   /systemProperties
   /configuration
   executions
   execution
   idtest/id
   goals
   goaltest/goal
   /goals
   /execution
   /executions
   /plugin
   /plugins


 On Tue, Aug 25, 2015 at 2:10 PM, Mike Trienis mike.trie...@orcsol.com
 wrote:

 Hello,

 I am using sbt and created a unit test where I create a `HiveContext`
 and execute some query and then return. Each time I run the unit test the
 JVM will increase it's memory usage until I get the error:

 Internal error when running tests: java.lang.OutOfMemoryError: PermGen
 space
 Exception in thread Thread-2 java.io.EOFException

 As a work-around, I can fork a new JVM each time I run the unit test,
 however, it seems like a bad solution as takes a while to run the unit
 test.

 By the way, I tried to importing the TestHiveContext:

- import org.apache.spark.sql.hive.test.TestHiveContext

 However, it suffers from the same memory issue. Has anyone else suffered
 from the same problem? Note that I am running these unit tests on my mac.

 Cheers, Mike.






How to unit test HiveContext without OutOfMemoryError (using sbt)

2015-08-25 Thread Mike Trienis
Hello,

I am using sbt and created a unit test where I create a `HiveContext` and
execute some query and then return. Each time I run the unit test the JVM
will increase it's memory usage until I get the error:

Internal error when running tests: java.lang.OutOfMemoryError: PermGen space
Exception in thread Thread-2 java.io.EOFException

As a work-around, I can fork a new JVM each time I run the unit test,
however, it seems like a bad solution as takes a while to run the unit
test.

By the way, I tried to importing the TestHiveContext:

   - import org.apache.spark.sql.hive.test.TestHiveContext

However, it suffers from the same memory issue. Has anyone else suffered
from the same problem? Note that I am running these unit tests on my mac.

Cheers, Mike.


Re: How to unit test HiveContext without OutOfMemoryError (using sbt)

2015-08-25 Thread Yana Kadiyska
The PermGen space error is controlled with MaxPermSize parameter. I run
with this in my pom, I think copied pretty literally from Spark's own
tests... I don't know what the sbt equivalent is but you should be able to
pass it...possibly via SBT_OPTS?


 plugin
  groupIdorg.scalatest/groupId
  artifactIdscalatest-maven-plugin/artifactId
  version1.0/version
  configuration

reportsDirectory${project.build.directory}/surefire-reports/reportsDirectory
  parallelfalse/parallel
  junitxml./junitxml
  filereportsSparkTestSuite.txt/filereports
  argLine-Xmx3g -XX:MaxPermSize=256m
-XX:ReservedCodeCacheSize=512m/argLine
  stderr/
  systemProperties
  java.awt.headlesstrue/java.awt.headless
  spark.testing1/spark.testing
  spark.ui.enabledfalse/spark.ui.enabled

spark.driver.allowMultipleContextstrue/spark.driver.allowMultipleContexts
  /systemProperties
  /configuration
  executions
  execution
  idtest/id
  goals
  goaltest/goal
  /goals
  /execution
  /executions
  /plugin
  /plugins


On Tue, Aug 25, 2015 at 2:10 PM, Mike Trienis mike.trie...@orcsol.com
wrote:

 Hello,

 I am using sbt and created a unit test where I create a `HiveContext` and
 execute some query and then return. Each time I run the unit test the JVM
 will increase it's memory usage until I get the error:

 Internal error when running tests: java.lang.OutOfMemoryError: PermGen
 space
 Exception in thread Thread-2 java.io.EOFException

 As a work-around, I can fork a new JVM each time I run the unit test,
 however, it seems like a bad solution as takes a while to run the unit
 test.

 By the way, I tried to importing the TestHiveContext:

- import org.apache.spark.sql.hive.test.TestHiveContext

 However, it suffers from the same memory issue. Has anyone else suffered
 from the same problem? Note that I am running these unit tests on my mac.

 Cheers, Mike.