[jira] [Commented] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
[ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575945#comment-15575945 ] Justin Miller commented on SPARK-17936: --- Hey Sean, I did a bit more digging this morning looking at SpecificUnsafeProjection and saw this commit: https://github.com/apache/spark/commit/b1b47274bfeba17a9e4e9acebd7385289f31f6c8 I thought I'd try running w/2.1.0-SNAPSHOT and see how things went and it appears to work great now! [Stage 1:> (0 + 8) / 8]11:28:33.237 INFO c.p.o.ObservationPersister - (ObservationPersister) - Thrift Parse Success: 0 / Thrift Parse Errors: 0 [Stage 3:> (0 + 8) / 8]11:29:03.236 INFO c.p.o.ObservationPersister - (ObservationPersister) - Thrift Parse Success: 89 / Thrift Parse Errors: 0 [Stage 5:> (4 + 4) / 8]11:29:33.237 INFO c.p.o.ObservationPersister - (ObservationPersister) - Thrift Parse Success: 205 / Thrift Parse Errors: 0 Since we're still testing this out that snapshot works great for now. Do you know when 2.1.0 might be available generally? Best, Justin > "CodeGenerator - failed to compile: > org.codehaus.janino.JaninoRuntimeException: Code of" method Error > - > > Key: SPARK-17936 > URL: https://issues.apache.org/jira/browse/SPARK-17936 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Justin Miller > > Greetings. I'm currently in the process of migrating a project I'm working on > from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift > structs coming from Kafka into Parquet files stored in S3. This conversion > process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll > paste the stack trace below. > org.codehaus.janino.JaninoRuntimeException: Code of method > "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V" > of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" > grows beyond 64 KB > at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941) > at org.codehaus.janino.CodeContext.write(CodeContext.java:854) > at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242) > at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058) > Also, later on: > 07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception > in thread Thread[Executor task launch worker-6,5,run-main-group-0] > java.lang.OutOfMemoryError: Java heap space > I've seen similar issues posted, but those were always on the query side. I > have a hunch that this is happening at write time as the error occurs after > batchDuration. Here's the write snippet. > stream. > flatMap { > case Success(row) => > thriftParseSuccess += 1 > Some(row) > case Failure(ex) => > thriftParseErrors += 1 > logger.error("Error during deserialization: ", ex) > None > }.foreachRDD { rdd => > val sqlContext = SQLContext.getOrCreate(rdd.context) > transformer(sqlContext.createDataFrame(rdd, converter.schema)) > .coalesce(coalesceSize) > .write > .mode(Append) > .partitionBy(partitioning: _*) > .parquet(parquetPath) > } > Please let me know if you can be of assistance and if there's anything I can > do to help. > Best, > Justin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
[ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575494#comment-15575494 ] Justin Miller commented on SPARK-17936: --- It's just strange that the issue seems to effect previous versions (if they're the same issue) but didn't impact me when I was using 1.6.2 and the 0.8 kafka consumer. Is it possible that Scala 2.10 vs Scala 2.11 makes a difference? There are a lot of variables at play unfortunately. > "CodeGenerator - failed to compile: > org.codehaus.janino.JaninoRuntimeException: Code of" method Error > - > > Key: SPARK-17936 > URL: https://issues.apache.org/jira/browse/SPARK-17936 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Justin Miller > > Greetings. I'm currently in the process of migrating a project I'm working on > from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift > structs coming from Kafka into Parquet files stored in S3. This conversion > process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll > paste the stack trace below. > org.codehaus.janino.JaninoRuntimeException: Code of method > "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V" > of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" > grows beyond 64 KB > at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941) > at org.codehaus.janino.CodeContext.write(CodeContext.java:854) > at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242) > at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058) > Also, later on: > 07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception > in thread Thread[Executor task launch worker-6,5,run-main-group-0] > java.lang.OutOfMemoryError: Java heap space > I've seen similar issues posted, but those were always on the query side. I > have a hunch that this is happening at write time as the error occurs after > batchDuration. Here's the write snippet. > stream. > flatMap { > case Success(row) => > thriftParseSuccess += 1 > Some(row) > case Failure(ex) => > thriftParseErrors += 1 > logger.error("Error during deserialization: ", ex) > None > }.foreachRDD { rdd => > val sqlContext = SQLContext.getOrCreate(rdd.context) > transformer(sqlContext.createDataFrame(rdd, converter.schema)) > .coalesce(coalesceSize) > .write > .mode(Append) > .partitionBy(partitioning: _*) > .parquet(parquetPath) > } > Please let me know if you can be of assistance and if there's anything I can > do to help. > Best, > Justin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
[ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575476#comment-15575476 ] Sean Owen commented on SPARK-17936: --- I doubt it is a different cause given it is the same type of error in the same path. That is, do you expect the resolution differs? It can be reopened if so but otherwise I think this is more likely to fragment discussion about one problem. > "CodeGenerator - failed to compile: > org.codehaus.janino.JaninoRuntimeException: Code of" method Error > - > > Key: SPARK-17936 > URL: https://issues.apache.org/jira/browse/SPARK-17936 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Justin Miller > > Greetings. I'm currently in the process of migrating a project I'm working on > from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift > structs coming from Kafka into Parquet files stored in S3. This conversion > process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll > paste the stack trace below. > org.codehaus.janino.JaninoRuntimeException: Code of method > "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V" > of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" > grows beyond 64 KB > at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941) > at org.codehaus.janino.CodeContext.write(CodeContext.java:854) > at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242) > at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058) > Also, later on: > 07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception > in thread Thread[Executor task launch worker-6,5,run-main-group-0] > java.lang.OutOfMemoryError: Java heap space > I've seen similar issues posted, but those were always on the query side. I > have a hunch that this is happening at write time as the error occurs after > batchDuration. Here's the write snippet. > stream. > flatMap { > case Success(row) => > thriftParseSuccess += 1 > Some(row) > case Failure(ex) => > thriftParseErrors += 1 > logger.error("Error during deserialization: ", ex) > None > }.foreachRDD { rdd => > val sqlContext = SQLContext.getOrCreate(rdd.context) > transformer(sqlContext.createDataFrame(rdd, converter.schema)) > .coalesce(coalesceSize) > .write > .mode(Append) > .partitionBy(partitioning: _*) > .parquet(parquetPath) > } > Please let me know if you can be of assistance and if there's anything I can > do to help. > Best, > Justin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
[ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575471#comment-15575471 ] Justin Miller commented on SPARK-17936: --- I'd also note this wasn't an issue in Spark 1.6.2. The process would run fine for hours and never crashed on this error. > "CodeGenerator - failed to compile: > org.codehaus.janino.JaninoRuntimeException: Code of" method Error > - > > Key: SPARK-17936 > URL: https://issues.apache.org/jira/browse/SPARK-17936 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Justin Miller > > Greetings. I'm currently in the process of migrating a project I'm working on > from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift > structs coming from Kafka into Parquet files stored in S3. This conversion > process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll > paste the stack trace below. > org.codehaus.janino.JaninoRuntimeException: Code of method > "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V" > of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" > grows beyond 64 KB > at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941) > at org.codehaus.janino.CodeContext.write(CodeContext.java:854) > at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242) > at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058) > Also, later on: > 07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception > in thread Thread[Executor task launch worker-6,5,run-main-group-0] > java.lang.OutOfMemoryError: Java heap space > I've seen similar issues posted, but those were always on the query side. I > have a hunch that this is happening at write time as the error occurs after > batchDuration. Here's the write snippet. > stream. > flatMap { > case Success(row) => > thriftParseSuccess += 1 > Some(row) > case Failure(ex) => > thriftParseErrors += 1 > logger.error("Error during deserialization: ", ex) > None > }.foreachRDD { rdd => > val sqlContext = SQLContext.getOrCreate(rdd.context) > transformer(sqlContext.createDataFrame(rdd, converter.schema)) > .coalesce(coalesceSize) > .write > .mode(Append) > .partitionBy(partitioning: _*) > .parquet(parquetPath) > } > Please let me know if you can be of assistance and if there's anything I can > do to help. > Best, > Justin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17936) "CodeGenerator - failed to compile: org.codehaus.janino.JaninoRuntimeException: Code of" method Error
[ https://issues.apache.org/jira/browse/SPARK-17936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575452#comment-15575452 ] Justin Miller commented on SPARK-17936: --- I did look through them and I don't think they're related. Note that the error is different and this is trying to write data not read large amounts of data. > "CodeGenerator - failed to compile: > org.codehaus.janino.JaninoRuntimeException: Code of" method Error > - > > Key: SPARK-17936 > URL: https://issues.apache.org/jira/browse/SPARK-17936 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Justin Miller > > Greetings. I'm currently in the process of migrating a project I'm working on > from Spark 1.6.2 to 2.0.1. The project uses Spark Streaming to convert Thrift > structs coming from Kafka into Parquet files stored in S3. This conversion > process works fine in 1.6.2 but I think there may be a bug in 2.0.1. I'll > paste the stack trace below. > org.codehaus.janino.JaninoRuntimeException: Code of method > "(Lorg/apache/spark/sql/catalyst/expressions/GeneratedClass;[Ljava/lang/Object;)V" > of class > "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" > grows beyond 64 KB > at org.codehaus.janino.CodeContext.makeSpace(CodeContext.java:941) > at org.codehaus.janino.CodeContext.write(CodeContext.java:854) > at org.codehaus.janino.UnitCompiler.writeShort(UnitCompiler.java:10242) > at org.codehaus.janino.UnitCompiler.writeLdc(UnitCompiler.java:9058) > Also, later on: > 07:35:30.191 ERROR o.a.s.u.SparkUncaughtExceptionHandler - Uncaught exception > in thread Thread[Executor task launch worker-6,5,run-main-group-0] > java.lang.OutOfMemoryError: Java heap space > I've seen similar issues posted, but those were always on the query side. I > have a hunch that this is happening at write time as the error occurs after > batchDuration. Here's the write snippet. > stream. > flatMap { > case Success(row) => > thriftParseSuccess += 1 > Some(row) > case Failure(ex) => > thriftParseErrors += 1 > logger.error("Error during deserialization: ", ex) > None > }.foreachRDD { rdd => > val sqlContext = SQLContext.getOrCreate(rdd.context) > transformer(sqlContext.createDataFrame(rdd, converter.schema)) > .coalesce(coalesceSize) > .write > .mode(Append) > .partitionBy(partitioning: _*) > .parquet(parquetPath) > } > Please let me know if you can be of assistance and if there's anything I can > do to help. > Best, > Justin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org