TarunMootala commented on issue #8199:
URL: https://github.com/apache/hudi/issues/8199#issuecomment-2379691850
I'm facing the exact OOME error with the Cleaner. Is there a fix or
workaround ?
Hudi Version: 0.12.1
Spark 3.3.0
Glue: 4.0
Spark Streaming batch interval: 120 seconds
#### Write Configs:
```
write_options = {
"hoodie.table.name": args["table_name"],
"hoodie.datasource.write.keygenerator.type": "COMPLEX",
"hoodie.datasource.write.keygenerator.class":
"org.apache.hudi.keygen.ComplexKeyGenerator",
"hoodie.datasource.write.partitionpath.field": "entity_name",
"hoodie.datasource.write.recordkey.field":
"partition_key,sequence_number",
"hoodie.datasource.write.precombine.field":
"approximate_arrival_timestamp",
"hoodie.datasource.write.operation": "insert",
"hoodie.insert.shuffle.parallelism": 10,
"hoodie.bulkinsert.shuffle.parallelism": 10,
"hoodie.upsert.shuffle.parallelism": 10,
"hoodie.delete.shuffle.parallelism": 10,
"hoodie.metadata.enable": "false",
"hoodie.datasource.hive_sync.use_jdbc": "false",
"hoodie.datasource.hive_sync.enable": "false",
"hoodie.keep.min.commits": 450,
"hoodie.keep.max.commits": 465,
"hoodie.cleaner.commits.retained": 449,
}
```
#### Error:
```
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.lang.StringCoding.encode(StringCoding.java:350) ~[?:1.8.0_422]
at java.lang.String.getBytes(String.java:941) ~[?:1.8.0_422]
at io.javalin.Context.result(Context.kt:364)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.timeline.service.RequestHandler.writeValueAsStringSync(RequestHandler.java:209)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.timeline.service.RequestHandler.writeValueAsString(RequestHandler.java:175)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.timeline.service.RequestHandler.lambda$registerFileSlicesAPI$18(RequestHandler.java:383)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.timeline.service.RequestHandler$$Lambda$3772/474269844.handle(Unknown
Source) ~[?:?]
at
org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:500)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at io.javalin.Javalin$$Lambda$3752/1708997269.manage(Unknown Source)
~[?:?]
at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at io.javalin.Javalin$$Lambda$3756/1572187887.handle(Unknown Source)
~[?:?]
at
io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
at
org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103)
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]