TarunMootala commented on issue #8199:
URL: https://github.com/apache/hudi/issues/8199#issuecomment-2379691850

   I'm facing the exact OOME error with the Cleaner. Is there a fix or 
workaround ? 
   
   Hudi Version: 0.12.1
   Spark 3.3.0 
   Glue: 4.0
   
   Spark Streaming batch interval: 120 seconds
   
   #### Write Configs:
   
   ```
           write_options = {
               "hoodie.table.name": args["table_name"],
               "hoodie.datasource.write.keygenerator.type": "COMPLEX",
               "hoodie.datasource.write.keygenerator.class": 
"org.apache.hudi.keygen.ComplexKeyGenerator",
               "hoodie.datasource.write.partitionpath.field": "entity_name",
               "hoodie.datasource.write.recordkey.field": 
"partition_key,sequence_number",
               "hoodie.datasource.write.precombine.field": 
"approximate_arrival_timestamp",
               "hoodie.datasource.write.operation": "insert",
               "hoodie.insert.shuffle.parallelism": 10,
               "hoodie.bulkinsert.shuffle.parallelism": 10,
               "hoodie.upsert.shuffle.parallelism": 10,
               "hoodie.delete.shuffle.parallelism": 10,
               "hoodie.metadata.enable": "false",
               "hoodie.datasource.hive_sync.use_jdbc": "false",
               "hoodie.datasource.hive_sync.enable": "false",
               "hoodie.keep.min.commits": 450,  
               "hoodie.keep.max.commits": 465, 
               "hoodie.cleaner.commits.retained": 449,
           }
   
   ```
   
   
   #### Error:
   
   ```
   java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java.lang.StringCoding.encode(StringCoding.java:350) ~[?:1.8.0_422]
        at java.lang.String.getBytes(String.java:941) ~[?:1.8.0_422]
        at io.javalin.Context.result(Context.kt:364) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.timeline.service.RequestHandler.writeValueAsStringSync(RequestHandler.java:209)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.timeline.service.RequestHandler.writeValueAsString(RequestHandler.java:175)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.timeline.service.RequestHandler.lambda$registerFileSlicesAPI$18(RequestHandler.java:383)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.timeline.service.RequestHandler$$Lambda$3772/474269844.handle(Unknown
 Source) ~[?:?]
        at 
org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:500)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at io.javalin.Javalin$$Lambda$3752/1708997269.manage(Unknown Source) 
~[?:?]
        at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at io.javalin.Javalin$$Lambda$3756/1572187887.handle(Unknown Source) 
~[?:?]
        at 
io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.HandlerList.handle(HandlerList.java:61)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.Server.handle(Server.java:502) 
~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.HttpChannel.handle(HttpChannel.java:370)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
        at 
org.apache.hudi.org.apache.jetty.io.FillInterest.fillable(FillInterest.java:103)
 ~[hudi-spark3-bundle_2.12-0.12.1.jar:0.12.1]
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to