Github user thammegowda commented on the issue: https://github.com/apache/incubator-joshua/pull/81 I am trying to run an experiment with a bunch of big language models, but the tuner is taking forever! In the code base, I found a few more (possible) bottlenecks: 1. https://github.com/apache/incubator-joshua/blob/000298e555fbc71315b1d8719f5c3918a2102e5b/scripts/training/run_tuner.py#L421 2. https://github.com/apache/incubator-joshua/blob/a30e95563a20e2ccad574f4065654b955fe8fa25/src/main/java/org/apache/joshua/zmert/ZMERT.java#L61 So, 4000 MB of heap is hardcoded to the Mert. Possible explanation: My language models are huge (a big one is ~90GB), they definitely don't fit into 4GB, so JVM is spending all the time in garbage collection.
---