[GitHub] [druid] paul-rogers commented on a diff in pull request #13846: Suggested memory calculation in case NOT_ENOUGH_MEMORY_FAULT is thrown.

via GitHub Fri, 24 Feb 2023 15:27:14 -0800


paul-rogers commented on code in PR #13846:
URL: https://github.com/apache/druid/pull/13846#discussion_r1117801467



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/indexing/error/NotEnoughMemoryFault.java:
##########
@@ -45,19 +47,27 @@ public NotEnoughMemoryFault(
   {
     super(
         CODE,
-        "Not enough memory (total = %,d; usable = %,d; server workers = %,d; 
server threads = %,d)",
+        "Not enough memory. Required alteast %,d bytes. (total = %,d bytes; 
usable = %,d bytes; server workers = %,d; server threads = %,d). Increase JVM 
memory with the -xmx option",

Review Comment:
   Nit: `atleast` -> `at least`
   
   Note that there seems to be a convention that all interpolated values be 
enclosed in angle brackets. Seems silly sometimes, bug some folks feel strongly 
about it.
   
   Increasing JVM memory is not always possible. The other solution is to 
_decrease_ the number of workers. This is the problem with making suggestions: 
there are multiple ways to solve the problem.



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerMemoryParameters.java:
##########
@@ -431,6 +466,26 @@ private static long memoryPerBundle(
     return memoryForBundles / bundleCount;
   }
 
+  /**
+   * Used for estimating the usable memory for better exception messages when 
{@link NotEnoughMemoryFault} is thrown.
+   */
+  private static long estimateUsableMemory(
+      final int numWorkersInJvm,
+      final int numProcessingThreadsInJvm,
+      final long estimatedEachBundleMemory
+  )
+  {
+    final int bundleCount = numWorkersInJvm + numProcessingThreadsInJvm;
+    return estimateUsableMemory(numWorkersInJvm, estimatedEachBundleMemory * 
bundleCount);
+
+  }
+
+  private static long estimateUsableMemory(final int numWorkersInJvm, final 
long estimatedTotalBundleMemory)
+  {
+    final long estimatedWorkerMemory = numWorkersInJvm * 
PARTITION_STATS_MEMORY_MAX_BYTES;

Review Comment:
   Nit: Now that you've worked out the math, would be great to explain the 
reasoning. For example, why is the partition stats large enough to worry about? 
Why isn't there any other per-worker overhead to consider?



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerMemoryParameters.java:
##########
@@ -439,11 +494,19 @@ private static long memoryNeededForInputChannels(final 
int numInputWorkers)
   }
 
   /**
-   * Amount of heap memory available for our usage.
+   * Amount of heap memory available for our usage. Any computation changes 
done to this method should also be done in its corresponding method {@link 
WorkerMemoryParameters#estimateTotalMemoryInJvmFromUsableMemory}
+   */
+  private static long computeUsableMemoryInJvm(final long maxMemory, final 
long totalLookupFootprint)
+  {
+    return (long) ((maxMemory - totalLookupFootprint) * 
USABLE_MEMORY_FRACTION);
+  }
+
+  /**
+   * Estimate amount of heap memory to use in case usable memory is provided. 
This method is used for bettter exception messages when {@link 
NotEnoughMemoryFault} is thrown.
    */
-  private static long computeUsableMemoryInJvm(final Injector injector)
+  private static long estimateTotalMemoryInJvmFromUsableMemory(long 
usuableMemeory, final long totalLookupFootprint)

Review Comment:
   This would also benefit from explanation. On the surface, it appears we're 
estimating the memory in the JVM. But, of course, we don't have to estimate 
that: we know that. So, maybe we're estimating the amount of memory the JVM 
_would need_ to handle the given workload. Can we add a note, or change the 
name, to express that, if that is, indeed, what we're doing?



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerMemoryParameters.java:
##########
@@ -200,15 +207,30 @@ public static WorkerMemoryParameters 
createProductionInstanceForWorker(
    *                                  the task capacity.
    * @param numProcessingThreadsInJvm size of the processing thread pool in 
the JVM.
    * @param numInputWorkers           number of workers across input stages 
that need to be merged together.
+   * @param totalLookUpFootprint      estimated size of the lookups loaded by 
the process.
    */
   public static WorkerMemoryParameters createInstance(
       final long maxMemoryInJvm,
-      final long usableMemoryInJvm,
       final int numWorkersInJvm,
       final int numProcessingThreadsInJvm,
-      final int numInputWorkers
+      final int numInputWorkers,
+      final long totalLookUpFootprint
   )
   {
+    Preconditions.checkArgument(maxMemoryInJvm > 0, "Max memory passed: [%s] 
should be > 0", maxMemoryInJvm);
+    Preconditions.checkArgument(numWorkersInJvm > 0, "Number of workers: [%s] 
in jvm should be > 0", numWorkersInJvm);

Review Comment:
   Nit: `%d` for numbers. Here and below.
   
   For extra credit, since the values are likely to be big, include comma 
separators: `%,d`.



##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/WorkerMemoryParameters.java:
##########
@@ -439,11 +494,19 @@ private static long memoryNeededForInputChannels(final 
int numInputWorkers)
   }
 
   /**
-   * Amount of heap memory available for our usage.
+   * Amount of heap memory available for our usage. Any computation changes 
done to this method should also be done in its corresponding method {@link 
WorkerMemoryParameters#estimateTotalMemoryInJvmFromUsableMemory}
+   */
+  private static long computeUsableMemoryInJvm(final long maxMemory, final 
long totalLookupFootprint)
+  {
+    return (long) ((maxMemory - totalLookupFootprint) * 
USABLE_MEMORY_FRACTION);

Review Comment:
   What does `USABLE_MEMORY_FRACTION` represent? Should it apply to lookups? 
That is, if `USABLE_MEMORY_FRACTION` represents memory given over to Java, 
process overhead, etc., then we should apply that to total memory, and then 
subtract lookups, since lookups are subject to the same overhead as worker 
memory. Else, maybe explain the reasoning.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] paul-rogers commented on a diff in pull request #13846: Suggested memory calculation in case NOT_ENOUGH_MEMORY_FAULT is thrown.

Reply via email to