[GitHub] [madlib] khannaekta commented on a change in pull request #476: DL: Avoid constant folding of weights in GPDB6 plan

GitBox Tue, 04 Feb 2020 11:18:36 -0800

khannaekta commented on a change in pull request #476: DL: Avoid constant 
folding of weights in GPDB6 plan
URL: https://github.com/apache/madlib/pull/476#discussion_r374871042


 ##########
 File path: src/ports/postgres/modules/deep_learning/madlib_keras.py_in
 ##########
 @@ -685,11 +707,27 @@ def get_loss_metric_from_keras_eval(schema_madlib, 
table, compile_params,
                                             ARRAY{images_per_seg},
                                             {use_gpus}::BOOLEAN,
                                             ARRAY{accessible_gpus_for_seg},
-                                            {is_final_iteration}
+                                            $2
                                             )) as loss_metric
         from {table}
-        """.format(**locals()), ["bytea"])
-    res = plpy.execute(evaluate_query, [serialized_weights])
+        """.format(**locals()),["bytea", "boolean"])
+    # For >=GPDB6, previously, when the evaluate UDA was called with the
+    # initial weights value passed in, the query plan for it would
+    # create custom plans with weights embedded in the plan itself.
+    # This meant that the query plan size would also include the size
+    # of these weights, bloating it up to hit the 1GB limit when dispatching
+    # the query plan to segments, leading to OOM for large weights.
+    # In GPDB, for PREPARE plans, there is a threshold of 5 attempts to create
+    # custom plans(constant folding the passed in params) for execution and 
then
+    # it uses a generic plan(not constant folding the passed in params) for all
+    # the subsequent executions.
+    # Therefore, to avoid GPDB6 from creating custom plans when passing in
+    # weights, the UDA query is executed passing in DUMMY weights for 5
+    # time, prior to calling it with the actual weights.
+    if not is_platform_pg() and not is_platform_gp6():
 
 Review comment:
   Agreed! will update the code to have it as a separate function. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [madlib] khannaekta commented on a change in pull request #476: DL: Avoid constant folding of weights in GPDB6 plan

Reply via email to