orhankislal commented on a change in pull request #476: DL: Avoid constant 
folding of weights in GPDB6 plan
URL: https://github.com/apache/madlib/pull/476#discussion_r374853305
 
 

 ##########
 File path: src/ports/postgres/modules/deep_learning/madlib_keras.py_in
 ##########
 @@ -168,6 +168,23 @@ def fit(schema_madlib, source_table, model, 
model_arch_table,
         FROM {source_table}
         """.format(**locals()), ["bytea", "boolean"])
 
+    # For >=GPDB6, previously, when the fit_step UDA was called with the
+    # initial weights value passed in, the query plan for it would
+    # create custom plans with weights embedded in the plan itself.
+    # This meant that the query plan size would also include the size
+    # of these weights, bloating it up to hit the 1GB limit when dispatching
+    # the query plan to segments, leading to OOM for large weights.
+    # In GPDB, for PREPARE plans, there is a threshold of 5 attempts to create
+    # custom plans(constant folding the passed in params) for execution and 
then
+    # it uses a generic plan(not constant folding the passed in params) for all
+    # the subsequent executions.
+    # Therefore, to avoid GPDB6 from creating custom plans when passing in
+    # weights, the UDA query is executed passing in DUMMY weights for 5
+    # time, prior to calling it with the actual weights.
+    if not is_platform_pg() and not is_platform_gp6():
 
 Review comment:
   The comment and the code do not match. Comment implies we want to run this 
loop for gpdb 6 but the code has `not is_platform_gp6()`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to