This is an automated email from the ASF dual-hosted git repository. khannaekta pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/madlib.git
commit 181c28e726e72b7624195473d0618b9f9e7d3c9b Author: Frank McQuillan <[email protected]> AuthorDate: Thu Oct 8 14:48:08 2020 -0700 update user docs and examples for custom functions Also, fix format error in user docs --- .../madlib_keras_custom_function.sql_in | 37 ++++--- .../madlib_keras_model_selection.sql_in | 116 +++++++++++++++++++-- 2 files changed, 131 insertions(+), 22 deletions(-) diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in index acdaa28..bb9864d 100644 --- a/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in +++ b/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in @@ -38,7 +38,7 @@ Interface and implementation are subject to change. </em> <div class="toc"><b>Contents</b><ul> <li class="level1"><a href="#load_function">Load Function</a></li> <li class="level1"><a href="#delete_function">Delete Function</a></li> -<li class="level1"><a href="#top_n_function">Top n Function</a></li> +<li class="level1"><a href="#top_k_function">Top k Accuracy Function</a></li> <li class="level1"><a href="#example">Examples</a></li> <li class="level1"><a href="#literature">Literature</a></li> <li class="level1"><a href="#related">Related Topics</a></li> @@ -52,6 +52,11 @@ The functions to be loaded must be in the form of serialized Python objects created using Dill, which extends Python's pickle module to the majority of the built-in Python types [1]. +Custom functions are also used to return top k categorical accuracy rate +in the case that you want a different k value than the default from Keras. +This module includes a helper function to create the custom function +automatically for a specified k. + There is also a utility function to delete a function from the table. @@ -150,10 +155,13 @@ delete_custom_function( </dd> </dl> -@anchor top_n_function -@par Top n Function +@anchor top_k_function +@par Top k Accuracy Function -Load a top n function with a specific n to the custom functions table. +Create and load a custom function for a specific k into the custom functions table. +The Keras accuracy parameter 'top_k_categorical_accuracy' returns top 5 accuracy by default [2]. +If you want a different top k value, use this helper function to create a custom +Python function to compute the top k accuracy that you specify. <pre class="syntax"> load_top_k_accuracy_function( @@ -170,7 +178,7 @@ load_top_k_accuracy_function( </dd> <dt>k</dt> - <dd>INTEGER. k value for the top k accuracy function. + <dd>INTEGER. k value for the top k accuracy that you want. </dd> </dl> @@ -187,12 +195,12 @@ load_top_k_accuracy_function( <tr> <th>name</th> <td>TEXT PRIMARY KEY. Name of the object. - Generated with the following pattern: (sparse_,)top_(n)_accuracy. + Generated with the following pattern: top_(k)_accuracy. </td> </tr> <tr> <th>description</th> - <td>TEXT. Description of the object (free text). + <td>TEXT. Description of the object. </td> </tr> <tr> @@ -233,7 +241,7 @@ conn.commit() </pre> List table to see objects: <pre class="example"> -SELECT id, name, description FROM test_custom_function_table ORDER BY id; +SELECT id, name, description FROM custom_function_table ORDER BY id; </pre> <pre class="result"> id | name | description @@ -292,23 +300,28 @@ SELECT madlib.delete_custom_function( 'custom_function_table', 'rmse'); </pre> If all objects are deleted from the table using this function, the table itself will be dropped. </pre> -Load top 3 accuracy function: +-# Load top 3 accuracy function followed by a top 10 accuracy function: <pre class="example"> DROP TABLE IF EXISTS custom_function_table; SELECT madlib.load_top_k_accuracy_function('custom_function_table', 3); +SELECT madlib.load_top_k_accuracy_function('custom_function_table', + 10); SELECT id, name, description FROM custom_function_table ORDER BY id; </pre> <pre class="result"> - id | name | description -----+----------------+------------------------ - 1 | top_3_accuracy | returns top_3_accuracy + id | name | description +----+-----------------+------------------------- + 1 | top_3_accuracy | returns top_3_accuracy + 2 | top_10_accuracy | returns top_10_accuracy </pre> @anchor literature @literature [1] Dill https://pypi.org/project/dill/ +[2] https://keras.io/api/metrics/accuracy_metrics/#topkcategoricalaccuracy-class + @anchor related @par Related Topics diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in index fd18edb..870dd18 100644 --- a/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in +++ b/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in @@ -81,7 +81,8 @@ generate_model_configs( </dd> <dt>model_selection_table</dt> - <dd>VARCHAR. Model selection table created by this module. A summary table + <dd>VARCHAR. Model selection table created by this module. If this table already + exists, it will be appended to. A summary table named <model_selection_table>_summary is also created. Contents of both output tables are described below. </dd> @@ -119,10 +120,12 @@ generate_model_configs( than regular log-based sampling. In the case of grid search, omit the sample type and just put the grid points in the list. - For custom loss functions or custom metrics, - list the custom function name in the usual way, and provide the name of the + For custom loss functions, custom metrics, and custom top k categorical accuracy, + list the custom function name and provide the name of the table where the serialized Python objects reside using the - parameter 'object_table' below. See the examples section later on this page for more examples. + parameter 'object_table' below. See the examples section later on this page. + For more information on custom functions, please + see <a href="group__grp__custom__function.html">Load Custom Functions</a>. </dd> <dt>fit_params_grid</dt> @@ -139,7 +142,6 @@ generate_model_configs( } $$ </pre> - See the examples section later on this page for more examples. </dd> <dt>search_type</dt> @@ -223,7 +225,7 @@ generate_model_configs( @anchor load_mst_table @par Load Model Selection Table [Deprecated] -This method is deprecated and replaced by the method 'generate_model_configs()' described above. +This method is deprecated and replaced by the method 'generate_model_configs' described above. <pre class="syntax"> load_model_selection_table( @@ -247,7 +249,7 @@ load_model_selection_table( <dt>model_selection_table</dt> <dd>VARCHAR. Model selection table created by this utility. A summary table named <model_selection_table>_summary is also created. Contents of both output - tables are the same as described above for the method 'generate_model_configs()'. + tables are the same as described above for the method 'generate_model_configs'. </dd> <dt>model_id_list</dt> @@ -672,10 +674,104 @@ SELECT * FROM mst_table_manual_summary; </pre> -# Custom loss functions and custom metrics. -TBD - +Let's say we have a table 'custom_function_table' that contains a custom loss +function called 'my_custom_loss' and a custom accuracy function +called 'my_custom_accuracy' based +on <a href="group__grp__custom__function.html">Load Custom Functions.</a> +Generate the model configurations with: +<pre class="example"> +DROP TABLE IF EXISTS mst_table, mst_table_summary; +SELECT madlib.generate_model_configs( + 'model_arch_library', -- model architecture table + 'mst_table', -- model selection table output + ARRAY[1,2], -- model ids from model architecture table + $$ + {'loss': ['my_custom_loss'], + 'optimizer_params_list': [ {'optimizer': ['Adam', 'SGD'], 'lr': [0.001, 0.01]} ], + 'metrics': ['my_custom_accuracy']} + $$, -- compile_param_grid + $$ + { 'batch_size': [64, 128], + 'epochs': [10] + } + $$, -- fit_param_grid + 'grid', -- search_type + NULL, -- num_configs + NULL, -- random_state + 'custom_function_table' -- table with custom functions + ); +SELECT * FROM mst_table ORDER BY mst_key; +</pre> +<pre class="result"> + mst_key | model_id | compile_params | fit_params +---------+----------+---------------------------------------------------------------------------------+-------------------------- + 1 | 1 | optimizer='Adam(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 2 | 1 | optimizer='Adam(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 3 | 1 | optimizer='SGD(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 4 | 1 | optimizer='SGD(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 5 | 1 | optimizer='Adam(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 6 | 1 | optimizer='Adam(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 7 | 1 | optimizer='SGD(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 8 | 1 | optimizer='SGD(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 9 | 2 | optimizer='Adam(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 10 | 2 | optimizer='Adam(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 11 | 2 | optimizer='SGD(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 12 | 2 | optimizer='SGD(lr=0.001)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 13 | 2 | optimizer='Adam(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 14 | 2 | optimizer='Adam(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 + 15 | 2 | optimizer='SGD(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=64 + 16 | 2 | optimizer='SGD(lr=0.01)',metrics=['my_custom_accuracy'],loss='my_custom_loss' | epochs=10,batch_size=128 +(16 rows) +</pre> +Similarly, if you created a custom top k categorical accuracy function 'top_3_accuracy' +in <a href="group__grp__custom__function.html">Load Custom Functions</a> +you can generate the model configurations as: +<pre class="example"> +DROP TABLE IF EXISTS mst_table, mst_table_summary; +SELECT madlib.generate_model_configs( + 'model_arch_library', -- model architecture table + 'mst_table', -- model selection table output + ARRAY[1,2], -- model ids from model architecture table + $$ + {'loss': ['categorical_crossentropy'], + 'optimizer_params_list': [ {'optimizer': ['Adam', 'SGD'], 'lr': [0.001, 0.01]} ], + 'metrics': ['top_3_accuracy']} + $$, -- compile_param_grid + $$ + { 'batch_size': [64, 128], + 'epochs': [10] + } + $$, -- fit_param_grid + 'grid', -- search_type + NULL, -- num_configs + NULL, -- random_state + 'custom_function_table' -- table with custom functions + ); +SELECT * FROM mst_table ORDER BY mst_key; +</pre> +<pre class="result"> + mst_key | model_id | compile_params | fit_params +---------+----------+---------------------------------------------------------------------------------+-------------------------- + 1 | 1 | optimizer='Adam(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 2 | 1 | optimizer='Adam(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 3 | 1 | optimizer='SGD(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 4 | 1 | optimizer='SGD(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 5 | 1 | optimizer='Adam(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 6 | 1 | optimizer='Adam(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 7 | 1 | optimizer='SGD(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 8 | 1 | optimizer='SGD(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 9 | 2 | optimizer='Adam(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 10 | 2 | optimizer='Adam(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 11 | 2 | optimizer='SGD(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 12 | 2 | optimizer='SGD(lr=0.001)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 13 | 2 | optimizer='Adam(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 14 | 2 | optimizer='Adam(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 + 15 | 2 | optimizer='SGD(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=64 + 16 | 2 | optimizer='SGD(lr=0.01)',metrics=['top_3_accuracy'],loss='categorical_crossentropy' | epochs=10,batch_size=128 +(16 rows) +</pre> -# <b>[Deprecated]</b> Load model selection table. This method is replaced -by the 'generate_model_configs()' method described above. +by the 'generate_model_configs' method described above. Select the model(s) from the model architecture table that you want to run, along with the compile and fit parameters. Unique combinations will be created:
