[ https://issues.apache.org/jira/browse/MADLIB-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Frank McQuillan updated MADLIB-1312: ------------------------------------ Fix Version/s: (was: v1.16) v2.0 > Fix potential SQL injection issues > ---------------------------------- > > Key: MADLIB-1312 > URL: https://issues.apache.org/jira/browse/MADLIB-1312 > Project: Apache MADlib > Issue Type: Improvement > Components: Deep Learning > Reporter: Nandish Jayaram > Priority: Major > Fix For: v2.0 > > > Based on code review comments on PR > [https://github.com/apache/madlib/pull/355:] > In the madlib_keras fit function, we use the following prepare statement to > call fit transition function: > {code:java} > run_training_iteration = plpy.prepare(""" > SELECT {0}.fit_step( > {1}::REAL[], > {2}::SMALLINT[], > gp_segment_id, > {3}::INTEGER, > ARRAY{4}, > ARRAY{5}, > $MAD${6}$MAD$::TEXT, > {7}::TEXT, > {8}::TEXT, > {9}, > $1 > ) AS iteration_result > FROM {10} > """.format(schema_madlib, independent_varname, dependent_varname, > num_classes, seg_nums, total_buffers_per_seg, model_arch, > compile_params_to_pass, fit_params_to_pass, > use_gpu, source_table), ["bytea"]) > {code} > {code:java} > $MAD${6}$MAD$::TEXT{code} > could lead to potential SQL injection issues. > Quoting related code review comments on PR #355 by [~dvaldano]: > {quote}This allows the user to inject arbitrary SQL commands, by including > the string $madlib$ in the compile or fit params. This isn't as bad as > allowing them to execute arbitrary python code, but it's something that could > also happen unintentionally and result in strange errors. We could reduce the > risk by choosing a random string, but see below for a better way of handling > this. > Same issue as above, for model_arch. SQL injection possible If the user > passes a {{model_arch}} json string with {{$MAD$}} in it. > A better way to pass all of these parameters is in the same way we pass the > weights--label them as$1,$2,$3 in the SELECT statement string, and pass their > types as additional args to prepare. The values will then get passed to > execute as additional params. This is the recommended safe way to do sql > queries from python, and will do proper quoting and type casting for all of > them so we don't have to. > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)