Hi, I was looking for the support of bind variables as Ruslan pointed out. I came around with a different workaround as we cannot use dataframes in our project,we are more dependent on using the SQL queries.
val HC=new HiveContext(sc) val query=HC.sql("select * from eici_view where customername='_:parameter'") val plan=query.queryExecution.logical val placeHolder="_:parameter" val replaceValue="medium" val newPlan= plan transformAllExpressions { case Literal(placeHolder,e) => Literal(replaceValue,e) } val myDF=new DataFrame(HC,newPlan) I am using a placeholder then using the logical plan getting created and replacing the Literal node containing the placeholder value with a Literal node which wraps the Users Input. Will this workaround work in all cases?? Thanks, Dinesh Philips India On Fri, Sep 11, 2015 at 6:32 AM, Ruslan Dautkhanov <dautkha...@gmail.com> wrote: > Using dataframe API is a good workaround. > > Another way would be to use bind variables. I don't think Spark SQL > supports them. > That's what Dinesh probably meant by "was not able to find any API for > preparing the SQL statement safely avoiding injection". > > E.g. > > val sql_handler = sqlContext.sql("SELECT name FROM people WHERE age >= :var1 > AND age <= :var2").parse() > > toddlers = sql_handler.execute("var1"->1, "var2"->3) > > teenagers = sql_handler.execute(13, 19) > > > It's not possible to do a SQL Injection if Spark SQL would support bind > variables, as parameter would be always treated as variables and not part > of SQL. Also it's arguably easier for developers as you don't have to > escape/quote. > > > ps. Another advantage is Spark could parse and create plan once - but > execute multiple times. > http://www.akadia.com/services/ora_bind_variables.html > This point is more relevant for OLTP-like queries which Spark is probably > not yet good at (e.g. return a few rows quickly/ winthin a few ms). > > > > -- > Ruslan Dautkhanov > > On Thu, Sep 10, 2015 at 12:07 PM, Michael Armbrust <mich...@databricks.com > > wrote: > >> Either that or use the DataFrame API, which directly constructs query >> plans and thus doesn't suffer from injection attacks (and runs on the same >> execution engine). >> >> On Thu, Sep 10, 2015 at 12:10 AM, Sean Owen <so...@cloudera.com> wrote: >> >>> I don't think this is Spark-specific. Mostly you need to escape / >>> quote user-supplied values as with any SQL engine. >>> >>> On Thu, Sep 10, 2015 at 7:32 AM, V Dineshkumar >>> <developer.dines...@gmail.com> wrote: >>> > Hi, >>> > >>> > What is the preferred way of avoiding SQL Injection while using Spark >>> SQL? >>> > In our use case we have to take the parameters directly from the users >>> and >>> > prepare the SQL Statement.I was not able to find any API for preparing >>> the >>> > SQL statement safely avoiding injection. >>> > >>> > Thanks, >>> > Dinesh >>> > Philips India >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >