Hi, You don't have to run the SQL statement. You can parse it, that will be the logical parsing.
val logicalPlan = ss.sessionState.sqlParser.parsePlan(sqlText = query) println(logicalPlan.prettyJson) [ { "class" : "org.apache.spark.sql.catalyst.plans.logical.Project", "num-children" : 1, "projectList" : [ [ { "class" : "org.apache.spark.sql.catalyst.analysis.UnresolvedStar", "num-children" : 0 } ] ], "child" : 0 }, { "class" : "org.apache.spark.sql.catalyst.analysis.UnresolvedRelation", "num-children" : 0, "tableIdentifier" : { "product-class" : "org.apache.spark.sql.catalyst.TableIdentifier", "table" : "abc" } } ] On Fri, Jan 25, 2019 at 6:07 AM <l...@china-inv.cn> wrote: > Hi, All, > > I tried the suggested approach and it works, but it requires to 'run' the > SQL statement first. > > I just want to parse the SQL statement without running it, so I can do > this in my laptop without connecting to our production environment. > > I tried to write a tool which uses the SqlBase.g4 bundled with SPARK SQL > to extract names of the input tables and it works as expected. > > But I have a question: > > The parser generated by SqlBase.g4 only accepts 'select' statement with > all keywords such as 'SELECT', 'FROM' and table names capitalized > e.g. it accepts 'SELECT * FROM FOO', but it doesn't accept 'select * from > foo'. > > But I can run the spark.sql("select * from foo") in the spark2-shell > without any problem. > > Is there another 'layer' in the SPARK SQL to capitalize those 'tokens' > before invoking the parser? > > If so, why not just modify the SqlBase.g4 to accept lower cases keywords? > > Thanks > > Boying > > > > 发件人: "Shahab Yunus" <shahab.yu...@gmail.com> > 收件人: "Ramandeep Singh Nanda" <ramannan...@gmail.com> > 抄送: "Tomas Bartalos" <tomas.barta...@gmail.com>, l...@china-inv.cn, "user > @spark/'user @spark'/spark users/user@spark" <user@spark.apache.org> > 日期: 2019/01/24 06:45 > 主题: Re: How to get all input tables of a SPARK SQL 'select' statement > ------------------------------ > > > > Could be a tangential idea but might help: Why not use queryExecution and > logicalPlan objects that are available when you execute a query using > SparkSession and get a DataFrame back? The Json representation contains > almost all the info that you need and you don't need to go to Hive to get > this info. > > Some details here: > *https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-Dataset.html#queryExecution* > <https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-Dataset.html#queryExecution> > > On Wed, Jan 23, 2019 at 5:35 PM Ramandeep Singh Nanda < > *ramannan...@gmail.com* <ramannan...@gmail.com>> wrote: > Explain extended or explain would list the plan along with the tables. Not > aware of any statements that explicitly list dependencies or tables > directly. > > Regards, > Ramandeep Singh > > On Wed, Jan 23, 2019, 11:05 Tomas Bartalos <*tomas.barta...@gmail.com* > <tomas.barta...@gmail.com> wrote: > This might help: > show tables; > > st 23. 1. 2019 o 10:43 <*l...@china-inv.cn* <l...@china-inv.cn>> > napísal(a): > Hi, All, > > We need to get all input tables of several SPARK SQL 'select' statements. > > We can get those information of Hive SQL statements by using 'explain > dependency select....'. > But I can't find the equivalent command for SPARK SQL. > > Does anyone know how to get this information of a SPARK SQL 'select' > statement? > > Thanks > > Boying > > > > ------------------------------ > > > > 本邮件内容包含保密信息。如阁下并非拟发送的收件人,请您不要阅读、保存、对外披露或复制本邮件的任何内容,或者打开本邮件的任何附件。请即回复邮件告知发件人,并立刻将该邮件及其附件从您的电脑系统中全部删除,不胜感激。 > > > This email message may contain confidential and/or privileged information. > If you are not the intended recipient, please do not read, save, forward, > disclose or copy the contents of this email or open any file attached to > this email. We will be grateful if you could advise the sender immediately > by replying this email, and delete this email and any attachment or links > to this email completely and immediately from your computer system. > > > ------------------------------ > > > > > ------------------------------ > > > > 本邮件内容包含保密信息。如阁下并非拟发送的收件人,请您不要阅读、保存、对外披露或复制本邮件的任何内容,或者打开本邮件的任何附件。请即回复邮件告知发件人,并立刻将该邮件及其附件从您的电脑系统中全部删除,不胜感激。 > > > > This email message may contain confidential and/or privileged information. > If you are not the intended recipient, please do not read, save, forward, > disclose or copy the contents of this email or open any file attached to > this email. We will be grateful if you could advise the sender immediately > by replying this email, and delete this email and any attachment or links > to this email completely and immediately from your computer system. > > > ------------------------------ > > -- Regards, Ramandeep Singh http://orastack.com +13474792296 ramannan...@gmail.com