As far as I know the driver should not generate additional statements. Can you share what software you're using to connect to Impala through the driver? I suspect that that software generated these queries, possibly to do some schema discovery.
Cheers, Lars On Thu, Jun 14, 2018 at 10:14 PM Jim Apple <[email protected]> wrote: > I don’t think I understand the statement. Under what conditions are > additional DDL statements generated by the driver? What exact query did you > enter and what was generated instead? > > On Thu, Jun 14, 2018 at 5:44 PM Sunil Parmar <[email protected]> > wrote: > >> The Impala JDBC driver generates additional DDL statements. >> >> select column1,column2 from table limit 0 >> or >> show tables >> or >> use dwh; >> or >> describe table >> >> If DDL are expensive; is there a way to avoid this ? >> >> Sunil Parmar >> >> >> On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <[email protected]> >> wrote: >> >>> SET is very cheap because it just changes a value in the user's session. >>> There's no interaction with any other services. >>> >>> DDL operations can be a lot more expensive, although they don't compete >>> with executing queries for resources. For the most part those DDL >>> operations you mentioned consume resources in Java, generate load on >>> metadata services like the HDFS namenode and Hive Metastore, and can block >>> other DDL operations. We don't have great visibility at the moment into >>> those resources consumed by metadata operation. >>> >>> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <[email protected]> >>> wrote: >>> >>>> Thanks Tim, >>>> >>>> If i got your point right, then SET operation is affecting the client >>>> Java memory and not considered as part of the impala daemon memory limit, >>>> right? >>>> >>>> Is this correct also for invalidate meta data and Refresh or alter >>>> table ... recover partitions? Are all of these client operations? Are they >>>> use any resources assigned for impala daemon or impala resource pools? >>>> >>>> If they are client operations then I can use the used resources using >>>> the Linux TOP command, if they are taking any resources from impala daemon >>>> memory limit or resource pool, I will be happy to know where I can track >>>> the resource usage of these DDL operations. >>>> >>>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <[email protected]> >>>> wrote: >>>> >>>>> "SET" is very cheap - it only affects the client session on the Impala >>>>> server that you're connected to. DDL operations are often more expensive >>>>> because they require updating metadata globally. That can sometimes >>>>> involve >>>>> a bit of work (e.g. gather metadata about existing files on HDFS) or can >>>>> involve the operation getting queued behind other metadata operations. >>>>> >>>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Community, >>>>>> >>>>>> Does the DDL operations like alter, drop and create consume >>>>>> resources? and does the set operations like set resource_pool=xxx also >>>>>> consume resources? >>>>>> >>>>>> Yes, i'm aware these operations are quick but once they are running >>>>>> from interfaces like Hue or MSTR through ODBC it's running till it get >>>>>> timeout .... which may exceed few minutes >>>>>> >>>>>> -- >>>>>> Take Care >>>>>> Fawze Abujaber >>>>>> >>>>> >>>>> -- >>>> Take Care >>>> Fawze Abujaber >>>> >>> >>>
