The Impala JDBC driver generates additional DDL statements.

select column1,column2 from table limit 0
or
show tables
or
use dwh;
or
describe table

If DDL are expensive; is there a way to avoid this ?

Sunil Parmar


On Mon, May 21, 2018 at 5:48 PM Tim Armstrong <[email protected]>
wrote:

> SET is very cheap because it just changes a value in the user's session.
> There's no interaction with any other services.
>
> DDL operations can be a lot more expensive, although they don't compete
> with executing queries for resources. For the most part those DDL
> operations you mentioned consume resources in Java, generate load on
> metadata services like the HDFS namenode and Hive Metastore, and can block
> other DDL operations. We don't have great visibility at the moment into
> those resources consumed by metadata operation.
>
> On Mon, May 21, 2018 at 11:21 AM, Fawze Abujaber <[email protected]>
> wrote:
>
>> Thanks Tim,
>>
>> If i got your point right, then SET operation is affecting the client
>> Java memory and not considered as part of the impala daemon memory limit,
>> right?
>>
>> Is this correct also for invalidate meta data and Refresh or alter table
>> ... recover partitions? Are all of these client operations? Are they use
>> any resources assigned for impala daemon or impala resource pools?
>>
>> If they are client operations then I can use the used resources using the
>> Linux TOP command,  if they are taking any resources from impala daemon
>> memory limit or resource pool, I will be happy to know where I can track
>> the resource usage of these DDL operations.
>>
>> On Mon, 21 May 2018 at 20:45 Tim Armstrong <[email protected]>
>> wrote:
>>
>>> "SET" is very cheap - it only affects the client session on the Impala
>>> server that you're connected to. DDL operations are often more expensive
>>> because they require updating metadata globally. That can sometimes involve
>>> a bit of work (e.g. gather metadata about existing files on HDFS) or can
>>> involve the operation getting queued behind other metadata operations.
>>>
>>> On Sun, May 20, 2018 at 4:09 AM, Fawze Abujaber <[email protected]>
>>> wrote:
>>>
>>>> Hi Community,
>>>>
>>>> Does the DDL operations like alter, drop and create consume resources?
>>>> and does the set operations like set resource_pool=xxx also consume
>>>> resources?
>>>>
>>>> Yes, i'm aware these operations are quick but once they are running
>>>> from interfaces like Hue or MSTR through ODBC it's running till it get
>>>> timeout .... which may exceed few minutes
>>>>
>>>> --
>>>> Take Care
>>>> Fawze Abujaber
>>>>
>>>
>>> --
>> Take Care
>> Fawze Abujaber
>>
>
>

Reply via email to