Hi Cobby,

 

In your statement and I quote:

 

“Empirical tests, by the show, suggest that indeed, partition optimization does 
take place. “

 

Do you have an example of this please? 

 

Thanks,

 

Mich

 

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and 
Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Peridale 
Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: cobby cohen [mailto:ququr...@yahoo.com] 
Sent: 16 March 2015 16:14
To: user@hive.apache.org
Subject: Re: view over partitioned table

 

hi,

i am using CDH 5.2 (hive 13).

as for partitioned views, to my understanding they are more about abstracting a 
non partitioned back end. My questing is related to how and if the query 
optimizer can detect possible uses of partitioned column, even if it might be 
"masked" by a view (or multiple view).

Empirical tests, by the show, suggest that indeed, partition optimization does 
take place.  

regards, cobby.

 

  _____  

From: "Moore, Douglas" <douglas.mo...@thinkbiganalytics.com>
To: "user@hive.apache.org" <user@hive.apache.org>; 'cobby cohen' 
<ququr...@yahoo.com> 
Sent: Monday, March 16, 2015 4:19 PM
Subject: Re: view over partitioned table

 

Mich,

 

What version of Hive are you running?

Have you seen this?

https://cwiki.apache.org/confluence/display/Hive/PartitionedViews

 

- Douglas

 

From: Mich Talebzadeh <m...@peridale.co.uk>
Reply-To: <user@hive.apache.org>
Date: Sun, 15 Mar 2015 19:01:57 +0000
To: <user@hive.apache.org>, 'cobby cohen' <ququr...@yahoo.com>
Subject: RE: view over partitioned table

 

OK,

 

This is the way I read it. Crearte table t1 .. partitioned by date will use 
horizontal partitioning as per common with any RDBMS say Oracle.

 

The view I will create it as follows:

 

hive> create view v1 as select * from t1;

OK

Time taken: 0.073 seconds

 

hive> analyze table t1 partition (date) compute statistics;

Query ID = hduser_20150315184646_af3890b2-4079-4f81-b38b-da9af67b2751

Total jobs = 1

Launching Job 1 out of 1

-----

Stage-Stage-0: Map: 1   Cumulative CPU: 0.6 sec   HDFS Read: 299 HDFS Write: 0 
SUCCESS

Total MapReduce CPU Time Spent: 600 msec

OK

 

hive> insert into table t1 partition (date)  values (1,'2015-04-04');

hive> insert into table t1 partition (date)  values (1,'2015-05-04');

 

Help to update stats

 

hive> analyze table t1 partition (date) compute statistics;

 

select from the view after exiting hive and back

 

 

hive> explain select id from v1 where date = '2015-04-04';

OK

STAGE DEPENDENCIES:

  Stage-0 is a root stage

 

STAGE PLANS:

  Stage: Stage-0

    Fetch Operator

      limit: -1

      Processor Tree:

        TableScan

          alias: t1

          Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column 
stats: NONE

          Select Operator

            expressions: id (type: string)

            outputColumnNames: _col0

            Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column 
stats: NONE

            ListSink

 

Time taken: 0.627 seconds, Fetched: 17 row(s)

 

 

Now select the same from base table

 

hive> explain select id from t1 where date = '2015-04-04';

OK

STAGE DEPENDENCIES:

  Stage-0 is a root stage

 

STAGE PLANS:

  Stage: Stage-0

    Fetch Operator

      limit: -1

      Processor Tree:

        TableScan

          alias: t1

          Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column 
stats: NONE

          Select Operator

            expressions: id (type: string)

            outputColumnNames: _col0

            Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column 
stats: NONE

            ListSink

 

Time taken: 0.597 seconds, Fetched: 17 row(s)

 

The plan looks the same but it is slower on the view

 

HTH

 

Mich 

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Peridale 
Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: cobby cohen [mailto:ququr...@yahoo.com] 
Sent: 15 March 2015 17:52
To: user@hive.apache.org
Subject: view over partitioned table

 

in the following case:

 

create table T1(id string)

partitioned by (date date);

 

create view V1(id, date)

select id, date from T1

 

if i do the following

select id from V1 where date = '2015-04-04'

will the query optimize on T1's partitioning?

 

and what if V1 is a bit more complex, further depending on other views (all of 
which expose the partitioned column as-is)?

 

thx!

 

Reply via email to