Latha
What error are you getting when doing explain plan.

On Mon, Apr 20, 2015 at 4:40 PM, Sivasubramaniam, Latha <
[email protected]> wrote:

> I am unable to do an explain plan for this query.  Is there a way to pass
> a sql file for the Explain plan command. Query runs fine from the file with
> the !run command though. Will try to fix the errors and see if I can get
> the query plan.
>
> Thanks,
> latha
>
> -----Original Message-----
> From: Jacques Nadeau [mailto:[email protected]]
> Sent: Monday, April 20, 2015 4:12 PM
> To: [email protected]
> Subject: Re: Query performance Comparison - Drill and Impala
>
> Can you confirm you are getting broadcast joins for all the dimension
> tables? Additionally, can you confirm the planning time? I'm guessing both
> might be an issue given the times you are seeing.
> On Apr 20, 2015 6:46 PM, "Sivasubramaniam, Latha" <
> [email protected]> wrote:
>
> > Hi,
> >
> > I am running performance tests for our use cases on Impala and Drill
> > to make a choice and pick one for our Data warehouse implementation on
> > top of Hadoop.  I used the same cluster for both the tests. Select
> > Query that takes ~20 seconds in impala, takes around 440 seconds in
> > drill on same hardware cluster (nearly 20 times more). I would like to
> > make sure that I am not missing anything in tuning or configuration.
> > Below is our configuration, appreciate comments on if these
> > performance numbers are as expected or if any tuning can be done to
> improve drill performance.
> >
> > Hadoop Cluster configuration and data set:
> >
> >
> > *         5 nodes cluster, 1 name node and 4 data nodes, 32GB memory , 4
> > cpus on each node.
> >
> > *         There are two main fact tables. 98 million records in one and
> 87
> > million records in another. There are around 10 dimension tables. For
> > now, we are running the tests with normalized schema from out data
> warehouse.
> >
> > *         Running the same query against Impala and Drill except for the
> > syntax differences everything is same.
> >
> > Impala Configuration
> >
> >
> > *         CDH 5.3.2 with Impala 2.1.2 version
> >
> > *         4 impala daemons running on 4 data nodes.
> >
> > *          Impala daemons memory set to 25GB.
> >
> > *         Partitioned parquet tables were created from text format hdfs
> > files using impala itself (CTAS command).
> >
> > *         Parquet block size 256MB
> >
> > Drill Configuration
> >
> >
> > *         Drill 0.8 version, added drill on the CDH cluster setup
> >
> > *         4 drillbits running on 4 data nodes
> >
> > *         Drillbits memory set to 16GB and heap memory set to 8GB.  I ran
> > with 8GB memory and 4GB for heap, there was no change in the query
> timings.
> >
> > *         Partitioned parquet files.
> >
> > SQL Query
> >
> > SELECT
> >
> >                 DIM_Sites_parq.Site_Desc,
> >
> >                 DIM_Workgroups_AgentEventDetails.Workgroup_Desc,
> >                 DIM_Agents_AgentEventDetails.Last_Name,
> >
> >                 DIM_Agents_AgentEventDetails.First_Name,
> >
> >
> >                 DIM_Agent_Status_AgentEventDetails.Agent_Status_Key,
> >                 DIM_CallTypes_ContactDetails.Call_Type_Desc,
> >
> >
> >
> >                 DIM_Call_Actions.Call_Action_Desc,
> >
> >                 FACT_Contact_Detail.ANI,
> >                 FACT_Contact_Detail.DNIS,
> >
> >                 FACT_Contact_Detail.Begin_Date_Time,
> >                 FACT_Contact_Detail.Call_Start_Date_Time,
> >
> >                 FACT_Contact_Detail.Orig_DNIS,
> >                 FACT_Agent_Activity_Detail.Start_Date_Time,
> >                 FACT_Agent_Activity_Detail.End_Date_Time,
> >                 FACT_Agent_Activity_Detail.Dur_Time,
> >
> >                 FACT_Agent_Activity_Detail.Sequence_Number,
> >
> >                 FACT_Contact_Detail.Begin_Date_Time,
> >                 FACT_Contact_Detail.Call_Start_Date_Time,
> >                 FACT_Agent_Activity_Detail.Start_Date_Time,
> >                 FACT_Agent_Activity_Detail.End_Date_Time,
> >
> >
> >                 DIM_Dispositions.Disposition_Desc,
> >                 DIM_Agents_AgentEventDetails.Full_Name ,
> >
> >
> > DIM_Resource_Groups_AgentEventDetails.Resource_Group_Desc,
> >
> >                 FACT_Contact_Detail.Sequence_Number,
> >
> >                 FACT_Contact_Detail.Service_Key,
> >                 FACT_Contact_Detail.Station,
> >                 FACT_Contact_Detail.Near_Off_Hook_Date_Time,
> >                 FACT_Contact_Detail.Far_Off_Hook_Date_Time,
> >                 FACT_Contact_Detail.Near_On_Hook_Date_Time,
> >                 FACT_Contact_Detail.Far_On_Hook_Date_Time,
> >                 FACT_Contact_Detail.Begin_Greeting_Date_Time,
> >
> >                 FACT_Contact_Detail.End_Guard_Date_Time,
> >                 FACT_Contact_Detail.Resource_Group_Type,
> >                 FACT_Contact_Detail.Info_Digits,
> >                 FACT_Contact_Detail.Agent_Site_Key,
> >                 FACT_Contact_Detail.Agent_Switch_Key,
> >
> >                 FACT_Contact_Detail.Line_Number,
> >                 FACT_Contact_Detail.Queue_Start_Date_Time,
> >
> >                 FACT_Contact_Detail.CallQ_Start_Date_Time,
> >                 FACT_Contact_Detail.Queue_End_Date_Time,
> >                 FACT_Contact_Detail.Dur_Queue_Time,
> >                 FACT_Contact_Detail.Conn_Clear_Date_Time,
> >                 FACT_Contact_Detail.Wrap_End_Date_Time,
> >
> > FACT_Contact_Detail.Answer_Flag,
> >
> > FACT_Contact_Detail.Call_End_Date_Time,
> >  FACT_Contact_Detail.Script_Id_Source,
> >
> >   FACT_Contact_Detail.IPNIQ_ASBR_Flag,
> >  FACT_Contact_Detail.Reserved_Start_Date_Time,
> >
> >   FACT_Contact_Detail.Reserved_End_Date_Time,
> >  FACT_Contact_Detail.Dur_IPNIQ_Q_Time,
> >  FACT_Contact_Detail.SeqLeg_Number,
> >
> > FACT_Contact_Detail.Num_Accept_Responses,
> >  FACT_Contact_Detail.Num_Reject_Responses,
> >  FACT_Contact_Detail.Preview_Start_Date_Time,
> >
> >   FACT_Contact_Detail.Preview_End_Date_Time,
> >
> > FACT_Contact_Detail.Detection_Date_Time,
> >  FACT_Contact_Detail.Answer_Date_Time,
> >  FACT_Contact_Detail.Record_Number,
> >  FACT_Contact_Detail.Detection_Type,
> >  FACT_Contact_Detail.Calling_Party_Name,
> >  FACT_Contact_Detail.Num_Phone_Rings,
> >
> >   FACT_Contact_Detail.End_Greeting_Date_Time,
> >  FACT_Contact_Detail.Phone_Start_Ringing_Date_Time,
> >
> >   FACT_Contact_Detail.Begin_Greeting_Date_Time,
> >  FACT_Contact_Detail.Begin_Msg_Playback_Date_Time,
> >  FACT_Contact_Detail.Xfer_Command_Date_Time,
> >  FACT_Contact_Detail.Connect_Date_Time,
> >  FACT_Contact_Detail.Second_Party_Park_Flag,
> >  FACT_Contact_Detail.Dur_Short_Call,
> >  FACT_Agent_Activity_Detail.Call_Id_Source,
> >
> >   FACT_Contact_Detail.Call_Category_Key,
> >
> >   DIM_Agents_AgentEventDetailsSecondParty.User_Id_Source
> >
> >   FROM
> >     hdfs.rootparq.`fact_agent_activity_detail_12m_partparq` AS
> > FACT_Agent_Activity_Detail
> >
> >                 LEFT OUTER JOIN
> >
> >  (select * from hdfs.rootparq.`fact_contact_detail_12m_partparq` where
> > Call_Start_Date_Time
> >
> > between
> >
> > '2014-12-03 00:00:00' AND '2014-12-03 13:00:00' AND
> > fact_contact_detail_12m_partparq.dir0 = 2014 and
> > fact_contact_detail_12M_partParq.dir1 = 12 ) AS FACT_Contact_Detail
> > ON
> >        FACT_Agent_Activity_Detail.Data_Source_Key =
> > FACT_Contact_Detail.Data_Source_Key
> >                 AND
> >        FACT_Agent_Activity_Detail.Call_Id_Source =
> > FACT_Contact_Detail.Call_Id_Source
> >                 AND
> >        FACT_Agent_Activity_Detail.Sequence_Number =
> > FACT_Contact_Detail.Sequence_Number
> >
> >
> >                 LEFT OUTER JOIN
> >                    DIM_Services_Parq AS
> > DIM_Services_Agent_Activity_Detail
> > ON DIM_Services_Agent_Activity_Detail.Service_Key =
> > FACT_Agent_Activity_Detail.Service_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Services_Parq AS DIM_Services_Contact_Detail ON
> > FACT_Contact_Detail.Service_Key =
> > DIM_Services_Contact_Detail.Service_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Call_Types_Parq AS DIM_CallTypes_ContactDetails ON
> >        FACT_Contact_Detail.Call_Type_Key =
> > DIM_CallTypes_ContactDetails.Call_Type_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Resource_Groups_parq AS DIM_Resource_Groups_AgentEventDetails ON
> >        FACT_Contact_Detail.Resource_Group_Key =
> > DIM_Resource_Groups_AgentEventDetails.Resource_Group_Key
> >
> >
> >                 LEFT OUTER JOIN
> >     DIM_Call_Actions_parq AS DIM_Call_Actions ON
> > DIM_Call_Actions.Call_Action_Key = FACT_Contact_Detail.Call_Action_Key
> >
> >                 INNER JOIN
> >         DIM_Agents_Parquet AS DIM_Agents_AgentEventDetails ON
> > FACT_Agent_Activity_Detail.Agent_Key =
> > DIM_Agents_AgentEventDetails.Agent_Key
> >
> >                 INNER JOIN
> >         DIM_Sites_Parq ON FACT_Agent_Activity_Detail.Site_Key =
> > DIM_Sites_Parq.Site_Key
> >
> >                 INNER JOIN
> >         DIM_Workgroups_parq AS DIM_Workgroups_AgentEventDetails ON
> >        FACT_Agent_Activity_Detail.Workgroup_Key =
> > DIM_Workgroups_AgentEventDetails.Workgroup_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Dispositions_parq AS DIM_Dispositions ON
> > FACT_Contact_Detail.Agent_Disp_Key = DIM_Dispositions.Disposion_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Agents_Parquet AS DIM_Agents_AgentEventDetailsSecondParty ON
> >        FACT_Contact_Detail.Second_Party_Agent_Key =
> > DIM_Agents_AgentEventDetailsSecondParty.Agent_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Call_Action_Reasons_parq AS DIM_Call_Action_Reasons ON
> > FACT_Contact_Detail.Call_Action_Reason_Key =
> > DIM_Call_Action_Reasons.Call_Action_Reason_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Agent_Status_parq AS DIM_Agent_Status_AgentEventDetails ON
> >        FACT_Agent_Activity_Detail.Status_Key =
> > DIM_Agent_Status_AgentEventDetails.Agent_Status_Key
> >
> >                 LEFT OUTER JOIN
> >     DIM_Agent_Status_Reasons_parq AS DIM_Agent_Reason_AgentEventDetails
> ON
> >        FACT_Agent_Activity_Detail.Reason_Key =
> > DIM_Agent_Reason_AgentEventDetails.Agent_Status_Reason_Key
> >
> >
> > WHERE
> >  (
> >
> > DIM_Workgroups_AgentEventDetails.Workgroup_Key  IN  ( 3,1)
> >
> >   AND
> >
> >
> >    ( FACT_Agent_Activity_Detail.Start_Date_Time
> >
> > between
> > '2014-12-03 00:00:00' AND '2014-12-03 13:00:00' AND
> > FACT_Agent_Activity_Detail.dir0 = 2014 and
> > FACT_Agent_Activity_Detail.dir1 =
> >
> > 12
> >    )
> >
> >   );
> >
> >
> > Thanks,
> > Latha
> >
> >
> > This email (including any attachments) is proprietary to Aspect
> > Software, Inc. and may contain information that is confidential. If
> > you have received this message in error, please do not read, copy or
> forward this message.
> > Please notify the sender immediately, delete it from your system and
> > destroy any copies. You may not further disclose or distribute this
> > email or its attachments.
> >
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>

Reply via email to