It probably won't make a difference to performance — the subqueries will be 
executed in separate map-reduce jobs, and the output put on disk anyway.

First, get it to work; second, optimize.

Robin

From: dyuti a <hadoop.hiv...@gmail.com<mailto:hadoop.hiv...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Friday, June 28, 2013 12:05 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Need urgent help in hive query

Hi Robin,
Thanks for your reply. Hope this will help us but still looking into 
performance wise too....if no other option then have to go by this.

Dti


On Sat, Jun 29, 2013 at 12:03 AM, Robin Morris 
<r...@baynote.com<mailto:r...@baynote.com>> wrote:
Split up the query – put results of the subquery into a table, run the final 
query on that, then drop the temporary table.

Robin

From: dyuti a <hadoop.hiv...@gmail.com<mailto:hadoop.hiv...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Friday, June 28, 2013 10:54 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>, Michael Malak 
<michaelma...@yahoo.com<mailto:michaelma...@yahoo.com>>
Subject: Re: Fwd: Need urgent help in hive query

Hi Michael,
Thanks for your help, is there any other possible options apart from this.


On Fri, Jun 28, 2013 at 10:33 PM, Michael Malak 
<michaelma...@yahoo.com<mailto:michaelma...@yahoo.com>> wrote:
Just copy and paste the whole long expressions to their second occurrences.

________________________________
From: dyuti a <hadoop.hiv...@gmail.com<mailto:hadoop.hiv...@gmail.com>>
To: user@hive.apache.org<mailto:user@hive.apache.org>
Sent: Friday, June 28, 2013 10:58 AM
Subject: Fwd: Need urgent help in hive query

Hi Experts,
I'm trying with the below SQL query in Hive, which does not support column 
alias access in subsequent columns as shown below in the query. Is there any 
other way to rewrite the same without using alias? any of your help are really 
appreciated.

INSERT INTO CAUD
   (
     pst_cc
  pst_no
  pst_ber
  pst_tkn
  pst_dtm
  pst_nbr
  pst_cde
  pst_dte
)
SELECT der.cc<http://der.cc/>
  der.no<http://der.no/>
  der.ber
  der.tkn
  der.dtm
  der.nbr
  der.cde
  der.dte

FROM (SELECT udp.cc<http://udp.cc/>
   udp.no<http://udp.no/>
   udp.ber
   udp.tkn
   ,CASE WHEN udp.SYSTEM_DTE>16000000 AND udp.SYSTEM_DTE<100000000 THEN 
udp.SYSTEM_DTE
            WHEN udp.DTE_OUT>16000000 AND udp.DTE_OUT<100000000 THEN udp.DTE_OUT
            WHEN udp.DTE_IN>16000000 AND udp.DTE_IN<100000000 THEN udp.DTE_IN
            ELSE '1231'
            END  AS DTE_OUT
   ,CASE WHEN udp.TME_OUT <> 0 THEN udp.TME_OUT
            WHEN udp.TME_IN <> 0 THEN udp.TME_IN
            ELSE 0
            END AS TME_OUT
            ,TRIM(CAST(TME_OUT  AS CHAR(6))) AS TME_OUT1
    ,CAST(CAST(SUBSTR(TRIM(DTE_OUT),1,8)  AS CHAR(8)) AS DATE FORMAT 
'yyyymmdd')  AS DTE_OUT_O
    ,CASE WHEN TME_OUT>99999 THEN CAST(TME_OUT1 AS CHAR(6))
            WHEN TME_OUT>9999 AND TME_OUT<=99999 THEN CAST('0'||TME_OUT1  AS 
CHAR(6))
            WHEN TME_OUT>999 AND TME_OUT<=9999 THEN CAST('00'||TME_OUT1 AS 
CHAR(6))
            WHEN TME_OUT>99 AND TME_OUT<=999 THEN CAST('000'||TME_OUT1 AS 
CHAR(6))
            WHEN TME_OUT>9 AND TME_OUT<=99 THEN CAST('0000'||TME_OUT1 AS 
CHAR(6))
            WHEN TME_OUT>0 AND TME_OUT<=9 THEN CAST('00000'||TME_OUT1 AS 
CHAR(6))
            WHEN TME_OUT=0 THEN '000000'
            END AS TME_OUT2
            
,SUBSTR(TRIM(TME_OUT2),1,2)||':'||SUBSTR(TRIM(TME_OUT2),3,2)||':'||SUBSTR(TRIM(TME_OUT2),5,2)
 AS
            TME_OUT_O
            , CAST( DTE_OUT_O||' '||TME_OUT_O AS TIMESTAMP FORMAT 'YYYYMMDD:HH: 
MI:SS')     AS DTM
            ,udp.nbr  AS  nbr

   FROM   STS_GNCAUDP   udp

            INNER JOIN LOAD_LOG LZ_LL  ON udp.LOG_KEY=LZ_LL.LOG_KEY
            INNER JOIN ESA_LOAD_LOG ESA_LL  ON 
LZ_LL.ESA_LOAD_LOG_KEY=ESA_LL.LOG_KEY
            AND ESA_LL.PBLSH_IND='$PBLSH_IND'
            AND ESA_LL.LOAD_END_DTM ='$HIGH_DATE_TIME'
            AND ESA_LL.SOR_CD= '$CLM_SOR_CD'
            AND ESA_LL.SUBJ_AREA_NM= '$SUBJ_AREA_NM'
            AND ESA_LL.WORK_FLOW_NM= '$WORK_FLOW_NM'
            QUALIFY ROW_NUMBER()  OVER  (PARTITION BY udp.cc,udp.pst_no,
            udp.cde,udp.nbr,udp.dte,udp.LOG_KEY
            ORDER BY DTM DESC)=1) AS der
;



Thanks in advance!
Dti






Reply via email to