The advice I have always seen for your case is to transform the subquery in the WHERE clause into a LEFT OUTER JOIN.
On Thu, Jul 25, 2013 at 11:04 AM, Edson Ramiro <erlfi...@gmail.com> wrote: > AFAIK, > > Hive supports subqueries only in the FROM clause. > > Maybe you have to split you query into more queries... > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries > > > > > Edson Ramiro > > > On Thu, Jul 25, 2013 at 9:31 AM, Jérôme Verdier < > verdier.jerom...@gmail.com> wrote: > >> Hi Bennie, >> >> I was trying some solutions to pass through my problem, and a problem >> occurs >> >> here is the error : >> >> FAILED: ParseException line 26:14 cannot recognize input near 'SELECT' >> 'cal' '.' in expression specification >> >> Is AND...BETWEEN ( SELECT..... is possible in Hive? >> >> >> 2013/7/25 Bennie Schut <bsc...@ebuddy.com> >> >>> Hi Jerome, >>> >>> Yes it looks like you could stop using GET_SEMAINE and directly joining >>> "calendrier_hebdo" with "calendrier" for example. For "FCALC_IDJOUR" you >>> will have to make a udf so I hope you have some java skills :) >>> The "calendrier" tables suggests you have star schema with a calendar >>> table. If on oracle you partitioned on a date and use a subquery to get the >>> dates you want from the fact table you can expect this to be a problem in >>> hive. Partition pruning works during planning it will not know which >>> partitioned to prune and thus run on all the data in the fact table and >>> filter after it's done making partitioning useless. There are ways of >>> working around this, it seems most people decide to use a "deterministic" >>> udf which produces the dates and this causes the udfs to be run during >>> planning and partition pruning magically works again. >>> Hope this helps. >>> >>> Bennie. >>> >>> Op 25-7-2013 09:50, Jérôme Verdier schreef: >>> >>> Hi, >>> >>> I need some help to translate a PL/SQL script in HiveQL. >>> >>> Problem : my PL/SQL script is calling two functions. >>> >>> you can see the script below : >>> >>> SELECT >>> in_co_societe as co_societe, >>> 'SEMAINE' as co_type_periode, >>> a.type_entite as type_entite, >>> a.code_entite as code_entite, >>> a.type_rgrp_produits as type_rgrp_produits, >>> a.co_rgrp_produits as co_rgrp_produits, >>> SUM(a.MT_CA_NET_TTC) as MT_CA_NET_TTC, >>> SUM(a.MT_OBJ_CA_NET_TTC) as MT_OBJ_CA_NET_TTC, >>> SUM(a.NB_CLIENTS) as NB_CLIENTS, >>> SUM(a.MT_CA_NET_TTC_COMP) as MT_CA_NET_TTC_COMP, >>> SUM(a.MT_OBJ_CA_NET_TTC_COMP) as >>> MT_OBJ_CA_NET_TTC_COMP, >>> SUM(a.NB_CLIENTS_COMP) as NB_CLIENTS_COMP >>> from >>> kpi.thm_ca_rgrp_produits_jour/*@o_bi.match.eu*/ a >>> WHERE >>> a.co_societe = in_co_societe >>> AND a.dt_jour between >>> ( >>> SELECT >>> cal.dt_jour_deb >>> FROM ods.calendrier_hebdo cal >>> WHERE cal.co_societe = in_co_societe >>> AND cal.co_an_semaine = ods.package_date.get_semaine( >>> ods.package_date.fcalc_idjour( >>> CASE >>> WHEN TO_CHAR(D_Dernier_Jour,'YYYY') = >>> TO_CHAR(D_Dernier_Jour-364,'YYYY') THEN >>> NEXT_DAY(D_Dernier_Jour-364,1)-7 >>> ELSE >>> D_Dernier_Jour-364 >>> END >>> ) >>> ) >>> ) >>> AND D_Dernier_Jour-364 >>> -- On ne calcule rien si la semaine est compl¿¿te >>> AND ( >>> TO_CHAR(D_Dernier_Jour,'DDMM') <> '3112' >>> AND TO_CHAR(D_Dernier_Jour,'D') <> '7' >>> ) >>> GROUP BY >>> a.type_entite, >>> a.code_entite, >>> a.type_rgrp_produits, >>> a.co_rgrp_produits; >>> >>> The function ods.package_date.get_semaine is : >>> >>> FUNCTION GET_SEMAINE >>> (ID_DEB IN NUMBER) >>> RETURN NUMBER >>> IS >>> SEMAINE NUMBER(10); >>> BEGIN >>> SELECT CO_AN_SEMAINE >>> INTO SEMAINE >>> FROM CALENDRIER >>> WHERE ID_JOUR = ID_DEB; >>> >>> RETURN (SEMAINE); >>> EXCEPTION >>> WHEN NO_DATA_FOUND THEN >>> RETURN (0); >>> WHEN OTHERS THEN >>> RETURN (0); >>> END; >>> >>> The function ods.package_date.fcalc_idjour is below : >>> >>> FUNCTION FCALC_IDJOUR >>> (DATE_REFERENCE IN DATE) >>> RETURN NUMBER >>> IS >>> NM_ANNEE NUMBER := TO_NUMBER(TO_CHAR(DATE_REFERENCE,'YYYY')); >>> NM_MOIS NUMBER := >>> TO_NUMBER(SUBSTR(TO_CHAR(DATE_REFERENCE,'YYYYMM'),5,2)); >>> NM_JOUR NUMBER := >>> TO_NUMBER(SUBSTR(TO_CHAR(DATE_REFERENCE,'YYYYMMDD'),7,2)); >>> IDJOUR_CALCULE NUMBER := 0; >>> BEGIN >>> IF NM_ANNEE < 1998 >>> OR DATE_REFERENCE IS NULL THEN >>> IDJOUR_CALCULE := 0; >>> ELSE >>> IDJOUR_CALCULE := ((NM_ANNEE - 1998) * 600) + ((NM_MOIS - 01) * >>> 50) + NM_JOUR; >>> END IF; >>> >>> RETURN IDJOUR_CALCULE; >>> DBMS_OUTPUT.PUT_LINE(IDJOUR_CALCULE); >>> END FCALC_IDJOUR; >>> >>> Is it possible to translate this in one HiveQL script ? >>> >>> >>> >> >> >> -- >> *Jérôme VERDIER* >> 06.72.19.17.31 >> verdier.jerom...@gmail.com >> >> > -- https://github.com/bearrito @deepbearrito