Hey nicole ,
your query is working fine in mysql but when it comes to hive then it
just fails as it is

hive> select users_info.country,count(1) from (select userid from
users_audit
    > where SUBSTR(users_audit.logtime,1,17)>='1971-06-27 13:00' and
SUBSTR(users_audit.logtime,1,17)<='2010-06-27 14:00')
    > users_audit inner  join users_info on
(users_info.id=users_audit.userid)
    > group by users_info.country order by count(1) desc;
FAILED: Hive Internal Error: java.lang.NullPointerException(null)
java.lang.NullPointerException
        at 
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:214)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:684)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:805)
        at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:161)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7506)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:7464)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:4562)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5936)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6524)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7282)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

can u just tell me why this is happening :)



On Wed, Jun 27, 2012 at 11:32 PM, Gesli, Nicole
<nicole.ge...@memorylane.com> wrote:
> Soham, I think you need something like this:
>
> SELECT ui.country, COUNT(1)
> FROM ( SELECT DISTINCT userid
>        FROM   users_audit
>        WHERE  SUBSTR(ua.logtime, 1, 17) >= '2012-06-27 13:00'
>        AND    SUBSTR(ua.logtime, 1, 17) <= '2012-06-27 14:00'
>      ) ua
>        JOIN
>        users_info ui ON (ui.id = ua.userid)
> GROUP BY
>        ui.country;
>
> -Nicole
>
> From: Nitin Pawar <nitinpawar...@gmail.com>
> Reply-To: <user@hive.apache.org>
> Date: Wed, 27 Jun 2012 15:22:23 +0530
>
> To: <user@hive.apache.org>
> Subject: Re: date datatype in hive
>
> soham,
>
> the reason u r getting the error is there may be multiple names associated
> with one country
> so when you do  a group by there is no unique output to generate
>
> so group by country, name will give you unique results
>
> same is the case with select *
>
> in case you want to sort the output you may try with order by country
> instead of group by
> if you want to omit duplicates you can use distinct
>
> so try something like
>
> select count(*), country from users_info group by country
> select count(*) from users_info group by country
> select distinct name, country from users_info order by country
> select name, county from users_info group by country, name
>
> all above should work unless i made a typo error :)
>
>
>
> On Wed, Jun 27, 2012 at 3:11 PM, Soham Sardar <sohamsardart...@gmail.com>
> wrote:
>>
>> Thanks Bejoy and Nitin for replyin
>> well now i got that thing ryt but then i have serious issues
>>
>> in hive querying like some of such queries are
>>
>> 1)  select * from users_info group by country;
>>
>> and this is giving me only the countries grouped alphabetically and
>> not any other information
>>
>> 2) select name,country from users_info group by country;
>>
>> and this gives me an error
>> FAILED: Error in semantic analysis: Line 1:7 Expression not in GROUP
>> BY key 'name'
>>
>> can someone help me out with this :)
>>
>>
>>
>>
>> On Wed, Jun 27, 2012 at 1:39 PM, Nitin Pawar <nitinpawar...@gmail.com>
>> wrote:
>> > soham,
>> >
>> > in your query
>> > hive> select name from users_info  group by country;
>> > FAILED: Error in semantic analysis: Line 1:7 Expression not in GROUP
>> > BY key 'name'
>> >
>> > when you run group by it needs to be present in the select columns as
>> > well
>> > thats what Bejoy suggested
>> >
>> > On Wed, Jun 27, 2012 at 12:05 PM, Soham Sardar
>> > <sohamsardart...@gmail.com>
>> > wrote:
>> >>
>> >> Hey bejoy thats the problem i am not able to run the group by query in
>> >> hive i dunno whether i m making a mistake or some thing
>> >> see my previoius reply to this same thread i put up the same issue ...
>> >>
>> >>
>> >> On Wed, Jun 27, 2012 at 12:02 PM, Bejoy KS <bejoy...@yahoo.com> wrote:
>> >> > Hi Soham
>> >> >
>> >> > Rewrite your query with the columns in Group By included in Select as
>> >> > well. Something like
>> >> >
>> >> > select country,name from users_info  group by country;
>> >> >
>> >> > Regards
>> >> > Bejoy KS
>> >> >
>> >> > Sent from handheld, please excuse typos.
>> >> >
>> >> > -----Original Message-----
>> >> > From: Soham Sardar <sohamsardart...@gmail.com>
>> >> > Date: Wed, 27 Jun 2012 11:57:23
>> >> > To: <user@hive.apache.org>; Bejoy Ks<bejoy...@yahoo.com>
>> >> > Reply-To: user@hive.apache.org
>> >> > Subject: Re: date datatype in hive
>> >> >
>> >> > And btw does group by works in hive because the same wuery i am
>> >> > running in mysql and its working fine but its failing in hive
>> >> >
>> >> > select name from users_info  group by country;
>> >> >
>> >> > in mysql its working but whn i try to run it in hive its telling
>> >> >
>> >> > hive> select name from users_info  group by country;
>> >> > FAILED: Error in semantic analysis: Line 1:7 Expression not in GROUP
>> >> > BY key 'name'
>> >> >
>> >> > I wanna know why is it failing
>> >> >
>> >> >
>> >> > On Wed, Jun 27, 2012 at 10:19 AM, Soham Sardar
>> >> > <sohamsardart...@gmail.com> wrote:
>> >> >> See Bejoy and Everyone ,
>> >> >> I have two tables
>> >> >> one users_info and one users_audit
>> >> >> in hive ..
>> >> >>
>> >> >> hive> desc users_audit;
>> >> >> OK
>> >> >> id      int
>> >> >> userid  int
>> >> >> logtime string
>> >> >>
>> >> >> hive> desc users_info;
>> >> >> OK
>> >> >> id      int
>> >> >> name    string
>> >> >> age     int
>> >> >> country string
>> >> >> gender  string
>> >> >> bday    string
>> >> >>
>> >> >> now i have given the description of the tables in hive
>> >> >>
>> >> >> my goal is to find:
>> >> >> the maximum number of users loging - in at between a certain time
>> >> >> frame say 1PM to 2PM- belonging to a specific country
>> >> >>
>> >> >> for example
>> >> >> in between 1PM to 2PM there are 10000 users loging into and 5500 are
>> >> >> from africa , then i need to print the names of all the users who
>> >> >> are
>> >> >> logged in between the time stamp and who are from africa ..
>> >> >> the logtime in the users_audit table gives the login time and the
>> >> >> names and country are from the users_info table .
>> >> >>
>> >> >> Can someone help me out with this . I am a new user in hive so would
>> >> >> like some ones help !!!!!!
>> >> >> Thanks in advance!!  :)
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Tue, Jun 26, 2012 at 6:53 PM, Bejoy Ks <bejoy...@yahoo.com>
>> >> >> wrote:
>> >> >>>
>> >> >>>
>> >> >>> Hi Soham
>> >> >>>
>> >> >>> Hive Supports pretty much all the primitive data types including
>> >> >>> INT.
>> >> >>> For a detaild list please refer
>> >> >>>
>> >> >>>
>> >> >>> https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-PrimitiveTypes
>> >> >>>
>> >> >>> The only draw back as in common is when you have the data type as
>> >> >>> String you cannot use it directly on Mathematical functions.
>> >> >>>
>> >> >>> Your requirement can easily be satisfied with a few date functions
>> >> >>> on
>> >> >>> String data itself.
>> >> >>>
>> >> >>> Regards
>> >> >>> Bejoy KS
>> >> >>>
>> >> >>>
>> >> >>> ________________________________
>> >> >>> From: Soham Sardar <sohamsardart...@gmail.com>
>> >> >>> To: user@hive.apache.org; user <u...@sqoop.apache.org>
>> >> >>> Sent: Tuesday, June 26, 2012 2:23 PM
>> >> >>> Subject: date datatype in hive
>> >> >>>
>> >> >>> I have a native data type in mysql and i just imported it into hive
>> >> >>> and the data type of the column has now become string ..
>> >> >>> Now i would like to know if there is any native data type in hive
>> >> >>> and
>> >> >>> What are the pros and cons of using string type in hive rather than
>> >> >>> (int)(thats what i expect ) type
>> >> >>>
>> >> >>> And with the string type can i just run any type of queries like
>> >> >>> print the names of the person which has the maximum number of users
>> >> >>> between 1 PM and 3 PM
>> >> >>>
>> >> >>> and the table should have just name int(5) and login time (datetime
>> >> >>> )
>> >> >>> {as per the mysql syntax}
>> >
>> >
>> >
>> >
>> > --
>> > Nitin Pawar
>> >
>
>
>
>
> --
> Nitin Pawar
>

Reply via email to