1. Hive does not support "FROM a, b" kind of join. You can use "FROM a
JOIN b ON ..".

2. You might want to use the following simpler (and faster) query:

SELECT report_name, m.col1 as report_run_date, m.col2 as user_name
FROM (
  SELECT  report_name, max(struct(report_run_date, user_name)) as m
  FROM report_history
) tmp;

The struct udf is added recently. It's only in trunk.
Zheng

On Tue, Feb 23, 2010 at 1:14 PM, Tom Nichols <[email protected]> wrote:
> Really?  It looked to me like Hive was complaining about the comma at
> the end of "from report_history rh,"
>
> SELECT rh.user_name, rh.report_name, rh.report_run_date
> FROM report_history rh,
>   (SELECT max(report_run_date) as maxdate, report_name
>     FROM report_history
>     GROUP BY report_name) maxresults
> WHERE rh.report_name = maxresults.report_name
> AND rh.report_run_date= maxresults.maxdate;
>
> FAILED: Parse Error: line 2:20 mismatched input ',' expecting EOF
>
> I was able to perform the same task by using an intermediate table for
> the inner query but that obviously complicates things and I'd suspect
> makes it take longer.
>
> Thanks again for the help!
> -Tom
>
>
> On Tue, Feb 23, 2010 at 4:00 PM, Carl Steinbach <[email protected]> wrote:
>> Tom,
>>
>> The example in the link will compile and execute on Hive without
>> modification.
>>
>> Carl
>>
>> On Tue, Feb 23, 2010 at 12:46 PM, Tom Nichols <[email protected]> wrote:
>>>
>>> Hi Hive users,
>>>
>>> I've got a somewhat convoluted query which I'm wondering how I would
>>> translate it to hive...  It is similar to the first FAQ example here:
>>> http://www.techonthenet.com/sql/max.php
>>> So following the example above, I could easily get the last run date
>>> for each report, but how would I get the _user_ who last ran the
>>> report?  It involves a join based on a subquery.  Conceptually this
>>> seems like it wouldn't be difficult because I can carry all of the
>>> columns while searching for a 'max' of one column in a M-R job...  it
>>> just doesn't express itself in SQL quite the same way.
>>>
>>> Is there a way to put the result of my inner select directly into
>>> another Hive table and then my outer select would just join on that
>>> table?
>>>
>>> Thanks.
>>
>>
>



-- 
Yours,
Zheng

Reply via email to