Here is my stab at it. I have not tested it but this should get you started
Following points are importat
1. I added a WHERE clause in the sub query to limit he data set by any
partition u may have
2. You have to write a collect UDF to use it. Wampler/Capriolo's book in
Chapter 13.Functions - refer the class GenericUDAFCollect
SELECT
page_url,
token,
collect(concat_ws('|', pcw. original_category, pcw.weight))
FROM
(SELECT
page_url,
token,
original_category,
weight
FROM
media_visit_info)
WHERE
partition_column='partition_col_val'
GROUP BY
original_category,
weight
) pcw
LIMIT 10
;
From: ch huang <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Monday, August 19, 2013 2:04 AM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: question about hive SQL
hi,all:
i do not very familar with HQL, and my problem is ,now i have 2 queries
Q1: select page_url, original_category,token from media_visit_info group by
page_url, original_category,token limit 10
Q2: select original_category as code , weight from media_visit_info where
page_url='X' group by original_category,weight;
Q1 page_url value should be send to Q2 where condition ,and the two query
result should be combined like
{
url:http\\:www.baidu.com,
category:|CN10,
token:20,
categorys:
[
{code:|CN10-1-1,weight:0.5},
{code:|CN11-2-2,weight:0.1},
{code:|CN10-1-3,weight:0.02}
]
}
i do not know if it can write into one query(JOIN+SUBQUERY??) ,any one can help?
CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the
intended recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited. If you
are not the intended recipient, please contact the sender by reply email and
destroy all copies of the original message along with any attachments, from
your computer system. If you are the intended recipient, please be advised that
the content of this message is subject to access, review and disclosure by the
sender's Email System Administrator.