hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Mohit Durgapal
I have a hive table partitioned by dates. It contains ecomm data in the format siteid,sitecatid,catid,subcatgid,pid,pname,pprice,pmrp,pdesc What I need to do is to run a query on table above in hive for top 10 products(count wise) in each sub category. What adds a bit more complexity is

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Nitin Pawar
may be you can share your table ddl, your query and what output r u looking for On Fri, Apr 11, 2014 at 12:26 PM, Mohit Durgapal durgapalmo...@gmail.comwrote: I have a hive table partitioned by dates. It contains ecomm data in the format

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Mohit Durgapal
Hi Nitin, The ddl is as follows: CREATE EXTERNAL TABLE user_logs( users_iduuidstring, siteid int, site_catid int, stext string, catgint, // CATEGORY scatg int, // SUBCATEGORY catgnamestring, scatgname string, brand string,// PRODUCT BRAND NAME prrange

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Nitin Pawar
will it be a good idea to just get top 10 ranked products by whatever your ranking is based on and then join it with its metadata (self join or any other way) ? On Fri, Apr 11, 2014 at 1:52 PM, Mohit Durgapal durgapalmo...@gmail.comwrote: Hi Nitin, The ddl is as follows: CREATE EXTERNAL

Reducer wont finnish - window function

2014-04-11 Thread Juraj jiv
hello, i may need your help with one query. Its always ending with reducucer timeout in YARN. I tried increase timeout to 30min but its still not enough and progress is not moving at all. Here is query: INSERT INTO TABLE TMP2 SELECT a.rn ,MAX( a.date_report_end) over ( PARTITION BY a.field1

Re: hive query to select top 10 product of each subcategory and select most recent product info

2014-04-11 Thread Adrian Hains
I think you need to separate out the logic that does your group by aggregations from the logic of then retrieving all of the other columns for a single row from that set. Something like: select tbl.myKeyColumn1, tbl.myKeyColumn2, tbl.otherValueColumn1,

Re: Hive jdbc access to HiveServer2. How to debug?

2014-04-11 Thread Thejas Nair
This hanging issue has been seen when there is a mismatch between the auth modes of HS2 server and client (usually SASL vs NOSASL). On Sun, Apr 6, 2014 at 5:40 AM, Jay Vyas jayunit...@gmail.com wrote: Hi hive. I cant run JDBC queries against HiveServer2. It appears the client is connecting