Hello, 

I am trying to use Kylin to perform some complex query. For example, I want
to get the moving distinct count of buyers in the sample using following
query in web console:

SELECT
    t.part_dt,
    (SELECT 
        count(distinct buyer_id)
    FROM
        KYLIN_SALES
    WHERE part_dt BETWEEN TIMESTAMPADD(day, -60, t.part_dt) AND t.part_dt
    ) as total
FROM (
    SELECT part_dt
    FROM KYLIN_SALES
    WHERE part_dt BETWEEN DATE '2012-12-01' AND DATE '2013-02-01'
    GROUP BY part_dt
) t

I got following message:
null while executing SQL: "SELECT t.part_dt, (SELECT count(distinct
buyer_id) FROM KYLIN_SALES WHERE part_dt BETWEEN TIMESTAMPADD(day, -60,
t.part_dt) AND t.part_dt ) as total FROM ( SELECT part_dt FROM KYLIN_SALES
WHERE part_dt BETWEEN DATE '2012-12-01' AND DATE '2013-02-01' GROUP BY
part_dt ) t LIMIT 50000"

Environment: 
Centos 7 
Hadoop on CDH-5.8 
dedicated Kafka-2.1 (not included in CDH) 

Such query seems to work on MySQL and Postgres (with modification of time
related function), as shown in http://sqlfiddle.com/#!9/6b1307/1/0
http://sqlfiddle.com/#!17/b968f/3/0
I also tested on MySQL with data loaded and got expected data.

Is this behavior expected? If so, is there any suggested method to generate
such statistic data? (there may be some gap in data, so window of fixed rows
may be not suitable in my use case)

Lifan Su

--
Sent from: http://apache-kylin.74782.x6.nabble.com/

Reply via email to