[
https://issues.apache.org/jira/browse/KYLIN-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355493#comment-17355493
]
star_dev commented on KYLIN-5000:
---------------------------------
[~zhangyaqian]
1. 分页形式独立请求
一次执行
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH'
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 0
一次执行
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH'
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 1000
结果有重复,如图
!image-2021-06-02-13-31-44-052.png!
2. union 形式请求
select COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID, kkk from (
select * from (SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk
FROM MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH'
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 0)
UNION
SELECT * from (
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH'
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 1000)
UNION
SELECT * from (
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH'
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 2000)
UNION
SELECT * from (
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH'
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 3000)
) tmp
结果无重复,如图
!image-2021-06-02-13-33-02-400.png!
3. [^0-1000.csv]
> kylin query分页多次请求间返回结果有重复
> -------------------------
>
> Key: KYLIN-5000
> URL: https://issues.apache.org/jira/browse/KYLIN-5000
> Project: Kylin
> Issue Type: Task
> Components: Query Engine
> Affects Versions: v3.1.1
> Reporter: star_dev
> Priority: Major
> Attachments: 0-1000.csv, 0-4000.csv, 1000-2000.csv,
> image-2021-06-02-13-31-44-052.png, image-2021-06-02-13-33-02-400.png
>
>
> 请以分页形式分多次请求kylin查询时,如第一次
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 0
> 第二次
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 1000
> 两次查询的结果有重复数据
>
> 但当执行
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 0
> union
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 1000
> 时,返回了2000条不重复的数据
>
> 问题:
> # 当以分页形式分别独立请求kylin查询时,为何不同的请求间会有重复数据?
> # 当以union 形式请求kylin查询时,能按照需求返回不重复的满足条数的数据,和1有什么不同吗?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)