[ 
https://issues.apache.org/jira/browse/KYLIN-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-3335:
----------------------------------
    Description: 
Currently it's painful to search cube or project related jobs, since those 
infos are hidden in values. Especially, when users want to list job in a period 
under one project, by current design, all of the job output info have to be 
read into memory. If this kind of operation is done very often within a short 
period, it's easy to get OOM.

If the job id is prefixed with project and cube names, then we can push down 
prefix filters, which is efficient and safe.

This kind of change will cause backward compatibility issue. There're two ways 
to deal with this:
* Set a milestone with time tag, if a search relates to data earlier than this 
time, then just need to do one scan with prefix filter. Otherwise, two scans 
are needed. One with prefix filter and the other use current strategy. As time 
goes on, old job infos will be deleted. Once there's no data older than the 
time, only one scan is needed.
* Do migration for the old data once.

  was:
Currently it's painful to search cube or project related jobs, since those 
infos are hidden in values. Especially, when users want to list job in a period 
under one project, by current design, all of the job output info have to be 
read into memory. If this kind of operation is very often with a short period, 
it's easy to get OOM.

If the job id is prefixed with project and cube names, then we can push down 
prefix filters, which is efficient and safe.

This kind of change will cause backward compatibility issue. There're two ways 
to deal with this:
* Set a milestone with time tag, if a search relates to data earlier than this 
time, then just need to do one scan with prefix filter. Otherwise, two scans 
are needed. One with prefix filter and the other use current strategy. As time 
goes on, old job infos will be deleted. Once there's no data older than the 
time, only one scan is needed.
* Do migration for the old data once.


> Add project & cube related info to the job id for better filtering
> ------------------------------------------------------------------
>
>                 Key: KYLIN-3335
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3335
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: Zhong Yanghong
>            Priority: Major
>
> Currently it's painful to search cube or project related jobs, since those 
> infos are hidden in values. Especially, when users want to list job in a 
> period under one project, by current design, all of the job output info have 
> to be read into memory. If this kind of operation is done very often within a 
> short period, it's easy to get OOM.
> If the job id is prefixed with project and cube names, then we can push down 
> prefix filters, which is efficient and safe.
> This kind of change will cause backward compatibility issue. There're two 
> ways to deal with this:
> * Set a milestone with time tag, if a search relates to data earlier than 
> this time, then just need to do one scan with prefix filter. Otherwise, two 
> scans are needed. One with prefix filter and the other use current strategy. 
> As time goes on, old job infos will be deleted. Once there's no data older 
> than the time, only one scan is needed.
> * Do migration for the old data once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to