Jianfeng Jia has posted comments on this change.

Change subject: [ASTERIXDB-1946][STO][IDX]Create RTree/InvertedIdx for 
Correlated Datasets
......................................................................


Patch Set 6:

@Chenluo,  not much about this patch on the code side. I want to update some 
finding after using this patch on my test data.

1. Too many components? 
There are a lot more 2ndary components than prefix-policy generated. In one 
partition I have 557 inverted index components (there are 282 primary index 
components), and half of them are very tiny.  (e.g., 2M). Previously we only 
have 60 inverted indexes. 
2. Performance is a bit slower than the prefix policy. (?)
I did a simple count test for tweet contains "election" and "happy".
                 prefix  correlated
election    88s      93s
happy      190s     241.351s

The performance is slower than prefix policy which contradicts with our 
conjecture. Maybe it also related to too many 2dnary components?

It's not an objection about this patch. I think we still need to merge this one 
to complete the "correlated" policy. After that, we need more performance test 
and deeper analysis to improve it in the future patches. :-)

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1845
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I100fc0b86b8a6fa36a95d77806107bad0307544e
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Luo Chen <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Luo Chen <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: No

Reply via email to