Jianfeng Jia has posted comments on this change.
Change subject: [ASTERIXDB-1946][STO][IDX]Create RTree/InvertedIdx for
Correlated Datasets
......................................................................
Patch Set 6:
@Chenluo, not much about this patch on the code side. I want to update some
finding after using this patch on my test data.
1. Too many components?
There are a lot more 2ndary components than prefix-policy generated. In one
partition I have 557 inverted index components (there are 282 primary index
components), and half of them are very tiny. (e.g., 2M). Previously we only
have 60 inverted indexes.
2. Performance is a bit slower than the prefix policy. (?)
I did a simple count test for tweet contains "election" and "happy".
prefix correlated
election 88s 93s
happy 190s 241.351s
The performance is slower than prefix policy which contradicts with our
conjecture. Maybe it also related to too many 2dnary components?
It's not an objection about this patch. I think we still need to merge this one
to complete the "correlated" policy. After that, we need more performance test
and deeper analysis to improve it in the future patches. :-)
--
To view, visit https://asterix-gerrit.ics.uci.edu/1845
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I100fc0b86b8a6fa36a95d77806107bad0307544e
Gerrit-PatchSet: 6
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Luo Chen <[email protected]>
Gerrit-Reviewer: Ian Maxon <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Luo Chen <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-Reviewer: abdullah alamoudi <[email protected]>
Gerrit-HasComments: No