Taewoo Kim has posted comments on this change.

Change subject: Fulltext search initial implementation
......................................................................


Patch Set 5:

(1 comment)

https://asterix-gerrit.ics.uci.edu/#/c/989/5/asterixdb/asterix-doc/src/site/markdown/aql/manual.md
File asterixdb/asterix-doc/src/site/markdown/aql/manual.md:

Line 720: `rtree` for spatial data, and `keyword`, `ngram`, and `fulltext` for 
textual (string) data.
> what is the different between `keyword` and `fulltext`?
The keyword index is length partitioned index, while the full-text index is a 
single partitioned index. For length partitioned index, we build an index by 
firstly clustering the field which has the same length (= number of tokens) 
then tokenize the word and store them. So, the representation would be 
[7][president][PK1]. Here, 7 is the number of tokens in the indexed field for 
the record where PK is PK1. Token "president" is a word token. This is needed 
to calculate the similarity fast since calculating the similarity has lower and 
higher bound. But, for the full-text search, this not required. Actually, this 
"partition" feature should be disabled. So, for full-text index, we just store 
[president][PK1].


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/989
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I71887c2ea847e4488f4c98a11f8a5bcad02cac5a
Gerrit-PatchSet: 5
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Taewoo Kim <[email protected]>
Gerrit-Reviewer: Heri Ramampiaro <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Michael Blow <[email protected]>
Gerrit-Reviewer: Taewoo Kim <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-HasComments: Yes

Reply via email to