Taewoo Kim has uploaded a new change for review.
https://asterix-gerrit.ics.uci.edu/1448
Change subject: Add a corner case handling for NGramUTF8StringBinaryTokenizer
......................................................................
Add a corner case handling for NGramUTF8StringBinaryTokenizer
- For a corner case where the length of given string is less than
the given gram length, it returns 0 as the total number of grams.
Change-Id: I5965856b4da018276b37460bed7fb1fc60d8c2f3
---
M
hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/NGramUTF8StringBinaryTokenizer.java
1 file changed, 5 insertions(+), 1 deletion(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb
refs/changes/48/1448/1
diff --git
a/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/NGramUTF8StringBinaryTokenizer.java
b/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/NGramUTF8StringBinaryTokenizer.java
index 4c486c5..7b3af14 100644
---
a/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/NGramUTF8StringBinaryTokenizer.java
+++
b/hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/NGramUTF8StringBinaryTokenizer.java
@@ -110,7 +110,11 @@
if (usePrePost) {
totalGrams = numChars + gramLength - 1;
} else {
- totalGrams = numChars - gramLength + 1;
+ if (length >= gramLength) {
+ totalGrams = numChars - gramLength + 1;
+ } else {
+ totalGrams = 0;
+ }
}
}
--
To view, visit https://asterix-gerrit.ics.uci.edu/1448
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5965856b4da018276b37460bed7fb1fc60d8c2f3
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Taewoo Kim <[email protected]>