[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896273#comment-15896273
 ] 

Wenhai commented on ASTERIXDB-1813:
-----------------------------------

Refer to patch https://asterix-gerrit.ics.uci.edu/#/c/1076/.

> similarity-jaccard-prefix() issue
> ---------------------------------
>
>                 Key: ASTERIXDB-1813
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1813
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Wenhai
>
> For the following two records, similarity-jaccard-prefix() doesn't generate 
> the correct result. Switch the line (skip-index, indexnl) to see the 
> difference. In order to see this, you need to enable the fuzzy join rule. It 
> doesn't happen in the master yet. This bug needs to be fixed before enabling 
> the fuzzy join rule. 
> {code}
> drop dataverse test if exists;
> create dataverse test;
> use dataverse test;
> create type DBLPType as open {
>   id: uuid
> }
> create dataset AmazonReviewNoDup(DBLPType)
>   primary key id;
> create index AmazonReviewNoDup_summary_b_idx
> on AmazonReviewNoDup(summary:string?) type btree enforced;
> create index AmazonReviewNoDup_summary_kw_idx
> on AmazonReviewNoDup(summary:string?) type keyword enforced;
> insert into dataset AmazonReviewNoDup(
> { "id": uuid("83208a78-7007-8d77-935b-d9127e4cc9dc"), "summary": "Clear, 
> Concise, and fun!" }
> );
> insert into dataset AmazonReviewNoDup(
> { "id": uuid("83208a78-7007-8d77-935b-d9127e4cc9dd"), "summary": "Clear, 
> Concise, and Charitable" }
> );
> for $o in dataset
> AmazonReviewNoDup
> for $i in dataset
> AmazonReviewNoDup
> //where /* +indexnl */ similarity-jaccard(word-tokens($o.summary), 
> word-tokens($i.summary)) >= 0.6
> where /* +skip-index */ similarity-jaccard(word-tokens($o.summary), 
> word-tokens($i.summary)) >= 0.6
> and $o.id < $i.id
> return {"oid":$o.id, "iid":$i.id};
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to