[jira] [Commented] (ASTERIXDB-1880) Unequal number of valid ... exception during a similarity join query

Taewoo Kim (JIRA) Mon, 10 Apr 2017 18:14:57 -0700

    [ 
https://issues.apache.org/jira/browse/ASTERIXDB-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963710#comment-15963710
 ]


Taewoo Kim commented on ASTERIXDB-1880:
---------------------------------------

Updated:

For each node, there are two partitions (A and B). Partition A seems OK. Every 
node has exactly four files like the following:

{code}
-rw-r--r-- 1 taewok2 102  26M Mar  1 15:26 
2017-03-01-15-24-56-697_2017-03-01-15-24-56-697_b
-rw-r--r-- 1 taewok2 102 257K Mar  1 15:26 
2017-03-01-15-24-56-697_2017-03-01-15-24-56-697_d
-rw-r--r-- 1 taewok2 102    0 Mar  1 15:24 
2017-03-01-15-24-56-697_2017-03-01-15-24-56-697_f
-rw-r--r-- 1 taewok2 102 340M Mar  1 15:26 
2017-03-01-15-24-56-697_2017-03-01-15-24-56-697_i
-rw-r--r-- 1 taewok2 102 3.6K Mar  1 15:16 .metadata
{code}

On Partition B, the bloom filter (suffix: _f) file is missing on all nodes. 

{code}
-rw-r--r-- 1 taewok2 102  26M Mar  1 15:26 
2017-03-01-15-24-56-666_2017-03-01-15-24-56-666_b
-rw-r--r-- 1 taewok2 102 257K Mar  1 15:26 
2017-03-01-15-24-56-666_2017-03-01-15-24-56-666_d
-rw-r--r-- 1 taewok2 102 340M Mar  1 15:26 
2017-03-01-15-24-56-666_2017-03-01-15-24-56-666_i
-rw-r--r-- 1 taewok2 102 3.6K Mar  1 15:16 .metadata
{code}

> Unequal number of valid ... exception during a similarity join query
> --------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1880
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1880
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>
> On an 8-node cluster, the following similarity query generates the following 
> exception. There is a keyword index on the summary field.
> {code}
> Unequal number of valid Dictionary BTree, Inverted Lists, Deleted BTree, and 
> Bloom Filter files found. Aborting cleanup. [HyracksDataException]
> {code}
> {code}
> use dataverse exp;
> count(
> for $p in dataset
> "AmazonReviewProductID"
> for $o in dataset
> "AmazonReviewNoDup"
> for $i in dataset
> "AmazonReviewNoDup"
> where $p.asin /* +indexnl */ = $o.asin and $p.id >=
> int64("6450")
> and $p.id <=
> int64("7449")
> and /* +indexnl */ similarity-jaccard(word-tokens($o.summary), 
> word-tokens($i.summary)) >= 0.8 and $o.id < $i.id
> return {"oid":$o.id, "iid":$i.id}
> );
> {code}
> DDL
> {code}
> drop dataverse exp if exists;
> create dataverse exp;
> use dataverse exp;
> create type AmazonReviewType as open {
>       id: uuid
> }
> create dataset AmazonReviewNoDup(AmazonReviewType) primary key id 
> autogenerated;
> create index AmazonReviewNoDup_summary_kw_idx 
> on AmazonReviewNoDup(summary:string?) type keyword enforced;
> create type AmazonProductIDType as closed {
>       id: int64,
>       asin: string
> }
> create dataset AmazonReviewProductID(AmazonProductIDType) primary key id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (ASTERIXDB-1880) Unequal number of valid ... exception during a similarity join query

Reply via email to