[ https://issues.apache.org/jira/browse/ASTERIXDB-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Taewoo Kim closed ASTERIXDB-1743. --------------------------------- Resolution: Invalid > Hash Join on 9-node is slower than conducting the same join on 1-node. > ---------------------------------------------------------------------- > > Key: ASTERIXDB-1743 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1743 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Taewoo Kim > Assignee: Taewoo Kim > > For the same amount of the data, conducting a simple hash join like the > following AQL takes 2 hours and 30 minutes to finish on 9-nodes, while it > takes 1 hour and 20 minutes on 1-node. The data file is a 5.5GB Json file. > The difference is that spilling happens on 1-node and it's not happening on > 9-node. > {code} > create type AmazonReviewType as open { > id: uuid > } > create dataset AmazonReview9Mline(AmazonReviewType) primary key id auto > generated; > omit load ... > count( > for $o in dataset AmazonReview9Mline > for $i in dataset AmazonReview9Mline > where $o.reviewerID = $i.reviewerID and $o.id < $i.id > return {"oid":$o.reviewerID, "iid":$i.reviewerID} > ); > {code} > The following is a sample record. > {code} > { > "reviewerID": "A2SUAM1J3GNN3B", > "asin": "0000013714", > "reviewerName": "J. McDonald", > "helpful": [2, 3], > "reviewText": "I bought this for my husband who plays the piano. He is > having a wonderful time playing these old hymns. The music is at times hard > to read because we think the book was published for singing from more than > playing from. Great purchase though!", > "overall": 5.0, > "summary": "Heavenly Highway Hymns", > "unixReviewTime": 1252800000, > "reviewTime": "09 13, 2009" > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)