[ https://issues.apache.org/jira/browse/ASTERIXDB-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Maxon resolved ASTERIXDB-1776. ---------------------------------- Resolution: Fixed > Data loss in many multi-partitions > ---------------------------------- > > Key: ASTERIXDB-1776 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1776 > Project: Apache AsterixDB > Issue Type: Bug > Components: Hyracks Core > Environment: MAC/Linux > Reporter: Wenhai > Assignee: Ian Maxon > Priority: Critical > Attachments: cc.log, demo.xml, execute.log, tpch_node1.log, > tpch_node2.log > > > Total description: If we configure more than 24 partitions in each NC, we > always loss almost half of the partitions, without any error information or > logs. > Schema: > {noformat} > drop dataverse tpch if exists; > create dataverse tpch; > use dataverse tpch; > create type LineItemType as closed { > l_orderkey: int32, > l_partkey: int32, > l_suppkey: int32, > l_linenumber: int32, > l_quantity: int32, > l_extendedprice: double, > l_discount: double, > l_tax: double, > l_returnflag: string, > l_linestatus: string, > l_shipdate: string, > l_commitdate: string, > l_receiptdate: string, > l_shipinstruct: string, > l_shipmode: string, > l_comment: string > } > create dataset LineItem(LineItemType) > primary key l_orderkey, l_linenumber; > load dataset LineItem > using localfs > (("path"="127.0.0.1:///path-to-tpch-data/tpch0.001/lineitem.tbl"),("format"="delimited-text"),("delimiter"="|")); > {noformat} > Query: > {noformat} > use dataverse tpch; > let $s := count( > for $d in dataset LineItem > return $d > ) > return $s > {noformat} > Return: > {noformat} > 6005 > {noformat} > Command: > {noformat} > managix stop -n tpch > managix start -n tpch > {noformat} > Query: > {noformat} > use dataverse tpch; > let $s := count( > for $d in dataset LineItem > return $d > ) > return $s > {noformat} > Return: > {noformat} > 4521 > {noformat} > We lose 1/3 records in this tiny test. When we increase the tpch scale onto > 200gb across 196 partitions by the distribution of 8 X 24, we should get 1.2 > billion records, but it only returned 0.45 billion! -- This message was sent by Atlassian JIRA (v6.3.15#6346)