Ali Alsuliman created ASTERIXDB-3498:
----------------------------------------
Summary: Misalignment between operators assigned locations
Key: ASTERIXDB-3498
URL: https://issues.apache.org/jira/browse/ASTERIXDB-3498
Project: Apache AsterixDB
Issue Type: Bug
Components: COMP - Compiler
Reporter: Ali Alsuliman
Assignee: Ali Alsuliman
Datasets are always created on the sorted cluster locations (of the node ids).
For example, if there are two nodes with ids 'a' and 'b' each having 3
partitions, then one possible cluster locations could be as follows (depending
on how the nodes join the cluster):
{{cluster locations = ['b', 'b', 'b', 'a', 'a', 'a']}}
When a dataset is created, its locations will be the sorted cluster locations
which are used by data-scan operators:
{{['a', 'a', 'a', 'b', 'b', 'b']}}
This situation will cause failures for queries with operators getting assigned
the cluster locations. This can happen for operators that are not connected to
data-scan operators, and therefore they will use the cluster locations instead
of inheriting the data-scan operators sorted locations. This misalignment
results in a situation where the first activity of an operator runs in one
partition like the build activity of hash-join, and the second probe activity
runs in a completely different partition when those two activities should have
run in the same partition.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)