Hussain Towaileb created ASTERIXDB-3073:
-------------------------------------------
Summary: Dynamic Prefixes for External Datasets
Key: ASTERIXDB-3073
URL: https://issues.apache.org/jira/browse/ASTERIXDB-3073
Project: Apache AsterixDB
Issue Type: Epic
Components: EXT - External data
Affects Versions: 0.9.8
Reporter: Hussain Towaileb
Assignee: Hussain Towaileb
Fix For: 0.9.9
Currently, when a user creates an external dataset, a prefix can be provided
which directs the external dataset to the location the files need to be read
from. This has a major impact on performance as it allows us to only read the
files we are interested in an avoid reading unnecessary files.
However, a limitation to the current implementation is that the prefix is
always a static path, leading to challenges such as reading the file (for
example) of all userId > 1 or all files of userId INĀ [1, 2, 3], in such
scenarios we always end up reading all the files, which can be a very expensive
operation, then using our WHERE clause to get the desired result.
This feature aims to support a more dynamic approach to allow for a flexible
prefix that can support different scenarios (for example, the user passing the
desired userId in the prefix instead of a single prefix value) and still
maintain the behavior of reading the minimal number of files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)