Hussain Towaileb created ASTERIXDB-3073:
-------------------------------------------

             Summary: Dynamic Prefixes for External Datasets
                 Key: ASTERIXDB-3073
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3073
             Project: Apache AsterixDB
          Issue Type: Epic
          Components: EXT - External data
    Affects Versions: 0.9.8
            Reporter: Hussain Towaileb
            Assignee: Hussain Towaileb
             Fix For: 0.9.9


Currently, when a user creates an external dataset, a prefix can be provided 
which directs the external dataset to the location the files need to be read 
from. This has a major impact on performance as it allows us to only read the 
files we are interested in an avoid reading unnecessary files.

However, a limitation to the current implementation is that the prefix is 
always a static path, leading to challenges such as reading the file (for 
example) of all userId > 1 or all files of userId INĀ [1, 2, 3], in such 
scenarios we always end up reading all the files, which can be a very expensive 
operation, then using our WHERE clause to get the desired result.

This feature aims to support a more dynamic approach to allow for a flexible 
prefix that can support different scenarios (for example, the user passing the 
desired userId in the prefix instead of a single prefix value) and still 
maintain the behavior of reading the minimal number of files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to