[ https://issues.apache.org/jira/browse/ASTERIXDB-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Maxon updated ASTERIXDB-3324: --------------------------------- Labels: triaged (was: ) > Stabilize columnar datasets > --------------------------- > > Key: ASTERIXDB-3324 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-3324 > Project: Apache AsterixDB > Issue Type: Bug > Components: COMP - Compiler, RT - Runtime, STO - Storage > Affects Versions: 0.9.9 > Reporter: Wail Y. Alkowaileet > Assignee: Wail Y. Alkowaileet > Priority: Major > Labels: triaged > Fix For: 0.9.9 > > > Multiple issues were found while running SQLPPExecutionTest while columnar is > the default storage format. > h1. Filter and project pushdowns > * NullPointerException could thrown by PushdownUtil.getFieldName(...) when > access a join expression. The type computer of the join doesn't include any > typing information; hence the thrown exception > * Similarly, in other operators. We need to pick the right type computer > (input vs. output) depending on the operator > * Change scope pushdown scope when UNION ALL operator is encountered to > avoid pushing SELECT conditions (incorrectly) after UNION ALL. > * Avoid re-registering record variables when computing the expected schema. > Such variables should be marked as irreplaceable > * Nested functions' arguments' should not be assigned to their produced > variables (if any). > * UNION ALL is quite special, it contains LogicalVariable triplets (not > variable expressions). When computing the defUse chains, the computer should > account for the variable used by the UNION ALL > * Disallow pushing SELECT conditions of CASE WHEN expressions > * Place NoOpAccessor for PKs in columnar filters to avoid advancing > (incorrectly) the PKs > * The current way of providing FilterAccessorProvider to the filter's > IScalarEvaluatorFactories has a race condition as IHyracksTaskContext can be > shared. Instead, FilterAccessorProvider should be provided by a dedicated > IEvaluatorContext (namely ColumnFilterEvaluatorContext). > * Disable filter against fields with heterogeneous numerical values (e.g., > double and bigint) > * Avoid advancing ColumnarAssembler readers if the mega-leaf node is > filtered out (otherwise, we can overrun the reader – no more values exception > – or we can read incorrect data) > * ColumnLeafFrame should duplicate the page buffer (a shared buffer) to > avoid race condition when a dataset is being scanned twice at the sametime. > h1. Storage and record assembly: > * Retain empty objects > * Preserve the type of declared fields during record assembly (currently, we > only produce bigint and doubles, which could be interpreted incorrectly in > closed fields if smaller precision types are used) > * Ensure PKs column indexes are [0 - N-1] (where N = the number of PKs), > whether the PKs are in the root or nested in one or more objects > * Ensure there's always a "delegate" when assembling objects. Especially in > case of accessing closed and open fields at the same time. Otherwise, we can > end up with empty objects. > * Ensure created PKs readers by *PathExtractorVisitor* have max def-level = > 1 even if they're nested > * Use correct items for array and multiset declared items when > LazyVisitablePointable is used > * Process actual types instead of union when using LazyVisitablePointable > * Ensure key uniqueness on LOAD > * Avoid accessing closed fields in empty objects (resulted from the column > assembler) > * Disallow LSM filters on columnar datasets > * Disallow correlated-prefix merge-policy (optimized for LSM-filters) in > columnar datasets > h1. Misc. > * If storage format specified incorrectly (i.e., it is neither row or > column, then a NullPointerException is thrown) -- This message was sent by Atlassian Jira (v8.20.10#820010)