>From Wail Alkowaileet <[email protected]>: Wail Alkowaileet has submitted this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17866 )
Change subject: [MULTIPLE ISSUES][COMP] Multiple fixes for external filters ...................................................................... [MULTIPLE ISSUES][COMP] Multiple fixes for external filters - user model changes: yes - storage format changes: no - interface changes: no Details: - ASTERIXDB-3280: Make inlined types open by default when creating datasets - ASTERIXDB-3281: Ensure the type is open when 'embed-filter-values' is enabled Change-Id: I04c66c5f637e49faca610fc2cb14668a8635187b Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17866 Integration-Tests: Jenkins <[email protected]> Tested-by: Wail Alkowaileet <[email protected]> Reviewed-by: Wail Alkowaileet <[email protected]> Reviewed-by: Ali Alsuliman <[email protected]> --- M asterixdb/asterix-app/src/test/resources/runtimets/results/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.2.adm M asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.1.ddl.sqlpp M asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.3.ddl.sqlpp A asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.query.sqlpp M asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml A asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/01-01-2023/aircraft.csv M asterixdb/asterix-app/src/main/java/org/apache/asterix/app/translator/QueryTranslator.java M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataUtils.java A asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.000.ddl.sqlpp A asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/03-01-2023/aircraft.csv A asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/embed-with-closed-type/embed-with-closed-type.000.ddl.sqlpp A asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/04-01-2023/aircraft.csv A asterixdb/asterix-app/src/test/resources/runtimets/results/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.adm A asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/02-01-2023/aircraft.csv M asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj 15 files changed, 248 insertions(+), 22 deletions(-) Approvals: Wail Alkowaileet: Looks good to me, but someone else must approve; Verified Ali Alsuliman: Looks good to me, approved Jenkins: Verified diff --git a/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/01-01-2023/aircraft.csv b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/01-01-2023/aircraft.csv new file mode 100644 index 0000000..3b4ec33 --- /dev/null +++ b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/01-01-2023/aircraft.csv @@ -0,0 +1,10 @@ +tail_num,code,description,state +80009E,2819,2819,MN +80019E,2805,2805,IA +80059E,2824,2824,MN +80129E,2801,2801,MN +80139E,2804,2804,MN +80199E,2804,2804,WI +80209E,2843,2843,ND +80219E,2804,2804,WI +80239E,2800,2800,IA \ No newline at end of file diff --git a/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/02-01-2023/aircraft.csv b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/02-01-2023/aircraft.csv new file mode 100644 index 0000000..3b4ec33 --- /dev/null +++ b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/02-01-2023/aircraft.csv @@ -0,0 +1,10 @@ +tail_num,code,description,state +80009E,2819,2819,MN +80019E,2805,2805,IA +80059E,2824,2824,MN +80129E,2801,2801,MN +80139E,2804,2804,MN +80199E,2804,2804,WI +80209E,2843,2843,ND +80219E,2804,2804,WI +80239E,2800,2800,IA \ No newline at end of file diff --git a/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/03-01-2023/aircraft.csv b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/03-01-2023/aircraft.csv new file mode 100644 index 0000000..3b4ec33 --- /dev/null +++ b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/03-01-2023/aircraft.csv @@ -0,0 +1,10 @@ +tail_num,code,description,state +80009E,2819,2819,MN +80019E,2805,2805,IA +80059E,2824,2824,MN +80129E,2801,2801,MN +80139E,2804,2804,MN +80199E,2804,2804,WI +80209E,2843,2843,ND +80219E,2804,2804,WI +80239E,2800,2800,IA \ No newline at end of file diff --git a/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/04-01-2023/aircraft.csv b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/04-01-2023/aircraft.csv new file mode 100644 index 0000000..3b4ec33 --- /dev/null +++ b/asterixdb/asterix-app/data/csv/external-filter/embed-csv/sales/04-01-2023/aircraft.csv @@ -0,0 +1,10 @@ +tail_num,code,description,state +80009E,2819,2819,MN +80019E,2805,2805,IA +80059E,2824,2824,MN +80129E,2801,2801,MN +80139E,2804,2804,MN +80199E,2804,2804,WI +80209E,2843,2843,ND +80219E,2804,2804,WI +80239E,2800,2800,IA \ No newline at end of file diff --git a/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/translator/QueryTranslator.java b/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/translator/QueryTranslator.java index a792267..f43fbde 100644 --- a/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/translator/QueryTranslator.java +++ b/asterixdb/asterix-app/src/main/java/org/apache/asterix/app/translator/QueryTranslator.java @@ -971,6 +971,7 @@ itemTypeEntity, metadataProvider, mdTxnCtx); ExternalDataUtils.normalize(properties); ExternalDataUtils.validate(properties); + ExternalDataUtils.validateType(properties, (ARecordType) itemType); validateExternalDatasetProperties(externalDetails, properties, dd.getSourceLocation(), mdTxnCtx, appCtx); datasetDetails = new ExternalDatasetDetails(externalDetails.getAdapter(), properties, new Date(), diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.1.ddl.sqlpp b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.1.ddl.sqlpp index 4776ab4..896d593 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.1.ddl.sqlpp +++ b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.1.ddl.sqlpp @@ -37,7 +37,7 @@ /* Internal datasets */ -CREATE DATASET A_Customers_Default_Closed( +CREATE DATASET A_Customers_Default_Open( c_custkey integer not unknown, c_name string not unknown, c_phone string, @@ -60,7 +60,7 @@ /* External datasets */ -CREATE EXTERNAL DATASET B_Orders_Default_Closed( +CREATE EXTERNAL DATASET B_Orders_Default_Open( o_orderkey integer not unknown, o_custkey integer not unknown, o_orderstatus string not unknown, @@ -107,7 +107,7 @@ /* Internal datasets with inline META type */ -CREATE DATASET C_Customers_Meta_Default_Closed( +CREATE DATASET C_Customers_Meta_Default_Open( c_custkey integer not unknown, c_name string not unknown, c_phone string, diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.3.ddl.sqlpp b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.3.ddl.sqlpp index 8a08888..6599a67 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.3.ddl.sqlpp +++ b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.3.ddl.sqlpp @@ -19,12 +19,12 @@ USE test; -DROP DATASET A_Customers_Default_Closed; +DROP DATASET A_Customers_Default_Open; DROP DATASET A_Customers_Closed; DROP DATASET A_Customers_Open; -DROP DATASET B_Orders_Default_Closed; +DROP DATASET B_Orders_Default_Open; DROP DATASET B_Orders_Closed; DROP DATASET B_Orders_Open; -DROP DATASET C_Customers_Meta_Default_Closed; +DROP DATASET C_Customers_Meta_Default_Open; DROP DATASET C_Customers_Meta_Closed; DROP DATASET C_Customers_Meta_Open; \ No newline at end of file diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.000.ddl.sqlpp b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.000.ddl.sqlpp new file mode 100644 index 0000000..4cc642f --- /dev/null +++ b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.000.ddl.sqlpp @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +DROP DATAVERSE test IF EXISTS; +CREATE DATAVERSE test; +USE test; + +CREATE TYPE TestType AS { + tail_num: STRING, + code: INT, + description: INT, + state: STRING +}; + +CREATE EXTERNAL DATASET Sales(TestType) USING %adapter% ( + %template%, + ("container"="playground"), + ("definition"="external-filter/embed-csv/sales/{day:int}-{month:int}-{year:int}"), + ("embed-filter-values" = "true"), + ("header"="true"), + ("format"="csv") +); \ No newline at end of file diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.query.sqlpp b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.query.sqlpp new file mode 100644 index 0000000..646d85d --- /dev/null +++ b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.query.sqlpp @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +USE test; + +SELECT VALUE s +FROM Sales s +ORDER BY s.tail_num, + s.year, + s.month, + s.day; \ No newline at end of file diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/embed-with-closed-type/embed-with-closed-type.000.ddl.sqlpp b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/embed-with-closed-type/embed-with-closed-type.000.ddl.sqlpp new file mode 100644 index 0000000..dc2fb6e --- /dev/null +++ b/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/external-dataset/common/dynamic-prefixes/embed-with-closed-type/embed-with-closed-type.000.ddl.sqlpp @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +DROP DATAVERSE test IF EXISTS; +CREATE DATAVERSE test; +USE test; + +CREATE TYPE TestType AS CLOSED { + tail_num: STRING, + code: INT, + description: INT, + state: STRING +}; + +CREATE EXTERNAL DATASET Sales(TestType) USING %adapter% ( + %template%, + ("container"="playground"), + ("definition"="external-filter/embed-csv/sales/{day:int}-{month:int}-{year:int}"), + ("embed-filter-values" = "true"), + ("header"="true"), + ("format"="csv") +); \ No newline at end of file diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/results/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.2.adm b/asterixdb/asterix-app/src/test/resources/runtimets/results/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.2.adm index 8af06a6..b7f2d90 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/results/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.2.adm +++ b/asterixdb/asterix-app/src/test/resources/runtimets/results/ddl/create-dataset-inline-type-1/create-dataset-inline-type-1.2.adm @@ -1,21 +1,21 @@ { "en": "Dataset", "DatatypeName": "$d$t$i$A_Customers_Closed", "DatasetName": "A_Customers_Closed", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ] ] } -{ "en": "Dataset", "DatatypeName": "$d$t$i$A_Customers_Default_Closed", "DatasetName": "A_Customers_Default_Closed", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ] ] } +{ "en": "Dataset", "DatatypeName": "$d$t$i$A_Customers_Default_Open", "DatasetName": "A_Customers_Default_Open", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ] ] } { "en": "Dataset", "DatatypeName": "$d$t$i$A_Customers_Open", "DatasetName": "A_Customers_Open", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ], [ "c_name" ] ] } { "en": "Dataset", "DatatypeName": "$d$t$i$B_Orders_Closed", "DatasetName": "B_Orders_Closed", "DatatypeDataverseName": "test" } -{ "en": "Dataset", "DatatypeName": "$d$t$i$B_Orders_Default_Closed", "DatasetName": "B_Orders_Default_Closed", "DatatypeDataverseName": "test" } +{ "en": "Dataset", "DatatypeName": "$d$t$i$B_Orders_Default_Open", "DatasetName": "B_Orders_Default_Open", "DatatypeDataverseName": "test" } { "en": "Dataset", "DatatypeName": "$d$t$i$B_Orders_Open", "DatasetName": "B_Orders_Open", "DatatypeDataverseName": "test" } { "en": "Dataset", "DatatypeName": "$d$t$i$C_Customers_Meta_Closed", "DatasetName": "C_Customers_Meta_Closed", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ], [ "c_x" ] ], "MetatypeDataverseName": "test", "MetatypeName": "$d$t$m$C_Customers_Meta_Closed", "KeySourceIndicator": [ 0, 1 ] } -{ "en": "Dataset", "DatatypeName": "$d$t$i$C_Customers_Meta_Default_Closed", "DatasetName": "C_Customers_Meta_Default_Closed", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ] ], "MetatypeDataverseName": "test", "MetatypeName": "$d$t$m$C_Customers_Meta_Default_Closed" } +{ "en": "Dataset", "DatatypeName": "$d$t$i$C_Customers_Meta_Default_Open", "DatasetName": "C_Customers_Meta_Default_Open", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_custkey" ] ], "MetatypeDataverseName": "test", "MetatypeName": "$d$t$m$C_Customers_Meta_Default_Open" } { "en": "Dataset", "DatatypeName": "$d$t$i$C_Customers_Meta_Open", "DatasetName": "C_Customers_Meta_Open", "DatatypeDataverseName": "test", "PrimaryKey": [ [ "c_x" ], [ "c_y" ] ], "MetatypeDataverseName": "test", "MetatypeName": "$d$t$m$C_Customers_Meta_Open", "KeySourceIndicator": [ 1, 1 ] } { "en": "Datatype", "DatatypeName": "$d$t$i$A_Customers_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } -{ "en": "Datatype", "DatatypeName": "$d$t$i$A_Customers_Default_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } +{ "en": "Datatype", "DatatypeName": "$d$t$i$A_Customers_Default_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$i$A_Customers_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$i$B_Orders_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "o_orderkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderstatus", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_totalprice", "FieldType": "double", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderdate", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderpriority", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_clerk", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_shippriority", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } -{ "en": "Datatype", "DatatypeName": "$d$t$i$B_Orders_Default_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "o_orderkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderstatus", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_totalprice", "FieldType": "double", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderdate", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderpriority", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_clerk", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_shippriority", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } +{ "en": "Datatype", "DatatypeName": "$d$t$i$B_Orders_Default_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "o_orderkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderstatus", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_totalprice", "FieldType": "double", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderdate", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderpriority", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_clerk", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_shippriority", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$i$B_Orders_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "o_orderkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderstatus", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_totalprice", "FieldType": "double", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderdate", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_orderpriority", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_clerk", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_shippriority", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "o_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$i$C_Customers_Meta_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } -{ "en": "Datatype", "DatatypeName": "$d$t$i$C_Customers_Meta_Default_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } +{ "en": "Datatype", "DatatypeName": "$d$t$i$C_Customers_Meta_Default_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$i$C_Customers_Meta_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "c_custkey", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_name", "FieldType": "string", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_phone", "FieldType": "string", "IsNullable": true, "IsMissable": true }, { "FieldName": "c_comment", "FieldType": "string", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$m$C_Customers_Meta_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "c_x", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_y", "FieldType": "int32", "IsNullable": true, "IsMissable": true } ] } } } -{ "en": "Datatype", "DatatypeName": "$d$t$m$C_Customers_Meta_Default_Closed", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": false, "Fields": [ { "FieldName": "c_x", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_y", "FieldType": "int32", "IsNullable": true, "IsMissable": true } ] } } } +{ "en": "Datatype", "DatatypeName": "$d$t$m$C_Customers_Meta_Default_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "c_x", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_y", "FieldType": "int32", "IsNullable": true, "IsMissable": true } ] } } } { "en": "Datatype", "DatatypeName": "$d$t$m$C_Customers_Meta_Open", "Derived": { "Tag": "RECORD", "IsAnonymous": true, "Record": { "IsOpen": true, "Fields": [ { "FieldName": "c_x", "FieldType": "int32", "IsNullable": false, "IsMissable": false }, { "FieldName": "c_y", "FieldType": "int32", "IsNullable": false, "IsMissable": false } ] } } } diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/results/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.adm b/asterixdb/asterix-app/src/test/resources/runtimets/results/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.adm new file mode 100644 index 0000000..5cee80e --- /dev/null +++ b/asterixdb/asterix-app/src/test/resources/runtimets/results/external-dataset/common/dynamic-prefixes/csv/embed-multiple-missing-values/embed-multiple-missing-values.001.adm @@ -0,0 +1,36 @@ +{ "tail_num": "80009E", "code": 2819, "description": 2819, "state": "MN", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80009E", "code": 2819, "description": 2819, "state": "MN", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80009E", "code": 2819, "description": 2819, "state": "MN", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80009E", "code": 2819, "description": 2819, "state": "MN", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80019E", "code": 2805, "description": 2805, "state": "IA", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80019E", "code": 2805, "description": 2805, "state": "IA", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80019E", "code": 2805, "description": 2805, "state": "IA", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80019E", "code": 2805, "description": 2805, "state": "IA", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80059E", "code": 2824, "description": 2824, "state": "MN", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80059E", "code": 2824, "description": 2824, "state": "MN", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80059E", "code": 2824, "description": 2824, "state": "MN", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80059E", "code": 2824, "description": 2824, "state": "MN", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80129E", "code": 2801, "description": 2801, "state": "MN", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80129E", "code": 2801, "description": 2801, "state": "MN", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80129E", "code": 2801, "description": 2801, "state": "MN", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80129E", "code": 2801, "description": 2801, "state": "MN", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80139E", "code": 2804, "description": 2804, "state": "MN", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80139E", "code": 2804, "description": 2804, "state": "MN", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80139E", "code": 2804, "description": 2804, "state": "MN", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80139E", "code": 2804, "description": 2804, "state": "MN", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80199E", "code": 2804, "description": 2804, "state": "WI", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80199E", "code": 2804, "description": 2804, "state": "WI", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80199E", "code": 2804, "description": 2804, "state": "WI", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80199E", "code": 2804, "description": 2804, "state": "WI", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80209E", "code": 2843, "description": 2843, "state": "ND", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80209E", "code": 2843, "description": 2843, "state": "ND", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80209E", "code": 2843, "description": 2843, "state": "ND", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80209E", "code": 2843, "description": 2843, "state": "ND", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80219E", "code": 2804, "description": 2804, "state": "WI", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80219E", "code": 2804, "description": 2804, "state": "WI", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80219E", "code": 2804, "description": 2804, "state": "WI", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80219E", "code": 2804, "description": 2804, "state": "WI", "day": 4, "month": 1, "year": 2023 } +{ "tail_num": "80239E", "code": 2800, "description": 2800, "state": "IA", "day": 1, "month": 1, "year": 2023 } +{ "tail_num": "80239E", "code": 2800, "description": 2800, "state": "IA", "day": 2, "month": 1, "year": 2023 } +{ "tail_num": "80239E", "code": 2800, "description": 2800, "state": "IA", "day": 3, "month": 1, "year": 2023 } +{ "tail_num": "80239E", "code": 2800, "description": 2800, "state": "IA", "day": 4, "month": 1, "year": 2023 } diff --git a/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml b/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml index 1d0d038..6d9de7a 100644 --- a/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml +++ b/asterixdb/asterix-app/src/test/resources/runtimets/testsuite_external_dataset_s3.xml @@ -223,6 +223,13 @@ <!-- Parquet Tests End --> <!-- Dynamic prefixes tests start --> <test-case FilePath="external-dataset/common/dynamic-prefixes"> + <compilation-unit name="embed-with-closed-type"> + <placeholder name="adapter" value="S3" /> + <output-dir compare="Text">embed-with-closed-type</output-dir> + <expected-error>Compilation error: A closed type cannot be used when 'embed-filter-values' is enabled</expected-error> + </compilation-unit> + </test-case> + <test-case FilePath="external-dataset/common/dynamic-prefixes"> <compilation-unit name="one-field"> <placeholder name="adapter" value="S3" /> <output-dir compare="Text">one-field</output-dir> @@ -388,6 +395,12 @@ <output-dir compare="Text">embed-multiple-values</output-dir> </compilation-unit> </test-case> + <test-case FilePath="external-dataset/common/dynamic-prefixes/csv"> + <compilation-unit name="embed-multiple-missing-values"> + <placeholder name="adapter" value="S3" /> + <output-dir compare="Text">embed-multiple-missing-values</output-dir> + </compilation-unit> + </test-case> <test-case FilePath="external-dataset/common/dynamic-prefixes/parquet"> <compilation-unit name="computed-field-segment-pattern-mismatch"> <placeholder name="adapter" value="S3" /> diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataUtils.java b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataUtils.java index e740e68..62dc466 100644 --- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataUtils.java +++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/ExternalDataUtils.java @@ -1048,4 +1048,13 @@ return protocol + "://" + container + "/"; } + + public static void validateType(Map<String, String> properties, ARecordType itemType) throws CompilationException { + boolean embedValues = Boolean.parseBoolean( + properties.getOrDefault(ExternalDataConstants.KEY_EMBED_FILTER_VALUES, ExternalDataConstants.FALSE)); + if (ExternalDataPrefix.containsComputedFields(properties) && embedValues && !itemType.isOpen()) { + throw new CompilationException(ErrorCode.COMPILATION_ERROR, "A closed type cannot be used when '" + + ExternalDataConstants.KEY_EMBED_FILTER_VALUES + "' is enabled"); + } + } } diff --git a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj index 0eb09a9..3675876 100644 --- a/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj +++ b/asterixdb/asterix-lang-sqlpp/src/main/javacc/SQLPP.jj @@ -1188,7 +1188,7 @@ } { nameComponents = QualifiedName() - (typeExpr = DatasetTypeSpecification())? + (typeExpr = DatasetTypeSpecification(RecordTypeDefinition.RecordKind.OPEN))? ( { String name; } <WITH> @@ -1199,7 +1199,7 @@ "We can only support one additional associated field called \"meta\"."); } } - metaTypeExpr = DatasetTypeSpecification() + metaTypeExpr = DatasetTypeSpecification(RecordTypeDefinition.RecordKind.OPEN) )? ifNotExists = IfNotExists() (LOOKAHEAD(3) primaryKeyFieldsWithTypes = PrimaryKeyWithType() @@ -1261,7 +1261,7 @@ } { nameComponents = QualifiedName() - typeExpr = DatasetTypeSpecification() + typeExpr = DatasetTypeSpecification(RecordTypeDefinition.RecordKind.OPEN) ifNotExists = IfNotExists() <USING> adapterName = AdapterName() properties = Configuration() ( <HINTS> hints = Properties() )? @@ -1280,13 +1280,13 @@ } } -TypeExpression DatasetTypeSpecification() throws ParseException: +TypeExpression DatasetTypeSpecification(RecordTypeDefinition.RecordKind defaultRecordKind) throws ParseException: { TypeExpression typeExpr = null; } { ( - LOOKAHEAD(3) typeExpr = DatasetRecordTypeSpecification(true) + LOOKAHEAD(3) typeExpr = DatasetRecordTypeSpecification(true, defaultRecordKind) | typeExpr = DatasetReferenceTypeSpecification() ) { @@ -1305,7 +1305,7 @@ } } -RecordTypeDefinition DatasetRecordTypeSpecification(boolean allowRecordKindModifier) throws ParseException: +RecordTypeDefinition DatasetRecordTypeSpecification(boolean allowRecordKindModifier, RecordTypeDefinition.RecordKind defaultRecordKind) throws ParseException: { RecordTypeDefinition recordTypeDef = null; RecordTypeDefinition.RecordKind recordKind = null; @@ -1316,7 +1316,7 @@ ( recordKind = RecordTypeKind() { recordKindToken = token; } <TYPE> )? { if (recordKind == null) { - recordKind = RecordTypeDefinition.RecordKind.CLOSED; + recordKind = defaultRecordKind; } else if (!allowRecordKindModifier) { throw createUnexpectedTokenError(recordKindToken); } @@ -1730,7 +1730,7 @@ nameComponents = QualifiedName() ( ( - typeExpr = DatasetTypeSpecification() + typeExpr = DatasetTypeSpecification(RecordTypeDefinition.RecordKind.CLOSED) ifNotExists = IfNotExists() viewConfigDefaultNull = CastDefaultNull() { @@ -2865,7 +2865,7 @@ namespace = nameComponents.first; datasetName = nameComponents.second; } - (<AS> typeExpr = DatasetTypeSpecification())? + (<AS> typeExpr = DatasetTypeSpecification(RecordTypeDefinition.RecordKind.OPEN))? <USING> adapterName = AdapterName() properties = Configuration() { ExternalDetailsDecl edd = new ExternalDetailsDecl(); -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17866 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: I04c66c5f637e49faca610fc2cb14668a8635187b Gerrit-Change-Number: 17866 Gerrit-PatchSet: 5 Gerrit-Owner: Wail Alkowaileet <[email protected]> Gerrit-Reviewer: Ali Alsuliman <[email protected]> Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: Murtadha Hubail <[email protected]> Gerrit-Reviewer: Wail Alkowaileet <[email protected]> Gerrit-MessageType: merged
