Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Ian Maxon has submitted this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. [ASTERIXDB-3384][DOC] Document COPY Details: Add some brief documentation about each COPY statement. Also add the simplified BNF for railroad diagrams for each. Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 Reviewed-by: Ian Maxon Reviewed-by: Wail Alkowaileet Integration-Tests: Jenkins Tested-by: Jenkins --- M asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf M asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md 2 files changed, 98 insertions(+), 0 deletions(-) Approvals: Ian Maxon: Looks good to me, but someone else must approve Wail Alkowaileet: Looks good to me, approved Jenkins: Verified; Verified diff --git a/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf b/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf index af67e33..cbfa92e 100644 --- a/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf +++ b/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf @@ -258,4 +258,9 @@ UpsertStmnt ::= "UPSERT" "INTO" QualifiedName ("AS" Variable)? Query ("RETURNING" Expr)? + +CopyStmnt ::= "COPY" "INTO"? QualifiedName ("AS" Variable)? "FROM" Identifier "AT" QualifiedName ("PATH" StringLiteral)? (WITH ObjectConstructor)? + +CopyToStmnt ::= "COPY" ( QualifiedName | "(" Query ")" ) "TO" AdapterName "PATH" ParenthesizedArrayConstructor ("OVER" "(" ("PARTITION" "BY" Expr ("AS" Variable)? ("," Expr ("AS" Variable)? )? )? OrderbyClause ")" )? WITH ObjectConstructor + DeleteStmnt ::= "DELETE" "FROM" QualifiedName (("AS")? Variable)? ("WHERE" Expr)? diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md index 4ec5033..71e6af5 100644 --- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md +++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md @@ -697,3 +697,77 @@ DELETE FROM customers WHERE custid = "C47"; +### Copy Statement + +# CopyStmnt +![](../images/diagrams/CopyStmnt.png) + +The `COPY` statement is used to load data in bulk from an external source into a dataset. It differs from `LOAD` in that +it can be performed multiple times, to upsert new data into the dataset that may have changed in the source. + +For example, this statement would copy the contents of the JSON file `customerData.json` residing on the NC named `asterix_nc1` +into the Customers dataset. + +# Example + +COPY Customers +FROM localfs +PATH ("asterix_nc1://data/nontagged/customerData.json") +WITH { +"format": "json" +}; + + +### Copy To Statement + +# CopyToStmnt +![](../images/diagrams/CopyToStmnt.png) + +The `COPY TO` statement allows easy export of the result of a query, or an entire dataset, into a file or set of files on +an external source. This can be any source that has an adapter. + +For example, this statement would create a copy of `ColumnDataset` on an S3 bucket, myBucket: + +# Example + +COPY ColumnDataset +TO s3 +PATH("CopyToResult/") +WITH { +"format" : "json" +"container": "myBucket", +"accessKeyId": "", +"secretAccessKey": "", +"region": "us-west-2" +}; + +The statement allows for much more precise exports than this, however. A typical pattern of data accessed via object stores +like S3 is for it to be partitioned into files, with each folder containing some of those files representing a key. The use +of `OVER` and `PARTITION BY` allow exports to match this. For example: + +# Example + +COPY (SELECT cd.uid uid, +cd.sensor_info.name name, +to_bigint(cd.sensor_info.battery_status) battery_status +FROM ColumnDataset cd +) toWrite +TO s3 +PATH("CopyToResult", to_string(b)) +OVER ( +PARTITION BY toWrite.battery_status b +ORDER BY toWrite.name +) +WITH { +"format" : "json", +"compression": "gzip", +"max-objects-per-file": 1000, +"container": "myBucket", +"accessKeyId": "", +"secretAccessKey": "", +"region": "us-west-2" +}; + +This query will be exported as partitions into a set of folders, with one folder for each value of `battery_status`. +Each partition itself will also be sorted by the `name` field, and compressed with `gzip` and divided into files of 100 +objects or fewer per file. -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 6 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Wail Alkowaileet : Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Wail Alkowaileet has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 5: Code-Review+2 -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 5 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-Reviewer: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Thu, 02 May 2024 15:39:24 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Attention is currently required from: Peeyush Gupta, Ian Maxon, Wail Alkowaileet, Hussain Towaileb. Ian Maxon has removed a vote from this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Removed Contrib-2 by Unrecognized Gerrit Account 1000171 (1000171) -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 5 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Wail Alkowaileet Gerrit-Attention: Hussain Towaileb Gerrit-MessageType: deleteVote
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
Attention is currently required from: Peeyush Gupta, Ian Maxon, Wail Alkowaileet, Hussain Towaileb. Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 5: Contrib-2 Analytics Compatibility Tests Failed https://cbjenkins.page.link/fW9m5Xe4UECzaFyW9 : UNSTABLE -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 5 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Wail Alkowaileet Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Wed, 01 May 2024 19:07:50 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Jenkins : Attention is currently required from: Peeyush Gupta, Wail Alkowaileet, Hussain Towaileb. Jenkins has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 5: Integration-Tests+1 Integration Tests Successful https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-trigger/433/ : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 5 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Wail Alkowaileet Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Wed, 01 May 2024 18:43:34 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
Attention is currently required from: Peeyush Gupta, Wail Alkowaileet, Hussain Towaileb. Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 5: Analytics Compatibility Compilation Successful https://cbjenkins.page.link/szkDjafufrEHx4W58 : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 5 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Wail Alkowaileet Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Wed, 01 May 2024 16:59:02 + Gerrit-HasComments: No Gerrit-Has-Labels: No Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Attention is currently required from: Peeyush Gupta, Wail Alkowaileet, Hussain Towaileb. Ian Maxon has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 5: Code-Review+1 (4 comments) File asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/b50e2851_abebf510 PS1, Line 734: localfs > Probably it is better to keep everything as S3. […] Done https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/24554ac2_5785cd72 PS1, Line 752: PATH("CopyToResult/" || to_string(b)) > PATH("CopyToResult", to_string(b)) is much cleaner Done https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/93e7ac5d_a057c34b PS1, Line 760: 100 > I believe this will throw an error. […] Done File asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/c7228de1_eefe42de PS3, Line 756: / > You won't need /. […] Done -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 5 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Wail Alkowaileet Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Wed, 01 May 2024 16:52:17 + Gerrit-HasComments: Yes Gerrit-Has-Labels: Yes Comment-In-Reply-To: Wail Alkowaileet Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Hello Peeyush Gupta, Hussain Towaileb, Jenkins, Anon. E. Moose #1000171, I'd like you to reexamine a change. Please visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 to look at the new patch set (#4). Change subject: [ASTERIXDB-3384][DOC] Document COPY .. [ASTERIXDB-3384][DOC] Document COPY Details: Add some brief documentation about each COPY statement. Also add the simplified BNF for railroad diagrams for each. Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c --- M asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf M asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md 2 files changed, 93 insertions(+), 0 deletions(-) git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/43/18243/4 -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 4 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-MessageType: newpatchset
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Wail Alkowaileet : Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Wail Alkowaileet has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 3: (1 comment) File asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/3f9e7e15_07d1101d PS3, Line 756: / You won't need /. It will be added for you :) -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 3 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Sat, 27 Apr 2024 17:24:12 + Gerrit-HasComments: Yes Gerrit-Has-Labels: No Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 3: Contrib+1 Analytics Compatibility Tests Successful https://cbjenkins.page.link/UpsqJXehJev955MAA : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 3 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Fri, 26 Apr 2024 11:12:38 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Jenkins : Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Jenkins has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 3: Integration-Tests+1 Integration Tests Successful https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-trigger/382/ : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 3 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Fri, 26 Apr 2024 09:38:43 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 3: Analytics Compatibility Compilation Successful https://cbjenkins.page.link/PuyT25TSPJ8LrRo18 : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 3 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Fri, 26 Apr 2024 09:01:17 + Gerrit-HasComments: No Gerrit-Has-Labels: No Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Hello Peeyush Gupta, Hussain Towaileb, Jenkins, Anon. E. Moose #1000171, I'd like you to reexamine a change. Please visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 to look at the new patch set (#2). Change subject: [ASTERIXDB-3384][DOC] Document COPY .. [ASTERIXDB-3384][DOC] Document COPY Details: Add some brief documentation about each COPY statement. Also add the simplified BNF for railroad diagrams for each. Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c --- M asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf M asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md 2 files changed, 93 insertions(+), 0 deletions(-) git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/43/18243/2 -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 2 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-MessageType: newpatchset
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Wail Alkowaileet : Attention is currently required from: Peeyush Gupta, Ian Maxon, Hussain Towaileb. Wail Alkowaileet has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 1: (3 comments) File asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/9eefca59_990991d2 PS1, Line 734: localfs Probably it is better to keep everything as S3. Not that won't work, but it is going to be weird reading those files in a distributed setting. https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/def178a6_4261270e PS1, Line 752: PATH("CopyToResult/" || to_string(b)) PATH("CopyToResult", to_string(b)) is much cleaner https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243/comment/e3611820_6846466b PS1, Line 760: 100 I believe this will throw an error. The minimum is 1000 -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-CC: Wail Alkowaileet Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Ian Maxon Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Fri, 19 Apr 2024 16:57:19 + Gerrit-HasComments: Yes Gerrit-Has-Labels: No Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Jenkins : Attention is currently required from: Peeyush Gupta, Hussain Towaileb. Jenkins has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 1: Integration-Tests+1 Integration Tests Successful https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-trigger/328/ : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Fri, 19 Apr 2024 16:55:58 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Attention is currently required from: Peeyush Gupta, Hussain Towaileb. Ian Maxon has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 1: Code-Review+1 -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Hussain Towaileb Gerrit-Reviewer: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-Reviewer: Peeyush Gupta Gerrit-Attention: Peeyush Gupta Gerrit-Attention: Hussain Towaileb Gerrit-Comment-Date: Fri, 19 Apr 2024 16:20:19 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
Attention is currently required from: Ian Maxon. Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 1: Contrib+1 Analytics Compatibility Tests Successful https://cbjenkins.page.link/p156rtFtA1gCwL5B6 : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Jenkins Gerrit-Attention: Ian Maxon Gerrit-Comment-Date: Fri, 19 Apr 2024 12:04:54 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Jenkins : Attention is currently required from: Ian Maxon. Jenkins has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 1: Integration-Tests-1 Integration Tests Failed https://asterix-jenkins.ics.uci.edu/job/asterix-gerrit-trigger/326/ : UNSTABLE -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-Reviewer: Jenkins Gerrit-CC: Anon. E. Moose #1000171 Gerrit-Attention: Ian Maxon Gerrit-Comment-Date: Fri, 19 Apr 2024 10:17:34 + Gerrit-HasComments: No Gerrit-Has-Labels: Yes Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
Anon. E. Moose #1000171 has posted comments on this change. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. Patch Set 1: Analytics Compatibility Compilation Successful https://cbjenkins.page.link/2ttifn43HhfZKEMc7 : SUCCESS -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-CC: Anon. E. Moose #1000171 Gerrit-CC: Jenkins Gerrit-Comment-Date: Fri, 19 Apr 2024 09:50:19 + Gerrit-HasComments: No Gerrit-Has-Labels: No Gerrit-MessageType: comment
Change in asterixdb[master]: [ASTERIXDB-3384][DOC] Document COPY
>From Ian Maxon : Ian Maxon has uploaded this change for review. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 ) Change subject: [ASTERIXDB-3384][DOC] Document COPY .. [ASTERIXDB-3384][DOC] Document COPY Details: Add some brief documentation about each COPY statement. Also add the simplified BNF for railroad diagrams for each. Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c --- M asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf M asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md 2 files changed, 88 insertions(+), 0 deletions(-) git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/43/18243/1 diff --git a/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf b/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf index af67e33..31df730 100644 --- a/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf +++ b/asterixdb/asterix-doc/src/main/grammar/sqlpp.ebnf @@ -258,4 +258,8 @@ UpsertStmnt ::= "UPSERT" "INTO" QualifiedName ("AS" Variable)? Query ("RETURNING" Expr)? +CopyStmnt ::= "COPY" "INTO"? QualifiedName ("AS" Variable)? "FROM" Identifier "AT" QualifiedName ("PATH" StringLiteral)? (WITH ObjectConstructor)? + +CopyToStmnt ::= "COPY" ( QualifiedName | "(" Query ")" ) "TO" AdapterName "PATH" ParenthesizedArrayConstructor ("OVER" "(" ("PARTITION" "BY" Expr ("AS" Variable)? ("," Expr ("AS" Variable)? )? )? OrderbyClause ")" )? + DeleteStmnt ::= "DELETE" "FROM" QualifiedName (("AS")? Variable)? ("WHERE" Expr)? diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md b/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md index 4ec5033..a7a 100644 --- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md +++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/7_ddl_dml.md @@ -697,3 +697,73 @@ DELETE FROM customers WHERE custid = "C47"; +### Copy Statement + +# CopyStmnt +![](../images/diagrams/CopyStmnt.png) + +The `COPY` statement is used to load data in bulk from an external source into a dataset. It differs from `LOAD` in that +it can be performed multiple times, to upsert new data into the dataset that may have changed in the source. + +For example, this statement would copy the contents of the JSON file `customerData.json` residing on the NC named `asterix_nc1` +into the Customers dataset. + +# Example + +COPY Customers +FROM localfs +PATH ("asterix_nc1://data/nontagged/customerData.json") +WITH { +"format": "json" +}; + + +### Copy To Statement + +# CopyToStmnt +![](../images/diagrams/CopyToStmnt.png) + +The `COPY TO` statement allows easy export of the result of a query, or an entire dataset, into a file or set of files on +an external source. This can be any source that has an adapter. + +For example, this statement would create a copy of `ColumnDataset` on each node as a single JSON file + +# Example + +COPY ColumnDataset +TO localfs +PATH("localhost:///media/backup/CopyToResult") +WITH { +"format" : "json" +}; + +The statement allows for much more precise exports than this, however. A typical pattern of data accessed via object stores +like S3 is for it to be partitioned into files, with each folder containing some of those files representing a key. The use +of `OVER` and `PARTITION BY` allow exports to match this. For example: + +# Example + +COPY (SELECT cd.uid uid, +cd.sensor_info.name name, +to_bigint(cd.sensor_info.battery_status) battery_status +FROM ColumnDataset cd +) toWrite +TO s3 +PATH("CopyToResult/" || to_string(b)) +OVER ( +PARTITION BY toWrite.battery_status b +ORDER BY toWrite.name +) +WITH { +"format" : "json", +"compression": "gzip", +"max-objects-per-file": 100, +"container": "myBucket", +"accessKeyId": "", +"secretAccessKey": "", +"region": "us-west-2" +}; + +This query will be exported as partitions into a set of folders, with one folder for each value of `battery_status`. +Each partition itself will also be sorted by the `name` field, and compressed with `gzip` and divided into files of 100 +objects or fewer per file. -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/18243 To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: Ibdacf4e6b156a3b6ef15b1420a4102c122f8bf1c Gerrit-Change-Number: 18243 Gerrit-PatchSet: 1 Gerrit-Owner: Ian Maxon Gerrit-MessageType: newchange