[jira] [Updated] (ASTERIXDB-3059) EXCEPT in SelectClause

Glenn Justo Galvizo (Jira) Fri, 09 Sep 2022 20:02:08 -0700


     [ 
https://issues.apache.org/jira/browse/ASTERIXDB-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Glenn Justo Galvizo updated ASTERIXDB-3059:
-------------------------------------------
    Description: 
It would be nice to have an option to exclude certain fields when using SELECT 
*. 
[BigQuery|https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#select_except]
 gives users the option to exclude columns with 'SELECT * EXCEPT f1, f2, ...'. 
'EXCEPT' seems like a nice addition to our grammar, here are a few examples on 
how we can extend our SELECT clause to include 'EXCEPT':

 

Let's start off with some DDLs and some data:
{code:sql}
CREATE TYPE     GenericType AS { _id: uuid };
CREATE DATASET  Users (GenericType) PRIMARY KEY _id AUTOGENERATED;
CREATE DATASET  Friends (GenericType) PRIMARY KEY _id AUTOGENERATED;

INSERT INTO     Users [
  { "user_id": 1, "name": "Sally",
    "address": { "zip_code": "92512", "state": "CA" } },
  { "user_id": 2, "name": "Mark" },
  { "user_id": 3, "name": "John" },
  { "user_id": 4, "name": "Tracy", "title": "Dr" }
];
INSERT INTO     Friends [
  { "user_id": 1, "friend": 2 },
  { "user_id": 1, "friend": 3 },
  { "user_id": 2, "friend": 1 },
  { "user_id": 3, "friend": 1 }
];{code}
----
Now we can get into the examples:

1. Get me all users U1 that have friends. Do not include the F dataset. In the 
example below, we have a 'SELECT *' followed by an 'EXCEPT ( F )'. In contrast 
to BigQuery, we do not specify columns / fields in our exclusion list here, 
instead we specify an exclusion of _variables_ that would have otherwise been 
returned with our 'SELECT' clause.
{code:sql}
FROM   Users U1, Friends F
WHERE  U1.user_id = F.user_id
SELECT * EXCEPT F;{code}
Returns the following:
{code:java}
> { "U1":  { "user_id": 1, "name": "Sally",
             "address": { "zip_code": "92512", "state": "CA" } } }
> { "U1": { "user_id": 2, "name": "Mark" } }
> { "U1": { "user_id": 3, "name": "John" } }{code}
2. Get me all users U1 that have friends. Do not include that user's address 
and title. The example below more closely aligns with BigQuery, as we specify 
columns / fields that we want to exclude from U1 in the exclusion list 
'address' and 'title'.
{code:sql}
FROM   Users U1, Friends F
WHERE  U1.user_id = F.user_id 
SELECT DISTINCT U1.* EXCEPT address, title;{code}
Returns the following:
{code:java}
> { "user_id": 1, "name": "Sally" }
> { "user_id": 2, "name": "Mark" }
> { "user_id": 3, "name": "John" }{code}
3. If we want to exclude some field in a nested object, we can specify the 
fields within our object with the '.' syntax. Say we don't want to include the 
zipcode of an address object within a user that has friends. We would use the 
query below:
{code:sql}
FROM   Users U1, Friends F
WHERE  U1.user_id = F.user_id
SELECT DISTINCT U1.* EXCEPT address.zip_code;{code}
Returns the following:
{code:java}
> { "user_id": 1, "name": "Sally", "address": { "state": "CA" } }
> { "user_id": 2, "name": "Mark" }
> { "user_id": 3, "name": "John" }{code}
4. Now suppose we want users and the user documents of their immediate friends, 
excluding addresses from both user documents. We would use the following query:
{code:sql}
FROM   Users U1, Friends F, Users U2
WHERE  U1.user_id = F.user_id AND
       U2.user_id = F.friend
SELECT U1, U2 EXCEPT U1.address, U2.address;{code}
Returns the following:
{code:java}
> { "U1": { "user_id": 1, "name": "Sally" }, 
    "U2": { "user_id": 2, "name": "Mark" } }
> { "U1": { "user_id": 1, "name": "Sally" }, 
    "U2": { "user_id": 3, "name": "John" } }
> { "U1": { "user_id": 2, "name": "Mark" }, 
    "U2": { "user_id": 1, "name": "Sally" } }
> { "U1": { "user_id": 3, "name": "John" }, 
    "U2": { "user_id": 1, "name": "Sally" } }{code}
----
The exact grammar I have in mind is given below, where we simply add the 
'EXCEPT' at the tail of our 'SELECT' clause. Logically, the 'EXCEPT' clause 
surrounds our SELECT- EXPRESSION with a call to OBJECT_REMOVE_FIELDS. Using #2 
as an example, a logical equivalent would be:
{code:sql}
FROM   (
  FROM   Users U1, Friends F
  WHERE  U1.user_id = F.user_id
  SELECT DISTINCT U1.* 
) TMP
SELECT VALUE OBJECT_REMOVE_FIELDS(TMP, ["address", "title"]);{code}

  was:
It would be nice to have an option to exclude certain fields when using SELECT 
*. 
[BigQuery|https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#select_except]
 gives users the option to exclude columns with 'SELECT * EXCEPT f1, f2, ...'. 
'EXCEPT' seems like a nice addition to our grammar, here are a few examples on 
how we can extend our SELECT clause to include 'EXCEPT':

 

Let's start off with some DDLs and some data:
{code:sql}
CREATE TYPE     GenericType AS { _id: uuid };
CREATE DATASET  Users (GenericType) PRIMARY KEY _id AUTOGENERATED;
CREATE DATASET  Friends (GenericType) PRIMARY KEY _id AUTOGENERATED;

INSERT INTO     Users [
  { "user_id": 1, "name": "Sally",
    "address": { "zip_code": "92512", "state": "CA" } },
  { "user_id": 2, "name": "Mark" },
  { "user_id": 3, "name": "John" },
  { "user_id": 4, "name": "Tracy", "title": "Dr" }
];
INSERT INTO     Friends [
  { "user_id": 1, "friend": 2 },
  { "user_id": 1, "friend": 3 },
  { "user_id": 2, "friend": 1 },
  { "user_id": 3, "friend": 1 }
];{code}
----
Now we can get into the examples:

1. Get me all users U1 that have friends. Do not include the F dataset. In the 
example below, we have a 'SELECT *' followed by an 'EXCEPT ( F )'. In contrast 
to BigQuery, we do not specify columns / fields in our exclusion list here, 
instead we specify an exclusion of _variables_ that would have otherwise been 
returned with our 'SELECT' clause.
{code:sql}
FROM   Users U1, Friends F
WHERE  U1.user_id = F.user_id
SELECT * EXCEPT F;{code}
Returns the following:
{code:java}
> { "U1":  { "user_id": 1, "name": "Sally",
             "address": { "zip_code": "92512", "state": "CA" } } }
> { "U1": { "user_id": 2, "name": "Mark" } }
> { "U1": { "user_id": 3, "name": "John" } }{code}
2. Get me all users U1 that have friends. Do not include that user's address 
and title. The example below more closely aligns with BigQuery, as we specify 
columns / fields that we want to exclude from U1 in the exclusion list 
'address' and 'title'.
{code:sql}
FROM   Users U1, Friends F
WHERE  U1.user_id = F.user_id 
SELECT DISTINCT U1.* EXCEPT address, title;{code}
Returns the following:
{code:java}
> { "user_id": 1, "name": "Sally" }
> { "user_id": 2, "name": "Mark" }
> { "user_id": 3, "name": "John" }{code}
3. If we want to exclude some field in a nested object, we can specify the 
fields within our object with the '.' syntax. Say we don't want to include the 
zipcode of an address object within a user that has friends. We would use the 
query below:
{code:sql}
FROM   Users U1, Friends F
WHERE  U1.user_id = F.user_id
SELECT DISTINCT U1.* EXCEPT address.zip_code;{code}
Returns the following:
{code:java}
> { "user_id": 1, "name": "Sally", "address": { "state": "CA" } }
> { "user_id": 2, "name": "Mark" }
> { "user_id": 3, "name": "John" }{code}
 
----
The exact grammar I have in mind is given below, where we simply add the 
'EXCEPT' at the tail of our 'SELECT' clause. Logically, the 'EXCEPT' clause 
surrounds our SELECT- EXPRESSION with a call to OBJECT_REMOVE_FIELDS. Using #2 
as an example, a logical equivalent would be:
{code:sql}
FROM   (
  FROM   Users U1, Friends F
  WHERE  U1.user_id = F.user_id
  SELECT DISTINCT U1.* 
) TMP
SELECT VALUE OBJECT_REMOVE_FIELDS(TMP, ["address", "title"]);{code}


> EXCEPT in SelectClause
> ----------------------
>
>                 Key: ASTERIXDB-3059
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3059
>             Project: Apache AsterixDB
>          Issue Type: New Feature
>          Components: SQL - Translator SQL++
>            Reporter: Glenn Justo Galvizo
>            Assignee: Glenn Justo Galvizo
>            Priority: Major
>         Attachments: SelectClause-1.png
>
>
> It would be nice to have an option to exclude certain fields when using 
> SELECT *. 
> [BigQuery|https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#select_except]
>  gives users the option to exclude columns with 'SELECT * EXCEPT f1, f2, 
> ...'. 'EXCEPT' seems like a nice addition to our grammar, here are a few 
> examples on how we can extend our SELECT clause to include 'EXCEPT':
>  
> Let's start off with some DDLs and some data:
> {code:sql}
> CREATE TYPE     GenericType AS { _id: uuid };
> CREATE DATASET  Users (GenericType) PRIMARY KEY _id AUTOGENERATED;
> CREATE DATASET  Friends (GenericType) PRIMARY KEY _id AUTOGENERATED;
> INSERT INTO     Users [
>   { "user_id": 1, "name": "Sally",
>     "address": { "zip_code": "92512", "state": "CA" } },
>   { "user_id": 2, "name": "Mark" },
>   { "user_id": 3, "name": "John" },
>   { "user_id": 4, "name": "Tracy", "title": "Dr" }
> ];
> INSERT INTO     Friends [
>   { "user_id": 1, "friend": 2 },
>   { "user_id": 1, "friend": 3 },
>   { "user_id": 2, "friend": 1 },
>   { "user_id": 3, "friend": 1 }
> ];{code}
> ----
> Now we can get into the examples:
> 1. Get me all users U1 that have friends. Do not include the F dataset. In 
> the example below, we have a 'SELECT *' followed by an 'EXCEPT ( F )'. In 
> contrast to BigQuery, we do not specify columns / fields in our exclusion 
> list here, instead we specify an exclusion of _variables_ that would have 
> otherwise been returned with our 'SELECT' clause.
> {code:sql}
> FROM   Users U1, Friends F
> WHERE  U1.user_id = F.user_id
> SELECT * EXCEPT F;{code}
> Returns the following:
> {code:java}
> > { "U1":  { "user_id": 1, "name": "Sally",
>              "address": { "zip_code": "92512", "state": "CA" } } }
> > { "U1": { "user_id": 2, "name": "Mark" } }
> > { "U1": { "user_id": 3, "name": "John" } }{code}
> 2. Get me all users U1 that have friends. Do not include that user's address 
> and title. The example below more closely aligns with BigQuery, as we specify 
> columns / fields that we want to exclude from U1 in the exclusion list 
> 'address' and 'title'.
> {code:sql}
> FROM   Users U1, Friends F
> WHERE  U1.user_id = F.user_id 
> SELECT DISTINCT U1.* EXCEPT address, title;{code}
> Returns the following:
> {code:java}
> > { "user_id": 1, "name": "Sally" }
> > { "user_id": 2, "name": "Mark" }
> > { "user_id": 3, "name": "John" }{code}
> 3. If we want to exclude some field in a nested object, we can specify the 
> fields within our object with the '.' syntax. Say we don't want to include 
> the zipcode of an address object within a user that has friends. We would use 
> the query below:
> {code:sql}
> FROM   Users U1, Friends F
> WHERE  U1.user_id = F.user_id
> SELECT DISTINCT U1.* EXCEPT address.zip_code;{code}
> Returns the following:
> {code:java}
> > { "user_id": 1, "name": "Sally", "address": { "state": "CA" } }
> > { "user_id": 2, "name": "Mark" }
> > { "user_id": 3, "name": "John" }{code}
> 4. Now suppose we want users and the user documents of their immediate 
> friends, excluding addresses from both user documents. We would use the 
> following query:
> {code:sql}
> FROM   Users U1, Friends F, Users U2
> WHERE  U1.user_id = F.user_id AND
>        U2.user_id = F.friend
> SELECT U1, U2 EXCEPT U1.address, U2.address;{code}
> Returns the following:
> {code:java}
> > { "U1": { "user_id": 1, "name": "Sally" }, 
>     "U2": { "user_id": 2, "name": "Mark" } }
> > { "U1": { "user_id": 1, "name": "Sally" }, 
>     "U2": { "user_id": 3, "name": "John" } }
> > { "U1": { "user_id": 2, "name": "Mark" }, 
>     "U2": { "user_id": 1, "name": "Sally" } }
> > { "U1": { "user_id": 3, "name": "John" }, 
>     "U2": { "user_id": 1, "name": "Sally" } }{code}
> ----
> The exact grammar I have in mind is given below, where we simply add the 
> 'EXCEPT' at the tail of our 'SELECT' clause. Logically, the 'EXCEPT' clause 
> surrounds our SELECT- EXPRESSION with a call to OBJECT_REMOVE_FIELDS. Using 
> #2 as an example, a logical equivalent would be:
> {code:sql}
> FROM   (
>   FROM   Users U1, Friends F
>   WHERE  U1.user_id = F.user_id
>   SELECT DISTINCT U1.* 
> ) TMP
> SELECT VALUE OBJECT_REMOVE_FIELDS(TMP, ["address", "title"]);{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (ASTERIXDB-3059) EXCEPT in SelectClause

Reply via email to