[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r213001675 ## File path: docs/dev/table/sqlClient.md ## @@ -459,6 +466,50 @@ Web interface: http://localhost:8081 {% top %} +SQL Views +- + +Views allow to define virtual tables from SQL queries. The view definition is parsed and validated immediately. However, the actual execution happens when the view is accessed during the submission of a general `INSERT INTO` or `SELECT` statement. + +Views can either be defined in [environment files](sqlClient.html#environment-files) or within the CLI session. + +The following example shows how to define multiple views in a file: + +{% highlight yaml %} +views: + - name: MyRestrictedView +query: "SELECT MyField2 FROM MyTableSource" + - name: MyComplexView +query: > + SELECT MyField2 + 42, CAST(MyField1 AS VARCHAR) + FROM MyTableSource + WHERE MyField2 > 200 +{% endhighlight %} + +Similar to table sources and sinks, views defined in a session environment file have highest precendence. + +Views can also be created within a CLI session using the `CREATE VIEW` statement: + +{% highlight text %} +CREATE VIEW MyNewView AS SELECT MyField2 FROM MyTableSource +{% endhighlight %} + +The `SHOW VIEW` statement allows for printing a previously created view again: Review comment: OK, let's remove it for now and figure that out later. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212979054 ## File path: flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliClient.java ## @@ -420,17 +420,17 @@ private void callCreateView(SqlCommandCall cmdCall) { final String name = cmdCall.operands[0]; final String query = cmdCall.operands[1]; + final String previousQuery = context.getViews().get(name); Review comment: I think we should not allow to override a view. It should be explicitly dropped first. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212977695 ## File path: docs/dev/table/sqlClient.md ## @@ -459,6 +466,50 @@ Web interface: http://localhost:8081 {% top %} +SQL Views +- + +Views allow to define virtual tables from SQL queries. The view definition is parsed and validated immediately. However, the actual execution happens when the view is accessed during the submission of a general `INSERT INTO` or `SELECT` statement. + +Views can either be defined in [environment files](sqlClient.html#environment-files) or within the CLI session. + +The following example shows how to define multiple views in a file: + +{% highlight yaml %} +views: + - name: MyRestrictedView +query: "SELECT MyField2 FROM MyTableSource" + - name: MyComplexView +query: > + SELECT MyField2 + 42, CAST(MyField1 AS VARCHAR) + FROM MyTableSource + WHERE MyField2 > 200 +{% endhighlight %} + +Similar to table sources and sinks, views defined in a session environment file have highest precendence. + +Views can also be created within a CLI session using the `CREATE VIEW` statement: + +{% highlight text %} +CREATE VIEW MyNewView AS SELECT MyField2 FROM MyTableSource +{% endhighlight %} + +The `SHOW VIEW` statement allows for printing a previously created view again: Review comment: OK, I don't mind listing views are tables in `SHOW TABLES`. However, if there is no way to distinguish a view from a table, we can also remove the `SHOW VIEW xxx` command for now. Why should somebody try to look up the definition of a view if he's not aware that it is a view. He can only know that if he created the view himself and in that case probably know the definition. I also think the syntax of `SHOW VIEW xxx` is not well chosen because it is inconsistent with `SHOW TABLES` and `SHOW FUNCTIONS`. It gives *details about a single view* where as the other `SHOW` commands *list all elements of a type*. I think the semantics are more similar to the `DESCRIBE` command. We could rename `SHOW VIEW` to ``DESCRIBE VIEW`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212977695 ## File path: docs/dev/table/sqlClient.md ## @@ -459,6 +466,50 @@ Web interface: http://localhost:8081 {% top %} +SQL Views +- + +Views allow to define virtual tables from SQL queries. The view definition is parsed and validated immediately. However, the actual execution happens when the view is accessed during the submission of a general `INSERT INTO` or `SELECT` statement. + +Views can either be defined in [environment files](sqlClient.html#environment-files) or within the CLI session. + +The following example shows how to define multiple views in a file: + +{% highlight yaml %} +views: + - name: MyRestrictedView +query: "SELECT MyField2 FROM MyTableSource" + - name: MyComplexView +query: > + SELECT MyField2 + 42, CAST(MyField1 AS VARCHAR) + FROM MyTableSource + WHERE MyField2 > 200 +{% endhighlight %} + +Similar to table sources and sinks, views defined in a session environment file have highest precendence. + +Views can also be created within a CLI session using the `CREATE VIEW` statement: + +{% highlight text %} +CREATE VIEW MyNewView AS SELECT MyField2 FROM MyTableSource +{% endhighlight %} + +The `SHOW VIEW` statement allows for printing a previously created view again: Review comment: OK, I don't mind listing views are tables in `SHOW TABLES`. However, if there is no way to distinguish a view from a table, we can also remove the `SHOW VIEW xxx` command. Why should somebody try to look up the definition of a view if he's not aware that it is a view. He can only know that if he created the view himself and in that case probably know the definition. I also think the syntax of `SHOW VIEW xxx` is not well chosen because it is inconsistent with `SHOW TABLES` and `SHOW FUNCTIONS`. It gives *details about a single view* where as the other `SHOW` commands *list all elements of a type*. I think the semantics are more similar to the `DESCRIBE` command. We could rename `SHOW VIEW` to ``DESCRIBE VIEW`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212621362 ## File path: docs/dev/table/sqlClient.md ## @@ -459,6 +466,50 @@ Web interface: http://localhost:8081 {% top %} +SQL Views +- + +Views allow to define virtual tables from SQL queries. The view definition is parsed and validated immediately. However, the actual execution happens when the view is accessed during the submission of a general `INSERT INTO` or `SELECT` statement. + +Views can either be defined in [environment files](sqlClient.html#environment-files) or within the CLI session. + +The following example shows how to define multiple views in a file: + +{% highlight yaml %} +views: + - name: MyRestrictedView +query: "SELECT MyField2 FROM MyTableSource" + - name: MyComplexView +query: > + SELECT MyField2 + 42, CAST(MyField1 AS VARCHAR) + FROM MyTableSource + WHERE MyField2 > 200 +{% endhighlight %} + +Similar to table sources and sinks, views defined in a session environment file have highest precendence. + +Views can also be created within a CLI session using the `CREATE VIEW` statement: + +{% highlight text %} +CREATE VIEW MyNewView AS SELECT MyField2 FROM MyTableSource +{% endhighlight %} + +The `SHOW VIEW` statement allows for printing a previously created view again: Review comment: I think the way that these commands work is a bit inconsistent with the `SHOW TABLES`, `SHOW FUNCTIONS`, and `DESCRIBE xxx` commands. 1. Views are listed as regular tables when calling `SHOW TABLES`. * List tables and views together, but mark views as views * Or add a `SHOW VIEWS` to only list views (and only list tables with `LIST TABLES`) 2. The schema of a view is returned with `DESCRIBE xxx`. This is fine, but IMO, we should also return the query. Showing schema and query with different commands is not intuitive to me. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212693978 ## File path: flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/CliClient.java ## @@ -390,51 +395,101 @@ private void callSelect(SqlCommandCall cmdCall) { view.open(); // view left - terminal.writer().println(CliStrings.messageInfo(CliStrings.MESSAGE_RESULT_QUIT).toAnsi()); - terminal.flush(); + printInfo(CliStrings.MESSAGE_RESULT_QUIT); } catch (SqlExecutionException e) { - printException(e); + printExecutionException(e); } } private boolean callInsertInto(SqlCommandCall cmdCall) { - terminal.writer().println(CliStrings.messageInfo(CliStrings.MESSAGE_SUBMITTING_STATEMENT).toAnsi()); - terminal.flush(); + printInfo(CliStrings.MESSAGE_SUBMITTING_STATEMENT); try { final ProgramTargetDescriptor programTarget = executor.executeUpdate(context, cmdCall.operands[0]); terminal.writer().println(CliStrings.messageInfo(CliStrings.MESSAGE_STATEMENT_SUBMITTED).toAnsi()); terminal.writer().println(programTarget.toString()); terminal.flush(); } catch (SqlExecutionException e) { - printException(e); + printExecutionException(e); return false; } return true; } - private void callSource(SqlCommandCall cmdCall) { - final String pathString = cmdCall.operands[0]; + private void callCreateView(SqlCommandCall cmdCall) { + final String name = cmdCall.operands[0]; + final String query = cmdCall.operands[1]; + + try { + // validate with a copy + final SessionContext contextCopy = context.copy(); Review comment: This check fails if the VIEW query contains UDFs. It seems that these are not registered yet. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212609696 ## File path: docs/dev/table/sqlClient.md ## @@ -459,6 +466,50 @@ Web interface: http://localhost:8081 {% top %} +SQL Views +- + +Views allow to define virtual tables from SQL queries. The view definition is parsed and validated immediately. However, the actual execution happens when the view is accessed during the submission of a general `INSERT INTO` or `SELECT` statement. + +Views can either be defined in [environment files](sqlClient.html#environment-files) or within the CLI session. + +The following example shows how to define multiple views in a file: + +{% highlight yaml %} +views: + - name: MyRestrictedView +query: "SELECT MyField2 FROM MyTableSource" + - name: MyComplexView +query: > + SELECT MyField2 + 42, CAST(MyField1 AS VARCHAR) + FROM MyTableSource + WHERE MyField2 > 200 +{% endhighlight %} + +Similar to table sources and sinks, views defined in a session environment file have highest precendence. + +Views can also be created within a CLI session using the `CREATE VIEW` statement: + +{% highlight text %} +CREATE VIEW MyNewView AS SELECT MyField2 FROM MyTableSource +{% endhighlight %} + +The `SHOW VIEW` statement allows for printing a previously created view again: + +{% highlight text %} +SHOW VIEW MyNewView +{% endhighlight %} + +Views created within a CLI session can also be removed again using the `DROP VIEW` statement: + +{% highlight text %} +DROP VIEW MyNewView +{% endhighlight %} + +Attention The definition of views is limited to the mentioned semantics above. Defining a schema for views or escape whitespaces in table names will be supported in future versions. Review comment: semantics -> syntax This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212622912 ## File path: flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/cli/SqlCommandParser.java ## @@ -30,86 +35,124 @@ private SqlCommandParser() { } public static Optional parse(String stmt) { - String trimmed = stmt.trim(); + // normalize + stmt = stmt.trim(); // remove ';' at the end because many people type it intuitively - if (trimmed.endsWith(";")) { - trimmed = trimmed.substring(0, trimmed.length() - 1); + if (stmt.endsWith(";")) { + stmt = stmt.substring(0, stmt.length() - 1).trim(); } + + // parse for (SqlCommand cmd : SqlCommand.values()) { - int pos = 0; - int tokenCount = 0; - for (String token : trimmed.split("\\s")) { - pos += token.length() + 1; // include space character - // check for content - if (token.length() > 0) { - // match - if (tokenCount < cmd.tokens.length && token.equalsIgnoreCase(cmd.tokens[tokenCount])) { - if (tokenCount == cmd.tokens.length - 1) { - final SqlCommandCall call = new SqlCommandCall( - cmd, - splitOperands(cmd, trimmed, trimmed.substring(Math.min(pos, trimmed.length( - ); - return Optional.of(call); - } - } else { - // next sql command - break; - } - tokenCount++; // check next token + final Pattern pattern = Pattern.compile(cmd.matchingRegex, Pattern.CASE_INSENSITIVE | Pattern.DOTALL); Review comment: Pattern compilation is quite heavy. Can we do this just once and not for every entered command? Maybe store the compiled pattern in the enum? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client
fhueske commented on a change in pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client URL: https://github.com/apache/flink/pull/6606#discussion_r212654542 ## File path: flink-libraries/flink-sql-client/src/main/java/org/apache/flink/table/client/config/Environment.java ## @@ -67,24 +75,36 @@ public Environment() { public void setTables(List> tables) { this.tables = new HashMap<>(tables.size()); tables.forEach(config -> { - if (!config.containsKey(TABLE_NAME)) { - throw new SqlClientException("The 'name' attribute of a table is missing."); - } - final Object nameObject = config.get(TABLE_NAME); - if (nameObject == null || !(nameObject instanceof String) || ((String) nameObject).length() <= 0) { - throw new SqlClientException("Invalid table name '" + nameObject + "'."); - } - final String name = (String) nameObject; + final String name = extractEarlyStringProperty(config, TABLE_NAME, "table"); final Map properties = new HashMap<>(config); properties.remove(TABLE_NAME); - if (this.tables.containsKey(name)) { + if (this.tables.containsKey(name) || this.views.containsKey(name)) { throw new SqlClientException("Duplicate table name '" + name + "'."); } this.tables.put(name, createTableDescriptor(name, properties)); }); } + public Map getViews() { + return views; + } + + public void setViews(List> views) { + // the order of how views are registered matters because + // they might reference each other + this.views = new LinkedHashMap<>(views.size()); + views.forEach(config -> { + final String name = extractEarlyStringProperty(config, VIEW_NAME, "view"); + final String query = extractEarlyStringProperty(config, VIEW_QUERY, "view"); + + if (this.tables.containsKey(name) || this.views.containsKey(name)) { + throw new SqlClientException("Duplicate table name '" + name + "'."); Review comment: `table name` -> `view name` (or rewrite error message to sth like "Cannot create view XXX because another table or view with that name is already registered." This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services