Change in asterixdb[master]: Making the SQL++ reference manual a bit more generic in how ...

Yingyi Bu (Code Review) Mon, 03 Oct 2016 12:01:01 -0700

Yingyi Bu has submitted this change and it was merged.

Change subject: Making the SQL++ reference manual a bit more generic in how it 
reads.
......................................................................



Making the SQL++ reference manual a bit more generic in how it reads.

Change-Id: I184ede1398de3190b60bec2947d826bdc5278594
Reviewed-on: https://asterix-gerrit.ics.uci.edu/1237
Sonar-Qube: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Reviewed-by: Yingyi Bu <buyin...@gmail.com>
---
M asterixdb/asterix-doc/src/main/markdown/sqlpp/1_intro.md
M asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md
M asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
M asterixdb/asterix-doc/src/main/markdown/sqlpp/4_ddl.md
4 files changed, 46 insertions(+), 31 deletions(-)

Approvals:
  Yingyi Bu: Looks good to me, approved
  Jenkins: Verified; No violations found



diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/1_intro.md 
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/1_intro.md
index 808d713..fdc04cb 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/1_intro.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/1_intro.md
@@ -19,7 +19,22 @@
 
 # <a id="Introduction">1. Introduction</a><font size="3"/>
 
-This document is intended as a reference guide to the full syntax and 
semantics of the SQL++ Query Language, a SQL-inspired language for working with 
semistructured data. SQL++ has much in common with SQL, but there are also 
differences due to the data model that the language is designed to serve. (SQL 
was designed in the 1970's for interacting with the flat, schema-ified world of 
relational databases, while SQL++ is designed for the nested, 
schema-less/schema-optional world of modern NoSQL systems.) In particular, 
SQL++ in the context of Apache AsterixDB is intended for working with the 
Asterix Data Model (ADM), which is a data model aimed at a superset of JSON 
with an enriched and flexible type system.
+This document is intended as a reference guide to the full syntax and 
semantics of
+the SQL++ Query Language, a SQL-inspired language for working with 
semistructured data.
+SQL++ has much in common with SQL, but some differences do exist due to the 
different
+data models that the two languages were designed to serve.
+SQL was designed in the 1970's for interacting with the flat, schema-ified 
world of
+relational databases, while SQL++ is much newer and targets the nested, 
schema-optional
+(or even schema-less) world of modern NoSQL systems.
 
-New AsterixDB users are encouraged to read and work through the (friendlier) 
guide "AsterixDB 101: An ADM and SQL++ Primer" before attempting to make use of 
this document. In addition, readers are advised to read and understand the 
Asterix Data Model (ADM) reference guide since a basic understanding of ADM 
concepts is a prerequisite to understanding SQL++. In what follows, we detail 
the features of the SQL++ language in a grammar-guided manner: we list and 
briefly explain each of the productions in the SQL++ grammar, offering examples 
(and results) for clarity.
+In the context of Apache AsterixDB, SQL++ is intended for working with the 
Asterix Data Model (ADM),
+a data model based on a superset of JSON with an enriched and flexible type 
system.
+New AsterixDB users are encouraged to read and work through the (much 
friendlier) guide
+"AsterixDB 101: An ADM and SQL++ Primer" before attempting to make use of this 
document.
+In addition, readers are advised to read through the Asterix Data Model (ADM) 
reference guide
+first as well, as an understanding of the data model is a prerequisite to 
understanding SQL++.
+
+In what follows, we detail the features of the SQL++ language in a 
grammar-guided manner.
+We list and briefly explain each of the productions in the SQL++ grammar, 
offering examples
+(and results) for clarity.
 
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md 
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md
index c2bab77..732daa4 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/2_expr.md
@@ -21,7 +21,7 @@
 
     Expression ::= OperatorExpression | CaseExpression | QuantifiedExpression
 
-SQL++ is a highly composable expression language. Each SQL++ expression 
returns zero or more Asterix Data Model (ADM) instances. There are three major 
kinds of expressions in SQL++. At the topmost level, a SQL++ expression can be 
an OperatorExpression (similar to a mathematical expression), an 
ConditionalExpression (to choose between alternative values), or a 
QuantifiedExpression (which yields a boolean value). Each will be detailed as 
we explore the full SQL++ grammar.
+SQL++ is a highly composable expression language. Each SQL++ expression 
returns zero or more data model instances. There are three major kinds of 
expressions in SQL++. At the topmost level, a SQL++ expression can be an 
OperatorExpression (similar to a mathematical expression), an 
ConditionalExpression (to choose between alternative values), or a 
QuantifiedExpression (which yields a boolean value). Each will be detailed as 
we explore the full SQL++ grammar.
 
 ## <a id="Primary_expressions">Primary Expressions</a>
 
@@ -29,9 +29,9 @@
                   | VariableReference
                   | ParenthesizedExpression
                   | FunctionCallExpression
-                  | Constructor
+                 | Constructor
 
-The most basic building block for any SQL++ expression is PrimaryExpression. 
This can be a simple literal (constant) value, a reference to a query variable 
that is in scope, a parenthesized expression, a function call, or a newly 
constructed instance of the Asterix Data Model (such as a newly constructed ADM 
record or list of ADM instances).
+The most basic building block for any SQL++ expression is PrimaryExpression. 
This can be a simple literal (constant) value, a reference to a query variable 
that is in scope, a parenthesized expression, a function call, or a newly 
constructed instance of the data model (such as a newly constructed record or 
list of data model instances).
 
 ### <a id="Literals">Literals</a>
 
@@ -75,7 +75,7 @@
     <LETTER>    ::= ["A" - "Z", "a" - "z"]
     DelimitedIdentifier   ::= "\`" (<ESCAPE_APOS> | ~["\'"])* "\`"
 
-A variable in SQL++ can be bound to any legal ADM value. A variable reference 
refers to the value to which an in-scope variable is bound. (E.g., a variable 
binding may originate from one of the `FROM`, `WITH` or `LET` clauses of a 
`SELECT` statement or from an input parameter in the context of a function 
body.) Backticks, e.g., \`id\`, are used for delimited identifiers. Delimiting 
is needed when a variable's desired name clashes with a SQL++ keyword or 
includes characters not allowed in regular identifiers.
+A variable in SQL++ can be bound to any legal data model value. A variable 
reference refers to the value to which an in-scope variable is bound. (E.g., a 
variable binding may originate from one of the `FROM`, `WITH` or `LET` clauses 
of a `SELECT` statement or from an input parameter in the context of a function 
body.) Backticks, e.g., \`id\`, are used for delimited identifiers. Delimiting 
is needed when a variable's desired name clashes with a SQL++ keyword or 
includes characters not allowed in regular identifiers.
 
 ##### Examples
 
@@ -100,7 +100,7 @@
 
     FunctionCallExpression ::= FunctionName "(" ( Expression ( "," Expression 
)* )? ")"
 
-Functions are included in SQL++, like most languages, as a way to package 
useful functionality or to componentize complicated or reusable SQL++ 
computations. A function call is a legal SQL++ query expression that represents 
the ADM value resulting from the evaluation of its body expression with the 
given parameter bindings; the parameter value bindings can themselves be any 
SQL++ expressions.
+Functions are included in SQL++, like most languages, as a way to package 
useful functionality or to componentize complicated or reusable SQL++ 
computations. A function call is a legal SQL++ query expression that represents 
the value resulting from the evaluation of its body expression with the given 
parameter bindings; the parameter value bindings can themselves be any SQL++ 
expressions.
 
 The following example is a (built-in) function call expression whose value is 
8.
 
@@ -116,7 +116,7 @@
     RecordConstructor        ::= "{" ( FieldBinding ( "," FieldBinding )* )? 
"}"
     FieldBinding             ::= Expression ":" Expression
 
-A major feature of SQL++ is its ability to construct new ADM data instances. 
This is accomplished using its constructors for each of the major ADM complex 
object structures, namely lists (ordered or unordered) and records. Ordered 
lists are like JSON arrays, while unordered lists have multiset (bag) 
semantics. Records are built from attributes that are field-name/field-value 
pairs, again like JSON. (See the AsterixDB Data Model document for more details 
on each.)
+A major feature of SQL++ is its ability to construct new data model instances. 
This is accomplished using its constructors for each of the model's complex 
object structures, namely lists (ordered or unordered) and records. Ordered 
lists are like JSON arrays, while unordered lists have multiset (bag) 
semantics. Records are built from attributes that are field-name/field-value 
pairs, again like JSON. (See the data model document for more details on each.)
 
 The following examples illustrate how to construct a new ordered list with 3 
items, a new record with 2 fields, and a new unordered list with 4 items, 
respectively. List elements can be homogeneous (as in the first example), which 
is the common case, or they may be heterogeneous (as in the third example). The 
data values and field name values used to construct lists and records in 
constructors are all simply SQL++ expressions. Thus, the list elements, field 
names, and field values used in constructors can be simple literals or they can 
come from query variable references or even arbitrarily complex SQL++ 
expressions (subqueries).
 
@@ -125,8 +125,8 @@
     [ 'a', 'b', 'c' ]
 
     {
-      'project name': 'AsterixDB',
-      'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras' ]
+      'project name': 'Hyracks',
+      'project members': [ 'vinayakb', 'dtabass', 'chenli', 'tsotras', 'tillw' 
]
     }
 
     {{ 42, "forty-two!", { "rank": "Captain", "name": "America" }, 3.14159 }}
@@ -137,7 +137,7 @@
     Field           ::= "." Identifier
     Index           ::= "[" ( Expression | "?" ) "]"
 
-Components of complex types in ADM are accessed via path expressions. Path 
access can be applied to the result of a SQL++ expression that yields an 
instance of  a complex type, e.g., a record or list instance. For records, path 
access is based on field names. For ordered lists, path access is based on 
(zero-based) array-style indexing. SQL++ also supports an "I'm feeling lucky" 
style index accessor, [?], for selecting an arbitrary element from an ordered 
list. Attempts to access non-existent fields or out-of-bound list elements 
produce the special value `MISSING`.
+Components of complex types in the data model are accessed via path 
expressions. Path access can be applied to the result of a SQL++ expression 
that yields an instance of  a complex type, e.g., a record or list instance. 
For records, path access is based on field names. For ordered lists, path 
access is based on (zero-based) array-style indexing. SQL++ also supports an 
"I'm feeling lucky" style index accessor, [?], for selecting an arbitrary 
element from an ordered list. Attempts to access non-existent fields or 
out-of-bound list elements produce the special value `MISSING`.
 
 The following examples illustrate field access for a record, index-based 
element access for an ordered list, and also a composition thereof.
 
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md 
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
index c6dcf61..bfe4f0e 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
@@ -72,7 +72,7 @@
     OrderbyClause      ::= <ORDER> <BY> Expression ( <ASC> | <DESC> )? ( "," 
Expression ( <ASC> | <DESC> )? )*
     LimitClause        ::= <LIMIT> Expression ( <OFFSET> Expression )?
 
-In this section, we will make use of two stored collections of records 
(datasets in ADM parlance), `GleambookUsers` and `GleambookMessages`, in a 
series of running examples to explain `SELECT` queries. The contents of the 
example collections are as follows:
+In this section, we will make use of two stored collections of records 
(datasets), `GleambookUsers` and `GleambookMessages`, in a series of running 
examples to explain `SELECT` queries. The contents of the example collections 
are as follows:
 
 `GleambookUsers` collection:
 
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/4_ddl.md 
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/4_ddl.md
index a2eebbd..217a670 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/4_ddl.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/4_ddl.md
@@ -30,15 +30,16 @@
                       | DeleteStatement
                       | Query ";"
 
-In addition to queries, the AsterixDB implementation of SQL++ supports 
statements for data definition and
-manipulation purposes as well as controlling the context to be used in 
evaluating SQL++ expressions.
-This section details the DDL and DML statements supported in the SQL++ 
language as realized in Apache AsterixDB.
+In addition to queries, an implementation of SQL++ needs to support statements 
for data definition
+and manipulation purposes as well as controlling the context to be used in 
evaluating SQL++ expressions.
+This section details the DDL and DML statements supported in the SQL++ 
language as realized today in
+Apache AsterixDB.
 
 ## <a id="Declarations">Declarations</a>
 
     DatabaseDeclaration ::= "USE" Identifier
 
-The world of data in an AsterixDB instance is organized into data namespaces 
called **dataverses**.
+At the uppermost level, the world of data is organized into data namespaces 
called **dataverses**.
 To set the default dataverse for a series of statements, the USE statement is 
provided in SQL++.
 
 As an example, the following statement sets the default dataverse to be 
"TinySocial".
@@ -116,15 +117,15 @@
     OrderedListTypeDef   ::= "[" ( TypeExpr ) "]"
     UnorderedListTypeDef ::= "{{" ( TypeExpr ) "}}"
 
-The CREATE TYPE statement is used to create a new named ADM datatype.
-This type can then be used to create stored collections or utilized when 
defining one or more other ADM datatypes.
-Much more information about the Asterix Data Model (ADM) is available in the 
[data model reference guide](datamodel.html) to ADM.
+The CREATE TYPE statement is used to create a new named datatype.
+This type can then be used to create stored collections or utilized when 
defining one or more other datatypes.
+Much more information about the data model is available in the [data model 
reference guide](datamodel.html).
 A new type can be a record type, a renaming of another type, an ordered list 
type, or an unordered list type.
 A record type can be defined as being either open or closed.
 Instances of a closed record type are not permitted to contain fields other 
than those specified in the create type statement.
 Instances of an open record type may carry additional fields, and open is the 
default for new types if neither option is specified.
 
-The following example creates a new ADM record type called GleambookUser type.
+The following example creates a new record type called GleambookUser type.
 Since it is defined as (defaulting to) being an open type,
 instances will be permitted to contain more than what is specified in the type 
definition.
 The first four fields are essentially traditional typed name/value pairs (much 
like SQL fields).
@@ -142,7 +143,7 @@
       employment: [ EmploymentType ]
     };
 
-The next example creates a new ADM record type, closed this time, called 
MyUserTupleType.
+The next example creates a new record type, closed this time, called 
MyUserTupleType.
 Instances of this closed type will not be permitted to have extra fields,
 although the alias field is marked as optional and may thus be NULL or MISSING 
in legal instances of the type.
 Note that the type of the id field in the example is UUID.
@@ -177,7 +178,7 @@
     CompactionPolicy     ::= Identifier
 
 The CREATE DATASET statement is used to create a new dataset.
-Datasets are named, unordered collections of ADM record type instances;
+Datasets are named, unordered collections of record type instances;
 they are where data lives persistently and are the usual targets for SQL++ 
queries.
 Datasets are typed, and the system ensures that their contents conform to 
their type definitions.
 An Internal dataset (the default kind) is a dataset whose content lives within 
and is managed by the system.
@@ -190,8 +191,8 @@
 
 Another advanced option, when creating an Internal dataset, is to specify the 
merge policy to control which of the
 underlying LSM storage components to be merged.
-(AsterixDB supports Log-Structured Merge tree based physical storage for 
Internal datasets.)
-Apache AsterixDB currently supports four different component merging policies 
that can be chosen per dataset:
+(The system supports Log-Structured Merge tree based physical storage for 
Internal datasets.)
+Currently the system supports four different component merging policies that 
can be chosen per dataset:
 no-merge, constant, prefix, and correlated-prefix.
 The no-merge policy simply never merges disk components.
 The constant policy merges disk components when the number of components 
reaches a constant number k that can be configured by the user.
@@ -200,14 +201,14 @@
 If such a sequence exists, the components in the sequence are merged together 
to form a single component.
 Finally, the correlated-prefix policy is similar to the prefix policy, but it 
delegates the decision of merging the disk components of all the indexes in a 
dataset to the primary index.
 When the correlated-prefix policy decides that the primary index needs to be 
merged (using the same decision criteria as for the prefix policy), then it 
will issue successive merge requests on behalf of all other indexes associated 
with the same dataset.
-The default policy for AsterixDB is the prefix policy except when there is a 
filter on a dataset, where the preferred policy for filters is the 
correlated-prefix.
+The system's default policy is the prefix policy except when there is a filter 
on a dataset, where the preferred policy for filters is the correlated-prefix.
 
 Another advanced option shown in the syntax above, related to performance and 
mentioned above, is that a **filter** can optionally be created on a field to 
further optimize range queries with predicates on the filter's field.
 Filters allow some range queries to avoid searching all LSM components when 
the query conditions match the filter.
 (Refer to [Filter-Based LSM Index Acceleration](filters.html) for more 
information about filters.)
 
 An External dataset, in contrast to an Internal dataset, has data stored 
outside of the system's control.
-Files living in HDFS or in the local filesystem(s) of a cluster's nodes are 
currently supported in AsterixDB.
+Files living in HDFS or in the local filesystem(s) of a cluster's nodes are 
currently supported.
 External dataset support allows SQL++ queries to treat foreign data as though 
it were stored in the system,
 making it possible to query "legacy" file data (e.g., Hive data) without 
having to physically import it.
 When defining an External dataset, an appropriate adapter type must be 
selected for the desired external data.
@@ -369,7 +370,7 @@
 (See the [guide to external data](externaldata.html) for more information on 
the available adapters.)
 If a dataset has an auto-generated primary key field, the file to be imported 
should not include that field in it.
 
-The following example shows how to bulk load the GleambookUsers dataset from 
an external file containing data that has been prepared in ADM format.
+The following example shows how to bulk load the GleambookUsers dataset from 
an external file containing data that has been prepared in ADM (Asterix Data 
Model) format.
 
 ##### Example
 
@@ -390,7 +391,7 @@
 (The system will automatically extend the provided record with this additional 
field and a corresponding value.)
 Insertion will fail if the dataset already has data with the primary key 
value(s) being inserted.
 
-In AsterixDB, inserts are processed transactionally.
+Inserts are processed transactionally by the system.
 The transactional scope of each insert transaction is the insertion of a 
single object plus its affiliated secondary index entries (if any).
 If the query part of an insert returns a single object, then the INSERT 
statement will be a single, atomic transaction.
 If the query part returns multiple objects, each object being inserted will be 
treated as a separate tranaction.
@@ -414,8 +415,7 @@
 
     UPSERT INTO UsersCopy (SELECT VALUE user FROM GleambookUsers user)
 
-*Editor's note: Upserts currently work in AQL but are apparently disabled at 
the moment in SQL++.
-(@Yingyi, is that indeed the case?)*
+*Editor's note: Upserts currently work in AQL but are not yet enabled (at the 
moment) in SQL++.
 
 ### <a id="Deletes">DELETEs</a>
 
@@ -424,7 +424,7 @@
 The SQL++ DELETE statement is used to delete data from a target dataset.
 The data to be deleted is identified by a boolean expression involving the 
variable bound to the target dataset in the DELETE statement.
 
-Deletes in AsterixDB are processed transactionally.
+Deletes are processed transactionally by the system.
 The transactional scope of each delete transaction is the deletion of a single 
object plus its affiliated secondary index entries (if any).
 If the boolean expression for a delete identifies a single object, then the 
DELETE statement itself will be a single, atomic transaction.
 If the expression identifies multiple objects, then each object deleted will 
be handled as a separate transaction.

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1237
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I184ede1398de3190b60bec2947d826bdc5278594
Gerrit-PatchSet: 3
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Michael Carey <dtab...@gmail.com>
Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Michael Carey <dtab...@gmail.com>
Gerrit-Reviewer: Yingyi Bu <buyin...@gmail.com>

Change in asterixdb[master]: Making the SQL++ reference manual a bit more generic in how ...

Reply via email to