maropu commented on a change in pull request #21479:
URL: https://github.com/apache/spark/pull/21479#discussion_r409933251



##########
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##########
@@ -592,6 +592,7 @@ primaryExpression
     | identifier                                                               
                #columnReference
     | base=primaryExpression '.' fieldName=identifier                          
                #dereference
     | '(' expression ')'                                                       
                #parenthesizedExpression
+    | EXTRACT '(' field=identifier FROM source=valueExpression ')'             
                #extract

Review comment:
       Ah, I see. Nice catch! The python script that we are now working on 
(https://github.com/apache/spark/pull/28224) just dumps the entries of 
`ExpressionDescription(ExpressionInfo)`, so the output unfortunately cannot 
include a doc entry for  `EXTRACT` now. To document it, there are the three 
options that I can think of;
   
    -  (the simplest fix) Add some description about `EXTRACT`  in the SELECT 
syntax page (e.g., the `named_expression` section), then add a link to 
`date_part` in the built-in function page.
   
    - Add a dummy `ExpressionDescription` for `EXTRACT` like this;
   ```
   @ExpressionDescription(
     usage = "_FUNC_(field, source) - Extracts a part of the date/timestamp or 
interval source.",
     arguments = """ ... """,
     examples = """
       Examples:
         > SELECT _FUNC_('YEAR' FROM TIMESTAMP '2019-08-12 01:00:00.123456');
          2019
     """,
     since = "3.0.0")
   case class Extract(...) extends DatePart(field, source, child)
   ```
   
   - Add a new entry for an alias name in `ExpressionDescription` like this;
   ```
   @ExpressionDescription(
     usage = "_FUNC_(field FROM source) - Extracts a part of the date/timestamp 
or interval source.",
     arguments = """... """,
     alias = "extract",
     examples = """
       Examples:
         ...
         > SELECT _FUNC_ALIAS_(seconds FROM interval 5 hours 30 seconds 1 
milliseconds 1 microseconds);
          30.001001
     """,
     since = "3.0.0")
   case class DatePart(...) extends RuntimeReplaceable {
   ```
   Which one is preferred, or any other smarter idea?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to