[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15677


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r86062599
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -898,8 +995,12 @@ case class ToUTCTimestamp(left: Expression, right: 
Expression)
  * Returns the date part of a timestamp or string.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(expr) - Extracts the date part of the date or datetime 
expression expr.",
-  extended = "> SELECT _FUNC_('2009-07-30 04:17:52');\n '2009-07-30'")
+  usage = "_FUNC_(date) - Extracts the date part of the date or timestamp 
expression.",
--- End diff --

Hm, primarily this takes `DateType` with implicit conversion. I understand, 
In practical, it is right to say other data types as it basically casts but in 
that case we should mention something like `string expression` too. I do hope 
we can improve this when describing arguments too. (we may have to make it from 
`date` to something like `date/timestamp/string`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r86062169
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -109,8 +121,12 @@ case class MapValues(child: Expression)
  */
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(array(obj1, obj2, ...), ascendingOrder) - Sorts the 
input array in ascending order according to the natural ordering of the array 
elements.",
-  extended = " > SELECT _FUNC_(array('b', 'd', 'c', 'a'), true);\n 'a', 
'b', 'c', 'd'")
+  usage = "_FUNC_(array[, ascendingOrder]) - Sorts the input array in 
ascending order according to the natural ordering of the array elements.",
--- End diff --

Sorry, do you mean change `_FUNC_(array[, ascendingOrder])` to 
`_FUNC_(array(obj1, obj2, ...), ascendingOrder)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r86000653
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -1200,7 +1333,7 @@ case class UnBase64(child: Expression) extends 
UnaryExpression with ImplicitCast
  * If either argument is null, the result will also be null.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(bin, str) - Decode the first argument using the second 
argument character set.")
+  usage = "_FUNC_(bin, str) - Decodes the first argument using the second 
argument character set.")
--- End diff --

The same here. Can you also add a test case from `StringFunctionsSuite `?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85998364
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -1055,8 +1161,12 @@ case class Substring(str: Expression, pos: 
Expression, len: Expression)
  * A function that return the length of the given string or binary 
expression.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str | binary) - Returns the length of str or number of 
bytes in binary data.",
-  extended = "> SELECT _FUNC_('Spark SQL');\n 9")
+  usage = "_FUNC_(expr) - Returns the length of `str` or number of bytes 
in binary data.",
--- End diff --

Where is `str`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85999566
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -1200,7 +1333,7 @@ case class UnBase64(child: Expression) extends 
UnaryExpression with ImplicitCast
  * If either argument is null, the result will also be null.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(bin, str) - Decode the first argument using the second 
argument character set.")
+  usage = "_FUNC_(bin, str) - Decodes the first argument using the second 
argument character set.")
--- End diff --

`_FUNC_(bin, str)` -> `_FUNC_(bin, charset)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85999505
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -1200,7 +1333,7 @@ case class UnBase64(child: Expression) extends 
UnaryExpression with ImplicitCast
  * If either argument is null, the result will also be null.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(bin, str) - Decode the first argument using the second 
argument character set.")
+  usage = "_FUNC_(bin, str) - Decodes the first argument using the second 
argument character set.")
--- End diff --

`the second argument character set` -> 
```
the character set `charset`
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85998607
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -1177,7 +1305,12 @@ case class Base64(child: Expression) extends 
UnaryExpression with ImplicitCastIn
  * Converts the argument from a base 64 string to BINARY.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str) - Convert the argument from a base 64 string to 
binary.")
+  usage = "_FUNC_(str) - Convert the argument from a base 64 string `str` 
to a binary.",
--- End diff --

`Convert` -> `Converts`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85997612
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -407,9 +436,15 @@ case class StringTranslate(srcExpr: Expression, 
matchingExpr: Expression, replac
  */
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = """_FUNC_(str, str_array) - Returns the index (1-based) of the 
given string (left) in the comma-delimited list (right).
-Returns 0, if the string wasn't found or if the given string (left) 
contains a comma.""",
-  extended = "> SELECT _FUNC_('ab','abc,b,ab,c,def');\n 3")
+  usage = """
+_FUNC_(str, str_array) - Returns the index (1-based) of the given 
string (left) in the comma-delimited list (right).
+  Returns 0, if the string was not found or if the given string (left) 
contains a comma.
--- End diff --

Instead of using `(left)`/`(right)`, we can directly use the parm names, 
just like what we did above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85996873
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -121,7 +121,7 @@ case class Like(left: Expression, right: Expression)
 }
 
 @ExpressionDescription(
-  usage = "str _FUNC_ regexp - Returns true if str matches regexp and 
false otherwise.")
+  usage = "str _FUNC_ regexp - Returns true if `str` matches `regexp` and 
false otherwise.")
--- End diff --

`and false` -> `, or false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85996783
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -68,7 +68,7 @@ trait StringRegexExpression extends 
ImplicitCastInputTypes {
  * Simple RegEx pattern matching function
  */
 @ExpressionDescription(
-  usage = "str _FUNC_ pattern - Returns true if str matches pattern and 
false otherwise.")
+  usage = "str _FUNC_ pattern - Returns true if `str` matches `pattern` 
and false otherwise.")
--- End diff --

`and false` -> `, or false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85996472
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ---
@@ -410,7 +410,7 @@ object Equality {
 }
 
 @ExpressionDescription(
-  usage = "a _FUNC_ b - Returns TRUE if a equals b and false otherwise.")
+  usage = "expr1 _FUNC_ expr2 - Returns true if `expr1` equals `expr2` and 
false otherwise.")
--- End diff --

`and false` -> `, or false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85995869
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala
 ---
@@ -168,7 +200,12 @@ case class Nvl2(expr1: Expression, expr2: Expression, 
expr3: Expression)
  * Evaluates to `true` iff it's NaN.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(a) - Returns true if a is NaN and false otherwise.")
+  usage = "_FUNC_(expr) - Returns true if `expr` is NaN and false 
otherwise.",
--- End diff --

`and false` -> `, or false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85984067
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -898,8 +995,12 @@ case class ToUTCTimestamp(left: Expression, right: 
Expression)
  * Returns the date part of a timestamp or string.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(expr) - Extracts the date part of the date or datetime 
expression expr.",
-  extended = "> SELECT _FUNC_('2009-07-30 04:17:52');\n '2009-07-30'")
+  usage = "_FUNC_(date) - Extracts the date part of the date or timestamp 
expression.",
--- End diff --

`_FUNC_(date)` -> `_FUNC_(timestamp)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85978972
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -616,8 +685,12 @@ case class LastDay(startDate: Expression) extends 
UnaryExpression with ImplicitC
  */
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(start_date, day_of_week) - Returns the first date which 
is later than start_date and named as indicated.",
-  extended = "> SELECT _FUNC_('2015-01-14', 'TU');\n '2015-01-20'")
+  usage = "_FUNC_(start_date, day_of_week) - Returns the first date which 
is later than `start_date` and named as indicated.",
--- End diff --

This needs a description of `day_of_week`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85978496
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -374,7 +428,14 @@ case class ToUnixTimestamp(timeExp: Expression, 
format: Expression) extends Unix
  * second parameter.
  */
 @ExpressionDescription(
-  usage = "_FUNC_([date[, pattern]]) - Returns the UNIX timestamp of 
current or specified time.")
+  usage = "_FUNC_([expr[, pattern]]) - Returns the UNIX timestamp of 
current or specified time.",
--- End diff --

The same here. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85978329
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -351,7 +400,12 @@ case class DateFormatClass(left: Expression, right: 
Expression) extends BinaryEx
  * Deterministic version of [[UnixTimestamp]], must have at least one 
parameter.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(date[, pattern]) - Returns the UNIX timestamp of the 
give time.")
+  usage = "_FUNC_(expr[, pattern]) - Returns the UNIX timestamp of the 
give time.",
--- End diff --

This description also needs to explain how `pattern` is used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-11-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85976537
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -24,7 +24,12 @@ import org.apache.spark.sql.types._
 
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(expr1,expr2,expr3) - If expr1 is TRUE then IF() returns 
expr2; otherwise it returns expr3.")
+  usage = "_FUNC_(expr1, expr2, expr3) - If `expr1` evaluates to true, 
then returns `expr2`; otherwise it returns `expr3`.",
--- End diff --

`it returns` -> `returns`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85653989
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -970,7 +1136,14 @@ case class StringRepeat(str: Expression, times: 
Expression)
  */
 @ExpressionDescription(
   usage = "_FUNC_(str) - Returns the reversed given string.",
-  extended = "> SELECT _FUNC_('Spark SQL');\n 'LQS krapS'")
+  extended = """
+Arguments:
+  str - a string expression.
--- End diff --

Please check this. This is wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85644253
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -148,8 +171,15 @@ case class Hour(child: Expression) extends 
UnaryExpression with ImplicitCastInpu
 }
 
 @ExpressionDescription(
-  usage = "_FUNC_(param) - Returns the minute component of the 
string/timestamp/interval.",
-  extended = "> SELECT _FUNC_('2009-07-30 12:58:59');\n 58")
+  usage = "_FUNC_(timestamp) - Returns the minute component of the 
string/timestamp.",
+  extended = """
+Arguments:
+  timestamp - a timestamp expression.
--- End diff --

In DBMS, this is a very common issue. If users want to use strings to 
represent timestamp, or the other datetime values, they need to follow the 
formats. If I were users, my first question is which formats they should 
follow. For example, the precision is 6 or 2? 

`string/timestamp` is not clear. We definitely should improve it. Below is 
a document you can refer: 
http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/sqlref/src/tpc/db2z_datetimestringrepresentation.html


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85635839
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 ---
@@ -148,8 +171,15 @@ case class Hour(child: Expression) extends 
UnaryExpression with ImplicitCastInpu
 }
 
 @ExpressionDescription(
-  usage = "_FUNC_(param) - Returns the minute component of the 
string/timestamp/interval.",
-  extended = "> SELECT _FUNC_('2009-07-30 12:58:59');\n 58")
+  usage = "_FUNC_(timestamp) - Returns the minute component of the 
string/timestamp.",
+  extended = """
+Arguments:
+  timestamp - a timestamp expression.
--- End diff --

Oh, yes. I knew this but I was hesitant to fix `string/timestamp` because 
fixing this to `timestamp` sounds a (strictly..) regression in the 
documentation (it mean, `string/timestamp` is not inaccurate..); however, I 
also could not add `string` in `extended` part as it is inconsistent with the 
others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85635824
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -234,7 +259,16 @@ case class CreateStruct(children: Seq[Expression]) 
extends Expression {
  */
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(name1, val1, name2, val2, ...) - Creates a struct with 
the given field names and values.")
+  usage = "_FUNC_(name1, val1, name2, val2, ...) - Creates a struct with 
the given field names and values.",
+  extended = """
+Arguments:
+  name - a string expression literal that represents the field name.
+  val - an expression of any type.
+
+Examples:
+  > SELECT _FUNC_("a", 1, "b", 2, "c", 3);
--- End diff --

Could you follow the consistent way to specify the string literal in the 
example? Most of them are using single quotes 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85635703
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -82,7 +90,16 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
  * The children are a flatted sequence of kv pairs, e.g. (key1, value1, 
key2, value2, ...)
  */
 @ExpressionDescription(
-  usage = "_FUNC_(key0, value0, key1, value1...) - Creates a map with the 
given key/value pairs.")
+  usage = "_FUNC_(key0, value0, key1, value1...) - Creates a map with the 
given key/value pairs.",
--- End diff --

Nit: please correct all of them to make the usage of `...` consistent. For 
example, `x, y, z, ...`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85635610
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala
 ---
@@ -25,7 +25,14 @@ import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.CalendarInterval
 
 @ExpressionDescription(
-  usage = "_FUNC_(a) - Returns -a.")
+  usage = "_FUNC_(expr) - Returns the negated value of `expr`.",
+  extended = """
+Arguments:
+  expr - a numeric or interval expression.
+Examples:
--- End diff --

Nit: add an extra space


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85634799
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -1263,10 +1516,20 @@ case class Encode(value: Expression, charset: 
Expression)
  * fractional part.
  */
 @ExpressionDescription(
-  usage = """_FUNC_(X, D) - Formats the number X like '#,###,###.##', 
rounded to D decimal places.
-If D is 0, the result has no decimal point or fractional part.
-This is supposed to function like MySQL's FORMAT.""",
-  extended = "> SELECT _FUNC_(12332.123456, 4);\n '12,332.1235'")
+  usage = """
+_FUNC_(expr1, expr2) - Formats the number `expr1` like '#,###,###.##', 
rounded to `expr2`
+  decimal places. If `expr2` is 0, the result has no decimal point or 
fractional part.
+  This is supposed to function like MySQL's FORMAT.
+  """,
+  extended = """
+Arguments:
+  expr1 - numeric type expression.
+  expr2 - numeric type expression that defines the decimal places to 
round.
--- End diff --

`numeric type expression` -> `a numeric type expression`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15677#discussion_r85634741
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -102,8 +102,17 @@ case class UserDefinedGenerator(
  * }}}
  */
 @ExpressionDescription(
-  usage = "_FUNC_(n, v1, ..., vk) - Separate v1, ..., vk into n rows.",
-  extended = "> SELECT _FUNC_(2, 1, 2, 3);\n  [1,2]\n  [3,null]")
+  usage = "_FUNC_(n, expr1, ..., exprk) - Separates `expr1`, ..., `exprk` 
into `n` rows.",
+  extended = """
+Arguments:
+  n - an integer literal that represents the number of output rows.
+  expr - an expression of any type.
--- End diff --

This is an example. Have you tried the complex type?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15677: [SPARK-17963][SQL][Documentation] Add examples (e...

2016-10-28 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/15677

[SPARK-17963][SQL][Documentation] Add examples (extend) in each expression 
and improve documentation with arguments

## What changes were proposed in this pull request?

This PR proposes to change the documentation for functions. Please refer 
the discussion from https://github.com/apache/spark/pull/15513

The changes include

 - Re-indent the documentation
 - Add examples/arguments in `extended` where the arguments are multiple or 
specific format (e.g. xml/ json).

For examples, the documentation was updated as below:

### Functions with single line usage

**Before**
  - `pow`

```sql
Usage: pow(x1, x2) - Raise x1 to the power of x2.
Extended Usage:
> SELECT pow(2, 3);
 8.0
```

  - `current_timestamp`

```sql
Usage: current_timestamp() - Returns the current timestamp at the start 
of query evaluation.
Extended Usage:
No example for current_timestamp.
```

**After**

  - `pow`

```sql
Usage: pow(expr1, expr2) - Raise expr1 to the power of expr2.
Extended Usage:
Arguments:
  expr1 - a numeric expression.
  expr2 - a numeric expression.

Examples:
  > SELECT pow(2, 3);
   8.0
```

  - `current_timestamp`

```sql
Usage: current_timestamp() - Returns the current timestamp at the start 
of query evaluation.
Extended Usage:
No example/arguemnt for current_timestamp.
```


### Functions with (already) multiple line usage

**Before**

  - `approx_count_distinct`

```sql
Usage: approx_count_distinct(expr) - Returns the estimated cardinality 
by HyperLogLog++.
approx_count_distinct(expr, relativeSD=0.05) - Returns the 
estimated cardinality by HyperLogLog++
  with relativeSD, the maximum estimation error allowed.

Extended Usage:
No example for approx_count_distinct.
```

  - `percentile_approx`

```sql
Usage:
  percentile_approx(col, percentage [, accuracy]) - Returns the 
approximate percentile value of numeric
  column `col` at the given percentage. The value of percentage 
must be between 0.0
  and 1.0. The `accuracy` parameter (default: 1) is a positive 
integer literal which
  controls approximation accuracy at the cost of memory. Higher 
value of `accuracy` yields
  better accuracy, `1.0/accuracy` is the relative error of the 
approximation.

  percentile_approx(col, array(percentage1 [, percentage2]...) [, 
accuracy]) - Returns the approximate
  percentile array of column `col` at the given percentage array. 
Each value of the
  percentage array must be between 0.0 and 1.0. The `accuracy` 
parameter (default: 1) is
  a positive integer literal which controls approximation accuracy 
at the cost of memory.
  Higher value of `accuracy` yields better accuracy, `1.0/accuracy` 
is the relative error of
  the approximation.

Extended Usage:
No example for percentile_approx.
```

**After**

  - `approx_count_distinct`

```sql
Usage:
approx_count_distinct(expr[, relativeSD]) - Returns the estimated 
cardinality by HyperLogLog++.
  relativeSD defines the maximum estimation error allowed.

Extended Usage:
Arguments:
  expr - an expression of any type that represents data to count.
  relativeSD - a numeric literal that defines the maximum 
estimation error allowed.
```

  - `percentile_approx`

```sql
Usage:
percentile_approx(col, percentage [, accuracy]) - Returns the 
approximate percentile value of numeric
  column `col` at the given percentage. The value of `percentage` 
must be between 0.0
  and 1.0. The `accuracy` parameter (default: 1) is a positive 
integer literal which
  controls approximation accuracy at the cost of memory. Higher 
value of `accuracy` yields
  better accuracy, `1.0/accuracy` is the relative error of the 
approximation.
  When `percentage` is an array, each value of the percentage array 
must be between 0.0 and 1.0.

Extended Usage:
Arguments:
  col - a numeric expression.
  percentage - a numeric literal or an array literal of numeric 
type that defines the
percentile. For example, 0.5 means 50-percentile.
  accuracy - a numeric literal.

Examples: