date:20170603

[GitHub] spark pull request #18029: [SPARK-20168][WIP][DStream] Add changes to use ki...

2017-06-03 Thread yssharma

Github user yssharma commented on a diff in the pull request:

https://github.com/apache/spark/pull/18029#discussion_r119986035
  
--- Diff: 
external/kinesis-asl/src/test/java/org/apache/spark/streaming/kinesis/JavaKinesisInputDStreamBuilderSuite.java
 ---
@@ -45,7 +46,7 @@ public void testJavaKinesisDStreamBuilder() {
   .streamName(streamName)
   .endpointUrl(endpointUrl)
   .regionName(region)
-  .initialPositionInStream(initialPosition)
+  .initialPositionInStream(initialPosition, scala.Option.apply(null))
--- End diff --

@budde not having the overloaded methods introduces this backward 
compatibility issue which I didn't like much. What are your thoughts on this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18029: [SPARK-20168][WIP][DStream] Add changes to use ki...

2017-06-03 Thread yssharma

Github user yssharma commented on a diff in the pull request:

https://github.com/apache/spark/pull/18029#discussion_r119986280
  
--- Diff: 
external/kinesis-asl/src/test/scala/org/apache/spark/streaming/kinesis/KinesisInputDStreamBuilderSuite.scala
 ---
@@ -111,5 +110,28 @@ class KinesisInputDStreamBuilderSuite extends 
TestSuiteBase with BeforeAndAfterE
 assert(dstream.kinesisCreds == customKinesisCreds)
 assert(dstream.dynamoDBCreds == Option(customDynamoDBCreds))
 assert(dstream.cloudWatchCreds == Option(customCloudWatchCreds))
+
+val yesterday = DateUtils.addDays(new Date, -1)
+val dStreamFromTimestamp = builder
+.endpointUrl(customEndpointUrl)
+.regionName(customRegion)
+.initialPositionInStream(InitialPositionInStream.AT_TIMESTAMP, 
Some(yesterday))
--- End diff --

@budde Added optional timestamp for resume, but passing as Some() doesn't 
seem very interesting. Passing a  date directly seems more intuitive. Thoughts ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18159: [SPARK-20703][SQL] Associate metrics with data wr...

2017-06-03 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18159#discussion_r12935
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala 
---
@@ -17,38 +17,97 @@
 
 package org.apache.spark.sql.execution.command
 
+import scala.collection.mutable
+
 import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.{Row, SparkSession}
+import org.apache.spark.sql.{Row, SparkSession, SQLContext}
 import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
 import org.apache.spark.sql.catalyst.errors.TreeNodeException
 import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
 import org.apache.spark.sql.catalyst.plans.{logical, QueryPlan}
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-import org.apache.spark.sql.execution.SparkPlan
+import org.apache.spark.sql.execution.{SparkPlan, SQLExecution}
+import org.apache.spark.sql.execution.datasources.ExecutedWriteSummary
 import org.apache.spark.sql.execution.debug._
+import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
 import org.apache.spark.sql.execution.streaming.{IncrementalExecution, 
OffsetSeqMetadata}
 import org.apache.spark.sql.streaming.OutputMode
 import org.apache.spark.sql.types._
 
 /**
- * A logical command that is executed for its side-effects.  
`RunnableCommand`s are
- * wrapped in `ExecutedCommand` during execution.
+ * A logical command specialized for writing data out. 
`WriteOutFileCommand`s are
+ * wrapped in `WrittenFileCommandExec` during execution.
  */
-trait RunnableCommand extends logical.Command {
-  def run(sparkSession: SparkSession, children: Seq[SparkPlan]): Seq[Row] 
= {
+trait WriteOutFileCommand extends logical.Command {
+
+  /**
+   * Those metrics will be updated once the command finishes writing data 
out. Those metrics will
+   * be taken by `WrittenFileCommandExe` as its metrics when showing in UI.
+   */
+  def metrics(sqlContext: SQLContext): Map[String, SQLMetric] = {
+val sparkContext = sqlContext.sparkContext
+
+Map(
+  // General metrics.
--- End diff --

@cloud-fan I removed the part of specified metrics per file/partition. Now 
it is about 600 lines, and about 180 lines are tests. Do you think it's okay?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18159: [SPARK-20703][SQL] Associate metrics with data writes on...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18159
  
**[Test build #77710 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77710/testReport)**
 for PR 18159 at commit 
[`9819f01`](https://github.com/apache/spark/commit/9819f0103a15dd948c049eb7130f577f084b28e4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12478
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -730,4 +730,58 @@ public void testToLong() throws IOException {
   assertFalse(negativeInput, 
UTF8String.fromString(negativeInput).toLong(wrapper));
 }
   }
+
+  @Test
+  public void trim() {
--- End diff --

rename to `trimBothWithTrimString` or something like that? Same for the 
other two.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18159: [SPARK-20703][SQL] Associate metrics with data writes on...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18159
  
**[Test build #77709 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77709/testReport)**
 for PR 18159 at commit 
[`a6438ef`](https://github.com/apache/spark/commit/a6438ef30ce058905baf8614f3e17eaa41f5a4c3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12519
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -502,69 +503,232 @@ case class FindInSet(left: Expression, right: 
Expression) extends BinaryExpressi
   override def prettyName: String = "find_in_set"
 }
 
+trait String2TrimExpression extends ImplicitCastInputTypes {
+  self: Expression =>
+
+  override def dataType: DataType = StringType
+  override def inputTypes: Seq[AbstractDataType] = 
Seq.fill(children.size)(StringType)
+
+  override def nullable: Boolean = children.exists(_.nullable)
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def sql: String = {
+if (children.size == 1) {
+  val childrenSQL = children.map(_.sql).mkString(", ")
+  s"$prettyName($childrenSQL)"
+} else {
+  val trimSQL = children(0).map(_.sql).mkString(", ")
+  val tarSQL = children(1).map(_.sql).mkString(", ")
+  s"$prettyName($trimSQL, $tarSQL)"
+}
+  }
+}
+
 /**
- * A function that trim the spaces from both ends for the specified string.
- */
+ * A function that takes a character string, removes the leading and/or 
trailing characters matching with the characters
+ * in the trim string, returns the new string. If BOTH and trimStr 
keywords are not specified, it defaults to remove
+ * space character from both ends.
+ * trimStr: A character string to be trimmed from the source string, if it 
has multiple characters, the function
+ * searches for each character in the source string, removes the 
characters from the source string until it
+ * encounters the first non-match character.
+ * BOTH: removes any characters from both ends of the source string that 
matches characters in the trim string.
+  */
 @ExpressionDescription(
-  usage = "_FUNC_(str) - Removes the leading and trailing space characters 
from `str`.",
+  usage = """
+_FUNC_(str) - Removes the leading and trailing space characters from 
`str`.
+_FUNC_(BOTH trimStr FROM str) - Remove the leading and trailing 
trimString from `str`
+  """,
   extended = """
+Arguments:
+  str - a string expression
+  trimString - the trim string
+  BOTH, FROM - these are keyword to specify for trim string from both 
ends of the string
 Examples:
   > SELECT _FUNC_('SparkSQL   ');
SparkSQL
+  > SELECT _FUNC_(BOTH 'SL' FROM 'SSparkSQLS');
+   parkSQ
   """)
-case class StringTrim(child: Expression)
-  extends UnaryExpression with String2StringExpression {
+case class StringTrim(children: Seq[Expression])
+  extends Expression with String2TrimExpression {
 
-  def convert(v: UTF8String): UTF8String = v.trim()
+  require(children.size <= 2 && children.nonEmpty,
+s"$prettyName requires at least one argument and no more than two.")
 
   override def prettyName: String = "trim"
 
+  // trim function can take one or two arguments.
+  // Specify one child, it is for the trim space function.
+  // Specify the two children, it is for the trim function with BOTH 
option.
+  override def eval(input: InternalRow): Any = {
+val inputs = children.map(_.eval(input).asInstanceOf[UTF8String])
+if (inputs(0) != null) {
+  if (children.size == 1) {
+return inputs(0).trim()
+  } else if (inputs(1) != null) {
+return inputs(1).trim(inputs(0))
+  }
+}
+null
+  }
+
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
-defineCodeGen(ctx, ev, c => s"($c).trim()")
+if (children.size == 2 && !children(0).isInstanceOf[Literal]) {
+  throw new AnalysisException(s"The trimming parameter should be 
Literal.")}
+
+val evals = children.map(_.genCode(ctx))
+val inputs = evals.map { eval =>
+  s"${eval.isNull} ? null : ${eval.value}"
+}
+val getTrimFunction = if (children.size == 1) {
+  s"UTF8String ${ev.value} = ${inputs(0)}.trim();"
+} else {
+  s"UTF8String ${ev.value} = ${inputs(1)}.trim(${inputs(0)});"
+}
+ev.copy(evals.map(_.code).mkString("\n") + s"""
+  boolean ${ev.isNull} = false;
+  $getTrimFunction
+  if (${ev.value} == null) {
+${ev.isNull} = true;
+  }
+""")
   }
 }
 
 /**
- * A function that trim the spaces from left end for given string.
+ * A function that trims the characters from left end for a given string, 
If LEADING and trimStr keywords are not
+ * specified, it defaults to remove space character from the left end.
+ *

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12313
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -503,58 +503,63 @@ case class FindInSet(left: Expression, right: 
Expression) extends BinaryExpressi
   override def prettyName: String = "find_in_set"
 }
 
+trait String2TrimExpression extends ImplicitCastInputTypes {
+  self: Expression =>
+
+  override def dataType: DataType = StringType
+  override def inputTypes: Seq[AbstractDataType] = 
Seq.fill(children.size)(StringType)
+
+  override def nullable: Boolean = children.exists(_.nullable)
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def sql: String = {
+if (children.size == 1) {
+  val childrenSQL = children.map(_.sql).mkString(", ")
+  s"$prettyName($childrenSQL)"
+} else {
+  val trimSQL = children(0).map(_.sql).mkString(", ")
+  val tarSQL = children(1).map(_.sql).mkString(", ")
+  s"$prettyName($trimSQL, $tarSQL)"
+}
+  }
+}
+
 /**
  * A function that takes a character string, removes the leading and/or 
trailing characters matching with the characters
- * in the trim string, returns the new string. If LEADING/TRAILING/BOTH 
and trimStr keywords are not specified, it
- * defaults to remove space character from both ends.
+ * in the trim string, returns the new string. If BOTH and trimStr 
keywords are not specified, it defaults to remove
+ * space character from both ends.
  * trimStr: A character string to be trimmed from the source string, if it 
has multiple characters, the function
  * searches for each character in the source string, removes the 
characters from the source string until it
  * encounters the first non-match character.
- * LEADING: removes any characters from the left end of the source string 
that matches characters in the trim string.
- * TRAILING: removes any characters from the right end of the source 
string that matches characters in the trim string.
  * BOTH: removes any characters from both ends of the source string that 
matches characters in the trim string.
   */
 @ExpressionDescription(
   usage = """
 _FUNC_(str) - Removes the leading and trailing space characters from 
`str`.
 _FUNC_(BOTH trimStr FROM str) - Remove the leading and trailing 
trimString from `str`
-_FUNC_(LEADING trimStr FROM str) - Remove the leading trimString from 
`str`
-_FUNC_(TRAILING trimStr FROM str) - Remove the trailing trimString 
from `str`
   """,
   extended = """
 Arguments:
   str - a string expression
   trimString - the trim string
   BOTH, FROM - these are keyword to specify for trim string from both 
ends of the string
-  LEADING, FROM - these are keyword to specify for trim string from 
left end of the string
-  TRAILING, FROM - these are keyword to specify for trim string from 
right end of the string
 Examples:
   > SELECT _FUNC_('SparkSQL   ');
SparkSQL
   > SELECT _FUNC_(BOTH 'SL' FROM 'SSparkSQLS');
parkSQ
-  > SELECT _FUNC_(LEADING 'paS' FROM 'SSparkSQLS');
-   rkSQLS
-  > SELECT _FUNC_(TRAILING 'SLQ' FROM 'SSparkSQLS');
-   SSparkS
   """)
 case class StringTrim(children: Seq[Expression])
-  extends Expression with ImplicitCastInputTypes {
+  extends Expression with String2TrimExpression {
--- End diff --

Can we let `String2TrimExpression` extends `Expression` so that we can 
simplify this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12268
  
--- Diff: 
common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java 
---
@@ -746,10 +751,6 @@ public void trim() {
 
   @Test
   public void trimLeft() {
-assertEquals(fromString("  hello "), fromString("  hello 
").trimLeft(fromString("")));
-assertEquals(fromString(""), 
fromString("a").trimLeft(fromString("a")));
-assertEquals(fromString("b"), 
fromString("b").trimLeft(fromString("a")));
-assertEquals(fromString("b"), 
fromString("b").trimLeft(fromString("a")));
--- End diff --

I know there's a duplicated case, but why remove others?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12329
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -503,58 +503,63 @@ case class FindInSet(left: Expression, right: 
Expression) extends BinaryExpressi
   override def prettyName: String = "find_in_set"
 }
 
+trait String2TrimExpression extends ImplicitCastInputTypes {
+  self: Expression =>
+
+  override def dataType: DataType = StringType
+  override def inputTypes: Seq[AbstractDataType] = 
Seq.fill(children.size)(StringType)
+
+  override def nullable: Boolean = children.exists(_.nullable)
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  override def sql: String = {
+if (children.size == 1) {
+  val childrenSQL = children.map(_.sql).mkString(", ")
+  s"$prettyName($childrenSQL)"
+} else {
+  val trimSQL = children(0).map(_.sql).mkString(", ")
+  val tarSQL = children(1).map(_.sql).mkString(", ")
+  s"$prettyName($trimSQL, $tarSQL)"
+}
+  }
+}
+
 /**
  * A function that takes a character string, removes the leading and/or 
trailing characters matching with the characters
- * in the trim string, returns the new string. If LEADING/TRAILING/BOTH 
and trimStr keywords are not specified, it
- * defaults to remove space character from both ends.
+ * in the trim string, returns the new string. If BOTH and trimStr 
keywords are not specified, it defaults to remove
+ * space character from both ends.
  * trimStr: A character string to be trimmed from the source string, if it 
has multiple characters, the function
  * searches for each character in the source string, removes the 
characters from the source string until it
  * encounters the first non-match character.
- * LEADING: removes any characters from the left end of the source string 
that matches characters in the trim string.
- * TRAILING: removes any characters from the right end of the source 
string that matches characters in the trim string.
  * BOTH: removes any characters from both ends of the source string that 
matches characters in the trim string.
   */
 @ExpressionDescription(
   usage = """
 _FUNC_(str) - Removes the leading and trailing space characters from 
`str`.
 _FUNC_(BOTH trimStr FROM str) - Remove the leading and trailing 
trimString from `str`
-_FUNC_(LEADING trimStr FROM str) - Remove the leading trimString from 
`str`
-_FUNC_(TRAILING trimStr FROM str) - Remove the trailing trimString 
from `str`
   """,
   extended = """
 Arguments:
   str - a string expression
   trimString - the trim string
   BOTH, FROM - these are keyword to specify for trim string from both 
ends of the string
-  LEADING, FROM - these are keyword to specify for trim string from 
left end of the string
-  TRAILING, FROM - these are keyword to specify for trim string from 
right end of the string
 Examples:
   > SELECT _FUNC_('SparkSQL   ');
SparkSQL
   > SELECT _FUNC_(BOTH 'SL' FROM 'SSparkSQLS');
parkSQ
-  > SELECT _FUNC_(LEADING 'paS' FROM 'SSparkSQLS');
-   rkSQLS
-  > SELECT _FUNC_(TRAILING 'SLQ' FROM 'SSparkSQLS');
-   SSparkS
   """)
 case class StringTrim(children: Seq[Expression])
-  extends Expression with ImplicitCastInputTypes {
+  extends Expression with String2TrimExpression {
 
   require(children.size <= 2 && children.nonEmpty,
 s"$prettyName requires at least one argument and no more than two.")
 
-  override def dataType: DataType = StringType
-  override def inputTypes: Seq[AbstractDataType] = 
Seq.fill(children.size)(StringType)
-
-  override def nullable: Boolean = children.exists(_.nullable)
-  override def foldable: Boolean = children.forall(_.foldable)
-
   override def prettyName: String = "trim"
 
   // trim function can take one or two arguments.
-  // For one argument(children size is 1), it is the trim space function.
-  // For two arguments(children size is 2), it is the trim function with 
one of these options: BOTH/LEADING/TRAILING.
+  // Specify one child, it is for the trim space function.
+  // Specify the two children, it is for the trim function with BOTH 
option.
--- End diff --

Can you please give more clear description here: what do the first argument 
and the second argument mean? Maybe my previous comment is misleading, sorry 
about that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12347
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 ---
@@ -578,39 +583,29 @@ case class StringTrim(children: Seq[Expression])
 val getTrimFunction = if (children.size == 1) {
   s"UTF8String ${ev.value} = ${inputs(0)}.trim();"
 } else {
-  s"UTF8String ${ev.value} = 
${inputs(1)}.trim(${inputs(0)});".stripMargin
+  s"UTF8String ${ev.value} = ${inputs(1)}.trim(${inputs(0)});"
--- End diff --

I personally think this is a little strange, is it better to always use 
input(0) as the source string?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12387
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1105,19 +1105,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create a name LTRIM for TRIM(Leading), RTRIM for TRIM(Trailing), TRIM 
for TRIM(BOTH)
+   * Create a function name LTRIM for TRIM(Leading), RTRIM for 
TRIM(Trailing), TRIM for TRIM(BOTH),
+   * otherwise, returnthe original funcID.
--- End diff --

nit: `return the original function identifier.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12233
  
--- Diff: 
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
@@ -511,7 +511,7 @@ public UTF8String trim() {
   }
 
   /**
-   * Removes the given trim string from both ends of a string
+   * Removes the given source string starting from both ends
--- End diff --

Based on the given trim string, trim this string starting from both ends?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12208
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2658,17 +2658,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   sql("select LTRIM(BOTH 'S' FROM 'SS abc S')").head
 }
 assert(ae1.getMessage contains
-  "The specified function ltrim doesn't support with option BOTH")
+  "The specified function LTRIM doesn't support with option BOTH")
 val ae2 = intercept[ParseException]{
   sql("select RTRIM(TRAILING 'S' FROM 'SS abc S')").head
 }
 assert(ae2.getMessage contains
-  "The specified function rtrim doesn't support with option TRAILING")
+  "The specified function RTRIM doesn't support with option TRAILING")
 val ae3 = intercept[ParseException]{
-  sql("select TRIM(WINDOW 'S' FROM 'SS abc S')").head
+  sql("select TRIM(OVER 'S' AND 'SS abc S')").head
--- End diff --

Why change this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12646: [SPARK-14878][SQL] Trim characters string functio...

2017-06-03 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/12646#discussion_r12222
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2658,17 +2658,17 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   sql("select LTRIM(BOTH 'S' FROM 'SS abc S')").head
 }
 assert(ae1.getMessage contains
-  "The specified function ltrim doesn't support with option BOTH")
+  "The specified function LTRIM doesn't support with option BOTH")
 val ae2 = intercept[ParseException]{
   sql("select RTRIM(TRAILING 'S' FROM 'SS abc S')").head
 }
 assert(ae2.getMessage contains
-  "The specified function rtrim doesn't support with option TRAILING")
+  "The specified function RTRIM doesn't support with option TRAILING")
 val ae3 = intercept[ParseException]{
-  sql("select TRIM(WINDOW 'S' FROM 'SS abc S')").head
+  sql("select TRIM(OVER 'S' AND 'SS abc S')").head
 }
 assert(ae3.getMessage contains
-  "The specified function trim doesn't support with option WINDOW")
+  "Literals of type 'OVER' are currently not supported")
--- End diff --

Is this caused by other check? But this is not what we want to cover in the 
test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18159: [SPARK-20703][SQL] Associate metrics with data writes on...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18159
  
**[Test build #77708 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77708/testReport)**
 for PR 18159 at commit 
[`778007d`](https://github.com/apache/spark/commit/778007d354030648e379040d7bb647553422a031).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18159: [SPARK-20703][SQL] Associate metrics with data writes on...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18159
  
**[Test build #77707 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77707/testReport)**
 for PR 18159 at commit 
[`30f202e`](https://github.com/apache/spark/commit/30f202e37fc1d27d5752918f1886d500ea46aa11).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-06-03 Thread kevinyu98

Github user kevinyu98 commented on the issue:

https://github.com/apache/spark/pull/12646
  
@wzhfy Hello Zhenhua, can you help take a look at updated codes? thanks a 
lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18159: [SPARK-20703][SQL] Associate metrics with data wr...

2017-06-03 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18159#discussion_r11871
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala 
---
@@ -17,38 +17,97 @@
 
 package org.apache.spark.sql.execution.command
 
+import scala.collection.mutable
+
 import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.{Row, SparkSession}
+import org.apache.spark.sql.{Row, SparkSession, SQLContext}
 import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
 import org.apache.spark.sql.catalyst.errors.TreeNodeException
 import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
 import org.apache.spark.sql.catalyst.plans.{logical, QueryPlan}
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-import org.apache.spark.sql.execution.SparkPlan
+import org.apache.spark.sql.execution.{SparkPlan, SQLExecution}
+import org.apache.spark.sql.execution.datasources.ExecutedWriteSummary
 import org.apache.spark.sql.execution.debug._
+import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
 import org.apache.spark.sql.execution.streaming.{IncrementalExecution, 
OffsetSeqMetadata}
 import org.apache.spark.sql.streaming.OutputMode
 import org.apache.spark.sql.types._
 
 /**
- * A logical command that is executed for its side-effects.  
`RunnableCommand`s are
- * wrapped in `ExecutedCommand` during execution.
+ * A logical command specialized for writing data out. 
`WriteOutFileCommand`s are
+ * wrapped in `WrittenFileCommandExec` during execution.
  */
-trait RunnableCommand extends logical.Command {
-  def run(sparkSession: SparkSession, children: Seq[SparkPlan]): Seq[Row] 
= {
+trait WriteOutFileCommand extends logical.Command {
+
+  /**
+   * Those metrics will be updated once the command finishes writing data 
out. Those metrics will
+   * be taken by `WrittenFileCommandExe` as its metrics when showing in UI.
+   */
+  def metrics(sqlContext: SQLContext): Map[String, SQLMetric] = {
+val sparkContext = sqlContext.sparkContext
+
+Map(
+  // General metrics.
--- End diff --

Ok. Let me revert specified metrics.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18159: [SPARK-20703][SQL] Associate metrics with data writes on...

2017-06-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18159
  
can you also post some screenshots?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18159: [SPARK-20703][SQL] Associate metrics with data wr...

2017-06-03 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18159#discussion_r11816
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala 
---
@@ -17,38 +17,97 @@
 
 package org.apache.spark.sql.execution.command
 
+import scala.collection.mutable
+
 import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.{Row, SparkSession}
+import org.apache.spark.sql.{Row, SparkSession, SQLContext}
 import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
 import org.apache.spark.sql.catalyst.errors.TreeNodeException
 import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
 import org.apache.spark.sql.catalyst.plans.{logical, QueryPlan}
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-import org.apache.spark.sql.execution.SparkPlan
+import org.apache.spark.sql.execution.{SparkPlan, SQLExecution}
+import org.apache.spark.sql.execution.datasources.ExecutedWriteSummary
 import org.apache.spark.sql.execution.debug._
+import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
 import org.apache.spark.sql.execution.streaming.{IncrementalExecution, 
OffsetSeqMetadata}
 import org.apache.spark.sql.streaming.OutputMode
 import org.apache.spark.sql.types._
 
 /**
- * A logical command that is executed for its side-effects.  
`RunnableCommand`s are
- * wrapped in `ExecutedCommand` during execution.
+ * A logical command specialized for writing data out. 
`WriteOutFileCommand`s are
+ * wrapped in `WrittenFileCommandExec` during execution.
  */
-trait RunnableCommand extends logical.Command {
-  def run(sparkSession: SparkSession, children: Seq[SparkPlan]): Seq[Row] 
= {
+trait WriteOutFileCommand extends logical.Command {
+
+  /**
+   * Those metrics will be updated once the command finishes writing data 
out. Those metrics will
+   * be taken by `WrittenFileCommandExe` as its metrics when showing in UI.
+   */
+  def metrics(sqlContext: SQLContext): Map[String, SQLMetric] = {
+val sparkContext = sqlContext.sparkContext
+
+Map(
+  // General metrics.
--- End diff --

shall we just add general metrics first? I hope this can make the PR 
smaller...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18189: [SPARK-20972][SQL] rename HintInfo.isBroadcastable to fo...

2017-06-03 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18189
  
I found this problem when I was playing with CBO stuff. This 
name(`isBroadcastable`) will be displayed with the query plan when users run 
`EXPLAIN COST`, that's why I think it matters.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18192: [SPARK-20944][SHUFFLE] Move shouldBypassMergeSort from S...

2017-06-03 Thread zhengcanbin

Github user zhengcanbin commented on the issue:

https://github.com/apache/spark/pull/18192
  
@heary-cao Tks, and what title you suggest and which comment format is 
correct ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18128
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18128
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77706/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18128
  
**[Test build #77706 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77706/testReport)**
 for PR 18128 at commit 
[`2a7e6e3`](https://github.com/apache/spark/commit/2a7e6e3cde6431a6f2015c3ee731ed354934674d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18128
  
**[Test build #77706 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77706/testReport)**
 for PR 18128 at commit 
[`2a7e6e3`](https://github.com/apache/spark/commit/2a7e6e3cde6431a6f2015c3ee731ed354934674d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/18128
  
@felixcheung if I remove `as.integer`, backend doesn't recognize it as 
`integer`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18113: [SPARK-20890][SQL] Added min and max typed aggreg...

2017-06-03 Thread setjet

Github user setjet commented on a diff in the pull request:

https://github.com/apache/spark/pull/18113#discussion_r119996472
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/typedaggregators.scala
 ---
@@ -95,7 +93,123 @@ class TypedAverage[IN](val f: IN => Double) extends 
Aggregator[IN, (Double, Long
 
   // Java api support
   def this(f: MapFunction[IN, java.lang.Double]) = this(x => 
f.call(x).asInstanceOf[Double])
+
   def toColumnJava: TypedColumn[IN, java.lang.Double] = {
 toColumn.asInstanceOf[TypedColumn[IN, java.lang.Double]]
   }
 }
+
+class TypedMinDouble[IN](val f: IN => Double)
+  extends Aggregator[IN, java.lang.Double, java.lang.Double] {
+
+  override def zero: java.lang.Double = null
--- End diff --

@cloud-fan do you agree with this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-06-03 Thread zero323

Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/17969
  
Not a problem. It is just easier to reopen this in a future, than resolving 
ongoing conflicts. This is mostly deletions, but covers large part of the API, 
and even with recursive + patience git doesn't handle that well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18148
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18148
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77704/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18148
  
**[Test build #77704 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77704/testReport)**
 for PR 18148 at commit 
[`2832253`](https://github.com/apache/spark/commit/2832253afe2a48daae3f78568315b19a5aeb045f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18128: [SPARK-20906][SparkR]:Constrained Logistic Regres...

2017-06-03 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18128#discussion_r119995950
  
--- Diff: R/pkg/R/mllib_classification.R ---
@@ -239,21 +253,64 @@ function(object, path, overwrite = FALSE) {
 setMethod("spark.logit", signature(data = "SparkDataFrame", formula = 
"formula"),
   function(data, formula, regParam = 0.0, elasticNetParam = 0.0, 
maxIter = 100,
tol = 1E-6, family = "auto", standardization = TRUE,
-   thresholds = 0.5, weightCol = NULL, aggregationDepth = 
2) {
+   thresholds = 0.5, weightCol = NULL, aggregationDepth = 
2,
+   lowerBoundsOnCoefficients = NULL, 
upperBoundsOnCoefficients = NULL,
+   lowerBoundsOnIntercepts = NULL, upperBoundsOnIntercepts 
= NULL) {
 formula <- paste(deparse(formula), collapse = "")
+row <- 0
+col <- 0
 
 if (!is.null(weightCol) && weightCol == "") {
   weightCol <- NULL
 } else if (!is.null(weightCol)) {
   weightCol <- as.character(weightCol)
 }
 
+if (!is.null(lowerBoundsOnIntercepts)) {
+lowerBoundsOnIntercepts <- 
as.array(lowerBoundsOnIntercepts)
+}
+
+if (!is.null(upperBoundsOnIntercepts)) {
+upperBoundsOnIntercepts <- 
as.array(upperBoundsOnIntercepts)
+}
+
+if (!is.null(lowerBoundsOnCoefficients)) {
+  if (class(lowerBoundsOnCoefficients) != "matrix") {
+stop("lowerBoundsOnCoefficients must be a matrix.")
+  }
+  row <- nrow(lowerBoundsOnCoefficients)
+  col <- ncol(lowerBoundsOnCoefficients)
+  lowerBoundsOnCoefficients <- 
as.array(as.vector(lowerBoundsOnCoefficients))
+}
+
+if (!is.null(upperBoundsOnCoefficients)) {
+  if (class(upperBoundsOnCoefficients) != "matrix") {
+stop("upperBoundsOnCoefficients must be a matrix.")
+  }
+
+  if (!is.null(lowerBoundsOnCoefficients) & (row != 
nrow(upperBoundsOnCoefficients)
+| col != ncol(upperBoundsOnCoefficients))) {
+stop(paste("dimension of upperBoundsOnCoefficients ",
+   "is not the same as lowerBoundsOnCoefficients", 
sep = ""))
+  }
+
+  if (is.null(lowerBoundsOnCoefficients)) {
+row <- nrow(upperBoundsOnCoefficients)
+col <- ncol(upperBoundsOnCoefficients)
+  }
--- End diff --

ok thanks, L290-291


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-06-03 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/17969
  
@zero323 I think folks are generally very busy these 2 weeks for various 
reasons ;)
I'd suggest revisiting this in a couple of weeks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17899: [SPARK-20636] Add new optimization rule to flip adjacent...

2017-06-03 Thread ptkool

Github user ptkool commented on the issue:

https://github.com/apache/spark/pull/17899
  
@hvanhovell @gatorsmile Can you have another look at this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-06-03 Thread zero323

Github user zero323 closed the pull request at:

https://github.com/apache/spark/pull/17969


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-06-03 Thread zero323

Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/17969
  
@felixcheung I assume there is no interest in that. We can revisit this 
some other time I guess.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18159: [SPARK-20703][SQL] Associate metrics with data writes on...

2017-06-03 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/18159
  
hmm anyway to shorten the change? this change is a bit too big for metrics 
...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18159: [SPARK-20703][SQL] Associate metrics with data wr...

2017-06-03 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18159#discussion_r119995109
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/commands.scala 
---
@@ -17,38 +17,97 @@
 
 package org.apache.spark.sql.execution.command
 
+import scala.collection.mutable
+
 import org.apache.spark.rdd.RDD
-import org.apache.spark.sql.{Row, SparkSession}
+import org.apache.spark.sql.{Row, SparkSession, SQLContext}
 import org.apache.spark.sql.catalyst.{CatalystTypeConverters, InternalRow}
 import org.apache.spark.sql.catalyst.errors.TreeNodeException
 import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
 import org.apache.spark.sql.catalyst.plans.{logical, QueryPlan}
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-import org.apache.spark.sql.execution.SparkPlan
+import org.apache.spark.sql.execution.{SparkPlan, SQLExecution}
+import org.apache.spark.sql.execution.datasources.ExecutedWriteSummary
 import org.apache.spark.sql.execution.debug._
+import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics}
 import org.apache.spark.sql.execution.streaming.{IncrementalExecution, 
OffsetSeqMetadata}
 import org.apache.spark.sql.streaming.OutputMode
 import org.apache.spark.sql.types._
 
 /**
- * A logical command that is executed for its side-effects.  
`RunnableCommand`s are
- * wrapped in `ExecutedCommand` during execution.
+ * A logical command specialized for writing data out. 
`WriteOutFileCommand`s are
+ * wrapped in `WrittenFileCommandExec` during execution.
  */
-trait RunnableCommand extends logical.Command {
-  def run(sparkSession: SparkSession, children: Seq[SparkPlan]): Seq[Row] 
= {
+trait WriteOutFileCommand extends logical.Command {
--- End diff --

move it to a new file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18189: [SPARK-20972][SQL] rename HintInfo.isBroadcastable to fo...

2017-06-03 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/18189
  
tbh the difference is so small that i don't think it is worth spending time 
here ... as pointed out it is not forceBroadcast either.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/18128
  
Local test passed. Let me check it tonight.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17922: [SPARK-20601][PYTHON][ML] Python API Changes for Constra...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17922
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17922: [SPARK-20601][PYTHON][ML] Python API Changes for Constra...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17922
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77705/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17922: [SPARK-20601][PYTHON][ML] Python API Changes for Constra...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17922
  
**[Test build #77705 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77705/testReport)**
 for PR 17922 at commit 
[`649bf28`](https://github.com/apache/spark/commit/649bf2843241f1ea7a4e6fd231355c4347118750).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18052: [SPARK-20347][PYSPARK][WIP] Provide AsyncRDDActions in P...

2017-06-03 Thread zero323

Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/18052
  
__Note__: [Waiting for some 
feedback](https://twitter.com/holdenkarau/status/866672579318337537).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17922: [SPARK-20601][PYTHON][ML] Python API Changes for Constra...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17922
  
**[Test build #77705 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77705/testReport)**
 for PR 17922 at commit 
[`649bf28`](https://github.com/apache/spark/commit/649bf2843241f1ea7a4e6fd231355c4347118750).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17922: [SPARK-20601][PYTHON][ML] Python API Changes for Constra...

2017-06-03 Thread zero323

Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/17922
  
Sure @yanboliang. Give me a sec.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18066
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77702/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18066
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18066
  
**[Test build #77702 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77702/testReport)**
 for PR 18066 at commit 
[`c183032`](https://github.com/apache/spark/commit/c183032a0edc3837bd0e697a15b298b47a2ab5ab).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18148
  
**[Test build #77704 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77704/testReport)**
 for PR 18148 at commit 
[`2832253`](https://github.com/apache/spark/commit/2832253afe2a48daae3f78568315b19a5aeb045f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/18148
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/18148
  
Fun, but probably unrelated:

```
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f115a81f0e9, pid=125279, tid=139711419225856
#
# JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 
1.8.0_60-b27)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode 
linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x88a0e9]  LoadKlassNode::make(PhaseGVN&, Node*, Node*, 
Node*, TypePtr const*, TypeKlassPtr const*)+0x49
#
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17922: [SPARK-20601][PYTHON][ML] Python API Changes for ...

2017-06-03 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/17922#discussion_r119992981
  
--- Diff: python/pyspark/ml/tests.py ---
@@ -819,6 +847,84 @@ def logistic_regression_check_thresholds(self):
 LogisticRegression, threshold=0.42, thresholds=[0.5, 0.5]
 )
 
+def test_binomial_logistic_regression_bounds(self):
--- End diff --

Usually it's not need to write exactly the same test as Scala in PySpark, 
we can use a simple test with loading a dataset  or generating a very simple 
dataset and run constrained LR on it. You can refer test cases in 
[test.py](https://github.com/apache/spark/blob/master/python/pyspark/ml/tests.py)
 or other tests like 
[this](https://github.com/apache/spark/blob/master/python/pyspark/ml/classification.py#L200).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17094: [SPARK-19762][ML] Hierarchy for consolidating ML aggrega...

2017-06-03 Thread sethah

Github user sethah commented on the issue:

https://github.com/apache/spark/pull/17094
  
@srowen Speaking for myself, I think the other concerns can be issued as 
follow ups, yes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18046: [SPARK-20749][SQL] Built-in SQL Function Support - all v...

2017-06-03 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18046
  
ping @gatorsmile


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18128
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18128
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77703/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18128
  
**[Test build #77703 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77703/testReport)**
 for PR 18128 at commit 
[`b89a0f7`](https://github.com/apache/spark/commit/b89a0f71d22b02b854ee0c4cad4656ae31fc0321).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-06-03 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18066
  
ping @hvanhovell @sameeragarwal


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18128
  
**[Test build #77703 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77703/testReport)**
 for PR 18128 at commit 
[`b89a0f7`](https://github.com/apache/spark/commit/b89a0f71d22b02b854ee0c4cad4656ae31fc0321).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18128: [SPARK-20906][SparkR]:Constrained Logistic Regression fo...

2017-06-03 Thread wangmiao1981

Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/18128
  
Jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18066: [SPARK-20822][SQL] Generate code to build table cache us...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18066
  
**[Test build #77702 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77702/testReport)**
 for PR 18066 at commit 
[`c183032`](https://github.com/apache/spark/commit/c183032a0edc3837bd0e697a15b298b47a2ab5ab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18033: [SPARK-20807][SQL] Add compression/decompression of colu...

2017-06-03 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/18033
  
ping @hvanhovell @sameeragarwal 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18192: [SPARK-20944][SHUFFLE] Move shouldBypassMergeSort from S...

2017-06-03 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18192
  
Suggest modifying the title and Comment format is incorrect.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17953: [SPARK-20680][SQL] Spark-sql do not support for void col...

2017-06-03 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/17953
  
Retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17953: [SPARK-20680][SQL] Spark-sql do not support for void col...

2017-06-03 Thread LantaoJin

Github user LantaoJin commented on the issue:

https://github.com/apache/spark/pull/17953
  
retest this commit please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18192: [SPARK-20944][SHUFFLE] Move shouldBypassMergeSort from S...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18192
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18192: [SPARK-20944][SHUFFLE] Move shouldBypassMergeSort...

2017-06-03 Thread zhengcanbin

GitHub user zhengcanbin opened a pull request:

https://github.com/apache/spark/pull/18192

[SPARK-20944][SHUFFLE] Move shouldBypassMergeSort from SortShuffleWriâ¦

â¦ter to SortShuffleManager

## What changes were proposed in this pull request?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhengcanbin/spark SPARK-20944

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18192.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18192


commit 628af189dc316ed56f584d537ea7999ffefd1f75
Author: Canbin Zheng <2056268...@qq.com>
Date:   2017-06-03T14:16:12Z

[SPARK-20944][SHUFFLE] Move shouldBypassMergeSort from SortShuffleWriter to 
SortShuffleManager




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18174: [SPARK-20950][CORE]Improve diskWriteBufferSize configura...

2017-06-03 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18174
  
@srowen 
yes, you're right, It's time, and their unit is MS.
the numbers Is the average time of 10 times running` forceSorterToSpill`.
I assume big buffer copies time consuming longer than small buffer. 
Although the small buffer has been copied many times.  or local file systems 
write big buffer that time consuming longer than small buffer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18029
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77701/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18029
  
**[Test build #77701 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77701/testReport)**
 for PR 18029 at commit 
[`f1ffcbd`](https://github.com/apache/spark/commit/f1ffcbd897f5a9f6408458c55dd2b7b63ae73b93).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18029
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18029
  
**[Test build #77701 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77701/testReport)**
 for PR 18029 at commit 
[`f1ffcbd`](https://github.com/apache/spark/commit/f1ffcbd897f5a9f6408458c55dd2b7b63ae73b93).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18171: [SPARK-20945] Fix TID key not found in TaskSchedulerImpl

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18171
  
**[Test build #3775 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3775/testReport)**
 for PR 18171 at commit 
[`c4fff63`](https://github.com/apache/spark/commit/c4fff63c8ad2f959c7a8168d051e3b8ebfc06abb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18186: [SPARK-20966][WEB-UI][SQL]Table data is not sorted by st...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18186
  
**[Test build #3774 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3774/testReport)**
 for PR 18186 at commit 
[`4c3dbeb`](https://github.com/apache/spark/commit/4c3dbeb9c7d4db927bc15b60cf3d4ea8eb8fdc45).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18029
  
**[Test build #77700 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77700/testReport)**
 for PR 18029 at commit 
[`3cac39b`](https://github.com/apache/spark/commit/3cac39b49b7b56547eee117f53d0182798270111).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18029
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77700/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18029
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18029: [SPARK-20168][WIP][DStream] Add changes to use kinesis f...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18029
  
**[Test build #77700 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77700/testReport)**
 for PR 18029 at commit 
[`3cac39b`](https://github.com/apache/spark/commit/3cac39b49b7b56547eee117f53d0182798270111).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18029: [SPARK-20168][WIP][DStream] Add changes to use ki...

2017-06-03 Thread yssharma

Github user yssharma commented on a diff in the pull request:

https://github.com/apache/spark/pull/18029#discussion_r119984045
  
--- Diff: 
external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream.scala
 ---
@@ -38,6 +40,7 @@ private[kinesis] class KinesisInputDStream[T: ClassTag](
 val endpointUrl: String,
 val regionName: String,
 val initialPositionInStream: InitialPositionInStream,
+val initialPositionInStreamTimestamp: Date,
--- End diff --

@budde - I had two approaches in mind while adding this functionality-
1. Additional parameter which can be set by an overloaded method in Builder.
2. Creating a new case class for wrapping initial position with an optional 
timestamp.

I went ahead with implementing the first one for backward compatibility 
such that users can use their same builders.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17759: [DOCS] Fix a typo in Encoder.clsTag

2017-06-03 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17759


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17759: [DOCS] Fix a typo in Encoder.clsTag

2017-06-03 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17759
  
Merged this to close it, but we'd generally discourage this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18186: [SPARK-20966][WEB-UI][SQL]Table data is not sorted by st...

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18186
  
**[Test build #3774 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3774/testReport)**
 for PR 18186 at commit 
[`4c3dbeb`](https://github.com/apache/spark/commit/4c3dbeb9c7d4db927bc15b60cf3d4ea8eb8fdc45).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18171: [SPARK-20945] Fix TID key not found in TaskSchedulerImpl

2017-06-03 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18171
  
**[Test build #3775 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3775/testReport)**
 for PR 18171 at commit 
[`c4fff63`](https://github.com/apache/spark/commit/c4fff63c8ad2f959c7a8168d051e3b8ebfc06abb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18186: [SPARK-20966][WEB-UI][SQL]Table data is not sorte...

2017-06-03 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/18186#discussion_r119982964
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/ui/ThriftServerSessionPage.scala
 ---
@@ -147,42 +147,6 @@ private[ui] class ThriftServerSessionPage(parent: 
ThriftServerTab)
 {errorSummary}{details}
   }
 
-  /** Generate stats of batch sessions of the thrift server program */
-  private def generateSessionStatsTable(): Seq[Node] = {
--- End diff --

There's also a method like this in `ThriftServerPage.scala` above. I guess 
we can remove it too. It's dead code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13373: [SPARK-15616] [SQL] Metastore relation should fallback t...

2017-06-03 Thread lianhuiwang

Github user lianhuiwang commented on the issue:

https://github.com/apache/spark/pull/13373
  
@cloud-fan I do not think that PruneFileSourcePartitions rule is for Hive's 
CatalogRelation. example in this PR with master branch cannot get expected 
result. So i will update it with the latest code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18174: [SPARK-20950][CORE]Improve diskWriteBufferSize configura...

2017-06-03 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18174
  
There's no description of your test or what the numbers mean. I assume 
they're times. Why would a smaller buffer be faster?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18158: [SPARK-20936][CORE]Lack of an important case abou...

2017-06-03 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18158


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17094: [SPARK-19762][ML] Hierarchy for consolidating ML aggrega...

2017-06-03 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17094
  
@sethah @MLnick am I reading right that this can be merged as a step 
forward?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18158: [SPARK-20936][CORE]Lack of an important case about the t...

2017-06-03 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18158
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18188: [SPARK-20790] [MLlib] Remove extraneous logging i...

2017-06-03 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18188


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18188: [SPARK-20790] [MLlib] Remove extraneous logging in test

2017-06-03 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18188
  
Merged to master/2.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12894: [SPARK-15117][SQL][WIP] Generate Java code that gets a v...

2017-06-03 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/12894
  
@HyukjinKwon Thank you for pointing out this.
This PR will be replaced with 
https://issues.apache.org/jira/browse/SPARK-20823


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12894: [SPARK-15117][SQL][WIP] Generate Java code that g...

2017-06-03 Thread kiszk

Github user kiszk closed the pull request at:

https://github.com/apache/spark/pull/12894


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18174: [SPARK-20950][CORE]Improve diskWriteBufferSize configura...

2017-06-03 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18174
  
@srowen 
thanks for review it.
In our performance tuning, find the row of record the size of more than 2M. 
 so need to initialSerBufferSize configurable. but Change `initialSerBufferSize 
` is not good for performance tuning.
However, change spill `diskWriteBufferSize ` is good for performance 
tuning. 
So I did a little experiment, change the size of the diskWriteBufferSize to 
test.
set diskWriteBufferSize to 1M, 512K, 256k, 256K, 128K, 64K,etc. 

diskWriteBufferSize   `1M512K256K128K64K32K
16K8K 4K`
RecordSize:2.5M `742   722 694686  667  668 
  671669683`
   RecordSize:1M`294   293 292287  283  
285281 279285`

In order to eliminate the interference of other factors, these results are 
tested take the average of 10 times.

please review code again.
thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18148: [SPARK-20926][SQL] Removing exposures to guava library c...

2017-06-03 Thread rezasafi

Github user rezasafi commented on the issue:

https://github.com/apache/spark/pull/18148
  
Not clear why the tests were failed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18164: [SPARK-19732][SQL][PYSPARK] Add fill functions fo...

2017-06-03 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18164


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

99 matches

Mail list logo