date:20150430

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-97675656
  
  [Test build #31382 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31382/consoleFull)
 for   PR 5647 at commit 
[`0319821`](https://github.com/apache/spark/commit/0319821db7406f3cca359af5bc021d2f3fd92a17).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-97675701
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31382/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread scwf

Github user scwf commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29406960
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,13 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, b, a, b, 
a, b, a, b),
   The output schema did not preserve the case of the query.)
--- End diff --

Yes I think for caseInSensitivity case we should normalize the table name 
and attribute name


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29406952
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/package.scala
 ---
@@ -29,12 +29,23 @@ package object analysis {
 
   /**
* Resolver should return true if the first string refers to the same 
entity as the second string.
-   * For example, by using case insensitive equality.
+   * For example, by using case insensitive equality. Besides, Resolver 
also provides the ability
+   * to normalize the string according to its semantic.
*/
-  type Resolver = (String, String) = Boolean
+  trait Resolver {
+def apply(a: String, b: String): Boolean
+def apply(a: String): String
+  }
+
+  val caseInsensitiveResolution = new Resolver {
+override def apply(a: String, b: String): Boolean = 
a.equalsIgnoreCase(b)
+override def apply(a: String): String = a.toLowerCase // as Hive does
--- End diff --

I'd like keep the first `apply` as it was, because I don't want to impact a 
lots of existed code. I agree  we should rename the second `apply` = 
`normalize`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5799#discussion_r29406904
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
@@ -0,0 +1,55 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the License); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an AS IS BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql
+
+import org.apache.spark.annotation.Experimental
+import org.apache.spark.sql.execution.stat.FrequentItems
+
+/**
+ * :: Experimental ::
+ * Statistic functions for [[DataFrame]]s.
+ */
+@Experimental
+final class DataFrameStatFunctions private[sql](df: DataFrame) {
+
+  /**
+   * Finding frequent items for columns, possibly with false positives. 
Using the
+   * frequent element count algorithm described in
+   * [[http://dx.doi.org/10.1145/762471.762473, proposed by Karp, 
Schenker, and Papadimitriou]].
+   *
+   * @param cols the names of the columns to search frequent items in
+   * @param support The minimum frequency for an item to be considered 
`frequent`
+   * @return A Local DataFrame with the Array of frequent items for each 
column.
+   */
+  def freqItems(cols: Seq[String], support: Double): DataFrame = {
--- End diff --

also make sure you add a test to the JavaDataFrameSuite


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97683034
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6913][SQL] Fixed java.sql.SQLException...

2015-04-30 Thread SlavikBaranov

Github user SlavikBaranov commented on the pull request:

https://github.com/apache/spark/pull/5782#issuecomment-97683350
  
Thanks for comments, fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97683624
  
  [Test build #31393 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31393/consoleFull)
 for   PR 5744 at commit 
[`c87f517`](https://github.com/apache/spark/commit/c87f51774a8e4f488557865657e8974d2c06ba4b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5938][SQL] Improve JsonRDD performance

2015-04-30 Thread NathanHowell

GitHub user NathanHowell opened a pull request:

https://github.com/apache/spark/pull/5801

[SPARK-5938][SQL] Improve JsonRDD performance

This patch comprises of a few related pieces of work:

* Schema inference is performed directly on the JSON token stream
* `String = Row` conversion populate Spark SQL structures without 
intermediate types
* Projection pushdown is implemented via CatalystScan for DataFrame queries

I've run some basic queries on a 300MB/100k row dataset with a flat schema 
and the results are promising:

* Before: ```INFO DAGScheduler: Job 8 finished: count at console:20, took 
2.916653 s```
* After: ```INFO DAGScheduler: Job 8 finished: count at console:20, took 
2.184896 s```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NathanHowell/spark json-performance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5801.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5801


commit 1e441e23a2cfd8712720a728056e363e41538d1f
Author: Nathan Howell nhow...@godaddy.com
Date:   2015-04-29T05:44:19Z

Eliminate arrow pattern, replace with pattern matches

commit 73a56927d09c670eb62317f611c47a90096fe693
Author: Nathan Howell nhow...@godaddy.com
Date:   2015-04-27T22:38:28Z

Improve JSON parsing and type inference performance

commit 1abf1d6010c71cd1cffa97d7564f8fb71eb19f10
Author: Nathan Howell nhow...@godaddy.com
Date:   2015-04-30T02:16:33Z

Add projection pushdown support to JsonRDD




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97692345
  
  [Test build #31397 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31397/consoleFull)
 for   PR 5680 at commit 
[`3ad00d9`](https://github.com/apache/spark/commit/3ad00d9d1171cdf0563167a0e368482fb798043b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7031][ThriftServer]let thrift server ta...

2015-04-30 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/5609#issuecomment-97692357
  
Another different case failed. Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29408785
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/InputInfoTracker.scala
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.streaming.scheduler
+
+import scala.collection.mutable
+
+import org.apache.spark.Logging
+import org.apache.spark.streaming.{Time, StreamingContext}
+
+/** To track the information of input stream at specified batch time. */
+case class InputInfo(batchTime: Time, inputStreamId: Int, numRecords: Long)
+
+/**
+ * This class manages all the input streams as well as their input data 
statistics. The information
+ * will output to StreamingListener to better monitoring.
+ */
+private[streaming] class InputInfoTracker(ssc: StreamingContext) extends 
Logging {
+
+  /** Track all the input streams registered in DStreamGraph */
+  val inputStreams = ssc.graph.getInputStreams()
--- End diff --

Can this be private?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29408797
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/InputInfoTracker.scala
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.streaming.scheduler
+
+import scala.collection.mutable
+
+import org.apache.spark.Logging
+import org.apache.spark.streaming.{Time, StreamingContext}
+
+/** To track the information of input stream at specified batch time. */
+case class InputInfo(batchTime: Time, inputStreamId: Int, numRecords: Long)
+
+/**
+ * This class manages all the input streams as well as their input data 
statistics. The information
+ * will output to StreamingListener to better monitoring.
+ */
+private[streaming] class InputInfoTracker(ssc: StreamingContext) extends 
Logging {
+
+  /** Track all the input streams registered in DStreamGraph */
+  val inputStreams = ssc.graph.getInputStreams()
+  /** Track all the id of input streams registered in DStreamGraph */
+  val inputStreamIds = inputStreams.map(_.id)
--- End diff --

Can this be private?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29408816
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/InputInfoTracker.scala
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.streaming.scheduler
+
+import scala.collection.mutable
+
+import org.apache.spark.Logging
+import org.apache.spark.streaming.{Time, StreamingContext}
+
+/** To track the information of input stream at specified batch time. */
+case class InputInfo(batchTime: Time, inputStreamId: Int, numRecords: Long)
+
+/**
+ * This class manages all the input streams as well as their input data 
statistics. The information
+ * will output to StreamingListener to better monitoring.
+ */
+private[streaming] class InputInfoTracker(ssc: StreamingContext) extends 
Logging {
+
+  /** Track all the input streams registered in DStreamGraph */
+  val inputStreams = ssc.graph.getInputStreams()
+  /** Track all the id of input streams registered in DStreamGraph */
+  val inputStreamIds = inputStreams.map(_.id)
+
+  // Map to track all the InputInfo related to specific batch time and 
input stream.
+  private val batchTimeToInputInfos = new mutable.HashMap[Time, 
mutable.HashMap[Int, InputInfo]]
+
+  /** Report the input information with batch time to the tracker */
+  def reportInfo(batchTime: Time, inputInfo: InputInfo): Unit = 
synchronized {
+val inputInfos = batchTimeToInputInfos.getOrElseUpdate(batchTime,
+  new mutable.HashMap[Int, InputInfo]())
+
+if (inputInfos.contains(inputInfo.inputStreamId)) {
+  throw new IllegalStateException(sInput stream 
${inputInfo.inputStreamId}} for batch +
+s$batchTime is already added into InputInfoTracker, this is a 
illegal state)
+}
+inputInfos += ((inputInfo.inputStreamId, inputInfo))
+  }
+
+  /** Get the all the input stream's information of specified batch time */
+  def getInfo(batchTime: Time): Map[Int, InputInfo] = synchronized {
+val inputInfos = batchTimeToInputInfos.get(batchTime)
+// Convert mutable HashMap to immutable Map for the caller
+inputInfos.map(_.toMap).getOrElse(Map[Int, InputInfo]())
+  }
+
+  /** Get the input information of specified batch time and input stream 
id */
+  def getInfoOfBatchAndStream(batchTime: Time, inputStreamId: Int
--- End diff --

This is not used anywhere other than tests, is this necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread jerryshao

Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97695062
  
Yes, I will do this, please take take a look at the whole design, thanks a 
lot :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29410293
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala
 ---
@@ -70,9 +70,8 @@ private[streaming] class 
StreamingJobProgressListener(ssc: StreamingContext)
 runningBatchInfos(batchStarted.batchInfo.batchTime) = 
batchStarted.batchInfo
 waitingBatchInfos.remove(batchStarted.batchInfo.batchTime)
 
-batchStarted.batchInfo.receivedBlockInfo.foreach { case (_, infos) =
-  totalReceivedRecords += infos.map(_.numRecords).sum
-}
+// TODO. this should be fixed when input stream is not receiver based 
stream.
+totalReceivedRecords += 
batchStarted.batchInfo.streamIdToNumRecords.values.sum
--- End diff --

Yes, will do. Also have one concern, if the `batchStarted` is not a 
receiver-based batchInfo, so do we need to count this records into 
`totalReceivedRecords` when batch is just started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6612] [MLLib] [PySpark] Python KMeans p...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5647#issuecomment-97675700
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1406] Mllib pmml model export

2015-04-30 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3062


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-6846 [WEBUI] Stage kill URL easy to acci...

2015-04-30 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/5528#issuecomment-97676514
  
That's fine, in the sense that the endpoint returns no data. OK, so it 
works except for this proxying. Hm, surely the YARN proxy can pass on a POST. 
We'll have to look into this. Any wisdom from YARN folks about where to look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29406469
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,13 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, b, a, b, 
a, b, a, b),
   The output schema did not preserve the case of the query.)
--- End diff --

Supporting normalization is good. However, when explicitly specifying the 
case in the query, should we need to preserve the case of the query, instead of 
normalizing it like this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97691954
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97691943
  
  [Test build #31386 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31386/consoleFull)
 for   PR 5799 at commit 
[`8279d4d`](https://github.com/apache/spark/commit/8279d4d4cb09f78e2f8f83f9a3738101b940ed40).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97691955
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31386/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7196][SQL] Support precision and scale ...

2015-04-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5777#issuecomment-97695518
  
@viirya apparently this doesn't fix the problem. Can you look into it more? 
Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29408973
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/BatchInfo.scala 
---
@@ -32,7 +33,7 @@ import org.apache.spark.streaming.Time
 @DeveloperApi
 case class BatchInfo(
 batchTime: Time,
-receivedBlockInfo: Map[Int, Array[ReceivedBlockInfo]],
+streamIdToNumRecords: Map[Int, Long],
 submissionTime: Long,
 processingStartTime: Option[Long],
 processingEndTime: Option[Long]
--- End diff --

Can you make a method called `numRecords` which returns the sum? This is 
the same approach taken by @zsxwing in  #5533, so will be easier to merge 
conflicts later. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7274][SQL] Create Column expression for...

2015-04-30 Thread mengxr

Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5802#discussion_r29410114
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -363,6 +380,28 @@ object functions {
   def sqrt(e: Column): Column = Sqrt(e.expr)
 
   /**
+   * Creates a new struct column. The input column must be a column in a 
[[DataFrame]], or
+   * a derived column expression that is named (i.e. aliased).
+   *
+   * @group normal_funcs
+   */
+  @scala.annotation.varargs
+  def struct(cols: Column*): Column = {
--- End diff --

Do we allow empty input `struct()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29410045
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala
 ---
@@ -135,28 +132,25 @@ private[streaming] class 
StreamingJobProgressListener(ssc: StreamingContext)
 
   def receivedRecordsDistributions: Map[Int, Option[Distribution]] = 
synchronized {
 val latestBatchInfos = retainedBatches.reverse.take(batchInfoLimit)
-val latestBlockInfos = latestBatchInfos.map(_.receivedBlockInfo)
-(0 until numReceivers).map { receiverId =
-  val blockInfoOfParticularReceiver = latestBlockInfos.map { batchInfo 
=
-batchInfo.get(receiverId).getOrElse(Array.empty)
-  }
-  val recordsOfParticularReceiver = blockInfoOfParticularReceiver.map 
{ blockInfo =
-  // calculate records per second for each batch
-blockInfo.map(_.numRecords).sum.toDouble * 1000 / batchDuration
-  }
-  val distributionOption = Distribution(recordsOfParticularReceiver)
-  (receiverId, distributionOption)
+
+// TODO. this should be fixed when receiver-less input stream is mixed 
into BatchInfo
--- End diff --

This is what makes me concern a lot. Now for the `BatchInfo's 
streamIdToNumRecords`, all the input stream's statistic data will be in it, not 
receiver-based input stream, so do we need to differentiate the statistics?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97704110
  
  [Test build #31404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31404/consoleFull)
 for   PR 5799 at commit 
[`3a5c177`](https://github.com/apache/spark/commit/3a5c177e247ddb44a38e4ee4211c57ec3cad58eb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97709044
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97709002
  
  [Test build #31408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31408/consoleFull)
 for   PR 5680 at commit 
[`8325787`](https://github.com/apache/spark/commit/8325787bf13bcca16a405561413f1d81b3229941).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97709046
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31392/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97709011
  
  [Test build #31392 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31392/consoleFull)
 for   PR 5799 at commit 
[`482e741`](https://github.com/apache/spark/commit/482e74180445d30d0b5a769cd5f9bd0e94abfd17).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `jaxb-api-2.2.7.jar`
   * `jaxb-core-2.2.7.jar`
   * `jaxb-impl-2.2.7.jar`
   * `pmml-agent-1.1.15.jar`
   * `pmml-model-1.1.15.jar`
   * `pmml-schema-1.1.15.jar`

 * This patch **removes the following dependencies:**
   * `activation-1.1.jar`
   * `jaxb-api-2.2.2.jar`
   * `jaxb-impl-2.2.3-1.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97708818
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97708865
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97712927
  
  [Test build #31411 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31411/consoleFull)
 for   PR 5680 at commit 
[`8325787`](https://github.com/apache/spark/commit/8325787bf13bcca16a405561413f1d81b3229941).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7267][SQL]Push down Project when it's c...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5797#issuecomment-97715468
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7171: Added a method to retrieve metrics...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5805#issuecomment-97715782
  
  [Test build #31412 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31412/consoleFull)
 for   PR 5805 at commit 
[`9ed86ca`](https://github.com/apache/spark/commit/9ed86cabbd07de338adfe3153afa0ed4b005cee7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7267][SQL]Push down Project when it's c...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5797#issuecomment-97715469
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31396/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6967] [SQL] fix date type convertion in...

2015-04-30 Thread nadavoosh

Github user nadavoosh commented on the pull request:

https://github.com/apache/spark/pull/5590#issuecomment-97717732
  
hi @adrian-wang ! I am using spark and needed to include this fix, since I 
am reading from a table that has Date types. I just ran into a new problem 
though: when the Date field has null values, spark throws a  
java.lang.NullPointerException
at 
org.apache.spark.sql.types.DateUtils$.javaDateToDays(DateUtils.scala:39)
error. Any ideas on how I can fix that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7031][ThriftServer]let thrift server ta...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5609#issuecomment-97717662
  
  [Test build #31398 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31398/consoleFull)
 for   PR 5609 at commit 
[`8d3fc16`](https://github.com/apache/spark/commit/8d3fc16dd22c87fbf768951b64dabe7d121731ec).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `jaxb-api-2.2.7.jar`
   * `jaxb-core-2.2.7.jar`
   * `jaxb-impl-2.2.7.jar`
   * `pmml-agent-1.1.15.jar`
   * `pmml-model-1.1.15.jar`
   * `pmml-schema-1.1.15.jar`

 * This patch **removes the following dependencies:**
   * `activation-1.1.jar`
   * `jaxb-api-2.2.2.jar`
   * `jaxb-impl-2.2.3-1.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7031][ThriftServer]let thrift server ta...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5609#issuecomment-97717674
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31398/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97677458
  
I will let @marmbrus take a look at this tomorrow.

Meantime, can you add the apply method and Python getitem method? Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97677730
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-04-30 Thread sven0726

Github user sven0726 closed the pull request at:

https://github.com/apache/spark/pull/5800


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97677728
  
@rxin , already done :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Merge pull request #1 from apache/master

2015-04-30 Thread sven0726

GitHub user sven0726 opened a pull request:

https://github.com/apache/spark/pull/5800

Merge pull request #1 from apache/master



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sven0726/spark-1 master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5800.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5800


commit 5da8c543ff542951c3fefe6e123b891f66edf4b6
Author: sven0726 sven0...@gmail.com
Date:   2015-04-27T08:21:55Z

Merge pull request #1 from apache/master

2015-04-27ç¬¬ä¸æ¬¡merge




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5799#discussion_r29406784
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala 
---
@@ -0,0 +1,127 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the License); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an AS IS BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.stat
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Column, DataFrame, Row}
+import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
+import org.apache.spark.sql.types.{ArrayType, StructField, StructType}
+
+import scala.collection.mutable.{Map = MutableMap}
+
+private[sql] object FrequentItems extends Logging {
+
+  /** A helper class wrapping `MutableMap[Any, Long]` for simplicity. */
+  private class FreqItemCounter(size: Int) extends Serializable {
+val baseMap: MutableMap[Any, Long] = MutableMap.empty[Any, Long]
+
+/**
+ * Add a new example to the counts if it exists, otherwise deduct the 
count
+ * from existing items.
+ */
+def add(key: Any, count: Long): this.type = {
+  if (baseMap.contains(key))  {
+baseMap(key) += count
+  } else {
+if (baseMap.size  size) {
+  baseMap += key - count
+} else {
+  // TODO: Make this more efficient... A flatMap?
+  baseMap.retain((k, v) = v  count)
+  baseMap.transform((k, v) = v - count)
+}
+  }
+  this
+}
+
+/**
+ * Merge two maps of counts.
+ * @param other The map containing the counts for that partition
+ */
+def merge(other: FreqItemCounter): this.type = {
+  other.toSeq.foreach { case (k, v) =
+add(k, v)
+  }
+  this
+}
+
+def toSeq: Seq[(Any, Long)] = baseMap.toSeq
--- End diff --

u don't need this, do you? you can just operate on the map directly. i'm 
asking because i'm not sure whether baseMap.toSeq materializes a whole seq, 
which might be unnecessary


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5799#discussion_r29406755
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
@@ -0,0 +1,55 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the License); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an AS IS BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql
+
+import org.apache.spark.annotation.Experimental
+import org.apache.spark.sql.execution.stat.FrequentItems
+
+/**
+ * :: Experimental ::
+ * Statistic functions for [[DataFrame]]s.
+ */
+@Experimental
+final class DataFrameStatFunctions private[sql](df: DataFrame) {
+
+  /**
+   * Finding frequent items for columns, possibly with false positives. 
Using the
+   * frequent element count algorithm described in
+   * [[http://dx.doi.org/10.1145/762471.762473, proposed by Karp, 
Schenker, and Papadimitriou]].
+   *
--- End diff --

make sure you document the range of support allowed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97683042
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7267][SQL]Push down Project when it's c...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5797#issuecomment-97688351
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7267][SQL]Push down Project when it's c...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5797#issuecomment-97688243
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29408488
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,13 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, b, a, b, 
a, b, a, b),
   The output schema did not preserve the case of the query.)
--- End diff --

OK, I see your point, I will keep minimize the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7031][ThriftServer]let thrift server ta...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5609#issuecomment-97695851
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31389/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29409065
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala
 ---
@@ -70,9 +70,8 @@ private[streaming] class 
StreamingJobProgressListener(ssc: StreamingContext)
 runningBatchInfos(batchStarted.batchInfo.batchTime) = 
batchStarted.batchInfo
 waitingBatchInfos.remove(batchStarted.batchInfo.batchTime)
 
-batchStarted.batchInfo.receivedBlockInfo.foreach { case (_, infos) =
-  totalReceivedRecords += infos.map(_.numRecords).sum
-}
+// TODO. this should be fixed when input stream is not receiver based 
stream.
+totalReceivedRecords += 
batchStarted.batchInfo.streamIdToNumRecords.values.sum
--- End diff --

This can be replaced by `batchStarted.batchInfo.numRecords` if you 
implement `numRecords` as I said above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7031][ThriftServer]let thrift server ta...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5609#issuecomment-97695850
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7031][ThriftServer]let thrift server ta...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5609#issuecomment-97695837
  
  [Test build #31389 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31389/consoleFull)
 for   PR 5609 at commit 
[`8d3fc16`](https://github.com/apache/spark/commit/8d3fc16dd22c87fbf768951b64dabe7d121731ec).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6602][Core] Update Master, Worker, Clie...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5392#issuecomment-97695959
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31388/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7249] Updated Hadoop dependencies due t...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5786#issuecomment-97698375
  
  [Test build #31400 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31400/consoleFull)
 for   PR 5786 at commit 
[`7e9955d`](https://github.com/apache/spark/commit/7e9955df29b5d5c9cda950636d51da753e6d17ea).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7274][SQL] Create Column expression for...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5802#issuecomment-97700907
  
  [Test build #31401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31401/consoleFull)
 for   PR 5802 at commit 
[`0603a91`](https://github.com/apache/spark/commit/0603a915a75ce1429d0ceca843081602ce17c500).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7274][SQL] Create Column expression for...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5802#issuecomment-97700671
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6888][SQL] Make the jdbc driver handlin...

2015-04-30 Thread rtreffer

Github user rtreffer commented on the pull request:

https://github.com/apache/spark/pull/#issuecomment-97702611
  
Still no build :cry: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29411128
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,11 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, B, a, B, 
a, b, A, B),
--- End diff --

I'm not sure what we really want here. When user `SELECT b FROM t` and `t` 
has a column `B`, which one should we used in the result schema? `b` or `B`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6999] [SQL] Remove the infinite recursi...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5804#issuecomment-97706125
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29412354
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,11 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, B, a, B, 
a, b, A, B),
--- End diff --

Does that matter for a case-insensitive system? 
But we do need keep the attribute name identical in the references chain. 
This is a workaround approach for the bug fixing, in long term, we probably 
need to refactor the AttributeReference `equality` for name (or take the 
Resolver in?).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5342][YARN] Allow long running Spark ap...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4688#issuecomment-97712538
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31395/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4699][SQL] make caseSensitive configura...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5806#issuecomment-97712608
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5342][YARN] Allow long running Spark ap...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4688#issuecomment-97712526
  
  [Test build #31395 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31395/consoleFull)
 for   PR 4688 at commit 
[`36eb8a9`](https://github.com/apache/spark/commit/36eb8a956c357388e4fdf5858cb4f27236f26a9e).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **removes the following dependencies:**
   * `RoaringBitmap-0.4.5.jar`
   * `activation-1.1.jar`
   * `akka-actor_2.10-2.3.4-spark.jar`
   * `akka-remote_2.10-2.3.4-spark.jar`
   * `akka-slf4j_2.10-2.3.4-spark.jar`
   * `aopalliance-1.0.jar`
   * `arpack_combined_all-0.1.jar`
   * `avro-1.7.7.jar`
   * `breeze-macros_2.10-0.11.2.jar`
   * `breeze_2.10-0.11.2.jar`
   * `chill-java-0.5.0.jar`
   * `chill_2.10-0.5.0.jar`
   * `commons-beanutils-1.7.0.jar`
   * `commons-beanutils-core-1.8.0.jar`
   * `commons-cli-1.2.jar`
   * `commons-codec-1.10.jar`
   * `commons-collections-3.2.1.jar`
   * `commons-compress-1.4.1.jar`
   * `commons-configuration-1.6.jar`
   * `commons-digester-1.8.jar`
   * `commons-httpclient-3.1.jar`
   * `commons-io-2.1.jar`
   * `commons-lang-2.5.jar`
   * `commons-lang3-3.3.2.jar`
   * `commons-math-2.1.jar`
   * `commons-math3-3.4.1.jar`
   * `commons-net-2.2.jar`
   * `compress-lzf-1.0.0.jar`
   * `config-1.2.1.jar`
   * `core-1.1.2.jar`
   * `curator-client-2.4.0.jar`
   * `curator-framework-2.4.0.jar`
   * `curator-recipes-2.4.0.jar`
   * `gmbal-api-only-3.0.0-b023.jar`
   * `grizzly-framework-2.1.2.jar`
   * `grizzly-http-2.1.2.jar`
   * `grizzly-http-server-2.1.2.jar`
   * `grizzly-http-servlet-2.1.2.jar`
   * `grizzly-rcm-2.1.2.jar`
   * `groovy-all-2.3.7.jar`
   * `guava-14.0.1.jar`
   * `guice-3.0.jar`
   * `hadoop-annotations-2.2.0.jar`
   * `hadoop-auth-2.2.0.jar`
   * `hadoop-client-2.2.0.jar`
   * `hadoop-common-2.2.0.jar`
   * `hadoop-hdfs-2.2.0.jar`
   * `hadoop-mapreduce-client-app-2.2.0.jar`
   * `hadoop-mapreduce-client-common-2.2.0.jar`
   * `hadoop-mapreduce-client-core-2.2.0.jar`
   * `hadoop-mapreduce-client-jobclient-2.2.0.jar`
   * `hadoop-mapreduce-client-shuffle-2.2.0.jar`
   * `hadoop-yarn-api-2.2.0.jar`
   * `hadoop-yarn-client-2.2.0.jar`
   * `hadoop-yarn-common-2.2.0.jar`
   * `hadoop-yarn-server-common-2.2.0.jar`
   * `ivy-2.4.0.jar`
   * `jackson-annotations-2.4.0.jar`
   * `jackson-core-2.4.4.jar`
   * `jackson-core-asl-1.8.8.jar`
   * `jackson-databind-2.4.4.jar`
   * `jackson-jaxrs-1.8.8.jar`
   * `jackson-mapper-asl-1.8.8.jar`
   * `jackson-module-scala_2.10-2.4.4.jar`
   * `jackson-xc-1.8.8.jar`
   * `jansi-1.4.jar`
   * `javax.inject-1.jar`
   * `javax.servlet-3.0.0.v201112011016.jar`
   * `javax.servlet-3.1.jar`
   * `javax.servlet-api-3.0.1.jar`
   * `jaxb-api-2.2.2.jar`
   * `jaxb-impl-2.2.3-1.jar`
   * `jcl-over-slf4j-1.7.10.jar`
   * `jersey-client-1.9.jar`
   * `jersey-core-1.9.jar`
   * `jersey-grizzly2-1.9.jar`
   * `jersey-guice-1.9.jar`
   * `jersey-json-1.9.jar`
   * `jersey-server-1.9.jar`
   * `jersey-test-framework-core-1.9.jar`
   * `jersey-test-framework-grizzly2-1.9.jar`
   * `jets3t-0.7.1.jar`
   * `jettison-1.1.jar`
   * `jetty-util-6.1.26.jar`
   * `jline-0.9.94.jar`
   * `jline-2.10.4.jar`
   * `jodd-core-3.6.3.jar`
   * `json4s-ast_2.10-3.2.10.jar`
   * `json4s-core_2.10-3.2.10.jar`
   * `json4s-jackson_2.10-3.2.10.jar`
   * `jsr305-1.3.9.jar`
   * `jtransforms-2.4.0.jar`
   * `jul-to-slf4j-1.7.10.jar`
   * `kryo-2.21.jar`
   * `log4j-1.2.17.jar`
   * `lz4-1.2.0.jar`
   * `management-api-3.0.0-b012.jar`
   * `mesos-0.21.0-shaded-protobuf.jar`
   * `metrics-core-3.1.0.jar`
   * `metrics-graphite-3.1.0.jar`
   * `metrics-json-3.1.0.jar`
   * `metrics-jvm-3.1.0.jar`
   * `minlog-1.2.jar`
   * `netty-3.8.0.Final.jar`
   * `netty-all-4.0.23.Final.jar`
   * `objenesis-1.2.jar`
   * `opencsv-2.3.jar`
   * `oro-2.0.8.jar`
   * `paranamer-2.6.jar`
   * `parquet-column-1.6.0rc3.jar`
   * `parquet-common-1.6.0rc3.jar`
   * `parquet-encoding-1.6.0rc3.jar`
   * `parquet-format-2.2.0-rc1.jar`
   * `parquet-generator-1.6.0rc3.jar`
   * `parquet-hadoop-1.6.0rc3.jar`
   * `parquet-jackson-1.6.0rc3.jar`
   * `protobuf-java-2.4.1.jar`
   * `protobuf-java-2.5.0-spark.jar`
   * `py4j-0.8.2.1.jar`
   * `pyrolite-2.0.1.jar`
   * `quasiquotes_2.10-2.0.1.jar`
   * `reflectasm-1.07-shaded.jar`
   * `scala-compiler-2.10.4.jar`
   * `scala-library-2.10.4.jar`
   *

[GitHub] spark pull request: [SPARK-4699][SQL] make caseSensitive configura...

2015-04-30 Thread scwf

GitHub user scwf opened a pull request:

https://github.com/apache/spark/pull/5806

[SPARK-4699][SQL] make caseSensitive configurable in Analyzer.scala



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwf/spark case

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5806.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5806


commit 578d167bfccdc2d1d5ce9ca06cab7b7b753bb3eb
Author: Jacky Li jacky.li...@huawei.com
Date:   2014-12-02T17:33:44Z

make caseSensitive configurable

commit f57f15ce72652b3a04229a860a2aba22297368b8
Author: Jacky Li jacky.li...@huawei.com
Date:   2014-12-03T17:48:04Z

add testcase

commit 91b1b9606055211cfab409dbdecaa708aa83be34
Author: Jacky Li jacky.li...@huawei.com
Date:   2014-12-20T16:38:19Z

make caseSensitive configurable in Analyzer

commit e7bca31f6856a4fe2e301bc2ea608d709dcbe334
Author: Jacky Li jacky.li...@huawei.com
Date:   2014-12-20T17:25:20Z

make caseSensitive configuration in Analyzer and Catalog

commit fcbf0d9162574cf6f28dc703224e23d357f0aad9
Author: Jacky Li jacky.li...@huawei.com
Date:   2014-12-20T17:36:44Z

fix scalastyle check

commit 6332e0ffeac2180406cabfe789ef0ba697b49fa9
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-01-03T13:56:51Z

fix bug

commit 005c56d7a4a9c0797870810efe227d3cef225b12
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-01-03T14:08:27Z

make SQLContext caseSensitivity configurable

commit 9bf4cc7dbb069c4969c5f317590e3e9ddc4efd4f
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-01-03T14:39:10Z

fix bug in catalyst

commit 73c16b13b23e2b9e98ac6fb1864d8c98a3813dfb
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-01-03T17:02:17Z

fix bug in sql/hive

commit 05b09a3c1008869571e438c12e8593def7ecdc2c
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-01-19T07:42:39Z

fix conflict base on the latest master branch

commit dee56e9ae71ebd9c8464cf6be763895e8bcdf2e6
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-01-19T09:55:50Z

fix test case failure

commit 39e369c67f92105956486624cbc7d937627fd141
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-02-03T17:42:21Z

fix confilct after DataFrame PR

commit 12eca9a71d05fe74d44e4298f0587af31bf380d4
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-02-21T15:29:16Z

solve conflict with master

commit 664d1e9e610f2bef172cc3a10de452f1752ca51b
Author: Jacky Li jacky.li...@huawei.com
Date:   2015-02-21T15:30:37Z

Merge branch 'master' of https://github.com/apache/spark into case

commit 56034ca4baa25819b322905490cb0b75543f500c
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T07:47:38Z

fix conflicts and improve for catalystconf

commit 5472b0832213aa0d7f092c06f54095477e695c93
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:17:28Z

fix compile issue

commit 69b3b708c2b78ed2e1061d69ef3e7c3b5e2d94c6
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:32:35Z

fix AnalysisSuite

commit fd30e25f84e569769519282cc3ec39e58a200e87
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:34:34Z

added override

commit 966e719b77e3e8e3e715e58f3a0aeed3b4aba009
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:41:13Z

set CASE_SENSITIVE false in hivecontext

commit 5d7c45618bcc0ba1195e406230972a9c237016c7
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:46:35Z

set CASE_SENSITIVE false in TestHive

commit 6ef31cfb5269e6298349cf97fbe28fcfa43c26ec
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:53:07Z

revert pom changes

commit eee75bad4d7eacb73cfc57ea733aed1dcd97ec11
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:55:34Z

fix EmptyConf

commit d5a99337c86c92e705098b95f844b928e5129213
Author: wangfei wangf...@huawei.com
Date:   2015-04-30T08:59:04Z

fix style




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97678166
  
  [Test build #31390 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31390/consoleFull)
 for   PR 5744 at commit 
[`51719b7`](https://github.com/apache/spark/commit/51719b7f612859219ba31658da4e9582c6ef2856).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5744#issuecomment-97677743
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7133][SQL] Implement struct, array, and...

2015-04-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5744#discussion_r29406389
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1166,7 +1166,7 @@ def __init__(self, jc):
 
 # container operators
 __contains__ = _bin_op(contains)
-__getitem__ = _bin_op(getItem)
+__getitem__ = _bin_op(apply)
--- End diff --

can we add a unit test?

you can add it in tests.py


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97681263
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5799#discussion_r29406968
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala 
---
@@ -0,0 +1,127 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 2.0
+* (the License); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an AS IS BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+
+package org.apache.spark.sql.execution.stat
+
+import org.apache.spark.Logging
+import org.apache.spark.sql.{Column, DataFrame, Row}
+import org.apache.spark.sql.catalyst.plans.logical.LocalRelation
+import org.apache.spark.sql.types.{ArrayType, StructField, StructType}
+
+import scala.collection.mutable.{Map = MutableMap}
+
+private[sql] object FrequentItems extends Logging {
+
+  /** A helper class wrapping `MutableMap[Any, Long]` for simplicity. */
+  private class FreqItemCounter(size: Int) extends Serializable {
+val baseMap: MutableMap[Any, Long] = MutableMap.empty[Any, Long]
+
+/**
+ * Add a new example to the counts if it exists, otherwise deduct the 
count
+ * from existing items.
+ */
+def add(key: Any, count: Long): this.type = {
+  if (baseMap.contains(key))  {
+baseMap(key) += count
+  } else {
+if (baseMap.size  size) {
+  baseMap += key - count
+} else {
+  // TODO: Make this more efficient... A flatMap?
+  baseMap.retain((k, v) = v  count)
+  baseMap.transform((k, v) = v - count)
+}
+  }
+  this
+}
+
+/**
+ * Merge two maps of counts.
+ * @param other The map containing the counts for that partition
+ */
+def merge(other: FreqItemCounter): this.type = {
+  other.toSeq.foreach { case (k, v) =
+add(k, v)
+  }
+  this
+}
+
+def toSeq: Seq[(Any, Long)] = baseMap.toSeq
+
+def foldLeft[A, B](start: A)(f: (A, (Any, Long)) = A): A = 
baseMap.foldLeft(start)(f)
+
+def freqItems: Seq[Any] = baseMap.keys.toSeq
+  }
+
+  /**
+   * Finding frequent items for columns, possibly with false positives. 
Using the 
+   * frequent element count algorithm described in
+   * [[http://dx.doi.org/10.1145/762471.762473, proposed by Karp, 
Schenker, and Papadimitriou]].
+   * For Internal use only.
+   *
+   * @param df The input DataFrame
+   * @param cols the names of the columns to search frequent items in
+   * @param support The minimum frequency for an item to be considered 
`frequent`
+   * @return A Local DataFrame with the Array of frequent items for each 
column.
+   */
+  private[sql] def singlePassFreqItems(
+  df: DataFrame, 
+  cols: Seq[String],
+  support: Double): DataFrame = {
+if (support  1e-6) {
--- End diff --

```scala
require(support = 1e-6, ssupport ($support) must be greater than 1e-6.)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29407261
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/package.scala
 ---
@@ -29,12 +29,23 @@ package object analysis {
 
   /**
* Resolver should return true if the first string refers to the same 
entity as the second string.
-   * For example, by using case insensitive equality.
+   * For example, by using case insensitive equality. Besides, Resolver 
also provides the ability
+   * to normalize the string according to its semantic.
*/
-  type Resolver = (String, String) = Boolean
+  trait Resolver {
+def apply(a: String, b: String): Boolean
+def apply(a: String): String
+  }
+
+  val caseInsensitiveResolution = new Resolver {
+override def apply(a: String, b: String): Boolean = 
a.equalsIgnoreCase(b)
+override def apply(a: String): String = a.toLowerCase // as Hive does
--- End diff --

If we want to add this, I think we should call it normalize. Maybe change 
the first apply to something else in the future.

I'm not sure if we need to add this though. I will let @marmbrus comment on 
that.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29410424
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
@@ -95,7 +95,7 @@ private[ui] class StreamingPage(parent: StreamingTab)
 Maximum rate\n[events/sec],
 Last Error
   )
-  val dataRows = (0 until listener.numReceivers).map { receiverId =
--- End diff --

Now all the input streams will have a unique id (not only receiver based 
input streams), so assuming  this will get error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5798#issuecomment-97702165
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5798#issuecomment-97702224
  
  [Test build #31403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31403/consoleFull)
 for   PR 5798 at commit 
[`1f0ed92`](https://github.com/apache/spark/commit/1f0ed9236527bf1071f2cc4a5815f5f705f85dc5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7171: Added a method to retrieve metrics...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5805#issuecomment-97713353
  
  [Test build #31406 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31406/consoleFull)
 for   PR 5805 at commit 
[`92aa76f`](https://github.com/apache/spark/commit/92aa76fc559499470595fcd772d750b34d128cc6).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `jaxb-api-2.2.7.jar`
   * `jaxb-core-2.2.7.jar`
   * `jaxb-impl-2.2.7.jar`
   * `pmml-agent-1.1.15.jar`
   * `pmml-model-1.1.15.jar`
   * `pmml-schema-1.1.15.jar`

 * This patch **removes the following dependencies:**
   * `activation-1.1.jar`
   * `jaxb-api-2.2.2.jar`
   * `jaxb-impl-2.2.3-1.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7171: Added a method to retrieve metrics...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5805#issuecomment-97713362
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7171: Added a method to retrieve metrics...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5805#issuecomment-97713365
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31406/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97679106
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7242][SQL][MLLIB] Frequent items for Da...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5799#issuecomment-97679097
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7232] [SQL] Add a Substitution batch fo...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5776#issuecomment-97680177
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31385/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7232] [SQL] Add a Substitution batch fo...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5776#issuecomment-97680176
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7232] [SQL] Add a Substitution batch fo...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5776#issuecomment-97680167
  
  [Test build #31385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31385/consoleFull)
 for   PR 5776 at commit 
[`553005a`](https://github.com/apache/spark/commit/553005a4e9aebcbb42c712efd833118235d205dc).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29406841
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,13 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, b, a, b, 
a, b, a, b),
   The output schema did not preserve the case of the query.)
--- End diff --

In Hive
```
hive create table ddDD as select Key, valUe from src;
hive desc extended ;
OK
key string  
value   string  
 
Detailed Table Information  Table(tableName:, dbName:default, 
owner:hcheng, createTime:1430368423, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:key, type:string, comment:null), 
FieldSchema(name:value, type:string, comment:null)], 
location:file:/home/hcheng/warehouse/, 
inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[], parameters:{numFiles=1, COLUMN_STATS_ACCURATE=true, 
transient_lastDdlTime=1430368423, numRows=0, totalSize=5824, rawDataSize=0}, 
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)   
Time taken: 0.111 seconds, Fetched: 4 row(s)
```
You will see both table name  column names are normalized (to lower case), 
so I think it's probably not necessary for the preservation (Normalized name is 
what we want, doesn't it?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming] Add a DirectStreamTrac...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97692199
  
Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming] Add a DirectStreamTrac...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97692178
  
 Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/5680#issuecomment-97694754
  
There are merge conflicts! Please merge master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6602][Core] Update Master, Worker, Clie...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5392#issuecomment-97695946
  
  [Test build #31388 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31388/consoleFull)
 for   PR 5392 at commit 
[`72304f0`](https://github.com/apache/spark/commit/72304f0150e74eb6432fc3141d3d5bc71bb93d61).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class Heartbeat(workerId: String, worker: RpcEndpointRef) 
extends DeployMessage`
  * `  case class RegisteredWorker(master: RpcEndpointRef, masterWebUiUrl: 
String) extends DeployMessage`
  * `  case class RegisterApplication(appDescription: 
ApplicationDescription, driver: RpcEndpointRef)`
  * `  case class RegisteredApplication(appId: String, master: 
RpcEndpointRef) extends DeployMessage`
  * `  case class MasterChanged(master: RpcEndpointRef, masterWebUiUrl: 
String)`

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6602][Core] Update Master, Worker, Clie...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5392#issuecomment-97695958
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29409167
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/InputInfoTracker.scala
 ---
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.streaming.scheduler
+
+import scala.collection.mutable
+
+import org.apache.spark.Logging
+import org.apache.spark.streaming.{Time, StreamingContext}
+
+/** To track the information of input stream at specified batch time. */
+case class InputInfo(batchTime: Time, inputStreamId: Int, numRecords: Long)
+
+/**
+ * This class manages all the input streams as well as their input data 
statistics. The information
+ * will output to StreamingListener to better monitoring.
+ */
+private[streaming] class InputInfoTracker(ssc: StreamingContext) extends 
Logging {
+
+  /** Track all the input streams registered in DStreamGraph */
+  val inputStreams = ssc.graph.getInputStreams()
+  /** Track all the id of input streams registered in DStreamGraph */
+  val inputStreamIds = inputStreams.map(_.id)
+
+  // Map to track all the InputInfo related to specific batch time and 
input stream.
+  private val batchTimeToInputInfos = new mutable.HashMap[Time, 
mutable.HashMap[Int, InputInfo]]
+
+  /** Report the input information with batch time to the tracker */
+  def reportInfo(batchTime: Time, inputInfo: InputInfo): Unit = 
synchronized {
+val inputInfos = batchTimeToInputInfos.getOrElseUpdate(batchTime,
+  new mutable.HashMap[Int, InputInfo]())
+
+if (inputInfos.contains(inputInfo.inputStreamId)) {
+  throw new IllegalStateException(sInput stream 
${inputInfo.inputStreamId}} for batch +
+s$batchTime is already added into InputInfoTracker, this is a 
illegal state)
+}
+inputInfos += ((inputInfo.inputStreamId, inputInfo))
+  }
+
+  /** Get the all the input stream's information of specified batch time */
+  def getInfo(batchTime: Time): Map[Int, InputInfo] = synchronized {
+val inputInfos = batchTimeToInputInfos.get(batchTime)
+// Convert mutable HashMap to immutable Map for the caller
+inputInfos.map(_.toMap).getOrElse(Map[Int, InputInfo]())
+  }
+
+  /** Get the input information of specified batch time and input stream 
id */
+  def getInfoOfBatchAndStream(batchTime: Time, inputStreamId: Int
--- End diff --

yes, only used for test, I can remove it if necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7196][SQL] Support precision and scale ...

2015-04-30 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/5777#issuecomment-97696066
  
@rxin ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7112][Streaming][WIP] Add a InputInfoTr...

2015-04-30 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/5680#discussion_r29409159
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala
 ---
@@ -135,28 +132,25 @@ private[streaming] class 
StreamingJobProgressListener(ssc: StreamingContext)
 
   def receivedRecordsDistributions: Map[Int, Option[Distribution]] = 
synchronized {
 val latestBatchInfos = retainedBatches.reverse.take(batchInfoLimit)
-val latestBlockInfos = latestBatchInfos.map(_.receivedBlockInfo)
-(0 until numReceivers).map { receiverId =
-  val blockInfoOfParticularReceiver = latestBlockInfos.map { batchInfo 
=
-batchInfo.get(receiverId).getOrElse(Array.empty)
-  }
-  val recordsOfParticularReceiver = blockInfoOfParticularReceiver.map 
{ blockInfo =
-  // calculate records per second for each batch
-blockInfo.map(_.numRecords).sum.toDouble * 1000 / batchDuration
-  }
-  val distributionOption = Distribution(recordsOfParticularReceiver)
-  (receiverId, distributionOption)
+
+// TODO. this should be fixed when receiver-less input stream is mixed 
into BatchInfo
--- End diff --

What does this to do mean?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/5798#issuecomment-97701616
  
Thank you for the comments, I've updated the code for preserving the 
attribute name. Attribute name normalization seems still require some 
discussion, let's keep it for the future improvement.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7171: Added a method to retrieve metrics...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5805#issuecomment-97706740
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/5798#discussion_r29411449
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala
 ---
@@ -81,9 +81,11 @@ class HiveResolutionSuite extends HiveComparisonTest {
   .toDF().registerTempTable(caseSensitivityTest)
 
 val query = sql(SELECT a, b, A, B, n.a, n.b, n.A, n.B FROM 
caseSensitivityTest)
-assert(query.schema.fields.map(_.name) === Seq(a, b, A, B, 
a, b, A, B),
+assert(query.schema.fields.map(_.name) === Seq(a, B, a, B, 
a, b, A, B),
--- End diff --

I'm not sure what we really want here. When user `SELECT b FROM t` and `t` 
has a column `B`, which one should we used in the result schema? `b` or `B`? cc 
@marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-7171: Added a method to retrieve metrics...

2015-04-30 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5805#issuecomment-97706828
  
  [Test build #31406 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31406/consoleFull)
 for   PR 5805 at commit 
[`92aa76f`](https://github.com/apache/spark/commit/92aa76fc559499470595fcd772d750b34d128cc6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6913][SQL] Fixed java.sql.SQLException...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5782#issuecomment-97715156
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7269] [SQL] Incorrect analysis for aggr...

2015-04-30 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5798#issuecomment-97679300
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1263 matches

Mail list logo