[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73003/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #73003 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73003/testReport)**
 for PR 16857 at commit 
[`96807d2`](https://github.com/apache/spark/commit/96807d21ea687b4ce5f1a298b969e9117548a3a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16957: [SPARK-19550][HOTFIX][BUILD] Use JAVA_HOME/bin/java if J...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16957
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73001/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16957: [SPARK-19550][HOTFIX][BUILD] Use JAVA_HOME/bin/java if J...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16957
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16957: [SPARK-19550][HOTFIX][BUILD] Use JAVA_HOME/bin/java if J...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16957
  
**[Test build #73001 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73001/testReport)**
 for PR 16957 at commit 
[`17a61f1`](https://github.com/apache/spark/commit/17a61f1d47e44f0fc27b0354dddefcd9bc9adf57).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16960: [SPARK-19447] Make Range operator generate "recordsRead"...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16960
  
**[Test build #73005 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73005/testReport)**
 for PR 16960 at commit 
[`088556b`](https://github.com/apache/spark/commit/088556b0ee1b5c33c88b4ccddcae11b2eb18660f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9608: [SPARK-11638] [Mesos + Docker Bridge networking]: Run Spa...

2017-02-16 Thread cherryii
Github user cherryii commented on the issue:

https://github.com/apache/spark/pull/9608
  
Yes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9608: [SPARK-11638] [Mesos + Docker Bridge networking]: Run Spa...

2017-02-16 Thread radekg
Github user radekg commented on the issue:

https://github.com/apache/spark/pull/9608
  
Is this running on Mesos?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-02-16 Thread erikerlandson
Github user erikerlandson commented on the issue:

https://github.com/apache/spark/pull/13440
  
Hi @wangmiao1981,

I am still interested in this, but I don't have any sense about whether 
upstream has any interest.  Does upstream have any intention to accept it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-02-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16954
  
cc @hvanhovell I did the internal review. It is ready for you to review it. 
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16960: [SPARK-19447] Make Range operator generate "recordsRead"...

2017-02-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16960
  
cc @hvanhovell if you have a min to review this ...



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16960: [SPARK-19447] Make Range operator generate "recor...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16960#discussion_r101575264
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -309,4 +314,84 @@ class SQLMetricsSuite extends SparkFunSuite with 
SharedSQLContext {
 assert(metricInfoDeser.metadata === 
Some(AccumulatorContext.SQL_ACCUM_IDENTIFIER))
   }
 
+  test("range metrics") {
+val res1 = InputOutputMetricsHelper.run(
+  spark.range(30).filter(x => x % 3 == 0).toDF()
+)
+assert(res1 === (30L, 0L, 30L) :: Nil)
+
+val res2 = InputOutputMetricsHelper.run(
+  spark.range(150).repartition(4).filter(x => x < 10).toDF()
+)
+assert(res2 === (150L, 0L, 150L) :: (0L, 150L, 10L) :: Nil)
+
+withTempDir { tempDir =>
+  val dir = new File(tempDir, "pqS").getCanonicalPath
+
+  spark.range(10).write.parquet(dir)
+  spark.read.parquet(dir).createOrReplaceTempView("pqS")
+
+  val res3 = InputOutputMetricsHelper.run(
+spark.range(0, 30).repartition(3).crossJoin(sql("select * from 
pqS")).repartition(2).toDF()
+  )
+  assert(res3 === (10L, 0L, 10L) :: (30L, 0L, 30L) :: (0L, 30L, 300L) 
:: (0L, 300L, 0L) :: Nil)
+}
+  }
+}
+
+object InputOutputMetricsHelper {
+   private class InputOutputMetricsListener extends SparkListener {
+private case class MetricsResult(
+var recordsRead: Long = 0L,
+var shuffleRecordsRead: Long = 0L,
+var sumMaxOutputRows: Long = 0L)
+
+private[this] var stageIdToMetricsResult = HashMap.empty[Int, 
MetricsResult]
+
+def reset(): Unit = {
+  stageIdToMetricsResult = HashMap.empty[Int, MetricsResult]
+}
+
+def getResults(): List[(Long, Long, Long)] = {
--- End diff --

here too long long long


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16960: [SPARK-19447] Make Range operator generate "recor...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16960#discussion_r101575199
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
 ---
@@ -309,4 +314,84 @@ class SQLMetricsSuite extends SparkFunSuite with 
SharedSQLContext {
 assert(metricInfoDeser.metadata === 
Some(AccumulatorContext.SQL_ACCUM_IDENTIFIER))
   }
 
+  test("range metrics") {
+val res1 = InputOutputMetricsHelper.run(
+  spark.range(30).filter(x => x % 3 == 0).toDF()
+)
+assert(res1 === (30L, 0L, 30L) :: Nil)
+
+val res2 = InputOutputMetricsHelper.run(
+  spark.range(150).repartition(4).filter(x => x < 10).toDF()
+)
+assert(res2 === (150L, 0L, 150L) :: (0L, 150L, 10L) :: Nil)
+
+withTempDir { tempDir =>
+  val dir = new File(tempDir, "pqS").getCanonicalPath
+
+  spark.range(10).write.parquet(dir)
+  spark.read.parquet(dir).createOrReplaceTempView("pqS")
+
+  val res3 = InputOutputMetricsHelper.run(
+spark.range(0, 30).repartition(3).crossJoin(sql("select * from 
pqS")).repartition(2).toDF()
+  )
+  assert(res3 === (10L, 0L, 10L) :: (30L, 0L, 30L) :: (0L, 30L, 300L) 
:: (0L, 300L, 0L) :: Nil)
+}
+  }
+}
+
+object InputOutputMetricsHelper {
+   private class InputOutputMetricsListener extends SparkListener {
+private case class MetricsResult(
+var recordsRead: Long = 0L,
+var shuffleRecordsRead: Long = 0L,
+var sumMaxOutputRows: Long = 0L)
+
+private[this] var stageIdToMetricsResult = HashMap.empty[Int, 
MetricsResult]
+
+def reset(): Unit = {
+  stageIdToMetricsResult = HashMap.empty[Int, MetricsResult]
+}
+
+def getResults(): List[(Long, Long, Long)] = {
+  stageIdToMetricsResult.keySet.toList.sorted.map({ stageId =>
+val res = stageIdToMetricsResult(stageId)
+(res.recordsRead, res.shuffleRecordsRead, res.sumMaxOutputRows)})
+}
+
+override def onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit = 
synchronized {
+  val res = stageIdToMetricsResult.getOrElseUpdate(taskEnd.stageId, { 
MetricsResult() })
+
+  res.recordsRead += taskEnd.taskMetrics.inputMetrics.recordsRead
+  res.shuffleRecordsRead += 
taskEnd.taskMetrics.shuffleReadMetrics.recordsRead
+
+  var maxOutputRows = 0L
+  for (accum <- taskEnd.taskMetrics.externalAccums) {
+val info = accum.toInfo(Some(accum.value), None)
+if (info.name.toString.contains("number of output rows")) {
+  info.update match {
+case Some(n: Number) =>
+  if (n.longValue() > maxOutputRows) {
+maxOutputRows = n.longValue()
+  }
+case _ => // Ignore.
+  }
+}
+  }
+  res.sumMaxOutputRows += maxOutputRows
+}
+  }
+
+  def run(df: DataFrame): List[(Long, Long, Long)] = {
--- End diff --

document what hte long long long are for?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15120: [SPARK-4563][core] Allow driver to advertise a different...

2017-02-16 Thread cherryii
Github user cherryii commented on the issue:

https://github.com/apache/spark/pull/15120
  
I am using spark 2.1.0, my docker container is a spark driver that only 
runs my job, my container's entrypoint is a script that runs spark submit. I 
use the client deploy-mode, set the conf values defined above and get the error 
below when in bridge networking, does anyone know what I'm doing wrong or why 
bridge networking isn't working with the conf values defined above?

java.net.BindException: Cannot assign requested address: Service 
'sparkDriver' failed after 16 retries! Consider explicitly setting the 
appropriate port for the service 'sparkDriver' (for example spark.ui.port for 
SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at 
io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at 
io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at 
io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at 
io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16960: [SPARK-19447] Make Range operator generate "recordsRead"...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16960
  
**[Test build #73004 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73004/testReport)**
 for PR 16960 at commit 
[`10a53a7`](https://github.com/apache/spark/commit/10a53a783682a3d8966a8ba6bb255aadcb9dc87d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16960: [SPARK-19447] Make Range operator generate "recor...

2017-02-16 Thread ala
GitHub user ala opened a pull request:

https://github.com/apache/spark/pull/16960

[SPARK-19447] Make Range operator generate "recordsRead" metric

## What changes were proposed in this pull request?

The Range was modified to produce "recordsRead" metric instead of 
"generated rows". The tests were updated and partially moved to SQLMetricsSuite.

## How was this patch tested?

Unit tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ala/spark range-records-read

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16960.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16960


commit 10a53a783682a3d8966a8ba6bb255aadcb9dc87d
Author: Ala Luszczak 
Date:   2017-02-15T14:52:23Z

Using recordsRead instead of generated rows.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16959: [SPARK-19631][CORE] OutputCommitCoordinator should not a...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16959
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16959: [SPARK-19631][CORE] OutputCommitCoordinator shoul...

2017-02-16 Thread pwoody
GitHub user pwoody opened a pull request:

https://github.com/apache/spark/pull/16959

[SPARK-19631][CORE] OutputCommitCoordinator should not allow commits for 
already failed tasks

## What changes were proposed in this pull request?

Previously it was possible for there to be a race between a task failure 
and committing the output of a task. This ensures that any previously failed 
task attempts cannot enter the commit protocol.

## How was this patch tested?

Added a unit test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pwoody/spark pw/recordFailuresForCommitter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16959.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16959


commit ce17e02fbebbac6a3e4c92e5a9ec8b2a59879f20
Author: Patrick Woody 
Date:   2017-02-16T15:03:35Z

Record failed attempts in the OutputCommitCoordinator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16956
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72999/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16956
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16956
  
**[Test build #72999 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72999/testReport)**
 for PR 16956 at commit 
[`accd3b9`](https://github.com/apache/spark/commit/accd3b9b58f7976f587af3b65a3df3d5aca104f0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16958
  
So nice when I got two LGTMs and then Jenkins disagreed.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-16 Thread budde
Github user budde commented on a diff in the pull request:

https://github.com/apache/spark/pull/16944#discussion_r101560890
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -161,23 +161,49 @@ private[hive] class 
HiveMetastoreCatalog(sparkSession: SparkSession) extends Log
   bucketSpec,
   Some(partitionSchema))
 
+val catalogTable = metastoreRelation.catalogTable
 val logicalRelation = cached.getOrElse {
   val sizeInBytes =
 
metastoreRelation.stats(sparkSession.sessionState.conf).sizeInBytes.toLong
   val fileIndex = {
-val index = new CatalogFileIndex(
-  sparkSession, metastoreRelation.catalogTable, sizeInBytes)
+val index = new CatalogFileIndex(sparkSession, catalogTable, 
sizeInBytes)
 if (lazyPruningEnabled) {
   index
 } else {
   index.filterPartitions(Nil)  // materialize all the 
partitions in memory
 }
   }
   val partitionSchemaColumnNames = 
partitionSchema.map(_.name.toLowerCase).toSet
-  val dataSchema =
-StructType(metastoreSchema
+  val filteredMetastoreSchema = StructType(metastoreSchema
   .filterNot(field => 
partitionSchemaColumnNames.contains(field.name.toLowerCase)))
 
+  val inferenceMode = 
sparkSession.sessionState.conf.schemaInferenceMode
+  val dataSchema = if (inferenceMode != "NEVER_INFER" &&
+  !catalogTable.schemaFromTableProps) {
+val fileStatuses = fileIndex.listFiles(Nil).flatMap(_.files)
+val inferred = defaultSource.inferSchema(sparkSession, 
options, fileStatuses)
+val merged = if (fileType.equals("parquet")) {
+  
inferred.map(ParquetFileFormat.mergeMetastoreParquetSchema(metastoreSchema, _))
+} else {
+  inferred
--- End diff --

I took this from how the schema was inferred in HiveMetastoreCatalog prior 
to 2.1.0. Only ParquetFileFormat has a merge method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16958
  
**[Test build #73002 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73002/testReport)**
 for PR 16958 at commit 
[`2cb7552`](https://github.com/apache/spark/commit/2cb75525a90a8400228f344a82dde8f3ffe4422f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16958
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16958
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73002/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16944: [SPARK-19611][SQL] Introduce configurable table s...

2017-02-16 Thread budde
Github user budde commented on a diff in the pull request:

https://github.com/apache/spark/pull/16944#discussion_r101562475
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -296,6 +296,17 @@ object SQLConf {
   .longConf
   .createWithDefault(250 * 1024 * 1024)
 
+  val HIVE_SCHEMA_INFERENCE_MODE = 
buildConf("spark.sql.hive.schemaInferenceMode")
+.doc("Configures the action to take when a case-sensitive schema 
cannot be read from a Hive " +
+  "table's properties. Valid options include INFER_AND_SAVE (infer the 
case-sensitive " +
+  "schema from the underlying data files and write it back to the 
table properties), " +
+  "INFER_ONLY (infer the schema but don't attempt to write it to the 
table properties) and " +
+  "NEVER_INFER (fallback to using the case-insensitive metastore 
schema instead of inferring).")
+.stringConf
+.transform(_.toUpperCase())
+.checkValues(Set("INFER_AND_SAVE", "INFER_ONLY", "NEVER_INFER"))
+.createWithDefault("INFER_AND_SAVE")
--- End diff --

I'll update the code to catch and log any nonfatal exception when 
performing the ```alterTable()``` to save the table schema when 
```INFER_AND_SAVE``` is enabled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread bogdanrdc
Github user bogdanrdc commented on the issue:

https://github.com/apache/spark/pull/16958
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16826: [WIP][SPARK-19540][SQL] Add ability to clone SparkSessio...

2017-02-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16826
  
What's WIP about this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-02-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16611
  
For SQL, rather than "array", can we follow Python, e.g.

```
CREATE TEMPORARY TABLE tableA USING csv
OPTIONS (nullValue ['NA', 'null'], ...)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support f...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16611#discussion_r101553890
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -97,6 +99,15 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
   def option(key: String, value: Double): DataFrameReader = option(key, 
value.toString)
 
   /**
+   * Adds an input option for the underlying data source.
+   *
+   * @since 2.2.0
+   */
+  def option(key: String, value: Array[String]): DataFrameReader = {
--- End diff --

I'd also support Seq in scala.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9608: [SPARK-11638] [Mesos + Docker Bridge networking]: Run Spa...

2017-02-16 Thread cherryii
Github user cherryii commented on the issue:

https://github.com/apache/spark/pull/9608
  
I am using docker, my container is a spark driver that only runs my job, my 
container's entrypoint is a script that runs spark submit. I use the client 
deploy-mode and set all the config values you mention above and get the error 
below when in bridge networking, spark starts and runs fine in host networking 
but will create other problems I'll have to deal with which are specific to my 
organization and deployment environment, bridge networking really would be the 
best option for me if I could get it working:

java.net.BindException: Cannot assign requested address: Service 
'sparkDriver' failed after 16 retries! Consider explicitly setting the 
appropriate port for the service 'sparkDriver' (for example spark.ui.port for 
SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at 
io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at 
io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at 
io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at 
io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at 
io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at 
io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-02-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16534
  
Change looks good to me but I didn't look super carefully.

@holdenk can you take a look at this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9608: [SPARK-11638] [Mesos + Docker Bridge networking]: Run Spa...

2017-02-16 Thread radekg
Github user radekg commented on the issue:

https://github.com/apache/spark/pull/9608
  
Are you deploying with Docker? If so, what problems are you facing when 
using host networking?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/16958
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16476
  
**[Test build #72994 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72994/testReport)**
 for PR 16476 at commit 
[`0665a1d`](https://github.com/apache/spark/commit/0665a1d0d76bdb553010775eb1c53b8d461e792a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16958
  
**[Test build #73002 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73002/testReport)**
 for PR 16958 at commit 
[`2cb7552`](https://github.com/apache/spark/commit/2cb75525a90a8400228f344a82dde8f3ffe4422f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16476
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72994/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16476
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

2017-02-16 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/16672#discussion_r101541205
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with 
SharedSQLContext with BeforeAndAfterEach {
   }
 }
   }
+
+  test("insert data to a data source table which has a not existed 
location should succeed") {
+withTable("t") {
+  withTempDir { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t(a string, b int)
+ |USING parquet
+ |OPTIONS(path "$dir")
+   """.stripMargin)
+val table = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("t"))
+val expectedPath = dir.getAbsolutePath.stripSuffix("/")
+assert(table.location.stripSuffix("/") == expectedPath)
+
+dir.delete
+val tableLocFile = new File(table.location.stripPrefix("file:"))
+assert(!tableLocFile.exists)
+spark.sql("INSERT INTO TABLE t SELECT 'c', 1")
+assert(tableLocFile.exists)
+checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
+
+Utils.deleteRecursively(dir)
+assert(!tableLocFile.exists)
+spark.sql("INSERT OVERWRITE TABLE t SELECT 'c', 1")
+assert(tableLocFile.exists)
+checkAnswer(spark.table("t"), Row("c", 1) :: Nil)
+
+val newDir = dir.getAbsolutePath.stripSuffix("/") + "/x"
--- End diff --

ok, I will fix this when I do another pr, thanks~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16672: [SPARK-19329][SQL]Reading from or writing to a da...

2017-02-16 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/16672#discussion_r101541033
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -1816,4 +1816,123 @@ class DDLSuite extends QueryTest with 
SharedSQLContext with BeforeAndAfterEach {
   }
 }
   }
+
+  test("insert data to a data source table which has a not existed 
location should succeed") {
+withTable("t") {
+  withTempDir { dir =>
+spark.sql(
+  s"""
+ |CREATE TABLE t(a string, b int)
+ |USING parquet
+ |OPTIONS(path "$dir")
--- End diff --

currently, it will throw an exception that the path does not existed. maybe 
we can check if the path is a dir or not, dir can not exist and file must be 
exist?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #73003 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73003/testReport)**
 for PR 16857 at commit 
[`96807d2`](https://github.com/apache/spark/commit/96807d21ea687b4ce5f1a298b969e9117548a3a4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16927: [SPARK-19571][R] Fix SparkR test break on Windows via Ap...

2017-02-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16927
  
Hm.. @felixcheung, the origin seems from Hadoop's issue.

```
java.io.IOException: (null) entry in command string: null chmod 0644 
C:\Users\appveyor\AppData\Local\Temp\1\RtmpCuLueF\spark-mlpb486948329c.tmp\rMetadata\_temporary\0\_temporary\attempt_20170214140636_0069_m_00_70\part-0
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:859)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:842)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661)
at 
org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:501)
at 
org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:482)
at 
org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:498)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:467)
at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:433)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:801)
```

Probably it is https://issues.apache.org/jira/browse/HADOOP-10775. I will 
file an issue there after testing this with Hadoop 2.8+ when I am able to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73000/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #73000 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73000/testReport)**
 for PR 16857 at commit 
[`6fe61b0`](https://github.com/apache/spark/commit/6fe61b0faeb94af2438b66291361221acc985247).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved.

2017-02-16 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16958
  
cc @hvanhovell @bogdanrdc 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16958: [SPARK-13721][SQL] Make GeneratorOuter unresolved...

2017-02-16 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/16958

[SPARK-13721][SQL] Make GeneratorOuter unresolved.

## What changes were proposed in this pull request?
This is a small change to make GeneratorOuter always unresolved. It is 
mostly no-op change but makes it more clear since GeneratorOuter shouldn't 
survive analysis phase.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-13721

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16958.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16958


commit 2cb75525a90a8400228f344a82dde8f3ffe4422f
Author: Reynold Xin 
Date:   2017-02-16T15:02:45Z

[SPARK-13721][SQL] Make GeneratorOuter unresolved.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16938: [SPARK-19583][SQL]CTAS for data source table with a crea...

2017-02-16 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16938
  
@cloud-fan @gatorsmile @tejasapatil  let's discuss this together?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16957: [SPARK-19550][HOTFIX][BUILD] Use JAVA_HOME/bin/java if J...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16957
  
**[Test build #73001 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73001/testReport)**
 for PR 16957 at commit 
[`17a61f1`](https://github.com/apache/spark/commit/17a61f1d47e44f0fc27b0354dddefcd9bc9adf57).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16957: [SPARK-19550][HOTFIX][BUILD] Use JAVA_HOME/bin/ja...

2017-02-16 Thread srowen
GitHub user srowen opened a pull request:

https://github.com/apache/spark/pull/16957

[SPARK-19550][HOTFIX][BUILD] Use JAVA_HOME/bin/java if JAVA_HOME is set in 
dev/mima 

## What changes were proposed in this pull request?

Use JAVA_HOME/bin/java if JAVA_HOME is set in dev/mima script to run MiMa
This follows on https://github.com/apache/spark/pull/16871 -- it's a 
slightly separate issue, but, is currently causing a build failure.

## How was this patch tested?

Manually tested.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srowen/spark SPARK-19550.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16957


commit 17a61f1d47e44f0fc27b0354dddefcd9bc9adf57
Author: Sean Owen 
Date:   2017-02-16T14:58:20Z

Use JAVA_HOME/bin/java if JAVA_HOME is set in dev/mima script to run MiMa




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #73000 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73000/testReport)**
 for PR 16857 at commit 
[`6fe61b0`](https://github.com/apache/spark/commit/6fe61b0faeb94af2438b66291361221acc985247).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16956
  
**[Test build #72999 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72999/testReport)**
 for PR 16956 at commit 
[`accd3b9`](https://github.com/apache/spark/commit/accd3b9b58f7976f587af3b65a3df3d5aca104f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16871: [SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16871
  
Yep, since `dev/mima` runs just `java`, it's likely picking up Java 7 from 
the Jenkins machines. The jobs all set `JAVA_HOME`, and set it to the Java 8 
home. Other Spark scripts here will use `$JAVA_HOME/bin/java` to run `java` 
commands if that's set, and otherwise just `java`. See `dev/check-license`. I 
think that's the fix here too. I'll push this shortly as a hot-fix if it works 
for me, though, kinda a small separate issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #72998 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72998/testReport)**
 for PR 16857 at commit 
[`a563325`](https://github.com/apache/spark/commit/a56332518050b29d049d5bedead5e8b1d76293fe).
 * This patch **fails RAT tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #72998 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72998/testReport)**
 for PR 16857 at commit 
[`a563325`](https://github.com/apache/spark/commit/a56332518050b29d049d5bedead5e8b1d76293fe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72998/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16956: [SPARK-19598][SQL]Remove the alias parameter in U...

2017-02-16 Thread windpiger
Github user windpiger commented on a diff in the pull request:

https://github.com/apache/spark/pull/16956#discussion_r101535507
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -54,10 +54,6 @@ object ResolveHints {
 
   val newNode = CurrentOrigin.withOrigin(plan.origin) {
 plan match {
-  case r: UnresolvedRelation =>
--- End diff --

oh, yes, remove the case will miss  the hint on table without alias


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16857
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72997/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #72997 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72997/testReport)**
 for PR 16857 at commit 
[`a728282`](https://github.com/apache/spark/commit/a72828222d4ba0d07c6f7ebd4b5f0722bef34a9a).
 * This patch **fails RAT tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16857: [SPARK-19517][SS] KafkaSource fails to initialize partit...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16857
  
**[Test build #72997 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72997/testReport)**
 for PR 16857 at commit 
[`a728282`](https://github.com/apache/spark/commit/a72828222d4ba0d07c6f7ebd4b5f0722bef34a9a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16722: [SPARK-19591][ML][MLlib] Add sample weights to de...

2017-02-16 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request:

https://github.com/apache/spark/pull/16722#discussion_r101532020
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/BaggedPoint.scala ---
@@ -60,12 +68,14 @@ private[spark] object BaggedPoint {
   subsamplingRate: Double,
   numSubsamples: Int,
   withReplacement: Boolean,
+  extractSampleWeight: (Datum => Double) = (_: Datum) => 1.0,
   seed: Long = Utils.random.nextLong()): RDD[BaggedPoint[Datum]] = {
+// TODO: implement weighted bootstrapping
--- End diff --

sure, sounds good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16871: [SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16871
  
This caused the SBT build to fail because MiMa doesn't like Java 8 
bytecode, but this might be simple config issue. Investigating ...


https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/2466/console


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16722: [SPARK-19591][ML][MLlib] Add sample weights to de...

2017-02-16 Thread imatiach-msft
Github user imatiach-msft commented on a diff in the pull request:

https://github.com/apache/spark/pull/16722#discussion_r101531848
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/DecisionTreeClassifierSuite.scala
 ---
@@ -351,6 +370,36 @@ class DecisionTreeClassifierSuite
 dt.fit(df)
   }
 
+  test("training with sample weights") {
+val df = linearMulticlassDataset
+val numClasses = 3
+val predEquals = (x: Double, y: Double) => x == y
+// (impurity, maxDepth)
+val testParams = Seq(
+  ("gini", 10),
+  ("entropy", 10),
+  ("gini", 5)
+)
+for ((impurity, maxDepth) <- testParams) {
+  val estimator = new DecisionTreeClassifier()
+.setMaxDepth(maxDepth)
+.setSeed(seed)
+.setMinWeightFractionPerNode(0.049)
--- End diff --

nope, that's not exactly what I was looking for - I wanted to see tests 
that validate that when setting the params for this specific estimator that we 
get the desired errors.  From my point of view even subtle functionality like 
this should be validated.  But I guess spark usually doesn't have such tests.  
There are actually a lot of tests that would be desirable in spark ml 
estimators/transforms, eg fuzzing tests with randomly generated data and 
parameters, regularly scheduled performance tests, tests for various metrics 
(accuracy, precision etc) on a variety of datasets (eg from the UCI repository) 
with different characteristics.  I'm sure there are a lot of bugs that could be 
found this way which users may be running into but not reporting.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16956
  
**[Test build #72996 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72996/testReport)**
 for PR 16956 at commit 
[`3c4be76`](https://github.com/apache/spark/commit/3c4be7695b0d02d32f3eae178af96992372306e2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16955: [SPARK-19626][YARN]Using the correct config to set crede...

2017-02-16 Thread yaooqinn
Github user yaooqinn commented on the issue:

https://github.com/apache/spark/pull/16955
  
@srowen @jerryshao  Thanks for your comments. I have added some 
descriptions both in the JIRA and here, please check whether OK or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16956
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72996/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16956
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16956
  
**[Test build #72996 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72996/testReport)**
 for PR 16956 at commit 
[`3c4be76`](https://github.com/apache/spark/commit/3c4be7695b0d02d32f3eae178af96992372306e2).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16956
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16956
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72995/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16956
  
**[Test build #72995 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72995/testReport)**
 for PR 16956 at commit 
[`4023727`](https://github.com/apache/spark/commit/4023727176b9ea548143d3e03b8b2a524436314c).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16956: [SPARK-19598][SQL]Remove the alias parameter in Unresolv...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16956
  
**[Test build #72995 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72995/testReport)**
 for PR 16956 at commit 
[`4023727`](https://github.com/apache/spark/commit/4023727176b9ea548143d3e03b8b2a524436314c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16956: [SPARK-19598][SQL]Remove the alias parameter in U...

2017-02-16 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16956#discussion_r101530187
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala
 ---
@@ -54,10 +54,6 @@ object ResolveHints {
 
   val newNode = CurrentOrigin.withOrigin(plan.origin) {
 plan match {
-  case r: UnresolvedRelation =>
--- End diff --

actually you can't remove this case entirely. you still need to match on 
UnresolvedRelation, but just use `r.tableIdentifier.table`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16956: [SPARK-19598][SQL]Remove the alias parameter in U...

2017-02-16 Thread windpiger
GitHub user windpiger opened a pull request:

https://github.com/apache/spark/pull/16956

[SPARK-19598][SQL]Remove the alias parameter in UnresolvedRelation

## What changes were proposed in this pull request?

Remove the alias parameter in `UnresolvedRelation`, and use `SubqueryAlias` 
to replace it.
This can simplify some `match case` situations.

For example, the broadcast hint pull request can have one fewer case 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveHints.scala#L57-L61

## How was this patch tested?
add some unit tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/windpiger/spark removeUnresolveTableAlias

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16956.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16956


commit 94084bdca9870b3c7a2ea947dd6bfaa011de3c36
Author: windpiger 
Date:   2017-02-16T14:11:45Z

[SPARK-19598][SQL]remove the alias parameter in UnresolvedRelation

commit 4023727176b9ea548143d3e03b8b2a524436314c
Author: windpiger 
Date:   2017-02-16T14:20:30Z

fix a bug




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #9608: [SPARK-11638] [Mesos + Docker Bridge networking]: Run Spa...

2017-02-16 Thread cherryii
Github user cherryii commented on the issue:

https://github.com/apache/spark/pull/9608
  
The reason I'm asking is because bridge networking would solve a lot of 
other problems. So does that mean bridge networking on mesos and docker isn't 
supported right now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16928
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72993/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16928
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16928: [SPARK-18699][SQL] Put malformed tokens into a new field...

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16928
  
**[Test build #72993 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72993/testReport)**
 for PR 16928 at commit 
[`df39e39`](https://github.com/apache/spark/commit/df39e3934b2f3948847c0e9177155f940be949b5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r101516174
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -340,3 +343,104 @@ object CaseKeyWhen {
 CaseWhen(cases, elseValue)
   }
 }
+
+/**
+ * A function that returns the index of expr in (expr1, expr2, ...) list 
or 0 if not found.
+ * It takes at least 2 parameters, and all parameters should be subtype of 
AtomicType or NullType.
+ * It's also acceptable to give parameters of different types. When the 
parameters have different
+ * types, comparing will be done based on type firstly. For example, 
''999'' 's type is StringType,
+ * while 999's type is IntegerType, so that no further comparison need to 
be done since they have
+ * different types.
+ * If the search expression is NULL, the return value is 0 because NULL 
fails equality comparison
+ * with any value.
+ * To also point out, no implicit cast will be done in this expression.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(expr, expr1, expr2, ...) - Returns the index of expr in 
the expr1, expr2, ... or 0 if not found.",
+  extended = """
+Examples:
+  > SELECT _FUNC_(10, 9, 3, 10, 4);
+   3
+  > SELECT _FUNC_('a', 'b', 'c', 'd', 'a');
+   4
+  > SELECT _FUNC_('999', 'a', 999, 9.99, '999');
+   4
+  """)
+// scalastyle:on line.size.limit
+case class Field(children: Seq[Expression]) extends Expression {
+
+  /** Even if expr is not found in (expr1, expr2, ...) list, the value 
will be 0, not null */
+  override def nullable: Boolean = false
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  private lazy val ordering = 
TypeUtils.getInterpretedOrdering(children(0).dataType)
+
+  private val dataTypeMatchIndex: Array[Int] = 
children.zipWithIndex.tail.filter(
+_._1.dataType.sameType(children.head.dataType)).map(_._2).toArray
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.length <= 1) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires at least 2 
arguments")
+} else if (!children.forall(
+e => e.dataType.isInstanceOf[AtomicType] || 
e.dataType.isInstanceOf[NullType])) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires all arguments to 
be of AtomicType")
+} else {
+  TypeCheckResult.TypeCheckSuccess
+}
+  }
+
+  override def dataType: DataType = IntegerType
+  override def eval(input: InternalRow): Any = {
+val target = children.head.eval(input)
+@tailrec def findEqual(index: Int): Int = {
+  if (index == dataTypeMatchIndex.length) {
+0
+  } else {
+val value = children(dataTypeMatchIndex(index)).eval(input)
+if (value != null && ordering.equiv(target, value)) {
+  dataTypeMatchIndex(index)
+} else {
+  findEqual(index + 1)
+}
+  }
+}
+if (target == null) 0 else findEqual(index = 0)
+  }
+
+  protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+val evalChildren = children.map(_.genCode(ctx))
--- End diff --

I think you already call `genCode` here, and only filter type mismatch 
`ExprCode` in line 441.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r101515786
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -340,3 +341,91 @@ object CaseKeyWhen {
 CaseWhen(cases, elseValue)
   }
 }
+
+/**
+ * A function that returns the index of str in (str1, str2, ...) list or 0 
if not found.
+ * It takes at least 2 parameters, and all parameters' types should be 
subtypes of AtomicType.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(str, str1, str2, ...) - Returns the index of str in the 
str1,str2,... or 0 if not found.",
+  extended = """
+Examples:
+  > SELECT _FUNC_(10, 9, 3, 10, 4);
+   3
+  """)
+case class Field(children: Seq[Expression]) extends Expression {
+
+  override def nullable: Boolean = false
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  private lazy val ordering = 
TypeUtils.getInterpretedOrdering(children(0).dataType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.length <= 1) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires at least 2 
arguments")
+} else if (!children.forall(_.dataType.isInstanceOf[AtomicType])) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires all arguments to 
be of AtomicType")
+} else
+  TypeCheckResult.TypeCheckSuccess
+  }
+
+  override def dataType: DataType = IntegerType
+
+  override def eval(input: InternalRow): Any = {
+val target = children.head.eval(input)
+val targetDataType = children.head.dataType
+def findEqual(target: Any, params: Seq[Expression], index: Int): Int = 
{
+  params.toList match {
+case Nil => 0
+case head::tail if targetDataType == head.dataType
+  && head.eval(input) != null && ordering.equiv(target, 
head.eval(input)) => index
+case _ => findEqual(target, params.tail, index + 1)
+  }
+}
+if(target == null)
+  0
+else
+  findEqual(target, children.tail, 1)
+  }
+
+  protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+val evalChildren = children.map(_.genCode(ctx))
+val target = evalChildren(0)
+val targetDataType = children(0).dataType
+val rest = evalChildren.drop(1)
+val restDataType = children.drop(1).map(_.dataType)
+
+def updateEval(evalWithIndex: ((ExprCode, DataType), Int)): String = {
+  val ((eval, dataType), index) = evalWithIndex
+  s"""
+${eval.code}
+if (${dataType.equals(targetDataType)}
+  && ${ctx.genEqual(targetDataType, eval.value, target.value)}) {
+  ${ev.value} = ${index};
+}
+  """
+}
+
+def genIfElseStructure(code1: String, code2: String): String = {
--- End diff --

Oh, yeah, right, if use `foldLeft`, there is still a floating `else`. We 
can only use `foldRight` to remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16936: [SPARK-19605][DStream] Fail it if existing resource is n...

2017-02-16 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16936
  
Let us call @zsxwing for some suggestions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16949: [SPARK-16122][CORE] Add rest api for job environment

2017-02-16 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16949
  
@srowen good question!IMHO,we should add this API:

- provide complete API, the same as users see in webui
- if this is a security issue, we should address it in other ways
- maybe, existing API also has security issue as you said. Maybe, we need 
some authorization check or something else, also you said security issue.

Any suggestion is appreciated!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16955: [SPARK-19626]update cred using spark.yarn.credentials.up...

2017-02-16 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/16955
  
Agreed with @srowen , please describe the problem and fix both here in PR 
and in JIRA, also changing the title to be more meaningful. It would be better 
for others without the context to understand the issue fast.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16926: [MINOR][BUILD] Fix javadoc8 break

2017-02-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16926


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16926: [MINOR][BUILD] Fix javadoc8 break

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16926
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16871: [SPARK-19550][BUILD][CORE][WIP] Remove Java 7 sup...

2017-02-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16871


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16871: [SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16871
  
Merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16949: [SPARK-16122][CORE] Add rest api for job environment

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16949
  
It's a simple change, but I wonder if this is that important to add?
I always have a worry in the back of my mind that this becomes a security 
hole, as it's a way to look through the complete environment of a bunch of jobs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16955: [SPARK-19626]update cred using spark.yarn.credentials.up...

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16955
  
There's no detail in the JIRA or PR. I can't tell whether you are solving 
the problem you set out to solve. Are you saying this is the entire problem, 
that the wrong config is being queried?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16476
  
**[Test build #72994 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72994/testReport)**
 for PR 16476 at commit 
[`0665a1d`](https://github.com/apache/spark/commit/0665a1d0d76bdb553010775eb1c53b8d461e792a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-02-16 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16476#discussion_r101505199
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
 ---
@@ -340,3 +341,91 @@ object CaseKeyWhen {
 CaseWhen(cases, elseValue)
   }
 }
+
+/**
+ * A function that returns the index of str in (str1, str2, ...) list or 0 
if not found.
+ * It takes at least 2 parameters, and all parameters' types should be 
subtypes of AtomicType.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(str, str1, str2, ...) - Returns the index of str in the 
str1,str2,... or 0 if not found.",
+  extended = """
+Examples:
+  > SELECT _FUNC_(10, 9, 3, 10, 4);
+   3
+  """)
+case class Field(children: Seq[Expression]) extends Expression {
+
+  override def nullable: Boolean = false
+  override def foldable: Boolean = children.forall(_.foldable)
+
+  private lazy val ordering = 
TypeUtils.getInterpretedOrdering(children(0).dataType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.length <= 1) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires at least 2 
arguments")
+} else if (!children.forall(_.dataType.isInstanceOf[AtomicType])) {
+  TypeCheckResult.TypeCheckFailure(s"FIELD requires all arguments to 
be of AtomicType")
+} else
+  TypeCheckResult.TypeCheckSuccess
+  }
+
+  override def dataType: DataType = IntegerType
+
+  override def eval(input: InternalRow): Any = {
+val target = children.head.eval(input)
+val targetDataType = children.head.dataType
+def findEqual(target: Any, params: Seq[Expression], index: Int): Int = 
{
+  params.toList match {
+case Nil => 0
+case head::tail if targetDataType == head.dataType
+  && head.eval(input) != null && ordering.equiv(target, 
head.eval(input)) => index
+case _ => findEqual(target, params.tail, index + 1)
+  }
+}
+if(target == null)
+  0
+else
+  findEqual(target, children.tail, 1)
+  }
+
+  protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+val evalChildren = children.map(_.genCode(ctx))
+val target = evalChildren(0)
+val targetDataType = children(0).dataType
+val rest = evalChildren.drop(1)
+val restDataType = children.drop(1).map(_.dataType)
+
+def updateEval(evalWithIndex: ((ExprCode, DataType), Int)): String = {
+  val ((eval, dataType), index) = evalWithIndex
+  s"""
+${eval.code}
+if (${dataType.equals(targetDataType)}
+  && ${ctx.genEqual(targetDataType, eval.value, target.value)}) {
+  ${ev.value} = ${index};
+}
+  """
+}
+
+def genIfElseStructure(code1: String, code2: String): String = {
--- End diff --

Sorry I don't understand, how to use `foldLeft` approach? I think we can 
only use `foldRight` or `reduceRight`, because the code for latter children 
should be nested inner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark Histor...

2017-02-16 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/16924#discussion_r101502214
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -137,7 +138,13 @@ private[spark] class EventLoggingListener(
 // scalastyle:on println
 if (flushLogger) {
   writer.foreach(_.flush())
-  hadoopDataStream.foreach(_.hflush())
+  hadoopDataStream.foreach(ds => {
--- End diff --

OK, and it's not better to just call hsync in all cases -- you have to 
special case this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-02-16 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/16954
  
cc @hvanhovell @gatorsmile @nsyca


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16936: [SPARK-19605][DStream] Fail it if existing resource is n...

2017-02-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16936
  
Hm it just seems like the wrong approach, to externally estimate whether in 
theory it won't schedule. It is certainly a problem if streaming doesn't work 
though users would already realize it. The error check or message could be more 
explicit but it seems like something the streaming machinery should know and 
warn about?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72992/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16954: [SPARK-18874][SQL] First phase: Deferring the correlated...

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16954
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   >