svn commit: r28957 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_25_12_01-c17a8ff-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-25 Thread pwendell
Author: pwendell
Date: Sat Aug 25 19:16:22 2018
New Revision: 28957

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_25_12_01-c17a8ff docs


[This commit notification would consist of 1478 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-25214][SS][FOLLOWUP] Fix the issue that Kafka v2 source may return duplicated records when `failOnDataLoss=false`

2018-08-25 Thread zsxwing
Repository: spark
Updated Branches:
  refs/heads/master 6c66ab8b3 -> c17a8ff52


[SPARK-25214][SS][FOLLOWUP] Fix the issue that Kafka v2 source may return 
duplicated records when `failOnDataLoss=false`

## What changes were proposed in this pull request?

This is a follow up PR for #22207 to fix a potential flaky test. 
`processAllAvailable` doesn't work for continuous processing so we should not 
use it for a continuous query.

## How was this patch tested?

Jenkins.

Closes #22230 from zsxwing/SPARK-25214-2.

Authored-by: Shixiong Zhu 
Signed-off-by: Shixiong Zhu 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c17a8ff5
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c17a8ff5
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c17a8ff5

Branch: refs/heads/master
Commit: c17a8ff52377871ab4ff96b648ebaf4112f0b5be
Parents: 6c66ab8
Author: Shixiong Zhu 
Authored: Sat Aug 25 09:17:40 2018 -0700
Committer: Shixiong Zhu 
Committed: Sat Aug 25 09:17:40 2018 -0700

--
 .../spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala| 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c17a8ff5/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala
--
diff --git 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala
 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala
index 0ff341c..39c4e3f 100644
--- 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala
+++ 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDontFailOnDataLossSuite.scala
@@ -80,7 +80,7 @@ trait KafkaMissingOffsetsTest extends SharedSQLContext {
   }
 }
 
-class KafkaDontFailOnDataLossSuite extends KafkaMissingOffsetsTest {
+class KafkaDontFailOnDataLossSuite extends StreamTest with 
KafkaMissingOffsetsTest {
 
   import testImplicits._
 
@@ -165,7 +165,11 @@ class KafkaDontFailOnDataLossSuite extends 
KafkaMissingOffsetsTest {
 .trigger(Trigger.Continuous(100))
 .start()
   try {
-query.processAllAvailable()
+// `processAllAvailable` doesn't work for continuous processing, so 
just wait until the last
+// record appears in the table.
+eventually(timeout(streamingTimeout)) {
+  assert(spark.table(table).as[String].collect().contains("49"))
+}
   } finally {
 query.stop()
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



svn commit: r28956 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_25_08_02-6c66ab8-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-25 Thread pwendell
Author: pwendell
Date: Sat Aug 25 15:19:27 2018
New Revision: 28956

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_25_08_02-6c66ab8 docs


[This commit notification would consist of 1478 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-24688][EXAMPLES] Modify the comments about LabeledPoint

2018-08-25 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master 3e4f1666a -> 6c66ab8b3


[SPARK-24688][EXAMPLES] Modify the comments about LabeledPoint

## What changes were proposed in this pull request?

An RDD is created using LabeledPoint, but the comment is like 
#LabeledPoint(feature, label).
Although in the method ChiSquareTest.test, the second parameter is feature and 
the third parameter is label, it it better to write label in front of feature 
here because if an RDD is created using LabeldPoint, what we get are actually 
(label, feature) pairs.
Now it is changed as LabeledPoint(label, feature).

The comments in Scala and Java example have the same typos.

## How was this patch tested?

tested

https://issues.apache.org/jira/browse/SPARK-24688

Author: Weizhe Huang 492816239qq.com

Please review http://spark.apache.org/contributing.html before opening a pull 
request.

Closes #21665 from uzmijnlm/my_change.

Authored-by: Huangweizhe 
Signed-off-by: Sean Owen 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6c66ab8b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6c66ab8b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6c66ab8b

Branch: refs/heads/master
Commit: 6c66ab8b334c5358bc77995650f1886e4c43231d
Parents: 3e4f166
Author: Huangweizhe 
Authored: Sat Aug 25 09:24:20 2018 -0500
Committer: Sean Owen 
Committed: Sat Aug 25 09:24:20 2018 -0500

--
 .../spark/examples/mllib/JavaHypothesisTestingExample.java   | 2 +-
 examples/src/main/python/mllib/hypothesis_testing_example.py | 2 +-
 .../apache/spark/examples/mllib/HypothesisTestingExample.scala   | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6c66ab8b/examples/src/main/java/org/apache/spark/examples/mllib/JavaHypothesisTestingExample.java
--
diff --git 
a/examples/src/main/java/org/apache/spark/examples/mllib/JavaHypothesisTestingExample.java
 
b/examples/src/main/java/org/apache/spark/examples/mllib/JavaHypothesisTestingExample.java
index b48b95f..2732736 100644
--- 
a/examples/src/main/java/org/apache/spark/examples/mllib/JavaHypothesisTestingExample.java
+++ 
b/examples/src/main/java/org/apache/spark/examples/mllib/JavaHypothesisTestingExample.java
@@ -67,7 +67,7 @@ public class JavaHypothesisTestingExample {
   )
 );
 
-// The contingency table is constructed from the raw (feature, label) 
pairs and used to conduct
+// The contingency table is constructed from the raw (label, feature) 
pairs and used to conduct
 // the independence test. Returns an array containing the 
ChiSquaredTestResult for every feature
 // against the label.
 ChiSqTestResult[] featureTestResults = Statistics.chiSqTest(obs.rdd());

http://git-wip-us.apache.org/repos/asf/spark/blob/6c66ab8b/examples/src/main/python/mllib/hypothesis_testing_example.py
--
diff --git a/examples/src/main/python/mllib/hypothesis_testing_example.py 
b/examples/src/main/python/mllib/hypothesis_testing_example.py
index e566ead..21a5584 100644
--- a/examples/src/main/python/mllib/hypothesis_testing_example.py
+++ b/examples/src/main/python/mllib/hypothesis_testing_example.py
@@ -51,7 +51,7 @@ if __name__ == "__main__":
 [LabeledPoint(1.0, [1.0, 0.0, 3.0]),
  LabeledPoint(1.0, [1.0, 2.0, 0.0]),
  LabeledPoint(1.0, [-1.0, 0.0, -0.5])]
-)  # LabeledPoint(feature, label)
+)  # LabeledPoint(label, feature)
 
 # The contingency table is constructed from an RDD of LabeledPoint and 
used to conduct
 # the independence test. Returns an array containing the 
ChiSquaredTestResult for every feature

http://git-wip-us.apache.org/repos/asf/spark/blob/6c66ab8b/examples/src/main/scala/org/apache/spark/examples/mllib/HypothesisTestingExample.scala
--
diff --git 
a/examples/src/main/scala/org/apache/spark/examples/mllib/HypothesisTestingExample.scala
 
b/examples/src/main/scala/org/apache/spark/examples/mllib/HypothesisTestingExample.scala
index add1719..9b3c326 100644
--- 
a/examples/src/main/scala/org/apache/spark/examples/mllib/HypothesisTestingExample.scala
+++ 
b/examples/src/main/scala/org/apache/spark/examples/mllib/HypothesisTestingExample.scala
@@ -61,9 +61,9 @@ object HypothesisTestingExample {
   LabeledPoint(-1.0, Vectors.dense(-1.0, 0.0, -0.5)
   )
 )
-  ) // (feature, label) pairs.
+  ) // (label, feature) pairs.
 
-// The contingency table is constructed from the raw (feature, label) 
pairs and used to conduct
+// The contingency table is constructed from the raw (label, feature) 
pairs and used to conduct
  

svn commit: r28954 - in /dev/spark/2.4.0-SNAPSHOT-2018_08_25_00_02-3e4f166-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/java/ _site/api/java/lib/ _site/api/java/org/ _site/api/java/org/apache/ _s

2018-08-25 Thread pwendell
Author: pwendell
Date: Sat Aug 25 07:19:12 2018
New Revision: 28954

Log:
Apache Spark 2.4.0-SNAPSHOT-2018_08_25_00_02-3e4f166 docs


[This commit notification would consist of 1478 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org