[GitHub] spark issue #22204: [SPARK-25196][SQL] Analyze column statistics in cached q...

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22204
  
cc @wzhfy as well


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21732
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95240/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21732
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21732
  
**[Test build #95240 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95240/testReport)**
 for PR 21732 at commit 
[`c1f798f`](https://github.com/apache/spark/commit/c1f798f7e9cba0d04223eed06f1b1f547ec29dc5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/9
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95239/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9
  
**[Test build #95239 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95239/testReport)**
 for PR 9 at commit 
[`20a2aa2`](https://github.com/apache/spark/commit/20a2aa21813cbbb131110e00d2e5c1426d6dae78).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22214: [SPARK-23698][Python] Avoid 'undefined name' by defining...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22214
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95243/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22214: [SPARK-23698][Python] Avoid 'undefined name' by defining...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22214
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22214: [SPARK-23698][Python] Avoid 'undefined name' by defining...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22214
  
**[Test build #95243 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95243/testReport)**
 for PR 22214 at commit 
[`a077542`](https://github.com/apache/spark/commit/a077542e67213c0add6093c0bf8c7568b6cb7b32).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22214: [SPARK-23698][Python] Avoid 'undefined name' by d...

2018-08-24 Thread cclauss
Github user cclauss closed the pull request at:

https://github.com/apache/spark/pull/22214


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21087
  
**[Test build #95244 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95244/testReport)**
 for PR 21087 at commit 
[`ebd9265`](https://github.com/apache/spark/commit/ebd926530c1d8b2f515a4a233f5963eafc17e460).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21087
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21665: [SPARK-24688][examples]Modify the comments about Labeled...

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21665
  
@srowen, +1 for taking over. Seems it's been inactive almost for a month.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-24 Thread yucai
Github user yucai commented on a diff in the pull request:

https://github.com/apache/spark/pull/22197#discussion_r212788978
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
@@ -377,7 +378,7 @@ class ParquetFileFormat
   // Collects all converted Parquet filter predicates. Notice that 
not all predicates can be
   // converted (`ParquetFilters.createFilter` returns an 
`Option`). That's why a `flatMap`
   // is used here.
-  .flatMap(parquetFilters.createFilter(parquetSchema, _))
+  .flatMap(parquetFilters.createFilter(parquetSchema, _, 
isCaseSensitive))
--- End diff --

Yes, that way is better.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22214: [SPARK-23698][Python] Avoid 'undefined name' by d...

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22214#discussion_r212788823
  
--- Diff: python/setup.py ---
@@ -28,6 +28,7 @@
   file=sys.stderr)
 sys.exit(-1)
 
+__version__ = "Unknown"  # Prevent linters from raising 'undefined name'
--- End diff --

Shall we simply noqa for `VERSION = __version__`? looks weird we have a 
version defined as Unknown to fix lint errors.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-24 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22197#discussion_r212788741
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
@@ -377,7 +378,7 @@ class ParquetFileFormat
   // Collects all converted Parquet filter predicates. Notice that 
not all predicates can be
   // converted (`ParquetFilters.createFilter` returns an 
`Option`). That's why a `flatMap`
   // is used here.
-  .flatMap(parquetFilters.createFilter(parquetSchema, _))
+  .flatMap(parquetFilters.createFilter(parquetSchema, _, 
isCaseSensitive))
--- End diff --

can we pass this config when creating `ParquetFilters`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22189: Correct missing punctuation in the documentation

2018-08-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22189


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22214: [SPARK-23698][Python] Avoid 'undefined name' by defining...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22214
  
**[Test build #95243 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95243/testReport)**
 for PR 22214 at commit 
[`a077542`](https://github.com/apache/spark/commit/a077542e67213c0add6093c0bf8c7568b6cb7b32).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22214: [SPARK-23698][Python] Avoid 'undefined name' by defining...

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22214
  
It should have been better fixed together rather then 3 PRs to fix the same 
instances.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22214: [SPARK-23698][Python] Avoid 'undefined name' by defining...

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22214
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22189: Correct missing punctuation in the documentation

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22189
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22208: [SPARK-25216][SQL] Improve error message when a c...

2018-08-24 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22208#discussion_r212788594
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -216,8 +216,16 @@ class Dataset[T] private[sql](
   private[sql] def resolve(colName: String): NamedExpression = {
 queryExecution.analyzed.resolveQuoted(colName, 
sparkSession.sessionState.analyzer.resolver)
   .getOrElse {
-throw new AnalysisException(
-  s"""Cannot resolve column name "$colName" among 
(${schema.fieldNames.mkString(", ")})""")
+if (schema.fieldNames.contains(colName)) {
+  throw new AnalysisException(
+s"""Cannot resolve column name "$colName" among 
(${schema.fieldNames.mkString(", ")}).
+   | Try adding backticks to the column name, i.e., 
`$colName`"""
--- End diff --

Yup, please go ahead.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21087
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21087
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95238/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21087
  
**[Test build #95238 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95238/testReport)**
 for PR 21087 at commit 
[`ebd9265`](https://github.com/apache/spark/commit/ebd926530c1d8b2f515a4a233f5963eafc17e460).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22112
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2550/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22112
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22112
  
**[Test build #95242 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95242/testReport)**
 for PR 22112 at commit 
[`81bd74a`](https://github.com/apache/spark/commit/81bd74ae132c9ec89cfaeeb1f2f8cba20cab3f9e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22209
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95233/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22209
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22209
  
**[Test build #95233 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95233/testReport)**
 for PR 22209 at commit 
[`70678dc`](https://github.com/apache/spark/commit/70678dc957255d459ba0b5c3960f8a6c8767f50d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...

2018-08-24 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22198
  
@dilipbiswal @gatorsmile ping


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22204: [SPARK-25196][SQL] Analyze column statistics in cached q...

2018-08-24 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22204
  
@gatorsmile ping


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21734: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation t...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21734
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21734: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation t...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21734
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95241/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21734: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation t...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21734
  
**[Test build #95241 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95241/testReport)**
 for PR 21734 at commit 
[`f3aa2a4`](https://github.com/apache/spark/commit/f3aa2a4e7ed180e54157394dc9721c53de444866).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22141: [SPARK-25154][SQL] Support NOT IN sub-queries ins...

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22141#discussion_r212785260
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
@@ -137,13 +137,21 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
   plan: LogicalPlan): (Option[Expression], LogicalPlan) = {
 var newPlan = plan
 val newExprs = exprs.map { e =>
-  e transformUp {
+  e transformDown {
 case Exists(sub, conditions, _) =>
   val exists = AttributeReference("exists", BooleanType, nullable 
= false)()
   // Deduplicate conflicting attributes if any.
   newPlan = dedupJoin(
 Join(newPlan, sub, ExistenceJoin(exists), 
conditions.reduceLeftOption(And)))
   exists
+case (Not(InSubquery(values, ListQuery(sub, conditions, _, _ =>
+  val exists = AttributeReference("exists", BooleanType, nullable 
= false)()
--- End diff --

yea, it's ok to keep the current one. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/7
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95237/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/7
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/7
  
**[Test build #95237 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95237/testReport)**
 for PR 7 at commit 
[`ceb3f41`](https://github.com/apache/spark/commit/ceb3f41238c8731606164cea5c45a0b87bb5d6f2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22208: [SPARK-25216][SQL] Improve error message when a column c...

2018-08-24 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22208
  
Could you add some unit tests for this? At least, we had better check the 
error message for both `spark.sql.caseSensitive=true` and 
`spark.sql.caseSensitive=false`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22208: [SPARK-25216][SQL] Improve error message when a c...

2018-08-24 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22208#discussion_r212785096
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -216,8 +216,16 @@ class Dataset[T] private[sql](
   private[sql] def resolve(colName: String): NamedExpression = {
 queryExecution.analyzed.resolveQuoted(colName, 
sparkSession.sessionState.analyzer.resolver)
   .getOrElse {
-throw new AnalysisException(
-  s"""Cannot resolve column name "$colName" among 
(${schema.fieldNames.mkString(", ")})""")
+if (schema.fieldNames.contains(colName)) {
--- End diff --

@icexelloss . This cannot handle mixed cases like the following. This 
should be handled for the purpose of this PR. Please use 
`sparkSession.sessionState.analyzer.resolver`.
```python
spark.range(0, 1).toDF('A.b')['a.B']
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22208: [SPARK-25216][SQL] Improve error message when a c...

2018-08-24 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22208#discussion_r212784956
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -216,8 +216,16 @@ class Dataset[T] private[sql](
   private[sql] def resolve(colName: String): NamedExpression = {
 queryExecution.analyzed.resolveQuoted(colName, 
sparkSession.sessionState.analyzer.resolver)
   .getOrElse {
-throw new AnalysisException(
-  s"""Cannot resolve column name "$colName" among 
(${schema.fieldNames.mkString(", ")})""")
+if (schema.fieldNames.contains(colName)) {
+  throw new AnalysisException(
+s"""Cannot resolve column name "$colName" among 
(${schema.fieldNames.mkString(", ")}).
+   | Try adding backticks to the column name, i.e., 
`$colName`"""
+  .stripMargin.replaceAll("\n", ""))
+} else {
+  throw new AnalysisException(
+s"""Cannot resolve column name "$colName" among 
(${schema.fieldNames.mkString(", ")}"""
--- End diff --

At the end of message, `)` is missing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22226: [SPARK-24391][SQL] Support arrays of any types by to_jso...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/6
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22226: [SPARK-24391][SQL] Support arrays of any types by to_jso...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/6
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95235/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22226: [SPARK-24391][SQL] Support arrays of any types by to_jso...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/6
  
**[Test build #95235 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95235/testReport)**
 for PR 6 at commit 
[`c1755de`](https://github.com/apache/spark/commit/c1755decf93c503dcb3de957408908177a75cf3e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21734: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation t...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21734
  
**[Test build #95241 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95241/testReport)**
 for PR 21734 at commit 
[`f3aa2a4`](https://github.com/apache/spark/commit/f3aa2a4e7ed180e54157394dc9721c53de444866).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21734: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation t...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21734
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2549/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21734: [SPARK-24149][YARN][FOLLOW-UP] Only get the delegation t...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21734
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22218: [SPARK-25228][CORE]Add executor CPU time metric.

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22218#discussion_r212784518
  
--- Diff: 
core/src/main/scala/org/apache/spark/executor/ExecutorSource.scala ---
@@ -73,6 +75,13 @@ class ExecutorSource(threadPool: ThreadPoolExecutor, 
executorId: String) extends
 registerFileSystemStat(scheme, "write_ops", _.getWriteOps(), 0)
   }
 
+  // Dropwizard metrics gauge measuring the executor's process (JVM) CPU 
time.
+  // The value is returned in nanoseconds, the method return -1 if this 
operation is not supported.
+  val osMXBean = 
ManagementFactory.getOperatingSystemMXBean.asInstanceOf[OperatingSystemMXBean]
+  metricRegistry.register(MetricRegistry.name("executorCPUTime" ), new 
Gauge[Long] {
+override def getValue: Long = osMXBean.getProcessCpuTime()
--- End diff --

This metric is useful for users?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22222: [SPARK-25083][SQL] Remove the type erasure hack in data ...

2018-08-24 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/2
  
cc @cloud-fan and @rdblue have a look when you have time. If this PR 
doesn't coincide with your expect, I'll close this soon. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22222: [SPARK-25083][SQL] Remove the type erasure hack i...

2018-08-24 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/2#discussion_r212784374
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala 
---
@@ -40,6 +42,29 @@ private[sql] trait ColumnarBatchScan extends 
CodegenSupport {
 "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of 
output rows"),
 "scanTime" -> SQLMetrics.createTimingMetric(sparkContext, "scan time"))
 
+  /**
+   * Returns all the RDDs of ColumnarBatch which generates the input rows.
+   */
+  def inputBatchRDDs(): Seq[RDD[ColumnarBatch]]
+
+  /**
+   * Returns all the RDDs of InternalRow which generates the input rows.
+   */
+  def inputRowRDDs(): Seq[RDD[InternalRow]]
+
+  /**
+   * Get input RDD depends on supportsBatch.
+   */
+  final def getInputRDDs(): Seq[RDD[InternalRow]] = {
+if (supportsBatch) {
+  inputBatchRDDs().asInstanceOf[Seq[RDD[InternalRow]]]
--- End diff --

Here maybe the last explicitly erasure hack left, please check whether is 
it acceptable or not.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22063
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22063
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95234/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22063: [WIP][SPARK-25044][SQL] Address translation of LMF closu...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22063
  
**[Test build #95234 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95234/testReport)**
 for PR 22063 at commit 
[`721860d`](https://github.com/apache/spark/commit/721860d38b889848a4facd691146bd81bf8fa905).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21732
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2548/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21732
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21732
  
**[Test build #95240 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95240/testReport)**
 for PR 21732 at commit 
[`c1f798f`](https://github.com/apache/spark/commit/c1f798f7e9cba0d04223eed06f1b1f547ec29dc5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22220: [SPARK-25229][SQL] Partition fields are uniformly...

2018-08-24 Thread ouyangxiaochen
Github user ouyangxiaochen closed the pull request at:

https://github.com/apache/spark/pull/0


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22209
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95229/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22209
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22209: [SPARK-24415][Core] Fixed the aggregated stage metrics b...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22209
  
**[Test build #95229 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95229/testReport)**
 for PR 22209 at commit 
[`70678dc`](https://github.com/apache/spark/commit/70678dc957255d459ba0b5c3960f8a6c8767f50d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212783356
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
+"The `limit` parameter controls the number of times the pattern is 
applied and " +
+"therefore affects the length of the resulting array. If the limit n 
is " +
+"greater than zero then the pattern will be applied at most n - 1 
times, " +
+"the array's length will be no greater than n, and the array's last 
entry " +
+"will contain all input beyond the last matched delimiter. If n is " +
+"non-positive then the pattern will be applied as many times as " +
+"possible and the array can have any length. If n is zero then the " +
+"pattern will be applied as many times as possible, the array can " +
+"have any length, and trailing empty strings will be discarded.",
--- End diff --

hmm, is it possible to make this usage more compact? I think the usage here 
should be concise.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212783481
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
+"The `limit` parameter controls the number of times the pattern is 
applied and " +
+"therefore affects the length of the resulting array. If the limit n 
is " +
+"greater than zero then the pattern will be applied at most n - 1 
times, " +
+"the array's length will be no greater than n, and the array's last 
entry " +
+"will contain all input beyond the last matched delimiter. If n is " +
+"non-positive then the pattern will be applied as many times as " +
+"possible and the array can have any length. If n is zero then the " +
+"pattern will be applied as many times as possible, the array can " +
+"have any length, and trailing empty strings will be discarded.",
   examples = """
 Examples:
-  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]');
+  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', -1);
["one","two","three",""]
+|  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', 2);
+ |   ["one","twoBthreeC"]
   """)
-case class StringSplit(str: Expression, pattern: Expression)
-  extends BinaryExpression with ImplicitCastInputTypes {
+case class StringSplit(str: Expression, pattern: Expression, limit: 
Expression)
+  extends TernaryExpression with ImplicitCastInputTypes {
 
-  override def left: Expression = str
-  override def right: Expression = pattern
   override def dataType: DataType = ArrayType(StringType)
-  override def inputTypes: Seq[DataType] = Seq(StringType, StringType)
+  override def inputTypes: Seq[DataType] = Seq(StringType, StringType, 
IntegerType)
+  override def children: Seq[Expression] = str :: pattern :: limit :: Nil
 
-  override def nullSafeEval(string: Any, regex: Any): Any = {
-val strings = 
string.asInstanceOf[UTF8String].split(regex.asInstanceOf[UTF8String], -1)
+  override def nullSafeEval(string: Any, regex: Any, limit: Any): Any = {
--- End diff --

I think we still need to do some check on `limit`. According to Presto 
document, `limit` must be a positive number. -1 is only used when no `limit` 
parameter is given (default value).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212783375
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
+"The `limit` parameter controls the number of times the pattern is 
applied and " +
+"therefore affects the length of the resulting array. If the limit n 
is " +
+"greater than zero then the pattern will be applied at most n - 1 
times, " +
+"the array's length will be no greater than n, and the array's last 
entry " +
+"will contain all input beyond the last matched delimiter. If n is " +
+"non-positive then the pattern will be applied as many times as " +
+"possible and the array can have any length. If n is zero then the " +
+"pattern will be applied as many times as possible, the array can " +
+"have any length, and trailing empty strings will be discarded.",
   examples = """
 Examples:
-  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]');
+  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', -1);
--- End diff --

I think it is better to keep original example for default value.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9
  
**[Test build #95239 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95239/testReport)**
 for PR 9 at commit 
[`20a2aa2`](https://github.com/apache/spark/commit/20a2aa21813cbbb131110e00d2e5c1426d6dae78).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2547/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/9
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95231/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9
  
**[Test build #95231 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95231/testReport)**
 for PR 9 at commit 
[`20a2aa2`](https://github.com/apache/spark/commit/20a2aa21813cbbb131110e00d2e5c1426d6dae78).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212783216
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
+"The `limit` parameter controls the number of times the pattern is 
applied and " +
+"therefore affects the length of the resulting array. If the limit n 
is " +
+"greater than zero then the pattern will be applied at most n - 1 
times, " +
+"the array's length will be no greater than n, and the array's last 
entry " +
+"will contain all input beyond the last matched delimiter. If n is " +
+"non-positive then the pattern will be applied as many times as " +
+"possible and the array can have any length. If n is zero then the " +
+"pattern will be applied as many times as possible, the array can " +
+"have any length, and trailing empty strings will be discarded.",
   examples = """
 Examples:
-  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]');
+  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', -1);
["one","two","three",""]
+|  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', 2);
+ |   ["one","twoBthreeC"]
   """)
-case class StringSplit(str: Expression, pattern: Expression)
-  extends BinaryExpression with ImplicitCastInputTypes {
+case class StringSplit(str: Expression, pattern: Expression, limit: 
Expression)
--- End diff --

For test coverage, better to add tests in `string-functions.sql`  for the 
two cases: two arguments and three arguments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/7
  
@phegstrom Thanks for your contribution!
Btw, seems like your email address in your commits is not connected to your 
GitHub account. Could you connect the address to your account, or use other 
address connected to your account? Otherwise, your contribution will not be 
connected to you.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212783068
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2554,7 +2554,27 @@ object functions {
* @since 1.5.0
*/
   def split(str: Column, pattern: String): Column = withExpr {
-StringSplit(str.expr, lit(pattern).expr)
+StringSplit(str.expr, lit(pattern).expr, lit(-1).expr)
+  }
+
+  /**
+   * Splits str around pattern (pattern is a regular expression) up to 
`limit-1` times.
+   *
+   * The limit parameter controls the number of times the pattern is 
applied and therefore
+   * affects the length of the resulting array. If the limit n is greater 
than zero then the
+   * pattern will be applied at most n - 1 times, the array's length will 
be no greater than
+   * n, and the array's last entry will contain all input beyond the last 
matched delimiter.
+   * If n is non-positive then the pattern will be applied as many times 
as possible and the
+   * array can have any length. If n is zero then the pattern will be 
applied as many times as
+   * possible, the array can have any length, and trailing empty strings 
will be discarded.
+   *
+   * @note Pattern is a string representation of the regular expression.
+   *
+   * @group string_funcs
+   * @since 1.5.0
+   */
+  def split(str: Column, pattern: String, limit: Int): Column = withExpr {
+StringSplit(str.expr, lit(pattern).expr, lit(limit).expr)
--- End diff --

nit: better to directly use `Literal`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212782616
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
+"The `limit` parameter controls the number of times the pattern is 
applied and " +
+"therefore affects the length of the resulting array. If the limit n 
is " +
+"greater than zero then the pattern will be applied at most n - 1 
times, " +
+"the array's length will be no greater than n, and the array's last 
entry " +
+"will contain all input beyond the last matched delimiter. If n is " +
+"non-positive then the pattern will be applied as many times as " +
+"possible and the array can have any length. If n is zero then the " +
+"pattern will be applied as many times as possible, the array can " +
+"have any length, and trailing empty strings will be discarded.",
   examples = """
 Examples:
-  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]');
+  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', -1);
["one","two","three",""]
+|  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', 2);
+ |   ["one","twoBthreeC"]
   """)
-case class StringSplit(str: Expression, pattern: Expression)
-  extends BinaryExpression with ImplicitCastInputTypes {
+case class StringSplit(str: Expression, pattern: Expression, limit: 
Expression)
--- End diff --

We still need to support 2 arguments. Please add a constructor `def 
this(str: Expression, pattern: Expression)`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212782550
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
+"The `limit` parameter controls the number of times the pattern is 
applied and " +
+"therefore affects the length of the resulting array. If the limit n 
is " +
+"greater than zero then the pattern will be applied at most n - 1 
times, " +
+"the array's length will be no greater than n, and the array's last 
entry " +
+"will contain all input beyond the last matched delimiter. If n is " +
+"non-positive then the pattern will be applied as many times as " +
+"possible and the array can have any length. If n is zero then the " +
+"pattern will be applied as many times as possible, the array can " +
+"have any length, and trailing empty strings will be discarded.",
   examples = """
 Examples:
-  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]');
+  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', -1);
["one","two","three",""]
+|  > SELECT _FUNC_('oneAtwoBthreeC', '[ABC]', 2);
+ |   ["one","twoBthreeC"]
--- End diff --

nit: remove `|`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212782784
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -232,30 +232,41 @@ case class RLike(left: Expression, right: Expression) 
extends StringRegexExpress
  * Splits str around pat (pattern is a regular expression).
  */
 @ExpressionDescription(
-  usage = "_FUNC_(str, regex) - Splits `str` around occurrences that match 
`regex`.",
+  usage = "_FUNC_(str, regex, limit) - Splits `str` around occurrences 
that match `regex`." +
--- End diff --

Can you refine the description and the format along with the others, e.g., 
`RLike`

https://github.com/apache/spark/blob/ceb3f41238c8731606164cea5c45a0b87bb5d6f2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala#L78


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21087
  
**[Test build #95238 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95238/testReport)**
 for PR 21087 at commit 
[`ebd9265`](https://github.com/apache/spark/commit/ebd926530c1d8b2f515a4a233f5963eafc17e460).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21977: [SPARK-25004][CORE] Add spark.executor.pyspark.me...

2018-08-24 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/21977#discussion_r212782657
  
--- Diff: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/BaseYarnClusterSuite.scala
 ---
@@ -161,6 +162,11 @@ abstract class BaseYarnClusterSuite
 }
 extraJars.foreach(launcher.addJar)
 
+if (outFile.isDefined) {
--- End diff --

Like I said, I think `foreach` is a bad practice with options, so I'd 
rather not change to use it. I'd be happy to change this to a pattern match if 
you think it is really desirable to get rid of the `.get`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-24 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21087
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/7
  
**[Test build #95237 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95237/testReport)**
 for PR 7 at commit 
[`ceb3f41`](https://github.com/apache/spark/commit/ceb3f41238c8731606164cea5c45a0b87bb5d6f2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212782576
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2554,7 +2554,27 @@ object functions {
* @since 1.5.0
*/
   def split(str: Column, pattern: String): Column = withExpr {
-StringSplit(str.expr, lit(pattern).expr)
+StringSplit(str.expr, lit(pattern).expr, lit(-1).expr)
+  }
+
+  /**
+   * Splits str around pattern (pattern is a regular expression) up to 
`limit-1` times.
--- End diff --

Drop `up to `limit-1` times` in the first line? That's because the 
behaviour depends on values described below.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/7
  
ok to test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22216: [SPARK-25223][SQL] Use a map to store values for NamedLa...

2018-08-24 Thread ueshin
Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/22216
  
I see. Let me think again. I'd close this for now. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/7#discussion_r212781563
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2554,7 +2554,27 @@ object functions {
* @since 1.5.0
*/
   def split(str: Column, pattern: String): Column = withExpr {
-StringSplit(str.expr, lit(pattern).expr)
+StringSplit(str.expr, lit(pattern).expr, lit(-1).expr)
+  }
+
+  /**
+   * Splits str around pattern (pattern is a regular expression) up to 
`limit-1` times.
+   *
+   * The limit parameter controls the number of times the pattern is 
applied and therefore
+   * affects the length of the resulting array. If the limit n is greater 
than zero then the
+   * pattern will be applied at most n - 1 times, the array's length will 
be no greater than
+   * n, and the array's last entry will contain all input beyond the last 
matched delimiter.
+   * If n is non-positive then the pattern will be applied as many times 
as possible and the
+   * array can have any length. If n is zero then the pattern will be 
applied as many times as
+   * possible, the array can have any length, and trailing empty strings 
will be discarded.
+   *
+   * @note Pattern is a string representation of the regular expression.
+   *
+   * @group string_funcs
+   * @since 1.5.0
--- End diff --

`1.5.0` -> `2.4.0`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22216: [SPARK-25223][SQL] Use a map to store values for ...

2018-08-24 Thread ueshin
Github user ueshin closed the pull request at:

https://github.com/apache/spark/pull/22216


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-08-24 Thread NiharS
Github user NiharS commented on a diff in the pull request:

https://github.com/apache/spark/pull/22192#discussion_r212781171
  
--- Diff: core/src/test/java/org/apache/spark/ExecutorPluginSuite.java ---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import org.apache.spark.api.java.JavaSparkContext;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+// Tests loading plugins into executors
+public class ExecutorPluginSuite {
+  // Static value modified by testing plugin to ensure plugin loaded 
correctly.
+  public static int numSuccessfulPlugins = 0;
+
+  private SparkConf initializeSparkConf(String pluginNames) {
+return new SparkConf()
+.setMaster("local")
+.setAppName("test")
+.set("spark.executor.plugins", pluginNames);
--- End diff --

I had the same first response. The original executor plugin code was a 
class in Java, so I figured I should keep my test in Java as well (manually 
checked beforehand, couldn't find any instances of Java stuff being primarily 
tested in Scala so I figured it best not to do that). Perhaps it's for the 
plugin-writers to get to choose their plugin language, since it's easier to 
extend a Java interface in Scala, than to extend a Scala trait in Java? I'll 
leave it as is for the time being


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/7
  
Can you add tests in `StringFunctionsSuite`, too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-08-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22192#discussion_r212780758
  
--- Diff: core/src/test/java/org/apache/spark/ExecutorPluginSuite.java ---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import org.apache.spark.api.java.JavaSparkContext;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+// Tests loading plugins into executors
+public class ExecutorPluginSuite {
+  // Static value modified by testing plugin to ensure plugin loaded 
correctly.
+  public static int numSuccessfulPlugins = 0;
+
+  private SparkConf initializeSparkConf(String pluginNames) {
+return new SparkConf()
+.setMaster("local")
+.setAppName("test")
+.set("spark.executor.plugins", pluginNames);
--- End diff --

Ah, crap, this is Java... any specific reason to do this in Java? Otherwise 
it's probably fine the way it is.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/7
  
@gatorsmile @ueshin can you trigger this test?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [Core] Implements split with limit sql fun...

2018-08-24 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/7
  
not `[CORE]` but `[SQL]` in the title.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-08-24 Thread NiharS
Github user NiharS commented on a diff in the pull request:

https://github.com/apache/spark/pull/22192#discussion_r212780252
  
--- Diff: core/src/test/java/org/apache/spark/ExecutorPluginSuite.java ---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import org.apache.spark.api.java.JavaSparkContext;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+// Tests loading plugins into executors
+public class ExecutorPluginSuite {
+  // Static value modified by testing plugin to ensure plugin loaded 
correctly.
+  public static int numSuccessfulPlugins = 0;
+
+  private SparkConf initializeSparkConf(String pluginNames) {
+return new SparkConf()
+.setMaster("local")
+.setAppName("test")
+.set("spark.executor.plugins", pluginNames);
--- End diff --

Is it alright to use Seq in a Java file? I couldn't find it in any other 
java files in Spark that aren't in a target/ directory. I don't think the 
compiler is getting mad when I import it but want to make sure I'm sticking 
with the usual style rules


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22230: [SPARK-25214][SS][FOLLOWUP]Fix the issue that Kafka v2 s...

2018-08-24 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/22230
  
LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-08-24 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22192#discussion_r212779435
  
--- Diff: core/src/test/java/org/apache/spark/ExecutorPluginSuite.java ---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import org.apache.spark.api.java.JavaSparkContext;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+// Tests loading plugins into executors
+public class ExecutorPluginSuite {
+  // Static value modified by testing plugin to ensure plugin loaded 
correctly.
+  public static int numSuccessfulPlugins = 0;
+
+  private SparkConf initializeSparkConf(String pluginNames) {
+return new SparkConf()
+.setMaster("local")
+.setAppName("test")
+.set("spark.executor.plugins", pluginNames);
--- End diff --

`.set(EXECUTOR_PLUGINS, Seq("class1", "class2"))`



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-08-24 Thread NiharS
Github user NiharS commented on a diff in the pull request:

https://github.com/apache/spark/pull/22192#discussion_r212778867
  
--- Diff: core/src/test/java/org/apache/spark/ExecutorPluginSuite.java ---
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import org.apache.spark.api.java.JavaSparkContext;
+
+import org.junit.Assert;
+import org.junit.Test;
+
+// Tests loading plugins into executors
+public class ExecutorPluginSuite {
+  // Static value modified by testing plugin to ensure plugin loaded 
correctly.
+  public static int numSuccessfulPlugins = 0;
+
+  private SparkConf initializeSparkConf(String pluginNames) {
+return new SparkConf()
+.setMaster("local")
+.setAppName("test")
+.set("spark.executor.plugins", pluginNames);
--- End diff --

Sorry, could you explain this for me? Do you mean instead of 
`.set("spark.executor.plugins", pluginNames)` I should a) have a variable 
storing the config name and b) pass the plugin names to initializeSparkConf as 
a list, and have it as `.set(EXECUTOR_PLUGIN_CONF_NAME, String.join(",", 
pluginNames)`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22192: [SPARK-24918][Core] Executor Plugin API

2018-08-24 Thread NiharS
Github user NiharS commented on a diff in the pull request:

https://github.com/apache/spark/pull/22192#discussion_r212777617
  
--- Diff: core/src/main/java/org/apache/spark/ExecutorPlugin.java ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import org.apache.spark.annotation.DeveloperApi;
+
+/**
+ * A plugin which can be automaticaly instantiated within each Spark 
executor.  Users can specify
+ * plugins which should be created with the "spark.executor.plugins" 
configuration.  An instance
+ * of each plugin will be created for every executor, including those 
created by dynamic allocation,
+ * before the executor starts running any tasks.
+ *
+ * The specific api exposed to the end users still considered to be very 
unstable.  We will
+ * *hopefully* be able to keep compatability by providing default 
implementations for any methods
+ * added, but make no guarantees this will always be possible across all 
spark releases.
+ *
+ * Spark does nothing to verify the plugin is doing legitimate things, or 
to manage the resources
+ * it uses.  A plugin acquires the same privileges as the user running the 
task.  A bad plugin
+ * could also intefere with task execution and make the executor fail in 
unexpected ways.
+ */
+@DeveloperApi
+public interface ExecutorPlugin {
--- End diff --

Pinging @squito for this one, not sure if he specifically wanted an empty 
API to start (and have plugins run to their own completion or until the JVM 
cleans them up) or was considering adding methods later


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >