[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93446/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21823: [SPARK-24870][SQL]Cache can't work normally if th...

2018-07-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21823#discussion_r204482993
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SameResultSuite.scala ---
@@ -58,4 +61,16 @@ class SameResultSuite extends QueryTest with 
SharedSQLContext {
 val df4 = spark.range(10).agg(sumDistinct($"id"))
 
assert(df3.queryExecution.executedPlan.sameResult(df4.queryExecution.executedPlan))
   }
+
+  test("Canonicalized result is not case-insensitive") {
--- End diff --

`Canonicalized result is not case-insensitive` -> `Canonicalized result is 
case-insensitive`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Pyt...

2018-07-23 Thread icexelloss
Github user icexelloss commented on a diff in the pull request:

https://github.com/apache/spark/pull/21650#discussion_r204482591
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -5060,6 +5049,144 @@ def test_type_annotation(self):
 df = self.spark.range(1).select(pandas_udf(f=_locals['noop'], 
returnType='bigint')('id'))
 self.assertEqual(df.first()[0], 0)
 
+def test_mixed_udf(self):
+import pandas as pd
+from pyspark.sql.functions import udf, pandas_udf
+
+df = self.spark.range(0, 1).toDF('v')
+
+# Test mixture of multiple UDFs and Pandas UDFs
+
+@udf('int')
+def f1(x):
+assert type(x) == int
+return x + 1
+
+@pandas_udf('int')
+def f2(x):
+assert type(x) == pd.Series
+return x + 10
+
+@udf('int')
+def f3(x):
+assert type(x) == int
+return x + 100
+
+@pandas_udf('int')
+def f4(x):
+assert type(x) == pd.Series
+return x + 1000
+
+# Test mixed udfs in a single projection
+df1 = df.withColumn('f1', f1(df['v']))
+df1 = df1.withColumn('f2', f2(df1['v']))
+df1 = df1.withColumn('f3', f3(df1['v']))
+df1 = df1.withColumn('f4', f4(df1['v']))
+df1 = df1.withColumn('f2_f1', f2(df1['f1']))
+df1 = df1.withColumn('f3_f1', f3(df1['f1']))
+df1 = df1.withColumn('f4_f1', f4(df1['f1']))
+df1 = df1.withColumn('f3_f2', f3(df1['f2']))
+df1 = df1.withColumn('f4_f2', f4(df1['f2']))
+df1 = df1.withColumn('f4_f3', f4(df1['f3']))
+df1 = df1.withColumn('f3_f2_f1', f3(df1['f2_f1']))
+df1 = df1.withColumn('f4_f2_f1', f4(df1['f2_f1']))
+df1 = df1.withColumn('f4_f3_f1', f4(df1['f3_f1']))
+df1 = df1.withColumn('f4_f3_f2', f4(df1['f3_f2']))
+df1 = df1.withColumn('f4_f3_f2_f1', f4(df1['f3_f2_f1']))
+
+# Test mixed udfs in a single expression
+df2 = df.withColumn('f1', f1(df['v']))
+df2 = df2.withColumn('f2', f2(df['v']))
+df2 = df2.withColumn('f3', f3(df['v']))
+df2 = df2.withColumn('f4', f4(df['v']))
+df2 = df2.withColumn('f2_f1', f2(f1(df['v'])))
+df2 = df2.withColumn('f3_f1', f3(f1(df['v'])))
+df2 = df2.withColumn('f4_f1', f4(f1(df['v'])))
+df2 = df2.withColumn('f3_f2', f3(f2(df['v'])))
+df2 = df2.withColumn('f4_f2', f4(f2(df['v'])))
+df2 = df2.withColumn('f4_f3', f4(f3(df['v'])))
+df2 = df2.withColumn('f3_f2_f1', f3(f2(f1(df['v']
+df2 = df2.withColumn('f4_f2_f1', f4(f2(f1(df['v']
+df2 = df2.withColumn('f4_f3_f1', f4(f3(f1(df['v']
+df2 = df2.withColumn('f4_f3_f2', f4(f3(f2(df['v']
+df2 = df2.withColumn('f4_f3_f2_f1', f4(f3(f2(f1(df['v'])
+
+# expected result
+df3 = df.withColumn('f1', df['v'] + 1)
+df3 = df3.withColumn('f2', df['v'] + 10)
+df3 = df3.withColumn('f3', df['v'] + 100)
+df3 = df3.withColumn('f4', df['v'] + 1000)
+df3 = df3.withColumn('f2_f1', df['v'] + 11)
+df3 = df3.withColumn('f3_f1', df['v'] + 101)
+df3 = df3.withColumn('f4_f1', df['v'] + 1001)
+df3 = df3.withColumn('f3_f2', df['v'] + 110)
+df3 = df3.withColumn('f4_f2', df['v'] + 1010)
+df3 = df3.withColumn('f4_f3', df['v'] + 1100)
+df3 = df3.withColumn('f3_f2_f1', df['v'] + 111)
+df3 = df3.withColumn('f4_f2_f1', df['v'] + 1011)
+df3 = df3.withColumn('f4_f3_f1', df['v'] + 1101)
+df3 = df3.withColumn('f4_f3_f2', df['v'] + 1110)
+df3 = df3.withColumn('f4_f3_f2_f1', df['v'] + )
+
+self.assertEquals(df3.collect(), df1.collect())
+self.assertEquals(df3.collect(), df2.collect())
+
+def test_mixed_udf_and_sql(self):
+import pandas as pd
+from pyspark.sql.functions import udf, pandas_udf
+
+df = self.spark.range(0, 1).toDF('v')
+
+# Test mixture of UDFs, Pandas UDFs and SQL expression.
+
+@udf('int')
+def f1(x):
+assert type(x) == int
+return x + 1
+
+def f2(x):
+return x + 10
+
+@pandas_udf('int')
+def f3(x):
+assert type(x) == pd.Series
+return x + 100
+
+df1 = df.withColumn('f1', f1(df['v']))
+df1 = df1.withColumn('f2', f2(df['v']))
+df1 = df1.withColumn('f3', f3(df['v']))
+df1 = df1.withColumn('f1_f2', f1(f2(df['v'])))
+df1 = df1.wi

[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21758
  
**[Test build #93446 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93446/testReport)**
 for PR 21758 at commit 
[`c16a47f`](https://github.com/apache/spark/commit/c16a47f0d15998133b9d61d8df5310f1f66b11b0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93445/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21650
  
**[Test build #93451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93451/testReport)**
 for PR 21650 at commit 
[`4c9c007`](https://github.com/apache/spark/commit/4c9c007858aef65c2c190b35673404dd61279369).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21758
  
**[Test build #93445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93445/testReport)**
 for PR 21758 at commit 
[`9ae56d1`](https://github.com/apache/spark/commit/9ae56d12b580f0a3cecb90dfe275d067e8f3a7f3).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...

2018-07-23 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/21118
  
@cloud-fan, any update on merging this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21845: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-07-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21845
  
of course i am as usual. I actually already have been being taking care of 
it. Thing is the tests are just being added even if they are duplicated of 
something. I feel like it's a bit excessive so far. In genetal, I don't think 
there are particular tests especially taking a lot of time IMHO. What we should 
do is that we put some efforts to deduplicate the tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21805


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

2018-07-23 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/21546#discussion_r204479024
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -2095,9 +2095,11 @@ def toPandas(self):
 _check_dataframe_localize_timestamps
 import pyarrow
 
-tables = self._collectAsArrow()
-if tables:
-table = pyarrow.concat_tables(tables)
+# Collect un-ordered list of batches, and list of 
correct order indices
+batches, batch_order = self._collectAsArrow()
+if batches:
--- End diff --

Sure, I was playing around with this being an iterator, but I will change 
it since it is a list now


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21805
  
LGTM

Thanks! Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21840: [WIP] New copy() method for Column of StructType

2018-07-23 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/21840#discussion_r204476440
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -3858,3 +3858,29 @@ object ArrayUnion {
 new GenericArrayData(arrayBuffer)
   }
 }
+
+case class StructCopy(
+struct: Expression,
+fieldName: String,
+fieldValue: Expression) extends Expression with CodegenFallback {
+
+  override def children: Seq[Expression] = Seq(struct, fieldValue)
+  override def nullable: Boolean = struct.nullable
+
+  lazy val fieldIndex = 
struct.dataType.asInstanceOf[StructType].fieldIndex(fieldName)
+
+  override def dataType: DataType = {
+val structType = struct.dataType.asInstanceOf[StructType]
+val field = structType.fields(fieldIndex).copy(dataType = 
fieldValue.dataType)
+
+structType.copy(fields = structType.fields.updated(fieldIndex, field))
+  }
+
+  override def eval(input: InternalRow): Any = {
+val newFieldValue = fieldValue.eval(input)
+val structValue = struct.eval(input).asInstanceOf[GenericInternalRow]
--- End diff --

You cannot assume this and you also cannot update the row in-place. You 
will need to copy the row I am affraid.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...

2018-07-23 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21826
  
No we can't because you can still use string concat in filters, e.g.

colA || colB == "ab"

What is "||" here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-23 Thread maryannxue
Github user maryannxue commented on the issue:

https://github.com/apache/spark/pull/21821
  
I just ran a test with once-strategy check and found out that a few 
batches/rules do not stop, e.g. AggregatePushDown, "Convert to Spark client 
exec", PartitionPruning. I believe most of them are edge rules and none of them 
are analyzer rules. Still, let's keep this fix until just to be on the safe 
side.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21845: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-07-23 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21845
  
This helps, but it is not sustainable to keep increasing the threshold. 
What we need to do is to look at test time distribution and figure out what 
test suites are unnecessarily long and actually cut down the time there. 
@HyukjinKwon  Would you be interested in doing that?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21805
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93444/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21805
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21805
  
**[Test build #93444 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93444/testReport)**
 for PR 21805 at commit 
[`2a21c80`](https://github.com/apache/spark/commit/2a21c80f0b751277e8ac975279474378c765af6c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21847: [SPARK-24855][SQL][EXTERNAL][WIP]: Built-in AVRO support...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21847
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21847: [SPARK-24855][SQL][EXTERNAL][WIP]: Built-in AVRO support...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21847
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21847: [SPARK-24855][SQL][EXTERNAL][WIP]: Built-in AVRO support...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21847
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21847: [SPARK-24855][SQL][EXTERNAL][WIP]: Built-in AVRO ...

2018-07-23 Thread lindblombr
GitHub user lindblombr opened a pull request:

https://github.com/apache/spark/pull/21847

[SPARK-24855][SQL][EXTERNAL][WIP]: Built-in AVRO support should support 
specified schema on write

## What changes were proposed in this pull request?

Allows `avroSchema` option to be specified on write, allowing a user to 
specify a schema in cases where this is required.  A trivial use case is 
reading in an avro dataset, making some small adjustment to a column or columns 
and writing out using the same schema.  Implicit schema creation from SQL 
Struct results in a schema that while for the most part, is functionally 
similar, is not necessarily compatible.

Allows `fixed` Field type to be utilized for records of specified 
`avroSchema`

## How was this patch tested?

Unit tests in AvroSuite are extended to write out included sample datasets 
using new `avroSchema` option and verify readability of records.  Ideally, 
tests should be added for each `Schema.Type` to ensure records are written with 
full fidelity.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindblombr/spark specify_schema_on_write

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21847.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21847


commit 30fc1ae98a2c29824b83e956c2cbe87e9bbb3749
Author: Brian Lindblom 
Date:   2018-07-21T20:43:26Z

SPARK-24855: Built-in AVRO support should support specified schema on write




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21650
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1238/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21650
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21650
  
**[Test build #93450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93450/testReport)**
 for PR 21650 at commit 
[`78f2ebf`](https://github.com/apache/spark/commit/78f2ebf3b11fe8849fe0d41300f74319ca174d42).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17174: [SPARK-19145][SQL] Timestamp to String casting is slowin...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/17174
  
@tanejagagan Can you update?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19434
  
@CrazyJacky Can you close this for now cuz it's not active for a long time?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21839: [SPARK-24339][SQL] Prunes the unused columns from child ...

2018-07-23 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21839
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21805
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93443/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21805
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19773: [SPARK-22546][SQL] Supporting for changing column dataTy...

2018-07-23 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/19773
  
I'll resolve the conflicts today, thanks for ping me.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21805
  
**[Test build #93443 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93443/testReport)**
 for PR 21805 at commit 
[`de3f63e`](https://github.com/apache/spark/commit/de3f63e61fb45b61bc6544b6b5487e7f276f7ac7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19745: [SPARK-2926][Core][Follow Up] Sort shuffle reader...

2018-07-23 Thread xuanyuanking
Github user xuanyuanking closed the pull request at:

https://github.com/apache/spark/pull/19745


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19745: [SPARK-2926][Core][Follow Up] Sort shuffle reader for Sp...

2018-07-23 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/19745
  
No problem.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20854: [SPARK-23712][SQL] Interpreted UnsafeRowJoiner [WIP]

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20854
  
@hvanhovell What's the status of this pr?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21839: [SPARK-24339][SQL] Prunes the unused columns from child ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21839
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21839: [SPARK-24339][SQL] Prunes the unused columns from child ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21839
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1237/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21846: [SPARK-24887][SQL]Avro: use SerializableConfigura...

2018-07-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21846


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21839: [SPARK-24339][SQL] Prunes the unused columns from child ...

2018-07-23 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/21839
  
@gatorsmile Thanks for your advice, added ut in ScriptTransformationSuite.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20699: [SPARK-23544][SQL]Remove redundancy ShuffleExchange in t...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20699
  
**[Test build #93449 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93449/testReport)**
 for PR 20699 at commit 
[`be96e39`](https://github.com/apache/spark/commit/be96e390c87ecf1550a4297a92a68497caaedca4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Pyt...

2018-07-23 Thread icexelloss
Github user icexelloss commented on a diff in the pull request:

https://github.com/apache/spark/pull/21650#discussion_r204447950
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -5060,6 +5049,144 @@ def test_type_annotation(self):
 df = self.spark.range(1).select(pandas_udf(f=_locals['noop'], 
returnType='bigint')('id'))
 self.assertEqual(df.first()[0], 0)
 
+def test_mixed_udf(self):
+import pandas as pd
+from pyspark.sql.functions import udf, pandas_udf
+
+df = self.spark.range(0, 1).toDF('v')
+
+# Test mixture of multiple UDFs and Pandas UDFs
+
+@udf('int')
+def f1(x):
+assert type(x) == int
+return x + 1
+
+@pandas_udf('int')
+def f2(x):
+assert type(x) == pd.Series
+return x + 10
+
+@udf('int')
+def f3(x):
+assert type(x) == int
+return x + 100
+
+@pandas_udf('int')
+def f4(x):
+assert type(x) == pd.Series
+return x + 1000
+
+# Test mixed udfs in a single projection
+df1 = df.withColumn('f1', f1(df['v']))
+df1 = df1.withColumn('f2', f2(df1['v']))
+df1 = df1.withColumn('f3', f3(df1['v']))
+df1 = df1.withColumn('f4', f4(df1['v']))
+df1 = df1.withColumn('f2_f1', f2(df1['f1']))
+df1 = df1.withColumn('f3_f1', f3(df1['f1']))
+df1 = df1.withColumn('f4_f1', f4(df1['f1']))
+df1 = df1.withColumn('f3_f2', f3(df1['f2']))
+df1 = df1.withColumn('f4_f2', f4(df1['f2']))
+df1 = df1.withColumn('f4_f3', f4(df1['f3']))
+df1 = df1.withColumn('f3_f2_f1', f3(df1['f2_f1']))
+df1 = df1.withColumn('f4_f2_f1', f4(df1['f2_f1']))
+df1 = df1.withColumn('f4_f3_f1', f4(df1['f3_f1']))
+df1 = df1.withColumn('f4_f3_f2', f4(df1['f3_f2']))
+df1 = df1.withColumn('f4_f3_f2_f1', f4(df1['f3_f2_f1']))
+
+# Test mixed udfs in a single expression
+df2 = df.withColumn('f1', f1(df['v']))
+df2 = df2.withColumn('f2', f2(df['v']))
+df2 = df2.withColumn('f3', f3(df['v']))
+df2 = df2.withColumn('f4', f4(df['v']))
+df2 = df2.withColumn('f2_f1', f2(f1(df['v'])))
+df2 = df2.withColumn('f3_f1', f3(f1(df['v'])))
+df2 = df2.withColumn('f4_f1', f4(f1(df['v'])))
+df2 = df2.withColumn('f3_f2', f3(f2(df['v'])))
+df2 = df2.withColumn('f4_f2', f4(f2(df['v'])))
+df2 = df2.withColumn('f4_f3', f4(f3(df['v'])))
+df2 = df2.withColumn('f3_f2_f1', f3(f2(f1(df['v']
+df2 = df2.withColumn('f4_f2_f1', f4(f2(f1(df['v']
+df2 = df2.withColumn('f4_f3_f1', f4(f3(f1(df['v']
+df2 = df2.withColumn('f4_f3_f2', f4(f3(f2(df['v']
+df2 = df2.withColumn('f4_f3_f2_f1', f4(f3(f2(f1(df['v'])
+
+# expected result
+df3 = df.withColumn('f1', df['v'] + 1)
+df3 = df3.withColumn('f2', df['v'] + 10)
+df3 = df3.withColumn('f3', df['v'] + 100)
+df3 = df3.withColumn('f4', df['v'] + 1000)
+df3 = df3.withColumn('f2_f1', df['v'] + 11)
+df3 = df3.withColumn('f3_f1', df['v'] + 101)
+df3 = df3.withColumn('f4_f1', df['v'] + 1001)
+df3 = df3.withColumn('f3_f2', df['v'] + 110)
+df3 = df3.withColumn('f4_f2', df['v'] + 1010)
+df3 = df3.withColumn('f4_f3', df['v'] + 1100)
+df3 = df3.withColumn('f3_f2_f1', df['v'] + 111)
+df3 = df3.withColumn('f4_f2_f1', df['v'] + 1011)
+df3 = df3.withColumn('f4_f3_f1', df['v'] + 1101)
+df3 = df3.withColumn('f4_f3_f2', df['v'] + 1110)
+df3 = df3.withColumn('f4_f3_f2_f1', df['v'] + )
+
+self.assertEquals(df3.collect(), df1.collect())
+self.assertEquals(df3.collect(), df2.collect())
+
+def test_mixed_udf_and_sql(self):
+import pandas as pd
+from pyspark.sql.functions import udf, pandas_udf
+
+df = self.spark.range(0, 1).toDF('v')
+
+# Test mixture of UDFs, Pandas UDFs and SQL expression.
+
+@udf('int')
+def f1(x):
+assert type(x) == int
+return x + 1
+
+def f2(x):
+return x + 10
+
+@pandas_udf('int')
+def f3(x):
+assert type(x) == pd.Series
+return x + 100
+
+df1 = df.withColumn('f1', f1(df['v']))
+df1 = df1.withColumn('f2', f2(df['v']))
+df1 = df1.withColumn('f3', f3(df['v']))
+df1 = df1.withColumn('f1_f2', f1(f2(df['v'])))
+df1 = df1.wi

[GitHub] spark pull request #21839: [SPARK-24339][SQL] Prunes the unused columns from...

2018-07-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/21839#discussion_r204447671
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -450,13 +450,16 @@ object ColumnPruning extends Rule[LogicalPlan] {
 case d @ DeserializeToObject(_, _, child) if (child.outputSet -- 
d.references).nonEmpty =>
   d.copy(child = prunedChild(child, d.references))
 
-// Prunes the unused columns from child of Aggregate/Expand/Generate
+// Prunes the unused columns from child of 
Aggregate/Expand/Generate/ScriptTransformation
 case a @ Aggregate(_, _, child) if (child.outputSet -- 
a.references).nonEmpty =>
   a.copy(child = prunedChild(child, a.references))
 case f @ FlatMapGroupsInPandas(_, _, _, child) if (child.outputSet -- 
f.references).nonEmpty =>
   f.copy(child = prunedChild(child, f.references))
 case e @ Expand(_, _, child) if (child.outputSet -- 
e.references).nonEmpty =>
   e.copy(child = prunedChild(child, e.references))
+case s @ ScriptTransformation(_, _, _, child, _)
+  if (child.outputSet -- s.references).nonEmpty =>
--- End diff --

Thanks, fix in 2cf131f.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21839: [SPARK-24339][SQL] Prunes the unused columns from child ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21839
  
**[Test build #93448 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93448/testReport)**
 for PR 21839 at commit 
[`2cf131f`](https://github.com/apache/spark/commit/2cf131fe1b6af368a10964ddb6067b1b52898420).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19773: [SPARK-22546][SQL] Supporting for changing column dataTy...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19773
  
@xuanyuanking  Any update?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20699: [SPARK-23544][SQL]Remove redundancy ShuffleExchange in t...

2018-07-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20699
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21764: [SPARK-24802][SQL] Add a new config for Optimizat...

2018-07-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21764


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19745: [SPARK-2926][Core][Follow Up] Sort shuffle reader for Sp...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/19745
  
@xuanyuanking Can you close this for now because it's not active for a long 
time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21764: [SPARK-24802][SQL] Add a new config for Optimization Rul...

2018-07-23 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21764
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15970: [SPARK-18134][SQL] Comparable MapTypes [POC]

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/15970
  
@hvanhovell We still need to keep this pr open? Either way, we need rework 
based on this pr. If so, can you close this for now?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21821: [SPARK-24867] [SQL] Add AnalysisBarrier to DataFrameWrit...

2018-07-23 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21821
  
@hvanhovell The question is whether `HandleNullInputsForUDF ` is the only 
rule that caused the issue. If not, we still need to add an AnalysisBarrier. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18697: [SPARK-16683][SQL] Repeated joins to same table can leak...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18697
  
@aray Can you close this for now because it's not active for a long time? 
(I'm not sure the current master still has this issue..., so you should check 
it first)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15334: [SPARK-10367][SQL] Support Parquet logical type INTERVAL

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/15334
  
oh, I noticed the jira ticket has already been closed as later, so can you 
close this? @dilipbiswal 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15334: [SPARK-10367][SQL] Support Parquet logical type INTERVAL

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/15334
  
IIUC we have no plan to expose Interval types now, so can we close this for 
now? cc: @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18954: [SPARK-17654] [SQL] Enable populating hive bucketed tabl...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18954
  
@tejasapatil Can you close this for now because it's not active for a long 
time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15071: [SPARK-17517][SQL]Improve generated Code for BroadcastHa...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/15071
  
@yaooqinn Can you close this because it's not long time for a long time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
> Does it make sense to release byteChannel at deallocate()?

you could, just to let GC kick in a *bit* earlier, but I don't think its 
going to make a big difference.  (Netty's ByteBufs must be since they may be 
owned by netty's internal pools and never gc'ed)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21805
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function

2018-07-23 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21102
  
I want to hear opinion of others about the order of a result.
cc @ueshin


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
> I like it, but this will still create the byte channel right? is there a 
way to reuse it?

we could create a pool, though management becomes a bit more complex.  
would you ever shrink the pool, or would it always stay the same size?  I guess 
it shouldn't grow more than 1 byte array per io thread.

I'd rather get this simple fix in first before doing anything more 
complicated ...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21828: Update regression.py

2018-07-23 Thread woodthom2
Github user woodthom2 commented on the issue:

https://github.com/apache/spark/pull/21828
  
OK thank you. I will close


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21828: Update regression.py

2018-07-23 Thread woodthom2
Github user woodthom2 closed the pull request at:

https://github.com/apache/spark/pull/21828


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21650: [SPARK-24624][SQL][PYTHON] Support mixture of Pyt...

2018-07-23 Thread icexelloss
Github user icexelloss commented on a diff in the pull request:

https://github.com/apache/spark/pull/21650#discussion_r204429892
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala
 ---
@@ -94,36 +95,59 @@ object ExtractPythonUDFFromAggregate extends 
Rule[LogicalPlan] {
  */
 object ExtractPythonUDFs extends Rule[SparkPlan] with PredicateHelper {
 
-  private def hasPythonUDF(e: Expression): Boolean = {
+  private def hasScalarPythonUDF(e: Expression): Boolean = {
 e.find(PythonUDF.isScalarPythonUDF).isDefined
   }
 
-  private def canEvaluateInPython(e: PythonUDF): Boolean = {
-e.children match {
-  // single PythonUDF child could be chained and evaluated in Python
-  case Seq(u: PythonUDF) => canEvaluateInPython(u)
-  // Python UDF can't be evaluated directly in JVM
-  case children => !children.exists(hasPythonUDF)
+  private def canEvaluateInPython(e: PythonUDF, evalType: Int): Boolean = {
+if (e.evalType != evalType) {
+  false
+} else {
+  e.children match {
+// single PythonUDF child could be chained and evaluated in Python
+case Seq(u: PythonUDF) => canEvaluateInPython(u, evalType)
+// Python UDF can't be evaluated directly in JVM
+case children => !children.exists(hasScalarPythonUDF)
+  }
 }
   }
 
-  private def collectEvaluatableUDF(expr: Expression): Seq[PythonUDF] = 
expr match {
-case udf: PythonUDF if PythonUDF.isScalarPythonUDF(udf) && 
canEvaluateInPython(udf) => Seq(udf)
-case e => e.children.flatMap(collectEvaluatableUDF)
+  private def collectEvaluableUDF(expr: Expression, evalType: Int): 
Seq[PythonUDF] = expr match {
+case udf: PythonUDF if PythonUDF.isScalarPythonUDF(udf) && 
canEvaluateInPython(udf, evalType) =>
+  Seq(udf)
+case e => e.children.flatMap(collectEvaluableUDF(_, evalType))
+  }
+
+  /**
+   * Collect evaluable UDFs from the current node.
+   *
+   * This function collects Python UDFs or Scalar Python UDFs from 
expressions of the input node,
+   * and returns a list of UDFs of the same eval type.
+   *
+   * If expressions contain both UDFs eval types, this function will only 
return Python UDFs.
+   *
+   * The caller should call this function multiple times until all 
evaluable UDFs are collected.
+   */
+  private def collectEvaluableUDFs(plan: SparkPlan): Seq[PythonUDF] = {
+val pythonUDFs =
+  plan.expressions.flatMap(collectEvaluableUDF(_, 
PythonEvalType.SQL_BATCHED_UDF))
+
+if (pythonUDFs.isEmpty) {
+  plan.expressions.flatMap(collectEvaluableUDF(_, 
PythonEvalType.SQL_SCALAR_PANDAS_UDF))
+} else {
+  pythonUDFs
--- End diff --

What you said makes sense and that's actually my first attempt but end up 
being pretty complicated. The issue is that it is hard to do a one traversal of 
the expression tree to find the UDFs because we need to pass the evalType to 
all subtree and the result of one subtree can affect the result of another 
(i.e, if we find one type of UDF in one subtree, we need to pass the type to 
all other subtree because they must agree on evalType), this makes the code 
more complicated...

Another way is to do two traversals where in the first traversal, we look 
for eval type and in the second traversal, we look for UDFs of the eval type, 
but this isn't much different from what I have now in terms of efficiency and I 
find the current logic is simpler and less likely to have bugs. I actually 
tried these approaches and found the current way to be the easiest to implement 
and least likely to have bugs.

WDYT?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21635: [SPARK-24594][YARN] Introducing metrics for YARN

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21635
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93439/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21635: [SPARK-24594][YARN] Introducing metrics for YARN

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21635
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21635: [SPARK-24594][YARN] Introducing metrics for YARN

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21635
  
**[Test build #93439 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93439/testReport)**
 for PR 21635 at commit 
[`0b86788`](https://github.com/apache/spark/commit/0b86788e7ec7b367c779cb5517f9dd294f99dd4b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21811
  
**[Test build #4221 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4221/testReport)**
 for PR 21811 at commit 
[`8b46534`](https://github.com/apache/spark/commit/8b465341314b1cbbef726950d589d5fb49db341b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21653: [SPARK-13343] speculative tasks that didn't commit shoul...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21653
  
**[Test build #93447 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93447/testReport)**
 for PR 21653 at commit 
[`b6585da`](https://github.com/apache/spark/commit/b6585da0f137d3d3675925368c4668c884de900c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21653: [SPARK-13343] speculative tasks that didn't commit shoul...

2018-07-23 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/21653
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21758
  
**[Test build #93446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93446/testReport)**
 for PR 21758 at commit 
[`c16a47f`](https://github.com/apache/spark/commit/c16a47f0d15998133b9d61d8df5310f1f66b11b0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1236/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21758
  
**[Test build #93445 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93445/testReport)**
 for PR 21758 at commit 
[`9ae56d1`](https://github.com/apache/spark/commit/9ae56d12b580f0a3cecb90dfe275d067e8f3a7f3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21758: [SPARK-24795][CORE] Implement barrier execution mode

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21758
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1235/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204394561
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
@@ -1411,6 +1420,76 @@ class DAGScheduler(
   }
 }
 
+  case failure: TaskFailedReason if task.isBarrier =>
+// Also handle the task failed reasons here.
+failure match {
+  case Resubmitted =>
+handleResubmittedFailure(task, stage)
+
+  case _ => // Do nothing.
+}
+
+// Always fail the current stage and retry all the tasks when a 
barrier task fail.
+val failedStage = stageIdToStage(task.stageId)
+logInfo(s"Marking $failedStage (${failedStage.name}) as failed due 
to a barrier task " +
+  "failed.")
+val message = s"Stage failed because barrier task $task finished 
unsuccessfully. " +
+  s"${failure.toErrorString}"
--- End diff --

sure, it can be just `failure.toErrorString`, no need to wrap it into 
`s"..."`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204391134
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
@@ -1411,6 +1420,76 @@ class DAGScheduler(
   }
 }
 
+  case failure: TaskFailedReason if task.isBarrier =>
+// Also handle the task failed reasons here.
+failure match {
+  case Resubmitted =>
+handleResubmittedFailure(task, stage)
+
+  case _ => // Do nothing.
+}
+
+// Always fail the current stage and retry all the tasks when a 
barrier task fail.
+val failedStage = stageIdToStage(task.stageId)
+logInfo(s"Marking $failedStage (${failedStage.name}) as failed due 
to a barrier task " +
+  "failed.")
+val message = s"Stage failed because barrier task $task finished 
unsuccessfully. " +
+  s"${failure.toErrorString}"
--- End diff --

Why is it not needed? Could you expend more on this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204390597
  
--- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import org.apache.spark.annotation.{Experimental, Since}
+
+/** A [[TaskContext]] with extra info and tooling for a barrier stage. */
+trait BarrierTaskContext extends TaskContext {
+
+  /**
+   * :: Experimental ::
+   * Sets a global barrier and waits until all tasks in this stage hit 
this barrier. Similar to
+   * MPI_Barrier function in MPI, the barrier() function call blocks until 
all tasks in the same
+   * stage have reached this routine.
+   */
+  @Experimental
+  @Since("2.4.0")
+  def barrier(): Unit
+
+  /**
+   * :: Experimental ::
+   * Returns the all task infos in this barrier stage, the task infos are 
ordered by partitionId.
--- End diff --

The major reason is that each tasks within the same barrier stage may need 
to communicate with each other, we order the task infos by partitionId so a 
task can find its peer tasks by index.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16374: [SPARK-18925][STREAMING] Reduce memory usage of m...

2018-07-23 Thread vpchelko
Github user vpchelko closed the pull request at:

https://github.com/apache/spark/pull/16374


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21805
  
**[Test build #93444 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93444/testReport)**
 for PR 21805 at commit 
[`2a21c80`](https://github.com/apache/spark/commit/2a21c80f0b751277e8ac975279474378c765af6c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21789
  
**[Test build #93440 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93440/testReport)**
 for PR 21789 at commit 
[`05cbdc8`](https://github.com/apache/spark/commit/05cbdc8cf25b50b832febe73e0963885431679a6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21789
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93440/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21789
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21789
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21789
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93442/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21789
  
**[Test build #93442 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93442/testReport)**
 for PR 21789 at commit 
[`05cbdc8`](https://github.com/apache/spark/commit/05cbdc8cf25b50b832febe73e0963885431679a6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21846: [SPARK-24887][SQL]Avro: use SerializableConfiguration in...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21846
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21846: [SPARK-24887][SQL]Avro: use SerializableConfiguration in...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21846
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93441/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21846: [SPARK-24887][SQL]Avro: use SerializableConfiguration in...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21846
  
**[Test build #93441 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93441/testReport)**
 for PR 21846 at commit 
[`a8dd96d`](https://github.com/apache/spark/commit/a8dd96df3eef66bab29f83b141a67386676603fa).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-23 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r204379093
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
 ---
@@ -207,4 +207,7 @@ case class InMemoryRelation(
   }
 
   override protected def otherCopyArgs: Seq[AnyRef] = 
Seq(statsOfPlanToCache)
+
+  override def simpleString: String = s"InMemoryRelation(${output}, 
${cacheBuilder.storageLevel})"
+
--- End diff --

nit: remove the blank line


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-23 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r204378903
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetCacheSuite.scala ---
@@ -206,4 +206,20 @@ class DatasetCacheSuite extends QueryTest with 
SharedSQLContext with TimeLimits
 // first time use, load cache
 checkDataset(df5, Row(10))
   }
+
+  test("SPARK-24850 InMemoryRelation string representation does not 
include cached plan") {
+val dummyQueryExecution = spark.range(0, 1).toDF().queryExecution
+val inMemoryRelation = InMemoryRelation(
+  true,
+  1000,
+  StorageLevel.MEMORY_ONLY,
+  dummyQueryExecution.sparkPlan,
+  Some("test-relation"),
+  dummyQueryExecution.logical)
+
+
assert(!inMemoryRelation.simpleString.contains(dummyQueryExecution.sparkPlan.toString))
+assert(inMemoryRelation.simpleString ==
+  s"InMemoryRelation(${inMemoryRelation.output},"
+  + " StorageLevel(memory, deserialized, 1 replicas))")
+  }
--- End diff --

How about just comparing explain results?
```
val df = Seq((1, 2)).toDF("a", "b").cache
val outputStream = new java.io.ByteArrayOutputStream()
Console.withOut(outputStream) {
  df.explain(false)
}
assert(outputStream.toString.replaceAll("#\\d+", "#x").contains(
  "InMemoryRelation [a#x, b#x], StorageLevel(disk, memory, 
deserialized, 1 replicas)"))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21805: [SPARK-24850][SQL] fix str representation of Cach...

2018-07-23 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21805#discussion_r204378696
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
 ---
@@ -207,4 +207,7 @@ case class InMemoryRelation(
   }
 
   override protected def otherCopyArgs: Seq[AnyRef] = 
Seq(statsOfPlanToCache)
+
+  override def simpleString: String = s"InMemoryRelation(${output}, 
${cacheBuilder.storageLevel})"
--- End diff --

How about `s"InMemoryRelation [${Utils.truncatedString(output, ", ")}], 
${cacheBuilder.storageLevel}"`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21805: [SPARK-24850][SQL] fix str representation of CachedRDDBu...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21805
  
**[Test build #93443 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93443/testReport)**
 for PR 21805 at commit 
[`de3f63e`](https://github.com/apache/spark/commit/de3f63e61fb45b61bc6544b6b5487e7f276f7ac7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21846: [SPARK-24887][SQL]Avro: use SerializableConfiguration in...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21846
  
**[Test build #93441 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93441/testReport)**
 for PR 21846 at commit 
[`a8dd96d`](https://github.com/apache/spark/commit/a8dd96df3eef66bab29f83b141a67386676603fa).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21846: [SPARK-24887][SQL]Avro: use SerializableConfiguration in...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21846
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21789
  
**[Test build #93440 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93440/testReport)**
 for PR 21789 at commit 
[`05cbdc8`](https://github.com/apache/spark/commit/05cbdc8cf25b50b832febe73e0963885431679a6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21846: [SPARK-24887][SQL]Avro: use SerializableConfiguration in...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21846
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1234/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21789: [SPARK-24829][STS]In Spark Thrift Server, CAST AS FLOAT ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21789
  
**[Test build #93442 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93442/testReport)**
 for PR 21789 at commit 
[`05cbdc8`](https://github.com/apache/spark/commit/05cbdc8cf25b50b832febe73e0963885431679a6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   >