[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23124
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5434/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23025: [SPARK-26024][SQL]: Update documentation for repa...

2018-11-27 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/23025#discussion_r236970732
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -767,6 +767,14 @@ setMethod("repartition",
 #'  using \code{spark.sql.shuffle.partitions} as 
number of partitions.}
 #'}
 #'
+#' At least one partition-by expression must be specified.
--- End diff --

761 is significant also, but correct. 

essentially:
1. first line of the blob is the title (L760)
2. second text after "empty line" is the description (L762)
3. third after another "empty line" is the "detail note" which is stashed 
all the way to the bottom of the doc page

so generally you want "important" part of the description on top and not in 
the "detail" section because it is easily missed. 

this is the most common pattern in this code base. there's another, where 
multiple function is doc together as a group, eg. collection sql function (in 
functions.R). other finer control is possible as well but not used today in 
this code base.

similarly L829 is good, L831 is a bit fuzzy - I'd personally prefer without 
L831 to keep the whole text in the description section of the doc.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23124
  
**[Test build #99357 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99357/testReport)**
 for PR 23124 at commit 
[`6dff654`](https://github.com/apache/spark/commit/6dff6545f272e0d5117ac17fdc27b686573c5626).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread stczwd
Github user stczwd commented on the issue:

https://github.com/apache/spark/pull/22575
  
I hive send an email to Ryan Blue. 

> > > Can you send a mail to Ryan blue for adding this SPIP topic in 
tomorrow meeting. Meeting will be conducted tomorrow 05:00 pm PST. If you 
confirm then we can also attend the meeting.
> > 
> > 
> > I have send an email to Ryan Blue to attend this meeting.
> 
> I think you should also ask him to add your SPIP topic for tomorrows 
discussion.Agenda has to be set prior.

Tomorrow's discussion is mainly focus on DataSource V2 API, I don't think 
they will spend time to discuss SQL API. However, We can mention it while 
discussing the Catalog API.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21957: [SPARK-24994][SQL] When the data type of the fiel...

2018-11-27 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21957#discussion_r236965962
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -269,7 +269,8 @@ case class FileSourceScanExec(
   }
 
   @transient
-  private val pushedDownFilters = 
dataFilters.flatMap(DataSourceStrategy.translateFilter)
+  private val pushedDownFilters = dataFilters.flatMap(DataSourceStrategy.
+translateFilter(_, !relation.fileFormat.isInstanceOf[ParquetSource]))
--- End diff --

Thanks
 Yeah, this is not a good solution, I can't solve this problem better now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5433/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99348/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22163
  
**[Test build #99348 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99348/testReport)**
 for PR 22163 at commit 
[`b7ff915`](https://github.com/apache/spark/commit/b7ff9152ef17762ce370c7ec7c8be772a73f926e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23128
  
**[Test build #99355 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99355/testReport)**
 for PR 23128 at commit 
[`d12ea31`](https://github.com/apache/spark/commit/d12ea311e58e7925f21d343e5de13bfec6737549).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23153
  
**[Test build #99356 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99356/testReport)**
 for PR 23153 at commit 
[`7b985d8`](https://github.com/apache/spark/commit/7b985d84cb0fd853d40610b2380313389791298e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23052
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5432/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23052
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23153
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5431/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23153
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23124#discussion_r236962499
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala
 ---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.types._
+
+/**
+ * A builder of [[ArrayBasedMapData]], which fails if a null map key is 
detected, and removes
+ * duplicated map keys w.r.t. the last wins policy.
+ */
+class ArrayBasedMapBuilder(keyType: DataType, valueType: DataType) extends 
Serializable {
+  assert(!keyType.existsRecursively(_.isInstanceOf[MapType]), "key of map 
cannot be/contain map")
+  assert(keyType != NullType, "map key cannot be null type.")
+
+  private lazy val keyToIndex = keyType match {
+case _: AtomicType | _: CalendarIntervalType => 
mutable.HashMap.empty[Any, Int]
--- End diff --

I think for performance critical code path we should prefer java 
collection. thanks for pointing it out!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/23128
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23052
  
**[Test build #99354 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99354/testReport)**
 for PR 23052 at commit 
[`76e1466`](https://github.com/apache/spark/commit/76e1466a39aa2a40d999791bb9d3b09628921e85).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23052: [SPARK-26081][SQL] Prevent empty files for empty partiti...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23052
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for parsing...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22590
  
**[Test build #99353 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99353/testReport)**
 for PR 22590 at commit 
[`9e3c4bd`](https://github.com/apache/spark/commit/9e3c4bda06011cf6b4d21321d8e7336495839325).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23124#discussion_r236955791
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala
 ---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.types._
+
+/**
+ * A builder of [[ArrayBasedMapData]], which fails if a null map key is 
detected, and removes
+ * duplicated map keys w.r.t. the last wins policy.
+ */
+class ArrayBasedMapBuilder(keyType: DataType, valueType: DataType) extends 
Serializable {
+  assert(!keyType.existsRecursively(_.isInstanceOf[MapType]), "key of map 
cannot be/contain map")
+  assert(keyType != NullType, "map key cannot be null type.")
+
+  private lazy val keyToIndex = keyType match {
+case _: AtomicType | _: CalendarIntervalType => 
mutable.HashMap.empty[Any, Int]
--- End diff --

We need to exempt `BinaryType` from `AtomicType` here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/23124#discussion_r236958252
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilderSuite.scala
 ---
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.{UnsafeArrayData, 
UnsafeRow}
+import org.apache.spark.sql.types.{ArrayType, IntegerType, StructType}
+import org.apache.spark.unsafe.Platform
+
+class ArrayBasedMapBuilderSuite extends SparkFunSuite {
+
+  test("basic") {
+val builder = new ArrayBasedMapBuilder(IntegerType, IntegerType)
+builder.put(1, 1)
+builder.put(InternalRow(2, 2))
+builder.putAll(new GenericArrayData(Seq(3)), new 
GenericArrayData(Seq(3)))
+val map = builder.build()
+assert(map.numElements() == 3)
+assert(ArrayBasedMapData.toScalaMap(map) == Map(1 -> 1, 2 -> 2, 3 -> 
3))
+  }
+
+  test("fail with null key") {
+val builder = new ArrayBasedMapBuilder(IntegerType, IntegerType)
+builder.put(1, null) // null value is OK
+val e = intercept[RuntimeException](builder.put(null, 1))
+assert(e.getMessage.contains("Cannot use null as map key"))
+  }
+
+  test("remove duplicated keys with last wins policy") {
+val builder = new ArrayBasedMapBuilder(IntegerType, IntegerType)
+builder.put(1, 1)
+builder.put(2, 2)
+builder.put(1, 2)
+val map = builder.build()
+assert(map.numElements() == 2)
+assert(ArrayBasedMapData.toScalaMap(map) == Map(1 -> 2, 2 -> 2))
+  }
+
+  test("struct type key") {
+val builder = new ArrayBasedMapBuilder(new StructType().add("i", 
"int"), IntegerType)
+builder.put(InternalRow(1), 1)
+builder.put(InternalRow(2), 2)
+val unsafeRow = {
+  val row = new UnsafeRow(1)
+  val bytes = new Array[Byte](16)
+  row.pointTo(bytes, 16)
+  row.setInt(0, 1)
+  row
+}
+builder.put(unsafeRow, 3)
+val map = builder.build()
+assert(map.numElements() == 2)
+assert(ArrayBasedMapData.toScalaMap(map) == Map(InternalRow(1) -> 3, 
InternalRow(2) -> 2))
+  }
+
+  test("array type key") {
+val builder = new ArrayBasedMapBuilder(ArrayType(IntegerType), 
IntegerType)
+builder.put(new GenericArrayData(Seq(1, 1)), 1)
+builder.put(new GenericArrayData(Seq(2, 2)), 2)
+val unsafeArray = {
+  val array = new UnsafeArrayData()
+  val bytes = new Array[Byte](24)
+  Platform.putLong(bytes, Platform.BYTE_ARRAY_OFFSET, 2)
+  array.pointTo(bytes, Platform.BYTE_ARRAY_OFFSET, 24)
+  array.setInt(0, 1)
+  array.setInt(1, 1)
+  array
+}
+builder.put(unsafeArray, 3)
+val map = builder.build()
+assert(map.numElements() == 2)
+assert(ArrayBasedMapData.toScalaMap(map) ==
+  Map(new GenericArrayData(Seq(1, 1)) -> 3, new 
GenericArrayData(Seq(2, 2)) -> 2))
+  }
--- End diff --

We should have a binary type key test as well?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for parsing...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22590
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5430/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for parsing...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22590
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22590: [SPARK-25574][SQL]Add an option `keepQuotes` for parsing...

2018-11-27 Thread 10110346
Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/22590
  
@HyukjinKwon I think it is not important. but our customers need this 
feature.
Yeah, it is better to find a way to set the arbitrary parse settings options


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23086: [SPARK-25528][SQL] data source v2 API refactor (b...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23086#discussion_r236957293
  
--- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/Table.java 
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
--- End diff --

It's unclear to me what would be the best choice:
1. move data source API to catalyst module
2. move data source related rules to SQL core module
3. define private catalog related APIs in catalyst module and implement 
them in SQL core

Can we delay the discussion when we have a PR to add catalog support after 
the refactor?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/23128
  
python UT failed cause jvm crush.
retest this pleas.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/23124#discussion_r236952729
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala
 ---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.util
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.types._
+
+/**
+ * A builder of [[ArrayBasedMapData]], which fails if a null map key is 
detected, and removes
+ * duplicated map keys w.r.t. the last wins policy.
+ */
+class ArrayBasedMapBuilder(keyType: DataType, valueType: DataType) extends 
Serializable {
+  assert(!keyType.existsRecursively(_.isInstanceOf[MapType]), "key of map 
cannot be/contain map")
+  assert(keyType != NullType, "map key cannot be null type.")
+
+  private lazy val keyToIndex = keyType match {
+case _: AtomicType | _: CalendarIntervalType => 
mutable.HashMap.empty[Any, Int]
--- End diff --

FYI: I had a test lying around from when I worked on map_concat. With this 
PR:

- map_concat of two small maps (20 string keys per map, no dups) for 2M 
rows is about 17% slower.
- map_concat of two big maps (500 string keys per map, no dups) for 1M rows 
is about 25% slower.

The baseline code is the same branch as the PR, but without the 4 commits.

Some cost makes sense, as we're checking for dups, but it's odd that the 
overhead grows disproportionately as the size of the maps grows.


I remember that at one time, mutable.HashMap had some performance issues 
(rumor has it, anyway). So as a test, I modified ArrayBasedMapBuilder.scala to 
use java.util.Hashmap instead. After that:

- map_concat of two small maps (20 string keys per map, no dups) for 2M 
rows is about 12% slower.
- map_concat of two big maps (500 string keys per map, no dups) for 1M rows 
is about 15% slower.

It's a little more proportionate. I don't know if switching HashMap 
implementations would have some negative consequences.

Also, my test is a dumb benchmark that uses System.currentTimeMillis 
concatenating simple [String,Integer] maps.





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23162: [MINOR][DOC] Correct some document description errors

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23162
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23162: [MINOR][DOC] Correct some document description errors

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23162
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5429/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99345/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23162: [MINOR][DOC] Correct some document description errors

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23162
  
**[Test build #99352 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99352/testReport)**
 for PR 23162 at commit 
[`e9aba19`](https://github.com/apache/spark/commit/e9aba19b526610f3f31fa6a5b56140f6be8dc1c1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23160
  
**[Test build #99345 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99345/testReport)**
 for PR 23160 at commit 
[`37bcd62`](https://github.com/apache/spark/commit/37bcd6231816c7bc2b2561bff10955b822934ac6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99346/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23160
  
**[Test build #99346 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99346/testReport)**
 for PR 23160 at commit 
[`bbd745a`](https://github.com/apache/spark/commit/bbd745ac0f727de5988bcc876e57d11a32eadb31).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23162: [MINOR][DOC] Correct some document description er...

2018-11-27 Thread 10110346
GitHub user 10110346 opened a pull request:

https://github.com/apache/spark/pull/23162

[MINOR][DOC] Correct some document description errors

## What changes were proposed in this pull request?

Correct some document description errors.

## How was this patch tested?
N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/10110346/spark docerror

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23162.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23162


commit e9aba19b526610f3f31fa6a5b56140f6be8dc1c1
Author: liuxian 
Date:   2018-11-28T06:06:51Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23124#discussion_r236949897
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ---
@@ -558,8 +558,11 @@ private[parquet] class ParquetRowConverter(
 
 override def getConverter(fieldIndex: Int): Converter = 
keyValueConverter
 
-override def end(): Unit =
+override def end(): Unit = {
+  // The parquet map may contains null or duplicated map keys. When it 
happens, the behavior is
+  // undefined.
--- End diff --

done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99347/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23128
  
**[Test build #99347 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99347/testReport)**
 for PR 23128 at commit 
[`d12ea31`](https://github.com/apache/spark/commit/d12ea311e58e7925f21d343e5de13bfec6737549).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/22575
  
> > Can you send a mail to Ryan blue for adding this SPIP topic in tomorrow 
meeting. Meeting will be conducted tomorrow 05:00 pm PST. If you confirm then 
we can also attend the meeting.
> 
> I have send an email to Ryan Blue to attend this meeting.

I think you should also ask him to add your SPIP topic for tomorrows 
discussion.Agenda has to be set prior.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

2018-11-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23127


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23127
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread stczwd
Github user stczwd commented on the issue:

https://github.com/apache/spark/pull/22575
  
> Can you send a mail to Ryan blue for adding this SPIP topic in tomorrow 
meeting. Meeting will be conducted tomorrow 05:00 pm PST. If you confirm then 
we can also attend the meeting.

I have send an email to Ryan Blue to attend this meeting.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23083
  
**[Test build #99351 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99351/testReport)**
 for PR 23083 at commit 
[`1723819`](https://github.com/apache/spark/commit/17238196719de1e68cbcb1eb930cb3176308e437).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23083
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23083
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5428/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23083: [SPARK-26114][CORE] ExternalSorter's readingIterator fie...

2018-11-27 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23083
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23161
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23161
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99350/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23161
  
**[Test build #99350 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99350/testReport)**
 for PR 23161 at commit 
[`423711e`](https://github.com/apache/spark/commit/423711ec45883822942be309d8052cee976ef8c0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/22575
  
Can you send a mail to Ryan blue for adding this SPIP topic in tomorrow
meeting. Meeting will be conducted tomorrow 05:00 pm PST. If you confirm
then we can also attend the meeting.

On Wed, 28 Nov 2018 at 10:27 AM, stczwd  wrote:

> [image: image]
> 

>
> I have removed the 'stream' keyword.
>
> There is a DatasourceV2 community synch meetup tomorrow which is
> cordinated by Ryan Blue , can we discuss this point.
>
> Yep, it's a good idea.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread stczwd
Github user stczwd commented on the issue:

https://github.com/apache/spark/pull/22575
  
> 
![image](https://user-images.githubusercontent.com/12999161/49129177-ab056680-f2f4-11e8-8f71-4695ebc045c1.png)

I have removed the 'stream' keyword.
> There is a DatasourceV2 community synch meetup tomorrow which is 
cordinated by Ryan Blue , can we discuss this point.

Yep, it's a good idea.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/22575
  
cc @koeninger 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/22575
  

![image](https://user-images.githubusercontent.com/12999161/49129177-ab056680-f2f4-11e8-8f71-4695ebc045c1.png)

There is a DatasourceV2 community synch meetup tomorrow which is cordinated 
by Ryan Blue , can we discuss this point.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23161
  
**[Test build #99350 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99350/testReport)**
 for PR 23161 at commit 
[`423711e`](https://github.com/apache/spark/commit/423711ec45883822942be309d8052cee976ef8c0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23161
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5427/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23161
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99349/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23161: [SPARK-26189][R]Fix unionAll doc in SparkR

2018-11-27 Thread huaxingao
GitHub user huaxingao opened a pull request:

https://github.com/apache/spark/pull/23161

[SPARK-26189][R]Fix unionAll doc in SparkR

## What changes were proposed in this pull request?

Fix unionAll doc in SparkR

## How was this patch tested?

Manually ran test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huaxingao/spark spark-26189

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23161.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23161


commit 423711ec45883822942be309d8052cee976ef8c0
Author: Huaxin Gao 
Date:   2018-11-28T04:22:33Z

[SPARK-26189][R]Fix unionAll doc in SparkR




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23144
  
**[Test build #99349 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99349/testReport)**
 for PR 23144 at commit 
[`f46b6b7`](https://github.com/apache/spark/commit/f46b6b7ab82e9de39f888cd40e2b1904ae4df73a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23144
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5426/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/23144
  
Using an optional `normalize` function argument maybe OK, I will have a try.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/23144
  
@srowen  To adopt an optional `normalize` function argument, we may need to 
create a new class `StringParam` and add the argument into it. But this will be 
a breaking change, since existing string params are of type `Param[String]`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensi...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23144
  
**[Test build #99349 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99349/testReport)**
 for PR 23144 at commit 
[`f46b6b7`](https://github.com/apache/spark/commit/f46b6b7ab82e9de39f888cd40e2b1904ae4df73a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23104: [SPARK-26138][SQL] Cross join requires push Local...

2018-11-27 Thread guoxiaolongzte
Github user guoxiaolongzte commented on a diff in the pull request:

https://github.com/apache/spark/pull/23104#discussion_r236929433
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -459,6 +459,7 @@ object LimitPushDown extends Rule[LogicalPlan] {
   val newJoin = joinType match {
 case RightOuter => join.copy(right = maybePushLocalLimit(exp, 
right))
 case LeftOuter => join.copy(left = maybePushLocalLimit(exp, left))
+case Cross => join.copy(left = maybePushLocalLimit(exp, left), 
right = maybePushLocalLimit(exp, right))
--- End diff --

There are two tables as follows:
CREATE TABLE `**test1**`(`id` int, `name` int);
CREATE TABLE `**test2**`(`id` int, `name` int);

test1 table data:
2,2
1,1

test2 table data:
2,2
3,3
4,4

Execute sql select * from test1 t1 **left anti join** test2 t2 on 
t1.id=t2.id limit 1; The result:
1,1

But 
   we push the limit 1 on left side, the result is not correct. Result is 
empty.
   we push the limit 1 on right side, the result is not correct. Result is 
empty.

So
left anti join no need to push down limit. Similarly, left semi join is 
the same logic.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread 10110346
Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/22163
  
cc @kiszk  @maropu 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22163
  
**[Test build #99348 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99348/testReport)**
 for PR 22163 at commit 
[`b7ff915`](https://github.com/apache/spark/commit/b7ff9152ef17762ce370c7ec7c8be772a73f926e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23128
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5424/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5425/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23126: [SPARK-26158] [MLLIB] fix covariance accuracy pro...

2018-11-27 Thread KyleLi1985
Github user KyleLi1985 commented on a diff in the pull request:

https://github.com/apache/spark/pull/23126#discussion_r236927771
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/mllib/linalg/distributed/RowMatrixSuite.scala
 ---
@@ -266,6 +266,16 @@ class RowMatrixSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 }
   }
 
+  test("dense vector covariance accuracy (SPARK-26158)") {
+val rdd1 = sc.parallelize(Array(10.04, 10.12, 
9.931, 9.977))
--- End diff --

handy thing


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23126: [SPARK-26158] [MLLIB] fix covariance accuracy pro...

2018-11-27 Thread KyleLi1985
Github user KyleLi1985 commented on a diff in the pull request:

https://github.com/apache/spark/pull/23126#discussion_r236927721
  
--- Diff: mllib/src/test/java/org/apache/spark/ml/feature/JavaPCASuite.java 
---
@@ -67,7 +66,7 @@ public void testPCA() {
 JavaRDD dataRDD = jsc.parallelize(points, 2);
 
 RowMatrix mat = new RowMatrix(dataRDD.map(
-(Vector vector) -> (org.apache.spark.mllib.linalg.Vector) new 
DenseVector(vector.toArray())
+(Vector vector) -> 
org.apache.spark.mllib.linalg.Vectors.fromML(vector)
--- End diff --

Sure, as you said, if the **first item** is sparse vector, it will align 
with computeSparseVectorCovariance logic, and if the **first item** is a dense 
vector, it will align with computeDenseVectorCovariance logic.  The rest is 
user's choice. 
But, maybe we can add some notes into the annotation of function 
computeCovariance, 
give user some notes, like:

/**
   * Computes the covariance matrix, treating each row as an observation.
   *
   * Note:
   * When the first row is DenseVector, we use the 
computeDenseVectorCovariance
   * to calculate the covariance matrix, and if the first row is 
SparseVector, we
   * use the computeSparseVectorCovariance to calculate the covariance 
matrix
   *
   * @return a local dense matrix of size n x n
   *
   * @note This cannot be computed on matrices with more than 65535 columns.
   */
  @Since("1.0.0")
  def computeCovariance(): Matrix = {


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23128: [SPARK-26142][SQL] Implement shuffle read metrics in SQL

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23128
  
**[Test build #99347 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99347/testReport)**
 for PR 23128 at commit 
[`d12ea31`](https://github.com/apache/spark/commit/d12ea311e58e7925f21d343e5de13bfec6737549).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23128: [SPARK-26142][SQL] Implement shuffle read metrics...

2018-11-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/23128#discussion_r236926403
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala ---
@@ -154,7 +156,10 @@ class ShuffledRowRDD(
 
   override def compute(split: Partition, context: TaskContext): 
Iterator[InternalRow] = {
 val shuffledRowPartition = split.asInstanceOf[ShuffledRowRDDPartition]
-val metrics = context.taskMetrics().createTempShuffleReadMetrics()
+val tempMetrics = context.taskMetrics().createTempShuffleReadMetrics()
+// metrics here could be empty cause user can use ShuffledRowRDD 
directly,
+// so we just use the tempMetrics created in TaskContext in this case.
--- End diff --

Removing this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23055: [SPARK-26080][PYTHON] Disable 'spark.executor.pys...

2018-11-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23055#discussion_r236926113
  
--- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   private val reuseWorker = conf.getBoolean("spark.python.worker.reuse", 
true)
   // each python worker gets an equal part of the allocation. the worker 
pool will grow to the
   // number of concurrent tasks, which is determined by the number of 
cores in this executor.
-  private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+  private val memoryMb = if (Utils.isWindows) {
--- End diff --

I don't think I'm only the one tho. 

> Why is this code needed? 

I explain above multiple times. See above.

> it's not doing anything useful if you keep the python check around

See above. I want to delete them but added per review comment.

>  The JVM doesn't understand exactly what Python supports of not, it's 
better to let the python code decide that.

Not really. resource module is a Python builtin module that exists unix 
based system. It just does not exist in Windows.

> You say we should disable the feature on Windows. The python-side changes 
already do that. 

I explained above. See the first change I proposed 2d3315a. It relays on 
the environment variable.

> We should not remove the extra memory requested from the resource manager 
just because you're running on Windows - you'll still need that memory, you'll 
just get a different error message if you end up using more than you requested.

Yea, I know it would probably work. My question that is it ever tested? One 
failure case was found and it looks a bit odd that we document it works. It's 
not even tested and shall we make it simple rather then it make it work 
differently until it's tested?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23154: [SPARK-26195][SQL] Correct exception messages in some cl...

2018-11-27 Thread 10110346
Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/23154
  
LGTM,thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99342/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23159
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23159: [SPARK-26191][SQL] Control truncation of Spark plans via...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23159
  
**[Test build #99342 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99342/testReport)**
 for PR 23159 at commit 
[`e9e7893`](https://github.com/apache/spark/commit/e9e789311b998d961aa8fcf76463307a410969ea).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-11-27 Thread lcqzte10192193
Github user lcqzte10192193 commented on a diff in the pull request:

https://github.com/apache/spark/pull/23154#discussion_r236923346
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java
 ---
@@ -510,42 +510,42 @@ public void readIntegers(int total, 
WritableColumnVector c, int rowId) {
 
   @Override
   public byte readByte() {
-throw new UnsupportedOperationException("only readInts is valid.");
+throw new UnsupportedOperationException("only readByte is valid.");
--- End diff --

ok, thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-11-27 Thread lcqzte10192193
Github user lcqzte10192193 commented on a diff in the pull request:

https://github.com/apache/spark/pull/23154#discussion_r236923308
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -204,10 +204,10 @@ case class UnresolvedGenerator(name: 
FunctionIdentifier, children: Seq[Expressio
 throw new UnsupportedOperationException(s"Cannot evaluate expression: 
$this")
 
   override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode =
-throw new UnsupportedOperationException(s"Cannot evaluate expression: 
$this")
+throw new UnsupportedOperationException(s"Cannot generate code 
expression: $this")
--- End diff --

Yes,I fix them.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-27 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/23151#discussion_r236922709
  
--- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala ---
@@ -105,5 +105,16 @@ abstract class SparkFunSuite
   logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n")
 }
   }
-
+  /**
+   * Creates a temporary directory, which is then passed to `f` and will 
be deleted after `f`
+   * returns.
+   *
+   * @todo Probably this method should be moved to a more general place
+   */
+  protected def withCreateTempDir(f: File => Unit): Unit = {
--- End diff --

`trait SQLTestUtils extends SparkFunSuite with SQLTestUtilsBase with 
PlanTest`
if `SparkFunSuite `and `SQLTestUtilsBase `use the same name `withTempDir`. 
Can cause name contamination? thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22575: [SPARK-24630][SS] Support SQLStreaming in Spark

2018-11-27 Thread stczwd
Github user stczwd commented on the issue:

https://github.com/apache/spark/pull/22575
  
@sujithjay 
 Please refer 
[SPARK-24630](https://issues.apache.org/jira/browse/SPARK-24630) for more 
details.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-11-27 Thread 10110346
Github user 10110346 commented on a diff in the pull request:

https://github.com/apache/spark/pull/23154#discussion_r236920634
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java
 ---
@@ -510,42 +510,42 @@ public void readIntegers(int total, 
WritableColumnVector c, int rowId) {
 
   @Override
   public byte readByte() {
-throw new UnsupportedOperationException("only readInts is valid.");
+throw new UnsupportedOperationException("only readByte is valid.");
--- End diff --

These exception messages seem to be correct ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-11-27 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23154#discussion_r236919935
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -258,7 +258,7 @@ case class GeneratorOuter(child: Generator) extends 
UnaryExpression with Generat
 throw new UnsupportedOperationException(s"Cannot evaluate expression: 
$this")
 
   final override protected def doGenCode(ctx: CodegenContext, ev: 
ExprCode): ExprCode =
-throw new UnsupportedOperationException(s"Cannot evaluate expression: 
$this")
+throw new UnsupportedOperationException(s"Cannot generate code 
expression: $this")
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Add a withCreateTempDir...

2018-11-27 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/23151#discussion_r236919725
  
--- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala ---
@@ -105,5 +105,16 @@ abstract class SparkFunSuite
   logInfo(s"\n\n= FINISHED $shortSuiteName: '$testName' =\n")
 }
   }
-
+  /**
+   * Creates a temporary directory, which is then passed to `f` and will 
be deleted after `f`
+   * returns.
+   *
+   * @todo Probably this method should be moved to a more general place
+   */
+  protected def withCreateTempDir(f: File => Unit): Unit = {
+val dir = Utils.createTempDir()
--- End diff --

I'm not sure if I need call `.getCanonicalFile` again. i feel it's a little 
redundant. review `Utils.createTempDir()` --> `createDirectory` --> 
`dir.getCanonicalFile`.
It has been called `.getCanonicalFile`. thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-11-27 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23154#discussion_r236919395
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -204,10 +204,10 @@ case class UnresolvedGenerator(name: 
FunctionIdentifier, children: Seq[Expressio
 throw new UnsupportedOperationException(s"Cannot evaluate expression: 
$this")
 
   override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode =
-throw new UnsupportedOperationException(s"Cannot evaluate expression: 
$this")
+throw new UnsupportedOperationException(s"Cannot generate code 
expression: $this")
--- End diff --

Is it better to use `generate code for expression` or others rather than 
`generate code expression`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23154: [SPARK-26195][SQL] Correct exception messages in some cl...

2018-11-27 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/23154
  
@lcqzte10192193 I am sorry for my misunderstanding The original code in 
`VectorizedRleValuesReader.java` was correct. Could you please revert you 
change?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23160
  
**[Test build #99346 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99346/testReport)**
 for PR 23160 at commit 
[`bbd745a`](https://github.com/apache/spark/commit/bbd745ac0f727de5988bcc876e57d11a32eadb31).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5423/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196]Total tasks title in the stage page is inco...

2018-11-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23160
  
**[Test build #99345 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99345/testReport)**
 for PR 23160 at commit 
[`37bcd62`](https://github.com/apache/spark/commit/37bcd6231816c7bc2b2561bff10955b822934ac6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >