[GitHub] spark issue #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21696
  
Don't block on me. Just wanted to doubly sure if this is the only way. I am 
fine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21596
  
**[Test build #92554 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92554/testReport)**
 for PR 21596 at commit 
[`5006467`](https://github.com/apache/spark/commit/50064675706f7ac46f2665da752e0f410ad84183).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21696
  
Yea, I got that we it should have been done like this, and wonder if we can 
avoid this. It sounds more like a bandaid fix mainly cased by decimal. FWIW, in 
case of timestamp as INT96 (deprecated in Parquet), this is a legacy and 
something we should remove out. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21696
  
you can't get the physical schema information in a higher layer, as it may 
vary for different files. The table schema can evolve(add/drop column).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21696
  
physical schema information shouldn't usually referred in a higher layer 
though cc @liancheng. It's kind of something we should avoid ..  I got that we 
need this but wonder if this is the only way to get through.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21696#discussion_r199686314
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
@@ -379,14 +366,29 @@ class ParquetFileFormat
   null)
 
   val sharedConf = broadcastedHadoopConf.value.value
+
+  val fileMetaData =
+ParquetFileReader.readFooter(sharedConf, fileSplit.getPath, 
SKIP_ROW_GROUPS).getFileMetaData
--- End diff --

Yes, I think so. It should be avoided.`isCreatedByParquetMr` was 
intentionally a function to avoid it by short circuiting.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92551/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21668: [SPARK-24690][SQL] Add a new config to control pl...

2018-07-02 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21668#discussion_r199682495
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -375,16 +375,16 @@ case class CatalogStatistics(
* Convert [[CatalogStatistics]] to [[Statistics]], and match column 
stats to attributes based
* on column names.
*/
-  def toPlanStats(planOutput: Seq[Attribute], cboEnabled: Boolean): 
Statistics = {
-if (cboEnabled && rowCount.isDefined) {
+  def toPlanStats(planOutput: Seq[Attribute], planStatsEnabled: Boolean): 
Statistics = {
+if (planStatsEnabled && rowCount.isDefined) {
   val attrStats = AttributeMap(planOutput
 .flatMap(a => colStats.get(a.name).map(a -> _.toPlanStat(a.name, 
a.dataType
   // Estimate size as number of rows * row size.
   val size = EstimationUtils.getOutputSize(planOutput, rowCount.get, 
attrStats)
   Statistics(sizeInBytes = size, rowCount = rowCount, attributeStats = 
attrStats)
 } else {
-  // When CBO is disabled or the table doesn't have other statistics, 
we apply the size-only
-  // estimation strategy and only propagate sizeInBytes in statistics.
+  // When plan statistics are disabled or the table doesn't have other 
statistics,
+  // we apply the size-only estimation strategy and only propagate 
sizeInBytes in statistics.
   Statistics(sizeInBytes = sizeInBytes)
--- End diff --

yea, I see. We might do so.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #92551 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92551/testReport)**
 for PR 21469 at commit 
[`c9aada5`](https://github.com/apache/spark/commit/c9aada520889b87ace0886805910f0d56d099bd2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class StateStoreCustomSumMetric(name: String, desc: String) 
extends StateStoreCustomMetric`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21701: [SPARK-24730][SS] Add policy to choose max as global wat...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21701
  
**[Test build #92553 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92553/testReport)**
 for PR 21701 at commit 
[`c0d1c6e`](https://github.com/apache/spark/commit/c0d1c6e0a5532eeab0848834d2dc348808e54069).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21701: [SPARK-24730][SS] Add policy to choose max as global wat...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21701
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/631/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21701: [SPARK-24730][SS] Add policy to choose max as global wat...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21701
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21701: [SPARK-24730][SS] Add policy to choose max as glo...

2018-07-02 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/21701

[SPARK-24730][SS] Add policy to choose max as global watermark when 
streaming query has multiple watermarks 

## What changes were proposed in this pull request?

Currently, when a streaming query has multiple watermark, the policy is to 
choose the min of them as the global watermark. This is safe to do as the 
global watermark moves with the slowest stream, and is therefore is safe as it 
does not unexpectedly drop some data as late, etc. While this is indeed the 
safe thing to do, in some cases, you may want the watermark to advance with the 
fastest stream, that is, take the max of multiple watermarks. This PR is to add 
that configuration. It makes the following changes. 

- Adds a configuration to specify max as the policy.
- Saves the configuration in OffsetSeqMetadata because changing it in the 
middle can lead to unpredictable results. 
   - For old checkpoints without the configuration, it assumes the default 
policy as min (irrespective of the policy set at the session where the query is 
being restarted). This is to ensure that existing queries are affected in any 
way. 

- [ ] Add a test for recovery from existing checkpoints.

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark SPARK-24730

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21701.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21701


commit c0d1c6e0a5532eeab0848834d2dc348808e54069
Author: Tathagata Das 
Date:   2018-07-03T04:28:05Z

Implemented




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21605: [SPARK-24385][SQL] Resolve self-join condition am...

2018-07-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21605


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21605: [SPARK-24385][SQL] Resolve self-join condition ambiguity...

2018-07-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21605
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21668: [SPARK-24690][SQL] Add a new config to control plan stat...

2018-07-02 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21668
  
yea, ok. I'll reconsider this again. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21696
  
makes sense to me, since we need the physical schema information to 
pushdown decimal and timestamp. also cc @rdblue @michal-databricks 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21073: [SPARK-23936][SQL] Implement map_concat

2018-07-02 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/21073#discussion_r199678852
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -551,6 +551,36 @@ object TypeCoercion {
   case None => s
 }
 
+  case m @ MapConcat(children) if children.forall(c => 
MapType.acceptsType(c.dataType)) &&
+!haveSameType(children) =>
+val keyTypes = 
children.map(_.dataType.asInstanceOf[MapType].keyType)
--- End diff --

I don't necessarily have a _good_ reason, but here are two reasons I did 
that extra junk:

1) The Concat-like code didn't find a wider type amongst types 
map and map. So, it just fell to case None => 
m

2) If the call to map_concat has the same child types but multiple 
valueContainsNull values, the Concat-style code added a Cast to each child 
(this is because haveSameType considers expressions with different 
valueContainsNull values to have different types). It does no harm, as far as I 
can tell, but it seemed wrong.

About issue 1): I will debug. I might have done something wrong there. 
Plus, even if it's a real bug in findWiderCommonType, it affects my longer 
code, which may be looking for a wider common type amongst the keys or values, 
which could themselves be maps.

About issue 2): Maybe not an issue. Or I can create an alternate 
haveSameType() function for Maps.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21696#discussion_r199678338
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
@@ -379,14 +366,29 @@ class ParquetFileFormat
   null)
 
   val sharedConf = broadcastedHadoopConf.value.value
+
+  val fileMetaData =
+ParquetFileReader.readFooter(sharedConf, fileSplit.getPath, 
SKIP_ROW_GROUPS).getFileMetaData
--- End diff --

will we read footer again in the parquet reader?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21668: [SPARK-24690][SQL] Add a new config to control plan stat...

2018-07-02 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21668
  
yea this is a real problem, but I feel a better solution is to integrate 
the StarSchemaDetection into CBO. How hard will it be?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: [SPARK-24717][SS] Split out min retain version of state ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: [SPARK-24717][SS] Split out min retain version of state ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92550/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: [SPARK-24717][SS] Split out min retain version of state ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92550 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92550/testReport)**
 for PR 21700 at commit 
[`345b33a`](https://github.com/apache/spark/commit/345b33ab5b9042eb7be86b2993dc9b6306480f5d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21696#discussion_r199672805
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
 ---
@@ -19,166 +19,186 @@ package 
org.apache.spark.sql.execution.datasources.parquet
 
 import java.sql.Date
 
+import scala.collection.JavaConverters._
+
 import org.apache.parquet.filter2.predicate._
 import org.apache.parquet.filter2.predicate.FilterApi._
 import org.apache.parquet.io.api.Binary
-import org.apache.parquet.schema.PrimitiveComparator
+import org.apache.parquet.schema._
+import org.apache.parquet.schema.OriginalType._
+import org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName._
 
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import org.apache.spark.sql.catalyst.util.DateTimeUtils.SQLDate
 import org.apache.spark.sql.sources
-import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.UTF8String
 
 /**
  * Some utility function to convert Spark data source filters to Parquet 
filters.
  */
 private[parquet] class ParquetFilters(pushDownDate: Boolean, 
pushDownStartWith: Boolean) {
 
+  case class ParquetSchemaType(
+  originalType: OriginalType,
+  primitiveTypeName: PrimitiveType.PrimitiveTypeName,
+  decimalMetadata: DecimalMetadata)
+
   private def dateToDays(date: Date): SQLDate = {
 DateTimeUtils.fromJavaDate(date)
   }
 
-  private val makeEq: PartialFunction[DataType, (String, Any) => 
FilterPredicate] = {
-case BooleanType =>
+  private val makeEq: PartialFunction[ParquetSchemaType, (String, Any) => 
FilterPredicate] = {
+// BooleanType
+case ParquetSchemaType(null, BOOLEAN, null) =>
--- End diff --

Mapping type reference:

https://github.com/apache/spark/blob/21a7bfd5c324e6c82152229f1394f26afeae771c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L338-L560


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21690: [SPARK-24713]AppMatser of spark streaming kafka OOM if t...

2018-07-02 Thread yuanboliu
Github user yuanboliu commented on the issue:

https://github.com/apache/spark/pull/21690
  
The first pause is used to stop poll() in the method paranoidPoll
The second one is attached because of p.partition().
I'm not sure whether the state of pause will be rewritten after these 
methods are called, so I use pause repeatedly.  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21618: [SPARK-20408][SQL] Get the glob path in parallel to redu...

2018-07-02 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/21618
  
gental ping @cloud-fan @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21633: [SPARK-24646][CORE] Minor change to spark.yarn.dist.forc...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21633
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/630/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21633: [SPARK-24646][CORE] Minor change to spark.yarn.dist.forc...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21633
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21633: [SPARK-24646][CORE] Minor change to spark.yarn.dist.forc...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21633
  
**[Test build #92552 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92552/testReport)**
 for PR 21633 at commit 
[`4419f52`](https://github.com/apache/spark/commit/4419f52bf0104cc44fc6b27183030876778bbdc4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream format ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21546
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream format ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21546
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92543/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream format ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21546
  
**[Test build #92543 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92543/testReport)**
 for PR 21546 at commit 
[`e25acd2`](https://github.com/apache/spark/commit/e25acd2dbd7bdc176d43fa6957cc150edf19bdcd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21459: [SPARK-24420][Build] Upgrade ASM to 6.1 to support JDK9+

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21459
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92540/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21459: [SPARK-24420][Build] Upgrade ASM to 6.1 to support JDK9+

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21459
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21459: [SPARK-24420][Build] Upgrade ASM to 6.1 to support JDK9+

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21459
  
**[Test build #92540 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92540/testReport)**
 for PR 21459 at commit 
[`e6040c3`](https://github.com/apache/spark/commit/e6040c3c1b26a591c3bae5e7fb8ae95b5eafeea9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92545/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92545/testReport)**
 for PR 21601 at commit 
[`b351406`](https://github.com/apache/spark/commit/b3514067db43b543d8ceac38a0e1ffe6c1a5692e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #92551 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92551/testReport)**
 for PR 21469 at commit 
[`c9aada5`](https://github.com/apache/spark/commit/c9aada520889b87ace0886805910f0d56d099bd2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92544/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #92544 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92544/testReport)**
 for PR 21469 at commit 
[`c9aada5`](https://github.com/apache/spark/commit/c9aada520889b87ace0886805910f0d56d099bd2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class StateStoreCustomSumMetric(name: String, desc: String) 
extends StateStoreCustomMetric`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21469
  
retest this, please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21469
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92542/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #92542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92542/testReport)**
 for PR 21469 at commit 
[`96115bc`](https://github.com/apache/spark/commit/96115bcc48cfe67c4b7bd37315963b9ba1366b3a).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `case class StateStoreCustomSumMetric(name: String, desc: String) 
extends StateStoreCustomMetric`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21692: [SPARK-24715][Build] Override jline version as 2.14.3 in...

2018-07-02 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21692
  
Thanks @srowen for explanation. As I can tell, `-verbose:class` shows jline 
classes come from `jline-2.12.jar`, though `sbt dependency-tree` shows 
`jline:jline:0.9.94`. After this overriding, jline classes come from 
`jline-2.14.3.jar`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21686: [SPARK-24709][SQL] schema_of_json() - schema inference f...

2018-07-02 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21686
  
Thanks. Awesome. This matches what I had in mind then.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21699
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92541/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21699
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21699: [SPARK-24722][SQL] pivot() with Column type argument

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21699
  
**[Test build #92541 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92541/testReport)**
 for PR 21699 at commit 
[`0fdd11f`](https://github.com/apache/spark/commit/0fdd11ff26b4f4ca3b79bdd116aaf1c558643698).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21692: [SPARK-24715][Build] Override jline version as 2.14.3 in...

2018-07-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/21692
  
Thank you all!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21596
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92537/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21596
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21596
  
**[Test build #92537 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92537/testReport)**
 for PR 21596 at commit 
[`8dbc310`](https://github.com/apache/spark/commit/8dbc310ac6a8bcd2eb9f583b0a9990f9bed2aea0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21459: [SPARK-24420][Build] Upgrade ASM to 6.1 to support JDK9+

2018-07-02 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21459
  
SGTM.

On Mon, Jul 2, 2018 at 4:38 PM DB Tsai  wrote:

> There are three approvals from the committers, and the changes are pretty
> trivial to revert if we see any performance regression which is unlikely.
> To move thing forward, if there is no further objection, I'll merge it
> tomorrow. Thanks.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21662: [SPARK-24662][SQL][SS] Support limit in structured strea...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21662
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21662: [SPARK-24662][SQL][SS] Support limit in structured strea...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21662
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92536/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21662: [SPARK-24662][SQL][SS] Support limit in structured strea...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21662
  
**[Test build #92536 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92536/testReport)**
 for PR 21662 at commit 
[`8671944`](https://github.com/apache/spark/commit/8671944b801907b2dced7027ea3da3fb04ed2e8f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92550 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92550/testReport)**
 for PR 21700 at commit 
[`345b33a`](https://github.com/apache/spark/commit/345b33ab5b9042eb7be86b2993dc9b6306480f5d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92549/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92549 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92549/testReport)**
 for PR 21700 at commit 
[`0819412`](https://github.com/apache/spark/commit/081941248792612000fe4a1d92be917d771117eb).
 * This patch **fails RAT tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92549 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92549/testReport)**
 for PR 21700 at commit 
[`0819412`](https://github.com/apache/spark/commit/081941248792612000fe4a1d92be917d771117eb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21459: [SPARK-24420][Build] Upgrade ASM to 6.1 to support JDK9+

2018-07-02 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21459
  
There are three approvals from the committers, and the changes are pretty 
trivial to revert if we see any performance regression which is unlikely. To 
move thing forward, if there is no further objection, I'll merge it tomorrow. 
Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92548 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92548/testReport)**
 for PR 21700 at commit 
[`cab25df`](https://github.com/apache/spark/commit/cab25dfd8599a2edfdefe83ad9b9be1f827aaad0).
 * This patch **fails to generate documentation**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class BoundedSortedMap extends TreeMap `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-07-02 Thread mallman
Github user mallman commented on a diff in the pull request:

https://github.com/apache/spark/pull/21320#discussion_r199648692
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 ---
@@ -182,18 +182,20 @@ private[parquet] class ParquetRowConverter(
 
   // Converters for each field.
   private val fieldConverters: Array[Converter with 
HasParentContainerUpdater] = {
-parquetType.getFields.asScala.zip(catalystType).zipWithIndex.map {
-  case ((parquetFieldType, catalystField), ordinal) =>
-// Converted field value should be set to the `ordinal`-th cell of 
`currentRow`
-newConverter(parquetFieldType, catalystField.dataType, new 
RowUpdater(currentRow, ordinal))
+parquetType.getFields.asScala.map {
+  case parquetField =>
+val fieldIndex = catalystType.fieldIndex(parquetField.getName)
--- End diff --

I dropped into the `sql/console` and attempted to write a parquet file with 
duplicate column names. It didn't work. Transcript below.

```
scala> import org.apache.spark.sql._
import org.apache.spark.sql._

scala> val sameColumnNames = StructType(StructField("a", IntegerType) :: 
StructField("a", StringType) :: Nil)
sameColumnNames: org.apache.spark.sql.types.StructType = 
StructType(StructField(a,IntegerType,true), StructField(a,StringType,true))

scala> val rowRDD = sqlContext.sparkContext.parallelize(Row(1, "one") :: 
Row(2, "two") :: Nil, 1)
rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = 
ParallelCollectionRDD[0] at parallelize at :51

scala> val df = sqlContext.createDataFrame(rowRDD, sameColumnNames)
18/07/02 16:31:33 INFO SharedState: Setting hive.metastore.warehouse.dir 
('null') to the value of spark.sql.warehouse.dir 
('file:/Volumes/VideoAmpCS/msa/workspace/spark-public/spark-warehouse').
18/07/02 16:31:33 INFO SharedState: Warehouse path is 
'file:/Volumes/VideoAmpCS/msa/workspace/spark-public/spark-warehouse'.
18/07/02 16:31:33 INFO ContextHandler: Started 
o.e.j.s.ServletContextHandler@7b13b737{/SQL,null,AVAILABLE,@Spark}
18/07/02 16:31:33 INFO ContextHandler: Started 
o.e.j.s.ServletContextHandler@3c9fb104{/SQL/json,null,AVAILABLE,@Spark}
18/07/02 16:31:33 INFO ContextHandler: Started 
o.e.j.s.ServletContextHandler@3d5cadbe{/SQL/execution,null,AVAILABLE,@Spark}
18/07/02 16:31:33 INFO ContextHandler: Started 
o.e.j.s.ServletContextHandler@73732e26{/SQL/execution/json,null,AVAILABLE,@Spark}
18/07/02 16:31:33 INFO ContextHandler: Started 
o.e.j.s.ServletContextHandler@72a13c4a{/static/sql,null,AVAILABLE,@Spark}
18/07/02 16:31:34 INFO StateStoreCoordinatorRef: Registered 
StateStoreCoordinator endpoint
df: org.apache.spark.sql.DataFrame = [a: int, a: string]

scala> df.write.parquet("sameColumnNames.parquet")
org.apache.spark.sql.AnalysisException: Found duplicate column(s) when 
inserting into 
file:/Volumes/VideoAmpCS/msa/workspace/spark-public/sameColumnNames.parquet: 
`a`;
  at 
org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)
  at 
org.apache.spark.sql.util.SchemaUtils$.checkSchemaColumnNameDuplication(SchemaUtils.scala:42)
  at 
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:64)
  at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
  at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
  at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
  at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
  at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
  at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
  at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
  at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
  at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
  at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:662)
  at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:662)
  at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
  at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
  at 

[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92548/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92548 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92548/testReport)**
 for PR 21700 at commit 
[`cab25df`](https://github.com/apache/spark/commit/cab25dfd8599a2edfdefe83ad9b9be1f827aaad0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21700
  
Missing new line in EOF for two new Java files. Just addressed.
Jenkins, retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21696: [SPARK-24716][SQL] Refactor ParquetFilters

2018-07-02 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/21696
  
cc @gatorsmile @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21459: [SPARK-24420][Build] Upgrade ASM to 6.1 to support JDK9+

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21459
  
**[Test build #4202 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4202/testReport)**
 for PR 21459 at commit 
[`bec3e81`](https://github.com/apache/spark/commit/bec3e81a3522b54692150584c86d1925799c08da).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92547/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92547 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92547/testReport)**
 for PR 21700 at commit 
[`45796d8`](https://github.com/apache/spark/commit/45796d8c74d0a55bf0d3a22f1c526dc764c0e924).
 * This patch **fails Java style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class BoundedSortedMap extends TreeMap `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-07-02 Thread mallman
Github user mallman commented on a diff in the pull request:

https://github.com/apache/spark/pull/21320#discussion_r199643803
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala
 ---
@@ -71,9 +80,22 @@ private[parquet] class ParquetReadSupport(val convertTz: 
Option[TimeZone])
   StructType.fromString(schemaString)
 }
 
-val parquetRequestedSchema =
+val clippedParquetSchema =
   ParquetReadSupport.clipParquetSchema(context.getFileSchema, 
catalystRequestedSchema)
 
+val parquetRequestedSchema = if (parquetMrCompatibility) {
+  // Parquet-mr will throw an exception if we try to read a superset 
of the file's schema.
+  // Therefore, we intersect our clipped schema with the underlying 
file's schema
+  ParquetReadSupport.intersectParquetGroups(clippedParquetSchema, 
context.getFileSchema)
+.map(intersectionGroup =>
+  new MessageType(intersectionGroup.getName, 
intersectionGroup.getFields))
+.getOrElse(ParquetSchemaConverter.EMPTY_MESSAGE)
+} else {
+  // Spark's built-in Parquet reader will throw an exception in some 
cases if the requested
+  // schema is not the same as the clipped schema
--- End diff --

I believe the failure occurs because the requested schema and file 
schema—while having columns with identical names and types—have columns in 
different order. Of the one test that fails in the `ParquetFilterSuite`, namely 
"Filter applied on merged Parquet schema with new column should work", it 
appears to be the only one for which the order of the columns is changed. These 
are the file and requested schema for that test:

```
Parquet file schema:
message spark_schema {
  required int32 c;
  optional binary b (UTF8);
}

Parquet requested schema:
message spark_schema {
  optional binary b (UTF8);
  required int32 c;
}
```

I would say the Spark reader expects identical column order, whereas the 
parquet-mr reader accepts different column order but identical (or compatible) 
column names. That's my supposition at least.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92547 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92547/testReport)**
 for PR 21700 at commit 
[`45796d8`](https://github.com/apache/spark/commit/45796d8c74d0a55bf0d3a22f1c526dc764c0e924).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21700
  
cc. @tdas @zsxwing @jose-torres @jerryshao @arunmahadevan @HyukjinKwon


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21700
  
retest this, please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92546 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92546/testReport)**
 for PR 21700 at commit 
[`22f0e22`](https://github.com/apache/spark/commit/22f0e220f661b5457584ef83b1ecddc18212fa73).
 * This patch **fails Java style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class BoundedSortedMap extends TreeMap `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92546/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21700
  
**[Test build #92546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92546/testReport)**
 for PR 21700 at commit 
[`22f0e22`](https://github.com/apache/spark/commit/22f0e220f661b5457584ef83b1ecddc18212fa73).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21700
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21700: SPARK-24717 Split out min retain version of state for me...

2018-07-02 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21700
  
Pasting JIRA issue description to explain why this patch is needed:

As default version of "spark.sql.streaming.minBatchesToRetain" is set to 
high (100), which doesn't require strictly 100x of memory, but I'm seeing 10x ~ 
80x of memory consumption for various workloads. In addition, in some cases, 
requiring 2x of memory is even unacceptable, so we should split out 
configuration for memory and let users adjust to trade-off between memory usage 
vs cache miss (building state from files).

In normal case, default value '2' would cover both cases: success and 
restoring failure with less than or around 2x of memory usage, and '1' would 
only cover success case but no longer require more than 1x of memory. In 
extreme case, user can set the value to '0' to completely disable the map cache 
to maximize executor memory (covers #21500).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21700: SPARK-24717 Split out min retain version of state...

2018-07-02 Thread HeartSaVioR
GitHub user HeartSaVioR opened a pull request:

https://github.com/apache/spark/pull/21700

SPARK-24717 Split out min retain version of state for memory in 
HDFSBackedStateStoreProvider

## What changes were proposed in this pull request?

This patch proposes breaking down configuration of retaining batch size on 
state into two pieces: files and in memory (cache). While this patch reuses 
existing configuration for files, it introduces new configuration, 
"spark.sql.streaming.maxBatchesToRetainInMemory" to configure max count of 
batch to retain in memory.

This patch also introduces BoundedSortedMap to retain at most first N 
elements (sorted by key) which can be leveraged in loadedMaps in 
HDFSBackedStateStoreProvider.

## How was this patch tested?

Apply this patch on top of SPARK-24441 
(https://github.com/apache/spark/pull/21469), and manually tested to ensure 
overall size of state is around 2x or less instead of 10x ~ 80x according to 
various workloads.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HeartSaVioR/spark SPARK-24717

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21700.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21700


commit 22f0e220f661b5457584ef83b1ecddc18212fa73
Author: Jungtaek Lim 
Date:   2018-07-02T22:04:49Z

SPARK-24717 Split out min retain version of state for memory in 
HDFSBackedStateStoreProvider

* introduce BoundedSortedMap which implements bounded size of sorted map
  * only first N elements will be retained
* replace loadedMaps to BoundedSortedMap to retain only N versions of states
  * no need to cleanup in maintenance phase
* introduce new configuration: 
spark.sql.streaming.minBatchesToRetainInMemory




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21692: [SPARK-24715][Build] Override jline version as 2.14.3 in...

2018-07-02 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/21692
  
I haven't looked into this particular issue thoroughly, but, I'm aware that 
SBT and Maven don't actually resolve dependencies in quite the same way. I 
think the resolve conflicts with different rules -- most recent wins vs 
'nearest' wins. Could be it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21687: [SPARK-24165][SQL] Fixing the output data type of CaseWh...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21687
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92535/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21687: [SPARK-24165][SQL] Fixing the output data type of CaseWh...

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21687
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21687: [SPARK-24165][SQL] Fixing the output data type of CaseWh...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21687
  
**[Test build #92535 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92535/testReport)**
 for PR 21687 at commit 
[`c040d31`](https://github.com/apache/spark/commit/c040d315346aba04b843f4bda12038a357a14912).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21605: [SPARK-24385][SQL] Resolve self-join condition ambiguity...

2018-07-02 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/21605
  
kindly ping @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21599: [SPARK-24598][SQL] Overflow on arithmetic operati...

2018-07-02 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21599#discussion_r199632638
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala
 ---
@@ -128,17 +128,31 @@ abstract class BinaryArithmetic extends 
BinaryOperator with NullIntolerant {
   def calendarIntervalMethod: String =
 sys.error("BinaryArithmetics must override either 
calendarIntervalMethod or genCode")
 
+  def checkOverflowCode(result: String, op1: String, op2: String): String =
+sys.error("BinaryArithmetics must override either checkOverflowCode or 
genCode")
+
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = 
dataType match {
 case _: DecimalType =>
   defineCodeGen(ctx, ev, (eval1, eval2) => 
s"$eval1.$decimalMethod($eval2)")
 case CalendarIntervalType =>
   defineCodeGen(ctx, ev, (eval1, eval2) => 
s"$eval1.$calendarIntervalMethod($eval2)")
+// In the following cases, overflow can happen, so we need to check 
the result is valid.
+// Otherwise we throw an ArithmeticException
--- End diff --

@gatorsmile @hvanhovell do you have time to check this and give your 
opinion here? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-07-02 Thread mallman
Github user mallman commented on a diff in the pull request:

https://github.com/apache/spark/pull/21320#discussion_r199631341
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala
 ---
@@ -47,16 +47,25 @@ import org.apache.spark.sql.types._
  *
  * Due to this reason, we no longer rely on [[ReadContext]] to pass 
requested schema from [[init()]]
  * to [[prepareForRead()]], but use a private `var` for simplicity.
+ *
+ * @param parquetMrCompatibility support reading with parquet-mr or 
Spark's built-in Parquet reader
  */
-private[parquet] class ParquetReadSupport(val convertTz: Option[TimeZone])
+private[parquet] class ParquetReadSupport(val convertTz: Option[TimeZone],
+parquetMrCompatibility: Boolean)
 extends ReadSupport[UnsafeRow] with Logging {
   private var catalystRequestedSchema: StructType = _
 
+  /**
+   * Construct a [[ParquetReadSupport]] with [[convertTz]] set to [[None]] 
and
+   * [[parquetMrCompatibility]] set to [[false]].
+   *
+   * We need a zero-arg constructor for SpecificParquetRecordReaderBase.  
But that is only
+   * used in the vectorized reader, where we get the convertTz value 
directly, and the value here
+   * is ignored. Further, we set [[parquetMrCompatibility]] to [[false]] 
as this constructor is only
+   * called by the Spark reader.
--- End diff --

I don't understand your confusion. I think the comment makes it very clear 
why we need to set that parameter to false. How can I make it better? Or can 
you be more specific about what is unclear to you?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92545/testReport)**
 for PR 21601 at commit 
[`b351406`](https://github.com/apache/spark/commit/b3514067db43b543d8ceac38a0e1ffe6c1a5692e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/629/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21469: [SPARK-24441][SS] Expose total estimated size of states ...

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21469
  
**[Test build #92544 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92544/testReport)**
 for PR 21469 at commit 
[`c9aada5`](https://github.com/apache/spark/commit/c9aada520889b87ace0886805910f0d56d099bd2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >