[GitHub] [spark] sarutak opened a new pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


sarutak opened a new pull request #34649:
URL: https://github.com/apache/spark/pull/34649


   ### What changes were proposed in this pull request?
   
   This PR mitigate an issue that MiMa fails for Scala 2.13 after SPARK-35672 
(##34120).
   ```
   $ dev/change-scala-version.sh 2.13
   $ dev/mima
   ...
   [error] spark-core: Failed binary compatibility check against 
org.apache.spark:spark-core_2.13:3.2.0! Found 8 potential problems (filtered 
905)
   [error]  * method userClassPath()scala.collection.mutable.ListBuffer in 
class org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments does not 
have a correspondent in current version
   [error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.userClassPath")
   [error]  * method 
copy(java.lang.String,java.lang.String,java.lang.String,java.lang.String,Int,java.lang.String,scala.Option,scala.collection.mutable.ListBuffer,scala.Option,Int)org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments
 in class org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments does 
not have a correspondent in current version
   [error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.copy")
   [error]  * synthetic method copy$default$10()Int in class 
org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments does not have 
a correspondent in current version
   [error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.copy$default$10")
   [error]  * synthetic method 
copy$default$8()scala.collection.mutable.ListBuffer in class 
org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments has a 
different result type in current version, where it is scala.Option rather than 
scala.collection.mutable.ListBuffer
   [error]filter with: 
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.copy$default$8")
   [error]  * synthetic method copy$default$9()scala.Option in class 
org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments has a 
different result type in current version, where it is Int rather than 
scala.Option
   [error]filter with: 
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.copy$default$9")
   [error]  * method 
this(java.lang.String,java.lang.String,java.lang.String,java.lang.String,Int,java.lang.String,scala.Option,scala.collection.mutable.ListBuffer,scala.Option,Int)Unit
 in class org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments does 
not have a correspondent in current version
   [error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.this")
   [error]  * the type hierarchy of object 
org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments is different 
in current version. Missing types {scala.runtime.AbstractFunction10}
   [error]filter with: 
ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend$Arguments$")
   [error]  * method 
apply(java.lang.String,java.lang.String,java.lang.String,java.lang.String,Int,java.lang.String,scala.Option,scala.collection.mutable.ListBuffer,scala.Option,Int)org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments
 in object org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments 
does not have a correspondent in current version
   [error]filter with: 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.executor.CoarseGrainedExecutorBackend#Arguments.apply")
   ...
   ```
   
   It's funny that the class `Arguments` is `public` but it's a member class of 
`CoarseGrainedExecutorBackend` which is `package private` and MiMa doesn't 
raise error for Scala 2.12, but adding an exclusion rule is one workaround.
   
   ### Why are the changes needed?
   
   To keep the build stable.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Confirmed MiMa passed.
   ```
   $ dev/change-scala-version.sh 2.13
   $ dev/mima
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros commented on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


attilapiros commented on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973068166


   Thanks @sarutak for taking care of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on a change in pull request #34632: [SPARK-37356][CORE] Add fine grained locking to the BlockInfoManager

2021-11-18 Thread GitBox


mridulm commented on a change in pull request #34632:
URL: https://github.com/apache/spark/pull/34632#discussion_r752476094



##
File path: core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala
##
@@ -166,6 +189,48 @@ private[storage] class BlockInfoManager extends Logging {
 
Option(TaskContext.get()).map(_.taskAttemptId()).getOrElse(BlockInfo.NON_TASK_WRITER)
   }
 
+  /**
+   * Helper for lock acquisistion.
+   */
+  private def acquireLock(
+  blockId: BlockId,
+  blocking: Boolean)(
+  f: BlockInfo => Boolean): Option[BlockInfo] = {
+var done = false
+var result: Option[BlockInfo] = None
+while(!done) {
+  val wrapper = blockInfoWrappers.get(blockId)
+  if (wrapper == null) {
+done = true
+  } else {
+wrapper.withLock { (info, condition) =>
+  if (f(info)) {
+result = Some(info)
+done = true
+  } else if (!blocking) {
+done = true
+  } else {
+condition.await()

Review comment:
   You are right, `acquireLock` gets used only where we previous had a 
`wait`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34611:
URL: https://github.com/apache/spark/pull/34611#issuecomment-973101827


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145388/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-973101824


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145395/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973148314


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145400/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973166396


   **[Test build #145402 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145402/testReport)**
 for PR 34650 at commit 
[`a1ce1fd`](https://github.com/apache/spark/commit/a1ce1fd4bdea0b6755d65168feaf997b33d309ac).
* This patch **fails MiMa tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA removed a comment on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973156390


   **[Test build #145402 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145402/testReport)**
 for PR 34650 at commit 
[`a1ce1fd`](https://github.com/apache/spark/commit/a1ce1fd4bdea0b6755d65168feaf997b33d309ac).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973193881


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49875/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on a change in pull request #34502: [SPARK-37224][SS] Optimize write path on RocksDB state store provider

2021-11-18 Thread GitBox


HeartSaVioR commented on a change in pull request #34502:
URL: https://github.com/apache/spark/pull/34502#discussion_r752576654



##
File path: docs/structured-streaming-programming-guide.md
##
@@ -1956,8 +1956,21 @@ Here are the configs regarding to RocksDB instance of 
the state store provider:
 Whether we resets all ticker and histogram stats for RocksDB on 
load.
 True
   
+  
+spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows
+Whether we track the total number of rows in state store. Please refer 
the details in Performance-aspect 
considerations.
+True
+  
 
 
+# Performance-aspect considerations
+
+1. For write-heavy workloads, you may want to disable the track of total 
number of rows.

Review comment:
   1. is not a typo. Just wanted to reserve a space we would eventually add 
more. I'm not an expert of RocksDB so don't have insights to put some guides on 
tuning, but RocksDB itself seems to provide lots of things to tune so it may 
come up later.
   
   I agree that "write-heavy workloads" sounds unclear; basically it means 
higher amount of updates (write/delete) against state store. This cannot be 
inferred from the volume of inputs depending on the operator and window - if 
the input produces lots of state keys on streaming aggregation, then it's going 
to issue lots of writes against state. If the input are huge but binds to a few 
windows, then a few writes against state.
   
   Probably we can leverage the state metric "rows to update" and "rows to 
delete". They represent the amount of updates. Technically this change doesn't 
introduce perf. regression in any workloads so it's not limited to write-heavy 
workloads - we make a trade-off on observability so it's up to end users to 
choose performance vs observability.
   
   Looks like it'd be better to remove the representation "For write-heavy 
workloads" and simply add "to gain additional performance on state store", with 
hinting that it will be more effective if the state metric "rows to update" and 
"rows to delete" are high.
   
   Thanks for the inputs!

##
File path: docs/structured-streaming-programming-guide.md
##
@@ -1956,8 +1956,21 @@ Here are the configs regarding to RocksDB instance of 
the state store provider:
 Whether we resets all ticker and histogram stats for RocksDB on 
load.
 True
   
+  
+spark.sql.streaming.stateStore.rocksdb.trackTotalNumberOfRows
+Whether we track the total number of rows in state store. Please refer 
the details in Performance-aspect 
considerations.
+True
+  
 
 
+# Performance-aspect considerations
+
+1. For write-heavy workloads, you may want to disable the track of total 
number of rows.

Review comment:
   `1.` is not a typo. Just wanted to reserve a space we would eventually 
add more. I'm not an expert of RocksDB so don't have insights to put some 
guides on tuning, but RocksDB itself seems to provide lots of things to tune so 
it may come up later.
   
   I agree that "write-heavy workloads" sounds unclear; basically it means 
higher amount of updates (write/delete) against state store. This cannot be 
inferred from the volume of inputs depending on the operator and window - if 
the input produces lots of state keys on streaming aggregation, then it's going 
to issue lots of writes against state. If the input are huge but binds to a few 
windows, then a few writes against state.
   
   Probably we can leverage the state metric "rows to update" and "rows to 
delete". They represent the amount of updates. Technically this change doesn't 
introduce perf. regression in any workloads so it's not limited to write-heavy 
workloads - we make a trade-off on observability so it's up to end users to 
choose performance vs observability.
   
   Looks like it'd be better to remove the representation "For write-heavy 
workloads" and simply add "to gain additional performance on state store", with 
hinting that it will be more effective if the state metric "rows to update" and 
"rows to delete" are high.
   
   Thanks for the inputs!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


SparkQA commented on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973214542


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49874/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


kazuyukitanimura commented on a change in pull request #34611:
URL: https://github.com/apache/spark/pull/34611#discussion_r752601914



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java
##
@@ -53,19 +53,46 @@ public void skip() {
 throw new UnsupportedOperationException();
   }
 
+  private void updateCurrentByte() {
+try {
+  currentByte = (byte) in.read();
+} catch (IOException e) {
+  throw new ParquetDecodingException("Failed to read a byte", e);
+}
+  }
+
   @Override
   public final void readBooleans(int total, WritableColumnVector c, int rowId) 
{
-// TODO: properly vectorize this
-for (int i = 0; i < total; i++) {
-  c.putBoolean(rowId + i, readBoolean());
+int i = 0;
+if (bitOffset > 0) {
+  i = Math.min(8 - bitOffset, total);
+  c.putBooleans(rowId, i, currentByte, bitOffset);
+  bitOffset = (bitOffset + i) & 7;
+}
+for (; i + 7 < total; i += 8) {
+  updateCurrentByte();
+  c.putBooleans(rowId + i, currentByte);
+}
+if (i < total) {
+  updateCurrentByte();
+  bitOffset = total - i;
+  c.putBooleans(rowId + i, bitOffset, currentByte, 0);
 }
   }
 
   @Override
   public final void skipBooleans(int total) {
-// TODO: properly vectorize this
-for (int i = 0; i < total; i++) {
-  readBoolean();
+int skipBytes = (bitOffset + total - 8) / 8;
+bitOffset = (bitOffset + total) & 7;
+if (skipBytes >= 0) {

Review comment:
   Ahh you are right. will fix this and add tests




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973244493


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49876/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973244532


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49876/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973244532


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49876/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on a change in pull request #34632: [SPARK-37356][CORE] Add fine grained locking to the BlockInfoManager

2021-11-18 Thread GitBox


mridulm commented on a change in pull request #34632:
URL: https://github.com/apache/spark/pull/34632#discussion_r752507849



##
File path: core/src/main/scala/org/apache/spark/storage/BlockInfoManager.scala
##
@@ -341,106 +415,103 @@ private[storage] class BlockInfoManager extends Logging 
{
*
* @return the ids of blocks whose pins were released
*/
-  def releaseAllLocksForTask(taskAttemptId: TaskAttemptId): Seq[BlockId] = 
synchronized {
+  def releaseAllLocksForTask(taskAttemptId: TaskAttemptId): Seq[BlockId] = {
 val blocksWithReleasedLocks = mutable.ArrayBuffer[BlockId]()
 
-val readLocks = 
readLocksByTask.remove(taskAttemptId).getOrElse(ImmutableMultiset.of[BlockId]())
-val writeLocks = 
writeLocksByTask.remove(taskAttemptId).getOrElse(Seq.empty)
-
-for (blockId <- writeLocks) {
-  infos.get(blockId).foreach { info =>
+val writeLocks = 
Option(writeLocksByTask.remove(taskAttemptId)).getOrElse(Collections.emptySet)
+writeLocks.forEach { blockId =>
+  blockInfo(blockId) { (info, condition) =>
 assert(info.writerTask == taskAttemptId)
 info.writerTask = BlockInfo.NO_WRITER
+condition.signalAll()
   }
   blocksWithReleasedLocks += blockId
 }
 
-readLocks.entrySet().iterator().asScala.foreach { entry =>
+val readLocks = Option(readLocksByTask.remove(taskAttemptId))
+  .getOrElse(ImmutableMultiset.of[BlockId])
+readLocks.entrySet().forEach { entry =>

Review comment:
   I was trying to investigate if the delay between writeLocks remove and 
readLocks remove might cause any potential issues (given the condition signal 
for write locks iteration).
   
   I did not find cause for concern though, so resolving conversation. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA removed a comment on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973114717


   **[Test build #145400 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145400/testReport)**
 for PR 34636 at commit 
[`12a4096`](https://github.com/apache/spark/commit/12a4096d740e2cfb6842f08ceb5088ecc3a72db4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973148314


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145400/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973194334






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973194329






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973194331


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49871/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973194331


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49871/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973202711


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49873/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973202711


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49873/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down

2021-11-18 Thread GitBox


SparkQA commented on pull request #34593:
URL: https://github.com/apache/spark/pull/34593#issuecomment-973244136


   **[Test build #145406 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145406/testReport)**
 for PR 34593 at commit 
[`42f637e`](https://github.com/apache/spark/commit/42f637e64e9ffe1f5f0e93a94ee08bcb4c10731c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-11-18 Thread GitBox


SparkQA commented on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-973244430


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49878/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #34652: [SPARK-37224][SS][FOLLOWUP] Clarify the guide doc and fix the method doc

2021-11-18 Thread GitBox


HeartSaVioR commented on pull request #34652:
URL: https://github.com/apache/spark/pull/34652#issuecomment-973265193


   cc. @viirya @xuanyuanking 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR opened a new pull request #34652: [SPARK-37224][SS][FOLLOWUP] Clarify the guide doc and fix the method doc

2021-11-18 Thread GitBox


HeartSaVioR opened a new pull request #34652:
URL: https://github.com/apache/spark/pull/34652


   ### What changes were proposed in this pull request?
   
   This PR is a follow-up of #34502 to address post-reviews. 
   
   This PR rewords on the explanation on performance tune on RocksDB state 
store to make it less confused, and also fix the method docs to be in sync with 
the code changes.
   
   ### Why are the changes needed?
   
   1. The explanation on performance tune on RocksDB state store was unclear in 
a couple of spots.
   2. We changed the method signature, but the change was not reflected to the 
method doc.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, end users will get less confused from the explanation on performance 
tune on RocksDB state store.
   
   ### How was this patch tested?
   
   N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


SparkQA removed a comment on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973150233


   **[Test build #145401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145401/testReport)**
 for PR 34651 at commit 
[`71f09df`](https://github.com/apache/spark/commit/71f09df4c464b92ca0ec2d603ab86d4eb3b5af31).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive

2021-11-18 Thread GitBox


SparkQA commented on pull request #34647:
URL: https://github.com/apache/spark/pull/34647#issuecomment-973017066


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49869/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-18 Thread GitBox


SparkQA commented on pull request #34644:
URL: https://github.com/apache/spark/pull/34644#issuecomment-973035927


   **[Test build #145385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145385/testReport)**
 for PR 34644 at commit 
[`39feabe`](https://github.com/apache/spark/commit/39feabe27f21f2eec7eef8a324eefb250e766c61).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


SparkQA commented on pull request #34611:
URL: https://github.com/apache/spark/pull/34611#issuecomment-973062089


   **[Test build #145388 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145388/testReport)**
 for PR 34611 at commit 
[`a24339a`](https://github.com/apache/spark/commit/a24339ace97b9926ff63732e2b4621fa5abdb5e3).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973114717


   **[Test build #145400 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145400/testReport)**
 for PR 34636 at commit 
[`12a4096`](https://github.com/apache/spark/commit/12a4096d740e2cfb6842f08ceb5088ecc3a72db4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973162274


   **[Test build #145403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145403/testReport)**
 for PR 34650 at commit 
[`33d3918`](https://github.com/apache/spark/commit/33d3918edf9ab6ec7430144dfe17f45791bfa087).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973241342


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49875/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973241347


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145398/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973241347


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145398/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973241345


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49874/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] hvanhovell closed pull request #34632: [SPARK-37356][CORE] Add fine grained locking to the BlockInfoManager

2021-11-18 Thread GitBox


hvanhovell closed pull request #34632:
URL: https://github.com/apache/spark/pull/34632


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34645: [SPARK-37371][SQL] UnionExec should support columnar if all children support columnar

2021-11-18 Thread GitBox


SparkQA commented on pull request #34645:
URL: https://github.com/apache/spark/pull/34645#issuecomment-973292081


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49880/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973291905


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145403/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973291906


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49877/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973291906


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49877/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-973291903


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49878/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973291904


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145401/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973291905


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145403/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-973291903


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49878/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973291904


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145401/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


SparkQA commented on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-973034670


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49868/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973125812


   **[Test build #145400 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145400/testReport)**
 for PR 34636 at commit 
[`12a4096`](https://github.com/apache/spark/commit/12a4096d740e2cfb6842f08ceb5088ecc3a72db4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] attilapiros opened a new pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


attilapiros opened a new pull request #34651:
URL: https://github.com/apache/spark/pull/34651


   
   ### What changes were proposed in this pull request?
   
   Collecting `LocalSparkContext` worker logs in case of test failure. 
   
   ### Why are the changes needed?
   
   About 50 test suites are using `LocalSparkContext` by specifying 
"local-cluster" as the cluster URL. In this case executor logs will be under 
the worker dir which is a temporary directory and as such will be deleted at 
shutdown (for details see 
https://github.com/apache/spark/blob/0a4961df29aab6912492e87e4e719865fe20d981/core/src/main/scala/org/apache/spark/deploy/LocalSparkCluster.scala#L70).
   
   So when such a test fails and the error was on the executor side the log 
will be lost.
   
   This is only for local cluster tests and not for standalone tests where logs 
will be kept in the "/work".
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Manually by adding a temporary code to one of the test to fail then checked 
the unittest.log:
   
   ```
   21/11/18 18:59:32.924 dag-scheduler-event-loop INFO TaskSchedulerImpl: 
Killing all running tasks in stage 0: Stage finished
   21/11/18 18:59:32.924 pool-1-thread-1-ScalaTest-running-DistributedSuite 
INFO DAGScheduler: Job 0 finished: $anonfun$new$13 at OutcomeOf.scala:85, took 
4.339006 s
   21/11/18 18:59:32.930 pool-1-thread-1-ScalaTest-running-DistributedSuite 
INFO DistributedSuite: 
   
   = EXTRA LOGS FOR THE FAILED TEST
   
   21/11/18 18:59:32.930 pool-1-thread-1-ScalaTest-running-DistributedSuite 
INFO DistributedSuite: 
   - Logfile: 
/Users/attilazsoltpiros/git/attilapiros/spark/core/target/tmp/org.apache.spark.DistributedSuite/worker-85d9c1f8-3dae-453d-b105-fc2087ef110c/app-2028095928-/1/target/unit-tests.log
   21/11/18 18:59:32.939 pool-1-thread-1-ScalaTest-running-DistributedSuite 
INFO DistributedSuite: 21/11/18 18:59:29.877 main INFO 
CoarseGrainedExecutorBackend: Started daemon with process name: 62486@Budlap-617
   21/11/18 18:59:29.885 main INFO SignalUtils: Registering signal handler for 
TERM
   21/11/18 18:59:29.886 main INFO SignalUtils: Registering signal handler for 
HUP
   21/11/18 18:59:29.886 main INFO SignalUtils: Registering signal handler for 
INT
   21/11/18 18:59:30.459 main WARN NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
   ...
   21/11/18 18:59:32.208 Executor task launch worker for task 1.0 in stage 0.0 
(TID 1) INFO MemoryStore: Block broadcast_0 stored as values in memory 
(estimated size 6.2 KiB, free 546.3 MiB)
   21/11/18 18:59:32.792 Executor task launch worker for task 1.0 in stage 0.0 
(TID 1) INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 923 bytes result 
sent to driver
   
   21/11/18 18:59:32.940 pool-1-thread-1-ScalaTest-running-DistributedSuite 
INFO DistributedSuite: 
   - Logfile: 
/Users/attilazsoltpiros/git/attilapiros/spark/core/target/tmp/org.apache.spark.DistributedSuite/worker-67734ffb-23d8-4aa5-85cf-7b1857961c48/app-2028095928-/2/target/unit-tests.log
   21/11/18 18:59:32.940 pool-1-thread-1-ScalaTest-running-DistributedSuite 
INFO DistributedSuite: 21/11/18 18:59:30.005 main INFO 
CoarseGrainedExecutorBackend: Started daemon with process name: 62488@Budlap-617
   21/11/18 18:59:30.014 main INFO SignalUtils: Registering signal handler for 
TERM
   21/11/18 18:59:30.015 main INFO SignalUtils: Registering signal handler for 
HUP
   21/11/18 18:59:30.015 main INFO SignalUtils: Registering signal handler for 
INT
   ...  
   
   ```
   
   
   Here you can see the path were:
   - 
spark/core/target/tmp/org.apache.spark.DistributedSuite/worker-85d9c1f8-3dae-453d-b105-fc2087ef110c/app-2028095928-/1/target/unit-tests.log
 
   - 
spark/core/target/tmp/org.apache.spark.DistributedSuite/worker-0c97959f-f8df-464e-a9c4-941a1a40701b/app-2028095928-/0/target/unit-tests.log


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on pull request #34613: [SPARK-37342][BUILD] Upgrade Apache Arrow to 6.0.0

2021-11-18 Thread GitBox


sunchao commented on pull request #34613:
URL: https://github.com/apache/spark/pull/34613#issuecomment-973153680


   Hmm I must missed something since I always get this "ConnectionRefusedError: 
[Errno 61] Connection refused" even though I followed the exact above steps. 
Not sure what host & port it tries to access: I did open up passwordless ssh to 
localhost.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kazuyukitanimura commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


kazuyukitanimura commented on a change in pull request #34611:
URL: https://github.com/apache/spark/pull/34611#discussion_r752537385



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
##
@@ -147,6 +147,18 @@ public void putBooleans(int rowId, int count, boolean 
value) {
 }
   }
 
+  @Override
+  public void putBooleans(int rowId, byte src) {

Review comment:
   Thanks, sounds good!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


sunchao commented on a change in pull request #34611:
URL: https://github.com/apache/spark/pull/34611#discussion_r752544556



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java
##
@@ -53,19 +53,46 @@ public void skip() {
 throw new UnsupportedOperationException();
   }
 
+  private void updateCurrentByte() {
+try {
+  currentByte = (byte) in.read();
+} catch (IOException e) {
+  throw new ParquetDecodingException("Failed to read a byte", e);
+}
+  }
+
   @Override
   public final void readBooleans(int total, WritableColumnVector c, int rowId) 
{
-// TODO: properly vectorize this
-for (int i = 0; i < total; i++) {
-  c.putBoolean(rowId + i, readBoolean());
+int i = 0;
+if (bitOffset > 0) {
+  i = Math.min(8 - bitOffset, total);
+  c.putBooleans(rowId, i, currentByte, bitOffset);
+  bitOffset = (bitOffset + i) & 7;
+}
+for (; i + 7 < total; i += 8) {
+  updateCurrentByte();
+  c.putBooleans(rowId + i, currentByte);
+}
+if (i < total) {
+  updateCurrentByte();
+  bitOffset = total - i;
+  c.putBooleans(rowId + i, bitOffset, currentByte, 0);
 }
   }
 
   @Override
   public final void skipBooleans(int total) {
-// TODO: properly vectorize this
-for (int i = 0; i < total; i++) {
-  readBoolean();
+int skipBytes = (bitOffset + total - 8) / 8;
+bitOffset = (bitOffset + total) & 7;
+if (skipBytes >= 0) {

Review comment:
   Ehh I think `(-4) / 8 == 0` right? in Java.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


SparkQA commented on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973179932


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49874/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973209654


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145404/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973209654


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145404/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA removed a comment on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973196575


   **[Test build #145404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145404/testReport)**
 for PR 34636 at commit 
[`e1e61f2`](https://github.com/apache/spark/commit/e1e61f24dc88993733e7b2c274fa6e4c20fffb26).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973209346


   **[Test build #145404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145404/testReport)**
 for PR 34636 at commit 
[`e1e61f2`](https://github.com/apache/spark/commit/e1e61f24dc88993733e7b2c274fa6e4c20fffb26).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


SparkQA commented on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973229564


   **[Test build #145398 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145398/testReport)**
 for PR 34649 at commit 
[`5a60e86`](https://github.com/apache/spark/commit/5a60e863d5400d90947fb4a556ee33ac1727bb32).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973230438


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49877/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973229794


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49875/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


SparkQA removed a comment on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973101987


   **[Test build #145398 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145398/testReport)**
 for PR 34649 at commit 
[`5a60e86`](https://github.com/apache/spark/commit/5a60e863d5400d90947fb4a556ee33ac1727bb32).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973241342


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49875/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-11-18 Thread GitBox


SparkQA commented on pull request #32875:
URL: https://github.com/apache/spark/pull/32875#issuecomment-973242337


   **[Test build #145405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145405/testReport)**
 for PR 32875 at commit 
[`200e42a`](https://github.com/apache/spark/commit/200e42a83de4c14dfb2ffbb5fadb3debd5607de9).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34645: [SPARK-37371][SQL] UnionExec should support columnar if all children support columnar

2021-11-18 Thread GitBox


SparkQA commented on pull request #34645:
URL: https://github.com/apache/spark/pull/34645#issuecomment-973259922


   **[Test build #145407 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145407/testReport)**
 for PR 34645 at commit 
[`7e07975`](https://github.com/apache/spark/commit/7e0797594c6d0ade0b33dca394ad3ded77f488fd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #34645: [SPARK-37371][SQL] UnionExec should support columnar if all children support columnar

2021-11-18 Thread GitBox


viirya commented on pull request #34645:
URL: https://github.com/apache/spark/pull/34645#issuecomment-973260082


   cc @cloud-fan @sunchao @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973269669


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49877/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34645: [SPARK-37371][SQL] UnionExec should support columnar if all children support columnar

2021-11-18 Thread GitBox


SparkQA commented on pull request #34645:
URL: https://github.com/apache/spark/pull/34645#issuecomment-973293622


   **[Test build #145409 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145409/testReport)**
 for PR 34645 at commit 
[`93a3a2d`](https://github.com/apache/spark/commit/93a3a2db66b284b63f138cbb88855172ac601950).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down

2021-11-18 Thread GitBox


SparkQA commented on pull request #34593:
URL: https://github.com/apache/spark/pull/34593#issuecomment-973293666


   **[Test build #145410 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145410/testReport)**
 for PR 34593 at commit 
[`f93e759`](https://github.com/apache/spark/commit/f93e7596beba14d8eaa4532180fd817ae240ba09).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #34645: [SPARK-37371][SQL] UnionExec should support columnar if all children support columnar

2021-11-18 Thread GitBox


sunchao commented on a change in pull request #34645:
URL: https://github.com/apache/spark/pull/34645#discussion_r752645500



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSetOperationsSuite.scala
##
@@ -1355,6 +1357,45 @@ class DataFrameSetOperationsSuite extends QueryTest with 
SharedSparkSession {
   Row(Row(Seq(Seq(Row("bb", null) ::
   Row(Row(Seq(Seq(Row(null, "ba") :: Nil)
   }
+
+  test("SPARK-37371: UnionExec should support columnar if all children support 
columnar") {
+def checkIfColumnar(
+plan: SparkPlan,
+targetPlan: (SparkPlan) => Boolean,
+isColumnar: Boolean): Unit = {
+  val target = plan.collect {
+case p if targetPlan(p) => p
+  }
+  assert(target.nonEmpty)
+  if (isColumnar) {

Review comment:
   nit: can replace this with one line: 
`assert(target.forall(_.supportsColumnar == isColumnar))`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-973037748


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49868/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34647:
URL: https://github.com/apache/spark/pull/34647#issuecomment-973037746


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49869/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34644:
URL: https://github.com/apache/spark/pull/34644#issuecomment-973037789


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145385/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34634: [SPARK-37357][SQL] Create skew partition specs should respect min partition size

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34634:
URL: https://github.com/apache/spark/pull/34634#issuecomment-973037747


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49870/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-973037748


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49868/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34634: [SPARK-37357][SQL] Create skew partition specs should respect min partition size

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34634:
URL: https://github.com/apache/spark/pull/34634#issuecomment-973037747


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49870/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34647: [SPARK-36180][SQL] Support TimestampNTZ type in Hive

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34647:
URL: https://github.com/apache/spark/pull/34647#issuecomment-973037746


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49869/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34611:
URL: https://github.com/apache/spark/pull/34611#issuecomment-973101827


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145388/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-973101824


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145395/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


SparkQA commented on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973101987


   **[Test build #145398 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145398/testReport)**
 for PR 34649 at commit 
[`5a60e86`](https://github.com/apache/spark/commit/5a60e863d5400d90947fb4a556ee33ac1727bb32).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34634: [SPARK-37357][SQL] Create skew partition specs should respect min partition size

2021-11-18 Thread GitBox


SparkQA commented on pull request #34634:
URL: https://github.com/apache/spark/pull/34634#issuecomment-973200417


   **[Test build #145397 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145397/testReport)**
 for PR 34634 at commit 
[`a6ea23c`](https://github.com/apache/spark/commit/a6ea23cb0ac7671ca257ff0cdc18b4abe3855312).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973241345


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49874/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34634: [SPARK-37357][SQL] Create skew partition specs should respect min partition size

2021-11-18 Thread GitBox


SparkQA commented on pull request #34634:
URL: https://github.com/apache/spark/pull/34634#issuecomment-973034173


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49870/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34644: [SPARK-36357][SQL] Support pushdown Timestamp with local time zone for orc

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34644:
URL: https://github.com/apache/spark/pull/34644#issuecomment-973037789


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145385/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


SparkQA commented on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-973084694


   **[Test build #145395 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145395/testReport)**
 for PR 34326 at commit 
[`818eba6`](https://github.com/apache/spark/commit/818eba69f703deaca110076d99c6830f5af481b2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34326: [SPARK-37053][CORE] Add metrics to SparkHistoryServer

2021-11-18 Thread GitBox


SparkQA removed a comment on pull request #34326:
URL: https://github.com/apache/spark/pull/34326#issuecomment-972924514


   **[Test build #145395 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145395/testReport)**
 for PR 34326 at commit 
[`818eba6`](https://github.com/apache/spark/commit/818eba69f703deaca110076d99c6830f5af481b2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] holdenk opened a new pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


holdenk opened a new pull request #34650:
URL: https://github.com/apache/spark/pull/34650


   ### What changes were proposed in this pull request?
   
   Keep track of and communicate to the listener bus how long we are waiting 
for execs to be allocated from the underlying cluster manager.
   
   
   ### Why are the changes needed?
   
   Sometimes the cluster manager may choke or otherwise not be able to allocate 
resources and we don't have a good way of detecting this situation making it 
difficult for the user to debug and tell apart from Spark not scaling up 
correctly.
   
   ### Does this PR introduce _any_ user-facing change?
   
   New field in the listener bus message for when a executor is allocated.
   
   ### How was this patch tested?
   
   New unit test in the listener suite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34649: [SPARK-35672][FOLLOWUP][TESTS] Add an exclusion rule to MimaExcludes.scala for Scala 2.13.

2021-11-18 Thread GitBox


SparkQA commented on pull request #34649:
URL: https://github.com/apache/spark/pull/34649#issuecomment-973136239


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49871/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973144066


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49872/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


SparkQA commented on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973150233


   **[Test build #145401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145401/testReport)**
 for PR 34651 at commit 
[`71f09df`](https://github.com/apache/spark/commit/71f09df4c464b92ca0ec2d603ab86d4eb3b5af31).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #34611: [SPARK-35867][SQL] Enable vectorized read for VectorizedPlainValuesReader.readBooleans

2021-11-18 Thread GitBox


sunchao commented on a change in pull request #34611:
URL: https://github.com/apache/spark/pull/34611#discussion_r752516781



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedPlainValuesReader.java
##
@@ -53,19 +53,46 @@ public void skip() {
 throw new UnsupportedOperationException();
   }
 
+  private void updateCurrentByte() {
+try {
+  currentByte = (byte) in.read();
+} catch (IOException e) {
+  throw new ParquetDecodingException("Failed to read a byte", e);
+}
+  }
+
   @Override
   public final void readBooleans(int total, WritableColumnVector c, int rowId) 
{
-// TODO: properly vectorize this
-for (int i = 0; i < total; i++) {
-  c.putBoolean(rowId + i, readBoolean());
+int i = 0;
+if (bitOffset > 0) {
+  i = Math.min(8 - bitOffset, total);
+  c.putBooleans(rowId, i, currentByte, bitOffset);
+  bitOffset = (bitOffset + i) & 7;
+}
+for (; i + 7 < total; i += 8) {
+  updateCurrentByte();
+  c.putBooleans(rowId + i, currentByte);
+}
+if (i < total) {
+  updateCurrentByte();
+  bitOffset = total - i;
+  c.putBooleans(rowId + i, bitOffset, currentByte, 0);
 }
   }
 
   @Override
   public final void skipBooleans(int total) {
-// TODO: properly vectorize this
-for (int i = 0; i < total; i++) {
-  readBoolean();
+int skipBytes = (bitOffset + total - 8) / 8;
+bitOffset = (bitOffset + total) & 7;
+if (skipBytes >= 0) {

Review comment:
   hmm why should we enter this when `skipBytes == 0`? suppose `bitOffset = 
1` and `total = 3`, we shouldn't update the `currentByte` right?

##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
##
@@ -147,6 +147,18 @@ public void putBooleans(int rowId, int count, boolean 
value) {
 }
   }
 
+  @Override
+  public void putBooleans(int rowId, byte src) {

Review comment:
    I think this is much easier to read :) we can always optimize it later 
if it turns out to be importance for performance (right now seems not so).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


SparkQA commented on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973156390


   **[Test build #145402 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145402/testReport)**
 for PR 34650 at commit 
[`a1ce1fd`](https://github.com/apache/spark/commit/a1ce1fd4bdea0b6755d65168feaf997b33d309ac).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-11-18 Thread GitBox


AmplabJenkins commented on pull request #34352:
URL: https://github.com/apache/spark/pull/34352#issuecomment-973156024


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145393/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34352: [SPARK-37018][SQL] Spark SQL should support create function with Aggregator

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34352:
URL: https://github.com/apache/spark/pull/34352#issuecomment-973156024


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145393/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34650: [WIP][SPARK-36664][CORE] Log time waiting for cluster resources

2021-11-18 Thread GitBox


AmplabJenkins removed a comment on pull request #34650:
URL: https://github.com/apache/spark/pull/34650#issuecomment-973118297


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145399/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34636: [WIP][SPARK-37359][K8S] Cleanup the Spark Kubernetes Integration tests

2021-11-18 Thread GitBox


SparkQA commented on pull request #34636:
URL: https://github.com/apache/spark/pull/34636#issuecomment-973196575


   **[Test build #145404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145404/testReport)**
 for PR 34636 at commit 
[`e1e61f2`](https://github.com/apache/spark/commit/e1e61f24dc88993733e7b2c274fa6e4c20fffb26).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34651: [WIP][SPARK-37373] Collecting LocalSparkContext worker logs in case of test failure

2021-11-18 Thread GitBox


SparkQA commented on pull request #34651:
URL: https://github.com/apache/spark/pull/34651#issuecomment-973277581


   **[Test build #145401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145401/testReport)**
 for PR 34651 at commit 
[`71f09df`](https://github.com/apache/spark/commit/71f09df4c464b92ca0ec2d603ab86d4eb3b5af31).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34593: [SPARK-37324][SQL] Adds support for decimal rounding mode up, down, half_down

2021-11-18 Thread GitBox


SparkQA commented on pull request #34593:
URL: https://github.com/apache/spark/pull/34593#issuecomment-973284096


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49879/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34652: [SPARK-37224][SS][FOLLOWUP] Clarify the guide doc and fix the method doc

2021-11-18 Thread GitBox


SparkQA commented on pull request #34652:
URL: https://github.com/apache/spark/pull/34652#issuecomment-973293544


   **[Test build #145408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145408/testReport)**
 for PR 34652 at commit 
[`6411abd`](https://github.com/apache/spark/commit/6411abd14ad1467b95a3713bb272b2ada1b89fe2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   >