[jira] [Created] (SPARK-13735) Log for parquet relation reading files is too verbose
Zhong Wang created SPARK-13735: -- Summary: Log for parquet relation reading files is too verbose Key: SPARK-13735 URL: https://issues.apache.org/jira/browse/SPARK-13735 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.6.0 Reporter: Zhong Wang Priority: Trivial The INFO level logging contains all files read by Parquet Relation, which is way too verbose if the input contains lots of files -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13704) TaskSchedulerImpl.createTaskSetManager can be expensive, and result in lost executors due to blocked heartbeats
Zhong Wang created SPARK-13704: -- Summary: TaskSchedulerImpl.createTaskSetManager can be expensive, and result in lost executors due to blocked heartbeats Key: SPARK-13704 URL: https://issues.apache.org/jira/browse/SPARK-13704 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.6.0, 1.5.2, 1.4.1, 1.3.1 Reporter: Zhong Wang In some cases, TaskSchedulerImpl.createTaskSetManager can be expensive. For example, in a Yarn cluster, it may call the topology script for rack awareness. When submit a very large job in a very large Yarn cluster, the topology script may take signifiant time to run. And this blocks receiving executors' heartbeats, which may result in lost executors Stacktraces we observed which is related to this issue: {code} "dag-scheduler-event-loop" daemon prio=10 tid=0x7f8392875800 nid=0x26e8 runnable [0x7f83576f4000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:272) at java.io.BufferedInputStream.read1(BufferedInputStream.java:273) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0xf551f460> (a java.lang.UNIXProcess$ProcessPipeInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) - locked <0xf5529740> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:154) at java.io.BufferedReader.read1(BufferedReader.java:205) at java.io.BufferedReader.read(BufferedReader.java:279) - locked <0xf5529740> (a java.io.InputStreamReader) at org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:728) at org.apache.hadoop.util.Shell.runCommand(Shell.java:524) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.runResolveCommand(ScriptBasedMapping.java:251) at org.apache.hadoop.net.ScriptBasedMapping$RawScriptBasedMapping.resolve(ScriptBasedMapping.java:188) at org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:119) at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:101) at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:81) at org.apache.spark.scheduler.cluster.YarnScheduler.getRackForHost(YarnScheduler.scala:38) at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:210) at org.apache.spark.scheduler.TaskSetManager$$anonfun$org$apache$spark$scheduler$TaskSetManager$$addPendingTask$1.apply(TaskSetManager.scala:189) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.TaskSetManager.org$apache$spark$scheduler$TaskSetManager$$addPendingTask(TaskSetManager.scala:189) at org.apache.spark.scheduler.TaskSetManager$$anonfun$1.apply$mcVI$sp(TaskSetManager.scala:158) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:157) at org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:187) at org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:161) - locked <0xea3b8a88> (a org.apache.spark.scheduler.cluster.YarnScheduler) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:872) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:778) at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:762) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) "sparkDriver-akka.actor.default-dispatcher-15" daemon prio=10 tid=0x7f829c02 nid=0x2737 waiting for monitor entry [0x7f8355ebd000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.spark.scheduler.TaskSchedule
[jira] [Commented] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179339#comment-15179339 ] Zhong Wang commented on SPARK-13337: The current join method with usingColumns argument generates result like TableC. The limitation is that it doesn't support null-safe join. > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175154#comment-15175154 ] Zhong Wang edited comment on SPARK-13337 at 3/2/16 6:50 AM: suppose we are joining two tables: -- TableA ||key1||key2||value1|| |null|k1|v1| |k2|k3|v2| TableB ||key1||key2||value2|| |null|k1|v3| |k4|k5|v4| The result table I want is: -- TableC ||key1||key2||value1||value2|| |null|k1|v1|v3| |k2|k3|v2|null| |k4|k5|null|v4| We cannot use the current join-using-columns interface, because it doesn't support null-safe joins, and we have null values in the first row We cannot use join-select with explicit "<=>" neither, because the output table will be like: -- ||df1.key1||df1.key2||df2.key1||df2.key2||value1||value2|| |null|k1|null|k1|v1|v3| |k2|k3|null|null|v2|null| |null|null|k4|k5|null|v4| it is difficult to get the result like TableC using select cause, because the null values from outer join (row 2 & 3) can be in both df1.* columns and df2.* columns Hope this makes sense to you. I'd like to submit a pr if this is a real use case was (Author: zwang): suppose we have two tables: -- TableA ||key1||key2||value1|| |null|k1|v1| |k2|k3|v2| TableB ||key1||key2||value2|| |null|k1|v3| |k4|k5|v4| The result table I want is: -- TableC ||key1||key2||value1||value2|| |null|k1|v1|v3| |k2|k3|v2|null| |k4|k5|null|v4| We cannot use the current join-using-columns interface, because it doesn't support null-safe joins, and we have null values in the first row We cannot use join-select with explicit "<=>" neither, because the output table will be like: -- ||df1.key1||df1.key2||df2.key1||df2.key2||value1||value2|| |null|k1|null|k1|v1|v3| |k2|k3|null|null|v2|null| |null|null|k4|k5|null|v4| it is difficult to get the result like TableC using select cause, because the null values from outer join (row 2 & 3) can be in both df1.* columns and df2.* columns Hope this makes sense to you. I'd like to submit a pr if this is a real use case > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175154#comment-15175154 ] Zhong Wang commented on SPARK-13337: suppose we have two tables: -- TableA ||key1||key2||value1|| |null|k1|v1| |k2|k3|v2| TableB ||key1||key2||value2|| |null|k1|v3| |k4|k5|v4| The result table I want is: -- TableC ||key1||key2||value1||value2|| |null|k1|v1|v3| |k2|k3|v2|null| |k4|k5|null|v4| We cannot use the current join-using-columns interface, because it doesn't support null-safe joins, and we have null values in the first row We cannot use join-select with explicit "<=>" neither, because the output table will be like: -- ||df1.key1||df1.key2||df2.key1||df2.key2||value1||value2|| |null|k1|null|k1|v1|v3| |k2|k3|null|null|v2|null| null|null|k4|k5|null|v4| it is difficult to get the result like TableC using select cause, because the null values from outer join (row 2 & 3) can be in both df1.* columns and df2.* columns Hope this makes sense to you. I'd like to submit a pr if this is a real use case > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175154#comment-15175154 ] Zhong Wang edited comment on SPARK-13337 at 3/2/16 6:50 AM: suppose we have two tables: -- TableA ||key1||key2||value1|| |null|k1|v1| |k2|k3|v2| TableB ||key1||key2||value2|| |null|k1|v3| |k4|k5|v4| The result table I want is: -- TableC ||key1||key2||value1||value2|| |null|k1|v1|v3| |k2|k3|v2|null| |k4|k5|null|v4| We cannot use the current join-using-columns interface, because it doesn't support null-safe joins, and we have null values in the first row We cannot use join-select with explicit "<=>" neither, because the output table will be like: -- ||df1.key1||df1.key2||df2.key1||df2.key2||value1||value2|| |null|k1|null|k1|v1|v3| |k2|k3|null|null|v2|null| |null|null|k4|k5|null|v4| it is difficult to get the result like TableC using select cause, because the null values from outer join (row 2 & 3) can be in both df1.* columns and df2.* columns Hope this makes sense to you. I'd like to submit a pr if this is a real use case was (Author: zwang): suppose we have two tables: -- TableA ||key1||key2||value1|| |null|k1|v1| |k2|k3|v2| TableB ||key1||key2||value2|| |null|k1|v3| |k4|k5|v4| The result table I want is: -- TableC ||key1||key2||value1||value2|| |null|k1|v1|v3| |k2|k3|v2|null| |k4|k5|null|v4| We cannot use the current join-using-columns interface, because it doesn't support null-safe joins, and we have null values in the first row We cannot use join-select with explicit "<=>" neither, because the output table will be like: -- ||df1.key1||df1.key2||df2.key1||df2.key2||value1||value2|| |null|k1|null|k1|v1|v3| |k2|k3|null|null|v2|null| null|null|k4|k5|null|v4| it is difficult to get the result like TableC using select cause, because the null values from outer join (row 2 & 3) can be in both df1.* columns and df2.* columns Hope this makes sense to you. I'd like to submit a pr if this is a real use case > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172881#comment-15172881 ] Zhong Wang edited comment on SPARK-13337 at 2/29/16 11:40 PM: -- It doesn't help in my case, because it doesn't support null-safe joins. It would be great if there is an interface like: {code} def join(right: DataFrame, usingColumns: Seq[String], joinType: String, nullSafe:Boolean): DataFrame {code} The current join-using-column interface works great if the joining tables doesn't contain null values: it can eliminate the null columns generated from outer joins automatically. The general joining methods in your example support null-safe joins perfectly, but it cannot automatically eliminate the null columns, which are generated from outer joins. Sorry that it is a little bit complicated here. Please let me know if you need a concrete example. was (Author: zwang): It doesn't help in my case, because it doesn't support null-safe joins. It would be great if there is an interface like: {code} def join(right: DataFrame, usingColumns: Seq[String], joinType: String, nullSafe:Boolean): DataFrame {code} It works great if the joining tables doesn't contain null values: it can eliminate the null columns generated from outer joins automatically. The general joining methods in your example support null-safe joins perfectly, but it cannot automatically eliminate the null columns, which are generated from outer joins. Sorry that it is a little bit complicated here. Please let me know if you need a concrete example. > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172881#comment-15172881 ] Zhong Wang commented on SPARK-13337: It doesn't help in my case, because it doesn't support null-safe joins. It would be great if there is an interface like: {code} def join(right: DataFrame, usingColumns: Seq[String], joinType: String, nullSafe:Boolean): DataFrame {code} It works great if the joining tables doesn't contain null values: it can eliminate the null columns generated from outer joins automatically. The general joining methods in your example support null-safe joins perfectly, but it cannot automatically eliminate the null columns, which are generated from outer joins. Sorry that it is a little bit complicated here. Please let me know if you need a concrete example. > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172709#comment-15172709 ] Zhong Wang edited comment on SPARK-13337 at 2/29/16 10:05 PM: -- For an outer join, it is difficult to eliminate the null columns from the result, because the null columns can come from both tables. The `join-using-column` interface can automatically eliminate those columns, which are very convenient. Sorry that I missed this point in my last reply. was (Author: zwang): For an outer join, it is difficult to eliminate the null columns from the result. The `join-using-column` interface can automatically eliminate those columns, which are very convenient. Sorry that I missed this point in my last reply. > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172709#comment-15172709 ] Zhong Wang commented on SPARK-13337: For an outer join, it is difficult to eliminate the null columns from the result. The `join-using-column` interface can automatically eliminate those columns, which are very convenient. Sorry that I missed this point in my last reply. > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156579#comment-15156579 ] Zhong Wang commented on SPARK-13337: Unfortunately no... I use the join-on-columns function to performs a natural join. It can eliminate the redundant columns in the resulting table, which is required by our use case > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
[ https://issues.apache.org/jira/browse/SPARK-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Wang updated SPARK-13337: --- Description: Currently, the join-on-columns function: {code} def join(right: DataFrame, usingColumns: Seq[String], joinType: String): DataFrame {code} performs a null-insafe join. It would be great if there is an option for null-safe join. was: Currently, the join-on-columns function: def join(right: DataFrame, usingColumns: Seq[String], joinType: String): DataFrame performs a null-insafe join. It would be great if there is an option for null-safe join. > DataFrame join-on-columns function should support null-safe equal > - > > Key: SPARK-13337 > URL: https://issues.apache.org/jira/browse/SPARK-13337 > Project: Spark > Issue Type: Improvement >Affects Versions: 1.6.0 >Reporter: Zhong Wang >Priority: Minor > > Currently, the join-on-columns function: > {code} > def join(right: DataFrame, usingColumns: Seq[String], joinType: String): > DataFrame > {code} > performs a null-insafe join. It would be great if there is an option for > null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-13337) DataFrame join-on-columns function should support null-safe equal
Zhong Wang created SPARK-13337: -- Summary: DataFrame join-on-columns function should support null-safe equal Key: SPARK-13337 URL: https://issues.apache.org/jira/browse/SPARK-13337 Project: Spark Issue Type: Improvement Affects Versions: 1.6.0 Reporter: Zhong Wang Priority: Minor Currently, the join-on-columns function: def join(right: DataFrame, usingColumns: Seq[String], joinType: String): DataFrame performs a null-insafe join. It would be great if there is an option for null-safe join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org