[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21342 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r190237452 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => --- End diff -- Sure, I'm glad take this and resolve in another pr. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189905459 --- Diff: core/src/main/scala/org/apache/spark/util/SparkFatalException.scala --- @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * fatal throwable in {@link scala.concurrent.Future}'s body, and re-throw + * SparkFatalException, which wraps the fatal throwable inside. --- End diff -- let's also mention that, we must use `ThreadUtil.awaitResult` to run the Future, catch this exception and re-throw the original exception. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189904589 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => --- End diff -- yea, I think Spark will only throw `SparkOutOfMemoryError` and we should only extend the error message for `SparkOutOfMemoryError`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189900950 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => --- End diff -- To be clear, @cloud-fan, do you mean that, ideally during `relationFuture`, all the OOM error thrown of type SparkOutOfMemoryError ? (SparkOutOfMemoryError subclass of OutOfMemoryError) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189754203 --- Diff: core/src/main/scala/org/apache/spark/util/SparkFatalException.scala --- @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * fatal throwable in {@link scala.concurrent.Future}'s body, and re-throw + * SparkFatalException, which wraps the fatal throwable inside. + */ +private[spark] final class SparkFatalException(val throwable: Throwable) extends Exception --- End diff -- OTOH I guess we're actually only using this in one place right now, so I think things are correct as written, but I was just kind of abstractly worrying about potential future pitfalls in case people start using this pattern in new code without also noticing the `ThreadUtils.awayResult` requirement. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189754010 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => -throw new OutOfMemoryError(s"Not enough memory to build and broadcast the table to " + +throw new SparkFatalException( + new OutOfMemoryError(s"Not enough memory to build and broadcast the table to " + --- End diff -- I agree that we're likely to have reclaimable space at this point, so the chance of a second OOM / failure here seems small. I'm pretty sure that the OutOfMemoryError being caught here often originates from Spark itself where we explicitly throw another `OutOfMemoryError` at a lower layer of the system, in which case we still actually have heap to allocate strings. We should investigate and clean up that practice, but let's do that in a separate PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189753564 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => --- End diff -- Agreed; let's fix that separately. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189753488 --- Diff: core/src/main/scala/org/apache/spark/util/SparkFatalException.scala --- @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * fatal throwable in {@link scala.concurrent.Future}'s body, and re-throw + * SparkFatalException, which wraps the fatal throwable inside. + */ +private[spark] final class SparkFatalException(val throwable: Throwable) extends Exception --- End diff -- Are there places where we fetch results from Futures without going through the `ThreadUtils.awaitResult`? In other words, is that a narrow waist? Would it make sense to add a second redundant layer of unwrapping at the top of `SparkUncaughtExceptionHandler` to handle that case? Not sure yet, but just thinking aloud here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189634704 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => -throw new OutOfMemoryError(s"Not enough memory to build and broadcast the table to " + +throw new SparkFatalException( + new OutOfMemoryError(s"Not enough memory to build and broadcast the table to " + --- End diff -- Just curious: Can we perform object operations (allocate `OutOfMemoryError`, allocate and concatenate `String`s) when we caught ` OutOfMemoryError`? I think that we have space since we failed to allocate a large object. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189632062 --- Diff: core/src/main/scala/org/apache/spark/util/SparkFatalException.scala --- @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * fatal throwable in {@link scala.concurrent.Future}'s body, and re-throw + * SparkFatalException, which wraps the fatal throwable inside. + */ +private[spark] final class SparkFatalException(val throwable: Throwable) extends Exception --- End diff -- I believe it will not. The `SparkFatalException` has a short life cycle: it is created inside `Future {}` and then caught and stripped by `ThreadUtils.awaitResult`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189627101 --- Diff: core/src/main/scala/org/apache/spark/util/SparkFatalException.scala --- @@ -0,0 +1,24 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.util + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * fatal throwable in {@link scala.concurrent.Future}'s body, and re-throw + * SparkFatalException, which wraps the fatal throwable inside. + */ +private[spark] final class SparkFatalException(val throwable: Throwable) extends Exception --- End diff -- Will this change impact https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/SparkUncaughtExceptionHandler.scala#L42 ? cc @zsxwing @JoshRosen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189603209 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => --- End diff -- not related to this PR, but I'm a little worried about catching OOM here. Spark has `SparkOutOfMemoryError`, and it seems more reasonable to catch `SparkOutOfMemoryError`. This can be fixed in another PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189482538 --- Diff: core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryException.java --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.memory; + +import org.apache.spark.annotation.Private; + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * {@link OutOfMemoryError} in {@link scala.concurrent.Future}'s body, and re-throw + * SparkOutOfMemoryException instead. + */ +@Private +public final class SparkOutOfMemoryException extends Exception { + + private OutOfMemoryError oe; --- End diff -- @felixcheung thanks for review. In current change there is no SparkOutOfMemoryException. I wrap fatal Throwable in SparkFatalException --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189471396 --- Diff: core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryException.java --- @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.memory; + +import org.apache.spark.annotation.Private; + +/** + * SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we catch + * {@link OutOfMemoryError} in {@link scala.concurrent.Future}'s body, and re-throw + * SparkOutOfMemoryException instead. + */ +@Private +public final class SparkOutOfMemoryException extends Exception { + + private OutOfMemoryError oe; --- End diff -- so OutOfMemoryError subclasses java.lang.Error where as this is subclassing from Exeception, does this matter here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21342: [SPARK-24294] Throw SparkException when OOM in Br...
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/21342 [SPARK-24294] Throw SparkException when OOM in BroadcastExchangeExec ## What changes were proposed in this pull request? When OutOfMemoryError thrown from BroadcastExchangeExec, scala.concurrent.Future will hit scala bug â https://github.com/scala/bug/issues/9554, and hang until future timeout: We could wrap the OOM inside SparkException to resolve this issue. ## How was this patch tested? Manually tested. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jinxing64/spark SPARK-24294 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21342.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21342 commit 9d83f1fa397e205a826c2719a3b2978dae7d09ed Author: jinxingDate: 2018-05-16T07:49:34Z [SPARK-24294] Throw SparkException when OOM in BroadcastExchangeExec --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org