Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/23058
The change looks good to me. I understand that this change uses memory
efficiently but I am wondering whether it causes any performance degradation
compared to memory mapping. If yes, can we
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226718233
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -800,14 +817,33 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226710526
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -800,14 +817,33 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226492916
--- Diff: docs/configuration.md ---
@@ -266,6 +266,37 @@ of the most common options to set are:
Only has effect in Spark standalone mode or Mesos
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226492665
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226492521
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -800,14 +817,33 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226463678
--- Diff: docs/configuration.md ---
@@ -266,6 +266,37 @@ of the most common options to set are:
Only has effect in Spark standalone mode or Mesos
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226463522
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226421974
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226421401
--- Diff: docs/configuration.md ---
@@ -266,6 +266,37 @@ of the most common options to set are:
Only has effect in Spark standalone mode or Mesos
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r226418231
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r225343405
--- Diff:
core/src/test/scala/org/apache/spark/util/logging/DriverLoggerSuite.scala ---
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r224189470
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -806,6 +806,22 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223901876
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,175 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223901520
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -806,6 +806,22 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223900426
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -806,6 +806,22 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223871565
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,175 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223866023
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -48,6 +48,19 @@ package object config {
.bytesConf
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223859868
--- Diff:
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -806,6 +806,22 @@ private[history] class FsHistoryProvider
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r223519892
--- Diff: core/src/main/scala/org/apache/spark/internal/Logging.scala ---
@@ -192,7 +211,15 @@ private[spark] object Logging
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
I looked further into it and it seems that writing to yarn app dir will
mean following certain guidelines: writing filelength, acls, owner, version,
using a compression algorithm, etc. And since
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
Got it, I didn't test that. Let me verify that and explore alternate
options.
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
It worked in my environment but let me look at hadoop classes to see if it
can be done in a better manner. Meanwhile, I will merge the cleaner code in
this PR
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
Also, for our use case, we are planning to get the logs just via yarn logs
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
Thanks for your comment @tgravescs
Moving the file to yarn logs app dir is optional and users can choose to
disable that. In my case, If they only enable syncToHdfs, the file
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22616#discussion_r222125433
--- Diff: core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala ---
@@ -285,8 +285,6 @@ private[spark] object ClosureCleaner extends Logging
Github user ankuriitg closed the pull request at:
https://github.com/apache/spark/pull/22604
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22616
@srowen @yanboliang
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22604
I have opened another PR for this, with the recommended changes:
https://github.com/apache/spark/pull/22616
I had to change a few other logDebug statements as well. Please let me know
GitHub user ankuriitg opened a pull request:
https://github.com/apache/spark/pull/22616
[SPARK-25586][Core] Remove logdebug statement for outer objects from
ClosureCleaner
## What changes were proposed in this pull request?
Cause: Recently test_glr_summary failed
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22604
Makes sense and let me update the description about SPARK-25118
---
-
To unsubscribe, e-mail: reviews-unsubscr
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22604
If I remove the following code, then the test succeeds and no spark jobs
are started. Shall I do this instead?
`
sb.append("\n")
val nd = s"Null de
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22604
Few variables (I think there are 3) printed in toString cause a Spark Job
to be started and the main reason is that those variables are lazily evaluated.
I can remove those variables from
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22604
@actuaryzhang @yanboliang
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22604
@vanzin @squito @bersprockets
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e
GitHub user ankuriitg opened a pull request:
https://github.com/apache/spark/pull/22604
[SPARK-25586][MLlib][Core] Replace toString method with a summarize mâ¦
â¦ethod in GeneralizedLinearRegressionTrainingSummary
## What changes were proposed in this pull request
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
Sure, let me do that
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
@srowen - the bug is causing a unit-test failure on this PR. How should I
resolve it?
---
-
To unsubscribe, e-mail: reviews
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
@srowen @jkbradley - it seems that my PR has uncovered some bug in another
class. GeneralizedLinearRegressionTrainingSummary.toString actually runs a
Spark job, so it ends up into an infinite
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
Yes, I will try to reproduce it locally
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22504#discussion_r220274067
--- Diff:
core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala ---
@@ -0,0 +1,241 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22504
cc: @vanzin @squito
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail
GitHub user ankuriitg opened a pull request:
https://github.com/apache/spark/pull/22504
[SPARK-25118][Submit] Persist Driver Logs in Yarn Client mode to Hdfs
## What changes were proposed in this pull request?
Currently, we do not have a mechanism to collect driver logs
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r215345106
--- Diff:
streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala ---
@@ -77,7 +77,14 @@ class UISeleniumSuite
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r215056633
--- Diff:
core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala ---
@@ -1190,6 +1190,61 @@ class AppStatusListenerSuite extends
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r214384020
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,20 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r214372932
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,20 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r214372050
--- Diff:
core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala ---
@@ -1190,6 +1190,61 @@ class AppStatusListenerSuite extends
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r213859644
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,21 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r213859611
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -506,7 +516,16 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r213857850
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -506,7 +516,16 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r213495693
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,22 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r213372734
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,22 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r212773791
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,16 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r212688205
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,16 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r212498716
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,16 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r212471192
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,16 @@ private[spark] class AppStatusListener
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22209
Just checked, this fixes the Spark History Server issue (SPARK-24539) as
well, I just needed to restart SHS in order for my changes to take affect
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22209
Thanks for your comment @tgravescs. This fix is indeed for the SparkUI
(running jobs) issue. It does not fix it for history server yet
Github user ankuriitg commented on the issue:
https://github.com/apache/spark/pull/22209
@vanzin @squito
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
GitHub user ankuriitg opened a pull request:
https://github.com/apache/spark/pull/22209
[SPARK-24415][Core] HistoryServer does not display metrics from tasksâ¦
⦠that
complete after stage failure
## What changes were proposed in this pull request
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21916#discussion_r206264100
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21916#discussion_r206265858
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21916#discussion_r206269773
--- Diff: core/src/main/scala/org/apache/spark/memory/MemoryManager.scala
---
@@ -180,6 +181,34 @@ private[spark] abstract class MemoryManager
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21916#discussion_r206266419
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21916#discussion_r206264805
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software
Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/21916#discussion_r206267854
--- Diff:
core/src/main/scala/org/apache/spark/executor/ProcfsBasedSystems.scala ---
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software
69 matches
Mail list logo