[GitHub] spark issue #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv writer...

2017-11-27 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19814 OK. Explicit description not found. I've just tested it manually and taken a look at the source. It tries to match the character with the configured quote and it never matches because u

[GitHub] spark pull request #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv...

2017-11-27 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19814#discussion_r153199400 --- Diff: python/pyspark/sql/readwriter.py --- @@ -828,8 +828,7 @@ def csv(self, path, mode=None, compression=None, sep=None, quote=None, escape

[GitHub] spark issue #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv writer...

2017-11-27 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19814 Suggested fix added. Happy to contribute in the followups if there are possibilities. Thanks :) --- - To unsubscribe, e

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-27 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/19826 [SPARK-22428][DOC] Add spark application garbage collector configurat… ## What changes were proposed in this pull request? The spark properties for configuring the ContextCleaner

[GitHub] spark pull request #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv...

2017-11-26 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19814#discussion_r153067049 --- Diff: python/pyspark/sql/readwriter.py --- @@ -828,8 +828,7 @@ def csv(self, path, mode=None, compression=None, sep=None, quote=None, escape

[GitHub] spark issue #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv writer...

2017-11-26 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19814 Good point, I'll speak with the Univocity guys... Then the question comes what should we do with this PR and Jira ticket? I'm a newbie so every help/advise is highly appreciated

[GitHub] spark issue #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv writer...

2017-11-24 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19814 As I've seen the univocity library does not support explicit quote turn off feature on the write side

[GitHub] spark pull request #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv...

2017-11-24 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/19814 [SPARK-22484][DOC] Document PySpark DataFrame csv writer behavior whe… ## What changes were proposed in this pull request? In PySpark API Document, DataFrame.write.csv() says

[GitHub] spark issue #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv writer...

2017-11-26 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19814 Seems like there is a good reason why this is not supported. Here is their opinion: > The writer will only put quotes when absolutely required (i.e. the value has a delimiter, l

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-29 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19826#discussion_r153832461 --- Diff: docs/configuration.md --- @@ -2306,6 +2306,51 @@ showDF(properties, numRows = 200, truncate = FALSE) +### Spark

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-30 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19826#discussion_r154156746 --- Diff: docs/configuration.md --- @@ -2306,7 +2346,6 @@ showDF(properties, numRows = 200, truncate = FALSE) - --- End

[GitHub] spark issue #19814: [SPARK-22484][DOC] Document PySpark DataFrame csv writer...

2017-11-27 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19814 Seems like the job died because of jvm internal issue: > *** glibc detected *** /usr/java/jdk1.8.0_60/bin/java: double free or corruption (out): 0x000100038

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19826#discussion_r153430954 --- Diff: docs/configuration.md --- @@ -2306,6 +2306,50 @@ showDF(properties, numRows = 200, truncate = FALSE) +### Spark

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19826#discussion_r153431150 --- Diff: docs/configuration.md --- @@ -2306,6 +2306,50 @@ showDF(properties, numRows = 200, truncate = FALSE) +### Spark

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19826#discussion_r153431216 --- Diff: docs/configuration.md --- @@ -2306,6 +2306,50 @@ showDF(properties, numRows = 200, truncate = FALSE) +### Spark

[GitHub] spark pull request #19826: [SPARK-22428][DOC] Add spark application garbage ...

2017-11-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19826#discussion_r153431033 --- Diff: docs/configuration.md --- @@ -2306,6 +2306,50 @@ showDF(properties, numRows = 200, truncate = FALSE) +### Spark

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-05 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/19893 [SPARK-16139][TEST] Add logging functionality for leaked threads in tests ## What changes were proposed in this pull request? Lots of our tests don't properly shutdown everything

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-14 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 Seems like the new feature caught some false positives in SQL: ``` = THREAD AUDIT POST ACTION CALLED WITHOUT PRE ACTION IN SUITE o.a.s.sql.sources.DataSourceAnalysisSuite

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-14 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 The last suspicious big group of threads (at least for me) is broadcast-exchange.* but as I've seen this is not false positive because the threadpool never stopped. In BroadcastExchangeExec

[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames

2017-12-19 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/20019 cc @jiangxb1987 @gatorsmile @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20019: [SPARK-22361][SQL][TEST] Add unit test for Window...

2017-12-19 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/20019 [SPARK-22361][SQL][TEST] Add unit test for Window Frames ## What changes were proposed in this pull request? There are already quite a few integration tests using window frames

[GitHub] spark pull request #20022: Add unit test for Window spilling

2017-12-19 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/20022 Add unit test for Window spilling ## What changes were proposed in this pull request? There is already test using window spilling, but the test coverage is not ideal

[GitHub] spark issue #20022: [SPARK-22363][SQL][TEST] Add unit test for Window spilli...

2017-12-19 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/20022 cc @jiangxb1987 @gatorsmile @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-19 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 @squito I mean another jira, because it needs deeper analysis and discussion. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 @squito thanks for sharing your findings, it's helpful. Yeah, slowly digging into the deep and finding out what these threads

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 @vanzin I've fixed the problematic tests and added a codepart in the ThreadAudit to highlight such situations. After the build we can see more

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-14 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r156902350 --- Diff: core/src/test/scala/org/apache/spark/ThreadAudit.scala --- @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-14 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r156902396 --- Diff: core/src/test/scala/org/apache/spark/ThreadAudit.scala --- @@ -0,0 +1,129 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-08 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 > Those try to keep the same session alive for multiple suites Good point to make this part clear. As a first step I've taken a look at the code and as I see SparkSess

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-08 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 As a next step analysed SQL test flow. Here are the steps: 1. SharedSparkSession.beforeAll called which initialise SparkSession and SQLContext 2. SparkFunSuite.beforeAll creates

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-08 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 Still to come. I'll put hive findings here the same way. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155639667 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155639701 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155639843 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155639821 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155511923 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -20,10 +20,12 @@ package org.apache.spark // scalastyle:off import

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r16989 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,24 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155596370 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155600929 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155601696 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155598719 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155603848 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155605695 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155608281 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 Yeah, this is fully true. This enhancement is definitely will not solve the issues once and for all. The problems were hidden till now and we would like to make a step ahead and make

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-07 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155626333 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-11 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 I've analysed the hive related test flow and found SparkSession and SQLContext sharing between suites as you mentioned. Here is the execution flow: 1. The first hive test suite

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-11 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 @jiangxb1987 I don't know whether I understand your concern well but there is no intention to modify the shared `TestHiveContext ` among suites. It will remain as it is now. As an additional

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-11 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 @jiangxb1987 feel free to take a look at it. More eyes, more possibilities. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 I've taken a look at the failed test but seems like unrelated. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 I have gathered statistics manually about the actual stand. I've grep-ed unit-test.logs in the whole build: ``` bash-3.2$ find . -type f | xargs grep "POSSIBLE THREAD LEAK&

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-05 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 cc @squito @srowen @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155370597 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -52,6 +62,23 @@ abstract class SparkFunSuite getTestResourceFile

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155370856 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -72,3 +99,27 @@ abstract class SparkFunSuite

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 In the meantime analysed a couple of cases and found netty related threads: - netty.* - globalEventExecutor.* - threadDeathWatcher.* I've added them to the whitelist

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155370650 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -52,6 +62,23 @@ abstract class SparkFunSuite getTestResourceFile

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155370694 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -52,6 +62,23 @@ abstract class SparkFunSuite getTestResourceFile

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155370741 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -52,6 +62,23 @@ abstract class SparkFunSuite getTestResourceFile

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155383597 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,24 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155385446 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -52,6 +66,23 @@ abstract class SparkFunSuite getTestResourceFile

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155386336 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,24 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r155381411 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -20,10 +20,12 @@ package org.apache.spark // scalastyle:off import

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-12 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r156385068 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-12 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r156385819 --- Diff: core/src/test/scala/org/apache/spark/SparkFunSuite.scala --- @@ -34,12 +36,53 @@ abstract class SparkFunSuite with Logging

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-12 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 @vanzin I've fixed the SQL test flow and additionally I've made the implementation less invasive by extracting the logic into a trait

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 On the other side globalEventExecutor.* and dag-scheduler-event-loop was an issue in the tests what I've taken a look

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 I've just started to take a look at it deeper and found some patterns. Namely we can exclude all netty.* threads + ForkJoinPool.* is most of the time but not always created inside scala

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-06 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 Here is a list but it definitely contains false positives. [SPARK-16139.txt](https://github.com/apache/spark/files/1535397/SPARK-16139.txt

[GitHub] spark issue #19893: [SPARK-16139][TEST] Add logging functionality for leaked...

2017-12-05 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/19893 Good point, I've also struggled to collect all actual problems :) Format changed to the following: ``` = FINISHED o.a.s.scheduler.DAGSchedulerSuite: 'task end event should

[GitHub] spark pull request #19893: [SPARK-16139][TEST] Add logging functionality for...

2017-12-05 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19893#discussion_r154958833 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala --- @@ -683,7 +683,7 @@ class TaskSetManagerSuite extends

[GitHub] spark pull request #19035: [SPARK-21822][SQL]When insert Hive Table is finis...

2017-12-20 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19035#discussion_r158047414 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -435,6 +435,18 @@ case class

[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames

2017-12-20 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/20019 @smurakozi nice catch, added them. Additionally found a nit. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19035: [SPARK-21822][SQL]When insert Hive Table is finis...

2017-12-20 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19035#discussion_r158050733 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -435,6 +435,18 @@ case class

[GitHub] spark issue #21430: [SPARK-23991][DSTREAMS] Fix data loss when WAL write fai...

2018-05-25 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21430 cc @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21430: [SPARK-23991][DSTREAMS] Fix data loss when WAL wr...

2018-05-25 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/21430 [SPARK-23991][DSTREAMS] Fix data loss when WAL write fails in allocateBlocksToBatch ## What changes were proposed in this pull request? When blocks tried to get allocated to a batch

[GitHub] spark issue #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of cached ...

2018-05-22 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/20997 Do I need to do any further changes? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-06-12 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21450 The issue seems unrelated. ``` [error] (kubernetes-integration-tests/*:checkstyle) java.io.FileNotFoundException: checkstyle-config.xml (No such file or directory) ``` Trying

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-06-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21450#discussion_r195016532 --- Diff: launcher/src/main/java/org/apache/spark/launcher/Main.java --- @@ -101,6 +99,22 @@ public static void main(String[] argsArray) throws

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-06-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21450#discussion_r195016477 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -138,25 +139,30 @@ case

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-06-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21450#discussion_r195016584 --- Diff: launcher/src/test/java/org/apache/spark/launcher/SparkSubmitCommandBuilderSuite.java --- @@ -190,6 +197,36 @@ public void testSparkRShell

[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-05-29 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21450 test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-05-29 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21450 cc @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution w...

2018-05-30 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21450 Updated the description to reflect the no arg case as well. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21430: [SPARK-23991][DSTREAMS] Fix data loss when WAL write fai...

2018-05-29 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21430 Thanks @vanzin @jerryshao for the help. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-06-01 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21450#discussion_r192386271 --- Diff: launcher/src/test/java/org/apache/spark/launcher/SparkSubmitCommandBuilderSuite.java --- @@ -190,6 +194,23 @@ public void testSparkRShell

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-06-01 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21450#discussion_r192385912 --- Diff: launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java --- @@ -229,7 +238,7 @@ args.add(join

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-06-01 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21450#discussion_r192385957 --- Diff: launcher/src/test/java/org/apache/spark/launcher/SparkSubmitCommandBuilderSuite.java --- @@ -344,8 +365,15 @@ private

[GitHub] spark pull request #21450: [SPARK-24319][SPARK SUBMIT] Fix spark-submit exec...

2018-05-29 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/21450 [SPARK-24319][SPARK SUBMIT] Fix spark-submit execution where no main class is required. ## What changes were proposed in this pull request? With [PR 20925](https://github.com/apache

[GitHub] spark pull request #21430: [SPARK-23991][DSTREAMS] Fix data loss when WAL wr...

2018-05-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21430#discussion_r191130178 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceivedBlockTracker.scala --- @@ -112,10 +112,13 @@ private[streaming

[GitHub] spark pull request #21430: [SPARK-23991][DSTREAMS] Fix data loss when WAL wr...

2018-05-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21430#discussion_r191130212 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceivedBlockTrackerSuite.scala --- @@ -308,12 +354,16 @@ class

[GitHub] spark pull request #21430: [SPARK-23991][DSTREAMS] Fix data loss when WAL wr...

2018-05-28 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/21430#discussion_r191130186 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceivedBlockTrackerSuite.scala --- @@ -115,6 +117,50 @@ class

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21685 What I can't really understand is why the `Scheduler Delay` is so different. ` Scheduler delay includes time to ship the task from the scheduler to the executor, and time to send

[GitHub] spark issue #21685: [SPARK-24707][DSTREAMS] Enable spark-kafka-streaming to ...

2018-07-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21685 In general `KafkaConsumer.poll` should take couple of seconds but 10+ is extreme high. The question `why it takes so long?` has to be answered first. In the processing time chart I see

[GitHub] spark issue #21455: [SPARK-24093][DStream][Minor]Make some fields of KafkaSt...

2018-06-26 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21455 Why is it required at all? Making things visible without proper reason is not a good idea. --- - To unsubscribe, e-mail

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-05-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r185451919 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,359

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-05-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r185451988 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,359

[GitHub] spark pull request #21214: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

2018-05-02 Thread gaborgsomogyi
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/21214 [SPARK-23775][TEST] Make DataFrameRangeSuite not flaky ## What changes were proposed in this pull request? DataFrameRangeSuite.test("Cancelling stage in a query with Range.&qu

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-05-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r185452496 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,359

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-05-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r185451746 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,359

[GitHub] spark issue #21214: [SPARK-23775][TEST] Make DataFrameRangeSuite not flaky

2018-05-02 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/21214 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

  1   2   3   4   5   >