[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-28 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/21087 Any further changes? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-08-09 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/21087#discussion_r209081079 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -580,6 +580,11 @@ object SQLConf { .booleanConf

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-06 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/21087 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-08-06 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/21087#discussion_r207815199 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -580,6 +580,11 @@ object SQLConf { .booleanConf

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-03 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/21087 It seems the tests timed-out. Any chance to re-run them? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-02 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/21087 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-01 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/21087 It would be great if some admin could review. If there is anything to improve please tell. It is very simple though

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-04-23 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/21087 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-04-17 Thread ferdonline
GitHub user ferdonline opened a pull request: https://github.com/apache/spark/pull/21087 [SPARK-23997][SQL] Configurable maximum number of buckets ## What changes were proposed in this pull request? This PR implements the possibility of the user to override the maximum number

[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

2018-01-16 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/20269 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20269: [SPARK-23029] [DOCS] Specifying default units of configu...

2018-01-16 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/20269 Hi. Thanks for your review. Sounds good, I will go around and add a "unit blurb" to them. I wrote "Default unit: X" to keep it the shortest and very obvious, but I

[GitHub] spark pull request #20269: [SPARK-23029] [DOCS] Specifying default units of ...

2018-01-15 Thread ferdonline
GitHub user ferdonline opened a pull request: https://github.com/apache/spark/pull/20269 [SPARK-23029] [DOCS] Specifying default units of configuration entries ## What changes were proposed in this pull request? This PR completes the docs, specifying the default units assumed

[GitHub] spark pull request #19805: [SPARK-22649][PYTHON][SQL] Adding localCheckpoint...

2017-12-17 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r157371948 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -540,9 +540,52 @@ class Dataset[T] private[sql

[GitHub] spark issue #19805: [SPARK-22649][PYTHON][SQL] Adding localCheckpoint to Dat...

2017-12-12 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/19805 Existing checkpoint tests applied to localCheckpoint as well, all working well. Please verify. I'm getting an unrelated fail in SparkR, did anything change n the build system

[GitHub] spark pull request #19805: [SPARK-22649][PYTHON][SQL] Adding localCheckpoint...

2017-12-02 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r154495320 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -537,9 +537,48 @@ class Dataset[T] private[sql

[GitHub] spark pull request #19805: [SPARK-22649][PYTHON][SQL] Adding localCheckpoint...

2017-11-29 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r153802577 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -537,9 +536,55 @@ class Dataset[T] private[sql

[GitHub] spark pull request #19805: [SPARK-22649][PYTHON][SQL] Adding localCheckpoint...

2017-11-29 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r153746129 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -537,9 +536,55 @@ class Dataset[T] private[sql

[GitHub] spark pull request #19805: [PYTHON][SQL] Adding localCheckpoint to Dataset A...

2017-11-29 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r153722443 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -537,9 +536,55 @@ class Dataset[T] private[sql

[GitHub] spark issue #19805: [PYTHON][SQL] Adding localCheckpoint to Dataset API

2017-11-28 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/19805 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19805: [PYTHON][SQL] Adding localCheckpoint to Dataset API

2017-11-27 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/19805 restest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19805: [PYTHON][SQL] Adding localCheckpoint to Dataset A...

2017-11-27 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r153147567 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -524,22 +524,41 @@ class Dataset[T] private[sql

[GitHub] spark pull request #19805: [PYTHON][SQL] Adding localCheckpoint to Dataset A...

2017-11-27 Thread ferdonline
Github user ferdonline commented on a diff in the pull request: https://github.com/apache/spark/pull/19805#discussion_r153145177 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -524,22 +524,41 @@ class Dataset[T] private[sql

[GitHub] spark issue #19805: [PYTHON][SQL] Adding localCheckpoint to Dataset API

2017-11-27 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/19805 Thanks for you review. I'm working on the changes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19805: [SQL] Adding localCheckpoint to Dataset API

2017-11-24 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/19805 cc @andrewor14 since I believe you know this part of spark pretty well, maybe you could help me integrating this. Any idea why Jenkins didn't start the testing

[GitHub] spark issue #19805: Adding localCheckpoint to Dataframe API

2017-11-24 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/19805 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19805: Adding localCheckpoint to Dataframe API

2017-11-23 Thread ferdonline
GitHub user ferdonline opened a pull request: https://github.com/apache/spark/pull/19805 Adding localCheckpoint to Dataframe API ## What changes were proposed in this pull request? This change adds local checkpoint support to datasets and respective bind from Python

[GitHub] spark issue #9428: [SPARK-8582][Core]Optimize checkpointing to avoid computi...

2017-11-17 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/9428 That's the reason why I want to checkpoint when they are first calculated. Further transformations use these results several times. Of course it's not a problem per se to calculate twice

[GitHub] spark issue #9428: [SPARK-8582][Core]Optimize checkpointing to avoid computi...

2017-11-16 Thread ferdonline
Github user ferdonline commented on the issue: https://github.com/apache/spark/pull/9428 Hello. I find this feature to be really important and I would be happy to contribute here. Even though we would potentially not support every use case, it would already be great