[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-11-03 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 I am closing this because as @kiszk pointed out in his comment, there is no reliable way to get `SQLConf` here. --- - To

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-08 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19447 I feel it is a bit annoying to add a parameters for each Constant Pool issue and we better look for solutions so that less parameters (e.g., other metrics as @kiszk suggested) can almost solve the

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-07 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19447 About which "number", which "limit" to set there, we want to see what problem happens in test cases or your precise description (what class for code generation causes a lot of small methods while the

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-07 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 Thanks @kiszk, that would most probably work, but I am not sure about which "number", which "limit" to set there. This is the reason why I thought that a configuration might have been better,

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-07 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19447 Thank you for your clarification. I understand that you originally addressed `JVM limit of 0x` caused by a lot of small methods. For 2, my idea is that it would be good to monitor the

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-07 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 Thanks for your comment @kiszk. It make sense, actually, that a very small number like 1 can cause exceptions. I will explain later why this happens, now I will start with a brief introduction

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-07 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19447 @mgaido91 When I reduced this value to 1, [this test case](https://github.com/apache/spark/pull/16648/files#diff-c7f041fda7fd9aa1a5225f86bab4b1b0R69) causes an exception `JVM limit of 0x`.

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-07 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 Yes, what I can do as a test case is that I can check how many inner classes are generated changing the configuration value, is it ok? ---

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19447 @gatorsmile I am happy to take it. We have [one test case](https://github.com/apache/spark/pull/16648/files#diff-c7f041fda7fd9aa1a5225f86bab4b1b0R69) that cause the error `JVM limit of 0x`. I am

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19447 How about making a separated function for checking the threshold and then test it like #18810: https://github.com/apache/spark/pull/18810/files#diff-8bcc5aea39c73d4bf38aef6f6951d42cR363 ---

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19447 We still need to add a test case. We also should capture the exception and issue a better one; otherwise, users will not know what they should do when they hit such a confusing error.

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 Here the answers to your questions @gatorsmile , please tell me if I need to elaborate more deeply. This conf controls how many inner classes are generated. A big value means that we will have

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19447 Instead of simply introducing a new conf, we need to answer three questions; otherwise, this is not useful to the users. - What is the perf impact of this conf? - When should users

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 Yes, with small values it will produce a lot of small `NestedClass`es, but it will work. Instead, if the value is too high this, all the functions (methods) which are created are inlined in the

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19447 Do you mean Spark will work with small value like 1000? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 @dongjoon-hyun I am not sure how I can test it. The use case in which this was useful is quite complex and I have not been able to reproduce it in a simpler way. ---

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19447 Could you add a test case for this? Since it's configurable, you can set a small number and catch some exceptions? --- -

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 thank you for your comments @kiszk and @dongjoon-hyun, I changed a bit the approach according to a similar approach in the same file:

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19447 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional