[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-23 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/9222#discussion_r42883859 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -130,16 +130,18 @@ private[r] object SQLUtils { } def

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-23 Thread FRosner
Github user FRosner commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150567323 @felixcheung it is strange. When I did some performance checks today, the run time did not seem to be quadradic. Is it maybe because the map transformation in the

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-23 Thread FRosner
Github user FRosner commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150494201 @felixcheung the ticket is mentioning two numbers (https://issues.apache.org/jira/browse/SPARK-11258#) in the description. For a small data frame it is already a

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-23 Thread FRosner
Github user FRosner commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150494394 @sun-rui so there is an existing R test case for this method? In the end I just did a refactoring so existing tests should not break. The idea of unit tests

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread FRosner
GitHub user FRosner opened a pull request: https://github.com/apache/spark/pull/9222 SPARK-11258 Remove quadratic runtime complexity for converting a Spark DataFrame into an R data.frame https://issues.apache.org/jira/browse/SPARK-11258 I was not able to locate an existing

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150153510 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/9222#discussion_r42733892 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -130,16 +130,17 @@ private[r] object SQLUtils { } def

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread FRosner
Github user FRosner commented on a diff in the pull request: https://github.com/apache/spark/pull/9222#discussion_r42734750 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -130,16 +130,17 @@ private[r] object SQLUtils { } def

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread FRosner
Github user FRosner commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150179591 Thanks for the reply. > Instead of creating a new testsuite in Scala, you can add a new test case in R, using callJStatic to invoke "dfToCols" on the Scala

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread felixcheung
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150382796 Do you have benchmark numbers for this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread sun-rui
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150377869 dfToCols is meant to be called from R side, it makes sense to test it in R. Having a test case in R for it can test not only the logic in Scala, but also helps to test

[GitHub] spark pull request: SPARK-11258 Remove quadratic runtime complexit...

2015-10-22 Thread sun-rui
Github user sun-rui commented on the pull request: https://github.com/apache/spark/pull/9222#issuecomment-150166755 Instead of creating a new testsuite in Scala, you can add a new test case in R, using callJStatic to invoke "dfToCols" on the Scala side. --- If your project is