[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-04-24 Thread yanboliang
Github user yanboliang closed the pull request at: https://github.com/apache/spark/pull/4527 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-03-31 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-87971697 Thanks - sorry for not having looked at this earlier. Do you see any performance gains with this change? My understanding is that JSON is already very slow, and thus the

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74410038 [Test build #27513 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27513/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74410073 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74410071 [Test build #27513 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27513/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74410504 [Test build #27514 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27514/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74410182 [Test build #27514 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27514/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74421311 [Test build #27522 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27522/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74421514 [Test build #27524 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27524/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread yanbohappy
Github user yanbohappy commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74424959 cc @liancheng @rxin @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74424392 [Test build #27524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27524/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74424393 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74424550 [Test build #27522 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27522/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74424555 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread yanbohappy
Github user yanbohappy commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74424908 This improvement is very similar with #758, so I have run the similar performance test. The benchmark suggests this optimization made the optimized version about

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74410507 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-14 Thread yanbohappy
Github user yanbohappy commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74388397 @yhuai This improvement is very similar with #758, so I have leverage the performance test there. The benchmark suggests this optimization made the optimized

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74059744 [Test build #27351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27351/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-12 Thread yanbohappy
Github user yanbohappy commented on a diff in the pull request: https://github.com/apache/spark/pull/4527#discussion_r24574169 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala --- @@ -39,7 +39,19 @@ private[sql] object JsonRDD extends Logging {

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-12 Thread yanbohappy
Github user yanbohappy commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74059428 @chenghao-intel @yhuai Thank you for your advice and it's very useful. We can use mutable rows for both top level records and inner structures at present.

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74066899 [Test build #27351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27351/consoleFull) for PR 4527 at commit

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-74066906 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-73925301 Also, can you add performance numbers? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-73924548 Thank you for working on it. Seems `new SpecificMutableRow(schema.fields.map(_.dataType))` cannot handle nested structure. I think we need to use the schema to

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-73927673 Oh, `enforceCorrectType` will take care inner structures by calling `asRow`. It will be great if we can use mutable rows for inner structures as well. --- If

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4527#discussion_r24503655 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala --- @@ -39,7 +39,19 @@ private[sql] object JsonRDD extends Logging {

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4527#discussion_r24504239 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala --- @@ -39,7 +39,19 @@ private[sql] object JsonRDD extends Logging {

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4527#discussion_r24503309 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala --- @@ -39,7 +39,19 @@ private[sql] object JsonRDD extends Logging {

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-73864173 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-5738] [SQL] Reuse mutable row for each ...

2015-02-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4527#issuecomment-73864166 [Test build #27288 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27288/consoleFull) for PR 4527 at commit