[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16134 **[Test build #69633 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69633/consoleFull)** for PR 16134 at commit

[GitHub] spark issue #16133: [SPARK-18702][SQL] input_file_block_start and input_file...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16133 **[Test build #69632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69632/consoleFull)** for PR 16133 at commit

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16134 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16133: [SPARK-18702][SQL] input_file_block_start and input_file...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69632/ Test PASSed. ---

[GitHub] spark issue #16133: [SPARK-18702][SQL] input_file_block_start and input_file...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16133 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16134 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69633/ Test PASSed. ---

[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16090 After looking more at the code, now I agree with your approach. One question, seems we still scan the files when creating a unpartitioned external data source table? --- If your project is set

[GitHub] spark issue #16114: [SPARK-18620][Streaming][Kinesis] Flatten input rates in...

2016-12-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16114 @brkyvz maybe you can give this a look to make sure it makes sense? especially the bit about the checkpointer. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #16131: [SPARK-18701][ML] Fix Poisson GLM failure due to ...

2016-12-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16131#discussion_r90772810 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -505,7 +505,7 @@ object GeneralizedLinearRegression

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16135 **[Test build #69635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69635/consoleFull)** for PR 16135 at commit

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15915 This introduces a whole new config property. I think the suggestion was to base it off of some existing property, like page size? @JoshRosen would be best positioned to comment on that. We can more

[GitHub] spark issue #15736: [SPARK-18224] [CORE] Optimise PartitionedPairBuffer impl...

2016-12-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15736 I'd certainly be curious to see a benchmark of the 'final' version with inlined comparator. I would honestly be surprised if that's not fastest of all. --- If your project is set up for it, you can

[GitHub] spark pull request #15620: [SPARK-18091] [SQL] Deep if expressions cause Gen...

2016-12-04 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15620#discussion_r90773625 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala --- @@ -97,6 +97,27 @@ class

[GitHub] spark issue #15620: [SPARK-18091] [SQL] Deep if expressions cause Generated ...

2016-12-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15620 thanks, merging to master/2.1! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15620: [SPARK-18091] [SQL] Deep if expressions cause Gen...

2016-12-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15620 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge...

2016-12-04 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/16037#discussion_r90773871 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala --- @@ -241,16 +241,27 @@ object LBFGS extends Logging { val bcW =

[GitHub] spark pull request #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge...

2016-12-04 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/16037#discussion_r90773986 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala --- @@ -241,16 +241,27 @@ object LBFGS extends Logging { val bcW =

[GitHub] spark pull request #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge...

2016-12-04 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/16037#discussion_r90773915 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/optimization/LBFGS.scala --- @@ -241,16 +241,27 @@ object LBFGS extends Logging { val bcW =

[GitHub] spark issue #16037: [SPARK-18471][MLLIB] In LBFGS, avoid sending huge vector...

2016-12-04 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16037 @MLnick Yeah, this is likely a problem with all the ML aggregators as well. We can probably take care of it using lazy evaluation. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #16133: [SPARK-18702][SQL] input_file_block_start and inp...

2016-12-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/16133#discussion_r90774313 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala --- @@ -533,31 +533,54 @@ class ColumnExpressionSuite extends QueryTest

[GitHub] spark pull request #16133: [SPARK-18702][SQL] input_file_block_start and inp...

2016-12-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/16133#discussion_r90774307 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala --- @@ -533,31 +533,54 @@ class ColumnExpressionSuite extends QueryTest

[GitHub] spark pull request #16133: [SPARK-18702][SQL] input_file_block_start and inp...

2016-12-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/16133#discussion_r90774314 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala --- @@ -567,10 +590,22 @@ class ColumnExpressionSuite extends QueryTest

[GitHub] spark pull request #16133: [SPARK-18702][SQL] input_file_block_start and inp...

2016-12-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/16133#discussion_r90774453 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -132,54 +132,57 @@ class NewHadoopRDD[K, V]( override def

[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-04 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/16090 Yeah I was wondering if we should also try to fix that. It seems maybe not as bad since unpartitioned tables usually aren't that big. We can create separate tickets for investigating that,

[GitHub] spark issue #15620: [SPARK-18091] [SQL] Deep if expressions cause Generated ...

2016-12-04 Thread kapilsingh5050
Github user kapilsingh5050 commented on the issue: https://github.com/apache/spark/pull/15620 @cloud-fan Can we merge this change to 1.6 and 2.0 too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-12-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14638 Hi, @rxin . Do you want me to prepare something based on `InputFileBlockHolder` of #16133 ? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #16136: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ...

2016-12-04 Thread yanboliang
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/16136 [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML programming guide. ## What changes were proposed in this pull request? Add R examples to ML programming guide for the following algorithms

[GitHub] spark issue #16053: [SPARK-17931] Eliminate unncessary task (de) serializati...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16053 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69640/ Test FAILed. ---

[GitHub] spark issue #16053: [SPARK-17931] Eliminate unncessary task (de) serializati...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16053 **[Test build #69640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69640/consoleFull)** for PR 16053 at commit

[GitHub] spark issue #16053: [SPARK-17931] Eliminate unncessary task (de) serializati...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16053 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15819: [SPARK-18372][SQL].Staging directory fail to be r...

2016-12-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15819#discussion_r90782492 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -54,6 +61,61 @@ case class InsertIntoHiveTable(

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16134 **[Test build #69641 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69641/consoleFull)** for PR 16134 at commit

[GitHub] spark issue #15819: [SPARK-18372][SQL].Staging directory fail to be removed

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 yes, exactly. This path is only for spark 1.x. what i proposed here is that we need to use the code of spark 2.0.x o fix the bug of spark 1.x. you can see this message from the my previous

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69635/ Test PASSed. ---

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774685 --- Diff: python/MANIFEST.in --- @@ -17,6 +17,8 @@ global-exclude *.py[cod] __pycache__ .DS_Store recursive-include deps/jars *.jar graft

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16135 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774620 --- Diff: python/setup.py --- @@ -69,10 +69,15 @@ EXAMPLES_PATH = os.path.join(SPARK_HOME, "examples/src/main/python") SCRIPTS_PATH =

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774722 --- Diff: python/setup.py --- @@ -69,10 +69,13 @@ EXAMPLES_PATH = os.path.join(SPARK_HOME, "examples/src/main/python") SCRIPTS_PATH =

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16082 **[Test build #69636 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69636/consoleFull)** for PR 16082 at commit

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16082 One minor nit, but other than that LGTM - thank you for taking the time to do this @lins05 :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16002: [SPARK-18341][ML] Eliminate use of SingularMatrixExcepti...

2016-12-04 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16002 That's a good point about IRLS. If we can find a nice, functional way to handle this error and propagate it out to WLS, then I'm for the change. But I think the current way is acceptable now and

[GitHub] spark pull request #16033: SPARK-18607 get a result on a percent of the task...

2016-12-04 Thread Ru-Xiang
Github user Ru-Xiang closed the pull request at: https://github.com/apache/spark/pull/16033 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-04 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 Yes it worked before On Dec 4, 2016 02:33, "Wenchen Fan" wrote: > val x: Dataset[String, Option[(String, String)]] = ... >

[GitHub] spark issue #16120: [SPARK-18634][PySpark][SQL] Corruption and Correctness i...

2016-12-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16120 cc @hvanhovell @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16136: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML prog...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16136 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69639/ Test PASSed. ---

[GitHub] spark issue #16136: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML prog...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16136 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16136: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML prog...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16136 **[Test build #69639 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69639/consoleFull)** for PR 16136 at commit

[GitHub] spark issue #15787: [SPARK-18286][ML] Add Scala/Java examples for MinHash an...

2016-12-04 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/15787 This PR can be closed since https://github.com/apache/spark/pull/15795 has merged --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #15787: [SPARK-18286][ML] Add Scala/Java examples for Min...

2016-12-04 Thread bravo-zhang
Github user bravo-zhang closed the pull request at: https://github.com/apache/spark/pull/15787 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add ReadWriteLock for each table's re...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16135 **[Test build #69635 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69635/consoleFull)** for PR 16135 at commit

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16082 This looks good to me, maybe @srowen, @davies, or @JoshRosen could do a final pass/merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16117: [SPARK-18686][SparkR][ML] Several cleanup and improvemen...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16117 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16117: [SPARK-18686][SparkR][ML] Several cleanup and improvemen...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16117 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69638/ Test PASSed. ---

[GitHub] spark issue #16117: [SPARK-18686][SparkR][ML] Several cleanup and improvemen...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16117 **[Test build #69638 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69638/consoleFull)** for PR 16117 at commit

[GitHub] spark pull request #16059: [SPARK-18625][ML] OneVsRestModel should support s...

2016-12-04 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16059#discussion_r90776155 --- Diff: mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala --- @@ -136,6 +137,19 @@ class OneVsRestSuite extends

[GitHub] spark pull request #16033: SPARK-18607 get a result on a percent of the task...

2016-12-04 Thread Ru-Xiang
GitHub user Ru-Xiang reopened a pull request: https://github.com/apache/spark/pull/16033 SPARK-18607 get a result on a percent of the tasks succeed ## What changes were proposed in this pull request? In this patch, we modify the codes corresponding to runApproximateJob so

[GitHub] spark issue #16136: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML prog...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16136 **[Test build #69639 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69639/consoleFull)** for PR 16136 at commit

[GitHub] spark issue #16136: [SPARK-18279][DOC][ML][SPARKR] Add R examples to ML prog...

2016-12-04 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16136 cc @jkbradley @felixcheung --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774898 --- Diff: python/setup.py --- @@ -69,10 +69,13 @@ EXAMPLES_PATH = os.path.join(SPARK_HOME, "examples/src/main/python") SCRIPTS_PATH =

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16082 **[Test build #69637 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69637/consoleFull)** for PR 16082 at commit

[GitHub] spark pull request #16090: [SPARK-18661] [SQL] Creating a partitioned dataso...

2016-12-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16090 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16090: [SPARK-18661] [SQL] Creating a partitioned datasource ta...

2016-12-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16090 LGTM, merging to master/2.1! @ericl please create tickets for the other 2 issues --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16082 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16082 **[Test build #69637 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69637/consoleFull)** for PR 16082 at commit

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69637/ Test PASSed. ---

[GitHub] spark issue #16122: [SPARK-18681][SQL] Fix filtering to compatible with part...

2016-12-04 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/16122 @wangyum Thanks for fixing this. The fact that our tests did not catch this bug means we have a gap in our test coverage. It looks like the test in `HiveClientSuite` is incorrect. Can you fix it?

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-04 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 spark 2.0.x does not have mapValues. but this works: scala> Seq(("a", Some((1, 1))), ("a", None)).toDS.groupByKey(_._2).count.show +---++ |

[GitHub] spark pull request #16134: [SPARK-18703] [SQL] Drop Staging Directories and ...

2016-12-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16134#discussion_r90781586 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -166,6 +166,29 @@ class InsertIntoHiveTableSuite

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-12-04 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/15979 admittedly the result looks weird. it really should be: +---++ |key|count(1)| +---++ | null| 1| | [1,1]|

[GitHub] spark issue #16053: [SPARK-17931] Eliminate unncessary task (de) serializati...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16053 **[Test build #69640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69640/consoleFull)** for PR 16053 at commit

[GitHub] spark issue #16059: [SPARK-18625][ML] OneVsRestModel should support setFeatu...

2016-12-04 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16059 @srowen Thanks for cc me, I'm having a look now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15620: [SPARK-18091] [SQL] Deep if expressions cause Generated ...

2016-12-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15620 merged to 2.0. Can you send a new PR to backport this to 1.6? There are a lot of code changes beween 1.6 and 2.1, it's safer to open a PR and run all tests. --- If your project is set up for

[GitHub] spark pull request #16117: [SPARK-18686][SparkR][ML] Several cleanup and imp...

2016-12-04 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16117#discussion_r90776748 --- Diff: R/pkg/R/mllib.R --- @@ -817,44 +804,29 @@ setMethod("predict", signature(object = "LogisticRegressionModel"), # Get the summary of an

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16134 https://github.com/apache/spark/pull/15819 is copying the code from Spark 2.0. The file deletion is based on the existing way: `fs.deleteOnExit`. This PR tries to delete the staging directory

[GitHub] spark issue #16033: SPARK-18607 get a result on a percent of the tasks succe...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16033 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16082 **[Test build #69636 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69636/consoleFull)** for PR 16082 at commit

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69636/ Test PASSed. ---

[GitHub] spark issue #16082: [SPARK-18652][PYTHON] Include the example data and third...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16082 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16059: [SPARK-18625][ML] OneVsRestModel should support s...

2016-12-04 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16059#discussion_r90776114 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala --- @@ -175,6 +183,8 @@ final class OneVsRestModel private[ml] (

[GitHub] spark issue #16117: [SPARK-18686][SparkR][ML] Several cleanup and improvemen...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16117 **[Test build #69638 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69638/consoleFull)** for PR 16117 at commit

[GitHub] spark pull request #16117: [SPARK-18686][SparkR][ML] Several cleanup and imp...

2016-12-04 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/16117#discussion_r90776572 --- Diff: R/pkg/R/mllib.R --- @@ -746,45 +744,35 @@ setMethod("predict", signature(object = "KMeansModel"), #' \dontrun{ #' sparkR.session()

[GitHub] spark pull request #16122: [SPARK-18681][SQL] Fix filtering to compatible wi...

2016-12-04 Thread mallman
Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/16122#discussion_r90781272 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -590,8 +590,10 @@ private[client] class Shim_v0_13 extends

[GitHub] spark issue #16137: [SPARK-18708][CORE] Improvement/improve docs in spark co...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16137 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16117: [SPARK-18686][SparkR][ML] Several cleanup and improvemen...

2016-12-04 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16117 LGTM except a minor comment on example. thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16131: [SPARK-18701][ML] Fix Poisson GLM failure due to wrong i...

2016-12-04 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16131 @srowen Try this example below or the example @sethah had issue with in #15683. I have tried running the 2.1 version Poisson GLM on our data and it fails for most of them (it

[GitHub] spark issue #16138: [WIP][Spark-16609] Add to_date with format function.

2016-12-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16138 Can you use "SPARK-16609" instead of "Spark-16609"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16053: [SPARK-17931] Eliminate unncessary task (de) serializati...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16053 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69646/ Test FAILed. ---

[GitHub] spark pull request #16138: [WIP][SPARK-16609] Add to_date/to_timestamp with ...

2016-12-04 Thread anabranch
Github user anabranch commented on a diff in the pull request: https://github.com/apache/spark/pull/16138#discussion_r90785863 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2661,12 +2661,30 @@ object functions { def unix_timestamp(s: Column,

[GitHub] spark issue #16138: [WIP][SPARK-16609] Add to_date/to_timestamp with format ...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16138 **[Test build #69649 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69649/consoleFull)** for PR 16138 at commit

[GitHub] spark issue #16094: [SPARK-18541][Python]Add metadata parameter to pyspark.s...

2016-12-04 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16094 Lets see if maybe @marmbrus or @davies has some time to look at this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16138: [WIP][SPARK-16609] Add to_date/to_timestamp with format ...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16138 **[Test build #69651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69651/consoleFull)** for PR 16138 at commit

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-12-04 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/15998 > @mallman do you know which tests fail the partition spec checking? It looks to me that before we call partition related API in SessionCatalog, the partition column names should be normalized

[GitHub] spark issue #16077: [SPARK-18643][SPARKR] SparkR hangs at session start when...

2016-12-04 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16077 Right, I was just reviewing possible code paths for this in the last few days and I'm pretty confident that this change will not run install.spark in cluster modes (which would have Spark/JVM

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-12-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15998 **[Test build #69652 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69652/consoleFull)** for PR 15998 at commit

[GitHub] spark issue #16014: [SPARK-18590][SPARKR] build R source package when making...

2016-12-04 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16014 As for release-build.sh - I had the change in there but I've changed it to make it more clear. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15998 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69652/ Test PASSed. ---

[GitHub] spark pull request #16133: [SPARK-18702][SQL] input_file_block_start and inp...

2016-12-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/16133#discussion_r90794591 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala --- @@ -533,31 +533,54 @@ class ColumnExpressionSuite extends QueryTest

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-12-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15998 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15819 We have stopped making new releases for 1.5 so it makes no sense to backport. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15819: [SPARK-18372][SQL][Branch-1.6].Staging directory fail to...

2016-12-04 Thread merlintang
Github user merlintang commented on the issue: https://github.com/apache/spark/pull/15819 this bug is related to 1.5.x as well as 1.6.x. please backport to 1.5.x as well. On Sun, Dec 4, 2016 at 6:20 PM, Reynold Xin wrote: > If it is

  1   2   3   >