spark git commit: [SPARK-19126][DOCS] Update Join Documentation Across Languages

2017-01-08 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.1 8690d4bd1 -> 8779e6a46 [SPARK-19126][DOCS] Update Join Documentation Across Languages ## What changes were proposed in this pull request? - [X] Make sure all join types are clearly mentioned - [X] Make join labeling/style consistent -

spark git commit: [SPARK-18903][SPARKR][BACKPORT-2.1] Add API to get SparkUI URL

2017-01-08 Thread felixcheung
6507 from felixcheung/portsparkuir21. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80a3e13e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/80a3e13e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/80a3e13e Branch: r

spark git commit: [DOC][BUILD][MINOR] add doc on new make-distribution switches

2016-12-27 Thread felixcheung
How was this patch tested? Doc only Author: Felix Cheung <felixcheun...@hotmail.com> Closes #16364 from felixcheung/buildguide. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2af8b5cf Tree: http://git-wip-us.apache.org/repos/

spark git commit: [BUILD] make-distribution should find JAVA_HOME for non-RHEL systems

2016-12-21 Thread felixcheung
How was this patch tested? Manually Author: Felix Cheung <felixcheun...@hotmail.com> Closes #16363 from felixcheung/buildjava. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e1b43dc4 Tree: http://git-wip-us.apache.

spark git commit: [SPARK-18903][SPARKR] Add API to get SparkUI URL

2016-12-21 Thread felixcheung
ung <felixcheun...@hotmail.com> Closes #16367 from felixcheung/rwebui. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7e8994ff Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7e8994ff Diff: http://git-wip-us.apache.

spark git commit: [SPARK-18579][SQL] Use ignoreLeadingWhiteSpace and ignoreTrailingWhiteSpace options in CSV writing

2017-03-23 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 12cd00706 -> 07c12c09a [SPARK-18579][SQL] Use ignoreLeadingWhiteSpace and ignoreTrailingWhiteSpace options in CSV writing ## What changes were proposed in this pull request? This PR proposes to support _not_ trimming the white spaces

spark git commit: [SPARK-20105][TESTS][R] Add tests for checkType and type string in structField in R

2017-03-27 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 314cf51de -> 3fada2f50 [SPARK-20105][TESTS][R] Add tests for checkType and type string in structField in R ## What changes were proposed in this pull request? It seems `checkType` and the type string in `structField` are not being tested

spark git commit: [MINOR][DOCS] Match several documentation changes in Scala to R/Python

2017-03-26 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 0bc8847aa -> 3fbf0a5f9 [MINOR][DOCS] Match several documentation changes in Scala to R/Python ## What changes were proposed in this pull request? This PR proposes to match minor documentations changes in

spark git commit: [MINOR][SPARKR] Add run command comment in examples

2017-03-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 79636054f -> 471de5db5 [MINOR][SPARKR] Add run command comment in examples ## What changes were proposed in this pull request? There are two examples in r folder missing the run commands. In this PR, I just add the missing comment, which

spark git commit: [SPARK-20092][R][PROJECT INFRA] Add the detection for Scala codes dedicated for R in AppVeyor tests

2017-03-26 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 0b903caef -> 2422c86f2 [SPARK-20092][R][PROJECT INFRA] Add the detection for Scala codes dedicated for R in AppVeyor tests ## What changes were proposed in this pull request? We are currently detecting the changes in `R/` directory only

spark git commit: [SPARK-18817][SPARKR][SQL] change derby log output to temp dir

2017-03-19 Thread felixcheung
xcheun...@hotmail.com> Closes #16330 from felixcheung/rderby. (cherry picked from commit 422aa67d1bb84f913b06e6d94615adb6557e2870) Signed-off-by: Felix Cheung <felixche...@apache.org> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-18817][SPARKR][SQL] change derby log output to temp dir

2017-03-19 Thread felixcheung
xcheun...@hotmail.com> Closes #16330 from felixcheung/rderby. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/422aa67d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/422aa67d Diff: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [MINOR][R] Reorder `Collate` fields in DESCRIPTION file

2017-03-19 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 5c165596d -> 60262bc95 [MINOR][R] Reorder `Collate` fields in DESCRIPTION file ## What changes were proposed in this pull request? It seems cran check scripts corrects `R/pkg/DESCRIPTION` and follows the order in `Collate` fields. This

spark git commit: [SPARK-19654][SPARKR][SS] Structured Streaming API for R

2017-03-18 Thread felixcheung
Author: Felix Cheung <felixcheun...@hotmail.com> Closes #16982 from felixcheung/rss. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5c165596 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5c165596 Diff: http://gi

spark git commit: [SPARK-20020][SPARKR] DataFrame checkpoint API

2017-03-19 Thread felixcheung
lix Cheung <felixcheun...@hotmail.com> Closes #17351 from felixcheung/rdfcheckpoint. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c4059772 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c4059772 Diff: http:

spark git commit: [SPARK-20020][SPARKR][FOLLOWUP] DataFrame checkpoint API fix version tag

2017-03-20 Thread felixcheung
xcheun...@hotmail.com> Closes #17356 from felixcheung/rdfcheckpoint2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f14f81e9 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f14f81e9 Diff: http://git-wip-us.apache.

spark git commit: [SPARK-19849][SQL] Support ArrayType in to_json to produce JSON array

2017-03-19 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 990af630d -> 0cdcf9114 [SPARK-19849][SQL] Support ArrayType in to_json to produce JSON array ## What changes were proposed in this pull request? This PR proposes to support an array of struct type in `to_json` as below: ```scala import

spark git commit: [SPARK-19828][R] Support array type in from_json in R

2017-03-14 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 8fb2a02e2 -> d1f6c64c4 [SPARK-19828][R] Support array type in from_json in R ## What changes were proposed in this pull request? Since we could not directly define the array type in R, this PR proposes to support array types in R as

spark git commit: [MINOR][R] Reorder `Collate` fields in DESCRIPTION file

2017-04-04 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 0736980f3 -> 0e2ee8204 [MINOR][R] Reorder `Collate` fields in DESCRIPTION file ## What changes were proposed in this pull request? It seems cran check scripts corrects `R/pkg/DESCRIPTION` and follows the order in `Collate` fields. This

spark git commit: [SPARK-19825][R][ML] spark.ml R API for FPGrowth

2017-04-04 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 51d3c854c -> b34f7665d [SPARK-19825][R][ML] spark.ml R API for FPGrowth ## What changes were proposed in this pull request? Adds SparkR API for FPGrowth: [SPARK-19825](https://issues.apache.org/jira/browse/SPARK-19825): -

spark git commit: [SPARK-20159][SPARKR][SQL] Support all catalog API in R

2017-04-02 Thread felixcheung
d? manual tests, unit tests Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17483 from felixcheung/rcatalog. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/93dbfe70 Tree: http://git-wip-us.apache.org/repos/asf

spark git commit: [SPARK-20195][SPARKR][SQL] add createTable catalog API and deprecate createExternalTable

2017-04-06 Thread felixcheung
2.0) and deprecate createExternalTable, plus a number of minor fixes ## How was this patch tested? manual, unit tests Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17511 from felixcheung/rceatetable. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.a

spark git commit: [SPARK-20196][PYTHON][SQL] update doc for catalog functions for all languages, add pyspark refreshByPath API

2017-04-06 Thread felixcheung
add refreshByPath in python ## How was this patch tested? manual Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17512 from felixcheung/catalogdoc. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bccc3301 T

spark git commit: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json.array in from_json function in R

2017-04-17 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 86d251c58 -> 24f09b39c [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json.array in from_json function in R ## What changes were proposed in this pull request? This was suggested to be `as.json.array` at the first place in the PR to

spark git commit: [SPARK-20278][R] Disable 'multiple_dots_linter' lint rule that is against project's code style

2017-04-16 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master ad935f526 -> 86d251c58 [SPARK-20278][R] Disable 'multiple_dots_linter' lint rule that is against project's code style ## What changes were proposed in this pull request? Currently, multi-dot separated variables in R is not allowed. For

spark git commit: [SPARK-17647][SQL][FOLLOWUP][MINOR] fix typo

2017-04-18 Thread felixcheung
7663 from felixcheung/likedoctypo. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b0a1e93e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b0a1e93e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b0a1e93e Bra

spark git commit: [SPARK-20375][R] R wrappers for array and map

2017-04-19 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master bdc605691 -> 46c574976 [SPARK-20375][R] R wrappers for array and map ## What changes were proposed in this pull request? Adds wrappers for `o.a.s.sql.functions.array` and `o.a.s.sql.functions.map` ## How was this patch tested? Unit

spark git commit: [SPARK-20371][R] Add wrappers for collect_list and collect_set

2017-04-21 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master eb00378f0 -> fd648bff6 [SPARK-20371][R] Add wrappers for collect_list and collect_set ## What changes were proposed in this pull request? Adds wrappers for `collect_list` and `collect_set`. ## How was this patch tested? Unit tests,

spark git commit: [SPARK-19282][ML][SPARKR] RandomForest Wrapper and GBT Wrapper return param "maxDepth" to R models

2017-03-12 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 2f5187bde -> 9f8ce4825 [SPARK-19282][ML][SPARKR] RandomForest Wrapper and GBT Wrapper return param "maxDepth" to R models ## What changes were proposed in this pull request? RandomForest R Wrapper and GBT R Wrapper return param

spark git commit: [SPARK-19391][SPARKR][ML] Tweedie GLM API for SparkR

2017-03-14 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 415f9f342 -> f6314eab4 [SPARK-19391][SPARKR][ML] Tweedie GLM API for SparkR ## What changes were proposed in this pull request? Port Tweedie GLM #16344 to SparkR felixcheung yanboliang ## How was this patch tested? new test in Spa

spark git commit: [SPARK-19818][SPARKR] rbind should check for name consistency of input data frames

2017-03-06 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 9909f6d36 -> 1f6c090c1 [SPARK-19818][SPARKR] rbind should check for name consistency of input data frames ## What changes were proposed in this pull request? Added checks for name consistency of input data frames in union. ## How was

spark git commit: [SPARK-19795][SPARKR] add column functions to_json, from_json

2017-03-05 Thread felixcheung
tch tested? unit tests, manual Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17134 from felixcheung/rtojson. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80d5338b Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [DOC][MINOR][SPARKR] Update SparkR doc for names, columns and colnames

2017-03-01 Thread felixcheung
subset assignment, so the length of `value` can be less than the number of columns, e.g., `colnames(df)[1] <- "a"`. felixcheung Author: actuaryzhang <actuaryzhan...@gmail.com> Closes #17115 from actuaryzhang/sparkRMinorDoc. Project: http://git-wip-us.apache.org/repos/asf/spar

spark git commit: [SPARK-20197][SPARKR][BRANCH-2.1] CRAN check fail with package installation

2017-04-02 Thread felixcheung
ung <felixcheun...@hotmail.com> Closes #17515 from felixcheung/rcrancheck. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ca144106 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ca144106 Diff: http://git-wip-us.a

spark git commit: [SPARKR][DOC] update doc for fpgrowth

2017-04-04 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b28bbffba -> c1b8b6675 [SPARKR][DOC] update doc for fpgrowth ## What changes were proposed in this pull request? minor update zero323 Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17526 from felixcheung/rfpgrowt

spark git commit: [SPARK-20197][SPARKR] CRAN check fail with package installation

2017-04-07 Thread felixcheung
ung <felixcheun...@hotmail.com> Closes #17516 from felixcheung/rdircheckincran. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8feb799a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8feb799a Diff: http://git-wip-us.a

spark git commit: [SPARK-20026][DOC][SPARKR] Add Tweedie example for SparkR in programming guide

2017-04-07 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 9e0893b53 -> 870b9d9aa [SPARK-20026][DOC][SPARKR] Add Tweedie example for SparkR in programming guide ## What changes were proposed in this pull request? Add Tweedie example for SparkR in programming guide. The doc was already updated in

spark git commit: [SPARK-20258][DOC][SPARKR] Fix SparkR logistic regression example in programming guide (did not converge)

2017-04-07 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 8feb799af -> 1ad73f0a2 [SPARK-20258][DOC][SPARKR] Fix SparkR logistic regression example in programming guide (did not converge) ## What changes were proposed in this pull request? SparkR logistic regression example did not converge in

spark git commit: [SPARK-20208][R][DOCS] Document R fpGrowth support

2017-04-18 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master e468a96c4 -> 702d85af2 [SPARK-20208][R][DOCS] Document R fpGrowth support ## What changes were proposed in this pull request? Document fpGrowth in: - vignettes - programming guide - code example ## How was this patch tested? Manual

spark git commit: [SPARK-20208][R][DOCS] Document R fpGrowth support

2017-04-18 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 a33d44805 -> ef6923f7e [SPARK-20208][R][DOCS] Document R fpGrowth support ## What changes were proposed in this pull request? Document fpGrowth in: - vignettes - programming guide - code example ## How was this patch tested?

spark git commit: [SPARK-20438][R] SparkR wrappers for split and repeat

2017-04-24 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 90264aced -> 8a272ddc9 [SPARK-20438][R] SparkR wrappers for split and repeat ## What changes were proposed in this pull request? Add wrappers for `o.a.s.sql.functions`: - `split` as `split_string` - `repeat` as `repeat_string` ## How

spark git commit: [SPARK-21266][R][PYTHON] Support schema a DDL-formatted string in dapply/gapply/from_json

2017-07-10 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 18b3b00ec -> 2bfd5accd [SPARK-21266][R][PYTHON] Support schema a DDL-formatted string in dapply/gapply/from_json ## What changes were proposed in this pull request? This PR supports schema in a DDL formatted string for `from_json` in

spark git commit: [SPARK-20307][SPARKR] SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-07-08 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master d0bfc6733 -> a7b46c627 [SPARK-20307][SPARKR] SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer ## What changes were proposed in this pull request? For randomForest classifier, if test data contains unseen

spark git commit: [SPARK-20456][DOCS] Add examples for functions collection for pyspark

2017-07-08 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master a7b46c627 -> f5f02d213 [SPARK-20456][DOCS] Add examples for functions collection for pyspark ## What changes were proposed in this pull request? This adds documentation to many functions in pyspark.sql.functions.py: `upper`, `lower`,

spark git commit: [SPARK-21093][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak

2017-07-08 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master c3712b77a -> 08e0d033b [SPARK-21093][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak ## What changes were proposed in this pull request? This is a retry for #18320. This PR was reverted due to unexpected

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for STRING column methods

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b72b8521d -> 376d90d55 [SPARK-20889][SPARKR] Grouped documentation for STRING column methods ## What changes were proposed in this pull request? Grouped documentation for string column methods. Author: actuaryzhang

spark git commit: Revert "[SPARK-21094][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak"

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master db44f5f3e -> fc92d25f2 Revert "[SPARK-21094][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak" This reverts commit 6b3d02285ee0debc73cbcab01b10398a498fbeb8. Project:

spark git commit: [SPARK-21224][R] Specify a schema by using a DDL-formatted string when reading in R

2017-06-28 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 0c8444cf6 -> db44f5f3e [SPARK-21224][R] Specify a schema by using a DDL-formatted string when reading in R ## What changes were proposed in this pull request? This PR proposes to support a DDL-formetted string as schema as below: ```r

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for MISC column methods

2017-06-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master e2f32ee45 -> fddb63f46 [SPARK-20889][SPARKR] Grouped documentation for MISC column methods ## What changes were proposed in this pull request? Grouped documentation for column misc methods. Author: actuaryzhang

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for COLLECTION column methods

2017-06-30 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master fddb63f46 -> 52981715b [SPARK-20889][SPARKR] Grouped documentation for COLLECTION column methods ## What changes were proposed in this pull request? Grouped documentation for column collection methods. Author: actuaryzhang

spark git commit: [MINOR][SPARKR] ignore Rplots.pdf test output after running R tests

2017-07-04 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master cec392150 -> daabf425e [MINOR][SPARKR] ignore Rplots.pdf test output after running R tests ## What changes were proposed in this pull request? After running R tests in local build, it outputs Rplots.pdf. This one should be ignored in the

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for WINDOW column methods

2017-07-04 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 4d6d8192c -> cec392150 [SPARK-20889][SPARKR] Grouped documentation for WINDOW column methods ## What changes were proposed in this pull request? Grouped documentation for column window methods. Author: actuaryzhang

spark git commit: [SPARK-20889][SPARKR][FOLLOWUP] Clean up grouped doc for column methods

2017-07-04 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master ce10545d3 -> e9a93f814 [SPARK-20889][SPARKR][FOLLOWUP] Clean up grouped doc for column methods ## What changes were proposed in this pull request? Add doc for methods that were left out, and fix various style and consistency issues.

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for NONAGGREGATE column methods

2017-06-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 9f6b3e65c -> a2d562354 [SPARK-20889][SPARKR] Grouped documentation for NONAGGREGATE column methods ## What changes were proposed in this pull request? Grouped documentation for nonaggregate column methods. Author: actuaryzhang

spark git commit: [SPARK-20437][R] R wrappers for rollup and cube

2017-04-25 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 57e1da394 -> df58a95a3 [SPARK-20437][R] R wrappers for rollup and cube ## What changes were proposed in this pull request? - Add `rollup` and `cube` methods and corresponding generics. - Add short description to the vignette. ## How was

spark git commit: [DOCS][MINOR] Add missing since to SparkR repeat_string note.

2017-04-27 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b4724db19 -> b58cf77c4 [DOCS][MINOR] Add missing since to SparkR repeat_string note. ## What changes were proposed in this pull request? Replace note repeat_string 2.3.0 with note repeat_string since 2.3.0 ## How was this

spark git commit: [SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide

2017-04-27 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b58cf77c4 -> ba7666274 [SPARK-20208][DOCS][FOLLOW-UP] Add FP-Growth to SparkR programming guide ## What changes were proposed in this pull request? Add `spark.fpGrowth` to SparkR programming guide. ## How was this patch tested? Manual

spark git commit: [SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel

2017-04-25 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 55834a898 -> f971ce5dd [SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel ## What changes were proposed in this pull request? Pregel-based iterative algorithms with more than ~50 iterations begin to slow down and eventually

spark git commit: [SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel

2017-04-25 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 67eef47ac -> 0a7f5f279 [SPARK-5484][GRAPHX] Periodically do checkpoint in Pregel ## What changes were proposed in this pull request? Pregel-based iterative algorithms with more than ~50 iterations begin to slow down and eventually fail

spark git commit: [SPARKR][DOC] Document LinearSVC in R programming guide

2017-04-27 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b90bf520f -> 7fe824979 [SPARKR][DOC] Document LinearSVC in R programming guide ## What changes were proposed in this pull request? add link to svmLinear in the SparkR programming document. ## How was this patch tested? Build doc

spark git commit: [SPARKR][DOC] Document LinearSVC in R programming guide

2017-04-27 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 e02b6ebfd -> f60ed0c2c [SPARKR][DOC] Document LinearSVC in R programming guide ## What changes were proposed in this pull request? add link to svmLinear in the SparkR programming document. ## How was this patch tested? Build doc

spark git commit: [SPARK-19791][ML] Add doc and example for fpgrowth

2017-04-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b28c3bc20 -> add9d1bba [SPARK-19791][ML] Add doc and example for fpgrowth ## What changes were proposed in this pull request? Add a new section for fpm Add Example for FPGrowth in scala and Java updated: Rewrite transform to be more

spark git commit: [SPARK-19791][ML] Add doc and example for fpgrowth

2017-04-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 4a86d8db4 -> 9789d5c57 [SPARK-19791][ML] Add doc and example for fpgrowth ## What changes were proposed in this pull request? Add a new section for fpm Add Example for FPGrowth in scala and Java updated: Rewrite transform to be more

spark git commit: [SPARK-20533][SPARKR] SparkR Wrappers Model should be private and value should be lazy

2017-04-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master add9d1bba -> ee694cdff [SPARK-20533][SPARKR] SparkR Wrappers Model should be private and value should be lazy ## What changes were proposed in this pull request? MultilayerPerceptronClassifierWrapper model should be private.

spark git commit: [SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programming guide

2017-04-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 814a61a86 -> b28c3bc20 [SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programming guide ## What changes were proposed in this pull request? Add hyper link in the SparkR programming guide. ## How was this patch tested?

spark git commit: [SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programming guide

2017-04-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 ca6c59e7e -> 4a86d8db4 [SPARK-20477][SPARKR][DOC] Document R bisecting k-means in R programming guide ## What changes were proposed in this pull request? Add hyper link in the SparkR programming guide. ## How was this patch tested?

spark git commit: [SPARK-20493][R] De-duplicate parse logics for DDL-like type strings in R

2017-04-29 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master ee694cdff -> 70f1bcd7b [SPARK-20493][R] De-duplicate parse logics for DDL-like type strings in R ## What changes were proposed in this pull request? It seems we are using `SQLUtils.getSQLDataType` for type string in structField. It looks

spark git commit: [SPARK-21381][SPARKR] SparkR: pass on setHandleInvalid for classification algorithms

2017-07-31 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 6b186c9d6 -> 9570e81aa [SPARK-21381][SPARKR] SparkR: pass on setHandleInvalid for classification algorithms ## What changes were proposed in this pull request? SPARK-20307 Added handleInvalid option to RFormula for tree-based

spark git commit: [SPARK-21622][ML][SPARKR] Support offset in SparkR GLM

2017-08-06 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 74b47845e -> 55aa4da28 [SPARK-21622][ML][SPARKR] Support offset in SparkR GLM ## What changes were proposed in this pull request? Support offset in SparkR GLM #16699 Author: actuaryzhang Closes #18831 from

spark git commit: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-14 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master aa3df1590 -> 5a799fd8c [SPARK-20726][SPARKR] wrapper for SQL broadcast ## What changes were proposed in this pull request? - Adds R wrapper for `o.a.s.sql.functions.broadcast`. - Renames `broadcast` to `broadcast_`. ## How was this patch

spark git commit: [DOCS][SPARKR] Use verbose names for family annotations in functions.R

2017-05-14 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 1283c3d11 -> aa3df1590 [DOCS][SPARKR] Use verbose names for family annotations in functions.R ## What changes were proposed in this pull request? - Change current short annotations (same as Scala `group`) to verbose names (same as Scala

spark git commit: [SPARK-20704][SPARKR] change CRAN test to run single thread

2017-05-12 Thread felixcheung
How was this patch tested? Jenkins Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17945 from felixcheung/rchangesforpackage. (cherry picked from commit 888b84abe8d3fd36c5c2226aeb9e202029936f94) Signed-off-by: Felix Cheung <felixche...@apache.org> Project: http://git-wip-us.apach

spark git commit: [SPARK-20619][ML] StringIndexer supports multiple ways to order label

2017-05-12 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 888b84abe -> af40bb115 [SPARK-20619][ML] StringIndexer supports multiple ways to order label ## What changes were proposed in this pull request? StringIndexer maps labels to numbers according to the descending order of label frequency.

spark git commit: [SPARK-20704][SPARKR] change CRAN test to run single thread

2017-05-12 Thread felixcheung
tch tested? Jenkins Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17945 from felixcheung/rchangesforpackage. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/888b84ab Tree: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames() test fails

2017-05-08 Thread felixcheung
ame/master/R/pkg/inst/tests/testthat/test_sparkSQL.R#L3355 for catalog APIs ## How was this patch tested? unit tests, this needs to combine with another commit with SQL change to check Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17905 from felixcheung/rtabletests. Proj

spark git commit: [SPARK-20661][SPARKR][TEST][FOLLOWUP] SparkR tableNames() test fails

2017-05-08 Thread felixcheung
ark/blame/master/R/pkg/inst/tests/testthat/test_sparkSQL.R#L3355 for catalog APIs ## How was this patch tested? unit tests, this needs to combine with another commit with SQL change to check Author: Felix Cheung <felixcheun...@hotmail.com> Closes #17905 from felixcheung/rtabletests. (cher

spark git commit: [SPARK-20670][ML] Simplify FPGrowth transform

2017-05-10 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master a90c5cd82 -> a819dab66 [SPARK-20670][ML] Simplify FPGrowth transform ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-20670 As suggested by Sean Owen in

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for AGGREGATE column methods

2017-06-19 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 9b57cd8d5 -> 8965fe764 [SPARK-20889][SPARKR] Grouped documentation for AGGREGATE column methods ## What changes were proposed in this pull request? Grouped documentation for the aggregate functions for Column. Author: actuaryzhang

spark git commit: [SPARK-20906][SPARKR] Constrained Logistic Regression for SparkR

2017-06-21 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 215281d88 -> 53543374c [SPARK-20906][SPARKR] Constrained Logistic Regression for SparkR ## What changes were proposed in this pull request? PR https://github.com/apache/spark/pull/17715 Added Constrained Logistic Regression for ML. We

spark git commit: [SPARK-20889][SPARKR] Grouped documentation for DATETIME column methods

2017-06-22 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 2dadea95c -> 19331b8e4 [SPARK-20889][SPARKR] Grouped documentation for DATETIME column methods ## What changes were proposed in this pull request? Grouped documentation for datetime column methods. Author: actuaryzhang

spark git commit: [SPARK-21149][R] Add job description API for R

2017-06-23 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master f3dea6079 -> 07479b3cf [SPARK-21149][R] Add job description API for R ## What changes were proposed in this pull request? Extend `setJobDescription` to SparkR API. ## How was this patch tested? It looks difficult to add a test. Manually

spark git commit: [SPARK-21093][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak

2017-06-25 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 884347e1f -> 6b3d02285 [SPARK-21093][R] Terminate R's worker processes in the parent of R's daemon to prevent a leak ## What changes were proposed in this pull request? `mcfork` in R looks opening a pipe ahead but the existing logic does

spark git commit: [SPARK-15767][ML][SPARKR] Decision Tree wrapper in SparkR

2017-05-22 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 363091100 -> 4be337583 [SPARK-15767][ML][SPARKR] Decision Tree wrapper in SparkR ## What changes were proposed in this pull request? support decision tree in R ## How was this patch tested? added tests Author: Zheng RuiFeng

spark git commit: [SPARK-20815][SPARKR] NullPointerException in RPackageUtils#checkManifestForR

2017-05-22 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master a2460be9c -> 4dbb63f08 [SPARK-20815][SPARKR] NullPointerException in RPackageUtils#checkManifestForR ## What changes were proposed in this pull request? - Add a null check to RPackageUtils#checkManifestForR so that jars w/o manifests

spark git commit: [SPARK-20815][SPARKR] NullPointerException in RPackageUtils#checkManifestForR

2017-05-22 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 d8328d8d1 -> ddc199eef [SPARK-20815][SPARKR] NullPointerException in RPackageUtils#checkManifestForR ## What changes were proposed in this pull request? - Add a null check to RPackageUtils#checkManifestForR so that jars w/o manifests

spark git commit: [SPARK-20727] Skip tests that use Hadoop utils on CRAN Windows

2017-05-23 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 4dbb63f08 -> d06610f99 [SPARK-20727] Skip tests that use Hadoop utils on CRAN Windows ## What changes were proposed in this pull request? This change skips tests that use the Hadoop libraries while running on CRAN check with Windows as

spark git commit: [SPARK-20727] Skip tests that use Hadoop utils on CRAN Windows

2017-05-23 Thread felixcheung
Repository: spark Updated Branches: refs/heads/branch-2.2 ddc199eef -> 5e9541a4d [SPARK-20727] Skip tests that use Hadoop utils on CRAN Windows ## What changes were proposed in this pull request? This change skips tests that use the Hadoop libraries while running on CRAN check with Windows

spark git commit: [SPARK-20849][DOC][SPARKR] Document R DecisionTree

2017-05-26 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 8ce0d8ffb -> a97c49704 [SPARK-20849][DOC][SPARKR] Document R DecisionTree ## What changes were proposed in this pull request? 1, add an example for sparkr `decisionTree` 2, document it in user guide ## How was this patch tested? local

spark git commit: [SPARKR][DOCS][MINOR] Use consistent names in rollup and cube examples

2017-05-19 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master ea3b1e352 -> 2d90c04f2 [SPARKR][DOCS][MINOR] Use consistent names in rollup and cube examples ## What changes were proposed in this pull request? Rename `carsDF` to `df` in SparkR `rollup` and `cube` examples. ## How was this patch

spark git commit: [SPARKR] Fix bad examples in DataFrame methods and style issues

2017-05-19 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 2d90c04f2 -> 7f203a248 [SPARKR] Fix bad examples in DataFrame methods and style issues ## What changes were proposed in this pull request? Some examples in the DataFrame methods are syntactically wrong, even though they are pseudo code.

spark git commit: [SPARK-20980][DOCS] update doc to reflect multiLine change

2017-06-15 Thread felixcheung
xcheun...@hotmail.com> Closes #18312 from felixcheung/sqljsonwholefiledoc. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1bf55e39 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1bf55e39 Diff: http://git-wip-us.apache.org/

spark git commit: [SPARK-20980][DOCS] update doc to reflect multiLine change

2017-06-15 Thread felixcheung
xcheun...@hotmail.com> Closes #18312 from felixcheung/sqljsonwholefiledoc. (cherry picked from commit 1bf55e396c7b995a276df61d9a4eb8e60bcee334) Signed-off-by: Felix Cheung <felixche...@apache.org> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/

spark git commit: [SPARK-20892][SPARKR] Add SQL trunc function to SparkR

2017-06-18 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 05f83c532 -> 110ce1f27 [SPARK-20892][SPARKR] Add SQL trunc function to SparkR ## What changes were proposed in this pull request? Add SQL trunc function ## How was this patch tested? standard test Author: actuaryzhang

spark git commit: [SPARK-21128][R] Remove both "spark-warehouse" and "metastore_db" before listing files in R tests

2017-06-18 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 75a6d0585 -> 05f83c532 [SPARK-21128][R] Remove both "spark-warehouse" and "metastore_db" before listing files in R tests ## What changes were proposed in this pull request? This PR proposes to list the files in test _after_ removing both

spark git commit: [TEST][SPARKR][CORE] Fix broken SparkSubmitSuite

2017-06-12 Thread felixcheung
ore and subsequent post-commit with the change built fine (again because it wasn't building core) actually appveyor builds everything but it's not running scala suites ... ## How was this patch tested? jenkins srowen gatorsmile Author: Felix Cheung <felixcheun...@hotmail.com> Closes #18283 from fe

spark git commit: [TEST][SPARKR][CORE] Fix broken SparkSubmitSuite

2017-06-12 Thread felixcheung
ent post-commit with the change built fine (again because it wasn't building core) actually appveyor builds everything but it's not running scala suites ... ## How was this patch tested? jenkins srowen gatorsmile Author: Felix Cheung <felixcheun...@hotmail.com> Closes #18283 from fe

[1/7] spark git commit: [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN

2017-06-11 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master 5301a19a0 -> dc4c35183 http://git-wip-us.apache.org/repos/asf/spark/blob/dc4c3518/R/pkg/tests/fulltests/test_streaming.R -- diff --git

[2/7] spark git commit: [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN

2017-06-11 Thread felixcheung
http://git-wip-us.apache.org/repos/asf/spark/blob/dc4c3518/R/pkg/tests/fulltests/test_sparkSQL.R -- diff --git a/R/pkg/tests/fulltests/test_sparkSQL.R b/R/pkg/tests/fulltests/test_sparkSQL.R new file mode 100644 index

[3/7] spark git commit: [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN

2017-06-11 Thread felixcheung
http://git-wip-us.apache.org/repos/asf/spark/blob/0b0be47e/R/pkg/tests/fulltests/test_mllib_fpm.R -- diff --git a/R/pkg/tests/fulltests/test_mllib_fpm.R b/R/pkg/tests/fulltests/test_mllib_fpm.R new file mode 100644 index

[3/7] spark git commit: [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN

2017-06-11 Thread felixcheung
http://git-wip-us.apache.org/repos/asf/spark/blob/dc4c3518/R/pkg/tests/fulltests/test_mllib_fpm.R -- diff --git a/R/pkg/tests/fulltests/test_mllib_fpm.R b/R/pkg/tests/fulltests/test_mllib_fpm.R new file mode 100644 index

<    1   2   3   4   >