spark git commit: [SPARK-22268][BUILD] Fix lint-java

2017-10-19 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 5a07aca4d -> 7fae7995b [SPARK-22268][BUILD] Fix lint-java ## What changes were proposed in this pull request? Fix java style issues ## How was this patch tested? Run `./dev/lint-java` locally since it's not run on Jenkins Author:

spark git commit: [SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR

2017-10-26 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 3073344a2 -> a83d8d5ad [SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR ## What changes were proposed in this pull request? This PR proposes to revive `stringsAsFactors` option in collect API, which was mistakenly

spark git commit: [SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR

2017-10-26 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.2 d2dc175a1 -> 24fe7ccba [SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR ## What changes were proposed in this pull request? This PR proposes to revive `stringsAsFactors` option in collect API, which was

spark git commit: [SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR

2017-10-26 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.1 3e77b7481 -> aa023fddb [SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR ## What changes were proposed in this pull request? This PR proposes to revive `stringsAsFactors` option in collect API, which was

spark git commit: [SPARK-24709][SQL] schema_of_json() - schema inference from an example

2018-07-03 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 5585c5765 -> 776f299fc [SPARK-24709][SQL] schema_of_json() - schema inference from an example ## What changes were proposed in this pull request? In the PR, I propose to add new function - *schema_of_json()* which infers schema of JSON

[4/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.3.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/5660fb9a/site/docs/2.3.1/api/python/pyspark.mllib.html -- diff --git a/site/docs/2.3.1/api/python/pyspark.mllib.html b/site/docs/2.3.1/api/python/pyspark.mllib.html index

[1/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.3.1

2018-07-03 Thread gurwls223
Repository: spark-website Updated Branches: refs/heads/asf-site 26b527127 -> 5660fb9a4 http://git-wip-us.apache.org/repos/asf/spark-website/blob/5660fb9a/site/docs/2.3.1/api/python/searchindex.js -- diff --git

[3/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.3.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/5660fb9a/site/docs/2.3.1/api/python/pyspark.sql.html -- diff --git a/site/docs/2.3.1/api/python/pyspark.sql.html b/site/docs/2.3.1/api/python/pyspark.sql.html index

[2/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.3.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/5660fb9a/site/docs/2.3.1/api/python/pyspark.streaming.html -- diff --git a/site/docs/2.3.1/api/python/pyspark.streaming.html

[6/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.3.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/5660fb9a/site/docs/2.3.1/api/python/_modules/pyspark/profiler.html -- diff --git a/site/docs/2.3.1/api/python/_modules/pyspark/profiler.html

[5/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.3.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/5660fb9a/site/docs/2.3.1/api/python/pyspark.ml.html -- diff --git a/site/docs/2.3.1/api/python/pyspark.ml.html b/site/docs/2.3.1/api/python/pyspark.ml.html index

[7/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
Fix signature description broken in PySpark API documentation in 2.2.1 Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/26b52712 Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/26b52712 Diff:

[5/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/pyspark.ml.html -- diff --git a/site/docs/2.2.1/api/python/pyspark.ml.html b/site/docs/2.2.1/api/python/pyspark.ml.html index

[6/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/_modules/pyspark/rdd.html -- diff --git a/site/docs/2.2.1/api/python/_modules/pyspark/rdd.html

[2/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/pyspark.streaming.html -- diff --git a/site/docs/2.2.1/api/python/pyspark.streaming.html

[4/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/pyspark.mllib.html -- diff --git a/site/docs/2.2.1/api/python/pyspark.mllib.html b/site/docs/2.2.1/api/python/pyspark.mllib.html index

spark git commit: [SPARK-23698] Remove raw_input() from Python 2

2018-07-03 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 776f299fc -> b42fda8ab [SPARK-23698] Remove raw_input() from Python 2 Signed-off-by: cclauss ## What changes were proposed in this pull request? Humans will be able to enter text in Python 3 prompts which they can not do today. The

spark git commit: [BUILD] Close stale PRs

2018-07-03 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master b42fda8ab -> 5bf95f2a3 [BUILD] Close stale PRs Closes #20932 Closes #17843 Closes #13477 Closes #14291 Closes #20919 Closes #17907 Closes #18766 Closes #20809 Closes #8849 Closes #21076 Closes #21507 Closes #21336 Closes #21681 Closes

[3/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/pyspark.sql.html -- diff --git a/site/docs/2.2.1/api/python/pyspark.sql.html b/site/docs/2.2.1/api/python/pyspark.sql.html index

[1/7] spark-website git commit: Fix signature description broken in PySpark API documentation in 2.2.1

2018-07-03 Thread gurwls223
Repository: spark-website Updated Branches: refs/heads/asf-site 8857572df -> 26b527127 http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/searchindex.js -- diff --git

spark git commit: [SPARK-24732][SQL] Type coercion between MapTypes.

2018-07-03 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 5bf95f2a3 -> 7c08eb6d6 [SPARK-24732][SQL] Type coercion between MapTypes. ## What changes were proposed in this pull request? Currently we don't allow type coercion between maps. We can support type coercion between MapTypes where both

spark git commit: [SPARK-22924][SPARKR] R API for sortWithinPartitions

2017-12-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master fd7d141d8 -> ea0a5eef2 [SPARK-22924][SPARKR] R API for sortWithinPartitions ## What changes were proposed in this pull request? Add to `arrange` the option to sort only within partition ## How was this patch tested? manual, unit tests

spark git commit: [SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test failure when xmlrunner is installed.

2017-12-29 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master dbd492b7e -> 11a849b3a [SPARK-22370][SQL][PYSPARK][FOLLOW-UP] Fix a test failure when xmlrunner is installed. ## What changes were proposed in this pull request? This is a follow-up pr of #19587. If `xmlrunner` is installed,

spark git commit: [HOTFIX] Fix Scala style checks

2017-12-23 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master ea2642eb0 -> f6084a88f [HOTFIX] Fix Scala style checks ## What changes were proposed in this pull request? This PR fixes a style that broke the build. ## How was this patch tested? Manually tested. Author: hyukjinkwon

spark git commit: [SPARK-22844][R] Adds date_trunc in R API

2017-12-23 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master f6084a88f -> aeb45df66 [SPARK-22844][R] Adds date_trunc in R API ## What changes were proposed in this pull request? This PR adds `date_trunc` in R API as below: ```r > df <- createDataFrame(list(list(a = as.POSIXlt("2012-12-13

spark git commit: [SPARK-22967][TESTS] Fix VersionSuite's unit tests by change Windows path into URI path

2018-01-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 1c70da3bf -> 0552c36e0 [SPARK-22967][TESTS] Fix VersionSuite's unit tests by change Windows path into URI path ## What changes were proposed in this pull request? Two unit test will fail due to Windows format path: 1.test(s"$version:

spark git commit: [SPARK-22967][TESTS] Fix VersionSuite's unit tests by change Windows path into URI path

2018-01-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 b78130123 -> 799598905 [SPARK-22967][TESTS] Fix VersionSuite's unit tests by change Windows path into URI path ## What changes were proposed in this pull request? Two unit test will fail due to Windows format path:

spark git commit: [SPARK-19732][FOLLOW-UP] Document behavior changes made in na.fill and fillna

2018-01-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 76892bcf2 -> b46e58b74 [SPARK-19732][FOLLOW-UP] Document behavior changes made in na.fill and fillna ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/18164 introduces the behavior changes. We need

spark git commit: [SPARK-19732][FOLLOW-UP] Document behavior changes made in na.fill and fillna

2018-01-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 9ca0f6eaf -> f624850fe [SPARK-19732][FOLLOW-UP] Document behavior changes made in na.fill and fillna ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/18164 introduces the behavior changes. We

spark git commit: [SPARK-23009][PYTHON] Fix for non-str col names to createDataFrame from Pandas

2018-01-10 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 eb4fa551e -> 551ccfba5 [SPARK-23009][PYTHON] Fix for non-str col names to createDataFrame from Pandas ## What changes were proposed in this pull request? This the case when calling `SparkSession.createDataFrame` using a Pandas

spark git commit: [SPARK-23141][SQL][PYSPARK] Support data type string as a returnType for registerJavaFunction.

2018-01-18 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 8a9827482 -> e0421c650 [SPARK-23141][SQL][PYSPARK] Support data type string as a returnType for registerJavaFunction. ## What changes were proposed in this pull request? Currently `UDFRegistration.registerJavaFunction` doesn't

spark git commit: [SPARK-23141][SQL][PYSPARK] Support data type string as a returnType for registerJavaFunction.

2018-01-18 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master e28eb4311 -> 5063b7481 [SPARK-23141][SQL][PYSPARK] Support data type string as a returnType for registerJavaFunction. ## What changes were proposed in this pull request? Currently `UDFRegistration.registerJavaFunction` doesn't support

spark git commit: [SPARK-23094] Fix invalid character handling in JsonDataSource

2018-01-18 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 b8c6d9303 -> a295034da [SPARK-23094] Fix invalid character handling in JsonDataSource ## What changes were proposed in this pull request? There were two related fixes regarding `from_json`, `get_json_object` and `json_tuple` ([Fix

spark git commit: [SPARK-23094] Fix invalid character handling in JsonDataSource

2018-01-18 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master f568e9cf7 -> e01919e83 [SPARK-23094] Fix invalid character handling in JsonDataSource ## What changes were proposed in this pull request? There were two related fixes regarding `from_json`, `get_json_object` and `json_tuple` ([Fix

spark git commit: [SPARK-23080][SQL] Improve error message for built-in functions

2018-01-15 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 6c81fe227 -> 8ab2d7ea9 [SPARK-23080][SQL] Improve error message for built-in functions ## What changes were proposed in this pull request? When a user puts the wrong number of parameters in a function, an AnalysisException is thrown. If

spark git commit: [SPARK-23080][SQL] Improve error message for built-in functions

2018-01-15 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 706a308bd -> bb8e5addc [SPARK-23080][SQL] Improve error message for built-in functions ## What changes were proposed in this pull request? When a user puts the wrong number of parameters in a function, an AnalysisException is thrown.

spark git commit: [SPARK-22978][PYSPARK] Register Vectorized UDFs for SQL Statement

2018-01-16 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 20c69816a -> 5c06ee2d4 [SPARK-22978][PYSPARK] Register Vectorized UDFs for SQL Statement ## What changes were proposed in this pull request? Register Vectorized UDFs for SQL Statement. For example, ```Python >>> from

spark git commit: [SPARK-22978][PYSPARK] Register Vectorized UDFs for SQL Statement

2018-01-16 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 66217dac4 -> b85eb946a [SPARK-22978][PYSPARK] Register Vectorized UDFs for SQL Statement ## What changes were proposed in this pull request? Register Vectorized UDFs for SQL Statement. For example, ```Python >>> from pyspark.sql.functions

spark git commit: [SPARK-23069][DOCS][SPARKR] fix R doc for describe missing text

2018-01-14 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 7a3d0aad2 -> 66738d29c [SPARK-23069][DOCS][SPARKR] fix R doc for describe missing text ## What changes were proposed in this pull request? fix doc truncated ## How was this patch tested? manually Author: Felix Cheung

spark git commit: [SPARK-23069][DOCS][SPARKR] fix R doc for describe missing text

2018-01-14 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 a335a49ce -> 0d425c336 [SPARK-23069][DOCS][SPARKR] fix R doc for describe missing text ## What changes were proposed in this pull request? fix doc truncated ## How was this patch tested? manually Author: Felix Cheung

spark git commit: [SPARK-20947][PYTHON] Fix encoding/decoding error in pipe action

2018-01-21 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 12faae295 -> 602c6d82d [SPARK-20947][PYTHON] Fix encoding/decoding error in pipe action ## What changes were proposed in this pull request? Pipe action convert objects into strings using a way that was affected by the default encoding

spark git commit: [SPARK-23169][INFRA][R] Run lintr on the changes of lint-r script and .lintr configuration

2018-01-21 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 2239d7a41 -> 12faae295 [SPARK-23169][INFRA][R] Run lintr on the changes of lint-r script and .lintr configuration ## What changes were proposed in this pull request? When running the `run-tests` script, seems we don't run lintr on the

spark git commit: [SPARK-20749][SQL][FOLLOW-UP] Override prettyName for bit_length and octet_length

2018-01-23 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 96cb60bc3 -> ee572ba8c [SPARK-20749][SQL][FOLLOW-UP] Override prettyName for bit_length and octet_length ## What changes were proposed in this pull request? We need to override the prettyName for bit_length and octet_length for getting

spark git commit: [SPARK-23047][PYTHON][SQL] Change MapVector to NullableMapVector in ArrowColumnVector

2018-01-17 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 79ccd0cad -> 6e509fde3 [SPARK-23047][PYTHON][SQL] Change MapVector to NullableMapVector in ArrowColumnVector ## What changes were proposed in this pull request? This PR changes usage of `MapVector` in Spark codebase to use

spark git commit: [SPARK-23047][PYTHON][SQL] Change MapVector to NullableMapVector in ArrowColumnVector

2018-01-17 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master e946c63dd -> 4e6f8fb15 [SPARK-23047][PYTHON][SQL] Change MapVector to NullableMapVector in ArrowColumnVector ## What changes were proposed in this pull request? This PR changes usage of `MapVector` in Spark codebase to use

spark git commit: [SPARK-23132][PYTHON][ML] Run doctests in ml.image when testing

2018-01-17 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 4e6f8fb15 -> 45ad97df8 [SPARK-23132][PYTHON][ML] Run doctests in ml.image when testing ## What changes were proposed in this pull request? This PR proposes to actually run the doctests in `ml/image.py`. ## How was this patch tested?

spark git commit: [SPARK-23132][PYTHON][ML] Run doctests in ml.image when testing

2018-01-17 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 6e509fde3 -> b84c2a306 [SPARK-23132][PYTHON][ML] Run doctests in ml.image when testing ## What changes were proposed in this pull request? This PR proposes to actually run the doctests in `ml/image.py`. ## How was this patch tested?

spark git commit: [SPARK-23081][PYTHON] Add colRegex API to PySpark

2018-01-25 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 8532e26f3 -> 8480c0c57 [SPARK-23081][PYTHON] Add colRegex API to PySpark ## What changes were proposed in this pull request? Add colRegex API to PySpark ## How was this patch tested? add a test in sql/tests.py Author: Huaxin Gao

spark git commit: [SPARK-23081][PYTHON] Add colRegex API to PySpark

2018-01-25 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 8866f9c24 -> 2f65c20ea [SPARK-23081][PYTHON] Add colRegex API to PySpark ## What changes were proposed in this pull request? Add colRegex API to PySpark ## How was this patch tested? add a test in sql/tests.py Author: Huaxin Gao

spark git commit: [SPARK-23009][PYTHON] Fix for non-str col names to createDataFrame from Pandas

2018-01-09 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 7bcc26668 -> e59983724 [SPARK-23009][PYTHON] Fix for non-str col names to createDataFrame from Pandas ## What changes were proposed in this pull request? This the case when calling `SparkSession.createDataFrame` using a Pandas DataFrame

spark git commit: [SPARK-22980][PYTHON][SQL] Clarify the length of each series is of each batch within scalar Pandas UDF

2018-01-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 55dbfbca3 -> cd9f49a2a [SPARK-22980][PYTHON][SQL] Clarify the length of each series is of each batch within scalar Pandas UDF ## What changes were proposed in this pull request? This PR proposes to add a note that saying the length of a

spark git commit: [SPARK-22980][PYTHON][SQL] Clarify the length of each series is of each batch within scalar Pandas UDF

2018-01-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 60bcb4685 -> ca27d9cb5 [SPARK-22980][PYTHON][SQL] Clarify the length of each series is of each batch within scalar Pandas UDF ## What changes were proposed in this pull request? This PR proposes to add a note that saying the length

spark git commit: [SPARK-23261][PYSPARK] Rename Pandas UDFs

2018-01-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 0a9ac0248 -> 7a2ada223 [SPARK-23261][PYSPARK] Rename Pandas UDFs ## What changes were proposed in this pull request? Rename the public APIs and names of pandas udfs. - `PANDAS SCALAR UDF` -> `SCALAR PANDAS UDF` - `PANDAS GROUP MAP UDF` ->

spark git commit: [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*.py to .gitignore file.

2018-01-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 84bcf9dc8 -> a23187f53 [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*.py to .gitignore file. ## What changes were proposed in this pull request? This is a follow-up pr of #20338 which changed the downloaded file name of the

spark git commit: [MINOR] Fix typos in dev/* scripts.

2018-01-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 58fcb5a95 -> 9623a9824 [MINOR] Fix typos in dev/* scripts. ## What changes were proposed in this pull request? Consistency in style, grammar and removal of extraneous characters. ## How was this patch tested? Manually as this is a doc

spark git commit: [SPARK-23238][SQL] Externalize SQLConf configurations exposed in documentation

2018-01-29 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 49b0207dc -> 39d2c6b03 [SPARK-23238][SQL] Externalize SQLConf configurations exposed in documentation ## What changes were proposed in this pull request? This PR proposes to expose few internal configurations found in the documentation.

spark git commit: [SPARK-23238][SQL] Externalize SQLConf configurations exposed in documentation

2018-01-29 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 5dda5db12 -> 8229e155d [SPARK-23238][SQL] Externalize SQLConf configurations exposed in documentation ## What changes were proposed in this pull request? This PR proposes to expose few internal configurations found in the

spark git commit: [SPARK-23248][PYTHON][EXAMPLES] Relocate module docstrings to the top in PySpark examples

2018-01-27 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 3b6fc286d -> 8ff0cc48b [SPARK-23248][PYTHON][EXAMPLES] Relocate module docstrings to the top in PySpark examples ## What changes were proposed in this pull request? This PR proposes to relocate the docstrings in modules of examples

spark git commit: [SPARK-23248][PYTHON][EXAMPLES] Relocate module docstrings to the top in PySpark examples

2018-01-27 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 3227d14fe -> b8c32dc57 [SPARK-23248][PYTHON][EXAMPLES] Relocate module docstrings to the top in PySpark examples ## What changes were proposed in this pull request? This PR proposes to relocate the docstrings in modules of examples to

spark git commit: [SPARK-23157][SQL][FOLLOW-UP] DataFrame -> SparkDataFrame in R comment

2018-01-31 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 9ff1d96f0 -> f470df2fc [SPARK-23157][SQL][FOLLOW-UP] DataFrame -> SparkDataFrame in R comment Author: Henry Robinson Closes #20443 from henryr/SPARK-23157. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SPARK-23157][SQL][FOLLOW-UP] DataFrame -> SparkDataFrame in R comment

2018-01-31 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 8ee3a71c9 -> 7ccfc7530 [SPARK-23157][SQL][FOLLOW-UP] DataFrame -> SparkDataFrame in R comment Author: Henry Robinson Closes #20443 from henryr/SPARK-23157. (cherry picked from commit

spark git commit: [SPARK-23228][PYSPARK] Add Python Created jsparkSession to JVM's defaultSession

2018-01-31 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 161a3f2ae -> 3d0911bbe [SPARK-23228][PYSPARK] Add Python Created jsparkSession to JVM's defaultSession ## What changes were proposed in this pull request? In the current PySpark code, Python created `jsparkSession` doesn't add to JVM's

spark git commit: [SPARK-23300][TESTS][BRANCH-2.3] Prints out if Pandas and PyArrow are installed or not in PySpark SQL tests

2018-02-07 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 05239afc9 -> 2ba07d5b1 [SPARK-23300][TESTS][BRANCH-2.3] Prints out if Pandas and PyArrow are installed or not in PySpark SQL tests This PR backports https://github.com/apache/spark/pull/20473 to branch-2.3. Author: hyukjinkwon

spark git commit: [SPARK-23319][TESTS][BRANCH-2.3] Explicitly specify Pandas and PyArrow versions in PySpark tests (to skip or test)

2018-02-07 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 db59e5542 -> 053830256 [SPARK-23319][TESTS][BRANCH-2.3] Explicitly specify Pandas and PyArrow versions in PySpark tests (to skip or test) This PR backports https://github.com/apache/spark/pull/20487 to branch-2.3. Author: hyukjinkwon

spark git commit: [SPARK-23256][ML][PYTHON] Add columnSchema method to PySpark image reader

2018-02-04 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 551dff2bc -> 715047b02 [SPARK-23256][ML][PYTHON] Add columnSchema method to PySpark image reader ## What changes were proposed in this pull request? This PR proposes to add `columnSchema` in Python side too. ```python >>> from

spark git commit: [SPARK-23290][SQL][PYTHON][BACKPORT-2.3] Use datetime.date for date type when converting Spark DataFrame to Pandas DataFrame.

2018-02-06 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 521494d7b -> 44933033e [SPARK-23290][SQL][PYTHON][BACKPORT-2.3] Use datetime.date for date type when converting Spark DataFrame to Pandas DataFrame. ## What changes were proposed in this pull request? This is a backport of #20506.

spark git commit: [SPARK-23334][SQL][PYTHON] Fix pandas_udf with return type StringType() to handle str type properly in Python 2.

2018-02-06 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 44933033e -> a51154482 [SPARK-23334][SQL][PYTHON] Fix pandas_udf with return type StringType() to handle str type properly in Python 2. ## What changes were proposed in this pull request? In Python 2, when `pandas_udf` tries to

spark git commit: [SPARK-23334][SQL][PYTHON] Fix pandas_udf with return type StringType() to handle str type properly in Python 2.

2018-02-06 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 8141c3e3d -> 63c5bf13c [SPARK-23334][SQL][PYTHON] Fix pandas_udf with return type StringType() to handle str type properly in Python 2. ## What changes were proposed in this pull request? In Python 2, when `pandas_udf` tries to return

spark git commit: [SPARK-23352][PYTHON] Explicitly specify supported types in Pandas UDFs

2018-02-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 6efd5d117 -> c338c8cf8 [SPARK-23352][PYTHON] Explicitly specify supported types in Pandas UDFs ## What changes were proposed in this pull request? This PR targets to explicitly specify supported types in Pandas UDFs. The main change here

spark git commit: [SPARK-23084][PYTHON] Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark

2018-02-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master a34fce19b -> 8acb51f08 [SPARK-23084][PYTHON] Add unboundedPreceding(), unboundedFollowing() and currentRow() to PySpark ## What changes were proposed in this pull request? Added unboundedPreceding(), unboundedFollowing() and currentRow()

spark git commit: [SPARK-23387][SQL][PYTHON][TEST][BRANCH-2.3] Backport assertPandasEqual to branch-2.3.

2018-02-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 9fa7b0e10 -> 8875e47ce [SPARK-23387][SQL][PYTHON][TEST][BRANCH-2.3] Backport assertPandasEqual to branch-2.3. ## What changes were proposed in this pull request? When backporting a pr with tests using `assertPandasEqual` from master

spark git commit: [SPARK-22624][PYSPARK] Expose range partitioning shuffle introduced by spark-22614

2018-02-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 8acb51f08 -> eacb62fbb [SPARK-22624][PYSPARK] Expose range partitioning shuffle introduced by spark-22614 ## What changes were proposed in this pull request? Expose range partitioning shuffle introduced by spark-22614 ## How was this

spark git commit: [SPARK-23314][PYTHON] Add ambiguous=False when localizing tz-naive timestamps in Arrow codepath to deal with dst

2018-02-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 0783876c8 -> a34fce19b [SPARK-23314][PYTHON] Add ambiguous=False when localizing tz-naive timestamps in Arrow codepath to deal with dst ## What changes were proposed in this pull request? When tz_localize a tz-naive timetamp, pandas will

spark git commit: [SPARK-23314][PYTHON] Add ambiguous=False when localizing tz-naive timestamps in Arrow codepath to deal with dst

2018-02-11 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 b7571b9bf -> 9fa7b0e10 [SPARK-23314][PYTHON] Add ambiguous=False when localizing tz-naive timestamps in Arrow codepath to deal with dst ## What changes were proposed in this pull request? When tz_localize a tz-naive timetamp, pandas

spark git commit: [SPARK-20090][FOLLOW-UP] Revert the deprecation of `names` in PySpark

2018-02-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master f17b936f0 -> 407f67249 [SPARK-20090][FOLLOW-UP] Revert the deprecation of `names` in PySpark ## What changes were proposed in this pull request? Deprecating the field `name` in PySpark is not expected. This PR is to revert the change. ##

spark git commit: [SPARK-20090][FOLLOW-UP] Revert the deprecation of `names` in PySpark

2018-02-12 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 43f5e4067 -> 3737c3d32 [SPARK-20090][FOLLOW-UP] Revert the deprecation of `names` in PySpark ## What changes were proposed in this pull request? Deprecating the field `name` in PySpark is not expected. This PR is to revert the change.

spark git commit: [SPARK-23360][SQL][PYTHON] Get local timezone from environment via pytz, or dateutil.

2018-02-10 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 f3a9a7f6b -> b7571b9bf [SPARK-23360][SQL][PYTHON] Get local timezone from environment via pytz, or dateutil. ## What changes were proposed in this pull request? Currently we use `tzlocal()` to get Python local timezone, but it

spark git commit: [SPARK-23360][SQL][PYTHON] Get local timezone from environment via pytz, or dateutil.

2018-02-10 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 6d7c38330 -> 97a224a85 [SPARK-23360][SQL][PYTHON] Get local timezone from environment via pytz, or dateutil. ## What changes were proposed in this pull request? Currently we use `tzlocal()` to get Python local timezone, but it sometimes

spark git commit: [SPARK-23300][TESTS] Prints out if Pandas and PyArrow are installed or not in PySpark SQL tests

2018-02-05 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master a24c03138 -> 8141c3e3d [SPARK-23300][TESTS] Prints out if Pandas and PyArrow are installed or not in PySpark SQL tests ## What changes were proposed in this pull request? This PR proposes to log if PyArrow and Pandas are installed or not

spark git commit: [SPARK-23122][PYSPARK][FOLLOWUP] Replace registerTempTable by createOrReplaceTempView

2018-02-07 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master c36fecc3b -> 9775df67f [SPARK-23122][PYSPARK][FOLLOWUP] Replace registerTempTable by createOrReplaceTempView ## What changes were proposed in this pull request? Replace `registerTempTable` by `createOrReplaceTempView`. ## How was this

spark git commit: [SPARK-23122][PYSPARK][FOLLOWUP] Replace registerTempTable by createOrReplaceTempView

2018-02-07 Thread gurwls223
Repository: spark Updated Branches: refs/heads/branch-2.3 874d3f89f -> cb22e830b [SPARK-23122][PYSPARK][FOLLOWUP] Replace registerTempTable by createOrReplaceTempView ## What changes were proposed in this pull request? Replace `registerTempTable` by `createOrReplaceTempView`. ## How was

spark git commit: [SPARK-23319][TESTS] Explicitly specify Pandas and PyArrow versions in PySpark tests (to skip or test)

2018-02-07 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 9775df67f -> 71cfba04a [SPARK-23319][TESTS] Explicitly specify Pandas and PyArrow versions in PySpark tests (to skip or test) ## What changes were proposed in this pull request? This PR proposes to explicitly specify Pandas and PyArrow

spark git commit: [SPARK-23240][PYTHON] Better error message when extraneous data in pyspark.daemon's stdout

2018-02-20 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master aadf9535b -> 862fa697d [SPARK-23240][PYTHON] Better error message when extraneous data in pyspark.daemon's stdout ## What changes were proposed in this pull request? Print more helpful message when daemon module's stdout is empty or

spark git commit: [SPARK-22843][R] Adds localCheckpoint in R

2017-12-28 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master ded6d27e4 -> 76e8a1d7e [SPARK-22843][R] Adds localCheckpoint in R ## What changes were proposed in this pull request? This PR proposes to add `localCheckpoint(..)` in R API. ```r df <- localCheckpoint(createDataFrame(iris)) ``` ## How

spark git commit: [SPARK-21208][R] Adds setLocalProperty and getLocalProperty in R

2017-12-28 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 76e8a1d7e -> 1eebfbe19 [SPARK-21208][R] Adds setLocalProperty and getLocalProperty in R ## What changes were proposed in this pull request? This PR adds `setLocalProperty` and `getLocalProperty`in R. ```R > df <- createDataFrame(iris) >

spark git commit: [SPARK-21552][SQL] Add DecimalType support to ArrowWriter.

2017-12-26 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 0e6833006 -> eb386be1e [SPARK-21552][SQL] Add DecimalType support to ArrowWriter. ## What changes were proposed in this pull request? Decimal type is not yet supported in `ArrowWriter`. This is adding the decimal type support. ## How was

spark git commit: [SPARK-22874][PYSPARK][SQL] Modify checking pandas version to use LooseVersion.

2017-12-22 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 8df1da396 -> 13190a4f6 [SPARK-22874][PYSPARK][SQL] Modify checking pandas version to use LooseVersion. ## What changes were proposed in this pull request? Currently we check pandas version by capturing if `ImportError` for the specific

spark git commit: [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py file.

2017-12-27 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 6674acd1e -> b8bfce51a [SPARK-22324][SQL][PYTHON][FOLLOW-UP] Update setup.py file. ## What changes were proposed in this pull request? This is a follow-up pr of #19884 updating setup.py file to add pyarrow dependency. ## How was this

spark git commit: [SPARK-21616][SPARKR][DOCS] update R migration guide and vignettes

2018-01-01 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master f5b7714e0 -> 7a702d8d5 [SPARK-21616][SPARKR][DOCS] update R migration guide and vignettes ## What changes were proposed in this pull request? update R migration guide and vignettes ## How was this patch tested? manually Author: Felix

spark git commit: [MINOR] Fix a bunch of typos

2018-01-01 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 7a702d8d5 -> c284c4e1f [MINOR] Fix a bunch of typos Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c284c4e1 Tree:

spark git commit: [SPARK-21893][SPARK-22142][TESTS][FOLLOWUP] Enables PySpark tests for Flume and Kafka in Jenkins

2018-01-01 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 1c9f95cb7 -> e734a4b9c [SPARK-21893][SPARK-22142][TESTS][FOLLOWUP] Enables PySpark tests for Flume and Kafka in Jenkins ## What changes were proposed in this pull request? This PR proposes to enable PySpark tests for Flume and Kafka in

spark git commit: [SPARK-22530][PYTHON][SQL] Adding Arrow support for ArrayType

2018-01-01 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master c284c4e1f -> 1c9f95cb7 [SPARK-22530][PYTHON][SQL] Adding Arrow support for ArrayType ## What changes were proposed in this pull request? This change adds `ArrayType` support for working with Arrow in pyspark when creating a DataFrame,

spark git commit: [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF and Scalar Pandas UDF

2018-07-27 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 6424b146c -> e8752095a [SPARK-24624][SQL][PYTHON] Support mixture of Python UDF and Scalar Pandas UDF ## What changes were proposed in this pull request? This PR add supports for using mixed Python UDF and Scalar Pandas UDF, in the

spark git commit: [SPARK-24924][SQL][FOLLOW-UP] Add mapping for built-in Avro data source

2018-07-27 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master e8752095a -> c6a3db2fb [SPARK-24924][SQL][FOLLOW-UP] Add mapping for built-in Avro data source ## What changes were proposed in this pull request? Add one more test case for `com.databricks.spark.avro`. ## How was this patch tested? N/A

spark git commit: [SPARK-24945][SQL] Switching to uniVocity 2.7.3

2018-08-02 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 7cf16a7fa -> b3f2911ee [SPARK-24945][SQL] Switching to uniVocity 2.7.3 ## What changes were proposed in this pull request? In the PR, I propose to upgrade uniVocity parser from **2.6.3** to **2.7.3**. The recent version includes a fix

spark git commit: [SPARK-24773] Avro: support logical timestamp type with different precisions

2018-08-02 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 29077a1d1 -> 7cf16a7fa [SPARK-24773] Avro: support logical timestamp type with different precisions ## What changes were proposed in this pull request? Support reading/writing Avro logical timestamp type with different precisions

spark git commit: [SAPRK-25011][ML] add prefix to __all__ in fpm.py

2018-08-03 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 19a453191 -> ebf33a333 [SAPRK-25011][ML] add prefix to __all__ in fpm.py ## What changes were proposed in this pull request? jira: https://issues.apache.org/jira/browse/SPARK-25011 add prefix to __all__ in fpm.py ## How was this patch

spark git commit: [SPARK-24952][SQL] Support LZMA2 compression by Avro datasource

2018-07-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master 2fbe294cf -> d20c10fdf [SPARK-24952][SQL] Support LZMA2 compression by Avro datasource ## What changes were proposed in this pull request? In the PR, I propose to support `LZMA2` (`XZ`) and `BZIP2` compressions by `AVRO` datasource in

spark git commit: [SPARK-24956][BUILD][FOLLOWUP] Upgrade Maven version to 3.5.4 for AppVeyor as well

2018-07-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master d20c10fdf -> f1550aaf1 [SPARK-24956][BUILD][FOLLOWUP] Upgrade Maven version to 3.5.4 for AppVeyor as well ## What changes were proposed in this pull request? Maven version was upgraded and AppVeyor should also use upgraded maven version.

spark git commit: [SPARK-23633][SQL] Update Pandas UDFs section in sql-programming-guide

2018-07-30 Thread gurwls223
Repository: spark Updated Branches: refs/heads/master f1550aaf1 -> 8141d5592 [SPARK-23633][SQL] Update Pandas UDFs section in sql-programming-guide ## What changes were proposed in this pull request? Update Pandas UDFs section in sql-programming-guide. Add section for grouped aggregate

<    1   2   3   4   5   6   7   8   9   10   >