spark git commit: [SPARK-14206][SQL] buildReader() implementation for CSV

2016-03-30 Thread yhuai
Repository: spark Updated Branches: refs/heads/master da54abfd8 -> 26445c2e4 [SPARK-14206][SQL] buildReader() implementation for CSV ## What changes were proposed in this pull request? Major changes: 1. Implement `FileFormat.buildReader()` for the CSV data source. 1. Add an extra argument

spark git commit: [SPARK-14081][SQL] - Preserve DataFrame column types when filling nulls.

2016-03-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 258a24341 -> da54abfd8 [SPARK-14081][SQL] - Preserve DataFrame column types when filling nulls. ## What changes were proposed in this pull request? This change resolves an issue where `DataFrameNaFunctions.fill` changes a `FloatType`

spark git commit: [SPARK-11507][MLLIB] add compact in Matrices fromBreeze

2016-03-30 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.5 7e5a90651 -> 13f0f4892 [SPARK-11507][MLLIB] add compact in Matrices fromBreeze jira: https://issues.apache.org/jira/browse/SPARK-11507 "In certain situations when adding two block matrices, I get an error regarding colPtr and the

spark git commit: [SPARK-14282][SQL] CodeFormatter should handle oneline comment with /* */ properly

2016-03-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master dadf0138b -> 258a24341 [SPARK-14282][SQL] CodeFormatter should handle oneline comment with /* */ properly ## What changes were proposed in this pull request? This PR improves `CodeFormatter` to fix the following malformed indentations.

spark git commit: [SPARK-11507][MLLIB] add compact in Matrices fromBreeze

2016-03-30 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.6 84ad2544f -> 3cc3d8578 [SPARK-11507][MLLIB] add compact in Matrices fromBreeze jira: https://issues.apache.org/jira/browse/SPARK-11507 "In certain situations when adding two block matrices, I get an error regarding colPtr and the

spark git commit: [SPARK-14259][SQL] Add a FileSourceStrategy option for limiting #files in a partition

2016-03-30 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ca458618d -> dadf0138b [SPARK-14259][SQL] Add a FileSourceStrategy option for limiting #files in a partition ## What changes were proposed in this pull request? This pr is to add a config to control the maximum number of files as even

spark git commit: [SPARK-11507][MLLIB] add compact in Matrices fromBreeze

2016-03-30 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master f301df37c -> ca458618d [SPARK-11507][MLLIB] add compact in Matrices fromBreeze jira: https://issues.apache.org/jira/browse/SPARK-11507 "In certain situations when adding two block matrices, I get an error regarding colPtr and the

spark git commit: [SPARK-14152][ML][PYSPARK] MultilayerPerceptronClassifier supports save/load for Python API

2016-03-30 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 5dc948e81 -> f301df37c [SPARK-14152][ML][PYSPARK] MultilayerPerceptronClassifier supports save/load for Python API ## What changes were proposed in this pull request? ```MultilayerPerceptronClassifier``` supports save/load for Python API.

spark git commit: [MINOR][ML] Fix the wrong param name of LDA topicDistributionCol

2016-03-30 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 529d6ce8f -> 5dc948e81 [MINOR][ML] Fix the wrong param name of LDA topicDistributionCol ## What changes were proposed in this pull request? Fix the wrong param name of LDA ```topicDistributionCol```. ## How was this patch tested? No tests.

spark git commit: [SPARK-14181] TrainValidationSplit should have HasSeed

2016-03-30 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master bdabfd43f -> 529d6ce8f [SPARK-14181] TrainValidationSplit should have HasSeed https://issues.apache.org/jira/browse/SPARK-14181 TrainValidationSplit should have HasSeed for the random split of RDD. I also changed the random split from

spark git commit: [SPARK-13955][YARN] Also look for Spark jars in the build directory.

2016-03-30 Thread vanzin
Repository: spark Updated Branches: refs/heads/master d46c71b39 -> bdabfd43f [SPARK-13955][YARN] Also look for Spark jars in the build directory. Move the logic to find Spark jars to CommandBuilderUtils and make it available for YARN code, so that it's possible to easily launch Spark on YARN

spark git commit: [SPARK-14268][SQL] rename toRowExpressions and fromRowExpression to serializer and deserializer in ExpressionEncoder

2016-03-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 816f359cf -> d46c71b39 [SPARK-14268][SQL] rename toRowExpressions and fromRowExpression to serializer and deserializer in ExpressionEncoder ## What changes were proposed in this pull request? In `ExpressionEncoder`, we use

spark git commit: [SPARK-14114][SQL] implement buildReader for text data source

2016-03-30 Thread lian
Repository: spark Updated Branches: refs/heads/master 7320f9bd1 -> 816f359cf [SPARK-14114][SQL] implement buildReader for text data source ## What changes were proposed in this pull request? This PR implements buildReader for text data source and enable it in the new data source code path.