spark git commit: [SPARK-12961][CORE] Prevent snappy-java memory leak

2016-01-26 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 b40e58cf2 -> 572bc3999 [SPARK-12961][CORE] Prevent snappy-java memory leak JIRA: https://issues.apache.org/jira/browse/SPARK-12961 To prevent memory leak in snappy-java, just call the method once and cache the result. After the

spark git commit: [SPARK-12961][CORE] Prevent snappy-java memory leak

2016-01-26 Thread srowen
Repository: spark Updated Branches: refs/heads/master 6743de3a9 -> 5936bf9fa [SPARK-12961][CORE] Prevent snappy-java memory leak JIRA: https://issues.apache.org/jira/browse/SPARK-12961 To prevent memory leak in snappy-java, just call the method once and cache the result. After the library

spark git commit: [SPARK-3369][CORE][STREAMING] Java mapPartitions Iterator->Iterable is inconsistent with Scala's Iterator->Iterator

2016-01-26 Thread srowen
Repository: spark Updated Branches: refs/heads/master 5936bf9fa -> 649e9d0f5 [SPARK-3369][CORE][STREAMING] Java mapPartitions Iterator->Iterable is inconsistent with Scala's Iterator->Iterator Fix Java function API methods for flatMap and mapPartitions to require producing only an Iterator,

spark git commit: [SPARK-12967][NETTY] Avoid NettyRpc error message during sparkContext shutdown

2016-01-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 58f5d8c1d -> bae3c9a4e [SPARK-12967][NETTY] Avoid NettyRpc error message during sparkContext shutdown If there's an RPC issue while sparkContext is alive but stopped (which would happen only when executing SparkContext.stop), log a

spark git commit: [SPARK-12935][SQL] DataFrame API for Count-Min Sketch

2016-01-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master e7f9199e7 -> ce38a35b7 [SPARK-12935][SQL] DataFrame API for Count-Min Sketch This PR integrates Count-Min Sketch from spark-sketch into DataFrame. This version resorts to `RDD.aggregate` for building the sketch. A more performant UDAF

spark git commit: [SPARK-12728][SQL] Integrates SQL generation with native view

2016-01-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ce38a35b7 -> 58f5d8c1d [SPARK-12728][SQL] Integrates SQL generation with native view This PR is a follow-up of PR #10541. It integrates the newly introduced SQL generation feature with native view to make native view canonical. In this

spark git commit: [SPARK-12780] Inconsistency returning value of ML python models' properties

2016-01-26 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master bae3c9a4e -> 4db255c7a [SPARK-12780] Inconsistency returning value of ML python models' properties https://issues.apache.org/jira/browse/SPARK-12780 Author: Xusen Yin Closes #10724 from yinxusen/SPARK-12780.

spark git commit: [SPARK-12937][SQL] bloom filter serialization

2016-01-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master d54cfed5a -> 6743de3a9 [SPARK-12937][SQL] bloom filter serialization This PR adds serialization support for BloomFilter. A version number is added to version the serialized binary format. Author: Wenchen Fan

spark git commit: [SPARK-10911] Executors should System.exit on clean shutdown.

2016-01-26 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 649e9d0f5 -> ae0309a88 [SPARK-10911] Executors should System.exit on clean shutdown. Call system.exit explicitly to make sure non-daemon user threads terminate. Without this, user applications might live forever if the cluster manager

spark git commit: [SPARK-12682][SQL] Add support for (optionally) not storing tables in hive metadata format

2016-01-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ae0309a88 -> 08c781ca6 [SPARK-12682][SQL] Add support for (optionally) not storing tables in hive metadata format This PR adds a new table option (`skip_hive_metadata`) that'd allow the user to skip storing the table metadata in hive

spark git commit: [SPARK-12682][SQL] Add support for (optionally) not storing tables in hive metadata format

2016-01-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 572bc3999 -> f0c98a60f [SPARK-12682][SQL] Add support for (optionally) not storing tables in hive metadata format This PR adds a new table option (`skip_hive_metadata`) that'd allow the user to skip storing the table metadata in hive

spark git commit: [SPARK-12682][SQL][HOT-FIX] Fix test compilation

2016-01-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 f0c98a60f -> 6ce3dd940 [SPARK-12682][SQL][HOT-FIX] Fix test compilation Author: Yin Huai Closes #10925 from yhuai/branch-1.6-hot-fix. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-7799][STREAMING][DOCUMENT] Add the linking and deploying instructions for streaming-akka project

2016-01-26 Thread tdas
Repository: spark Updated Branches: refs/heads/master 08c781ca6 -> cbd507d69 [SPARK-7799][STREAMING][DOCUMENT] Add the linking and deploying instructions for streaming-akka project Since `actorStream` is an external project, we should add the linking and deploying instructions for it. A

spark git commit: [SPARK-11923][ML] Python API for ml.feature.ChiSqSelector

2016-01-26 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master cbd507d69 -> 8beab6815 [SPARK-11923][ML] Python API for ml.feature.ChiSqSelector https://issues.apache.org/jira/browse/SPARK-11923 Author: Xusen Yin Closes #10186 from yinxusen/SPARK-11923. Project:

spark git commit: [SPARK-12611][SQL][PYSPARK][TESTS] Fix test_infer_schema_to_local

2016-01-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 6ce3dd940 -> 85518eda4 [SPARK-12611][SQL][PYSPARK][TESTS] Fix test_infer_schema_to_local Previously (when the PR was first created) not specifying b= explicitly was fine (and treated as default null) - instead be explicit about b

spark git commit: [SPARK-12952] EMLDAOptimizer initialize() should return EMLDAOptimizer other than its parent class

2016-01-26 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 8beab6815 -> fbf7623d4 [SPARK-12952] EMLDAOptimizer initialize() should return EMLDAOptimizer other than its parent class https://issues.apache.org/jira/browse/SPARK-12952 Author: Xusen Yin Closes #10863 from

spark git commit: [SPARK-8725][PROJECT-INFRA] Test modules in topologically-sorted order in dev/run-tests

2016-01-26 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master fbf7623d4 -> ee74498de [SPARK-8725][PROJECT-INFRA] Test modules in topologically-sorted order in dev/run-tests This patch improves our `dev/run-tests` script to test modules in a topologically-sorted order based on modules' dependencies.

spark git commit: [SQL] Minor Scaladoc format fix

2016-01-26 Thread lian
Repository: spark Updated Branches: refs/heads/master ee74498de -> 83507fea9 [SQL] Minor Scaladoc format fix Otherwise the `^` character is always marked as error in IntelliJ since it represents an unclosed superscript markup tag. Author: Cheng Lian Closes #10926 from

spark git commit: [SPARK-12993][PYSPARK] Remove usage of ADD_FILES in pyspark

2016-01-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 83507fea9 -> 19fdb21af [SPARK-12993][PYSPARK] Remove usage of ADD_FILES in pyspark environment variable ADD_FILES is created for adding python files on spark context to be distributed to executors (SPARK-865), this is deprecated now.

spark git commit: [SPARK-12614][CORE] Don't throw non fatal exception from ask

2016-01-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master eb917291c -> 22662b241 [SPARK-12614][CORE] Don't throw non fatal exception from ask Right now RpcEndpointRef.ask may throw exception in some corner cases, such as calling ask after stopping RpcEnv. It's better to avoid throwing exception

spark git commit: [SPARK-11622][MLLIB] Make LibSVMRelation extends HadoopFsRelation and…

2016-01-26 Thread meng
Repository: spark Updated Branches: refs/heads/master 22662b241 -> 1dac964c1 [SPARK-11622][MLLIB] Make LibSVMRelation extends HadoopFsRelation and… … Add LibSVMOutputWriter The behavior of LibSVMRelation is not changed except adding LibSVMOutputWriter * Partition is still not supported *

spark git commit: [SPARK-7780][MLLIB] intercept in logisticregressionwith lbfgs should not be regularized

2016-01-26 Thread dbtsai
Repository: spark Updated Branches: refs/heads/master 555127387 -> b72611f20 [SPARK-7780][MLLIB] intercept in logisticregressionwith lbfgs should not be regularized The intercept in Logistic Regression represents a prior on categories which should not be regularized. In MLlib, the

spark git commit: [SPARK-12903][SPARKR] Add covar_samp and covar_pop for SparkR

2016-01-26 Thread shivaram
Repository: spark Updated Branches: refs/heads/master b72611f20 -> e7f9199e7 [SPARK-12903][SPARKR] Add covar_samp and covar_pop for SparkR Add ```covar_samp``` and ```covar_pop``` for SparkR. Should we also provide ```cov``` alias for ```covar_samp```? There is ```cov``` implementation at