git commit: [SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by default.

2014-11-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.2 8395e8fbd -> 786e75b33 [SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by default. This PR simplify serializer, always use batched serializer (AutoBatchedSerializer as default), even batch size is 1. Author: Dav

git commit: [SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by default.

2014-11-03 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master b671ce047 -> e4f42631a [SPARK-3886] [PySpark] simplify serializer, use AutoBatchedSerializer by default. This PR simplify serializer, always use batched serializer (AutoBatchedSerializer as default), even batch size is 1. Author: Davies

git commit: [SPARK-4166][Core] Add a backward compatibility test for ExecutorLostFailure

2014-11-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 9bdc8412a -> b671ce047 [SPARK-4166][Core] Add a backward compatibility test for ExecutorLostFailure Author: zsxwing Closes #3085 from zsxwing/SPARK-4166-back-comp and squashes the following commits: 89329f4 [zsxwing] Add a backward comp

git commit: [SPARK-4163][Core] Add a backward compatibility test for FetchFailed

2014-11-03 Thread adav
Repository: spark Updated Branches: refs/heads/master 1a9c6cdda -> 9bdc8412a [SPARK-4163][Core] Add a backward compatibility test for FetchFailed /cc aarondav Author: zsxwing Closes #3086 from zsxwing/SPARK-4163-back-comp and squashes the following commits: 21cb2a8 [zsxwing] Add a backwar

git commit: [SPARK-3573][MLLIB] Make MLlib's Vector compatible with SQL's SchemaRDD

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 42d02db86 -> 8395e8fbd [SPARK-3573][MLLIB] Make MLlib's Vector compatible with SQL's SchemaRDD Register MLlib's Vector as a SQL user-defined type (UDT) in both Scala and Python. With this PR, we can easily map a RDD[LabeledPoint] to a

git commit: [SPARK-3573][MLLIB] Make MLlib's Vector compatible with SQL's SchemaRDD

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 04450d115 -> 1a9c6cdda [SPARK-3573][MLLIB] Make MLlib's Vector compatible with SQL's SchemaRDD Register MLlib's Vector as a SQL user-defined type (UDT) in both Scala and Python. With this PR, we can easily map a RDD[LabeledPoint] to a Sche

git commit: [SPARK-4192][SQL] Internal API for Python UDT

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/master c5912ecc7 -> 04450d115 [SPARK-4192][SQL] Internal API for Python UDT Following #2919, this PR adds Python UDT (for internal use only) with tests under "pyspark.tests". Before `SQLContext.applySchema`, we check whether we need to convert u

git commit: [SPARK-4192][SQL] Internal API for Python UDT

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 0826eed9c -> 42d02db86 [SPARK-4192][SQL] Internal API for Python UDT Following #2919, this PR adds Python UDT (for internal use only) with tests under "pyspark.tests". Before `SQLContext.applySchema`, we check whether we need to conve

git commit: [FIX][MLLIB] fix seed in BaggedPointSuite

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 52db2b942 -> 0826eed9c [FIX][MLLIB] fix seed in BaggedPointSuite Saw Jenkins test failures due to random seeds. jkbradley manishamde Author: Xiangrui Meng Closes #3084 from mengxr/fix-baggedpoint-suite and squashes the following co

git commit: [FIX][MLLIB] fix seed in BaggedPointSuite

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 4f035dd2c -> c5912ecc7 [FIX][MLLIB] fix seed in BaggedPointSuite Saw Jenkins test failures due to random seeds. jkbradley manishamde Author: Xiangrui Meng Closes #3084 from mengxr/fix-baggedpoint-suite and squashes the following commit

git commit: [SPARK-611] Display executor thread dumps in web UI

2014-11-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 97a466eca -> 4f035dd2c [SPARK-611] Display executor thread dumps in web UI This patch allows executor thread dumps to be collected on-demand and viewed in the Spark web UI. The thread dumps are collected using Thread.getAllStackTraces().

git commit: [SPARK-4168][WebUI] web statges number should show correctly when stages are more than 1000

2014-11-03 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 15b58a223 -> 97a466eca [SPARK-4168][WebUI] web statges number should show correctly when stages are more than 1000 The number of completed stages and failed stages showed on webUI will always be less than 1000. This is really misleading w

git commit: [SQL] Convert arguments to Scala UDFs

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 fa86d862f -> 52db2b942 [SQL] Convert arguments to Scala UDFs Author: Michael Armbrust Closes #3077 from marmbrus/udfsWithUdts and squashes the following commits: 34b5f27 [Michael Armbrust] style 504adef [Michael Armbrust] Convert arg

git commit: [SQL] Convert arguments to Scala UDFs

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 28128150e -> 15b58a223 [SQL] Convert arguments to Scala UDFs Author: Michael Armbrust Closes #3077 from marmbrus/udfsWithUdts and squashes the following commits: 34b5f27 [Michael Armbrust] style 504adef [Michael Armbrust] Convert argumen

git commit: SPARK-4178. Hadoop input metrics ignore bytes read in RecordReader insta...

2014-11-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.2 51985f78c -> fa86d862f SPARK-4178. Hadoop input metrics ignore bytes read in RecordReader insta... ...ntiation Author: Sandy Ryza Closes #3045 from sryza/sandy-spark-4178 and squashes the following commits: 8d2e70e [Sandy Ryza] Kost

git commit: SPARK-4178. Hadoop input metrics ignore bytes read in RecordReader insta...

2014-11-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/master 25bef7e69 -> 28128150e SPARK-4178. Hadoop input metrics ignore bytes read in RecordReader insta... ...ntiation Author: Sandy Ryza Closes #3045 from sryza/sandy-spark-4178 and squashes the following commits: 8d2e70e [Sandy Ryza] Kostas's

git commit: [SQL] More aggressive defaults

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e83f13e8d -> 25bef7e69 [SQL] More aggressive defaults - Turns on compression for in-memory cached data by default - Changes the default parquet compression format back to gzip (we have seen more OOMs with production workloads due to the

git commit: [SQL] More aggressive defaults

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 6104754f7 -> 51985f78c [SQL] More aggressive defaults - Turns on compression for in-memory cached data by default - Changes the default parquet compression format back to gzip (we have seen more OOMs with production workloads due to

git commit: [SPARK-4152] [SQL] Avoid data change in CTAS while table already existed

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 572300ba8 -> 6104754f7 [SPARK-4152] [SQL] Avoid data change in CTAS while table already existed CREATE TABLE t1 (a String); CREATE TABLE t1 AS SELECT key FROM src; – throw exception CREATE TABLE if not exists t1 AS SELECT key FROM src

git commit: [SPARK-4152] [SQL] Avoid data change in CTAS while table already existed

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c238fb423 -> e83f13e8d [SPARK-4152] [SQL] Avoid data change in CTAS while table already existed CREATE TABLE t1 (a String); CREATE TABLE t1 AS SELECT key FROM src; – throw exception CREATE TABLE if not exists t1 AS SELECT key FROM src; â€

git commit: [SPARK-4202][SQL] Simple DSL support for Scala UDF

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 cc5dc4247 -> 572300ba8 [SPARK-4202][SQL] Simple DSL support for Scala UDF This feature is based on an offline discussion with mengxr, hopefully can be useful for the new MLlib pipeline API. For the following test snippet ```scala cas

git commit: [SPARK-4202][SQL] Simple DSL support for Scala UDF

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 24544fbce -> c238fb423 [SPARK-4202][SQL] Simple DSL support for Scala UDF This feature is based on an offline discussion with mengxr, hopefully can be useful for the new MLlib pipeline API. For the following test snippet ```scala case cl

git commit: [SPARK-3594] [PySpark] [SQL] take more rows to infer schema or sampling

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 2b6e1ce6e -> 24544fbce [SPARK-3594] [PySpark] [SQL] take more rows to infer schema or sampling This patch will try to infer schema for RDD which has empty value (None, [], {}) in the first row. It will try first 100 rows and merge the type

git commit: [SPARK-3594] [PySpark] [SQL] take more rows to infer schema or sampling

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 292da4ef2 -> cc5dc4247 [SPARK-3594] [PySpark] [SQL] take more rows to infer schema or sampling This patch will try to infer schema for RDD which has empty value (None, [], {}) in the first row. It will try first 100 rows and merge the

git commit: [SPARK-4207][SQL] Query which has syntax like 'not like' is not working in Spark SQL

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master df607da02 -> 2b6e1ce6e [SPARK-4207][SQL] Query which has syntax like 'not like' is not working in Spark SQL Queries which has 'not like' is not working spark sql. sql("SELECT * FROM records where value not like 'val%'") same query works

git commit: [SPARK-4207][SQL] Query which has syntax like 'not like' is not working in Spark SQL

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 fc782896b -> 292da4ef2 [SPARK-4207][SQL] Query which has syntax like 'not like' is not working in Spark SQL Queries which has 'not like' is not working spark sql. sql("SELECT * FROM records where value not like 'val%'") same query wo

git commit: [SPARK-4211][Build] Fixes hive.version in Maven profile hive-0.13.1

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3cca19622 -> df607da02 [SPARK-4211][Build] Fixes hive.version in Maven profile hive-0.13.1 instead of `hive.version=0.13.1`. e.g. mvn -Phive -Phive=0.13.1 Note: `hive.version=0.13.1a` is the default property value. However, when explicitl

git commit: [SPARK-4211][Build] Fixes hive.version in Maven profile hive-0.13.1

2014-11-03 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 a68321400 -> fc782896b [SPARK-4211][Build] Fixes hive.version in Maven profile hive-0.13.1 instead of `hive.version=0.13.1`. e.g. mvn -Phive -Phive=0.13.1 Note: `hive.version=0.13.1a` is the default property value. However, when expli

git commit: [SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 76386e1a2 -> a68321400 [SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample The current way of seed distribution makes the random sequences from partition i and i+1 offset by 1. ~~~ In [14]: import random In

git commit: [SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample

2014-11-03 Thread meng
Repository: spark Updated Branches: refs/heads/master 2aca97c7c -> 3cca19622 [SPARK-4148][PySpark] fix seed distribution and add some tests for rdd.sample The current way of seed distribution makes the random sequences from partition i and i+1 offset by 1. ~~~ In [14]: import random In [15]

git commit: [EC2] Factor out Mesos spark-ec2 branch

2014-11-03 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 76386e1a2 -> 2aca97c7c [EC2] Factor out Mesos spark-ec2 branch We reference a specific branch in two places. This patch makes it one place. Author: Nicholas Chammas Closes #3008 from nchammas/mesos-spark-ec2-branch and squashes the follo

Git Push Summary

2014-11-03 Thread pwendell
Repository: spark Updated Tags: refs/tags/v1.1.0-rc4 [deleted] 5918ea4c9 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

Git Push Summary

2014-11-03 Thread pwendell
Repository: spark Updated Branches: refs/heads/branch-1.2 [created] 76386e1a2 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org