svn commit: r1643476 - /spark/faq.md

2014-12-05 Thread rxin
Author: rxin Date: Sat Dec 6 00:35:47 2014 New Revision: 1643476 URL: http://svn.apache.org/r1643476 Log: Updated cluster size Modified: spark/faq.md Modified: spark/faq.md URL: http://svn.apache.org/viewvc/spark/faq.md?rev=1643476r1=1643475r2=1643476view=diff

svn commit: r1643477 - /spark/site/faq.html

2014-12-05 Thread rxin
Author: rxin Date: Sat Dec 6 00:36:33 2014 New Revision: 1643477 URL: http://svn.apache.org/r1643477 Log: Updated FAQ html page Modified: spark/site/faq.html Modified: spark/site/faq.html URL: http://svn.apache.org/viewvc/spark/site/faq.html?rev=1643477r1=1643476r2=1643477view=diff

spark git commit: [SPARK-4740] Create multiple concurrent connections between two peer nodes in Netty.

2014-12-09 Thread rxin
Xin r...@databricks.com Closes #3625 from rxin/SPARK-4740 and squashes the following commits: ad4241a [Reynold Xin] Updated javadoc. f33c72b [Reynold Xin] Code review feedback. 0fefabb [Reynold Xin] Use double check in synchronization. 41dfcb2 [Reynold Xin] Added test case. 9076b4a [Reynold Xin

spark git commit: Add mesos specific configurations into doc

2014-12-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 253b72b56 - d9956f86a Add mesos specific configurations into doc Author: Timothy Chen tnac...@gmail.com Closes #3349 from tnachen/mesos_doc and squashes the following commits: 737ef49 [Timothy Chen] Add TOC 5ca546a [Timothy Chen] Update

spark git commit: Add mesos specific configurations into doc

2014-12-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.2 f305e7db2 - 19efa5bf9 Add mesos specific configurations into doc Author: Timothy Chen tnac...@gmail.com Closes #3349 from tnachen/mesos_doc and squashes the following commits: 737ef49 [Timothy Chen] Add TOC 5ca546a [Timothy Chen]

spark git commit: [SPARK-4880] remove spark.locality.wait in Analytics

2014-12-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 59a49db59 - a7ed6f3cc [SPARK-4880] remove spark.locality.wait in Analytics spark.locality.wait set to 10 in examples/graphx/Analytics.scala. Should be left to the user. Author: Ernest earney...@gmail.com Closes #3730 from

spark git commit: [SPARK-4880] remove spark.locality.wait in Analytics

2014-12-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.2 ef5c23626 - e7f9dd5cd [SPARK-4880] remove spark.locality.wait in Analytics spark.locality.wait set to 10 in examples/graphx/Analytics.scala. Should be left to the user. Author: Ernest earney...@gmail.com Closes #3730 from

spark git commit: Small refactoring to pass SparkEnv into Executor rather than creating SparkEnv in Executor.

2014-12-19 Thread rxin
...@databricks.com Closes #3738 from rxin/sparkEnvDepRefactor and squashes the following commits: 82e02cc [Reynold Xin] Fixed couple bugs. 217062a [Reynold Xin] Code review feedback. bd00af7 [Reynold Xin] Small refactoring to pass SparkEnv into Executor rather than creating SparkEnv in Executor. Project

spark git commit: [SPARK-2075][Core] Make the compiler generate same bytes code for Hadoop 1.+ and Hadoop 2.+

2014-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master c6a3c0d50 - 6ee6aa70b [SPARK-2075][Core] Make the compiler generate same bytes code for Hadoop 1.+ and Hadoop 2.+ `NullWritable` is a `Comparable` rather than `Comparable[NullWritable]` in Hadoop 1.+, so the compiler cannot find an

spark git commit: [SPARK-2075][Core] Make the compiler generate same bytes code for Hadoop 1.+ and Hadoop 2.+

2014-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.2 4346a2ba1 - 665653d24 [SPARK-2075][Core] Make the compiler generate same bytes code for Hadoop 1.+ and Hadoop 2.+ `NullWritable` is a `Comparable` rather than `Comparable[NullWritable]` in Hadoop 1.+, so the compiler cannot find an

spark git commit: [SPARK-2075][Core] backport for branch-1.2

2014-12-22 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.2 665653d24 - b89696372 [SPARK-2075][Core] backport for branch-1.2 backport #3740 for branch-1.2 Author: zsxwing zsxw...@gmail.com Closes #3758 from zsxwing/SPARK-2075-branch-1.2 and squashes the following commits: b57d440 [zsxwing]

spark git commit: [SPARK-4918][Core] Reuse Text in saveAsTextFile

2014-12-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6ee6aa70b - 93b2f3a88 [SPARK-4918][Core] Reuse Text in saveAsTextFile Reuse Text in saveAsTextFile to reduce GC. /cc rxin Author: zsxwing zsxw...@gmail.com Closes #3762 from zsxwing/SPARK-4918 and squashes the following commits

spark git commit: [Minor] Fix a typo of type parameter in JavaUtils.scala

2014-12-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 815de5400 - 8d72341ab [Minor] Fix a typo of type parameter in JavaUtils.scala In JavaUtils.scala, thare is a typo of type parameter. In addition, the type information is removed at the time of compile by erasure. This issue is really

spark git commit: SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions

2014-12-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 02b55de3d - 9bc0df680 SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions takeOrdered should skip reduce step in case mapped RDDs have no partitions. This prevents the mentioned exception : 4. run query

spark git commit: SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions

2014-12-29 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.2 76046664d - e81c86967 SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions takeOrdered should skip reduce step in case mapped RDDs have no partitions. This prevents the mentioned exception : 4. run query

spark git commit: [SPARK-5038][SQL] Add explicit return type for implicit functions in Spark SQL

2014-12-31 Thread rxin
and potentially unexpected runtime behavior. Author: Reynold Xin r...@databricks.com Closes #3859 from rxin/sql-implicits and squashes the following commits: 30c2c24 [Reynold Xin] [SPARK-5038] Add explicit return type for implicit functions in Spark SQL. Project: http://git-wip-us.apache.org/repos/asf

spark git commit: [SPARK-5038] Add explicit return type for implicit functions.

2014-12-31 Thread rxin
up PR for rest of Spark (outside Spark SQL). The original PR for Spark SQL can be found at https://github.com/apache/spark/pull/3859 Author: Reynold Xin r...@databricks.com Closes #3860 from rxin/implicit and squashes the following commits: 73702f9 [Reynold Xin] [SPARK-5038] Add explicit return

spark git commit: [SPARK-5214][Test] Add a test to demonstrate EventLoop can be stopped in the event thread

2015-01-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 09e09c548 - 0d1e67ee9 [SPARK-5214][Test] Add a test to demonstrate EventLoop can be stopped in the event thread Author: zsxwing zsxw...@gmail.com Closes #4174 from zsxwing/SPARK-5214-unittest and squashes the following commits: 443e564

spark git commit: [SPARK-5355] use j.u.c.ConcurrentHashMap instead of TrieMap

2015-01-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 81251682e - 142093179 [SPARK-5355] use j.u.c.ConcurrentHashMap instead of TrieMap j.u.c.ConcurrentHashMap is more battle tested. cc rxin JoshRosen pwendell Author: Davies Liu dav...@databricks.com Closes #4208 from davies/safe-conf

spark git commit: [SPARK-4795][Core] Redesign the primitive type = Writable implicit APIs to make them be activated automatically

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 b22d5b5f8 - 5c63e0567 [SPARK-4795][Core] Redesign the primitive type = Writable implicit APIs to make them be activated automatically Try to redesign the primitive type = Writable implicit APIs to make them be activated automatically

spark git commit: [SPARK-4795][Core] Redesign the primitive type = Writable implicit APIs to make them be activated automatically

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1077f2e1d - d37978d8a [SPARK-4795][Core] Redesign the primitive type = Writable implicit APIs to make them be activated automatically Try to redesign the primitive type = Writable implicit APIs to make them be activated automatically and

spark git commit: [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 679228b7f - cb7f783df [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions ```scala df.selectExpr(abs(colA), colB) df.filter(age 21) ``` Author: Reynold Xin r...@databricks.com Closes #4348 from rxin/SPARK

spark git commit: [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master eb1563185 - 40c4cb2fe [SPARK-5579][SQL][DataFrame] Support for project/filter using SQL expressions ```scala df.selectExpr(abs(colA), colB) df.filter(age 21) ``` Author: Reynold Xin r...@databricks.com Closes #4348 from rxin/SPARK-5579

[1/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master e380d2d46 - 1077f2e1d http://git-wip-us.apache.org/repos/asf/spark/blob/1077f2e1/sql/core/src/main/scala/org/apache/spark/sql/UdfRegistration.scala -- diff --git

[2/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
[SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs A more convenient way to define user-defined functions. Author: Reynold Xin r...@databricks.com Closes #4345 from rxin/defineUDF and squashes the following commits: 639c0f8 [Reynold Xin] udf tests. 0a0b339

[1/2] spark git commit: [SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users to use UDFs

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 298ef5ba4 - b22d5b5f8 http://git-wip-us.apache.org/repos/asf/spark/blob/b22d5b5f/sql/core/src/main/scala/org/apache/spark/sql/UdfRegistration.scala -- diff --git

spark git commit: [SPARK-5588] [SQL] support select/filter by SQL expression

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 38a416f03 - ac0b2b788 [SPARK-5588] [SQL] support select/filter by SQL expression ``` df.selectExpr('a + 1', 'abs(age)') df.filter('age 3') df[ df.age 3 ] df[ ['age', 'name'] ] ``` Author: Davies Liu dav...@databricks.com Closes #4359

spark git commit: [SQL][DataFrame] Minor cleanup.

2015-02-04 Thread rxin
#4374 from rxin/df-style and squashes the following commits: e493342 [Reynold Xin] [SQL][DataFrame] Minor cleanup. (cherry picked from commit 6b4c7f08068b6099145ab039d0499e3fef68e2e9) Signed-off-by: Reynold Xin r...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: [SQL][DataFrame] Minor cleanup.

2015-02-04 Thread rxin
#4374 from rxin/df-style and squashes the following commits: e493342 [Reynold Xin] [SQL][DataFrame] Minor cleanup. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6b4c7f08 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree

spark git commit: [SPARK-5577] Python udf for DataFrame

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 06da8682b - dc9ead907 [SPARK-5577] Python udf for DataFrame Author: Davies Liu dav...@databricks.com Closes #4351 from davies/python_udf and squashes the following commits: d250692 [Davies Liu] fix conflict 34234d4 [Davies Liu] Merge

spark git commit: [SPARK-5605][SQL][DF] Allow using String to specify colum name in DSL aggregate functions

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 47e4d579e - 478ee3f91 [SPARK-5605][SQL][DF] Allow using String to specify colum name in DSL aggregate functions Author: Reynold Xin r...@databricks.com Closes #4376 from rxin/SPARK-5605 and squashes the following commits: c55f5fa

spark git commit: [SPARK-5605][SQL][DF] Allow using String to specify colum name in DSL aggregate functions

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9a7ce70ea - 1fbd124b1 [SPARK-5605][SQL][DF] Allow using String to specify colum name in DSL aggregate functions Author: Reynold Xin r...@databricks.com Closes #4376 from rxin/SPARK-5605 and squashes the following commits: c55f5fa

spark git commit: [SPARK-5612][SQL] Move DataFrame implicit functions into SQLContext.implicits.

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9d3a75ef8 - 7d789e117 [SPARK-5612][SQL] Move DataFrame implicit functions into SQLContext.implicits. Author: Reynold Xin r...@databricks.com Closes #4386 from rxin/df-implicits and squashes the following commits: 9d96606 [Reynold Xin

spark git commit: [SPARK-5612][SQL] Move DataFrame implicit functions into SQLContext.implicits.

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 bf43781bd - 0040b6128 [SPARK-5612][SQL] Move DataFrame implicit functions into SQLContext.implicits. Author: Reynold Xin r...@databricks.com Closes #4386 from rxin/df-implicits and squashes the following commits: 9d96606 [Reynold Xin

spark git commit: [MLlib] Minor: UDF style update.

2015-02-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 0040b6128 - 40746749a [MLlib] Minor: UDF style update. Author: Reynold Xin r...@databricks.com Closes #4388 from rxin/mllib-style and squashes the following commits: 61d465b [Reynold Xin] oops 3364295 [Reynold Xin] Missed one

spark git commit: [SPARK-5514] DataFrame.collect should call executeCollect

2015-02-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master dca6faa29 - 8aa3cfff6 [SPARK-5514] DataFrame.collect should call executeCollect Author: Reynold Xin r...@databricks.com Closes #4313 from rxin/SPARK-5514 and squashes the following commits: e34e91b [Reynold Xin] [SPARK-5514

spark git commit: SPARK-5500. Document that feeding hadoopFile into a shuffle operation wi...

2015-02-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 842d00032 - 830934976 SPARK-5500. Document that feeding hadoopFile into a shuffle operation wi... ...ll cause problems Author: Sandy Ryza sa...@cloudera.com Closes #4293 from sryza/sandy-spark-5500 and squashes the following commits:

[2/2] spark git commit: [SQL] Improve DataFrame API error reporting

2015-02-02 Thread rxin
...@databricks.com Closes #4296 from rxin/col-computability and squashes the following commits: 6527b86 [Reynold Xin] Merge pull request #8 from davies/col-computability fd92bc7 [Reynold Xin] Merge branch 'master' into col-computability f79034c [Davies Liu] fix python tests 5afe1ff [Reynold Xin] Fix scala

[1/2] spark git commit: [SQL] Improve DataFrame API error reporting

2015-02-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master eccb9fbb2 - 554403fd9 http://git-wip-us.apache.org/repos/asf/spark/blob/554403fd/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala -- diff --git

spark git commit: [Doc] Minor: Fixes several formatting issues

2015-02-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7930d2bef - 60f67e7a1 [Doc] Minor: Fixes several formatting issues Fixes several minor formatting issues in the [Continuous Compilation] [1] section. [1]: http://spark.apache.org/docs/latest/building-spark.html#continuous-compilation

spark git commit: [SPARK-5219][Core] Add locks to avoid scheduling race conditions

2015-02-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 60f67e7a1 - c306555f4 [SPARK-5219][Core] Add locks to avoid scheduling race conditions Author: zsxwing zsxw...@gmail.com Closes #4019 from zsxwing/SPARK-5219 and squashes the following commits: 36a8b4e [zsxwing] Add locks to avoid race

spark git commit: [SPARK-5414] Add SparkFirehoseListener class for consuming all SparkListener events

2015-02-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 13531dd97 - b8ebebeaa [SPARK-5414] Add SparkFirehoseListener class for consuming all SparkListener events There isn't a good way to write a SparkListener that receives all SparkListener events and which will be future-compatible (e.g. it

spark git commit: [SPARK-5549] Define TaskContext interface in Scala.

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 523a93523 - bebf4c42b [SPARK-5549] Define TaskContext interface in Scala. So the interface documentation shows up in ScalaDoc. Author: Reynold Xin r...@databricks.com Closes #4324 from rxin/TaskContext-scala and squashes the following

spark git commit: [SPARK-5501][SPARK-5420][SQL] Write support for the data source API

2015-02-02 Thread rxin
- [ ] Python API (another PR) - [x] More unit tests - [ ] Documents (another PR) marmbrus liancheng rxin Author: Yin Huai yh...@databricks.com Closes #4294 from yhuai/writeSupport and squashes the following commits: 3db1539 [Yin Huai] save does not take overwrite. 1c98881 [Yin Huai] Fix test

spark git commit: [BUILD] Add the ability to launch spark-shell from SBT.

2015-02-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1390e56fa - e9a4fe12d [BUILD] Add the ability to launch spark-shell from SBT. Now you can quickly launch the spark-shell without building an assembly. For quick development iteration run `build/sbt ~sparkShell` and calling exit will

spark git commit: [BUILD] Add the ability to launch spark-shell from SBT.

2015-02-07 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 6ec0cdc14 - 6bda16969 [BUILD] Add the ability to launch spark-shell from SBT. Now you can quickly launch the spark-shell without building an assembly. For quick development iteration run `build/sbt ~sparkShell` and calling exit will

spark git commit: Minor: Fix TaskContext deprecated annotations.

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master bebf4c42b - f7948f3f5 Minor: Fix TaskContext deprecated annotations. Made a mistake in https://github.com/apache/spark/pull/4324 Author: Reynold Xin r...@databricks.com Closes #4333 from rxin/taskcontext-deprecate and squashes

spark git commit: [SPARK-5135][SQL] Add support for describe table to DDL in SQLContext

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 785a2e3de - 55cebcf5b [SPARK-5135][SQL] Add support for describe table to DDL in SQLContext Hi, rxin marmbrus I considered your suggestion (in #4127) and now re-write it. This is now up-to-date. Could u please review it ? Author

spark git commit: [SPARK-5617][SQL] fix test failure of SQLQuerySuite

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 17ef7f930 - 785a2e3de [SPARK-5617][SQL] fix test failure of SQLQuerySuite SQLQuerySuite test failure: [info] - simple select (22 milliseconds) [info] - sorting (722 milliseconds) [info] - external sorting (728 milliseconds) [info] -

spark git commit: [SPARK-5135][SQL] Add support for describe table to DDL in SQLContext

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master a83936e10 - 4d8d070c4 [SPARK-5135][SQL] Add support for describe table to DDL in SQLContext Hi, rxin marmbrus I considered your suggestion (in #4127) and now re-write it. This is now up-to-date. Could u please review it ? Author

spark git commit: [HOTFIX] MLlib build break.

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master c3ba4d4cd - 6580929fa [HOTFIX] MLlib build break. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6580929f Tree:

spark git commit: [SPARK-5620][DOC] group methods in generated unidoc

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 50c48ebbe - e2be79d1b [SPARK-5620][DOC] group methods in generated unidoc It seems that `(ScalaUnidoc, unidoc)` is the correct way to overwrite `scalacOptions` in unidoc. CC: rxin gzm0 Author: Xiangrui Meng m...@databricks.com

spark git commit: [SPARK-5620][DOC] group methods in generated unidoc

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master a9ed51178 - 85ccee81a [SPARK-5620][DOC] group methods in generated unidoc It seems that `(ScalaUnidoc, unidoc)` is the correct way to overwrite `scalacOptions` in unidoc. CC: rxin gzm0 Author: Xiangrui Meng m...@databricks.com Closes

spark git commit: [SPARK-5638][SQL] Add a config flag to disable eager analysis of DataFrames

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/master 85ccee81a - e8a5d50a9 [SPARK-5638][SQL] Add a config flag to disable eager analysis of DataFrames Author: Reynold Xin r...@databricks.com Closes #4408 from rxin/df-config-eager and squashes the following commits: c0204cf [Reynold Xin

spark git commit: [SPARK-5638][SQL] Add a config flag to disable eager analysis of DataFrames

2015-02-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 e2be79d1b - 4fd67e436 [SPARK-5638][SQL] Add a config flag to disable eager analysis of DataFrames Author: Reynold Xin r...@databricks.com Closes #4408 from rxin/df-config-eager and squashes the following commits: c0204cf [Reynold Xin

spark git commit: [SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API

2015-02-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 092d4ba57 - 4640623bc [SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API Add more tests and docs for DataFrame Python API, improve test coverage, fix bugs. Author: Davies Liu dav...@databricks.com Closes #4331 from

spark git commit: [SPARK-5643][SQL] Add a show method to print the content of a DataFrame in tabular format.

2015-02-08 Thread rxin
0.570307 1982 020.4365040.475256 1983 030.4105160.442194 1984 040.4500900.483521 ``` Author: Reynold Xin r...@databricks.com Closes #4416 from rxin/SPARK-5643 and squashes the following commits: d0e0d6e [Reynold Xin] [SQL] Minor update to data source

spark git commit: [SPARK-5643][SQL] Add a show method to print the content of a DataFrame in tabular format.

2015-02-08 Thread rxin
0.570307 1982 020.4365040.475256 1983 030.4105160.442194 1984 040.4500900.483521 ``` Author: Reynold Xin r...@databricks.com Closes #4416 from rxin/SPARK-5643 and squashes the following commits: d0e0d6e [Reynold Xin] [SQL] Minor update to data source

spark git commit: [SPARK-5472][SQL] Fix Scala code style

2015-02-08 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 fa8ea48f2 - 955f2863e [SPARK-5472][SQL] Fix Scala code style Fix Scala code style. Author: Hung Lin h...@zoomdata.com Closes #4464 from hunglin/SPARK-5472 and squashes the following commits: ef7a3b3 [Hung Lin] SPARK-5472: fix scala

spark git commit: [SPARK-5472][SQL] Fix Scala code style

2015-02-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4396dfb37 - 4575c5643 [SPARK-5472][SQL] Fix Scala code style Fix Scala code style. Author: Hung Lin h...@zoomdata.com Closes #4464 from hunglin/SPARK-5472 and squashes the following commits: ef7a3b3 [Hung Lin] SPARK-5472: fix scala

spark git commit: [SPARK-5235] Make SQLConf Serializable

2015-01-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 259936be7 - 2fd7f72b6 [SPARK-5235] Make SQLConf Serializable Declare SQLConf to be serializable to fix Task not serializable exceptions in SparkSQL Author: Alex Baretta alexbare...@gmail.com Closes #4031 from

spark git commit: [SPARK-5274][SQL] Reconcile Java and Scala UDFRegistration.

2015-01-15 Thread rxin
. For Scala UDFs, added type tags. 4. Added all Java UDF registration methods to Scala's UDFRegistration. 5. Documentation Author: Reynold Xin r...@databricks.com Closes #4056 from rxin/udf-registration and squashes the following commits: ae9c556 [Reynold Xin] Updated example. 675a3c9 [Reynold Xin

spark git commit: [SPARK-5214][Core] Add EventLoop and change DAGScheduler to an EventLoop

2015-01-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master 74de94ea6 - e69fb8c75 [SPARK-5214][Core] Add EventLoop and change DAGScheduler to an EventLoop This PR adds a simple `EventLoop` and use it to replace Actor in DAGScheduler. `EventLoop` is a general class to support that posting events in

spark git commit: [SQL] fix typo in class description

2015-01-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 195564548 - 7dbf1fdb8 [SQL] fix typo in class description Author: Jacky Li jacky.li...@gmail.com Closes #4100 from jackylk/patch-9 and squashes the following commits: b13b9d6 [Jacky Li] Update SQLConf.scala 4d3f83d [Jacky Li] Update

spark git commit: [SQL][minor] Add a log4j file for catalyst test.

2015-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 306ff187a - debc03195 [SQL][minor] Add a log4j file for catalyst test. Author: Reynold Xin r...@databricks.com Closes #4117 from rxin/catalyst-test-log4j and squashes the following commits: 8ad610b [Reynold Xin] [SQL][minor] Add a log4j

spark git commit: [SPARK-5287][SQL] Add defaultSizeOf to every data type.

2015-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 23e25543b - bc20a52b3 [SPARK-5287][SQL] Add defaultSizeOf to every data type. JIRA: https://issues.apache.org/jira/browse/SPARK-5287 This PR only add `defaultSizeOf` to data types and make those internal type classes `protected[sql]`. I

spark git commit: [SQL] [Minor] Remove deprecated parquet tests

2015-01-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master b328ac6c8 - ba19689fe [SQL] [Minor] Remove deprecated parquet tests This PR removes the deprecated `ParquetQuerySuite`, renamed `ParquetQuerySuite2` to `ParquetQuerySuite`, and refactored changes introduced in #4115 to

spark git commit: [SPARK-5202] [SQL] Add hql variable substitution support

2015-01-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9bad06226 - 27bccc5ea [SPARK-5202] [SQL] Add hql variable substitution support https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution This is a block issue for the CLI user, it impacts the existed hql scripts

spark git commit: [SPARK-4937][SQL] Comment for the newly optimization rules in `BooleanSimplification`

2015-01-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master f3bfc768d - c1f3c27f2 [SPARK-4937][SQL] Comment for the newly optimization rules in `BooleanSimplification` Follow up of #3778 /cc rxin Author: scwf wangf...@huawei.com Closes #4086 from scwf/commentforspark-4937 and squashes

spark git commit: [SQL][Minor] Added comments and examples to explain BooleanSimplification

2015-01-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 610b0 - e7884bc95 [SQL][Minor] Added comments and examples to explain BooleanSimplification Author: Reynold Xin r...@databricks.com Closes #4090 from rxin/booleanSimplification and squashes the following commits: 68c8986 [Reynold Xin

spark git commit: [SQL][minor] Put DataTypes.java in java dir.

2015-01-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1a200a3ee - 195564548 [SQL][minor] Put DataTypes.java in java dir. Author: Reynold Xin r...@databricks.com Closes #4097 from rxin/javarename and squashes the following commits: c5ce96a [Reynold Xin] [SQL][minor] Put DataTypes.java

[2/2] spark git commit: [SPARK-5193][SQL] Remove Spark SQL Java-specific API.

2015-01-16 Thread rxin
/spark/pull/4030 https://github.com/apache/spark/pull/3965 https://github.com/apache/spark/pull/3958 Author: Reynold Xin r...@databricks.com Closes #4065 from rxin/sql-java-api and squashes the following commits: b1fd860 [Reynold Xin] Fix Mima 6d86578 [Reynold Xin] Ok one more attempt in fixing

[1/2] spark git commit: [SPARK-5193][SQL] Remove Spark SQL Java-specific API.

2015-01-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master ee1c1f3a0 - 61b427d4b http://git-wip-us.apache.org/repos/asf/spark/blob/61b427d4/sql/core/src/test/java/org/apache/spark/sql/api/java/JavaApplySchemaSuite.java -- diff

spark git commit: [SQL][minor] Improved Row documentation.

2015-01-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 61b427d4b - f3bfc768d [SQL][minor] Improved Row documentation. Author: Reynold Xin r...@databricks.com Closes #4085 from rxin/row-doc and squashes the following commits: f77cb27 [Reynold Xin] [SQL][minor] Improved Row documentation

spark git commit: [SPARK-5193][SQL] Tighten up HiveContext API

2015-01-14 Thread rxin
Experimental tag to analyze command. Author: Reynold Xin r...@databricks.com Closes #4054 from rxin/hivecontext-api and squashes the following commits: 25cc00a [Reynold Xin] Add implicit conversion back. cbca886 [Reynold Xin] [SPARK-5193][SQL] Tighten up HiveContext API Project: http://git-wip

spark git commit: [SPARK-5248] [SQL] move sql.types.decimal.Decimal to sql.types.Decimal

2015-01-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master d5eeb3516 - a3f7421b4 [SPARK-5248] [SQL] move sql.types.decimal.Decimal to sql.types.Decimal rxin follow up of #3732 Author: Daoyuan Wang daoyuan.w...@intel.com Closes #4041 from adrian-wang/decimal and squashes the following commits

spark git commit: [SPARK-4959][SQL] Attributes are case sensitive when using a select query from a projection(Backport to Spark-1.2)

2015-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.2 692dc5b66 - 92c238c19 [SPARK-4959][SQL] Attributes are case sensitive when using a select query from a projection(Backport to Spark-1.2) This is a follow up of #3796 , which can not be merged back to Spark-1.2. Manually merge it.

spark git commit: [SQL][Minor] Refactors deeply nested FP style code in BooleanSimplification

2015-01-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9d9294aeb - 814080278 [SQL][Minor] Refactors deeply nested FP style code in BooleanSimplification This is a follow-up of #4090. The original deeply nested `reduceOption` code is hard to grasp. !-- Reviewable:start -- [img

spark git commit: [SPARK-5279][SQL] Use java.math.BigDecimal as the exposed Decimal type.

2015-01-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master ad16da1bc - 1727e0841 [SPARK-5279][SQL] Use java.math.BigDecimal as the exposed Decimal type. Author: Reynold Xin r...@databricks.com Closes #4092 from rxin/bigdecimal and squashes the following commits: 27b08c9 [Reynold Xin] Fixed test

spark git commit: [SQL][Minor] Update sql doc according to data type APIs changes

2015-01-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1727e0841 - 1a200a3ee [SQL][Minor] Update sql doc according to data type APIs changes Follow up of #3925 /cc rxin Author: scwf wangf...@huawei.com Closes #4095 from scwf/sql-doc and squashes the following commits: 97e311b [scwf] update

spark git commit: [SPARK-5677] [SPARK-5734] [SQL] [PySpark] Python DataFrame API remaining tasks

2015-02-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1ac099e3e - b694eb9c2 [SPARK-5677] [SPARK-5734] [SQL] [PySpark] Python DataFrame API remaining tasks 1. DataFrame.renameColumn 2. DataFrame.show() and _repr_ 3. Use simpleString() rather than jsonValue in DataFrame.dtypes 4.

spark git commit: [SPARK-5677] [SPARK-5734] [SQL] [PySpark] Python DataFrame API remaining tasks

2015-02-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 864dccd70 - d66aae217 [SPARK-5677] [SPARK-5734] [SQL] [PySpark] Python DataFrame API remaining tasks 1. DataFrame.renameColumn 2. DataFrame.show() and _repr_ 3. Use simpleString() rather than jsonValue in DataFrame.dtypes 4.

spark git commit: [SPARK-5702][SQL] Allow short names for built-in data sources.

2015-02-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master b96918265 - b8f88d327 [SPARK-5702][SQL] Allow short names for built-in data sources. Also took the chance to fixed up some style ... Author: Reynold Xin r...@databricks.com Closes #4489 from rxin/SPARK-5702 and squashes the following

spark git commit: [SPARK-3688][SQL] More inline comments for LogicalPlan.

2015-02-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 44b2311d9 - fa6bdc6e8 [SPARK-3688][SQL] More inline comments for LogicalPlan. As a follow-up to https://github.com/apache/spark/pull/4524 Author: Reynold Xin r...@databricks.com Closes #4539 from rxin/SPARK-3688 and squashes

spark git commit: [SPARK-3688][SQL] More inline comments for LogicalPlan.

2015-02-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 e136f477e - 08ab3d236 [SPARK-3688][SQL] More inline comments for LogicalPlan. As a follow-up to https://github.com/apache/spark/pull/4524 Author: Reynold Xin r...@databricks.com Closes #4539 from rxin/SPARK-3688 and squashes

spark git commit: [SQL] Two DataFrame fixes.

2015-02-11 Thread rxin
Closes #4543 from rxin/df-cleanup and squashes the following commits: 81ec915 [Reynold Xin] [SQL] More DataFrame fixes. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d931b01d Tree: http://git-wip-us.apache.org/repos/asf

spark git commit: [SQL] Two DataFrame fixes.

2015-02-11 Thread rxin
...@databricks.com Closes #4543 from rxin/df-cleanup and squashes the following commits: 81ec915 [Reynold Xin] [SQL] More DataFrame fixes. (cherry picked from commit d931b01dcaaf009dcf68dcfe83428bd7f9e857cc) Signed-off-by: Reynold Xin r...@databricks.com Project: http://git-wip-us.apache.org/repos/asf/spark

[2/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin
[Minor] [SQL] Cleans up DataFrame variable names and toDF() calls Although we've migrated to the DataFrame API, lots of code still uses `rdd` or `srdd` as local variable names. This PR tries to address these naming inconsistencies and some other minor DataFrame related style issues. !--

[1/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3912d3324 - 61ab08549 http://git-wip-us.apache.org/repos/asf/spark/blob/61ab0854/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUdfSuite.scala -- diff

[2/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin
[Minor] [SQL] Cleans up DataFrame variable names and toDF() calls Although we've migrated to the DataFrame API, lots of code still uses `rdd` or `srdd` as local variable names. This PR tries to address these naming inconsistencies and some other minor DataFrame related style issues. !--

[1/2] spark git commit: [Minor] [SQL] Cleans up DataFrame variable names and toDF() calls

2015-02-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 f8f9a64eb - 2bd33ce62 http://git-wip-us.apache.org/repos/asf/spark/blob/2bd33ce6/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUdfSuite.scala -- diff

spark git commit: [SPARK-5799][SQL] Compute aggregation function on specified numeric columns

2015-02-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master a3afa4a1b - 5c78be7a5 [SPARK-5799][SQL] Compute aggregation function on specified numeric columns Compute aggregation function on specified numeric columns. For example: val df = Seq((a, 1, 0, b), (b, 2, 4, c), (a, 2, 3,

spark git commit: Minor fixes for commit https://github.com/apache/spark/pull/4592.

2015-02-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5c78be7a5 - 9baac56cc Minor fixes for commit https://github.com/apache/spark/pull/4592. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9baac56c Tree:

spark git commit: [SPARK-5878] fix DataFrame.repartition() in Python

2015-02-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 9a565b84f - aca799159 [SPARK-5878] fix DataFrame.repartition() in Python Also add tests for distinct() Author: Davies Liu dav...@databricks.com Closes #4667 from davies/repartition and squashes the following commits: 79059fd [Davies

spark git commit: [SPARK-5878] fix DataFrame.repartition() in Python

2015-02-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master de0dd6de2 - c1b6fa983 [SPARK-5878] fix DataFrame.repartition() in Python Also add tests for distinct() Author: Davies Liu dav...@databricks.com Closes #4667 from davies/repartition and squashes the following commits: 79059fd [Davies

spark git commit: Avoid deprecation warnings in JDBCSuite.

2015-02-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 2bd33ce62 - 9a565b84f Avoid deprecation warnings in JDBCSuite. This pull request replaces calls to deprecated methods from `java.util.Date` with near-equivalents in `java.util.Calendar`. Author: Tor Myklebust tmykl...@gmail.com

spark git commit: Avoid deprecation warnings in JDBCSuite.

2015-02-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 61ab08549 - de0dd6de2 Avoid deprecation warnings in JDBCSuite. This pull request replaces calls to deprecated methods from `java.util.Date` with near-equivalents in `java.util.Calendar`. Author: Tor Myklebust tmykl...@gmail.com Closes

spark git commit: [SQL][DOCS] Update sql documentation

2015-02-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 e26c14990 - cbd659e5f [SQL][DOCS] Update sql documentation Updated examples using the new api and added DataFrame concept Author: Antonio Navarro Perez ajnava...@users.noreply.github.com Closes #4560 from

spark git commit: [SQL] Various DataFrame doc changes.

2015-02-16 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.3 385a339a2 - e355b54de [SQL] Various DataFrame doc changes. Added a bunch of tags. Also changed parquetFile to take varargs rather than a string followed by varargs. Author: Reynold Xin r...@databricks.com Closes #4636 from rxin/df

spark git commit: [SQL] Various DataFrame doc changes.

2015-02-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master 58a82a788 - 0e180bfc3 [SQL] Various DataFrame doc changes. Added a bunch of tags. Also changed parquetFile to take varargs rather than a string followed by varargs. Author: Reynold Xin r...@databricks.com Closes #4636 from rxin/df-doc

<    1   2   3   4   5   6   7   8   9   10   >