git commit: [SPARK-2218] rename Equals to EqualTo in Spark SQL expressions.

2014-06-20 Thread rxin
and pattern matching in the optimizer. Note that this sits on top of #1144. Author: Reynold Xin r...@apache.org Closes #1146 from rxin/equals and squashes the following commits: f8583fd [Reynold Xin] Merge branch 'master' of github.com:apache/spark into equals 326b388 [Reynold Xin] Merge branch

git commit: [SPARK-2218] rename Equals to EqualTo in Spark SQL expressions.

2014-06-20 Thread rxin
and pattern matching in the optimizer. Note that this sits on top of #1144. Author: Reynold Xin r...@apache.org Closes #1146 from rxin/equals and squashes the following commits: f8583fd [Reynold Xin] Merge branch 'master' of github.com:apache/spark into equals 326b388 [Reynold Xin] Merge branch

git commit: [SQL] Use hive.SessionState, not the thread local SessionState

2014-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master d4c7572db - 204478491 [SQL] Use hive.SessionState, not the thread local SessionState Note that this is simply mimicing lookupRelation(). I do not have a concrete notion of why this solution is necessarily right-er than SessionState.get,

git commit: [SQL] Use hive.SessionState, not the thread local SessionState

2014-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 91dc0641c - 36668662f [SQL] Use hive.SessionState, not the thread local SessionState Note that this is simply mimicing lookupRelation(). I do not have a concrete notion of why this solution is necessarily right-er than

git commit: [SQL] Pass SQLContext instead of SparkContext into physical operators.

2014-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 648553d48 - ca5d8b590 [SQL] Pass SQLContext instead of SparkContext into physical operators. This makes it easier to use config options in operators. Author: Reynold Xin r...@apache.org Closes #1164 from rxin/sqlcontext and squashes

git commit: [SQL] Pass SQLContext instead of SparkContext into physical operators.

2014-06-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 36668662f - 1829ec411 [SQL] Pass SQLContext instead of SparkContext into physical operators. This makes it easier to use config options in operators. Author: Reynold Xin r...@apache.org Closes #1164 from rxin/sqlcontext and squashes

git commit: Cleanup on Connection, ConnectionManagerId, ConnectionManager classes part 2

2014-06-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master 21ddd7d1e - 383bf72c1 Cleanup on Connection, ConnectionManagerId, ConnectionManager classes part 2 Cleanup on Connection, ConnectionManagerId, and ConnectionManager classes part 2 while I was working at the code there to help IDE: 1.

git commit: [SPARK-2227] Support dfs command in SQL.

2014-06-23 Thread rxin
...@apache.org Closes #1167 from rxin/commands and squashes the following commits: 56f04f8 [Reynold Xin] [SPARK-2227] Support dfs command in SQL. (cherry picked from commit 51c8168377a89d20d0b2d7b9a28af58593a0fe0c) Signed-off-by: Reynold Xin r...@apache.org Project: http://git-wip-us.apache.org

git commit: [SPARK-2227] Support dfs command in SQL.

2014-06-23 Thread rxin
...@apache.org Closes #1167 from rxin/commands and squashes the following commits: 56f04f8 [Reynold Xin] [SPARK-2227] Support dfs command in SQL. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/51c81683 Tree: http://git-wip

git commit: [SPARK-2252] Fix MathJax for HTTPs.

2014-06-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 c43835305 - 05f84e28f [SPARK-2252] Fix MathJax for HTTPs. Found out about this from the Hacker News link to GraphX which was using HTTPs. @mengxr Author: Reynold Xin r...@apache.org Closes #1189 from rxin/mllib-doc and squashes

git commit: [SPARK-2252] Fix MathJax for HTTPs.

2014-06-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 56eb8af18 - 420c1c3e1 [SPARK-2252] Fix MathJax for HTTPs. Found out about this from the Hacker News link to GraphX which was using HTTPs. @mengxr Author: Reynold Xin r...@apache.org Closes #1189 from rxin/mllib-doc and squashes

git commit: Fix possible null pointer in acumulator toString

2014-06-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 54055fb2b - 2714968e1 Fix possible null pointer in acumulator toString Author: Michael Armbrust mich...@databricks.com Closes #1204 from marmbrus/nullPointerToString and squashes the following commits: 35b5fce [Michael Armbrust] Fix

git commit: Fix possible null pointer in acumulator toString

2014-06-24 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 e199a02dd - d3dbaf5a7 Fix possible null pointer in acumulator toString Author: Michael Armbrust mich...@databricks.com Closes #1204 from marmbrus/nullPointerToString and squashes the following commits: 35b5fce [Michael Armbrust] Fix

git commit: [SPARK-2263][SQL] Support inserting MAPK, V to Hive tables

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master b6b44853c - 8fade8973 [SPARK-2263][SQL] Support inserting MAPK, V to Hive tables JIRA issue: [SPARK-2263](https://issues.apache.org/jira/browse/SPARK-2263) Map objects were not converted to Hive types before inserting into Hive tables.

git commit: [SPARK-2263][SQL] Support inserting MAPK, V to Hive tables

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 d3dbaf5a7 - a31def10a [SPARK-2263][SQL] Support inserting MAPK, V to Hive tables JIRA issue: [SPARK-2263](https://issues.apache.org/jira/browse/SPARK-2263) Map objects were not converted to Hive types before inserting into Hive

git commit: [BUGFIX][SQL] Should match java.math.BigDecimal when wnrapping Hive output

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 a31def10a - 65a559cfc [BUGFIX][SQL] Should match java.math.BigDecimal when wnrapping Hive output The `BigDecimal` branch in `unwrap` matches to `scala.math.BigDecimal` rather than `java.math.BigDecimal`. Author: Cheng Lian

git commit: [BUGFIX][SQL] Should match java.math.BigDecimal when wnrapping Hive output

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8fade8973 - 22036aeb1 [BUGFIX][SQL] Should match java.math.BigDecimal when wnrapping Hive output The `BigDecimal` branch in `unwrap` matches to `scala.math.BigDecimal` rather than `java.math.BigDecimal`. Author: Cheng Lian

git commit: [SPARK-2267] Log exception when TaskResultGetter fails to fetch/deserialze task result

2014-06-25 Thread rxin
rxin/SPARK-2267 and squashes the following commits: ce1b19b [Reynold Xin] [SPARK-2267] Log exception when TaskResultGetter fails to fetch/deserialize task result Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c68be53d Tree

git commit: SPARK-2038: rename conf parameters in the saveAsHadoop functions with source-compatibility

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 22036aeb1 - acc01ab32 SPARK-2038: rename conf parameters in the saveAsHadoop functions with source-compatibility https://issues.apache.org/jira/browse/SPARK-2038 to differentiate with SparkConf object and at the same time keep the source

git commit: Replace doc reference to Shark with Spark SQL.

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 c68be53d0 - 731a788eb Replace doc reference to Shark with Spark SQL. (cherry picked from commit ac06a85da59db8f2654cdf6601d186348da09c01) Signed-off-by: Reynold Xin r...@apache.org Project:

git commit: [SPARK-2284][UI] Mark all failed tasks as failures.

2014-06-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 fa167194c - c445b3af3 [SPARK-2284][UI] Mark all failed tasks as failures. Previously only tasks failed with ExceptionFailure reason was marked as failure. Author: Reynold Xin r...@apache.org Closes #1224 from rxin/SPARK-2284

git commit: [SPARK-2172] PySpark cannot import mllib modules in YARN-client mode

2014-06-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4a346e242 - 441cdcca6 [SPARK-2172] PySpark cannot import mllib modules in YARN-client mode Include pyspark/mllib python sources as resources in the mllib.jar. This way they will be included in the final assembly Author: Szul, Piotr

git commit: [SPARK-2254] [SQL] ScalaRefection should mark primitive types as non-nullable.

2014-06-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 c445b3af3 - 47f8829e0 [SPARK-2254] [SQL] ScalaRefection should mark primitive types as non-nullable. Author: Takuya UESHIN ues...@happy-camper.st Closes #1193 from ueshin/issues/SPARK-2254 and squashes the following commits: cfd6088

git commit: [SPARK-2254] [SQL] ScalaRefection should mark primitive types as non-nullable.

2014-06-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 441cdcca6 - e4899a253 [SPARK-2254] [SQL] ScalaRefection should mark primitive types as non-nullable. Author: Takuya UESHIN ues...@happy-camper.st Closes #1193 from ueshin/issues/SPARK-2254 and squashes the following commits: cfd6088

git commit: Removed throwable field from FetchFailedException and added MetadataFetchFailedException

2014-06-26 Thread rxin
instead). Author: Reynold Xin r...@apache.org Closes #1227 from rxin/metadataFetchException and squashes the following commits: 5cb1e0a [Reynold Xin] MetadataFetchFailedException extends FetchFailedException. 8861ee2 [Reynold Xin] Throw MetadataFetchFailedException in MapOutputTracker. Project

git commit: [SPARK-2297][UI] Make task attempt and speculation more explicit in UI.

2014-06-26 Thread rxin
.png) Author: Reynold Xin r...@apache.org Closes #1236 from rxin/ui-task-attempt and squashes the following commits: 3b645dd [Reynold Xin] Expose attemptId in Stage. c0474b1 [Reynold Xin] Beefed up unit test. c404bdd [Reynold Xin] Fix ReplayListenerSuite. f56be4b [Reynold Xin] Fixed

git commit: Improve MapOutputTracker error logging.

2014-06-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3c104c79d - 2053d793c Improve MapOutputTracker error logging. Author: Reynold Xin r...@apache.org Closes #1258 from rxin/mapOutputTracker and squashes the following commits: a7c95b6 [Reynold Xin] Improve MapOutputTracker error logging

git commit: [SPARK-2320] Reduce exception/code block font size in web ui

2014-06-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2053d793c - cdf613fc5 [SPARK-2320] Reduce exception/code block font size in web ui Author: Reynold Xin r...@apache.org Closes #1261 from rxin/ui-pre-size and squashes the following commits: 7ab1a69 [Reynold Xin] [SPARK-2320] Reduce

git commit: [SPARK-2104] Fix task serializing issues when sort with Java non serializable class

2014-06-30 Thread rxin
some unit tests to validate the issue. @rxin , would you please take a look at this PR, thanks a lot. Author: jerryshao saisai.s...@intel.com Closes #1245 from jerryshao/SPARK-2104 and squashes the following commits: c8ee362 [jerryshao] Make field partitions transient 2b41917 [jerryshao] Minor

git commit: SPARK-897: preemptively serialize closures

2014-06-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 66135a341 - a484030da SPARK-897: preemptively serialize closures These commits cause `ClosureCleaner.clean` to attempt to serialize the cleaned closure with the default closure serializer and throw a `SparkException` if doing so fails.

git commit: SPARK-2077 Log serializer that actually ends up being used

2014-06-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master a484030da - 680364225 SPARK-2077 Log serializer that actually ends up being used I could settle with this being a debug also if we provided an example of how to turn it on in `log4j.properties`

git commit: [SPARK-2322] Exception in resultHandler should NOT crash DAGScheduler and shutdown SparkContext.

2014-06-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 680364225 - 358ae1534 [SPARK-2322] Exception in resultHandler should NOT crash DAGScheduler and shutdown SparkContext. This should go into 1.0.1. Author: Reynold Xin r...@apache.org Closes #1264 from rxin/SPARK-2322 and squashes

git commit: [SPARK-2185] Emit warning when task size exceeds a threshold.

2014-07-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3319a3e3c - 05c3d90e3 [SPARK-2185] Emit warning when task size exceeds a threshold. This functionality was added in an earlier commit but shortly after was removed due to a bad git merge (totally my fault). Author: Kay Ousterhout

git commit: update the comments in SqlParser

2014-07-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 05c3d90e3 - 6596392da update the comments in SqlParser SqlParser has been case-insensitive after https://github.com/apache/spark/commit/dab5439a083b5f771d5d5b462d0d517fa8e9aaf2 was merged Author: CodingCat zhunans...@gmail.com Closes

git commit: update the comments in SqlParser

2014-07-01 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 d468b3d74 - a4c754194 update the comments in SqlParser SqlParser has been case-insensitive after https://github.com/apache/spark/commit/dab5439a083b5f771d5d5b462d0d517fa8e9aaf2 was merged Author: CodingCat zhunans...@gmail.com

git commit: [SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand`

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 97a0bfe1c - 544880457 [SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand` This is a fix for the problem revealed by PR #1265. Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since Catalyst

git commit: [SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand`

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 313f202e2 - 5c43758fb [SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand` This is a fix for the problem revealed by PR #1265. Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since Catalyst

git commit: [SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand`

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0-jdbc 9f7cf5bdb - e23656960 [SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand` This is a fix for the problem revealed by PR #1265. Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since

git commit: Update SQLConf.scala

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master d43415075 - 0bbe61223 Update SQLConf.scala use concurrent.ConcurrentHashMap instead of util.Collections.synchronizedMap Author: baishuo(白硕) vc_j...@hotmail.com Closes #1272 from baishuo/master and squashes the following commits:

git commit: Update SQLConf.scala

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 6e0b7e530 - dc73ee13c Update SQLConf.scala use concurrent.ConcurrentHashMap instead of util.Collections.synchronizedMap Author: baishuo(白硕) vc_j...@hotmail.com Closes #1272 from baishuo/master and squashes the following commits:

git commit: Update SQLConf.scala

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0-jdbc e23656960 - 519167524 Update SQLConf.scala use concurrent.ConcurrentHashMap instead of util.Collections.synchronizedMap Author: baishuo(白硕) vc_j...@hotmail.com Closes #1272 from baishuo/master and squashes the following

git commit: [SPARK-2059][SQL] Add analysis checks

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0bbe61223 - b3e768e15 [SPARK-2059][SQL] Add analysis checks This replaces #1263 with a test case. Author: Reynold Xin r...@apache.org Author: Michael Armbrust mich...@databricks.com Closes #1265 from rxin/sql-analysis-error and squashes

git commit: [SPARK-2059][SQL] Add analysis checks

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0-jdbc 519167524 - 55c427a92 [SPARK-2059][SQL] Add analysis checks This replaces #1263 with a test case. Author: Reynold Xin r...@apache.org Author: Michael Armbrust mich...@databricks.com Closes #1265 from rxin/sql-analysis-error

git commit: [SPARK-2059][SQL] Add analysis checks

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 dc73ee13c - 354a62739 [SPARK-2059][SQL] Add analysis checks This replaces #1263 with a test case. Author: Reynold Xin r...@apache.org Author: Michael Armbrust mich...@databricks.com Closes #1265 from rxin/sql-analysis-error

git commit: Added SignalLogger to HistoryServer.

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master fc7165893 - 0db5d5a22 Added SignalLogger to HistoryServer. This was omitted in #1260. @aarondav Author: Reynold Xin r...@apache.org Closes #1300 from rxin/historyServer and squashes the following commits: af720a3 [Reynold Xin] Added

git commit: [SPARK-2370][SQL] Decrease metadata retrieved for partitioned hive queries.

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0-jdbc 55c427a92 - f5f37b2ec [SPARK-2370][SQL] Decrease metadata retrieved for partitioned hive queries. Author: Michael Armbrust mich...@databricks.com Closes #1305 from marmbrus/usePrunerPartitions and squashes the following commits:

git commit: [SPARK-2370][SQL] Decrease metadata retrieved for partitioned hive queries.

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 d9b5a8e2f - b77715a5b [SPARK-2370][SQL] Decrease metadata retrieved for partitioned hive queries. Author: Michael Armbrust mich...@databricks.com Closes #1305 from marmbrus/usePrunerPartitions and squashes the following commits:

git commit: [SPARK-2306]:BoundedPriorityQueue is private and not registered with Kry...

2014-07-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9d006c973 - 42f3abd52 [SPARK-2306]:BoundedPriorityQueue is private and not registered with Kry... Due to the non registration of BoundedPriorityQueue with kryoserializer, operations which are dependend on BoundedPriorityQueue are giving

svn commit: r1608626 - in /spark: downloads.md site/downloads.html

2014-07-07 Thread rxin
Author: rxin Date: Mon Jul 7 22:47:19 2014 New Revision: 1608626 URL: http://svn.apache.org/r1608626 Log: Fix 1.0.0 release notes link. Modified: spark/downloads.md spark/site/downloads.html Modified: spark/downloads.md URL: http://svn.apache.org/viewvc/spark/downloads.md?rev

git commit: Resolve sbt warnings during build Ⅱ

2014-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0128905ee - 3cd5029be Resolve sbt warnings during build Ⅱ Author: witgo wi...@qq.com Closes #1153 from witgo/expectResult and squashes the following commits: 97541d8 [witgo] merge master ead26e7 [witgo] Resolve sbt warnings during

git commit: [SPARK-2391][SQL] Custom take() for LIMIT queries.

2014-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3cd5029be - 5a4063645 [SPARK-2391][SQL] Custom take() for LIMIT queries. Using Spark's take can result in an entire in-memory partition to be shipped in order to retrieve a single row. Author: Michael Armbrust mich...@databricks.com

git commit: [SPARK-2392] Executors should not start their own HTTP servers

2014-07-08 Thread rxin
Repository: spark Updated Branches: refs/heads/master e6f7bfcfb - bf04a390e [SPARK-2392] Executors should not start their own HTTP servers Executors currently start their own unused HTTP file servers. This is because we use the same SparkEnv class for both executors and drivers, and we do

git commit: SPARK-2115: Stage kill link is too close to stage details link

2014-07-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2b18ea982 - c2babc089 SPARK-2115: Stage kill link is too close to stage details link Moved (kill) link to the right side. Add confirmation dialog when (kill) link is clicked. Author: Masayoshi TSUZUKI tsudu...@oss.nttdata.co.jp Closes

git commit: SPARK-2427: Fix Scala examples that use the wrong command line arguments index

2014-07-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2dd672485 - ae8ca4dfb SPARK-2427: Fix Scala examples that use the wrong command line arguments index The Scala examples HBaseTest and HdfsTest don't use the correct indexes for the command line arguments. This due to to the fix of JIRA

git commit: fix Graph partitionStrategy comment

2014-07-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2f59ce7db - 282cca0e4 fix Graph partitionStrategy comment Author: CrazyJvm crazy...@gmail.com Closes #1368 from CrazyJvm/graph-comment-1 and squashes the following commits: d47f3c5 [CrazyJvm] fix style e190d6f [CrazyJvm] fix Graph

git commit: [SPARK-2457] Inconsistent description in README about build option

2014-07-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master b23e9c3e4 - cbff18774 [SPARK-2457] Inconsistent description in README about build option Now, we should use -Pyarn instead of SPARK_YARN when building but README says as follows. For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x,

git commit: [SPARK-2455] Mark (Shippable)VertexPartition serializable

2014-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2245c87af - 7a0135293 [SPARK-2455] Mark (Shippable)VertexPartition serializable VertexPartition and ShippableVertexPartition are contained in RDDs but are not marked Serializable, leading to NotSerializableExceptions when using Java

git commit: [SPARK-2455] Mark (Shippable)VertexPartition serializable

2014-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 2a5514f7d - 354ce4d30 [SPARK-2455] Mark (Shippable)VertexPartition serializable VertexPartition and ShippableVertexPartition are contained in RDDs but are not marked Serializable, leading to NotSerializableExceptions when using Java

git commit: [SPARK-2441][SQL] Add more efficient distinct operator.

2014-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7a0135293 - 7e26b5761 [SPARK-2441][SQL] Add more efficient distinct operator. Author: Michael Armbrust mich...@databricks.com Closes #1366 from marmbrus/partialDistinct and squashes the following commits: 12a31ab [Michael Armbrust] Add

git commit: [SPARK-2441][SQL] Add more efficient distinct operator.

2014-07-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 354ce4d30 - 37e49433a [SPARK-2441][SQL] Add more efficient distinct operator. Author: Michael Armbrust mich...@databricks.com Closes #1366 from marmbrus/partialDistinct and squashes the following commits: 12a31ab [Michael Armbrust]

git commit: Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.

2014-07-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 635888cbe - aab534966 Made rdd.py pep8 complaint by using Autopep8 and a little manual editing. Author: Prashant Sharma prashan...@imaginea.com Closes #1354 from ScrapCodes/pep8-comp-1 and squashes the following commits: 9858ea8

git commit: Update README.md to include a slightly more informative project description.

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 52beb20f7 - 8f1d4226c Update README.md to include a slightly more informative project description. (cherry picked from commit 401083be9f010f95110a819a49837ecae7d9c4ec) Signed-off-by: Reynold Xin r...@apache.org Project:

git commit: README update: added for Big Data.

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8f1d4226c - 6555618c8 README update: added for Big Data. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6555618c Tree:

git commit: Added LZ4 to compression codec in configuration page.

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 72ea56da8 - e7ec815d9 Added LZ4 to compression codec in configuration page. Author: Reynold Xin r...@apache.org Closes #1417 from rxin/lz4 and squashes the following commits: 472f6a1 [Reynold Xin] Set the proper default. 9cf0b2f [Reynold

git commit: [SPARK-2500] Move the logInfo for registering BlockManager to BlockManagerMasterActor.register method

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4576d80a5 - 9c12de509 [SPARK-2500] Move the logInfo for registering BlockManager to BlockManagerMasterActor.register method PR for SPARK-2500 Move the logInfo call for BlockManager to BlockManagerMasterActor.register instead of

git commit: follow pep8 None should be compared using is or is not

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9c12de509 - 563acf5ed follow pep8 None should be compared using is or is not http://legacy.python.org/dev/peps/pep-0008/ ## Programming Recommendations - Comparisons to singletons like None should always be done with is or is not, never

git commit: [SPARK-2509][SQL] Add optimization for Substring.

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 90ca532a0 - 9b38b7c71 [SPARK-2509][SQL] Add optimization for Substring. `Substring` including `null` literal cases could be added to `NullPropagation`. Author: Takuya UESHIN ues...@happy-camper.st Closes #1428 from

git commit: [SPARK-2509][SQL] Add optimization for Substring.

2014-07-15 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 96fdc7c38 - 16c8d562d [SPARK-2509][SQL] Add optimization for Substring. `Substring` including `null` literal cases could be added to `NullPropagation`. Author: Takuya UESHIN ues...@happy-camper.st Closes #1428 from

git commit: Tightening visibility for various Broadcast related classes.

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master 33e64ecac - efe2a8b12 Tightening visibility for various Broadcast related classes. In preparation for SPARK-2521. Author: Reynold Xin r...@apache.org Closes #1438 from rxin/broadcast and squashes the following commits: 432f1cc [Reynold

git commit: [SPARK-2525][SQL] Remove as many compilation warning messages as possible in Spark SQL

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 e61149dd0 - fb38b9cc5 [SPARK-2525][SQL] Remove as many compilation warning messages as possible in Spark SQL JIRA: https://issues.apache.org/jira/browse/SPARK-2525. Author: Yin Huai h...@cse.ohio-state.edu Closes #1444 from

git commit: [SQL] Cleaned up ConstantFolding slightly.

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master df95d82da - 1c5739f68 [SQL] Cleaned up ConstantFolding slightly. Moved couple rules out of NullPropagation and added more comments. Author: Reynold Xin r...@apache.org Closes #1430 from rxin/sql-folding-rule and squashes the following

git commit: SPARK-2519. Eliminate pattern-matching on Tuple2 in performance-critical...

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1c5739f68 - fc7edc9e7 SPARK-2519. Eliminate pattern-matching on Tuple2 in performance-critical... ... aggregation code Author: Sandy Ryza sa...@cloudera.com Closes #1435 from sryza/sandy-spark-2519 and squashes the following commits:

git commit: [SPARK-2518][SQL] Fix foldability of Substring expression.

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master fc7edc9e7 - cc965eea5 [SPARK-2518][SQL] Fix foldability of Substring expression. This is a follow-up of #1428. Author: Takuya UESHIN ues...@happy-camper.st Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following commits:

git commit: [SPARK-2518][SQL] Fix foldability of Substring expression.

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 fb38b9cc5 - bf1ddc7b8 [SPARK-2518][SQL] Fix foldability of Substring expression. This is a follow-up of #1428. Author: Takuya UESHIN ues...@happy-camper.st Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following

git commit: [SPARK-2517] Remove some compiler warnings.

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master cc965eea5 - ef48222c1 [SPARK-2517] Remove some compiler warnings. Author: Reynold Xin r...@apache.org Closes #1433 from rxin/compile-warning and squashes the following commits: 8d0b890 [Reynold Xin] Remove some compiler warnings

git commit: [SPARK-2522] set default broadcast factory to torrent

2014-07-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master ef48222c1 - 96f28c972 [SPARK-2522] set default broadcast factory to torrent HttpBroadcastFactory is the current default broadcast factory. It sends the broadcast data to each worker one by one, which is slow when the cluster is big.

git commit: [SPARK-2317] Improve task logging.

2014-07-16 Thread rxin
14/07/15 19:44:40 INFO Executor: Finished task 6.0 in stage 1.0 (TID 6). 847 bytes result sent to driver 14/07/15 19:44:40 INFO Executor: Finished task 7.0 in stage 1.0 (TID 7). 847 bytes result sent to driver ``` Author: Reynold Xin r...@apache.org Closes #1259 from rxin/betterTaskLogging

git commit: [SPARK-2534] Avoid pulling in the entire RDD in various operators

2014-07-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9c73822a0 - d988d345d [SPARK-2534] Avoid pulling in the entire RDD in various operators This should go into both master and branch-1.0. Author: Reynold Xin r...@apache.org Closes #1450 from rxin/agg-closure and squashes the following

git commit: [SPARK-2534] Avoid pulling in the entire RDD in various operators (branch-1.0 backport)

2014-07-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 3bb5d2f8a - 26c428acb [SPARK-2534] Avoid pulling in the entire RDD in various operators (branch-1.0 backport) This backports #1450 into branch-1.0. Author: Reynold Xin r...@apache.org Closes #1469 from rxin/closure-1.0 and squashes

git commit: [SPARK-2299] Consolidate various stageIdTo* hash maps in JobProgressListener

2014-07-17 Thread rxin
: Reynold Xin r...@apache.org Closes #1262 from rxin/ui-consolidate-hashtables and squashes the following commits: 1ac3f97 [Reynold Xin] Oops. Properly handle description. f5736ad [Reynold Xin] Code review comments. b8828dc [Reynold Xin] Merge branch 'master' into ui-consolidate-hashtables 7a7b6c4

git commit: [SPARK-2570] [SQL] Fix the bug of ClassCastException

2014-07-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6afca2d10 - 29809a6d5 [SPARK-2570] [SQL] Fix the bug of ClassCastException Exception thrown when running the example of HiveFromSpark. Exception in thread main java.lang.ClassCastException: java.lang.Long cannot be cast to

git commit: SPARK-2553. Fix compile error

2014-07-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master e52b8719c - 30b8d369d SPARK-2553. Fix compile error Author: Sandy Ryza sa...@cloudera.com Closes #1479 from sryza/sandy-spark-2553 and squashes the following commits: 2cb5ed8 [Sandy Ryza] SPARK-2553. Fix compile error Project:

git commit: Reservoir sampling implementation.

2014-07-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7f87ab981 - 586e716e4 Reservoir sampling implementation. This is going to be used in https://issues.apache.org/jira/browse/SPARK-2568 Author: Reynold Xin r...@apache.org Closes #1478 from rxin/reservoirSample and squashes the following

git commit: [SPARK-2521] Broadcast RDD object (instead of sending it along with every task).

2014-07-19 Thread rxin
: 3.416348793 s, 1.477846558 s, 1.553432156 s ``` Author: Reynold Xin r...@apache.org Closes #1452 from rxin/broadcast-task and squashes the following commits: 762e0be [Reynold Xin] Warn large broadcasts. ade6eac [Reynold Xin] Log broadcast size. c3b6f11 [Reynold Xin] Added a unit test for clean up

git commit: put 'curRequestSize = 0' after 'logDebug' it

2014-07-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7b8cd1752 - 805f329bb put 'curRequestSize = 0' after 'logDebug' it This is a minor change. We should first logDebug($curRequestSize) and then set it to 0. Author: Lijie Xu csxuli...@gmail.com Closes #1477 from JerryLead/patch-1 and

git commit: [SPARK-2598] RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 98ab41122 - fa51b0fb5 [SPARK-2598] RangePartitioner's binary search does not use the given Ordering We should fix this in branch-1.0 as well. Author: Reynold Xin r...@apache.org Closes #1500 from rxin/rangePartitioner and squashes

git commit: [SPARK-2598] RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 11670bf1a - 480669f2b [SPARK-2598] RangePartitioner's binary search does not use the given Ordering We should fix this in branch-1.0 as well. Author: Reynold Xin r...@apache.org Closes #1500 from rxin/rangePartitioner and squashes

git commit: [SPARK-2495][MLLIB] remove private[mllib] from linear models' constructors

2014-07-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master fa51b0fb5 - 1b10b8114 [SPARK-2495][MLLIB] remove private[mllib] from linear models' constructors This is part of SPARK-2495 to allow users construct linear models manually. Author: Xiangrui Meng m...@databricks.com Closes #1492 from

git commit: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master c3462c656 - 5d16d5bbf [SPARK-2470] PEP8 fixes to PySpark This pull request aims to resolve all outstanding PEP8 violations in PySpark. Author: Nicholas Chammas nicholas.cham...@gmail.com Author: nchammas nicholas.cham...@gmail.com Closes

git commit: [SPARK-2617] Correct doc and usages of preservesPartitioning

2014-07-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6c2be93f0 - 4c7243e10 [SPARK-2617] Correct doc and usages of preservesPartitioning The name `preservesPartitioning` is ambiguous: 1) preserves the indices of partitions, 2) preserves the partitioner. The latter is correct and

git commit: Replace RoutingTableMessage with pair

2014-07-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master 60f0ae3d8 - 2d25e3481 Replace RoutingTableMessage with pair RoutingTableMessage was used to construct routing tables to enable joining VertexRDDs with partitioned edges. It stored three elements: the destination vertex ID, the source edge

git commit: [SPARK-2658][SQL] Add rule for true = 1.

2014-07-23 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 c6421b6f6 - 6b0804640 [SPARK-2658][SQL] Add rule for true = 1. Author: Michael Armbrust mich...@databricks.com Closes #1556 from marmbrus/fixBooleanEqualsOne and squashes the following commits: ad8edd4 [Michael Armbrust] Add rule

git commit: [SPARK-2658][SQL] Add rule for true = 1.

2014-07-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9e7725c86 - 78d18fdba [SPARK-2658][SQL] Add rule for true = 1. Author: Michael Armbrust mich...@databricks.com Closes #1556 from marmbrus/fixBooleanEqualsOne and squashes the following commits: ad8edd4 [Michael Armbrust] Add rule for

git commit: [SPARK-2529] Clean closures in foreach and foreachPartition.

2014-07-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8529ced35 - eb82abd8e [SPARK-2529] Clean closures in foreach and foreachPartition. Author: Reynold Xin r...@apache.org Closes #1583 from rxin/closureClean and squashes the following commits: 8982fe6 [Reynold Xin] [SPARK-2529] Clean

git commit: [SPARK-2529] Clean closures in foreach and foreachPartition.

2014-07-25 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.0 70109da21 - 797c663ae [SPARK-2529] Clean closures in foreach and foreachPartition. Author: Reynold Xin r...@apache.org Closes #1583 from rxin/closureClean and squashes the following commits: 8982fe6 [Reynold Xin] [SPARK-2529] Clean

git commit: Part of [SPARK-2456] Removed some HashMaps from DAGScheduler by storing information in Stage.

2014-07-25 Thread rxin
@markhamstra please take a look ... Author: Reynold Xin r...@apache.org Closes #1561 from rxin/dagSchedulerHashMaps and squashes the following commits: 1c44e15 [Reynold Xin] Clear pending tasks in submitMissingTasks. 620a0d1 [Reynold Xin] Use filterKeys. 5b54404 [Reynold Xin] Code review feedback

git commit: Excess judgment

2014-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 39ab87b92 - 16ef4d110 Excess judgment Author: Yadong Qi qiyadong2...@gmail.com Closes #1629 from watermen/bug-fix2 and squashes the following commits: 59b7237 [Yadong Qi] Update HiveQl.scala Project:

git commit: [SPARK-2174][MLLIB] treeReduce and treeAggregate

2014-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 96ba04bbf - 20424dad3 [SPARK-2174][MLLIB] treeReduce and treeAggregate In `reduce` and `aggregate`, the driver node spends linear time on the number of partitions. It becomes a bottleneck when there are many partitions and the data from

git commit: [SPARK-2568] RangePartitioner should run only one job if data is balanced

2014-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 84467468d - 2e6efcace [SPARK-2568] RangePartitioner should run only one job if data is balanced As of Spark 1.0, RangePartitioner goes through data twice: once to compute the count and once to do sampling. As a result, to do sortByKey,

git commit: [SPARK-2747] git diff --dirstat can miss sql changes and not run Hive tests

2014-07-30 Thread rxin
(e.g. 1k loc change in core, and only 1 loc change in hive). We should use git diff --name-only master instead. Author: Reynold Xin r...@apache.org Closes #1656 from rxin/hiveTest and squashes the following commits: f5eab9f [Reynold Xin] [SPARK-2747] git diff --dirstat can miss sql changes

<    1   2   3   4   5   6   7   8   9   10   >