and
pattern matching in the optimizer.
Note that this sits on top of #1144.
Author: Reynold Xin r...@apache.org
Closes #1146 from rxin/equals and squashes the following commits:
f8583fd [Reynold Xin] Merge branch 'master' of github.com:apache/spark into
equals
326b388 [Reynold Xin] Merge branch
and
pattern matching in the optimizer.
Note that this sits on top of #1144.
Author: Reynold Xin r...@apache.org
Closes #1146 from rxin/equals and squashes the following commits:
f8583fd [Reynold Xin] Merge branch 'master' of github.com:apache/spark into
equals
326b388 [Reynold Xin] Merge branch
Repository: spark
Updated Branches:
refs/heads/master d4c7572db - 204478491
[SQL] Use hive.SessionState, not the thread local SessionState
Note that this is simply mimicing lookupRelation(). I do not have a concrete
notion of why this solution is necessarily right-er than SessionState.get,
Repository: spark
Updated Branches:
refs/heads/branch-1.0 91dc0641c - 36668662f
[SQL] Use hive.SessionState, not the thread local SessionState
Note that this is simply mimicing lookupRelation(). I do not have a concrete
notion of why this solution is necessarily right-er than
Repository: spark
Updated Branches:
refs/heads/master 648553d48 - ca5d8b590
[SQL] Pass SQLContext instead of SparkContext into physical operators.
This makes it easier to use config options in operators.
Author: Reynold Xin r...@apache.org
Closes #1164 from rxin/sqlcontext and squashes
Repository: spark
Updated Branches:
refs/heads/branch-1.0 36668662f - 1829ec411
[SQL] Pass SQLContext instead of SparkContext into physical operators.
This makes it easier to use config options in operators.
Author: Reynold Xin r...@apache.org
Closes #1164 from rxin/sqlcontext and squashes
Repository: spark
Updated Branches:
refs/heads/master 21ddd7d1e - 383bf72c1
Cleanup on Connection, ConnectionManagerId, ConnectionManager classes part 2
Cleanup on Connection, ConnectionManagerId, and ConnectionManager classes part
2 while I was working at the code there to help IDE:
1.
...@apache.org
Closes #1167 from rxin/commands and squashes the following commits:
56f04f8 [Reynold Xin] [SPARK-2227] Support dfs command in SQL.
(cherry picked from commit 51c8168377a89d20d0b2d7b9a28af58593a0fe0c)
Signed-off-by: Reynold Xin r...@apache.org
Project: http://git-wip-us.apache.org
...@apache.org
Closes #1167 from rxin/commands and squashes the following commits:
56f04f8 [Reynold Xin] [SPARK-2227] Support dfs command in SQL.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/51c81683
Tree: http://git-wip
Repository: spark
Updated Branches:
refs/heads/branch-1.0 c43835305 - 05f84e28f
[SPARK-2252] Fix MathJax for HTTPs.
Found out about this from the Hacker News link to GraphX which was using HTTPs.
@mengxr
Author: Reynold Xin r...@apache.org
Closes #1189 from rxin/mllib-doc and squashes
Repository: spark
Updated Branches:
refs/heads/master 56eb8af18 - 420c1c3e1
[SPARK-2252] Fix MathJax for HTTPs.
Found out about this from the Hacker News link to GraphX which was using HTTPs.
@mengxr
Author: Reynold Xin r...@apache.org
Closes #1189 from rxin/mllib-doc and squashes
Repository: spark
Updated Branches:
refs/heads/master 54055fb2b - 2714968e1
Fix possible null pointer in acumulator toString
Author: Michael Armbrust mich...@databricks.com
Closes #1204 from marmbrus/nullPointerToString and squashes the following
commits:
35b5fce [Michael Armbrust] Fix
Repository: spark
Updated Branches:
refs/heads/branch-1.0 e199a02dd - d3dbaf5a7
Fix possible null pointer in acumulator toString
Author: Michael Armbrust mich...@databricks.com
Closes #1204 from marmbrus/nullPointerToString and squashes the following
commits:
35b5fce [Michael Armbrust] Fix
Repository: spark
Updated Branches:
refs/heads/master b6b44853c - 8fade8973
[SPARK-2263][SQL] Support inserting MAPK, V to Hive tables
JIRA issue: [SPARK-2263](https://issues.apache.org/jira/browse/SPARK-2263)
Map objects were not converted to Hive types before inserting into Hive tables.
Repository: spark
Updated Branches:
refs/heads/branch-1.0 d3dbaf5a7 - a31def10a
[SPARK-2263][SQL] Support inserting MAPK, V to Hive tables
JIRA issue: [SPARK-2263](https://issues.apache.org/jira/browse/SPARK-2263)
Map objects were not converted to Hive types before inserting into Hive
Repository: spark
Updated Branches:
refs/heads/branch-1.0 a31def10a - 65a559cfc
[BUGFIX][SQL] Should match java.math.BigDecimal when wnrapping Hive output
The `BigDecimal` branch in `unwrap` matches to `scala.math.BigDecimal` rather
than `java.math.BigDecimal`.
Author: Cheng Lian
Repository: spark
Updated Branches:
refs/heads/master 8fade8973 - 22036aeb1
[BUGFIX][SQL] Should match java.math.BigDecimal when wnrapping Hive output
The `BigDecimal` branch in `unwrap` matches to `scala.math.BigDecimal` rather
than `java.math.BigDecimal`.
Author: Cheng Lian
rxin/SPARK-2267 and squashes the following commits:
ce1b19b [Reynold Xin] [SPARK-2267] Log exception when TaskResultGetter fails to
fetch/deserialize task result
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c68be53d
Tree
Repository: spark
Updated Branches:
refs/heads/master 22036aeb1 - acc01ab32
SPARK-2038: rename conf parameters in the saveAsHadoop functions with
source-compatibility
https://issues.apache.org/jira/browse/SPARK-2038
to differentiate with SparkConf object and at the same time keep the source
Repository: spark
Updated Branches:
refs/heads/branch-1.0 c68be53d0 - 731a788eb
Replace doc reference to Shark with Spark SQL.
(cherry picked from commit ac06a85da59db8f2654cdf6601d186348da09c01)
Signed-off-by: Reynold Xin r...@apache.org
Project:
Repository: spark
Updated Branches:
refs/heads/branch-1.0 fa167194c - c445b3af3
[SPARK-2284][UI] Mark all failed tasks as failures.
Previously only tasks failed with ExceptionFailure reason was marked as failure.
Author: Reynold Xin r...@apache.org
Closes #1224 from rxin/SPARK-2284
Repository: spark
Updated Branches:
refs/heads/master 4a346e242 - 441cdcca6
[SPARK-2172] PySpark cannot import mllib modules in YARN-client mode
Include pyspark/mllib python sources as resources in the mllib.jar.
This way they will be included in the final assembly
Author: Szul, Piotr
Repository: spark
Updated Branches:
refs/heads/branch-1.0 c445b3af3 - 47f8829e0
[SPARK-2254] [SQL] ScalaRefection should mark primitive types as non-nullable.
Author: Takuya UESHIN ues...@happy-camper.st
Closes #1193 from ueshin/issues/SPARK-2254 and squashes the following commits:
cfd6088
Repository: spark
Updated Branches:
refs/heads/master 441cdcca6 - e4899a253
[SPARK-2254] [SQL] ScalaRefection should mark primitive types as non-nullable.
Author: Takuya UESHIN ues...@happy-camper.st
Closes #1193 from ueshin/issues/SPARK-2254 and squashes the following commits:
cfd6088
instead).
Author: Reynold Xin r...@apache.org
Closes #1227 from rxin/metadataFetchException and squashes the following
commits:
5cb1e0a [Reynold Xin] MetadataFetchFailedException extends FetchFailedException.
8861ee2 [Reynold Xin] Throw MetadataFetchFailedException in MapOutputTracker.
Project
.png)
Author: Reynold Xin r...@apache.org
Closes #1236 from rxin/ui-task-attempt and squashes the following commits:
3b645dd [Reynold Xin] Expose attemptId in Stage.
c0474b1 [Reynold Xin] Beefed up unit test.
c404bdd [Reynold Xin] Fix ReplayListenerSuite.
f56be4b [Reynold Xin] Fixed
Repository: spark
Updated Branches:
refs/heads/master 3c104c79d - 2053d793c
Improve MapOutputTracker error logging.
Author: Reynold Xin r...@apache.org
Closes #1258 from rxin/mapOutputTracker and squashes the following commits:
a7c95b6 [Reynold Xin] Improve MapOutputTracker error logging
Repository: spark
Updated Branches:
refs/heads/master 2053d793c - cdf613fc5
[SPARK-2320] Reduce exception/code block font size in web ui
Author: Reynold Xin r...@apache.org
Closes #1261 from rxin/ui-pre-size and squashes the following commits:
7ab1a69 [Reynold Xin] [SPARK-2320] Reduce
some unit tests to validate the issue.
@rxin , would you please take a look at this PR, thanks a lot.
Author: jerryshao saisai.s...@intel.com
Closes #1245 from jerryshao/SPARK-2104 and squashes the following commits:
c8ee362 [jerryshao] Make field partitions transient
2b41917 [jerryshao] Minor
Repository: spark
Updated Branches:
refs/heads/master 66135a341 - a484030da
SPARK-897: preemptively serialize closures
These commits cause `ClosureCleaner.clean` to attempt to serialize the cleaned
closure with the default closure serializer and throw a `SparkException` if
doing so fails.
Repository: spark
Updated Branches:
refs/heads/master a484030da - 680364225
SPARK-2077 Log serializer that actually ends up being used
I could settle with this being a debug also if we provided an example of how to
turn it on in `log4j.properties`
Repository: spark
Updated Branches:
refs/heads/master 680364225 - 358ae1534
[SPARK-2322] Exception in resultHandler should NOT crash DAGScheduler and
shutdown SparkContext.
This should go into 1.0.1.
Author: Reynold Xin r...@apache.org
Closes #1264 from rxin/SPARK-2322 and squashes
Repository: spark
Updated Branches:
refs/heads/master 3319a3e3c - 05c3d90e3
[SPARK-2185] Emit warning when task size exceeds a threshold.
This functionality was added in an earlier commit but shortly
after was removed due to a bad git merge (totally my fault).
Author: Kay Ousterhout
Repository: spark
Updated Branches:
refs/heads/master 05c3d90e3 - 6596392da
update the comments in SqlParser
SqlParser has been case-insensitive after
https://github.com/apache/spark/commit/dab5439a083b5f771d5d5b462d0d517fa8e9aaf2
was merged
Author: CodingCat zhunans...@gmail.com
Closes
Repository: spark
Updated Branches:
refs/heads/branch-1.0 d468b3d74 - a4c754194
update the comments in SqlParser
SqlParser has been case-insensitive after
https://github.com/apache/spark/commit/dab5439a083b5f771d5d5b462d0d517fa8e9aaf2
was merged
Author: CodingCat zhunans...@gmail.com
Repository: spark
Updated Branches:
refs/heads/master 97a0bfe1c - 544880457
[SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand`
This is a fix for the problem revealed by PR #1265.
Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since
Catalyst
Repository: spark
Updated Branches:
refs/heads/branch-1.0 313f202e2 - 5c43758fb
[SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand`
This is a fix for the problem revealed by PR #1265.
Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since
Catalyst
Repository: spark
Updated Branches:
refs/heads/branch-1.0-jdbc 9f7cf5bdb - e23656960
[SPARK-2059][SQL] Don't throw TreeNodeException in `execution.ExplainCommand`
This is a fix for the problem revealed by PR #1265.
Currently `HiveComparisonSuite` ignores output of `ExplainCommand` since
Repository: spark
Updated Branches:
refs/heads/master d43415075 - 0bbe61223
Update SQLConf.scala
use concurrent.ConcurrentHashMap instead of util.Collections.synchronizedMap
Author: baishuo(ç½ç¡) vc_j...@hotmail.com
Closes #1272 from baishuo/master and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/branch-1.0 6e0b7e530 - dc73ee13c
Update SQLConf.scala
use concurrent.ConcurrentHashMap instead of util.Collections.synchronizedMap
Author: baishuo(ç½ç¡) vc_j...@hotmail.com
Closes #1272 from baishuo/master and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/branch-1.0-jdbc e23656960 - 519167524
Update SQLConf.scala
use concurrent.ConcurrentHashMap instead of util.Collections.synchronizedMap
Author: baishuo(ç½ç¡) vc_j...@hotmail.com
Closes #1272 from baishuo/master and squashes the following
Repository: spark
Updated Branches:
refs/heads/master 0bbe61223 - b3e768e15
[SPARK-2059][SQL] Add analysis checks
This replaces #1263 with a test case.
Author: Reynold Xin r...@apache.org
Author: Michael Armbrust mich...@databricks.com
Closes #1265 from rxin/sql-analysis-error and squashes
Repository: spark
Updated Branches:
refs/heads/branch-1.0-jdbc 519167524 - 55c427a92
[SPARK-2059][SQL] Add analysis checks
This replaces #1263 with a test case.
Author: Reynold Xin r...@apache.org
Author: Michael Armbrust mich...@databricks.com
Closes #1265 from rxin/sql-analysis-error
Repository: spark
Updated Branches:
refs/heads/branch-1.0 dc73ee13c - 354a62739
[SPARK-2059][SQL] Add analysis checks
This replaces #1263 with a test case.
Author: Reynold Xin r...@apache.org
Author: Michael Armbrust mich...@databricks.com
Closes #1265 from rxin/sql-analysis-error
Repository: spark
Updated Branches:
refs/heads/master fc7165893 - 0db5d5a22
Added SignalLogger to HistoryServer.
This was omitted in #1260. @aarondav
Author: Reynold Xin r...@apache.org
Closes #1300 from rxin/historyServer and squashes the following commits:
af720a3 [Reynold Xin] Added
Repository: spark
Updated Branches:
refs/heads/branch-1.0-jdbc 55c427a92 - f5f37b2ec
[SPARK-2370][SQL] Decrease metadata retrieved for partitioned hive queries.
Author: Michael Armbrust mich...@databricks.com
Closes #1305 from marmbrus/usePrunerPartitions and squashes the following
commits:
Repository: spark
Updated Branches:
refs/heads/branch-1.0 d9b5a8e2f - b77715a5b
[SPARK-2370][SQL] Decrease metadata retrieved for partitioned hive queries.
Author: Michael Armbrust mich...@databricks.com
Closes #1305 from marmbrus/usePrunerPartitions and squashes the following
commits:
Repository: spark
Updated Branches:
refs/heads/master 9d006c973 - 42f3abd52
[SPARK-2306]:BoundedPriorityQueue is private and not registered with Kry...
Due to the non registration of BoundedPriorityQueue with kryoserializer,
operations which are dependend on BoundedPriorityQueue are giving
Author: rxin
Date: Mon Jul 7 22:47:19 2014
New Revision: 1608626
URL: http://svn.apache.org/r1608626
Log:
Fix 1.0.0 release notes link.
Modified:
spark/downloads.md
spark/site/downloads.html
Modified: spark/downloads.md
URL:
http://svn.apache.org/viewvc/spark/downloads.md?rev
Repository: spark
Updated Branches:
refs/heads/master 0128905ee - 3cd5029be
Resolve sbt warnings during build â
¡
Author: witgo wi...@qq.com
Closes #1153 from witgo/expectResult and squashes the following commits:
97541d8 [witgo] merge master
ead26e7 [witgo] Resolve sbt warnings during
Repository: spark
Updated Branches:
refs/heads/master 3cd5029be - 5a4063645
[SPARK-2391][SQL] Custom take() for LIMIT queries.
Using Spark's take can result in an entire in-memory partition to be shipped in
order to retrieve a single row.
Author: Michael Armbrust mich...@databricks.com
Repository: spark
Updated Branches:
refs/heads/master e6f7bfcfb - bf04a390e
[SPARK-2392] Executors should not start their own HTTP servers
Executors currently start their own unused HTTP file servers. This is because
we use the same SparkEnv class for both executors and drivers, and we do
Repository: spark
Updated Branches:
refs/heads/master 2b18ea982 - c2babc089
SPARK-2115: Stage kill link is too close to stage details link
Moved (kill) link to the right side. Add confirmation dialog when (kill) link
is clicked.
Author: Masayoshi TSUZUKI tsudu...@oss.nttdata.co.jp
Closes
Repository: spark
Updated Branches:
refs/heads/master 2dd672485 - ae8ca4dfb
SPARK-2427: Fix Scala examples that use the wrong command line arguments index
The Scala examples HBaseTest and HdfsTest don't use the correct indexes for the
command line arguments. This due to to the fix of JIRA
Repository: spark
Updated Branches:
refs/heads/master 2f59ce7db - 282cca0e4
fix Graph partitionStrategy comment
Author: CrazyJvm crazy...@gmail.com
Closes #1368 from CrazyJvm/graph-comment-1 and squashes the following commits:
d47f3c5 [CrazyJvm] fix style
e190d6f [CrazyJvm] fix Graph
Repository: spark
Updated Branches:
refs/heads/master b23e9c3e4 - cbff18774
[SPARK-2457] Inconsistent description in README about build option
Now, we should use -Pyarn instead of SPARK_YARN when building but README says
as follows.
For Apache Hadoop 2.2.X, 2.1.X, 2.0.X, 0.23.x,
Repository: spark
Updated Branches:
refs/heads/master 2245c87af - 7a0135293
[SPARK-2455] Mark (Shippable)VertexPartition serializable
VertexPartition and ShippableVertexPartition are contained in RDDs but are not
marked Serializable, leading to NotSerializableExceptions when using Java
Repository: spark
Updated Branches:
refs/heads/branch-1.0 2a5514f7d - 354ce4d30
[SPARK-2455] Mark (Shippable)VertexPartition serializable
VertexPartition and ShippableVertexPartition are contained in RDDs but are not
marked Serializable, leading to NotSerializableExceptions when using Java
Repository: spark
Updated Branches:
refs/heads/master 7a0135293 - 7e26b5761
[SPARK-2441][SQL] Add more efficient distinct operator.
Author: Michael Armbrust mich...@databricks.com
Closes #1366 from marmbrus/partialDistinct and squashes the following commits:
12a31ab [Michael Armbrust] Add
Repository: spark
Updated Branches:
refs/heads/branch-1.0 354ce4d30 - 37e49433a
[SPARK-2441][SQL] Add more efficient distinct operator.
Author: Michael Armbrust mich...@databricks.com
Closes #1366 from marmbrus/partialDistinct and squashes the following commits:
12a31ab [Michael Armbrust]
Repository: spark
Updated Branches:
refs/heads/master 635888cbe - aab534966
Made rdd.py pep8 complaint by using Autopep8 and a little manual editing.
Author: Prashant Sharma prashan...@imaginea.com
Closes #1354 from ScrapCodes/pep8-comp-1 and squashes the following commits:
9858ea8
Repository: spark
Updated Branches:
refs/heads/master 52beb20f7 - 8f1d4226c
Update README.md to include a slightly more informative project description.
(cherry picked from commit 401083be9f010f95110a819a49837ecae7d9c4ec)
Signed-off-by: Reynold Xin r...@apache.org
Project:
Repository: spark
Updated Branches:
refs/heads/master 8f1d4226c - 6555618c8
README update: added for Big Data.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6555618c
Tree:
Repository: spark
Updated Branches:
refs/heads/master 72ea56da8 - e7ec815d9
Added LZ4 to compression codec in configuration page.
Author: Reynold Xin r...@apache.org
Closes #1417 from rxin/lz4 and squashes the following commits:
472f6a1 [Reynold Xin] Set the proper default.
9cf0b2f [Reynold
Repository: spark
Updated Branches:
refs/heads/master 4576d80a5 - 9c12de509
[SPARK-2500] Move the logInfo for registering BlockManager to
BlockManagerMasterActor.register method
PR for SPARK-2500
Move the logInfo call for BlockManager to BlockManagerMasterActor.register
instead of
Repository: spark
Updated Branches:
refs/heads/master 9c12de509 - 563acf5ed
follow pep8 None should be compared using is or is not
http://legacy.python.org/dev/peps/pep-0008/
## Programming Recommendations
- Comparisons to singletons like None should always be done with is or is not,
never
Repository: spark
Updated Branches:
refs/heads/master 90ca532a0 - 9b38b7c71
[SPARK-2509][SQL] Add optimization for Substring.
`Substring` including `null` literal cases could be added to `NullPropagation`.
Author: Takuya UESHIN ues...@happy-camper.st
Closes #1428 from
Repository: spark
Updated Branches:
refs/heads/branch-1.0 96fdc7c38 - 16c8d562d
[SPARK-2509][SQL] Add optimization for Substring.
`Substring` including `null` literal cases could be added to `NullPropagation`.
Author: Takuya UESHIN ues...@happy-camper.st
Closes #1428 from
Repository: spark
Updated Branches:
refs/heads/master 33e64ecac - efe2a8b12
Tightening visibility for various Broadcast related classes.
In preparation for SPARK-2521.
Author: Reynold Xin r...@apache.org
Closes #1438 from rxin/broadcast and squashes the following commits:
432f1cc [Reynold
Repository: spark
Updated Branches:
refs/heads/branch-1.0 e61149dd0 - fb38b9cc5
[SPARK-2525][SQL] Remove as many compilation warning messages as possible in
Spark SQL
JIRA: https://issues.apache.org/jira/browse/SPARK-2525.
Author: Yin Huai h...@cse.ohio-state.edu
Closes #1444 from
Repository: spark
Updated Branches:
refs/heads/master df95d82da - 1c5739f68
[SQL] Cleaned up ConstantFolding slightly.
Moved couple rules out of NullPropagation and added more comments.
Author: Reynold Xin r...@apache.org
Closes #1430 from rxin/sql-folding-rule and squashes the following
Repository: spark
Updated Branches:
refs/heads/master 1c5739f68 - fc7edc9e7
SPARK-2519. Eliminate pattern-matching on Tuple2 in performance-critical...
... aggregation code
Author: Sandy Ryza sa...@cloudera.com
Closes #1435 from sryza/sandy-spark-2519 and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/master fc7edc9e7 - cc965eea5
[SPARK-2518][SQL] Fix foldability of Substring expression.
This is a follow-up of #1428.
Author: Takuya UESHIN ues...@happy-camper.st
Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following commits:
Repository: spark
Updated Branches:
refs/heads/branch-1.0 fb38b9cc5 - bf1ddc7b8
[SPARK-2518][SQL] Fix foldability of Substring expression.
This is a follow-up of #1428.
Author: Takuya UESHIN ues...@happy-camper.st
Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following
Repository: spark
Updated Branches:
refs/heads/master cc965eea5 - ef48222c1
[SPARK-2517] Remove some compiler warnings.
Author: Reynold Xin r...@apache.org
Closes #1433 from rxin/compile-warning and squashes the following commits:
8d0b890 [Reynold Xin] Remove some compiler warnings
Repository: spark
Updated Branches:
refs/heads/master ef48222c1 - 96f28c972
[SPARK-2522] set default broadcast factory to torrent
HttpBroadcastFactory is the current default broadcast factory. It sends the
broadcast data to each worker one by one, which is slow when the cluster is
big.
14/07/15 19:44:40 INFO Executor: Finished task 6.0 in stage 1.0 (TID 6). 847
bytes result sent to driver
14/07/15 19:44:40 INFO Executor: Finished task 7.0 in stage 1.0 (TID 7). 847
bytes result sent to driver
```
Author: Reynold Xin r...@apache.org
Closes #1259 from rxin/betterTaskLogging
Repository: spark
Updated Branches:
refs/heads/master 9c73822a0 - d988d345d
[SPARK-2534] Avoid pulling in the entire RDD in various operators
This should go into both master and branch-1.0.
Author: Reynold Xin r...@apache.org
Closes #1450 from rxin/agg-closure and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.0 3bb5d2f8a - 26c428acb
[SPARK-2534] Avoid pulling in the entire RDD in various operators (branch-1.0
backport)
This backports #1450 into branch-1.0.
Author: Reynold Xin r...@apache.org
Closes #1469 from rxin/closure-1.0 and squashes
: Reynold Xin r...@apache.org
Closes #1262 from rxin/ui-consolidate-hashtables and squashes the following
commits:
1ac3f97 [Reynold Xin] Oops. Properly handle description.
f5736ad [Reynold Xin] Code review comments.
b8828dc [Reynold Xin] Merge branch 'master' into ui-consolidate-hashtables
7a7b6c4
Repository: spark
Updated Branches:
refs/heads/master 6afca2d10 - 29809a6d5
[SPARK-2570] [SQL] Fix the bug of ClassCastException
Exception thrown when running the example of HiveFromSpark.
Exception in thread main java.lang.ClassCastException: java.lang.Long cannot
be cast to
Repository: spark
Updated Branches:
refs/heads/master e52b8719c - 30b8d369d
SPARK-2553. Fix compile error
Author: Sandy Ryza sa...@cloudera.com
Closes #1479 from sryza/sandy-spark-2553 and squashes the following commits:
2cb5ed8 [Sandy Ryza] SPARK-2553. Fix compile error
Project:
Repository: spark
Updated Branches:
refs/heads/master 7f87ab981 - 586e716e4
Reservoir sampling implementation.
This is going to be used in https://issues.apache.org/jira/browse/SPARK-2568
Author: Reynold Xin r...@apache.org
Closes #1478 from rxin/reservoirSample and squashes the following
: 3.416348793 s, 1.477846558 s, 1.553432156 s
```
Author: Reynold Xin r...@apache.org
Closes #1452 from rxin/broadcast-task and squashes the following commits:
762e0be [Reynold Xin] Warn large broadcasts.
ade6eac [Reynold Xin] Log broadcast size.
c3b6f11 [Reynold Xin] Added a unit test for clean up
Repository: spark
Updated Branches:
refs/heads/master 7b8cd1752 - 805f329bb
put 'curRequestSize = 0' after 'logDebug' it
This is a minor change. We should first logDebug($curRequestSize) and then set
it to 0.
Author: Lijie Xu csxuli...@gmail.com
Closes #1477 from JerryLead/patch-1 and
Repository: spark
Updated Branches:
refs/heads/master 98ab41122 - fa51b0fb5
[SPARK-2598] RangePartitioner's binary search does not use the given Ordering
We should fix this in branch-1.0 as well.
Author: Reynold Xin r...@apache.org
Closes #1500 from rxin/rangePartitioner and squashes
Repository: spark
Updated Branches:
refs/heads/branch-1.0 11670bf1a - 480669f2b
[SPARK-2598] RangePartitioner's binary search does not use the given Ordering
We should fix this in branch-1.0 as well.
Author: Reynold Xin r...@apache.org
Closes #1500 from rxin/rangePartitioner and squashes
Repository: spark
Updated Branches:
refs/heads/master fa51b0fb5 - 1b10b8114
[SPARK-2495][MLLIB] remove private[mllib] from linear models' constructors
This is part of SPARK-2495 to allow users construct linear models manually.
Author: Xiangrui Meng m...@databricks.com
Closes #1492 from
Repository: spark
Updated Branches:
refs/heads/master c3462c656 - 5d16d5bbf
[SPARK-2470] PEP8 fixes to PySpark
This pull request aims to resolve all outstanding PEP8 violations in PySpark.
Author: Nicholas Chammas nicholas.cham...@gmail.com
Author: nchammas nicholas.cham...@gmail.com
Closes
Repository: spark
Updated Branches:
refs/heads/master 6c2be93f0 - 4c7243e10
[SPARK-2617] Correct doc and usages of preservesPartitioning
The name `preservesPartitioning` is ambiguous: 1) preserves the indices of
partitions, 2) preserves the partitioner. The latter is correct and
Repository: spark
Updated Branches:
refs/heads/master 60f0ae3d8 - 2d25e3481
Replace RoutingTableMessage with pair
RoutingTableMessage was used to construct routing tables to enable
joining VertexRDDs with partitioned edges. It stored three elements: the
destination vertex ID, the source edge
Repository: spark
Updated Branches:
refs/heads/branch-1.0 c6421b6f6 - 6b0804640
[SPARK-2658][SQL] Add rule for true = 1.
Author: Michael Armbrust mich...@databricks.com
Closes #1556 from marmbrus/fixBooleanEqualsOne and squashes the following
commits:
ad8edd4 [Michael Armbrust] Add rule
Repository: spark
Updated Branches:
refs/heads/master 9e7725c86 - 78d18fdba
[SPARK-2658][SQL] Add rule for true = 1.
Author: Michael Armbrust mich...@databricks.com
Closes #1556 from marmbrus/fixBooleanEqualsOne and squashes the following
commits:
ad8edd4 [Michael Armbrust] Add rule for
Repository: spark
Updated Branches:
refs/heads/master 8529ced35 - eb82abd8e
[SPARK-2529] Clean closures in foreach and foreachPartition.
Author: Reynold Xin r...@apache.org
Closes #1583 from rxin/closureClean and squashes the following commits:
8982fe6 [Reynold Xin] [SPARK-2529] Clean
Repository: spark
Updated Branches:
refs/heads/branch-1.0 70109da21 - 797c663ae
[SPARK-2529] Clean closures in foreach and foreachPartition.
Author: Reynold Xin r...@apache.org
Closes #1583 from rxin/closureClean and squashes the following commits:
8982fe6 [Reynold Xin] [SPARK-2529] Clean
@markhamstra please take a look ...
Author: Reynold Xin r...@apache.org
Closes #1561 from rxin/dagSchedulerHashMaps and squashes the following commits:
1c44e15 [Reynold Xin] Clear pending tasks in submitMissingTasks.
620a0d1 [Reynold Xin] Use filterKeys.
5b54404 [Reynold Xin] Code review feedback
Repository: spark
Updated Branches:
refs/heads/master 39ab87b92 - 16ef4d110
Excess judgment
Author: Yadong Qi qiyadong2...@gmail.com
Closes #1629 from watermen/bug-fix2 and squashes the following commits:
59b7237 [Yadong Qi] Update HiveQl.scala
Project:
Repository: spark
Updated Branches:
refs/heads/master 96ba04bbf - 20424dad3
[SPARK-2174][MLLIB] treeReduce and treeAggregate
In `reduce` and `aggregate`, the driver node spends linear time on the number
of partitions. It becomes a bottleneck when there are many partitions and the
data from
Repository: spark
Updated Branches:
refs/heads/master 84467468d - 2e6efcace
[SPARK-2568] RangePartitioner should run only one job if data is balanced
As of Spark 1.0, RangePartitioner goes through data twice: once to compute the
count and once to do sampling. As a result, to do sortByKey,
(e.g. 1k loc
change in core, and only 1 loc change in hive).
We should use git diff --name-only master instead.
Author: Reynold Xin r...@apache.org
Closes #1656 from rxin/hiveTest and squashes the following commits:
f5eab9f [Reynold Xin] [SPARK-2747] git diff --dirstat can miss sql changes
201 - 300 of 3893 matches
Mail list logo