GitHub user yejiming opened a pull request:
https://github.com/apache/spark/pull/4942
Branch 1.3
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/spark branch-1.3
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4942.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4942
----
commit 9a1de4b20fcfa756f228b263f2a778534f6ca90d
Author: Venkata Ramana Gollamudi <[email protected]>
Date: 2015-02-12T22:44:21Z
[SPARK-5765][Examples]Fixed word split problem in run-example and
compute-classpath
Author: Venkata Ramana G <ramana.gollamudihuawei.com>
Author: Venkata Ramana Gollamudi <[email protected]>
Closes #4561 from gvramana/word_split and squashes the following commits:
285c8d4 [Venkata Ramana Gollamudi] Fixed word split problem in run-example
and compute-classpath
(cherry picked from commit 629d0143eeb3c153dac9c65e7b556723c6b4bfc7)
Signed-off-by: Andrew Or <[email protected]>
commit 0040fc50918cf5e53554b0dc8053528af58e6ba8
Author: Kay Ousterhout <[email protected]>
Date: 2015-02-12T22:46:37Z
[SPARK-5762] Fix shuffle write time for sort-based shuffle
mateiz was excluding the time to write this final file from the shuffle
write time intentional?
Author: Kay Ousterhout <[email protected]>
Closes #4559 from kayousterhout/SPARK-5762 and squashes the following
commits:
5c6f3d9 [Kay Ousterhout] Use foreach
94e4237 [Kay Ousterhout] Removed open time metrics added inadvertently
ace156c [Kay Ousterhout] Moved metrics to finally block
d773276 [Kay Ousterhout] Use nano time
5a59906 [Kay Ousterhout] [SPARK-5762] Fix shuffle write time for sort-based
shuffle
(cherry picked from commit 47c73d410ab533c3196184d2b6004081e79daeaa)
Signed-off-by: Andrew Or <[email protected]>
commit 11d108030516b1a0bd45f36312f6210dc9a577b0
Author: Andrew Or <[email protected]>
Date: 2015-02-12T22:47:52Z
[SPARK-5760][SPARK-5761] Fix standalone rest protocol corner cases + revamp
tests
The changes are summarized in the commit message. Test or test-related code
accounts for 90% of the lines changed.
Author: Andrew Or <[email protected]>
Closes #4557 from andrewor14/rest-tests and squashes the following commits:
b4dc980 [Andrew Or] Merge branch 'master' of github.com:apache/spark into
rest-tests
b55e40f [Andrew Or] Add test for unknown fields
cc96993 [Andrew Or] private[spark] -> private[rest]
578cf45 [Andrew Or] Clean up test code a little
d82d971 [Andrew Or] v1 -> serverVersion
ea48f65 [Andrew Or] Merge branch 'master' of github.com:apache/spark into
rest-tests
00999a8 [Andrew Or] Revamp tests + fix a few corner cases
(cherry picked from commit 1d5663e92cdaaa3dabfa58fdd7aede7e4fa4ec63)
Signed-off-by: Andrew Or <[email protected]>
commit 02d5b32bbebc055c1b4cde4f08a8194397921aa9
Author: lianhuiwang <[email protected]>
Date: 2015-02-12T22:50:16Z
[SPARK-5759][Yarn]ExecutorRunnable should catch YarnException while
NMClient start contain...
some time since some reasons, it lead to some exception while NMClient
start some containers.example:we do not config spark_shuffle on some machines,
so it will throw a exception:
java.lang.Error:
org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The
auxService:spark_shuffle does not exist.
because YarnAllocator use ThreadPoolExecutor to start Container, so we can
not find which container or hostname throw exception. I think we should catch
YarnException in ExecutorRunnable when start container. if there are some
exceptions, we can know the container id or hostname of failed container.
Author: lianhuiwang <[email protected]>
Closes #4554 from lianhuiwang/SPARK-5759 and squashes the following commits:
caf5a99 [lianhuiwang] use SparkException to warp exception
c02140f [lianhuiwang] ExecutorRunnable should catch YarnException while
NMClient start container
(cherry picked from commit 947b8bd82ec0f4c45910e6d781df4661f56e4587)
Signed-off-by: Andrew Or <[email protected]>
commit 11a0d5b6dce49c2beac8fd7eae2ccadf59a1e030
Author: David Y. Ross <[email protected]>
Date: 2015-02-12T22:52:38Z
SPARK-5747: Fix wordsplitting bugs in make-distribution.sh
The `$MVN` command variable may have spaces, so when referring to it, must
wrap in quotes.
Author: David Y. Ross <[email protected]>
Closes #4540 from dyross/dyr-fix-make-distribution2 and squashes the
following commits:
5a41596 [David Y. Ross] SPARK-5747: Fix wordsplitting bugs in
make-distribution.sh
(cherry picked from commit 26c816e7388eaa336a59183029f86548f1cc279c)
Signed-off-by: Andrew Or <[email protected]>
commit bf0d15c5255f054d2fb70d82ca96797a3665f058
Author: Davies Liu <[email protected]>
Date: 2015-02-12T22:54:38Z
[SPARK-5780] [PySpark] Mute the logging during unit tests
There a bunch of logging coming from driver and worker, it's noisy and
scaring, and a lots of exception in it, people are confusing about the tests
are failing or not.
This PR will mute the logging during tests, only show them if any one
failed.
Author: Davies Liu <[email protected]>
Closes #4572 from davies/mute and squashes the following commits:
1e9069c [Davies Liu] mute the logging during python tests
(cherry picked from commit 0bf031582588723dd5a4ca42e6f9f36bc2da1a0b)
Signed-off-by: Andrew Or <[email protected]>
commit b0c79daf4a24739963726dfecedff9a4b129f3c0
Author: Yin Huai <[email protected]>
Date: 2015-02-12T23:17:25Z
[SPARK-5758][SQL] Use LongType as the default type for integers in JSON
schema inference.
Author: Yin Huai <[email protected]>
Closes #4544 from yhuai/jsonUseLongTypeByDefault and squashes the following
commits:
6e2ffc2 [Yin Huai] Use LongType as the default type for integers in JSON
schema inference.
(cherry picked from commit c352ffbdb9112714c176a747edff6115e9369e58)
Signed-off-by: Michael Armbrust <[email protected]>
commit c7eb9ee2ccd93211c9ec125fd2baae267b35d3d4
Author: Michael Armbrust <[email protected]>
Date: 2015-02-12T23:19:19Z
[SPARK-5573][SQL] Add explode to dataframes
Author: Michael Armbrust <[email protected]>
Closes #4546 from marmbrus/explode and squashes the following commits:
eefd33a [Michael Armbrust] whitespace
a8d496c [Michael Armbrust] Merge remote-tracking branch 'apache/master'
into explode
4af740e [Michael Armbrust] Merge remote-tracking branch 'origin/master'
into explode
dc86a5c [Michael Armbrust] simple version
d633d01 [Michael Armbrust] add scala specific
950707a [Michael Armbrust] fix comments
ba8854c [Michael Armbrust] [SPARK-5573][SQL] Add explode to dataframes
(cherry picked from commit ee04a8b19be8330bfc48f470ef365622162c915f)
Signed-off-by: Michael Armbrust <[email protected]>
commit f7103b3437363bd81e4f4cfa282229019fcdcdad
Author: Daoyuan Wang <[email protected]>
Date: 2015-02-12T23:22:07Z
[SPARK-5755] [SQL] remove unnecessary Add
explain extended select +key from src;
before:
== Parsed Logical Plan ==
'Project [(0 + 'key) AS _c0#8]
'UnresolvedRelation [src], None
== Analyzed Logical Plan ==
Project [(0 + key#10) AS _c0#8]
MetastoreRelation test, src, None
== Optimized Logical Plan ==
Project [(0 + key#10) AS _c0#8]
MetastoreRelation test, src, None
== Physical Plan ==
Project [(0 + key#10) AS _c0#8]
HiveTableScan [key#10], (MetastoreRelation test, src, None), None
after this patch:
== Parsed Logical Plan ==
'Project ['key]
'UnresolvedRelation [src], None
== Analyzed Logical Plan ==
Project [key#10]
MetastoreRelation test, src, None
== Optimized Logical Plan ==
Project [key#10]
MetastoreRelation test, src, None
== Physical Plan ==
HiveTableScan [key#10], (MetastoreRelation test, src, None), None
Author: Daoyuan Wang <[email protected]>
Closes #4551 from adrian-wang/positive and squashes the following commits:
0821ae4 [Daoyuan Wang] remove unnecessary Add
(cherry picked from commit d5fc51491808630d0328a5937dbf349e00de361f)
Signed-off-by: Michael Armbrust <[email protected]>
commit 5c9db4e756b9f5567d9510e614ac399fbd68246a
Author: Vladimir Grigor <[email protected]>
Date: 2015-02-12T23:26:24Z
[SPARK-5335] Fix deletion of security groups within a VPC
Please see https://issues.apache.org/jira/browse/SPARK-5335.
The fix itself is in e58a8b01a8bedcbfbbc6d04b1c1489255865cf87 commit. Two
earlier commits are fixes of another VPC related bug waiting to be merged. I
should have created former bug fix in own branch then this fix would not have
former fixes. :(
This code is released under the project's license.
Author: Vladimir Grigor <[email protected]>
Author: Vladimir Grigor <[email protected]>
Closes #4122 from voukka/SPARK-5335_delete_sg_vpc and squashes the
following commits:
090dca9 [Vladimir Grigor] fixes as per review: removed printing of group_id
and added comment
730ec05 [Vladimir Grigor] fix for SPARK-5335: Destroying cluster in VPC
with "--delete-groups" fails to remove security groups
(cherry picked from commit ada993e954e2825c0fe13326fc23b0e1a567cd55)
Signed-off-by: Sean Owen <[email protected]>
commit 925fd84a1d25fd453af8ad457569dddb54938388
Author: Yin Huai <[email protected]>
Date: 2015-02-12T23:32:17Z
[SQL] Move SaveMode to SQL package.
Author: Yin Huai <[email protected]>
Closes #4542 from yhuai/moveSaveMode and squashes the following commits:
65a4425 [Yin Huai] Move SaveMode to sql package.
(cherry picked from commit c025a468826e9b9f62032e207daa9d42d9dba3ca)
Signed-off-by: Michael Armbrust <[email protected]>
commit edbac178d186f6936408b211385a5fea9e4f4603
Author: Yin Huai <[email protected]>
Date: 2015-02-13T02:08:01Z
[SPARK-3299][SQL]Public API in SQLContext to list tables
https://issues.apache.org/jira/browse/SPARK-3299
Author: Yin Huai <[email protected]>
Closes #4547 from yhuai/tables and squashes the following commits:
6c8f92e [Yin Huai] Add tableNames.
acbb281 [Yin Huai] Update Python test.
7793dcb [Yin Huai] Fix scala test.
572870d [Yin Huai] Address comments.
aba2e88 [Yin Huai] Format.
12c86df [Yin Huai] Add tables() to SQLContext to return a DataFrame
containing existing tables.
(cherry picked from commit 1d0596a16e1d3add2631f5d8169aeec2876a1362)
Signed-off-by: Michael Armbrust <[email protected]>
commit b9f332ab680f671a368a8411679bb4c52d495486
Author: tianyi <[email protected]>
Date: 2015-02-13T06:18:39Z
[SPARK-3365][SQL]Wrong schema generated for List type
This PR fix the issue SPARK-3365.
The reason is Spark generated wrong schema for the type `List` in
`ScalaReflection.scala`
for example:
the generated schema for type `Seq[String]` is:
```
{"name":"x","type":{"type":"array","elementType":"string","containsNull":true},"nullable":true,"metadata":{}}`
```
the generated schema for type `List[String]` is:
```
{"name":"x","type":{"type":"struct","fields":[]},"nullable":true,"metadata":{}}`
```
Author: tianyi <[email protected]>
Closes #4581 from tianyi/SPARK-3365 and squashes the following commits:
a097e86 [tianyi] change the order of resolution in ScalaReflection.scala
(cherry picked from commit 1c8633f3fe9d814c83384e339b958740c250c00c)
Signed-off-by: Cheng Lian <[email protected]>
commit a8f560c4e8c0b77b929f2564ac60fd558e62d72e
Author: Yin Huai <[email protected]>
Date: 2015-02-13T04:37:55Z
[SQL] Fix docs of SQLContext.tables
Author: Yin Huai <[email protected]>
Closes #4579 from yhuai/tablesDoc and squashes the following commits:
7f8964c [Yin Huai] Fix doc.
(cherry picked from commit 2aea892ebd4d6c802defeef35ef7ebfe42c06eba)
Signed-off-by: Cheng Lian <[email protected]>
commit 1255e83f841b59fd3c52fff3e6a733b8132c8d30
Author: WangTaoTheTonic <[email protected]>
Date: 2015-02-13T10:27:23Z
[SPARK-4832][Deploy]some other processes might take the daemon pid
Some other processes might use the pid saved in pid file. In that case we
should ignore it and launch daemons.
JIRA is down for maintenance. I will file one once it return.
Author: WangTaoTheTonic <[email protected]>
Author: WangTaoTheTonic <[email protected]>
Closes #3683 from WangTaoTheTonic/otherproc and squashes the following
commits:
daa86a1 [WangTaoTheTonic] some bash style fix
8befee7 [WangTaoTheTonic] handle the mistake scenario
cf4ecc6 [WangTaoTheTonic] remove redundant condition
f36cfb4 [WangTaoTheTonic] some other processes might take the pid
(cherry picked from commit 1768bd51438670c493ca3ca02988aee3ae31e87e)
Signed-off-by: Sean Owen <[email protected]>
commit 5c883df09eae5c9a8a0f54c265096a9e10a17fa5
Author: uncleGen <[email protected]>
Date: 2015-02-13T17:43:10Z
[SPARK-5732][CORE]:Add an option to print the spark version in spark script.
Naturally, we may need to add an option to print the spark version in spark
script. It is pretty common in script tool.

Author: uncleGen <[email protected]>
Author: genmao.ygm <[email protected]>
Closes #4522 from uncleGen/master-clean-150211 and squashes the following
commits:
9f2127c [genmao.ygm] revert the behavior of "-v"
015ddee [uncleGen] minor changes
463f02c [uncleGen] minor changes
(cherry picked from commit c0ccd2564182695ea5771524840bf1a99d5aa842)
Signed-off-by: Andrew Or <[email protected]>
commit 5e639422207a113eee4ea3796c221004664ede1a
Author: sboeschhuawei <[email protected]>
Date: 2015-02-13T17:45:57Z
[SPARK-5503][MLLIB] Example code for Power Iteration Clustering
Author: sboeschhuawei <[email protected]>
Closes #4495 from javadba/picexamples and squashes the following commits:
3c84b14 [sboeschhuawei] PIC Examples updates from Xiangrui's comments round
5
2878675 [sboeschhuawei] Fourth round with xiangrui on PICExample
d7ac350 [sboeschhuawei] Updates to PICExample from Xiangrui's comments
round 3
d7f0cba [sboeschhuawei] Updates to PICExample from Xiangrui's comments
round 3
cef28f4 [sboeschhuawei] Further updates to PICExample from Xiangrui's
comments
f7ff43d [sboeschhuawei] Update to PICExample from Xiangrui's comments
efeec45 [sboeschhuawei] Update to PICExample from Xiangrui's comments
03e8de4 [sboeschhuawei] Added PICExample
c509130 [sboeschhuawei] placeholder for pic examples
5864d4a [sboeschhuawei] placeholder for pic examples
(cherry picked from commit e1a1ff8108463ca79299ec0eb555a0c8db9dffa0)
Signed-off-by: Xiangrui Meng <[email protected]>
commit e5690a502f04ab948cc8f8f7fd04be55498ea0cc
Author: Ryan Williams <[email protected]>
Date: 2015-02-13T17:47:26Z
[SPARK-5783] Better eventlog-parsing error messages
Author: Ryan Williams <[email protected]>
Closes #4573 from ryan-williams/history and squashes the following commits:
a8647ec [Ryan Williams] fix test calls to .replay()
98aa3fe [Ryan Williams] include filename in history-parsing error message
8deecf0 [Ryan Williams] add line number to history-parsing error message
b668b52 [Ryan Williams] add log info line to history-eventlog parsing
commit cc9eec1a076624628d3d582e7c679f0861ecb39c
Author: Josh Rosen <[email protected]>
Date: 2015-02-13T17:53:57Z
[SPARK-5735] Replace uses of EasyMock with Mockito
This patch replaces all uses of EasyMock with Mockito. There are two
motivations for this:
1. We should use a single mocking framework in our tests in order to keep
things consistent.
2. EasyMock may be responsible for non-deterministic unit test failures due
to its Objensis dependency (see SPARK-5626 for more details).
Most of these changes are fairly mechanical translations of Mockito code to
EasyMock, although I made a small change that strengthens the assertions in one
test in KinesisReceiverSuite.
Author: Josh Rosen <[email protected]>
Closes #4578 from JoshRosen/SPARK-5735-remove-easymock and squashes the
following commits:
0ab192b [Josh Rosen] Import sorting plus two minor changes to more closely
match old semantics.
977565b [Josh Rosen] Remove EasyMock from build.
fae1d8f [Josh Rosen] Remove EasyMock usage in KinesisReceiverSuite.
7cca486 [Josh Rosen] Remove EasyMock usage in MesosSchedulerBackendSuite
fc5e94d [Josh Rosen] Remove EasyMock in CacheManagerSuite
(cherry picked from commit 077eec2d9dba197f51004ee4a322d0fa71424ea0)
Signed-off-by: Andrew Or <[email protected]>
commit ad731897bc7e33ecc37340614f9b9b300ab3d982
Author: Emre Sevinç <[email protected]>
Date: 2015-02-13T20:31:27Z
SPARK-5805 Fixed the type error in documentation.
Fixes SPARK-5805 : Fix the type error in the final example given in MLlib -
Clustering documentation.
Author: Emre Sevinç <[email protected]>
Closes #4596 from emres/SPARK-5805 and squashes the following commits:
1029f66 [Emre Sevinç] SPARK-5805 Fixed the type error in documentation.
(cherry picked from commit 9f31db061019414a964aac432e946eac61f8307c)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 41603717abe3fd46b14049c15301b81a3a614989
Author: Andrew Or <[email protected]>
Date: 2015-02-13T21:10:29Z
[HOTFIX] Fix build break in MesosSchedulerBackendSuite
commit efffc2e428b1e867a586749685da90875f6bcfc4
Author: Daoyuan Wang <[email protected]>
Date: 2015-02-13T21:46:50Z
[SPARK-5642] [SQL] Apply column pruning on unused aggregation fields
select k from (select key k, max(value) v from src group by k) t
Author: Daoyuan Wang <[email protected]>
Author: Michael Armbrust <[email protected]>
Closes #4415 from adrian-wang/groupprune and squashes the following commits:
5d2d8a3 [Daoyuan Wang] address Michael's comments
61f8ef7 [Daoyuan Wang] add a unit test
80ddcc6 [Daoyuan Wang] keep project
b69d385 [Daoyuan Wang] add a prune rule for grouping set
(cherry picked from commit 2cbb3e433ae334d5c318f05b987af314c854fbcc)
Signed-off-by: Michael Armbrust <[email protected]>
commit d9d0250fc5dfe529bebd4f67f945f4d7c3fc4106
Author: Yin Huai <[email protected]>
Date: 2015-02-13T21:51:06Z
[SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson
encounters unrecoverable parsing errors.
Author: Yin Huai <[email protected]>
Closes #4582 from yhuai/jsonErrorMessage and squashes the following commits:
152dbd4 [Yin Huai] Update error message.
1466256 [Yin Huai] Throw a better error message when a JSON object in the
input dataset span multiple records (lines for files or strings for an RDD of
strings).
(cherry picked from commit 2e0c084528409e1c565e6945521a33c0835ebbee)
Signed-off-by: Michael Armbrust <[email protected]>
commit 965876328d037f2a817f8c6bf5df0b3071abb43a
Author: Xiangrui Meng <[email protected]>
Date: 2015-02-13T23:09:27Z
[SPARK-5806] re-organize sections in mllib-clustering.md
Put example code close to the algorithm description.
Author: Xiangrui Meng <[email protected]>
Closes #4598 from mengxr/SPARK-5806 and squashes the following commits:
a137872 [Xiangrui Meng] re-organize sections in mllib-clustering.md
(cherry picked from commit cc56c8729a76af85aa6eb5d2f99787cca5e5b38f)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 356b798b3878bac1f89304e0be0f698f9eed6ec0
Author: Xiangrui Meng <[email protected]>
Date: 2015-02-14T00:43:49Z
[SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays
because ArrayBuffer is not specialized.
Author: Xiangrui Meng <[email protected]>
Closes #4594 from mengxr/SPARK-5803 and squashes the following commits:
1261bd5 [Xiangrui Meng] merge master
a4ea872 [Xiangrui Meng] use ArrayBuilder to build primitive arrays
(cherry picked from commit d50a91d529b0913364b483c511397d4af308a435)
Signed-off-by: Xiangrui Meng <[email protected]>
commit fccd38d2e08fb3502440a942a6958af5aada539b
Author: Xiangrui Meng <[email protected]>
Date: 2015-02-14T00:45:59Z
[SPARK-5730][ML] add doc groups to spark.ml components
This PR adds three groups to the ScalaDoc: `param`, `setParam`, and
`getParam`. Params will show up in the generated Scala API doc as the top
group. Setters/getters will be at the bottom.
Preview:

Author: Xiangrui Meng <[email protected]>
Closes #4600 from mengxr/SPARK-5730 and squashes the following commits:
febed9a [Xiangrui Meng] add doc groups to spark.ml components
(cherry picked from commit 4f4c6d5a5db04a56906bacdc85d7e5589b6edada)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 152147f5f884ae4eea3873f01719e6ab9bc7afd2
Author: Josh Rosen <[email protected]>
Date: 2015-02-14T01:45:31Z
[SPARK-5227] [SPARK-5679] Disable FileSystem cache in
WholeTextFileRecordReaderSuite
This patch fixes two difficult-to-reproduce Jenkins test failures in
InputOutputMetricsSuite (SPARK-5227 and SPARK-5679). The problem was that
WholeTextFileRecordReaderSuite modifies the `fs.local.block.size` Hadoop
configuration and this change was affecting subsequent test suites due to
Hadoop's caching of FileSystem instances (see HADOOP-8490 for more details).
The fix implemented here is to disable FileSystem caching in
WholeTextFileRecordReaderSuite.
Author: Josh Rosen <[email protected]>
Closes #4599 from JoshRosen/inputoutputsuite-fix and squashes the following
commits:
47dc447 [Josh Rosen] [SPARK-5227] [SPARK-5679] Disable FileSystem cache in
WholeTextFileRecordReaderSuite
(cherry picked from commit d06d5ee9b33505774ef1e5becc01b47492f1a2dc)
Signed-off-by: Patrick Wendell <[email protected]>
commit db5747921a648c3f7cf1de6dba70b82584afd097
Author: Sean Owen <[email protected]>
Date: 2015-02-14T04:12:52Z
SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus
This just unpersist()s each RDD in this code that was cache()ed.
Author: Sean Owen <[email protected]>
Closes #4234 from srowen/SPARK-3290 and squashes the following commits:
66c1e11 [Sean Owen] unpersist() each RDD that was cache()ed
(cherry picked from commit 0ce4e430a81532dc317136f968f28742e087d840)
Signed-off-by: Ankur Dave <[email protected]>
commit ba91bf5f4f048a721d97eb5779957ec39b15319f
Author: Reynold Xin <[email protected]>
Date: 2015-02-14T07:03:22Z
[SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
- The old implicit would convert RDDs directly to DataFrames, and that
added too many methods.
- toDataFrame -> toDF
- Dsl -> functions
- implicits moved into SQLContext.implicits
- addColumn -> withColumn
- renameColumn -> withColumnRenamed
Python changes:
- toDataFrame -> toDF
- Dsl -> functions package
- addColumn -> withColumn
- renameColumn -> withColumnRenamed
- add toDF functions to RDD on SQLContext init
- add flatMap to DataFrame
Author: Reynold Xin <[email protected]>
Author: Davies Liu <[email protected]>
Closes #4556 from rxin/SPARK-5752 and squashes the following commits:
5ef9910 [Reynold Xin] More fix
61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into
SPARK-5752
ff5832c [Reynold Xin] Fix python
749c675 [Reynold Xin] count(*) fixes.
5806df0 [Reynold Xin] Fix build break again.
d941f3d [Reynold Xin] Fixed explode compilation break.
fe1267a [Davies Liu] flatMap
c4afb8e [Reynold Xin] style
d9de47f [Davies Liu] add comment
b783994 [Davies Liu] add comment for toDF
e2154e5 [Davies Liu] schema() -> schema
3a1004f [Davies Liu] Dsl -> functions, toDF()
fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits
moved into SQLContext.implicits - addColumn -> withColumn - renameColumn ->
withColumnRenamed
0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs
directly to DataFrames
97dd47c [Davies Liu] fix mistake
6168f74 [Davies Liu] fix test
1fc0199 [Davies Liu] fix test
a075cd5 [Davies Liu] clean up, toPandas
663d314 [Davies Liu] add test for agg('*')
9e214d5 [Reynold Xin] count(*) fixes.
1ed7136 [Reynold Xin] Fix build break again.
921b2e3 [Reynold Xin] Fixed explode compilation break.
14698d4 [Davies Liu] flatMap
ba3e12d [Reynold Xin] style
d08c92d [Davies Liu] add comment
5c8b524 [Davies Liu] add comment for toDF
a4e5e66 [Davies Liu] schema() -> schema
d377fc9 [Davies Liu] Dsl -> functions, toDF()
6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits
moved into SQLContext.implicits - addColumn -> withColumn - renameColumn ->
withColumnRenamed
807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs
directly to DataFrames
(cherry picked from commit e98dfe627c5d0201464cdd0f363f391ea84c389a)
Signed-off-by: Reynold Xin <[email protected]>
commit e99e170c7bff95a102b3bf00cc31bfa81951d0cf
Author: gasparms <[email protected]>
Date: 2015-02-14T20:10:29Z
[SPARK-5800] Streaming Docs. Change linked files according the selected
language
Currently, Spark Streaming Programming Guide after updateStateByKey
explanation links to file stateful_network_wordcount.py and note "For the
complete Scala code ..." for any language tab selected. This is an incoherence.
I've changed the guide and link its pertinent example file.
JavaStatefulNetworkWordCount.java example was not created so I added to the
commit.
Author: gasparms <[email protected]>
Closes #4589 from gasparms/feature/streaming-guide and squashes the
following commits:
7f37f89 [gasparms] More style changes
ec202b0 [gasparms] Follow spark style guide
f527328 [gasparms] Improve example to look like scala example
4d8785c [gasparms] Remove throw exception
e92e6b8 [gasparms] Fix incoherence
92db405 [gasparms] Fix Streaming Programming Guide. Change files according
the selected language
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]