[GitHub] spark pull request: Branch 1.3

yejiming Sun, 08 Mar 2015 19:04:56 -0700

GitHub user yejiming opened a pull request:

    https://github.com/apache/spark/pull/4942


    Branch 1.3

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/spark branch-1.3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4942
    
----
commit 9a1de4b20fcfa756f228b263f2a778534f6ca90d
Author: Venkata Ramana Gollamudi <[email protected]>
Date:   2015-02-12T22:44:21Z

    [SPARK-5765][Examples]Fixed word split problem in run-example and 
compute-classpath
    
    Author: Venkata Ramana G <ramana.gollamudihuawei.com>
    
    Author: Venkata Ramana Gollamudi <[email protected]>
    
    Closes #4561 from gvramana/word_split and squashes the following commits:
    
    285c8d4 [Venkata Ramana Gollamudi] Fixed word split problem in run-example 
and compute-classpath
    
    (cherry picked from commit 629d0143eeb3c153dac9c65e7b556723c6b4bfc7)
    Signed-off-by: Andrew Or <[email protected]>

commit 0040fc50918cf5e53554b0dc8053528af58e6ba8
Author: Kay Ousterhout <[email protected]>
Date:   2015-02-12T22:46:37Z

    [SPARK-5762] Fix shuffle write time for sort-based shuffle
    
    mateiz was excluding the time to write this final file from the shuffle 
write time intentional?
    
    Author: Kay Ousterhout <[email protected]>
    
    Closes #4559 from kayousterhout/SPARK-5762 and squashes the following 
commits:
    
    5c6f3d9 [Kay Ousterhout] Use foreach
    94e4237 [Kay Ousterhout] Removed open time metrics added inadvertently
    ace156c [Kay Ousterhout] Moved metrics to finally block
    d773276 [Kay Ousterhout] Use nano time
    5a59906 [Kay Ousterhout] [SPARK-5762] Fix shuffle write time for sort-based 
shuffle
    
    (cherry picked from commit 47c73d410ab533c3196184d2b6004081e79daeaa)
    Signed-off-by: Andrew Or <[email protected]>

commit 11d108030516b1a0bd45f36312f6210dc9a577b0
Author: Andrew Or <[email protected]>
Date:   2015-02-12T22:47:52Z

    [SPARK-5760][SPARK-5761] Fix standalone rest protocol corner cases + revamp 
tests
    
    The changes are summarized in the commit message. Test or test-related code 
accounts for 90% of the lines changed.
    
    Author: Andrew Or <[email protected]>
    
    Closes #4557 from andrewor14/rest-tests and squashes the following commits:
    
    b4dc980 [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
rest-tests
    b55e40f [Andrew Or] Add test for unknown fields
    cc96993 [Andrew Or] private[spark] -> private[rest]
    578cf45 [Andrew Or] Clean up test code a little
    d82d971 [Andrew Or] v1 -> serverVersion
    ea48f65 [Andrew Or] Merge branch 'master' of github.com:apache/spark into 
rest-tests
    00999a8 [Andrew Or] Revamp tests + fix a few corner cases
    
    (cherry picked from commit 1d5663e92cdaaa3dabfa58fdd7aede7e4fa4ec63)
    Signed-off-by: Andrew Or <[email protected]>

commit 02d5b32bbebc055c1b4cde4f08a8194397921aa9
Author: lianhuiwang <[email protected]>
Date:   2015-02-12T22:50:16Z

    [SPARK-5759][Yarn]ExecutorRunnable should catch YarnException while 
NMClient start contain...
    
    some time since some reasons, it lead to some exception while NMClient 
start some containers.example:we do not config spark_shuffle on some machines, 
so it will throw a exception:
    java.lang.Error: 
org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The 
auxService:spark_shuffle does not exist.
    because YarnAllocator use ThreadPoolExecutor to start Container, so we can 
not find which container or hostname throw exception. I think we should catch 
YarnException in ExecutorRunnable when start container. if there are some 
exceptions, we can know the container id or hostname of failed container.
    
    Author: lianhuiwang <[email protected]>
    
    Closes #4554 from lianhuiwang/SPARK-5759 and squashes the following commits:
    
    caf5a99 [lianhuiwang] use SparkException to warp exception
    c02140f [lianhuiwang] ExecutorRunnable should catch YarnException while 
NMClient start container
    
    (cherry picked from commit 947b8bd82ec0f4c45910e6d781df4661f56e4587)
    Signed-off-by: Andrew Or <[email protected]>

commit 11a0d5b6dce49c2beac8fd7eae2ccadf59a1e030
Author: David Y. Ross <[email protected]>
Date:   2015-02-12T22:52:38Z

    SPARK-5747: Fix wordsplitting bugs in make-distribution.sh
    
    The `$MVN` command variable may have spaces, so when referring to it, must 
wrap in quotes.
    
    Author: David Y. Ross <[email protected]>
    
    Closes #4540 from dyross/dyr-fix-make-distribution2 and squashes the 
following commits:
    
    5a41596 [David Y. Ross] SPARK-5747: Fix wordsplitting bugs in 
make-distribution.sh
    
    (cherry picked from commit 26c816e7388eaa336a59183029f86548f1cc279c)
    Signed-off-by: Andrew Or <[email protected]>

commit bf0d15c5255f054d2fb70d82ca96797a3665f058
Author: Davies Liu <[email protected]>
Date:   2015-02-12T22:54:38Z

    [SPARK-5780] [PySpark] Mute the logging during unit tests
    
    There a bunch of logging coming from driver and worker, it's noisy and 
scaring, and a lots of exception in it, people are confusing about the tests 
are failing or not.
    
    This PR will mute the logging during tests, only show them if any one 
failed.
    
    Author: Davies Liu <[email protected]>
    
    Closes #4572 from davies/mute and squashes the following commits:
    
    1e9069c [Davies Liu] mute the logging during python tests
    
    (cherry picked from commit 0bf031582588723dd5a4ca42e6f9f36bc2da1a0b)
    Signed-off-by: Andrew Or <[email protected]>

commit b0c79daf4a24739963726dfecedff9a4b129f3c0
Author: Yin Huai <[email protected]>
Date:   2015-02-12T23:17:25Z

    [SPARK-5758][SQL] Use LongType as the default type for integers in JSON 
schema inference.
    
    Author: Yin Huai <[email protected]>
    
    Closes #4544 from yhuai/jsonUseLongTypeByDefault and squashes the following 
commits:
    
    6e2ffc2 [Yin Huai] Use LongType as the default type for integers in JSON 
schema inference.
    
    (cherry picked from commit c352ffbdb9112714c176a747edff6115e9369e58)
    Signed-off-by: Michael Armbrust <[email protected]>

commit c7eb9ee2ccd93211c9ec125fd2baae267b35d3d4
Author: Michael Armbrust <[email protected]>
Date:   2015-02-12T23:19:19Z

    [SPARK-5573][SQL] Add explode to dataframes
    
    Author: Michael Armbrust <[email protected]>
    
    Closes #4546 from marmbrus/explode and squashes the following commits:
    
    eefd33a [Michael Armbrust] whitespace
    a8d496c [Michael Armbrust] Merge remote-tracking branch 'apache/master' 
into explode
    4af740e [Michael Armbrust] Merge remote-tracking branch 'origin/master' 
into explode
    dc86a5c [Michael Armbrust] simple version
    d633d01 [Michael Armbrust] add scala specific
    950707a [Michael Armbrust] fix comments
    ba8854c [Michael Armbrust] [SPARK-5573][SQL] Add explode to dataframes
    
    (cherry picked from commit ee04a8b19be8330bfc48f470ef365622162c915f)
    Signed-off-by: Michael Armbrust <[email protected]>

commit f7103b3437363bd81e4f4cfa282229019fcdcdad
Author: Daoyuan Wang <[email protected]>
Date:   2015-02-12T23:22:07Z

    [SPARK-5755] [SQL] remove unnecessary Add
    
        explain extended select +key from src;
    before:
    == Parsed Logical Plan ==
    'Project [(0 + 'key) AS _c0#8]
     'UnresolvedRelation [src], None
    
    == Analyzed Logical Plan ==
    Project [(0 + key#10) AS _c0#8]
     MetastoreRelation test, src, None
    
    == Optimized Logical Plan ==
    Project [(0 + key#10) AS _c0#8]
     MetastoreRelation test, src, None
    
    == Physical Plan ==
    Project [(0 + key#10) AS _c0#8]
     HiveTableScan [key#10], (MetastoreRelation test, src, None), None
    
    after this patch:
    == Parsed Logical Plan ==
    'Project ['key]
     'UnresolvedRelation [src], None
    
    == Analyzed Logical Plan ==
    Project [key#10]
     MetastoreRelation test, src, None
    
    == Optimized Logical Plan ==
    Project [key#10]
     MetastoreRelation test, src, None
    
    == Physical Plan ==
    HiveTableScan [key#10], (MetastoreRelation test, src, None), None
    
    Author: Daoyuan Wang <[email protected]>
    
    Closes #4551 from adrian-wang/positive and squashes the following commits:
    
    0821ae4 [Daoyuan Wang] remove unnecessary Add
    
    (cherry picked from commit d5fc51491808630d0328a5937dbf349e00de361f)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 5c9db4e756b9f5567d9510e614ac399fbd68246a
Author: Vladimir Grigor <[email protected]>
Date:   2015-02-12T23:26:24Z

    [SPARK-5335] Fix deletion of security groups within a VPC
    
    Please see https://issues.apache.org/jira/browse/SPARK-5335.
    
    The fix itself is in e58a8b01a8bedcbfbbc6d04b1c1489255865cf87 commit. Two 
earlier commits are fixes of another VPC related bug waiting to be merged. I 
should have created former bug fix in own branch then this fix would not have 
former fixes. :(
    
    This code is released under the project's license.
    
    Author: Vladimir Grigor <[email protected]>
    Author: Vladimir Grigor <[email protected]>
    
    Closes #4122 from voukka/SPARK-5335_delete_sg_vpc and squashes the 
following commits:
    
    090dca9 [Vladimir Grigor] fixes as per review: removed printing of group_id 
and added comment
    730ec05 [Vladimir Grigor] fix for SPARK-5335: Destroying cluster in VPC 
with "--delete-groups" fails to remove security groups
    
    (cherry picked from commit ada993e954e2825c0fe13326fc23b0e1a567cd55)
    Signed-off-by: Sean Owen <[email protected]>

commit 925fd84a1d25fd453af8ad457569dddb54938388
Author: Yin Huai <[email protected]>
Date:   2015-02-12T23:32:17Z

    [SQL] Move SaveMode to SQL package.
    
    Author: Yin Huai <[email protected]>
    
    Closes #4542 from yhuai/moveSaveMode and squashes the following commits:
    
    65a4425 [Yin Huai] Move SaveMode to sql package.
    
    (cherry picked from commit c025a468826e9b9f62032e207daa9d42d9dba3ca)
    Signed-off-by: Michael Armbrust <[email protected]>

commit edbac178d186f6936408b211385a5fea9e4f4603
Author: Yin Huai <[email protected]>
Date:   2015-02-13T02:08:01Z

    [SPARK-3299][SQL]Public API in SQLContext to list tables
    
    https://issues.apache.org/jira/browse/SPARK-3299
    
    Author: Yin Huai <[email protected]>
    
    Closes #4547 from yhuai/tables and squashes the following commits:
    
    6c8f92e [Yin Huai] Add tableNames.
    acbb281 [Yin Huai] Update Python test.
    7793dcb [Yin Huai] Fix scala test.
    572870d [Yin Huai] Address comments.
    aba2e88 [Yin Huai] Format.
    12c86df [Yin Huai] Add tables() to SQLContext to return a DataFrame 
containing existing tables.
    
    (cherry picked from commit 1d0596a16e1d3add2631f5d8169aeec2876a1362)
    Signed-off-by: Michael Armbrust <[email protected]>

commit b9f332ab680f671a368a8411679bb4c52d495486
Author: tianyi <[email protected]>
Date:   2015-02-13T06:18:39Z

    [SPARK-3365][SQL]Wrong schema generated for List type
    
    This PR fix the issue SPARK-3365.
    The reason is Spark generated wrong schema for the type `List` in 
`ScalaReflection.scala`
    for example:
    
    the generated schema for type `Seq[String]` is:
    ```
    
{"name":"x","type":{"type":"array","elementType":"string","containsNull":true},"nullable":true,"metadata":{}}`
    ```
    
    the generated schema for type `List[String]` is:
    ```
    
{"name":"x","type":{"type":"struct","fields":[]},"nullable":true,"metadata":{}}`
    ```
    
    Author: tianyi <[email protected]>
    
    Closes #4581 from tianyi/SPARK-3365 and squashes the following commits:
    
    a097e86 [tianyi] change the order of resolution in ScalaReflection.scala
    
    (cherry picked from commit 1c8633f3fe9d814c83384e339b958740c250c00c)
    Signed-off-by: Cheng Lian <[email protected]>

commit a8f560c4e8c0b77b929f2564ac60fd558e62d72e
Author: Yin Huai <[email protected]>
Date:   2015-02-13T04:37:55Z

    [SQL] Fix docs of SQLContext.tables
    
    Author: Yin Huai <[email protected]>
    
    Closes #4579 from yhuai/tablesDoc and squashes the following commits:
    
    7f8964c [Yin Huai] Fix doc.
    
    (cherry picked from commit 2aea892ebd4d6c802defeef35ef7ebfe42c06eba)
    Signed-off-by: Cheng Lian <[email protected]>

commit 1255e83f841b59fd3c52fff3e6a733b8132c8d30
Author: WangTaoTheTonic <[email protected]>
Date:   2015-02-13T10:27:23Z

    [SPARK-4832][Deploy]some other processes might take the daemon pid
    
    Some other processes might use the pid saved in pid file. In that case we 
should ignore it and launch daemons.
    
    JIRA is down for maintenance. I will file one once it return.
    
    Author: WangTaoTheTonic <[email protected]>
    Author: WangTaoTheTonic <[email protected]>
    
    Closes #3683 from WangTaoTheTonic/otherproc and squashes the following 
commits:
    
    daa86a1 [WangTaoTheTonic] some bash style fix
    8befee7 [WangTaoTheTonic] handle the mistake scenario
    cf4ecc6 [WangTaoTheTonic] remove redundant condition
    f36cfb4 [WangTaoTheTonic] some other processes might take the pid
    
    (cherry picked from commit 1768bd51438670c493ca3ca02988aee3ae31e87e)
    Signed-off-by: Sean Owen <[email protected]>

commit 5c883df09eae5c9a8a0f54c265096a9e10a17fa5
Author: uncleGen <[email protected]>
Date:   2015-02-13T17:43:10Z

    [SPARK-5732][CORE]:Add an option to print the spark version in spark script.
    
    Naturally, we may need to add an option to print the spark version in spark 
script. It is pretty common in script tool.
    
![9](https://cloud.githubusercontent.com/assets/7402327/6183331/cab1b74e-b38e-11e4-9daa-e26e6015cff3.JPG)
    
    Author: uncleGen <[email protected]>
    Author: genmao.ygm <[email protected]>
    
    Closes #4522 from uncleGen/master-clean-150211 and squashes the following 
commits:
    
    9f2127c [genmao.ygm] revert the behavior of "-v"
    015ddee [uncleGen] minor changes
    463f02c [uncleGen] minor changes
    
    (cherry picked from commit c0ccd2564182695ea5771524840bf1a99d5aa842)
    Signed-off-by: Andrew Or <[email protected]>

commit 5e639422207a113eee4ea3796c221004664ede1a
Author: sboeschhuawei <[email protected]>
Date:   2015-02-13T17:45:57Z

    [SPARK-5503][MLLIB] Example code for Power Iteration Clustering
    
    Author: sboeschhuawei <[email protected]>
    
    Closes #4495 from javadba/picexamples and squashes the following commits:
    
    3c84b14 [sboeschhuawei] PIC Examples updates from Xiangrui's comments round 
5
    2878675 [sboeschhuawei] Fourth round with xiangrui on PICExample
    d7ac350 [sboeschhuawei] Updates to PICExample from Xiangrui's comments 
round 3
    d7f0cba [sboeschhuawei] Updates to PICExample from Xiangrui's comments 
round 3
    cef28f4 [sboeschhuawei] Further updates to PICExample from Xiangrui's 
comments
    f7ff43d [sboeschhuawei] Update to PICExample from Xiangrui's comments
    efeec45 [sboeschhuawei] Update to PICExample from Xiangrui's comments
    03e8de4 [sboeschhuawei] Added PICExample
    c509130 [sboeschhuawei] placeholder for pic examples
    5864d4a [sboeschhuawei] placeholder for pic examples
    
    (cherry picked from commit e1a1ff8108463ca79299ec0eb555a0c8db9dffa0)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit e5690a502f04ab948cc8f8f7fd04be55498ea0cc
Author: Ryan Williams <[email protected]>
Date:   2015-02-13T17:47:26Z

    [SPARK-5783] Better eventlog-parsing error messages
    
    Author: Ryan Williams <[email protected]>
    
    Closes #4573 from ryan-williams/history and squashes the following commits:
    
    a8647ec [Ryan Williams] fix test calls to .replay()
    98aa3fe [Ryan Williams] include filename in history-parsing error message
    8deecf0 [Ryan Williams] add line number to history-parsing error message
    b668b52 [Ryan Williams] add log info line to history-eventlog parsing

commit cc9eec1a076624628d3d582e7c679f0861ecb39c
Author: Josh Rosen <[email protected]>
Date:   2015-02-13T17:53:57Z

    [SPARK-5735] Replace uses of EasyMock with Mockito
    
    This patch replaces all uses of EasyMock with Mockito.  There are two 
motivations for this:
    
    1. We should use a single mocking framework in our tests in order to keep 
things consistent.
    2. EasyMock may be responsible for non-deterministic unit test failures due 
to its Objensis dependency (see SPARK-5626 for more details).
    
    Most of these changes are fairly mechanical translations of Mockito code to 
EasyMock, although I made a small change that strengthens the assertions in one 
test in KinesisReceiverSuite.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #4578 from JoshRosen/SPARK-5735-remove-easymock and squashes the 
following commits:
    
    0ab192b [Josh Rosen] Import sorting plus two minor changes to more closely 
match old semantics.
    977565b [Josh Rosen] Remove EasyMock from build.
    fae1d8f [Josh Rosen] Remove EasyMock usage in KinesisReceiverSuite.
    7cca486 [Josh Rosen] Remove EasyMock usage in MesosSchedulerBackendSuite
    fc5e94d [Josh Rosen] Remove EasyMock in CacheManagerSuite
    
    (cherry picked from commit 077eec2d9dba197f51004ee4a322d0fa71424ea0)
    Signed-off-by: Andrew Or <[email protected]>

commit ad731897bc7e33ecc37340614f9b9b300ab3d982
Author: Emre SevinÃ§ <[email protected]>
Date:   2015-02-13T20:31:27Z

    SPARK-5805 Fixed the type error in documentation.
    
    Fixes SPARK-5805 : Fix the type error in the final example given in MLlib - 
Clustering documentation.
    
    Author: Emre SevinÃ§ <[email protected]>
    
    Closes #4596 from emres/SPARK-5805 and squashes the following commits:
    
    1029f66 [Emre SevinÃ§] SPARK-5805 Fixed the type error in documentation.
    
    (cherry picked from commit 9f31db061019414a964aac432e946eac61f8307c)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 41603717abe3fd46b14049c15301b81a3a614989
Author: Andrew Or <[email protected]>
Date:   2015-02-13T21:10:29Z

    [HOTFIX] Fix build break in MesosSchedulerBackendSuite

commit efffc2e428b1e867a586749685da90875f6bcfc4
Author: Daoyuan Wang <[email protected]>
Date:   2015-02-13T21:46:50Z

    [SPARK-5642] [SQL] Apply column pruning on unused aggregation fields
    
    select k from (select key k, max(value) v from src group by k) t
    
    Author: Daoyuan Wang <[email protected]>
    Author: Michael Armbrust <[email protected]>
    
    Closes #4415 from adrian-wang/groupprune and squashes the following commits:
    
    5d2d8a3 [Daoyuan Wang] address Michael's comments
    61f8ef7 [Daoyuan Wang] add a unit test
    80ddcc6 [Daoyuan Wang] keep project
    b69d385 [Daoyuan Wang] add a prune rule for grouping set
    
    (cherry picked from commit 2cbb3e433ae334d5c318f05b987af314c854fbcc)
    Signed-off-by: Michael Armbrust <[email protected]>

commit d9d0250fc5dfe529bebd4f67f945f4d7c3fc4106
Author: Yin Huai <[email protected]>
Date:   2015-02-13T21:51:06Z

    [SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson 
encounters unrecoverable parsing errors.
    
    Author: Yin Huai <[email protected]>
    
    Closes #4582 from yhuai/jsonErrorMessage and squashes the following commits:
    
    152dbd4 [Yin Huai] Update error message.
    1466256 [Yin Huai] Throw a better error message when a JSON object in the 
input dataset span multiple records (lines for files or strings for an RDD of 
strings).
    
    (cherry picked from commit 2e0c084528409e1c565e6945521a33c0835ebbee)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 965876328d037f2a817f8c6bf5df0b3071abb43a
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-13T23:09:27Z

    [SPARK-5806] re-organize sections in mllib-clustering.md
    
    Put example code close to the algorithm description.
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4598 from mengxr/SPARK-5806 and squashes the following commits:
    
    a137872 [Xiangrui Meng] re-organize sections in mllib-clustering.md
    
    (cherry picked from commit cc56c8729a76af85aa6eb5d2f99787cca5e5b38f)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 356b798b3878bac1f89304e0be0f698f9eed6ec0
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-14T00:43:49Z

    [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays
    
    because ArrayBuffer is not specialized.
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4594 from mengxr/SPARK-5803 and squashes the following commits:
    
    1261bd5 [Xiangrui Meng] merge master
    a4ea872 [Xiangrui Meng] use ArrayBuilder to build primitive arrays
    
    (cherry picked from commit d50a91d529b0913364b483c511397d4af308a435)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit fccd38d2e08fb3502440a942a6958af5aada539b
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-14T00:45:59Z

    [SPARK-5730][ML] add doc groups to spark.ml components
    
    This PR adds three groups to the ScalaDoc: `param`, `setParam`, and 
`getParam`. Params will show up in the generated Scala API doc as the top 
group. Setters/getters will be at the bottom.
    
    Preview:
    
    ![screen shot 2015-02-13 at 2 47 49 
pm](https://cloud.githubusercontent.com/assets/829644/6196657/5740c240-b38f-11e4-94bb-bd8ef5a796c5.png)
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4600 from mengxr/SPARK-5730 and squashes the following commits:
    
    febed9a [Xiangrui Meng] add doc groups to spark.ml components
    
    (cherry picked from commit 4f4c6d5a5db04a56906bacdc85d7e5589b6edada)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 152147f5f884ae4eea3873f01719e6ab9bc7afd2
Author: Josh Rosen <[email protected]>
Date:   2015-02-14T01:45:31Z

    [SPARK-5227] [SPARK-5679] Disable FileSystem cache in 
WholeTextFileRecordReaderSuite
    
    This patch fixes two difficult-to-reproduce Jenkins test failures in 
InputOutputMetricsSuite (SPARK-5227 and SPARK-5679).  The problem was that 
WholeTextFileRecordReaderSuite modifies the `fs.local.block.size` Hadoop 
configuration and this change was affecting subsequent test suites due to 
Hadoop's caching of FileSystem instances (see HADOOP-8490 for more details).
    
    The fix implemented here is to disable FileSystem caching in 
WholeTextFileRecordReaderSuite.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #4599 from JoshRosen/inputoutputsuite-fix and squashes the following 
commits:
    
    47dc447 [Josh Rosen] [SPARK-5227] [SPARK-5679] Disable FileSystem cache in 
WholeTextFileRecordReaderSuite
    
    (cherry picked from commit d06d5ee9b33505774ef1e5becc01b47492f1a2dc)
    Signed-off-by: Patrick Wendell <[email protected]>

commit db5747921a648c3f7cf1de6dba70b82584afd097
Author: Sean Owen <[email protected]>
Date:   2015-02-14T04:12:52Z

    SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus
    
    This just unpersist()s each RDD in this code that was cache()ed.
    
    Author: Sean Owen <[email protected]>
    
    Closes #4234 from srowen/SPARK-3290 and squashes the following commits:
    
    66c1e11 [Sean Owen] unpersist() each RDD that was cache()ed
    
    (cherry picked from commit 0ce4e430a81532dc317136f968f28742e087d840)
    Signed-off-by: Ankur Dave <[email protected]>

commit ba91bf5f4f048a721d97eb5779957ec39b15319f
Author: Reynold Xin <[email protected]>
Date:   2015-02-14T07:03:22Z

    [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
    
    - The old implicit would convert RDDs directly to DataFrames, and that 
added too many methods.
    - toDataFrame -> toDF
    - Dsl -> functions
    - implicits moved into SQLContext.implicits
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    
    Python changes:
    - toDataFrame -> toDF
    - Dsl -> functions package
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    - add toDF functions to RDD on SQLContext init
    - add flatMap to DataFrame
    
    Author: Reynold Xin <[email protected]>
    Author: Davies Liu <[email protected]>
    
    Closes #4556 from rxin/SPARK-5752 and squashes the following commits:
    
    5ef9910 [Reynold Xin] More fix
    61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into 
SPARK-5752
    ff5832c [Reynold Xin] Fix python
    749c675 [Reynold Xin] count(*) fixes.
    5806df0 [Reynold Xin] Fix build break again.
    d941f3d [Reynold Xin] Fixed explode compilation break.
    fe1267a [Davies Liu] flatMap
    c4afb8e [Reynold Xin] style
    d9de47f [Davies Liu] add comment
    b783994 [Davies Liu] add comment for toDF
    e2154e5 [Davies Liu] schema() -> schema
    3a1004f [Davies Liu] Dsl -> functions, toDF()
    fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits 
moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> 
withColumnRenamed
    0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs 
directly to DataFrames
    97dd47c [Davies Liu] fix mistake
    6168f74 [Davies Liu] fix test
    1fc0199 [Davies Liu] fix test
    a075cd5 [Davies Liu] clean up, toPandas
    663d314 [Davies Liu] add test for agg('*')
    9e214d5 [Reynold Xin] count(*) fixes.
    1ed7136 [Reynold Xin] Fix build break again.
    921b2e3 [Reynold Xin] Fixed explode compilation break.
    14698d4 [Davies Liu] flatMap
    ba3e12d [Reynold Xin] style
    d08c92d [Davies Liu] add comment
    5c8b524 [Davies Liu] add comment for toDF
    a4e5e66 [Davies Liu] schema() -> schema
    d377fc9 [Davies Liu] Dsl -> functions, toDF()
    6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits 
moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> 
withColumnRenamed
    807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs 
directly to DataFrames
    
    (cherry picked from commit e98dfe627c5d0201464cdd0f363f391ea84c389a)
    Signed-off-by: Reynold Xin <[email protected]>

commit e99e170c7bff95a102b3bf00cc31bfa81951d0cf
Author: gasparms <[email protected]>
Date:   2015-02-14T20:10:29Z

    [SPARK-5800] Streaming Docs. Change linked files according the selected 
language
    
    Currently, Spark Streaming Programming Guide after updateStateByKey  
explanation links to file stateful_network_wordcount.py and note "For the 
complete Scala code ..." for any language tab selected. This is an incoherence.
    
    I've changed the guide and link its pertinent example file. 
JavaStatefulNetworkWordCount.java example was not created so I added to the 
commit.
    
    Author: gasparms <[email protected]>
    
    Closes #4589 from gasparms/feature/streaming-guide and squashes the 
following commits:
    
    7f37f89 [gasparms] More style changes
    ec202b0 [gasparms] Follow spark style guide
    f527328 [gasparms] Improve example to look like scala example
    4d8785c [gasparms] Remove throw exception
    e92e6b8 [gasparms] Fix incoherence
    92db405 [gasparms] Fix Streaming Programming Guide. Change files according 
the selected language

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: Branch 1.3

Reply via email to