[GitHub] spark pull request: [SPARK-1301][WebUI]Add UI elements to collapse...

shankervalipireddy Fri, 13 Mar 2015 15:26:55 -0700

GitHub user shankervalipireddy opened a pull request:

    https://github.com/apache/spark/pull/5021


    [SPARK-1301][WebUI]Add UI elements to collapse "Aggregated Metrics by 
Executor" pane on stage page

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/spark branch-1.3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5021.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5021
    
----
commit efffc2e428b1e867a586749685da90875f6bcfc4
Author: Daoyuan Wang <[email protected]>
Date:   2015-02-13T21:46:50Z

    [SPARK-5642] [SQL] Apply column pruning on unused aggregation fields
    
    select k from (select key k, max(value) v from src group by k) t
    
    Author: Daoyuan Wang <[email protected]>
    Author: Michael Armbrust <[email protected]>
    
    Closes #4415 from adrian-wang/groupprune and squashes the following commits:
    
    5d2d8a3 [Daoyuan Wang] address Michael's comments
    61f8ef7 [Daoyuan Wang] add a unit test
    80ddcc6 [Daoyuan Wang] keep project
    b69d385 [Daoyuan Wang] add a prune rule for grouping set
    
    (cherry picked from commit 2cbb3e433ae334d5c318f05b987af314c854fbcc)
    Signed-off-by: Michael Armbrust <[email protected]>

commit d9d0250fc5dfe529bebd4f67f945f4d7c3fc4106
Author: Yin Huai <[email protected]>
Date:   2015-02-13T21:51:06Z

    [SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson 
encounters unrecoverable parsing errors.
    
    Author: Yin Huai <[email protected]>
    
    Closes #4582 from yhuai/jsonErrorMessage and squashes the following commits:
    
    152dbd4 [Yin Huai] Update error message.
    1466256 [Yin Huai] Throw a better error message when a JSON object in the 
input dataset span multiple records (lines for files or strings for an RDD of 
strings).
    
    (cherry picked from commit 2e0c084528409e1c565e6945521a33c0835ebbee)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 965876328d037f2a817f8c6bf5df0b3071abb43a
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-13T23:09:27Z

    [SPARK-5806] re-organize sections in mllib-clustering.md
    
    Put example code close to the algorithm description.
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4598 from mengxr/SPARK-5806 and squashes the following commits:
    
    a137872 [Xiangrui Meng] re-organize sections in mllib-clustering.md
    
    (cherry picked from commit cc56c8729a76af85aa6eb5d2f99787cca5e5b38f)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 356b798b3878bac1f89304e0be0f698f9eed6ec0
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-14T00:43:49Z

    [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays
    
    because ArrayBuffer is not specialized.
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4594 from mengxr/SPARK-5803 and squashes the following commits:
    
    1261bd5 [Xiangrui Meng] merge master
    a4ea872 [Xiangrui Meng] use ArrayBuilder to build primitive arrays
    
    (cherry picked from commit d50a91d529b0913364b483c511397d4af308a435)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit fccd38d2e08fb3502440a942a6958af5aada539b
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-14T00:45:59Z

    [SPARK-5730][ML] add doc groups to spark.ml components
    
    This PR adds three groups to the ScalaDoc: `param`, `setParam`, and 
`getParam`. Params will show up in the generated Scala API doc as the top 
group. Setters/getters will be at the bottom.
    
    Preview:
    
    ![screen shot 2015-02-13 at 2 47 49 
pm](https://cloud.githubusercontent.com/assets/829644/6196657/5740c240-b38f-11e4-94bb-bd8ef5a796c5.png)
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4600 from mengxr/SPARK-5730 and squashes the following commits:
    
    febed9a [Xiangrui Meng] add doc groups to spark.ml components
    
    (cherry picked from commit 4f4c6d5a5db04a56906bacdc85d7e5589b6edada)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 152147f5f884ae4eea3873f01719e6ab9bc7afd2
Author: Josh Rosen <[email protected]>
Date:   2015-02-14T01:45:31Z

    [SPARK-5227] [SPARK-5679] Disable FileSystem cache in 
WholeTextFileRecordReaderSuite
    
    This patch fixes two difficult-to-reproduce Jenkins test failures in 
InputOutputMetricsSuite (SPARK-5227 and SPARK-5679).  The problem was that 
WholeTextFileRecordReaderSuite modifies the `fs.local.block.size` Hadoop 
configuration and this change was affecting subsequent test suites due to 
Hadoop's caching of FileSystem instances (see HADOOP-8490 for more details).
    
    The fix implemented here is to disable FileSystem caching in 
WholeTextFileRecordReaderSuite.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #4599 from JoshRosen/inputoutputsuite-fix and squashes the following 
commits:
    
    47dc447 [Josh Rosen] [SPARK-5227] [SPARK-5679] Disable FileSystem cache in 
WholeTextFileRecordReaderSuite
    
    (cherry picked from commit d06d5ee9b33505774ef1e5becc01b47492f1a2dc)
    Signed-off-by: Patrick Wendell <[email protected]>

commit db5747921a648c3f7cf1de6dba70b82584afd097
Author: Sean Owen <[email protected]>
Date:   2015-02-14T04:12:52Z

    SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus
    
    This just unpersist()s each RDD in this code that was cache()ed.
    
    Author: Sean Owen <[email protected]>
    
    Closes #4234 from srowen/SPARK-3290 and squashes the following commits:
    
    66c1e11 [Sean Owen] unpersist() each RDD that was cache()ed
    
    (cherry picked from commit 0ce4e430a81532dc317136f968f28742e087d840)
    Signed-off-by: Ankur Dave <[email protected]>

commit ba91bf5f4f048a721d97eb5779957ec39b15319f
Author: Reynold Xin <[email protected]>
Date:   2015-02-14T07:03:22Z

    [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
    
    - The old implicit would convert RDDs directly to DataFrames, and that 
added too many methods.
    - toDataFrame -> toDF
    - Dsl -> functions
    - implicits moved into SQLContext.implicits
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    
    Python changes:
    - toDataFrame -> toDF
    - Dsl -> functions package
    - addColumn -> withColumn
    - renameColumn -> withColumnRenamed
    - add toDF functions to RDD on SQLContext init
    - add flatMap to DataFrame
    
    Author: Reynold Xin <[email protected]>
    Author: Davies Liu <[email protected]>
    
    Closes #4556 from rxin/SPARK-5752 and squashes the following commits:
    
    5ef9910 [Reynold Xin] More fix
    61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into 
SPARK-5752
    ff5832c [Reynold Xin] Fix python
    749c675 [Reynold Xin] count(*) fixes.
    5806df0 [Reynold Xin] Fix build break again.
    d941f3d [Reynold Xin] Fixed explode compilation break.
    fe1267a [Davies Liu] flatMap
    c4afb8e [Reynold Xin] style
    d9de47f [Davies Liu] add comment
    b783994 [Davies Liu] add comment for toDF
    e2154e5 [Davies Liu] schema() -> schema
    3a1004f [Davies Liu] Dsl -> functions, toDF()
    fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits 
moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> 
withColumnRenamed
    0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs 
directly to DataFrames
    97dd47c [Davies Liu] fix mistake
    6168f74 [Davies Liu] fix test
    1fc0199 [Davies Liu] fix test
    a075cd5 [Davies Liu] clean up, toPandas
    663d314 [Davies Liu] add test for agg('*')
    9e214d5 [Reynold Xin] count(*) fixes.
    1ed7136 [Reynold Xin] Fix build break again.
    921b2e3 [Reynold Xin] Fixed explode compilation break.
    14698d4 [Davies Liu] flatMap
    ba3e12d [Reynold Xin] style
    d08c92d [Davies Liu] add comment
    5c8b524 [Davies Liu] add comment for toDF
    a4e5e66 [Davies Liu] schema() -> schema
    d377fc9 [Davies Liu] Dsl -> functions, toDF()
    6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits 
moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> 
withColumnRenamed
    807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs 
directly to DataFrames
    
    (cherry picked from commit e98dfe627c5d0201464cdd0f363f391ea84c389a)
    Signed-off-by: Reynold Xin <[email protected]>

commit e99e170c7bff95a102b3bf00cc31bfa81951d0cf
Author: gasparms <[email protected]>
Date:   2015-02-14T20:10:29Z

    [SPARK-5800] Streaming Docs. Change linked files according the selected 
language
    
    Currently, Spark Streaming Programming Guide after updateStateByKey  
explanation links to file stateful_network_wordcount.py and note "For the 
complete Scala code ..." for any language tab selected. This is an incoherence.
    
    I've changed the guide and link its pertinent example file. 
JavaStatefulNetworkWordCount.java example was not created so I added to the 
commit.
    
    Author: gasparms <[email protected]>
    
    Closes #4589 from gasparms/feature/streaming-guide and squashes the 
following commits:
    
    7f37f89 [gasparms] More style changes
    ec202b0 [gasparms] Follow spark style guide
    f527328 [gasparms] Improve example to look like scala example
    4d8785c [gasparms] Remove throw exception
    e92e6b8 [gasparms] Fix incoherence
    92db405 [gasparms] Fix Streaming Programming Guide. Change files according 
the selected language

commit 1945fcfd9ecbe84e9af7f35ee1d6ba06ac06d8e3
Author: Sean Owen <[email protected]>
Date:   2015-02-14T20:12:29Z

    Revise formatting of previous commit 
f80e2629bb74bc62960c61ff313f7e7802d61319

commit f87f3b755817aa239ae2efa718f7c1f4569d84bd
Author: gli <[email protected]>
Date:   2015-02-14T20:43:27Z

    SPARK-5822 [BUILD] cannot import src/main/scala & src/test/scala into 
eclipse as source folder
    
       When import the whole project into eclipse as maven project, found that 
the
       src/main/scala & src/test/scala can not be set as source folder as 
default
       behavior, so add a "add-source" goal in scala-maven-plugin to let this 
work.
    
    Author: gli <[email protected]>
    
    Closes #4531 from ligangty/addsource and squashes the following commits:
    
    4e4db4c [gli] [IDE] cannot import src/main/scala & src/test/scala into 
eclipse as source folder
    
    (cherry picked from commit ed5f4bb7cb2c934b818d1e8b8b4e6a0056119c80)
    Signed-off-by: Sean Owen <[email protected]>

commit 9c1c70d8cc8cf3afedecbc8868b3765c15bd493e
Author: Takeshi Yamamuro <[email protected]>
Date:   2015-02-15T14:42:20Z

    [SPARK-5827][SQL] Add missing import in the example of SqlContext
    
    If one tries an example by using copy&paste, throw an exception.
    
    Author: Takeshi Yamamuro <[email protected]>
    
    Closes #4615 from maropu/AddMissingImportInSqlContext and squashes the 
following commits:
    
    ab21b66 [Takeshi Yamamuro] Add missing import in the example of SqlContext
    
    (cherry picked from commit c771e475c449fe07cf45f37bdca2ba6ce9600bfc)
    Signed-off-by: Sean Owen <[email protected]>

commit 70ebad4d972101dc2f920ac014cd2359b99a50f9
Author: Reynold Xin <[email protected]>
Date:   2015-02-13T20:43:53Z

    [HOTFIX] Ignore DirectKafkaStreamSuite.

commit d96e188c7a2b52cff32814f8e0596f030c14ad21
Author: martinzapletal <[email protected]>
Date:   2015-02-15T17:10:03Z

    [MLLIB][SPARK-5502] User guide for isotonic regression
    
    User guide for isotonic regression added to docs/mllib-regression.md 
including code examples for Scala and Java.
    
    Author: martinzapletal <[email protected]>
    
    Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following 
commits:
    
    67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use 
more general language rather than the code/implementation specific terms
    80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic 
regression, added links to the page, updated data and examples
    7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic 
regression including examples for Scala and Java
    504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic 
regression including examples for Scala and Java
    
    (cherry picked from commit 61eb12674b90143388a01c22bf51cb7d02ab0447)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 4e099d757fc1bc4266f7849db6da0e996bf917be
Author: Sean Owen <[email protected]>
Date:   2015-02-15T17:15:48Z

    SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed 
libgfortran, libgcc code via JBLAS
    
    Exclude libgfortran, libgcc bundled by JBLAS for Windows. This much is 
simple, and solves the essential license issue. But the more important question 
is whether MLlib works on Windows then.
    
    Author: Sean Owen <[email protected]>
    
    Closes #4453 from srowen/SPARK-5669 and squashes the following commits:
    
    734dd86 [Sean Owen] Exclude libgfortran, libgcc bundled by JBLAS, affecting 
Windows / OS X / Linux 32-bit (not Linux 64-bit)
    
    (cherry picked from commit 836577b382695558f5c97d94ee725d0156ebfad2)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit d71099133b64a4b9e9ab430cf1b314ee7deaf08d
Author: Xiangrui Meng <[email protected]>
Date:   2015-02-16T04:29:26Z

    [SPARK-5769] Set params in constructors and in setParams in Python ML 
pipelines
    
    This PR allow Python users to set params in constructors and in setParams, 
where we use decorator `keyword_only` to force keyword arguments. The trade-off 
is discussed in the design doc of SPARK-4586.
    
    Generated doc:
    ![screen shot 2015-02-12 at 3 06 58 
am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png)
    
    CC: davies rxin
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #4564 from mengxr/py-pipeline-kw and squashes the following commits:
    
    fedf720 [Xiangrui Meng] use toDF
    d565f2c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
py-pipeline-kw
    cbc15d3 [Xiangrui Meng] fix style
    5032097 [Xiangrui Meng] update pipeline signature
    950774e [Xiangrui Meng] simplify keyword_only and update 
constructor/setParams signatures
    fdde5fc [Xiangrui Meng] fix style
    c9384b8 [Xiangrui Meng] fix sphinx doc
    8e59180 [Xiangrui Meng] add setParams and make constructors take params, 
where we force keyword args
    
    (cherry picked from commit cd4a15366244657c4b7936abe5054754534366f2)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit db3c539f20e17e327b2f284bf6fbb3f1abd7fe64
Author: Sean Owen <[email protected]>
Date:   2015-02-16T04:41:27Z

    SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from 
JBLAS
    
    Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with 
return type that doesn't include DoubleMatrix
    
    CC mengxr
    
    Author: Sean Owen <[email protected]>
    
    Closes #4614 from srowen/SPARK-5815 and squashes the following commits:
    
    288cb05 [Sean Owen] Clarify deprecation plans in scaladoc
    497458e [Sean Owen] Deprecate SVDPlusPlus.run and introduce 
SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix
    
    (cherry picked from commit acf2558dc92901c342262c35eebb95f2a9b7a9ae)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 9cf7d7088d245b9b41ec78295cd2d6e3e395793d
Author: Peter Rudenko <[email protected]>
Date:   2015-02-16T04:51:32Z

    [Ml] SPARK-5796 Don't transform data on a last estimator in Pipeline
    
    If it's a last estimator in Pipeline there's no need to transform data, 
since there's no next stage that would consume this data.
    
    Author: Peter Rudenko <[email protected]>
    
    Closes #4590 from petro-rudenko/patch-1 and squashes the following commits:
    
    d13ec33 [Peter Rudenko] [Ml] SPARK-5796 Don't transform data on a last 
estimator in Pipeline
    
    (cherry picked from commit c78a12c4cc4d4312c4ee1069d3b218882d32d678)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 0d932058ed95c2b65dc308fd523cfea6d9b29b16
Author: Peter Rudenko <[email protected]>
Date:   2015-02-16T08:07:23Z

    [Ml] SPARK-5804 Explicitly manage cache in Crossvalidator k-fold loop
    
    On a big dataset explicitly unpersist train and validation folds allows to 
load more data into memory in the next loop iteration. On my environment 
(single node 8Gb worker RAM, 2 GB dataset file, 3 folds for cross validation), 
saved more than 5 minutes.
    
    Author: Peter Rudenko <[email protected]>
    
    Closes #4595 from petro-rudenko/patch-2 and squashes the following commits:
    
    66a7cfb [Peter Rudenko] Move validationDataset cache to declaration
    c5f3265 [Peter Rudenko] [Ml] SPARK-5804 Explicitly manage cache in 
Crossvalidator k-fold loop
    
    (cherry picked from commit d51d6ba1547ae75ac76c9e6d8ea99e937eb7d09f)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 066301c65075bce515770d8e70294b3b2f588b96
Author: Cheng Lian <[email protected]>
Date:   2015-02-16T09:33:37Z

    [Minor] [SQL] Renames stringRddToDataFrame to stringRddToDataFrameHolder 
for consistency
    
    <!-- Reviewable:start -->
    [<img src="https://reviewable.io/review_button.png"; height=40 alt="Review 
on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4613)
    <!-- Reviewable:end -->
    
    Author: Cheng Lian <[email protected]>
    
    Closes #4613 from liancheng/df-implicit-rename and squashes the following 
commits:
    
    db8bdd3 [Cheng Lian] Renames stringRddToDataFrame to 
stringRddToDataFrameHolder for consistency
    
    (cherry picked from commit 199a9e80275ac70582ea32f0f2f5a0a15b168785)
    Signed-off-by: Cheng Lian <[email protected]>

commit 78f7edb85be5a397c0d1a2f3fd26aa83675cc0b1
Author: Cheng Lian <[email protected]>
Date:   2015-02-16T09:38:31Z

    [SPARK-4553] [SPARK-5767] [SQL] Wires Parquet data source with the newly 
introduced write support for data source API
    
    This PR migrates the Parquet data source to the new data source write 
support API.  Now users can also overwriting and appending to existing tables. 
Notice that inserting into partitioned tables is not supported yet.
    
    When Parquet data source is enabled, insertion to Hive Metastore Parquet 
tables is also fullfilled by the Parquet data source. This is done by the newly 
introduced `HiveMetastoreCatalog.ParquetConversions` rule, which is a "proper" 
implementation of the original hacky `HiveStrategies.ParquetConversion`. The 
latter is still preserved, and can be removed together with the old Parquet 
support in the future.
    
    TODO:
    
    - [x] Update outdated comments in `newParquet.scala`.
    
    <!-- Reviewable:start -->
    [<img src="https://reviewable.io/review_button.png"; height=40 alt="Review 
on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4563)
    <!-- Reviewable:end -->
    
    Author: Cheng Lian <[email protected]>
    
    Closes #4563 from liancheng/parquet-refining and squashes the following 
commits:
    
    fa98d27 [Cheng Lian] Fixes test cases which should disable off Parquet data 
source
    2476e82 [Cheng Lian] Fixes compilation error introduced during rebasing
    a83d290 [Cheng Lian] Passes Hive Metastore partitioning information to 
ParquetRelation2
    
    (cherry picked from commit 3ce58cf9c0ffe8b867ca79b404fe3fa291cf0e56)
    Signed-off-by: Cheng Lian <[email protected]>

commit 0165e9d1324e24571c702b32d8d76edca8808887
Author: Liang-Chi Hsieh <[email protected]>
Date:   2015-02-16T18:06:11Z

    [SPARK-5799][SQL] Compute aggregation function on specified numeric columns
    
    Compute aggregation function on specified numeric columns. For example:
    
        val df = Seq(("a", 1, 0, "b"), ("b", 2, 4, "c"), ("a", 2, 3, 
"d")).toDataFrame("key", "value1", "value2", "rest")
        df.groupBy("key").min("value2")
    
    Author: Liang-Chi Hsieh <[email protected]>
    
    Closes #4592 from viirya/specific_cols_agg and squashes the following 
commits:
    
    9446896 [Liang-Chi Hsieh] For comments.
    314c4cd [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' 
into specific_cols_agg
    353fad7 [Liang-Chi Hsieh] For python unit tests.
    54ed0c4 [Liang-Chi Hsieh] Address comments.
    b079e6b [Liang-Chi Hsieh] Remove duplicate codes.
    55100fb [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' 
into specific_cols_agg
    880c2ac [Liang-Chi Hsieh] Fix Python style checks.
    4c63a01 [Liang-Chi Hsieh] Fix pyspark.
    b1a24fc [Liang-Chi Hsieh] Address comments.
    2592f29 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' 
into specific_cols_agg
    27069c3 [Liang-Chi Hsieh] Combine functions and add varargs annotation.
    371a3f7 [Liang-Chi Hsieh] Compute aggregation function on specified numeric 
columns.
    
    (cherry picked from commit 5c78be7a515fc2fc92cda0517318e7b5d85762f4)
    Signed-off-by: Reynold Xin <[email protected]>

commit fef2267cd4299de412a50b18cfd5e97ea7e7d851
Author: Sean Owen <[email protected]>
Date:   2015-02-16T19:32:31Z

    SPARK-5795 [STREAMING] api.java.JavaPairDStream.saveAsNewAPIHadoopFiles may 
not friendly to java
    
    Revise JavaPairDStream API declaration on saveAs Hadoop methods, to allow 
it to be called directly as intended.
    
    CC tdas for review
    
    Author: Sean Owen <[email protected]>
    
    Closes #4608 from srowen/SPARK-5795 and squashes the following commits:
    
    36f1ead [Sean Owen] Add code that shows compile problem and fix
    036bd27 [Sean Owen] Revise JavaPairDStream API declaration on saveAs Hadoop 
methods, to allow it to be called directly as intended.
    
    (cherry picked from commit 8e25373ce72061d3b6a353259ec627606afa4a5f)
    Signed-off-by: Sean Owen <[email protected]>

commit 1a8895560f668faed33e99bcb88cafefd64fef03
Author: Cheng Hao <[email protected]>
Date:   2015-02-16T20:21:08Z

    [SQL] [Minor] Update the SpecificMutableRow.copy
    
    When profiling the Join / Aggregate queries via VisualVM, I noticed lots of 
`SpecificMutableRow` objects created, as well as the `MutableValue`, since the 
`SpecificMutableRow` are mostly used in data source implementation, but the 
`copy` method could be called multiple times in upper modules (e.g. in Join / 
aggregation etc.), duplicated instances created should be avoid.
    
    Author: Cheng Hao <[email protected]>
    
    Closes #4619 from chenghao-intel/specific_mutable_row and squashes the 
following commits:
    
    9300d23 [Cheng Hao] update the SpecificMutableRow.copy
    
    (cherry picked from commit cc552e042896350e21eec9b78593de25006ecc70)
    Signed-off-by: Michael Armbrust <[email protected]>

commit c2eaaea9f9f77662a4c9405b2796aa6bd362466e
Author: Daoyuan Wang <[email protected]>
Date:   2015-02-16T20:31:36Z

    [SPARK-5824] [SQL] add null format in ctas and set default col comment to 
null
    
    Author: Daoyuan Wang <[email protected]>
    
    Closes #4609 from adrian-wang/ctas and squashes the following commits:
    
    0a75d5a [Daoyuan Wang] reorder import
    93d1863 [Daoyuan Wang] add null format in ctas and set default col comment 
to null
    
    (cherry picked from commit 275a0c08134dea1896eab73a8e017256900fb1db)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 63fa123f1c2113caea74a7cf9a7293f256441dc7
Author: Michael Armbrust <[email protected]>
Date:   2015-02-16T20:32:56Z

    [SQL] Initial support for reporting location of error in sql string
    
    Author: Michael Armbrust <[email protected]>
    
    Closes #4587 from marmbrus/position and squashes the following commits:
    
    0810052 [Michael Armbrust] fix tests
    395c019 [Michael Armbrust] Merge remote-tracking branch 'marmbrus/position' 
into position
    e155dce [Michael Armbrust] more errors
    f3efa51 [Michael Armbrust] Update AnalysisException.scala
    d45ff60 [Michael Armbrust] [SQL] Initial support for reporting location of 
error in sql string
    
    (cherry picked from commit 104b2c45805ce0a9c86e2823f402de6e9f0aee81)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 0368494c502c33c05f806d106ff2042acad91cee
Author: OopsOutOfMemory <[email protected]>
Date:   2015-02-16T20:34:09Z

    [SQL] Add fetched row count in SparkSQLCLIDriver
    
    before this change:
    ```scala
    Time taken: 0.619 seconds
    ```
    
    after this change :
    ```scala
    Time taken: 0.619 seconds, Fetched: 4 row(s)
    ```
    
    Author: OopsOutOfMemory <[email protected]>
    
    Closes #4604 from OopsOutOfMemory/rowcount and squashes the following 
commits:
    
    7252dea [OopsOutOfMemory] add fetched row count
    
    (cherry picked from commit b4d7c7032d755de42951f92d9535287ef6230b9b)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 363a9a7d5ad682f828288f792a836c2c0b5e2f89
Author: Cheng Lian <[email protected]>
Date:   2015-02-16T20:48:55Z

    [SPARK-5296] [SQL] Add more filter types for data sources API
    
    This PR adds the following filter types for data sources API:
    
    - `IsNull`
    - `IsNotNull`
    - `Not`
    - `And`
    - `Or`
    
    The code which converts Catalyst predicate expressions to data sources 
filters is very similar to filter conversion logics in `ParquetFilters` which 
converts Catalyst predicates to Parquet filter predicates. In this way we can 
support nested AND/OR/NOT predicates without changing current `BaseScan` type 
hierarchy.
    
    <!-- Reviewable:start -->
    [<img src="https://reviewable.io/review_button.png"; height=40 alt="Review 
on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4623)
    <!-- Reviewable:end -->
    
    Author: Cheng Lian <[email protected]>
    
    This patch had conflicts when merged, resolved by
    Committer: Michael Armbrust <[email protected]>
    
    Closes #4623 from liancheng/more-fiters and squashes the following commits:
    
    1b296f4 [Cheng Lian] Add more filter types for data sources API

commit 864d77e0d23b974943a1875b7372de05b3595bd5
Author: Cheng Lian <[email protected]>
Date:   2015-02-16T20:52:05Z

    [SPARK-5833] [SQL] Adds REFRESH TABLE command
    
    Lifts `HiveMetastoreCatalog.refreshTable` to `Catalog`. Adds `RefreshTable` 
command to refresh (possibly cached) metadata in external data sources tables.
    
    <!-- Reviewable:start -->
    [<img src="https://reviewable.io/review_button.png"; height=40 alt="Review 
on Reviewable"/>](https://reviewable.io/reviews/apache/spark/4624)
    <!-- Reviewable:end -->
    
    Author: Cheng Lian <[email protected]>
    
    Closes #4624 from liancheng/refresh-table and squashes the following 
commits:
    
    8d1aa4c [Cheng Lian] Adds REFRESH TABLE command
    
    (cherry picked from commit c51ab37faddf4ede23243058dfb388e74a192552)
    Signed-off-by: Michael Armbrust <[email protected]>

commit dd977dfed4303825fd2d5da036fcfd53820aefd8
Author: Matt Whelan <[email protected]>
Date:   2015-02-16T22:54:32Z

    SPARK-5841: remove DiskBlockManager shutdown hook on stop
    
    After a call to stop, the shutdown hook is redundant, and causes a
    memory leak.
    
    Author: Matt Whelan <[email protected]>
    
    Closes #4627 from MattWhelan/SPARK-5841 and squashes the following commits:
    
    d5f5c7f [Matt Whelan] SPARK-5841: remove DiskBlockManager shutdown hook on 
stop
    
    (cherry picked from commit bb05982dd25e008fb01684dff1f95d03e7271721)
    Signed-off-by: Sean Owen <[email protected]>

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-1301][WebUI]Add UI elements to collapse...

Reply via email to