[GitHub] spark pull request: [SPARK-1470] remove scalalogging-slf4j depende...

witgo Sat, 02 Aug 2014 05:05:33 -0700

GitHub user witgo reopened a pull request:

    https://github.com/apache/spark/pull/332


    [SPARK-1470] remove scalalogging-slf4j dependency

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/witgo/spark remove_scalalogging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #332
    
----
commit dd95abada78b4d0aec97dacda50fdfd74464b073
Author: Reynold Xin <[email protected]>
Date:   2014-07-15T08:46:57Z

    [SPARK-2399] Add support for LZ4 compression.
    
    Based on Greg Bowyer's patch from JIRA 
https://issues.apache.org/jira/browse/SPARK-2399
    
    Author: Reynold Xin <[email protected]>
    
    Closes #1416 from rxin/lz4 and squashes the following commits:
    
    6c8fefe [Reynold Xin] Fixed typo.
    8a14d38 [Reynold Xin] [SPARK-2399] Add support for LZ4 compression.

commit 52beb20f7904e0333198b9b14619366ddf53ab85
Author: DB Tsai <[email protected]>
Date:   2014-07-15T09:14:58Z

    [SPARK-2477][MLlib] Using appendBias for adding intercept in 
GeneralizedLinearAlgorithm
    
    Instead of using prependOne currently in GeneralizedLinearAlgorithm, we 
would like to use appendBias for 1) keeping the indices of original training 
set unchanged by adding the intercept into the last element of vector and 2) 
using the same public API for consistently adding intercept.
    
    Author: DB Tsai <[email protected]>
    
    Closes #1410 from dbtsai/SPARK-2477_intercept_with_appendBias and squashes 
the following commits:
    
    011432c [DB Tsai] From Alpine Data Labs

commit 8f1d4226c285e33d2fb839d3163bb374eb6db0e7
Author: Reynold Xin <[email protected]>
Date:   2014-07-15T09:15:29Z

    Update README.md to include a slightly more informative project description.
    
    (cherry picked from commit 401083be9f010f95110a819a49837ecae7d9c4ec)
    Signed-off-by: Reynold Xin <[email protected]>

commit 6555618c8f39b4e7da9402c3fd9da7a75bf7794e
Author: Reynold Xin <[email protected]>
Date:   2014-07-15T09:20:01Z

    README update: added "for Big Data".

commit 04b01bb101eeaf76c2e7c94c291669f0b2372c9a
Author: Alexander Ulanov <[email protected]>
Date:   2014-07-15T15:40:22Z

    [MLLIB] [SPARK-2222] Add multiclass evaluation metrics
    
    Adding two classes:
    1) MulticlassMetrics implements various multiclass evaluation metrics
    2) MulticlassMetricsSuite implements unit tests for MulticlassMetrics
    
    Author: Alexander Ulanov <[email protected]>
    Author: unknown <[email protected]>
    Author: Xiangrui Meng <[email protected]>
    
    Closes #1155 from avulanov/master and squashes the following commits:
    
    2eae80f [Alexander Ulanov] Merge pull request #1 from mengxr/avulanov-master
    5ebeb08 [Xiangrui Meng] minor updates
    79c3555 [Alexander Ulanov] Addressing reviewers comments mengxr
    0fa9511 [Alexander Ulanov] Addressing reviewers comments mengxr
    f0dadc9 [Alexander Ulanov] Addressing reviewers comments mengxr
    4811378 [Alexander Ulanov] Removing println
    87fb11f [Alexander Ulanov] Addressing reviewers comments mengxr. Added 
confusion matrix
    e3db569 [Alexander Ulanov] Addressing reviewers comments mengxr. Added true 
positive rate and false positive rate. Test suite code style.
    a7e8bf0 [Alexander Ulanov] Addressing reviewers comments mengxr
    c3a77ad [Alexander Ulanov] Addressing reviewers comments mengxr
    e2c91c3 [Alexander Ulanov] Fixes to mutliclass metics
    d5ce981 [unknown] Comments about Double
    a5c8ba4 [unknown] Unit tests. Class rename
    fcee82d [unknown] Unit tests. Class rename
    d535d62 [unknown] Multiclass evaluation

commit cb09e93c1d7ef9c8f0a1abe4e659783c74993a4e
Author: William Benton <[email protected]>
Date:   2014-07-15T16:13:39Z

    Reformat multi-line closure argument.
    
    Author: William Benton <[email protected]>
    
    Closes #1419 from willb/reformat-2486 and squashes the following commits:
    
    2676231 [William Benton] Reformat multi-line closure argument.

commit 9dd635eb5df52835b3b7f4f2b9c789da9e813c71
Author: witgo <[email protected]>
Date:   2014-07-15T17:46:17Z

    SPARK-2480: Resolve sbt warnings "NOTE: SPARK_YARN is deprecated, please 
use -Pyarn flag"
    
    Author: witgo <[email protected]>
    
    Closes #1404 from witgo/run-tests and squashes the following commits:
    
    f703aee [witgo] fix Note: implicit method fromPairDStream is not applicable 
here because it comes after the application point and it lacks an explicit 
result type
    2944f51 [witgo] Remove "NOTE: SPARK_YARN is deprecated, please use -Pyarn 
flag"
    ef59c70 [witgo] fix Note: implicit method fromPairDStream is not applicable 
here because it comes after the application point and it lacks an explicit 
result type
    6cefee5 [witgo] Remove "NOTE: SPARK_YARN is deprecated, please use -Pyarn 
flag"

commit 72ea56da8e383c61c6f18eeefef03b9af00f5158
Author: witgo <[email protected]>
Date:   2014-07-15T18:52:56Z

    SPARK-1291: Link the spark UI to RM ui in yarn-client mode
    
    Author: witgo <[email protected]>
    
    Closes #1112 from witgo/SPARK-1291 and squashes the following commits:
    
    6022bcd [witgo] review commit
    1fbb925 [witgo] add addAmIpFilter to yarn alpha
    210299c [witgo] review commit
    1b92a07 [witgo] review commit
    6896586 [witgo] Add comments to addWebUIFilter
    3e9630b [witgo] review commit
    142ee29 [witgo] review commit
    1fe7710 [witgo] Link the spark UI to RM ui in yarn-client mode

commit e7ec815d9a2b0f89a56dc7dd3106c31a09492028
Author: Reynold Xin <[email protected]>
Date:   2014-07-15T20:13:33Z

    Added LZ4 to compression codec in configuration page.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #1417 from rxin/lz4 and squashes the following commits:
    
    472f6a1 [Reynold Xin] Set the proper default.
    9cf0b2f [Reynold Xin] Added LZ4 to compression codec in configuration page.

commit a21f9a7543309320bb2791468243c8f10bc6e81b
Author: Xiangrui Meng <[email protected]>
Date:   2014-07-15T21:00:54Z

    [SPARK-2471] remove runtime scope for jets3t
    
    The assembly jar (built by sbt) doesn't include jets3t if we set it to 
runtime only, but I don't know whether it was set this way for a particular 
reason.
    
    CC: srowen ScrapCodes
    
    Author: Xiangrui Meng <[email protected]>
    
    Closes #1402 from mengxr/jets3t and squashes the following commits:
    
    bfa2d17 [Xiangrui Meng] remove runtime scope for jets3t

commit 0f98ef1a2c9ecf328f6c5918808fa5ca486e8afd
Author: Michael Armbrust <[email protected]>
Date:   2014-07-15T21:01:48Z

    [SPARK-2483][SQL] Fix parsing of repeated, nested data access.
    
    Author: Michael Armbrust <[email protected]>
    
    Closes #1411 from marmbrus/nestedRepeated and squashes the following 
commits:
    
    044fa09 [Michael Armbrust] Fix parsing of repeated, nested data access.

commit bcd0c30c7eea4c50301cb732c733fdf4d4142060
Author: Michael Armbrust <[email protected]>
Date:   2014-07-15T21:04:01Z

    [SQL] Whitelist more Hive tests.
    
    Author: Michael Armbrust <[email protected]>
    
    Closes #1396 from marmbrus/moreTests and squashes the following commits:
    
    6660b60 [Michael Armbrust] Blacklist a test that requires DFS command.
    8b6001c [Michael Armbrust] Add golden files.
    ccd8f97 [Michael Armbrust] Whitelist more tests.

commit 8af46d58464b96471825ce376c3e11c8b1108c0e
Author: Yin Huai <[email protected]>
Date:   2014-07-15T21:06:45Z

    [SPARK-2474][SQL] For a registered table in OverrideCatalog, the Analyzer 
failed to resolve references in the format of "tableName.fieldName"
    
    Please refer to JIRA (https://issues.apache.org/jira/browse/SPARK-2474) for 
how to reproduce the problem and my understanding of the root cause.
    
    Author: Yin Huai <[email protected]>
    
    Closes #1406 from yhuai/SPARK-2474 and squashes the following commits:
    
    96b1627 [Yin Huai] Merge remote-tracking branch 'upstream/master' into 
SPARK-2474
    af36d65 [Yin Huai] Fix comment.
    be86ba9 [Yin Huai] Correct SQL console settings.
    c43ad00 [Yin Huai] Wrap the relation in a Subquery named by the table name 
in OverrideCatalog.lookupRelation.
    a5c2145 [Yin Huai] Support sql/console.

commit 61de65bc69f9a5fc396b76713193c6415436d452
Author: William Benton <[email protected]>
Date:   2014-07-15T21:11:57Z

    SPARK-2407: Added internal implementation of SQL SUBSTR()
    
    This replaces the Hive UDF for SUBSTR(ING) with an implementation in 
Catalyst
    and adds tests to verify correct operation.
    
    Author: William Benton <[email protected]>
    
    Closes #1359 from willb/internalSqlSubstring and squashes the following 
commits:
    
    ccedc47 [William Benton] Fixed too-long line.
    a30a037 [William Benton] replace view bounds with implicit parameters
    ec35c80 [William Benton] Adds fixes from review:
    4f3bfdb [William Benton] Added internal implementation of SQL SUBSTR()

commit 502f90782ad474e2630ed5be4d3c4be7dab09c34
Author: Michael Armbrust <[email protected]>
Date:   2014-07-16T00:56:17Z

    [SQL] Attribute equality comparisons should be done by exprId.
    
    Author: Michael Armbrust <[email protected]>
    
    Closes #1414 from marmbrus/exprIdResolution and squashes the following 
commits:
    
    97b47bc [Michael Armbrust] Attribute equality comparisons should be done by 
exprId.

commit c2048a5165b270f5baf2003fdfef7bc6c5875715
Author: Zongheng Yang <[email protected]>
Date:   2014-07-16T00:58:28Z

    [SPARK-2498] [SQL] Synchronize on a lock when using scala reflection inside 
data type objects.
    
    JIRA ticket: https://issues.apache.org/jira/browse/SPARK-2498
    
    Author: Zongheng Yang <[email protected]>
    
    Closes #1423 from concretevitamin/scala-ref-catalyst and squashes the 
following commits:
    
    325a149 [Zongheng Yang] Synchronize on a lock when initializing data type 
objects in Catalyst.

commit 4576d80a5155c9fbfebe9c36cca06c208bca5bd3
Author: Reynold Xin <[email protected]>
Date:   2014-07-16T01:47:39Z

    [SPARK-2469] Use Snappy (instead of LZF) for default shuffle compression 
codec
    
    This reduces shuffle compression memory usage by 3x.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #1415 from rxin/snappy and squashes the following commits:
    
    06c1a01 [Reynold Xin] SPARK-2469: Use Snappy (instead of LZF) for default 
shuffle compression codec.

commit 9c12de5092312319aa22f24df47a6de0e41a0102
Author: Henry Saputra <[email protected]>
Date:   2014-07-16T04:21:52Z

    [SPARK-2500] Move the logInfo for registering BlockManager to 
BlockManagerMasterActor.register method
    
    PR for SPARK-2500
    
    Move the logInfo call for BlockManager to BlockManagerMasterActor.register 
instead of BlockManagerInfo constructor.
    
    Previously the loginfo call for registering the registering a BlockManager 
is happening in the BlockManagerInfo constructor. This kind of confusing 
because the code could call "new BlockManagerInfo" without actually registering 
a BlockManager and could confuse when reading the log files.
    
    Author: Henry Saputra <[email protected]>
    
    Closes #1424 from 
hsaputra/move_registerblockmanager_log_to_registration_method and squashes the 
following commits:
    
    3370b4a [Henry Saputra] Move the loginfo for BlockManager to 
BlockManagerMasterActor.register instead of BlockManagerInfo constructor.

commit 563acf5edfbfb2fa756a1f0accde0940592663e9
Author: Ken Takagiwa <[email protected]>
Date:   2014-07-16T04:34:05Z

    follow pep8 None should be compared using is or is not
    
    http://legacy.python.org/dev/peps/pep-0008/
    ## Programming Recommendations
    - Comparisons to singletons like None should always be done with is or is 
not, never the equality operators.
    
    Author: Ken Takagiwa <[email protected]>
    
    Closes #1422 from giwa/apache_master and squashes the following commits:
    
    7b361f3 [Ken Takagiwa] follow pep8 None should be checked using is or is not

commit 90ca532a0fd95dc85cff8c5722d371e8368b2687
Author: Aaron Staple <[email protected]>
Date:   2014-07-16T04:35:36Z

    [SPARK-2314][SQL] Override collect and take in JavaSchemaRDD, forwarding to 
SchemaRDD implementations.
    
    Author: Aaron Staple <[email protected]>
    
    Closes #1421 from staple/SPARK-2314 and squashes the following commits:
    
    73e04dc [Aaron Staple] [SPARK-2314] Override collect and take in 
JavaSchemaRDD, forwarding to SchemaRDD implementations.

commit 9b38b7c71352bb5e6d359515111ad9ca33299127
Author: Takuya UESHIN <[email protected]>
Date:   2014-07-16T05:35:34Z

    [SPARK-2509][SQL] Add optimization for Substring.
    
    `Substring` including `null` literal cases could be added to 
`NullPropagation`.
    
    Author: Takuya UESHIN <[email protected]>
    
    Closes #1428 from ueshin/issues/SPARK-2509 and squashes the following 
commits:
    
    d9eb85f [Takuya UESHIN] Add Substring cases to NullPropagation.

commit 632fb3d9a9ebb3d2218385403145d5b89c41c025
Author: Takuya UESHIN <[email protected]>
Date:   2014-07-16T05:43:48Z

    [SPARK-2504][SQL] Fix nullability of Substring expression.
    
    This is a follow-up of #1359 with nullability narrowing.
    
    Author: Takuya UESHIN <[email protected]>
    
    Closes #1426 from ueshin/issues/SPARK-2504 and squashes the following 
commits:
    
    5157832 [Takuya UESHIN] Remove unnecessary white spaces.
    80958ac [Takuya UESHIN] Fix nullability of Substring expression.

commit efc452a16322e8b20b3c4fe1d6847315f928cd2d
Author: Cheng Lian <[email protected]>
Date:   2014-07-16T16:44:51Z

    [SPARK-2119][SQL] Improved Parquet performance when reading off S3
    
    JIRA issue: [SPARK-2119](https://issues.apache.org/jira/browse/SPARK-2119)
    
    Essentially this PR fixed three issues to gain much better performance when 
reading large Parquet file off S3.
    
    1. When reading the schema, fetching Parquet metadata from a part-file 
rather than the `_metadata` file
    
       The `_metadata` file contains metadata of all row groups, and can be 
very large if there are many row groups. Since schema information and row group 
metadata are coupled within a single Thrift object, we have to read the whole 
`_metadata` to fetch the schema. On the other hand, schema is replicated among 
footers of all part-files, which are fairly small.
    
    1. Only add the root directory of the Parquet file rather than all the 
part-files to input paths
    
       HDFS API can automatically filter out all hidden files and underscore 
files (`_SUCCESS` & `_metadata`), there's no need to filter out all part-files 
and add them individually to input paths. What make it much worse is that, 
`FileInputFormat.listStatus()` calls `FileSystem.globStatus()` on each 
individual input path sequentially, each results a blocking remote S3 HTTP 
request.
    
    1. Worked around 
[PARQUET-16](https://issues.apache.org/jira/browse/PARQUET-16)
    
       Essentially PARQUET-16 is similar to the above issue, and results lots 
of sequential `FileSystem.getFileStatus()` calls, which are further translated 
into a bunch of remote S3 HTTP requests.
    
       `FilteringParquetRowInputFormat` should be cleaned up once PARQUET-16 is 
fixed.
    
    Below is the micro benchmark result. The dataset used is a S3 Parquet file 
consists of 3,793 partitions, about 110MB per partition in average. The 
benchmark is done with a 9-node AWS cluster.
    
    - Creating a Parquet `SchemaRDD` (Parquet schema is fetched)
    
      ```scala
      val tweets = parquetFile(uri)
      ```
    
      - Before: 17.80s
      - After: 8.61s
    
    - Fetching partition information
    
      ```scala
      tweets.getPartitions
      ```
    
      - Before: 700.87s
      - After: 21.47s
    
    - Counting the whole file (both steps above are executed altogether)
    
      ```scala
      parquetFile(uri).count()
      ```
    
      - Before: ??? (haven't test yet)
      - After: 53.26s
    
    Author: Cheng Lian <[email protected]>
    
    Closes #1370 from liancheng/faster-parquet and squashes the following 
commits:
    
    94a2821 [Cheng Lian] Added comments about schema consistency
    d2c4417 [Cheng Lian] Worked around PARQUET-16 to improve Parquet performance
    1c0d1b9 [Cheng Lian] Accelerated Parquet schema retrieving
    5bd3d29 [Cheng Lian] Fixed Parquet log level

commit 33e64ecacbc44567f9cba2644a30a118653ea5fa
Author: Rui Li <[email protected]>
Date:   2014-07-16T17:23:37Z

    SPARK-2277: make TaskScheduler track hosts on rack
    
    Hi mateiz, I've created 
[SPARK-2277](https://issues.apache.org/jira/browse/SPARK-2277) to make 
TaskScheduler track hosts on each rack. Please help to review, thanks.
    
    Author: Rui Li <[email protected]>
    
    Closes #1212 from lirui-intel/trackHostOnRack and squashes the following 
commits:
    
    2b4bd0f [Rui Li] SPARK-2277: refine UT
    fbde838 [Rui Li] SPARK-2277: add UT
    7bbe658 [Rui Li] SPARK-2277: rename the method
    5e4ef62 [Rui Li] SPARK-2277: remove unnecessary import
    79ac750 [Rui Li] SPARK-2277: make TaskScheduler track hosts on rack

commit efe2a8b1262a371471f52ca7d47dc34789e80558
Author: Reynold Xin <[email protected]>
Date:   2014-07-16T17:44:54Z

    Tightening visibility for various Broadcast related classes.
    
    In preparation for SPARK-2521.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #1438 from rxin/broadcast and squashes the following commits:
    
    432f1cc [Reynold Xin] Tightening visibility for various Broadcast related 
classes.

commit df95d82da7c76c074fd4064f7c870d55d99e0d8e
Author: Yin Huai <[email protected]>
Date:   2014-07-16T17:53:59Z

    [SPARK-2525][SQL] Remove as many compilation warning messages as possible 
in Spark SQL
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-2525.
    
    Author: Yin Huai <[email protected]>
    
    Closes #1444 from yhuai/SPARK-2517 and squashes the following commits:
    
    edbac3f [Yin Huai] Removed some compiler type erasure warnings.

commit 1c5739f68510c2336bf6cb3e18aea03d85988bfb
Author: Reynold Xin <[email protected]>
Date:   2014-07-16T17:55:47Z

    [SQL] Cleaned up ConstantFolding slightly.
    
    Moved couple rules out of NullPropagation and added more comments.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #1430 from rxin/sql-folding-rule and squashes the following commits:
    
    7f9a197 [Reynold Xin] Updated documentation for ConstantFolding.
    7f8cf61 [Reynold Xin] [SQL] Cleaned up ConstantFolding slightly.

commit fc7edc9e76f97b25e456ae7b72ef8636656f4f1a
Author: Sandy Ryza <[email protected]>
Date:   2014-07-16T18:07:16Z

    SPARK-2519. Eliminate pattern-matching on Tuple2 in performance-critical...
    
    ... aggregation code
    
    Author: Sandy Ryza <[email protected]>
    
    Closes #1435 from sryza/sandy-spark-2519 and squashes the following commits:
    
    640706a [Sandy Ryza] SPARK-2519. Eliminate pattern-matching on Tuple2 in 
performance-critical aggregation code

commit cc965eea510397642830acb21f61127b68c098d6
Author: Takuya UESHIN <[email protected]>
Date:   2014-07-16T18:13:38Z

    [SPARK-2518][SQL] Fix foldability of Substring expression.
    
    This is a follow-up of #1428.
    
    Author: Takuya UESHIN <[email protected]>
    
    Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following 
commits:
    
    37d1ace [Takuya UESHIN] Fix foldability of Substring expression.

commit ef48222c10be3d29a83dfc2329f455eba203cd38
Author: Reynold Xin <[email protected]>
Date:   2014-07-16T18:15:07Z

    [SPARK-2517] Remove some compiler warnings.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #1433 from rxin/compile-warning and squashes the following commits:
    
    8d0b890 [Reynold Xin] Remove some compiler warnings.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-1470] remove scalalogging-slf4j depende...

Reply via email to