[GitHub] flink pull request #5023: [hotfix][docs] Review of concepts docs for grammar...

ChrisChinchilla Thu, 16 Nov 2017 15:27:49 -0800

GitHub user ChrisChinchilla opened a pull request:

    https://github.com/apache/flink/pull/5023


    [hotfix][docs] Review of concepts docs for grammar, formatting etc

    Spending some time doing a brief review of a few docs sections, just a 
startâ¦

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ChrisChinchilla/flink concepts-review

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5023.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5023
    
----
commit 77946f1d8772f2c2c095e93d5a76c8df42fbe790
Author: Chris Ward <[email protected]>
Date:   2017-10-02T13:20:50Z

    Polishing grammar and tone for consistency and clarity

commit e5eafba7053eb41f0bcd4e177ed1fdae7c2c3e66
Author: Chris Ward <[email protected]>
Date:   2017-10-05T18:26:21Z

    Clarify support of older APIs

commit 027914715744e673e41893067beec58b52a177ce
Author: Chris Ward <[email protected]>
Date:   2017-10-12T13:44:28Z

    Begin fixes for voicing, passive voice, weasel words etcâ¦

commit 05cb3f04d332886aafc2b5675358aed316f76826
Author: Nico Kruber <[email protected]>
Date:   2017-07-11T09:19:20Z

    [FLINK-7068][blob] change BlobService sub-classes for permanent and 
transient BLOBs
    
    [FLINK-7068][blob] start introducing a new BLOB storage abstraction
    
    This is incomplete and may not compile and/or run tests successfully yet.
    
    [FLINK-7068][blob] remove BlobView from TransientBlobCache
    
    The transient BLOB cache is not supposed to work with the HA store since it 
only
    serves non-HA files.
    
    [FLINK-7068][blob] remove unnecessary use of BlobClient
    
    [FLINK-7068][blob] implement TransientBlobCache#put methods
    
    [FLINK-7068][blob] remove further unnecessary use of BlobClient and adapt 
to HA get/put methods
    
    [FLINK-7068][blob] fix BlobServer#getFileInternal not being guarded by locks
    
    [FLINK-7068][blob] add incoming file cleanup at BlobServer in cases of 
errors
    
    [FLINK-7068] fix missing BlobServer#putHA() jobId propagation
    
    [FLINK-7068][blob] remove BlobClient use from BlobServer{Get|Put}Test
    
    [FLINK-7068][blob] make helper methods work with any BlobService
    
    [FLINK-7068][blob] start adding a BlobCacheGetTest
    
    [FLINK-7068][blob] verify get contents in separate threads
    
    This allows (at a slight chance) that we may see an intermediate file.
    
    [FLINK-7068][blob] better locking granularity during file retrieval
    
    This allows multiple parallel downloads from the HA store to the 
BlobServer's
    local store although only one of these downloaded staging files will 
actually
    be used. In practice, this happens only during recovery and not in parallel
    anyways.
    
    [FLINK-7068][blob] share more code among BlobServer and BlobServerConnection
    
    This also applies the better locking granularity of the previous commit to
    BlobServerConnection.
    
    [FLINK-7068][blob] properly cleanup temporary staging files in all cases
    
    [FLINK-7068][blob] make PermanentBlobCache and TransientBlobCache 
thread-safe
    
    [FLINK-7068][tests] improve various tests
    
    [FLINK-7068][blob] change the signature of the delete calls to return 
success
    
    We will not throw exceptions in case of failures anymore and return whether 
the
    operation was successful instead. Failure details will still be accessible 
in
    the written logs.
    
    [FLINK-7068][tests] extend and adapt BlobServerDeleteTest
    
    [FLINK-7068][tests] adapt further BlobCache tests
    
    [FLINK-7068][tests] adapt BlobClientTest
    
    [FLINK-7068][blob] cleanup BlobClient methods
    
    BlobClient is not supposed to be used by anyone else than the
    BlobServer/BlobCache classes. Most accessors were already package-private, 
now
    remove the ones that just blow up the code.
    
    [FLINK-7068] add a TODO to fix the currently failing tests
    
    [FLINK-7068][tests] add a BlobCacheRecoveryTest
    
    This currently fails due to TransientBlobCache#put also storing files in HA
    store which it should not!
    
    [FLINK-7068][tests] improve failure message
    
    [FLINK-7068][blob] add permanent/transient BLOB modes to BlobClient
    
    This allows a better control of which should end up in HA store and which 
should
    not. Also, during GET methods, we do not check the HA store unnecessarily.
    
    [FLINK-7068][tests] extend the Blob{Server|Cache}GetTest
    
    This adds some failing GET operations and verifies that the files are 
cleaned
    up accordingly.
    
    [FLINK-7068][blob] remove "final" flag from BlobCache class
    
    This re-enables mocking in various unit tests.
    
    [FLINK-7068][tests] fix test relying on order of folder contents
    
    [FLINK-7068][blob] some BlobServer cleanup
    
    [FLINK-7068][hotfix] fix checkstyle errors
    
    [FLINK-7068][tests] fix tests now requiring a more complete BlobCache mock
    
    A suitable BlobCache mock should at least return a mock for a permanent and 
a
    transient BLOB store, so mock(BlobCache.class) is not sufficient anymore.
    
    [FLINK-7068] final wrap up
    
    * remove a left-over TODO
    * remove useless tests for the concurrency of the GET operations (we cannot 
test
    that the file write is guarded by a lock directly - rely on the concurrent
    checks in the individual threads instead)
    * fix some log messages
    
    [FLINK-7068][blob] remove Thread#start() call from BlobServer constructor
    
    This is bad design and limits extensibility, e.g. in tests like the
    BlobCacheRetriesTest where this caused a race condition with the sub-class.
    Instead, the user must now call BlobServer#start() explicitely.
    
    [FLINK-7068][tests] remove unused imports
    
    [FLINK-7068][tests] fix a typo
    
    [FLINK-7068][tests] add some tests that verify behaviour with corrupted 
files
    
    Also add corruption checks for HA-store downloads which was not implemented 
yet.
    
    [FLINK-7068][blob] ensure consistency in PermanentBlobCache even in cases 
of invalid use
    
    During cleanup, no write lock was taken but the storage directory of an
    (unused!) job was deleted. Normally, there should be no process left 
accessing
    its data and no new process can jump in since the registration is locked. In
    case of invalid use cases, i.e. using a job's data outside a register() and
    release() block, this could lead to strange effects.
    By guarding the cleanup with the write lock as well, we circumvent that.
    
    [FLINK-7068][hotfix] remove an unused import

commit c7d4562db8af446e79967215cda597be37261eed
Author: Nico Kruber <[email protected]>
Date:   2017-07-25T11:00:33Z

    [FLINK-7261][blob] extend BlobStore#get/put with boolean return values
    
    This way, using code can distinguish non-HA cases, i.e. VoidBlobStore, from
    HA cases, i.e. FileSystemBlobStore, in a general way and have better error
    reporting.

commit d7ad3aed56ca72139760f04af541b061b3107cca
Author: Nico Kruber <[email protected]>
Date:   2017-08-07T16:12:28Z

    [FLINK-7412][network] optimise NettyMessage.TaskEventRequest#readFrom() to 
read from netty buffers directly
    
    This closes #4518.

commit f3ef92fca6f6b572a29e5b2b37caa2d669a06f20
Author: Nico Kruber <[email protected]>
Date:   2017-08-18T10:28:23Z

    [FLINK-7057][tests][hotfix] fix test instability of 
JobManagerCleanupITCase#testBlobServerCleanupCancelledJob
    
    This test expected two messages to arrice (job cancellation and job state 
change
    notification) but did not take different receive orders into account. The 
fix:
    - removes state change listening for this test case so that only one message
      arrives, and
    - adds message comparison by object, not just class (to improve debugging)

commit b2d2d584d9f1849bf81fbe09ee81ca44efacc421
Author: Nico Kruber <[email protected]>
Date:   2017-08-21T08:36:56Z

    [FLINK-7483][blob] prevent cleanup of re-registered jobs
    
    When a job is registered, it may have been released before and we thus need 
to
    reset the cleanup timeout again.

commit a1a1cf59c0c677b593934b17c239e6832440c56d
Author: Fabian Hueske <[email protected]>
Date:   2017-09-10T22:05:06Z

    [FLINK-7446] [table] Change DefinedRowtimeAttribute to work on existing 
field.
    
    This closes #4710.

commit 404779108319d1b4ead02bb4af1fb979ba1824dd
Author: Nico Kruber <[email protected]>
Date:   2017-09-20T10:05:25Z

    [FLINK-7068][blob] Introduce permanent and transient BLOB keys
    
    [FLINK-7068][blob] address PR review comments, part 1
    
    [FLINK-7068][blob] create a common base class for the BLOB caches
    
    [FLINK-7068][blob] update some comments
    
    [FLINK-7068][blob] integrate the BLOB type into the BlobKey
    
    [FLINK-7068][blob] rename a few methods for better consistency
    
    [FLINK-7068][blob] fix Blob*DeleteTest not working as documented in one test
    
    [FLINK-7068][blob] add checks for jobId being null in PermanentBlobCache
    
    [FLINK-7068][blob] implement get-and-delete logic for transient BLOBs
    
    Transient BLOB files are deleted on the BlobServer upon first access from a
    cache. Therefore, we do not need the DELETE operations anymore, aside from
    deleting the file from the local cache (for now).
    
    [FLINK-7068][blob] address PR comments, part 2
    
    [FLINK-7068][blob] separate permanent and transient BLOB keys
    
    * create PermanentBlobKey and TransientBlobKey (inheriting from BlobKey) and
      forbid using transient BLOBs with permanent caches and vice versa
    * make BlobKey package-private, similarly for the BlobType which is now
      reflected by the two BlobKey sub-classes
    -> this gives a cleaner interface for the user
    
    This closes #4358.

commit 1546277f7e2351f83efeb6e57c04e5b5c0323c6a
Author: Till Rohrmann <[email protected]>
Date:   2017-09-25T13:29:59Z

    [FLINK-7668] Add ExecutionGraphCache for ExecutionGraph based REST handlers
    
    The ExecutionGraphCache replaces the ExecutionGraphHolder. Unlike the 
latter, the former
    does not expect the AccessExecutionGraph to be the true ExecutionGraph. 
Instead it assumes
    it to be the ArchivedExecutionGraph. Therefore, it invalidates the cache 
entries after
    a given time to live period. This will trigger requesting the 
AccessExecutionGraph again
    and, thus, updating the ExecutionGraph information for the ExecutionGraph 
based REST
    handlers.
    
    In order to avoid memory leaks, the WebRuntimeMonitor starts now a periodic 
cleanup task
    which triggers ExecutionGraphCache.cleanup. This methods releases all cache 
entries which
    have exceeded their time to live. Currently it is set to 20 * 
refreshInterval of the
    web gui.
    
    This closes #4728.

commit 63f003be24ddb5190a1f2d4b5b710549c285e848
Author: Till Rohrmann <[email protected]>
Date:   2017-09-26T16:39:15Z

    [FLINK-7695] [flip6] Add JobConfigHandler for new RestServerEndpoint
    
    This closes #4737.

commit bd55021374380f811b5f338fa6f30588635f84f4
Author: Tzu-Li (Gordon) Tai <[email protected]>
Date:   2017-09-27T18:05:21Z

    [FLINK-7721] [DataStream] Only emit new min watermark iff it was aggregated 
from watermark-aligned inputs
    
    Prior to this commit, In the calculation of the new min watermark in
    StatusWatermarkValve#findAndOutputNewMinWatermarkAcrossAlignedChannels(),
    there is no verification that the calculated new min watermark
    really is aggregated from some aligned channel.
    
    In the corner case where all input channels are currently not aligned
    but actually some are active, we would then incorrectly determine that
    the final aggregation is Long.MAX_VALUE and emit that.
    
    This commit fixes this by only emitting the aggregated watermark iff it was
    really calculated from some aligned input channel (as well as the
    already existing constraint that it needs to be larger than the last
    emitted watermark). This change should also safely cover the case that a
    Long.MAX_VALUE was genuinely aggregated from the input channels.

commit 3b457321f39803b955ffc3d01c9d862ccb7a4769
Author: Tzu-Li (Gordon) Tai <[email protected]>
Date:   2017-09-28T12:11:22Z

    [FLINK-7728] [DataStream] Simplify BufferedValveOutputHandler used in 
StatusWatermarkValveTest
    
    The previous implementation was overly complicated. Having separate
    buffers for the StreamStatus and Watermarks is not required for our
    tests. Also, that design doesn't allow checking the order StreamStatus /
    Watermarks are emitted from a single input to the valve.
    
    This commit reworks it by buffering both StreamStatus and Watermarks in
    a shared queue.

commit 9538eae0312c74a1e84fa4088fc09154b9290c5e
Author: Tzu-Li (Gordon) Tai <[email protected]>
Date:   2017-09-28T14:49:22Z

    [FLINK-7728] [DataStream] Flush max watermark across all inputs once all 
become idle
    
    Prior to this commit, once all inputs of the StatusWatermarkValve
    becomes idle, we only emit the StreamStatus.IDLE marker, and check
    nothing else. This makes the watermark advancement behaviour
    inconsistent in the case that all inputs become idle depending on the
    order that they become idle.
    
    This commit fixes this by "flushing" the max watermark across all
    channels once all inputs become idle. At a high-level, what this means
    for downstream operators is that all inputs have become idle and will
    temporariliy cease to advance their watermarks, so they can safely
    advance their event time to whatever the largest watermark is.

commit f14dfd04c748e26e8f0c59146a8dd209871404fc
Author: Till Rohrmann <[email protected]>
Date:   2017-09-28T16:35:50Z

    [FLINK-7708] [flip6] Add CheckpointConfigHandler for new REST endpoint
    
    This commit implements the CheckpointConfigHandler which now returns a
    CheckpointConfigInfo object if checkpointing is enabled. In case that
    checkpointing is disabled for a job, it will return a 404 response.
    
    This closes #4744.

commit 85f5168f58e9a46e1009ce79f4285245199612cb
Author: Tzu-Li (Gordon) Tai <[email protected]>
Date:   2017-09-29T09:16:22Z

    [FLINK-7728] [DataStream] Make StatusWatermarkValve unit tests more 
fine-grained
    
    Previously, the unit tests in StatusWatermarkValveTest were too
    cluttered and testing too many behaviours in a single test. This makes
    it hard to have a good overview of what test cases are covered.
    
    This commit is a rework of the previous tests, making them more
    fine-grained so that the scope of each test is small enough. All
    previously tested behaviours are still covered.

commit 32d90137e3984ce067761e0d18707cc65992131d
Author: Till Rohrmann <[email protected]>
Date:   2017-09-29T13:09:06Z

    [FLINK-7710] [flip6] Add CheckpointStatisticsHandler for the new REST 
endpoint
    
    This commit also makes the CheckpointStatsHistory object serializable by 
removing the
    CheckpointStatsHistoryIterable and replacing it with a static ArrayList.
    
    This closes #4750.

commit 9f773c3f60e31569023087454a71e660c3d669cd
Author: Aljoscha Krettek <[email protected]>
Date:   2017-10-02T10:28:10Z

    [FLINK-7721] Verify StatusWatermarkValve only emits WM iff it has aligned 
inputs

commit ebc01c7d7e693666bad4a6d65b126f22dc264812
Author: Stephan Ewen <[email protected]>
Date:   2017-10-02T12:15:14Z

    [hotfix] [core] Prevent potential null pointer in MemorySize.equals(...)

commit 2aa618a2bab4260fc3d46b7a52f68450d0f34626
Author: Stephan Ewen <[email protected]>
Date:   2017-10-02T12:34:27Z

    [FLINK-7643] [core] Misc. cleanups in FileSystem
    
      - Simplify access to local file system
      - Use a fair lock for all FileSystem.get() operations
      - Robust falback to local fs for default scheme (avoids URI parsing error 
on Windows)
      - Deprecate 'getDefaultBlockSize()'
      - Deprecate create(...) with block sizes and replication factor, which is 
not applicable to many FS

commit d186c169591c36a6fca8dbb5d40ff2d5815ed810
Author: Stephan Ewen <[email protected]>
Date:   2017-10-02T14:25:18Z

    [FLINK-7643] [core] Rework FileSystem loading to use factories
    
    This makes sure that configurations are loaded once and file system 
instances are
    properly reused by scheme and authority.
    
    This also factors out a lot of the special treatment of Hadoop file systems 
and simply
    makes the Hadoop File System factory the default fallback factory.

commit 0f3fcb8967a95d31f383f8b4d895869b3d2b8b30
Author: Stephan Ewen <[email protected]>
Date:   2017-10-02T14:30:07Z

    [FLINK-7643] [core] Drop eager checks for file system support.
    
    Some places validate if the file URIs are resolvable on the client. This 
leads to
    problems when file systems are not accessible from the client, when the 
full libraries for
    the file systems are not present on the client (for example often the case 
in cloud setups),
    or when the configuration on the client is different from the 
nodes/containers that will
    execute the application.

commit 1479b72ba3f11a985f977ee568d72e25a97ab450
Author: Till Rohrmann <[email protected]>
Date:   2017-10-02T20:29:12Z

    [FLINK-7754] [rpc] Complete termination future after actor has been stopped
    
    This commit waits not only until the Actor has called postStop but also 
until the actor
    has been completely shut down by the ActorSystem before completing the 
termination
    future.
    
    This closes #4770.

commit 8fbade07443dd2ca3d408ed61485d3dcaf86415e
Author: zentol <[email protected]>
Date:   2017-10-02T20:58:21Z

    [hotfix] [dispatcher] Remove leftover javadoc from DispatcherGateway

commit 56aac28c614462a119edba13461b8de11ea9fbc9
Author: Stephan Ewen <[email protected]>
Date:   2017-10-02T21:06:07Z

    [hotfix] Fix testing log level in flink-runtime

commit f9b726df9bb86b41d0c903a36ed2e318ae94ce62
Author: Wright, Eron <[email protected]>
Date:   2017-10-04T03:36:09Z

    [FLINK-7752] [flip-6] RedirectHandler should execute on the IO thread
    
    This closes #4766.

commit 5d631dd79810e571c9baff2f3079084380e196bd
Author: Fabian Hueske <[email protected]>
Date:   2017-10-04T11:01:22Z

    [hotfix] [hbase] Set root log level to OFF for flink-hbase tests.
    
    Log level is changed due to a buggy Calcite check that causes a NPE.
    The check is only performed if log level DEBUG is enabled.
    
    This closes #4771

commit 1b17941b31831ecfcad9b1d5bc1505415ac66306
Author: Stephan Ewen <[email protected]>
Date:   2017-10-05T09:26:13Z

    [FLINK-7767] [file system sinks] Avoid loading Hadoop conf dynamically at 
runtime

commit c6cdd4172107985a4160869d6af6ddc0ade44fbf
Author: Stephan Ewen <[email protected]>
Date:   2017-10-05T09:27:48Z

    [FLINK-7766] [file system sink] Drop obsolete reflective hflush calls
    
    This was done reflectively before for Hadoop 1 compatibility.
    Since Hadoop 1 is no longer supported, this is obsolete now.

----


---

[GitHub] flink pull request #5023: [hotfix][docs] Review of concepts docs for grammar...

Reply via email to