[ 
https://issues.apache.org/jira/browse/BEAM-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961810#comment-15961810
 ] 

ASF GitHub Bot commented on BEAM-1907:
--------------------------------------

GitHub user dhalperi opened a pull request:

    https://github.com/apache/beam/pull/2471

    [BEAM-1907] PubsubIO: remove support for BoundedReader

    Google Cloud Pub/Sub is not currently that useful in bounded mode --
    it's a streaming source. Years ago, before the DirectRunner supported
    unbounded PCollections and sources, however, we were unable to run the
    streaming source in any SDK -- so we added a trivial bounded mode for
    testing.
    
    That trivial mode is no longer necessary. Additionally, it may confuse
    users into thinking it's reliable (it's not), performant (it's not),
    or has well defined semantics (it doesn't) -- it's really intended just
    for testing.
    
    Now that the DirectRunner supports everything we need -- unbounded
    PCollections, non-blocking execution with cancelation, etc. -- we can
    delete the bounded mode.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dhalperi/beam delete-pubsub-bounded

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2471.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2471
    
----
commit 8f85169b8a19538fb3f59ad992c224dbc9a1e13e
Author: Dan Halperin <[email protected]>
Date:   2017-04-07T21:50:42Z

    PubsubIO: remove support for BoundedReader
    
    Google Cloud Pub/Sub is not currently that useful in bounded mode --
    it's a streaming source. Years ago, before the DirectRunner supported
    unbounded PCollections and sources, however, we were unable to run the
    streaming source in any SDK -- so we added a trivial bounded mode for
    testing.
    
    That trivial mode is no longer necessary. Additionally, it may confuse
    users into thinking it's reliable (it's not), performant (it's not),
    or has well defined semantics (it doesn't) -- it's really intended just
    for testing.
    
    Now that the DirectRunner supports everything we need -- unbounded
    PCollections, non-blocking execution with cancelation, etc. -- we can
    delete the bounded mode.

----


> Delete PubsubBoundedReader
> --------------------------
>
>                 Key: BEAM-1907
>                 URL: https://issues.apache.org/jira/browse/BEAM-1907
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-gcp
>            Reporter: Daniel Halperin
>            Assignee: Daniel Halperin
>             Fix For: First stable release
>
>
> PubsubIO in bounded mode doesn't really make sense, outside of hacky testing 
> modes -- it might lose data, other stuff that's sketchy. We had it before 
> because the old {{DirectRunner}} did not support unbounded PCollections. Now 
> that it does, we should probably get rid of this buggy code.
> Specifically:
> * Delete the specialized PubsubBoundedReader implementation
> * Either: delete the maxNumRecords and maxReadTime methods, or rename them to 
> something like maxNumRecordsForTesting with clear javadoc as to their 
> drawbacks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to