[ 
https://issues.apache.org/jira/browse/BEAM-2660?focusedWorklogId=132273&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-132273
 ]

ASF GitHub Bot logged work on BEAM-2660:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Aug/18 09:10
            Start Date: 08/Aug/18 09:10
    Worklog Time Spent: 10m 
      Work Description: aromanenko-dev commented on issue #3619: [BEAM-2660] 
Set PubsubIO batch size using builder
URL: https://github.com/apache/beam/pull/3619#issuecomment-411340289
 
 
   @cjmcgraw 
   > pubsub is google cloud specific. But this change is not runner specific
   Yes, that is why I was wondering how it's related to any specific runner and 
@reuvenlax explained that it's happened that Dataflow runner has it's own 
implementation for Pubsub support.
   
   @reuvenlax 
   > however I want to make sure that you know it will not affect the Dataflow 
runner.
   As @dadrian mentioned above, this PR affects only sink part of PubsubIO, not 
source. To be honest, I don't know if Dataflow runner uses PubsubIO sink or not 
(not familiar with this part of code), so I can't guarantee this. 
   Do you think that running `Dataflow_ValidatesRunner` job is not enough? Do 
we need to run any other tests for that?
   
   In general, this LGTM except some raised concerns about Dataflow runner 
below. If it's tested  by `Dataflow_ValidatesRunner` job (and it was passed) 
then I'd like to have this merged.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 132273)
    Time Spent: 2h 40m  (was: 2.5h)

> Set PubsubIO batch size using builder
> -------------------------------------
>
>                 Key: BEAM-2660
>                 URL: https://issues.apache.org/jira/browse/BEAM-2660
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Carl McGraw
>            Assignee: Chamikara Jayalath
>            Priority: Major
>              Labels: gcp, java, pubsub, sdk
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> PubsubIO doesn't allow users to set the publish batch size. Instead the value 
> is hard coded in both the BoundedPubsubWriter and the UnboundedPubsubSink. 
> google's pub/sub is bound to a maximum of 10mb per request size. My company 
> has run into problems with events that are individually smaller than 1mb, but 
> when batched in the 100 or 2000 default batch sizes causes pubsub to fail to 
> send the event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to