[
https://issues.apache.org/jira/browse/BEAM-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948013#comment-15948013
]
Solomon Duskis commented on BEAM-1269:
--------------------------------------
BigtableIO should not set data channel pool counts for reads. This is the
current line:
// Set data channel count to one because there is only 1 scanner in this
session
BigtableOptions.Builder clonedBuilder = options.toBuilder()
.setDataChannelCount(1);
BigtableOptions optionsWithAgent =
clonedBuilder.setUserAgent(getBeamSdkPartOfUserAgent()).build();
It should be more like:
BigtableOptions optionsWithAgent = options
.toBuilder()
.setUserAgent(getBeamSdkPartOfUserAgent())
. setUseCachedDataPool(true)
. setDataHost(BigtableOptions.BIGTABLE_BATCH_DATA_HOST_DEFAULT)
.build();
> BigtableIO should make more efficient use of connections
> --------------------------------------------------------
>
> Key: BEAM-1269
> URL: https://issues.apache.org/jira/browse/BEAM-1269
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-gcp
> Reporter: Daniel Halperin
> Labels: newbie, starter
>
> RIght now, {{BigtableIO}} opens up a new Bigtable session for every DoFn, in
> the {{@Setup}} function. However, sessions can support multiple connections,
> so perhaps this code should be modified to open up a smaller session pool and
> then allocation connections in {{@StartBundle}}.
> This would likely make more efficient use of resources, especially for highly
> multithreaded workers.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)