[ 
https://issues.apache.org/jira/browse/BEAM-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-13141:
-----------------------------------
    Description: 
+*Context*+

As of today HBase IO interacts with Hbase cluster while building execution 
graph for validating the existence of table, etc 

https://github.com/apache/beam/blob/master/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseIO.java#L237

In certain scenarios dataflow jobs are launched from systems that does not have 
network access to Hbase cluster during graph construction stage. but can access 
only during execution time on google cloud. However due to current 
implementation of local access to HbaseIO, the job can be launched only from 
systems that has network access to Hbase Cluster.

*+Requirement+*

 Modify HbaseIO to accept a flag (say hasLocalAccess) and if flag is set to 
false defer validations , split calculation logic etc to job execution time 
rather than job construction time.

 

  was:
+*Context*+

As of today HBase IO interacts with Hbase cluster while building execution 
graph for validating the existence of table, calculating splits etc 

https://github.com/apache/beam/blob/master/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseIO.java#L237

In certain scenarios dataflow jobs are launched from systems that does not have 
network access to Hbase cluster during graph construction stage. but can access 
only during execution time on google cloud. However due to current 
implementation of local access to HbaseIO, the job can be launched only from 
systems that has network access to Hbase Cluster.

*+Requirement+*

 Modify HbaseIO to accept a flag (say hasLocalAccess) and if flag is set to 
false defer validations , split calculation logic etc to job execution time 
rather than job construction time.

 


> Support to submit Jobs using HBaseIO to DataflowRunner without local access 
> to HBase Cluster
> --------------------------------------------------------------------------------------------
>
>                 Key: BEAM-13141
>                 URL: https://issues.apache.org/jira/browse/BEAM-13141
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Prathap Kumar Parvathareddy
>            Priority: P2
>
> +*Context*+
> As of today HBase IO interacts with Hbase cluster while building execution 
> graph for validating the existence of table, etc 
> https://github.com/apache/beam/blob/master/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseIO.java#L237
> In certain scenarios dataflow jobs are launched from systems that does not 
> have network access to Hbase cluster during graph construction stage. but can 
> access only during execution time on google cloud. However due to current 
> implementation of local access to HbaseIO, the job can be launched only from 
> systems that has network access to Hbase Cluster.
> *+Requirement+*
>  Modify HbaseIO to accept a flag (say hasLocalAccess) and if flag is set to 
> false defer validations , split calculation logic etc to job execution time 
> rather than job construction time.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to