Hello everybody

I am facing a problem with a pipeline that runs perfectly on directrunner,
but when it comes to dataflow, it turns into a mess. It changes the element
and the side input (access).

The side input reads only a line with credentials.

Any thoughts on how its done are more than welcome. How do you manage
sensitive information in templated pipelines?

It is something like this:

class GetStuff(beam.DoFn):

def __init__(self, input1, input2):
self.input1 = input1
self.input2 = input2

def process(self, element, access):
user, token = access.split('\t')

thing1, thing2 = element.split('\t')


credentials_pipe = (
p
| 'Get credentials' >> beam.io.ReadFromText(user_options.credentials)
)

main_pipe = (
p
| 'Get information' >> beam.io.ReadFromText(user_options.input_file)
| 'Get prediction from severity' >> beam.ParDo(GetPrediction(
user_options.input1,
user_options.input2,
), beam.pvalue.AsSingleton(credentials_pipe))
)

p.run()


-- 

   *ANDRÉ ROCHA SILVA*
  * DATA ENGINEER*
  (48) 3181-0611

  <https://www.linkedin.com/in/andre-rocha-silva/> /andre-rocha-silva/
<http://portaltelemedicina.com.br/>
<https://www.youtube.com/channel/UC0KH36-OXHFIKjlRY2GyAtQ>
<https://pt-br.facebook.com/PortalTelemedicina/>
<https://www.linkedin.com/company/9426084/>

Reply via email to