HI Valentyn
  not sure if it help at all and not sure if you are Dataflow agnostic or
not
so have implemented your demo app by creating a template via
cloud-builld yaml file.. it took me quite a bit to do  as i kept on
struggling on 'missing modules' etc until i discover few settings
i need to configure in the yaml file - well perhaps you guys know this
already -
Was wondering - if you dont have already -  if i can contribute with a
sample  build.yaml and run.yaml to add to the demo?
Or you guys are just beams and not deal with dataflows?

kind regards
Marco

On Fri, Oct 25, 2024 at 7:58 PM Valentyn Tymofieiev <valen...@google.com>
wrote:

> I would suggest trying out an established working example and gradually
> change it to fit the project structure that you have, while making sure it
> continues to work.
>
> The short answer is Dataflow will pick up only what is specified in the
> pipeline options
>
> Whether your package uses or doesn't use  a .toml is not essential. You
> can install it inside the custom container image or supply package
> distribution (such as an sdist or multi-platform wheel)   via
> --extra_package, or if it has sources and a setup.py file, use the
> --setup_file pipeline option.
>
> On Thu, Oct 17, 2024 at 9:45 PM Sofia’s World <mmistr...@gmail.com> wrote:
>
>> Hello Valentin
>>   have never used a  .toml file (perhaps i am behind time)
>> could you explain how will dataflow pick up the  .toml?
>> I am currently using same setup as the pipeline project but i am NOT
>> using a .toml and i am getting problems as my main class cannot see my
>> equivalent of 'mypackage///'
>> Kind regards
>>  Marco
>>
>> On Thu, Oct 17, 2024 at 5:13 PM Valentyn Tymofieiev via user <
>> user@beam.apache.org> wrote:
>>
>>> See also:
>>> https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/dataflow/flex-templates/pipeline_with_dependencies/
>>>
>>> On Wed, Oct 16, 2024 at 4:50 PM XQ Hu via user <user@beam.apache.org>
>>> wrote:
>>>
>>>> It is fine to put that import inside the process method. I think
>>>> Dataflow probably complains about this due to your template launcher image
>>>> that does not install `psycopg2`.
>>>>
>>>> On Wed, Oct 16, 2024 at 6:08 PM Henry Tremblay via user <
>>>> user@beam.apache.org> wrote:
>>>>
>>>>> Not exactly Apache Beam, but I notice if I run Apache Beam on
>>>>> Dataflow, using a flex template, I have import problems:
>>>>>
>>>>>
>>>>>
>>>>> For example, the following code will fail because it can’t find
>>>>>    psycopg2
>>>>>
>>>>>
>>>>>
>>>>> 1 import psycopg2
>>>>>
>>>>>
>>>>>
>>>>> class ReadDb(beam.DoFn):
>>>>>
>>>>> 50
>>>>>
>>>>>  51     def __init__(self, user, password, host):
>>>>>
>>>>> 52         self.user = user
>>>>>
>>>>> 53         self.password = password
>>>>>
>>>>> 54         self.host = host
>>>>>
>>>>> 55
>>>>>
>>>>>  56     def process(self, element):
>>>>>
>>>>> 58         conn  =  psycopg2.connect (
>>>>>
>>>>> 59                 host = self.host,
>>>>>
>>>>> 60                 user = self.user,
>>>>>
>>>>> 61                 password = self.password,
>>>>>
>>>>> 62                 database = 'chassis_trusted_data',
>>>>>
>>>>> 63                 port = 5432)
>>>>>
>>>>> 64
>>>>>
>>>>>  65         yield 'a'
>>>>>
>>>>>
>>>>>
>>>>> I actually need to import pyscopg2 in the process method (line 57)
>>>>>
>>>>>
>>>>>
>>>>> I know I can use
>>>>>
>>>>>
>>>>>
>>>>> pipeline_options.view_as(SetupOptions).save_main_session =
>>>>> save_main_session
>>>>>
>>>>>
>>>>>
>>>>> but this causes pickling problems, and defeats the purpose of building
>>>>> a Docker image
>>>>>
>>>>>
>>>>>
>>>>

Reply via email to