FileIO Azure Storage problems

2020-11-19 Thread Thomas Li Fredriksen
Good morning everyone. I am attempting to parse a very large CSV (65 million lines) with BEAM (version 2.25) from an Azure Blob and have created a pipeline for this. I am running the pipeline on dataflow and testing with a smaller version of the file (10'000 lines). I am using FileIO and

Re: snowflake io in python

2020-11-19 Thread Alan Krumholz
How can I pass that flag using the SDK? Tried this: pipeline = beam.Pipeline(options=PipelineOptions(experiments= > ['use_runner_v2'], ...) but still getting a similar error: ---KeyError

Participate in a User Experience Research for Apache Beam

2020-11-19 Thread Clara Azarel Balderas Gomez
Hello, We’d like to invite you to provide feedback on your experience discovering, learning, and using Apache Beam via a user experience research. A user experience research is a systematic investigation of user needs, behaviors, and pain points to gather insights to inform the design process.

Re: snowflake io in python

2020-11-19 Thread Alan Krumholz
DataFlow runner On Thu, Nov 19, 2020 at 2:00 PM Brian Hulette wrote: > Hm what runner are you using? It looks like we're trying to encode and > decode the pipeline proto, which isn't possible for a multi-language > pipeline. Are you using a portable runner? > > Brian > > On Thu, Nov 19, 2020 at

Re: snowflake io in python

2020-11-19 Thread Brian Hulette
Hm what runner are you using? It looks like we're trying to encode and decode the pipeline proto, which isn't possible for a multi-language pipeline. Are you using a portable runner? Brian On Thu, Nov 19, 2020 at 10:42 AM Alan Krumholz wrote: > got it, thanks! > I was using: >

Re: snowflake io in python

2020-11-19 Thread Alan Krumholz
got it, thanks! I was using: 'xx.us-east-1' Seems using this instead fixes that problem: 'xx.us-east-1.snowflakecomputing.com I'm now hitting a different error though (now in python): in bq_to_snowflake(bq_table, snow_table, > git_branch) > 161 ) > 162 > --> 163 result = pipeline.run()

Re: snowflake io in python

2020-11-19 Thread Brian Hulette
Hi Alan, Sorry this error message is so verbose. What are you passing for the server_name argument [1]? It looks like that's what the Java stacktrace is complaining about: java.lang.IllegalArgumentException: serverName must be in format .snowflakecomputing.com [1]

snowflake io in python

2020-11-19 Thread Alan Krumholz
I'm trying to replace my custom/problematic snowflake sink with the new: https://beam.apache.org/documentation/io/built-in/snowflake/#writing-to-snowflake However when I try to run my pipeline (using python) I get this Java error: RuntimeError: java.lang.RuntimeException: Failed to build

Re: Unit Testing Custom Coder

2020-11-19 Thread Sofia’s World
Hey i have been writing unit tests for beam pipelines using TestPipeline - in python though, and i dont have such usecases wondering if this satisfy your usecase? If not, please give me some pointers to try to reproduce your usecase. To me it seems, one of your transformations raises an

Re: Unit Testing Custom Coder

2020-11-19 Thread Pablo Estrada
Hi Dave! I don't have a lot of experience with coders, but I would include the Beam user@ list (added just now) to see if someone else has done this. Best -P. On Wed, Nov 18, 2020 at 7:22 AM Dave Anderson wrote: > Pablo, > > Also, for now I've created tests that exercise the encode() and