Daniel Mateus Pires created BEAM-12715:
------------------------------------------
Summary: SnowflakeWrite fails in batch mode when the number of
shards is > 1000
Key: BEAM-12715
URL: https://issues.apache.org/jira/browse/BEAM-12715
Project: Beam
Issue Type: Bug
Components: beam-community
Reporter: Daniel Mateus Pires
When writing to Snowflake in batch mode, if the number of files to import is
more than 1000, the load will fail
>From the Snowflake docs
{quote}Of the three options for identifying/specifying data files to load from
a stage, providing a discrete list of files is generally the fastest; however,
the FILES parameter supports a maximum of 1,000 files, meaning a COPY command
executed with the FILES parameter can only load up to 1,000 files.
{quote}
I noticed that the Snowflake Write in batch mode ignores the number of shards
set by the user, and I think the first step should be to get the number of
shards before writing.
Longer term, should Beam issue multiple COPY statements with a distinct list of
files when the number of files is more than 1000? Maybe inside the same
transaction (BEGIN; END; block)
Also, I wanted to set the Jira issue component as io-java-snowflake but it does
not exist
--
This message was sent by Atlassian Jira
(v8.3.4#803005)