I've got 36 million rows of data which ends up as a record batch with 3000 batches ranging from 12k to 300k rows each. I'm assuming these batches are created using the multithreaded CSV file reader..
Is there anyway to reorg the data into sometime like 36 batches consistent of 1 million rows each? What I'm seeing when we try to load this data using the ADBC Snowflake driver is that each individual batch is executed as a bind array insert in the Snowflake Go Driver. 3,000 bind array inserts is taking 3 hours.. This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy. For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/about-us/contacts-locations. © 2023 BlackRock, Inc. All rights reserved.