GitHub user dmainou edited a comment on the discussion: How to improve the 
Snowflake upsert speed

Perhaps an architectural one?

Using the insert/update step requires reading the data to then upsert.

Couple of solutions. 

1- Load the data into Snowflake then run a statement that accomplishes what you 
need.

or

1. Read your source data and snowflake data inputs on two streams
2. sort them to be in the same order
3. use the Merge rows (diff) transform to determine new, deleted, changed & 
identical
4. toss the identical aside
5. then bulk insert hat's needed, update as required, etc.


Ps I've previously dropt the length of a jub from 6+ hours to 15 minutes.


GitHub link: 
https://github.com/apache/hop/discussions/5310#discussioncomment-13138748

----
This is an automatically sent email for users@hop.apache.org.
To unsubscribe, please send an email to: users-unsubscr...@hop.apache.org

Reply via email to