Did you enable the "Use batch update" option? On Wed, 31 Aug 2022 at 11:45, <pod...@gmx.com> wrote:
> HI, > > I see "Synchronize after merge" does nothing but executes separate query > for each row. > So it will be terrible slow with hundreds of thousands of records. > > > *Sent:* Tuesday, August 30, 2022 at 1:53 PM > *From:* "Hans Van Akelyen" <hans.van.akel...@gmail.com> > *To:* users@hop.apache.org > *Subject:* Re: Dimension lookup/update > Another solution to avoid all the lookups is to use a table input and load > the source table. > Then you can use "Merge Rows (diff)" to do a comparison between the old > set and the new set. > It will mark the rows as new,identical,changed,deleted after that you can > use "Synchronize after merge" and it will do the insert or update depending > on the flag field. > > This way you do not bash your database with all the lookup queries, you do > have to load and sort the entire source table to do the comparison against. > > Cheers, > Hans > > On Tue, 30 Aug 2022 at 13:47, <pod...@gmx.com> wrote: > >> >> Thank you both for quick reaction. >> >> I'm searching for some way to quickly update table with millions of >> records. >> >> I know 'Insert / update' but this is terrible slow - so may queries. >> >> I know I can do some workaround like instert into temp table and execute >> SQL query (update table taking data from another table). >> I was thinking that dimension table is something that could work here. >> >> >> >> *Sent:* Tuesday, August 30, 2022 at 1:09 PM >> *From:* "Bart Maertens" <bartm...@apache.org> >> *To:* users@hop.apache.org >> *Subject:* Re: Dimension lookup/update >> Hi Mike, >> >> The Dimension Lookup/update transform populates Slowly Changing >> Dimensions, typically type 2 [1] with some additional functionality. >> >> The "date range start", "date range end" and "version" fields are >> technical fields that are used to determine the validity period and version >> number for a dimension record. >> If you don't need any of the versioning information, you're probably not >> maintaining a slowly changing dimension. >> Hop comes with other transforms like "Table output", "Insert/update", >> "Database lookup" etc that are better suited to maintain regular tables or >> type 1 dimensions. >> >> [1] >> https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row >> >> Regards, >> Bart >> >> On Tue, Aug 30, 2022 at 1:01 PM <pod...@gmx.com> wrote: >> >>> >>> Hello! >>> >>> I try to use 'Dimension lookup/update' hop and I do not understand the >>> meaning of two fields in properties: >>> >>> "Date range start field / Table date range end" >>> >>> My data has no date column. I do not need it. So I cannot specify any >>> date. What this field is for and how to ignore it? >>> >>> >>> What is exactly 'Version field'? This is to keep several datasets in one >>> table? I.e same record can have multiple copies with different 'Version'? >>> What if I do no not need (most of scenarios I think) any 'versions', I >>> just need single copy of data? >>> If I have 'version number' I can leave 'key fields' empty? >>> >>> Regards, >>> >>> Mike >>> >>> >>> >>> >> >> >> >> > > >