Did you enable the "Use batch update" option?

On Wed, 31 Aug 2022 at 11:45, <pod...@gmx.com> wrote:

> HI,
>
> I see "Synchronize after merge" does nothing but executes separate query
> for each row.
> So it will be terrible slow with hundreds of thousands of records.
>
>
> *Sent:* Tuesday, August 30, 2022 at 1:53 PM
> *From:* "Hans Van Akelyen" <hans.van.akel...@gmail.com>
> *To:* users@hop.apache.org
> *Subject:* Re: Dimension lookup/update
> Another solution to avoid all the lookups is to use a table input and load
> the source table.
> Then you can use "Merge Rows (diff)" to do a comparison between the old
> set and the new set.
> It will mark the rows as new,identical,changed,deleted after that you can
> use "Synchronize after merge" and it will do the insert or update depending
> on the flag field.
>
> This way you do not bash your database with all the lookup queries, you do
> have to load and sort the entire source table to do the comparison against.
>
> Cheers,
> Hans
>
> On Tue, 30 Aug 2022 at 13:47, <pod...@gmx.com> wrote:
>
>>
>> Thank you both for quick reaction.
>>
>> I'm searching for some way to quickly update table with millions of
>> records.
>>
>> I know 'Insert / update' but this is terrible slow - so may queries.
>>
>> I know I can do some workaround like instert into temp table and execute
>> SQL query (update table taking data from another table).
>> I was thinking that dimension table is something that could work here.
>>
>>
>>
>> *Sent:* Tuesday, August 30, 2022 at 1:09 PM
>> *From:* "Bart Maertens" <bartm...@apache.org>
>> *To:* users@hop.apache.org
>> *Subject:* Re: Dimension lookup/update
>> Hi Mike,
>>
>> The Dimension Lookup/update transform populates Slowly Changing
>> Dimensions, typically type 2 [1] with some additional functionality.
>>
>> The "date range start", "date range end" and "version" fields are
>> technical fields that are used to determine the validity period and version
>> number for a dimension record.
>> If you don't need any of the versioning information, you're probably not
>> maintaining a slowly changing dimension.
>> Hop comes with other transforms like "Table output", "Insert/update",
>> "Database lookup" etc that are better suited to maintain regular tables or
>> type 1 dimensions.
>>
>> [1]
>> https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row
>>
>> Regards,
>> Bart
>>
>> On Tue, Aug 30, 2022 at 1:01 PM <pod...@gmx.com> wrote:
>>
>>>
>>> Hello!
>>>
>>> I try to use 'Dimension lookup/update' hop and I do not understand the
>>> meaning of two fields in properties:
>>>
>>> "Date range start field / Table date range end"
>>>
>>> My data has no date column. I do not need it. So I cannot specify any
>>> date. What this field is for and how to ignore it?
>>>
>>>
>>> What is exactly 'Version field'? This is to keep several datasets in one
>>> table? I.e same record can have multiple copies with different 'Version'?
>>> What if I do no not need (most of scenarios I think) any 'versions', I
>>> just need single copy of data?
>>> If I have 'version number' I can leave 'key fields' empty?
>>>
>>> Regards,
>>>
>>> Mike
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>

Reply via email to