Folks,

I have an issue I would like to run by you for suggestions.

Right now every time you create an entry, I keep track of tags
added/removed and call WeblogManager.updateTagCounts(). At first I was
doing most of the math in the code (i.e.
pojo.setCount(pojo.getCount()+1);pojo.setLastUsed(new Date())) but I was
afraid that this would result in data loss or incorrect data. Now I'm
doing most of the math in the db (i.e. update set total = total + 1).

The table is (id, websiteid, name, total, lastused) and for each tag
name we have 1 row for the whole site (websiteid = NULL) and one for
each website that used it. So if two users in separate cluster nodes use
the same tag for the first time in the system at the same time we will
end up with two rows for that tag with websiteid = NULL.

There are a couple of options that I've pondered on:

- If we were to hash the websiteid and tag name and use that as the id,
the transaction would abort for one of the users because we can't have
duplicated primary keys.

- We could add constraints to the table to make sure that there are no
duplicate pairs (websiteid, name) in the table. But I'm not sure how to
do that in Hibernate.

Note: both approaches would throw an exception and the user would have
to try again.

- We could do the extra work and do a sum() on getHotTags()/getTags().

- We could write a task that merges these rows once in a while.

I'd like to point out that this case shouldn't really happen too often,
but I want to make you aware of and looking for suggestions. At the
moment, I'm taking the safe approach of computing a group by name and
computing the sum(). Also, when looking for a tag row to update, I'm
always picking the latest with the hope of helping leaving one of the
duplicates with a low number. In summary, I'm comfortable with the
approach I've taken at the moment but welcome suggestions.

Regards,

-Elias

Reply via email to