Re: Tagging support in Roller

Allen Gilliland Wed, 13 Sep 2006 11:25:34 -0700


Elias Torres wrote:

I replied to your good comments and updated the wiki. Could we discuss
making the 3.1 branch and moving 3.0 branch to trunk?

yeah, but lets talk about that in a different thread so we don't getmixed up. i have a couple more comments below ...


-Elias

Allen Gilliland wrote:


*snip*

getAllTags() - how? this could return thousands of results

This is for tcloud (I forgot to mention that the return is not TagData
but TagCloudEntry (a pair of tagname and count)).

getAllTags(WebsiteData website) - again, how?  why?

Website cloud of entry tags.

doesn't that make it of even greater concern?  i would still expect a
decent sized site to have thousands of unique tags and then to get an
aggregate count of each of those tags to return in this method would be
a lot of data.

i don't have a problem with this as long as the results can be limited
some how.


ok. I have been thinking of having a table

create table websitetagcloud {
  id
  websiteid
  name
  count
};

so we can return this data quickly and we can do some limits here such
as only tags with count > 1 or something like that. I've updated the
Wiki page with this and other changes.

i like that idea. i'm not sure it's a 'cloud', its more of a tagcountor tagaggregate which is likely to be used to create the cloud. iactually see even more opportunity to extract relevant tag data intothis table.

would it also make sense to put the date in this table? that way a thetag count could be time sensitive, so you could restrict the set to tagsused in a certain timeframe, like tags used within the last hour. itwould also be cool to do counts given various timeframes, so a dayCnt,weekCnt, monthCnt, totalCnt. that way you could track what tags arepopular for a given day or week. of course the downside to that is youhave to worry about reseting those cnts :/


is there any other aggregate data which could be useful in this table?

getTagsOrderByCount(WebsiteData website, int count) - ok, for cloud?

I guess we don't need a hottags for a specific site and could probably
be done with getAllTags(WebsiteData)

getTagsOrderByCount(int count) - ditch, just use method above

This is used for HotTags for the entire site.

all of the ones where i said 'ditch, just use method above' i was trying
to suggest that we only need the 1 method signature and if it accepts a
website then that param is optional.  so if the website is non-null then
the results are restricted to the website, otherwise they apply to the
site as a whole.


I understood your suggestion and I'm taking it, I was just clarifying
the difference between the two calls.

that just cuts down on the number of methods and in all likelihood the
implementation of getTagsOrderByCount(count) would have been just to
call getTagsOrderByCount(null, count), so why have the extra method
signature in the manager interface.

+1

removeTag(String id) - ok, also need removeTag(tag)

+1

findTags(WebsiteData website, String pattern, int maxResults) - ok
findTags(String pattern, int maxResults) - ditch, just use method above

+1

also, i think every method needs to have a 'limit' parameter to limit
the result set and the maxResults should be configurable at the site
wide level so that we can prevent methods provided to users from
returning overly large result sets.

Could we use pagers instead? Limits feel too artificial for me and we
could be cutting out important information all of the time.

Yes we can, although our concept of a pager isn't like an iterator where
you want walk through the results one chunk at a time.  it only gives a
view of a portion of an overall collection and provides a standard way
to link to alternate views of the collection.  I'm not sure if that fits
with what you are expecting to do.


I think for somethings like getting hottest tags a pager would work,
since we can retrieve just the first page. Pagers will definitely be
useful when display entries for a specific tag, since that number is
unbounded. However, for tag cloud, not sure pagers would help much,
caching will be our friend.


yep.

-- Allen

none of the methods reference username, so that makes me think we don't
really need the username associated with a tag.

My thoughts on username were for the case you want type-ahead on *your*
tags and not just a specific weblog. I think a personal tagcloud would
be nice. Disclaimer: I can't believe I'm asking for all these clouds
when in reality I'm not a big fan of them, but oh well. I guess username
is important if more than one blog author exists, should we know who
entered which tag?

That makes sense and I would think we definitely would want to do that.
 However, maybe the reference should be to user id then, since that is
the primary key for a user.  The problem with username is that it's not
the primary key of the user table, and I believe that at some point we
expect that users should be allowed to change their username.


+1 fixed on the wiki.

the getAllTags() methods bother me a bit because i would think that on
any site that gets a reasonable amount of usage those methods would
return enormous result sets.  what do we need them for anyways?

clouds. Would paging resolve the concern?

yes, paging could help.  Dave and I discussed but never implemented any
restriction on pagers.  Some pagers have natural boundries, like entries
in a day, but the pager of the weblogs recent entries does not and it
should.

this would be another example of where a site owner should be allowed to
restrict tag paging to a certain limit so that users can't abuse the
data they are given access to.


Definitely.

-- Allen

everything else sounds about right, although it would be nice to see a
bit more info about what methods we think are needed in the site and
page models.

I'll give that more thinking later today.

-- Allen


Elias Torres wrote:

I have updated the proposal on the wiki page for tagging. Please
comment/delete/change/add/etc to it. I'll be glad to discuss and
improve it.

http://rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_WeblogTags

-Elias

On 9/11/06, Elias Torres <[EMAIL PROTECTED]> wrote:

Hi Guys,

We initially implemented a tagging function into Roller 2.0 (at
IBM) but
that really never made it into core because of my lack of effort in
completing a few things that Allen had suggested before it was
functional enough. I replied to his feedback answering some of the
concerns (which I didn't think were major) [1], but I never got a
direct
reply to my email. We would like to move to 3.0+ but we can't until
tagging is in place.

There's the big decision to whether we support either categories or
tags
or both. I'm fine with supporting both as long as we can disable
either
one or none in the UI through roller.properties.

I'm willing to code it in any specific way and don't have a set way in
mind. I'm fine with use Lucene as Ian Kellen had suggested long
time ago
for performance and use the db just as a persistent storage. I'll very
through and make sure there's a tab for the tag cloud, feeds and
proper
methods in beans, velocity models, etc.

Thanks,

Elias

[1]
http://www.nabble.com/Re%3A-Evalutating-tag-support-p3972587s12275.html

Re: Tagging support in Roller

Reply via email to