Re: Tagging support in Roller

Allen Gilliland Wed, 13 Sep 2006 09:59:25 -0700


Elias Torres wrote:


Allen Gilliland wrote:

Ok.  I took another pass over the proposal and have a few thoughts/ideas
I think we can talk about here on the list to get this moving.


Excellent, thanks for taking a look at it.

1. We should ditch the current branch based on roller 2.1 and recreate
it using the 3.0 codebase, migrating whatever code is still relevant.

+1

2. I suggest we follow Anil's suggestion in the issues section and not
try and make an decisions about categories vs. tags.  That's definitely
still an important debate, but I suggest we get tags going first then we
can actually see how each is used and make a decision based on acquired
feedback rather than speculation.

+1

3. I suggest we hold off on the tagging for weblogs and just do tags for
weblog entries.  This will just narrow the scope of the proposal a bit.


+0 I hadn't noticed the suggestion until yesterday but it seems like a
fine idea. If we want to do it later, that's fine, but I don't mind
doing it together.

I mainly suggested this to keep things simple. We know that we wanttagging on entries, but tagging on weblogs could be debated. If weleave it out for now then the proposal means less work and greaterliklihood that we can get it done in time. Plus, we don't even know ifpeople really want tagging at the weblog level, it may be something thatwe would do that's not really all that useful.


So I still suggest we strip it out and just do tagging on weblog entries.

As far as the data model, classes, and methods are concerned ...

i think the weblogentrytag table looks pretty good, but i'm wondering if
we really need a reverence to username, what is that for?  also, how is
the tagtime supposed to work?  does it only get set once when the tag is
added or does it get updated when the tags are updated?


tagtime is mostly for analysis of when a specific tag was inserted to
that entry, since authors can re-tag their entries. I'm fine with just
using entry modified time, but then the question is whether we duplicate
that time as well. For now, I'm fine with taking it out as well. What do
you think?

i'm not sure i understand all of the proposed manager method additions
either ...

getWebsiteTags(WebsiteData website) - not needed


If we don't implement Website tagging.

getWeblogEntriesByTag(WebsiteData website, String tag) - ok
getWeblogEntriesByTag(String tag) - ditch, just use method above

+1

getAllTags() - how? this could return thousands of results


This is for tcloud (I forgot to mention that the return is not TagData
but TagCloudEntry (a pair of tagname and count)).

getAllTags(WebsiteData website) - again, how?  why?


Website cloud of entry tags.

doesn't that make it of even greater concern? i would still expect adecent sized site to have thousands of unique tags and then to get anaggregate count of each of those tags to return in this method would bea lot of data.

i don't have a problem with this as long as the results can be limitedsome how.

getTagsOrderByCount(WebsiteData website, int count) - ok, for cloud?


I guess we don't need a hottags for a specific site and could probably
be done with getAllTags(WebsiteData)

getTagsOrderByCount(int count) - ditch, just use method above


This is used for HotTags for the entire site.

all of the ones where i said 'ditch, just use method above' i was tryingto suggest that we only need the 1 method signature and if it accepts awebsite then that param is optional. so if the website is non-null thenthe results are restricted to the website, otherwise they apply to thesite as a whole.

that just cuts down on the number of methods and in all likelihood theimplementation of getTagsOrderByCount(count) would have been just tocall getTagsOrderByCount(null, count), so why have the extra methodsignature in the manager interface.

removeTag(String id) - ok, also need removeTag(tag)

+1

findTags(WebsiteData website, String pattern, int maxResults) - ok
findTags(String pattern, int maxResults) - ditch, just use method above

+1

also, i think every method needs to have a 'limit' parameter to limit
the result set and the maxResults should be configurable at the site
wide level so that we can prevent methods provided to users from
returning overly large result sets.


Could we use pagers instead? Limits feel too artificial for me and we
could be cutting out important information all of the time.

Yes we can, although our concept of a pager isn't like an iterator whereyou want walk through the results one chunk at a time. it only gives aview of a portion of an overall collection and provides a standard wayto link to alternate views of the collection. I'm not sure if that fitswith what you are expecting to do.

none of the methods reference username, so that makes me think we don't
really need the username associated with a tag.


My thoughts on username were for the case you want type-ahead on *your*
tags and not just a specific weblog. I think a personal tagcloud would
be nice. Disclaimer: I can't believe I'm asking for all these clouds
when in reality I'm not a big fan of them, but oh well. I guess username
is important if more than one blog author exists, should we know who
entered which tag?

That makes sense and I would think we definitely would want to do that.However, maybe the reference should be to user id then, since that isthe primary key for a user. The problem with username is that it's notthe primary key of the user table, and I believe that at some point weexpect that users should be allowed to change their username.

the getAllTags() methods bother me a bit because i would think that on
any site that gets a reasonable amount of usage those methods would
return enormous result sets.  what do we need them for anyways?


clouds. Would paging resolve the concern?

yes, paging could help. Dave and I discussed but never implemented anyrestriction on pagers. Some pagers have natural boundries, like entriesin a day, but the pager of the weblogs recent entries does not and itshould.

this would be another example of where a site owner should be allowed torestrict tag paging to a certain limit so that users can't abuse thedata they are given access to.


-- Allen

everything else sounds about right, although it would be nice to see a
bit more info about what methods we think are needed in the site and
page models.


I'll give that more thinking later today.

-- Allen


Elias Torres wrote:

I have updated the proposal on the wiki page for tagging. Please
comment/delete/change/add/etc to it. I'll be glad to discuss and
improve it.

http://rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_WeblogTags

-Elias

On 9/11/06, Elias Torres <[EMAIL PROTECTED]> wrote:

Hi Guys,

We initially implemented a tagging function into Roller 2.0 (at IBM) but
that really never made it into core because of my lack of effort in
completing a few things that Allen had suggested before it was
functional enough. I replied to his feedback answering some of the
concerns (which I didn't think were major) [1], but I never got a direct
reply to my email. We would like to move to 3.0+ but we can't until
tagging is in place.

There's the big decision to whether we support either categories or tags
or both. I'm fine with supporting both as long as we can disable either
one or none in the UI through roller.properties.

I'm willing to code it in any specific way and don't have a set way in
mind. I'm fine with use Lucene as Ian Kellen had suggested long time ago
for performance and use the db just as a persistent storage. I'll very
through and make sure there's a tab for the tag cloud, feeds and proper
methods in beans, velocity models, etc.

Thanks,

Elias

[1]
http://www.nabble.com/Re%3A-Evalutating-tag-support-p3972587s12275.html

Re: Tagging support in Roller

Reply via email to