On 9/14/05, Allen Gilliland <[EMAIL PROTECTED]> wrote:
> I have comments, but let me first explain the work I prototyped so that you 
> have an idea of how I approached the problem.
> 
> To add tag support I did 2 things:
> 1. added a form field for collecting tags on the weblog entry form.
> 2. created a TagServlet for finding entries with various combinations of tags.
> 
> I created custom tables for storing my tags ...
> 
> create table roller_weblogentrytags(
>   id varchar(48) not null,
>   tag varchar(32) not null,
>   weblogentryid varchar(48) not null,
>   primary key(id));
> 
> I puposely setup the table to be ultra simple and partially denormalized 
> (i.e. the relationship is really many-to-many, but i didn't setup the tables 
> that way).  So the table only maintains the tag value and what entry it 
> corresponds to.  At the time I wasn't concerned with maintaining any other 
> metadata regarding a single tag.
> 
> The TagServlet then allows users to enter a combination of keywords and see 
> what entries come up.  The urls for the TagServlet are like this ...
> 
> /entries/some+combo+of+tags
> /entries/atagvalue
> 
> using a "+" indicates an intersection, i.e. only show results that include 
> all the listed tags.
> 
> Obviously adding the tag collection form fields is trivial, but knowing how 
> to store the data properly so that it's efficient to lookup various tag 
> combinations is tough.  I would love to see what the data model for a site 
> like del.icio.us looks like because it would give us some great insights into 
> how they maintain efficiency.
> 
> Caching could be tough because we will likely get extremely varied queries 
> with different combinations of tags.  Then on top of that it would be nice to 
> have the ability to do a lot of the "popular tags" stuff that del.icio.us 
> does.
> 
> Anways ... I have comments on your email inline below ...

The reason, why I'm not striving for lookup efficiency it's because I
wanted to leave it up to Lucene or in the IBM case to OmniFind search
engine to deal with the queries. I believe Lucene has a way to add
query terms so you can say posts with tag:apple and tag:farm, etc.
Plus of course, the added benefit of having tags for technorati to
consume in the rendered templates. I don't think that /tag/apple+farm
is something that Roller users are in desperate need of at this
moment, but I could be wrong.

Let's talk more to see what we should be concentrating on for this feature.

> 
> 
> On Wed, 2005-09-14 at 12:25, Elias Torres wrote:
> > Allen,
> >
> > I was thinking of using the entryattribute table, what do you think? I
> > don't think that we want another table for every little feature. At
> > first I was thinking of something simple, like a "text" field a la
> > del.icio.us as another + in the settings section of the post that can
> > be edited by the user anytime. Maybe then using the Tag render plugin
> > for just rendering the tags and also making sure that Lucene indexes
> > the tags as well. I don't think we need to worry about the big content
> > or technorati style dashboards yet, but at least start collecting the
> > data.
> 
> hmmm, it depends on how you want to use the entry attribute table.  were you 
> going to set a single attribute called "tags" which is a list of all the 
> entry tags?  or are you planning to do an attribute per tag?
> 
> i agree that getting tag data is important, but if you can't use that tag 
> data for something useful then what's the point?  if we are going to do tag 
> support then i'd at least like to see some way of finding tagged entries 
> included in the first release.
> 
> >
> > I do have a problem with the entryattribute table in general because
> > it's very limiting. For example, it's really cumbersome if I want to
> > store both the tag and the date it was inserted on. Even worse, if I
> > had another piece of metadata about that tagging to insert. It works
> > for MediaCast right now because you only have attributes about the
> > entry and not about the actualy entry metadata. I had mentioned to
> > Dave on IRC that since my day job is on Semantic Web stuff, maybe
> > making that table a more RDF-friendly table would be really cool for
> > Roller.
> 
> this is murky water if you ask me.  i think i like the fact that the 
> entryattribute table is a simple hashtable of data attached to a weblog 
> entry, that keeps it simple.  if you need to relate complex data to an entry 
> then it's probably best that you create a new table for that data.
> 
> i am all for reuse of existing architecture as long as it works, but if it is 
> going to inhibit our ability to effectively use the tags then i say forget 
> the entryattribute table and go ahead and do whatever you need to do.
> 
> i don't really know what you mean by RDF-friendly, so you'd have to elaborate 
> more.
> 

Basically the entryid column should be a normal column that can take a
URI/URL and not just an entryid and maybe another column for entryid
so we can fetch quickly all of the triples associated with that entry.

I can then do this:

<entryid-1> <hasTagging> <tagging1>
<entryid-1> <hasTagging> <tagging2>
<tagging1> <dc:date>   "2005-09-13"
<tagging1> <tag>   "blogs"
<tagging2> <dc:date>   "2005-09-15"
<tagging2> <tag>   "farm"

plus things like:

<tagging1> <syn> "weblogs"
<tagging1> <syn> "blog"
...

Again, if we are going to bake Tags into the core, then the table you
mentioned would be best for the servlet to render entries. But for any
entryattribute/metadata, I think the RDF might be more flexible for
things like structuredblogging.

> -- Allen
> 
> 
> >
> > What does everyone think? Who else is using the entryattribute table
> > besides MediaCast?
> >
> > Elias
> >
> > On 9/14/05, Allen Gilliland <[EMAIL PROTECTED]> wrote:
> > > Elias,
> > >
> > > I had actually began working on tag support and prototyped it back in
> > > July, but I didn't get much feedback/support on it so I focused on some
> > > other things instead.  I still have some code that works if that would 
> > > help.
> > >
> > > http://www.rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_WeblogTags
> > >
> > > It looks like I had also started a very simple design doc which you are
> > > welcome to elaborate on.  To be honest I don't think adding tag support
> > > takes much code, but it will require a significant amount of design
> > > because it will require a lot of dynamic content on concievably large
> > > sets of data.
> > >
> > > I'm definitely looking forward to what you come up with, this would be a
> > > great addition to Roller.
> > >
> > > -- Allen
> > >
> > >
> > > Elias Torres wrote:
> > >
> > > >I have updated my patch to now work with DB2. Everything seems to be
> > > >working beautifully.
> > > >
> > > >http://torrez.us/2005/08/23/roller/patches/db2_derby.hibernate3.patch
> > > >
> > > >Regards,
> > > >
> > > >Elias
> > > >
> > > >PS> Now onto tagging.
> > > >
> > > >Heads up. I would like to add tagging to Roller, possibly using the
> > > >metadata table. I'll try to draft something up on the wiki.
> > > >
> > > >On 9/13/05, Elias Torres <[EMAIL PROTECTED]> wrote:
> > > >
> > > >
> > > >>Hi Everyone,
> > > >>
> > > >>After getting the nice upgrade to Hibernate 3 by Dave, I started
> > > >>working on testing Derby support first, then DB2. I only found a
> > > >>couple of issues with Derby so far, everything seems to run fine.
> > > >>
> > > >>Here's my patch:
> > > >>http://torrez.us/2005/08/23/roller/patches/derby_hibernate3.patch
> > > >>
> > > >>Basically,
> > > >>
> > > >>There was a getInt() that doesn't seem to work on strings for Derby,
> > > >>so I did this:
> > > >>-                dbversion = rs.getInt(1);
> > > >>+                dbversion = Integer.parseInt(rs.getString(1));
> > > >>
> > > >>The next one was a query in HibernateRefererManagerImpl.java which is
> > > >>not performed via Hibernate and there was a "limit" keyword which is
> > > >>not supported by Derby. I first tried the HSQL version, but Derby
> > > >>doesn't support TOP either. I added a check on the loop for max
> > > >>results, somebody please verify that this is ok. Thanks.
> > > >>
> > > >>Elias
> > > >>
> > > >>PS> Now onto DB2.
> > > >>
> > > >>
> > > >>
> > >
> 
>

Reply via email to