Elias,

here's the code I was talking about.  i took a little time this afternoon 
updating it to roller 2.0, so you should be able to just unpackage the files in 
a roller 2.0 workspace and do the build.  you have to add the custom db tables 
manually ... they are in metadata/database/roller_tags.sql

http://blogs.sun.com/roller/resources/gconf/roller-tags.tar.gz

cheers.

-- Allen


On Wed, 2005-09-14 at 19:36, Allen Gilliland wrote:
> On Wed, 2005-09-14 at 16:12, Elias Torres wrote:
> > On 9/14/05, Allen Gilliland <[EMAIL PROTECTED]> wrote:
> > > On Wed, 2005-09-14 at 13:31, Elias Torres wrote:
> > > > The reason, why I'm not striving for lookup efficiency it's because I
> > > > wanted to leave it up to Lucene or in the IBM case to OmniFind search
> > > > engine to deal with the queries. I believe Lucene has a way to add
> > > > query terms so you can say posts with tag:apple and tag:farm, etc.
> > > > Plus of course, the added benefit of having tags for technorati to
> > > > consume in the rendered templates. I don't think that /tag/apple+farm
> > > > is something that Roller users are in desperate need of at this
> > > > moment, but I could be wrong.
> > > 
> > > I agree that adding the tags to the pages to be consumed by sites like 
> > > Technorati makes sense, but should we really *rely* on that?  technorati 
> > > requires actual anchors (<a>) which some Roller users may not want on 
> > > their weblogs, so we'd have to support a way for the tag data to be html 
> > > embedded metadata that doesn't affect page display.
> > > 
> > 
> > Sorry, I didn't mean to rely on technorati for searching but that it
> > would be a side benefit of having tags like with Lance's plugin except
> > the tags are not stored in the text but in the db.
> > 
> > > A possible part of the problem with this approach is that to get all that 
> > > tag metadata into users pages would require everyone to update their page 
> > > templates.  How would you propose to insert the tag data into user pages?
> > > 
> > 
> > That's for technorati to consume only and for users who want to
> > display their data. However, the approach would be to enhance the
> > search engine by extracting this other piece of metadata at indexing
> > time and adding UI to search to reflect this new option.
> 
> gotcha.  i think i confused this part a bit.  i agree that making the search 
> engine tag smart would be very useful and would provide a pretty easy way for 
> us to make use of tagged entries.  but again, why stop there?  the 
> functionality this would provide is fairly limited.  what about rss feeds?  
> would the search engine be able to offer up an rss feed of the last 50 
> entries tagged with "apple" and "farm"?  this is something that i would love 
> to see our tagging support be capable of.
> 
> > 
> > > Another point to note is that at BSC we have disabled the built-in Lucene 
> > > search because it has problems, so instead we use our custom Sun 
> > > Onesearch mechanism.  At least in our case this would make your 
> > > implementation harder to manage because we would then need to make 
> > > further customizations to our search engine just to support tag specific 
> > > searches.
> > > 
> > 
> > Right. 
> > 
> > > I would also disagree that the /tag/apple+farm functionality is something 
> > > that Roller users don't want.  At BSC we regularly get users asking why 
> > > we don't provide a kind of blog directory which makes it easier for users 
> > > to find weblogs.  All that Roller has is a search and the main page, and 
> > > neither of those provide a navigatable structure for browsing blogs.
> > > 
> > > Searching is definitely the best way to find what you're looking for, but 
> > > what if you aren't really looking for anything in specific?  I like the 
> > > idea that the TagServlet homepage could provide some fun ways for people 
> > > to find blogs, like listing things like ... Most Recent Tags and Most 
> > > Popular Tags kind of the way flickr does it 
> > > (http://www.flickr.com/photos/tags/).  Searching would have no way of 
> > > doing that.
> > 
> > I wholeheartedly agree with this part. You are correct that we need a
> > better dashboard. The first version of our internal deployment had a
> > lot more in this regard. We have a special frontpage once you've
> > logged in that includes things like recent comments to your entries,
> > recent entries for quick editing and a huge query that found the most
> > recent threads you've participated and links to the posts and
> > comments. But this will be scrapped when we move to 2.0 so we have to
> > start fresh. I did not want to put that burden on Roller, so I was
> > thinking a separate application to do that for performance. We also
> > have a ton of stats on users per country, number of blogs and number
> > of entries, postings per month, etc for everyone in the company to
> > see.
> 
> ahhh ... i think this is exactly what Roller needs!  i think that a way to 
> find/sort/view entries via tags would be the most important and usable 
> component of a Roller dashboard.  i also think we could use a simple 
> directory which just lists all weblogs and provides some ways to break 'em 
> down ... i.e. Newest Weblogs or Japanese Weblogs, etc.
> 
> basically, i think Roller needs a nice dashboard that allows users to find 
> weblogs and entries in any way that seems useful.  tagging seems like a good 
> way to do this.
> 
> > > 
> > > 
> > > >
> > > > Let's talk more to see what we should be concentrating on for this 
> > > > feature.
> > > >
> > > > >
> > > > >
> > > > > On Wed, 2005-09-14 at 12:25, Elias Torres wrote:
> > > > > > Allen,
> > > > > >
> > > > > > I was thinking of using the entryattribute table, what do you 
> > > > > > think? I
> > > > > > don't think that we want another table for every little feature. At
> > > > > > first I was thinking of something simple, like a "text" field a la
> > > > > > del.icio.us as another + in the settings section of the post that 
> > > > > > can
> > > > > > be edited by the user anytime. Maybe then using the Tag render 
> > > > > > plugin
> > > > > > for just rendering the tags and also making sure that Lucene indexes
> > > > > > the tags as well. I don't think we need to worry about the big 
> > > > > > content
> > > > > > or technorati style dashboards yet, but at least start collecting 
> > > > > > the
> > > > > > data.
> > > > >
> > > > > hmmm, it depends on how you want to use the entry attribute table.  
> > > > > were you going to set a single attribute called "tags" which is a 
> > > > > list of all the entry tags?  or are you planning to do an attribute 
> > > > > per tag?
> > > > >
> > > > > i agree that getting tag data is important, but if you can't use that 
> > > > > tag data for something useful then what's the point?  if we are going 
> > > > > to do tag support then i'd at least like to see some way of finding 
> > > > > tagged entries included in the first release.
> > > > >
> > 
> > Agreed.
> > 
> > > > > >
> > > > > > I do have a problem with the entryattribute table in general because
> > > > > > it's very limiting. For example, it's really cumbersome if I want to
> > > > > > store both the tag and the date it was inserted on. Even worse, if I
> > > > > > had another piece of metadata about that tagging to insert. It works
> > > > > > for MediaCast right now because you only have attributes about the
> > > > > > entry and not about the actualy entry metadata. I had mentioned to
> > > > > > Dave on IRC that since my day job is on Semantic Web stuff, maybe
> > > > > > making that table a more RDF-friendly table would be really cool for
> > > > > > Roller.
> > > > >
> > > > > this is murky water if you ask me.  i think i like the fact that the 
> > > > > entryattribute table is a simple hashtable of data attached to a 
> > > > > weblog entry, that keeps it simple.  if you need to relate complex 
> > > > > data to an entry then it's probably best that you create a new table 
> > > > > for that data.
> > > > >
> > > > > i am all for reuse of existing architecture as long as it works, but 
> > > > > if it is going to inhibit our ability to effectively use the tags 
> > > > > then i say forget the entryattribute table and go ahead and do 
> > > > > whatever you need to do.
> > > > >
> > > > > i don't really know what you mean by RDF-friendly, so you'd have to 
> > > > > elaborate more.
> > > > >
> > > >
> > > > Basically the entryid column should be a normal column that can take a
> > > > URI/URL and not just an entryid and maybe another column for entryid
> > > > so we can fetch quickly all of the triples associated with that entry.
> > > >
> > > > I can then do this:
> > > >
> > > > <entryid-1> <hasTagging> <tagging1>
> > > > <entryid-1> <hasTagging> <tagging2>
> > > > <tagging1> <dc:date>   "2005-09-13"
> > > > <tagging1> <tag>   "blogs"
> > > > <tagging2> <dc:date>   "2005-09-15"
> > > > <tagging2> <tag>   "farm"
> > > >
> > > > plus things like:
> > > >
> > > > <tagging1> <syn> "weblogs"
> > > > <tagging1> <syn> "blog"
> > > > ...
> > > >
> > > > Again, if we are going to bake Tags into the core, then the table you
> > > > mentioned would be best for the servlet to render entries. But for any
> > > > entryattribute/metadata, I think the RDF might be more flexible for
> > > > things like structuredblogging.
> > > 
> > > I think you lost me on that first part.  The entryid column of the 
> > > entryattribute table is a foreign key relationship to the weblogentry 
> > > table, so i don't see how you could change that.
> > > 
> > > What is the purpose for tracking what date a specific tag was entered on? 
> > >  I would bet that 99% of entries would be tagged only once and all tags 
> > > would be entered at the same time.
> > 
> > This was just an example. Although it'd be nice to track tag usage and
> > the tag date would be more accurate than the post date. I've done some
> > studies on that.
> > 
> > http://torrez.us/archives/2005/07/13/364
> > 
> > > 
> > > I like the simplicity of your tagging approach via searching, but as I 
> > > said above, I wonder if it's so simple that it's limiting.  Your approach 
> > > basically leaves everything up to the search engine which I believe we 
> > > have less control over.
> > > 
> > > -- Allen
> > > 
> > 
> > Thanks and you are right regarding my intent to leave it to search
> > engine to solve the problem (I'm not the first one to think that.
> > *cough* Google). In our case we have a young product called OmniFind
> > with a lot of flexibility so we wanted to try that because we are not
> > committers to Roller and right now we are dealing on a patch per patch
> > basis. However, I wouldn't mind taking your code and putting it back
> > into Roller 2.0 use the table the way you have proposed and work out a
> > way to "introduce" tags in the frontpage as a patch. At least, the
> > most popular tags for starters and maybe even subscribe via Atom/RSS.
> > We can then use the tag data and still feed it to our search engine
> > and experiment there somemore. What do you think?
> 
> i think that sounds great.  i'll gather up the code i have and send it over 
> tomorrow morning.
> 
> -- Allen
> 
> 
> > 
> > --------------------------------------
> > Speaking of search engines, we need a way to register a plugin to hook
> > on the save events of posts and comments w/o having to patch our local
> > build except by adding a new plugin in the configuration. Things like
> > settings (mediacast) are directly hard-coded into the jsp and rest of
> > the code. We'd rather not do that, since it would be hard to maintain.
> > 
> > Regards,
> > 
> > Elias
> 

Reply via email to