Elias, here's the code I was talking about. i took a little time this afternoon updating it to roller 2.0, so you should be able to just unpackage the files in a roller 2.0 workspace and do the build. you have to add the custom db tables manually ... they are in metadata/database/roller_tags.sql
http://blogs.sun.com/roller/resources/gconf/roller-tags.tar.gz cheers. -- Allen On Wed, 2005-09-14 at 19:36, Allen Gilliland wrote: > On Wed, 2005-09-14 at 16:12, Elias Torres wrote: > > On 9/14/05, Allen Gilliland <[EMAIL PROTECTED]> wrote: > > > On Wed, 2005-09-14 at 13:31, Elias Torres wrote: > > > > The reason, why I'm not striving for lookup efficiency it's because I > > > > wanted to leave it up to Lucene or in the IBM case to OmniFind search > > > > engine to deal with the queries. I believe Lucene has a way to add > > > > query terms so you can say posts with tag:apple and tag:farm, etc. > > > > Plus of course, the added benefit of having tags for technorati to > > > > consume in the rendered templates. I don't think that /tag/apple+farm > > > > is something that Roller users are in desperate need of at this > > > > moment, but I could be wrong. > > > > > > I agree that adding the tags to the pages to be consumed by sites like > > > Technorati makes sense, but should we really *rely* on that? technorati > > > requires actual anchors (<a>) which some Roller users may not want on > > > their weblogs, so we'd have to support a way for the tag data to be html > > > embedded metadata that doesn't affect page display. > > > > > > > Sorry, I didn't mean to rely on technorati for searching but that it > > would be a side benefit of having tags like with Lance's plugin except > > the tags are not stored in the text but in the db. > > > > > A possible part of the problem with this approach is that to get all that > > > tag metadata into users pages would require everyone to update their page > > > templates. How would you propose to insert the tag data into user pages? > > > > > > > That's for technorati to consume only and for users who want to > > display their data. However, the approach would be to enhance the > > search engine by extracting this other piece of metadata at indexing > > time and adding UI to search to reflect this new option. > > gotcha. i think i confused this part a bit. i agree that making the search > engine tag smart would be very useful and would provide a pretty easy way for > us to make use of tagged entries. but again, why stop there? the > functionality this would provide is fairly limited. what about rss feeds? > would the search engine be able to offer up an rss feed of the last 50 > entries tagged with "apple" and "farm"? this is something that i would love > to see our tagging support be capable of. > > > > > > Another point to note is that at BSC we have disabled the built-in Lucene > > > search because it has problems, so instead we use our custom Sun > > > Onesearch mechanism. At least in our case this would make your > > > implementation harder to manage because we would then need to make > > > further customizations to our search engine just to support tag specific > > > searches. > > > > > > > Right. > > > > > I would also disagree that the /tag/apple+farm functionality is something > > > that Roller users don't want. At BSC we regularly get users asking why > > > we don't provide a kind of blog directory which makes it easier for users > > > to find weblogs. All that Roller has is a search and the main page, and > > > neither of those provide a navigatable structure for browsing blogs. > > > > > > Searching is definitely the best way to find what you're looking for, but > > > what if you aren't really looking for anything in specific? I like the > > > idea that the TagServlet homepage could provide some fun ways for people > > > to find blogs, like listing things like ... Most Recent Tags and Most > > > Popular Tags kind of the way flickr does it > > > (http://www.flickr.com/photos/tags/). Searching would have no way of > > > doing that. > > > > I wholeheartedly agree with this part. You are correct that we need a > > better dashboard. The first version of our internal deployment had a > > lot more in this regard. We have a special frontpage once you've > > logged in that includes things like recent comments to your entries, > > recent entries for quick editing and a huge query that found the most > > recent threads you've participated and links to the posts and > > comments. But this will be scrapped when we move to 2.0 so we have to > > start fresh. I did not want to put that burden on Roller, so I was > > thinking a separate application to do that for performance. We also > > have a ton of stats on users per country, number of blogs and number > > of entries, postings per month, etc for everyone in the company to > > see. > > ahhh ... i think this is exactly what Roller needs! i think that a way to > find/sort/view entries via tags would be the most important and usable > component of a Roller dashboard. i also think we could use a simple > directory which just lists all weblogs and provides some ways to break 'em > down ... i.e. Newest Weblogs or Japanese Weblogs, etc. > > basically, i think Roller needs a nice dashboard that allows users to find > weblogs and entries in any way that seems useful. tagging seems like a good > way to do this. > > > > > > > > > > > > > > > Let's talk more to see what we should be concentrating on for this > > > > feature. > > > > > > > > > > > > > > > > > > > On Wed, 2005-09-14 at 12:25, Elias Torres wrote: > > > > > > Allen, > > > > > > > > > > > > I was thinking of using the entryattribute table, what do you > > > > > > think? I > > > > > > don't think that we want another table for every little feature. At > > > > > > first I was thinking of something simple, like a "text" field a la > > > > > > del.icio.us as another + in the settings section of the post that > > > > > > can > > > > > > be edited by the user anytime. Maybe then using the Tag render > > > > > > plugin > > > > > > for just rendering the tags and also making sure that Lucene indexes > > > > > > the tags as well. I don't think we need to worry about the big > > > > > > content > > > > > > or technorati style dashboards yet, but at least start collecting > > > > > > the > > > > > > data. > > > > > > > > > > hmmm, it depends on how you want to use the entry attribute table. > > > > > were you going to set a single attribute called "tags" which is a > > > > > list of all the entry tags? or are you planning to do an attribute > > > > > per tag? > > > > > > > > > > i agree that getting tag data is important, but if you can't use that > > > > > tag data for something useful then what's the point? if we are going > > > > > to do tag support then i'd at least like to see some way of finding > > > > > tagged entries included in the first release. > > > > > > > > > Agreed. > > > > > > > > > > > > > > I do have a problem with the entryattribute table in general because > > > > > > it's very limiting. For example, it's really cumbersome if I want to > > > > > > store both the tag and the date it was inserted on. Even worse, if I > > > > > > had another piece of metadata about that tagging to insert. It works > > > > > > for MediaCast right now because you only have attributes about the > > > > > > entry and not about the actualy entry metadata. I had mentioned to > > > > > > Dave on IRC that since my day job is on Semantic Web stuff, maybe > > > > > > making that table a more RDF-friendly table would be really cool for > > > > > > Roller. > > > > > > > > > > this is murky water if you ask me. i think i like the fact that the > > > > > entryattribute table is a simple hashtable of data attached to a > > > > > weblog entry, that keeps it simple. if you need to relate complex > > > > > data to an entry then it's probably best that you create a new table > > > > > for that data. > > > > > > > > > > i am all for reuse of existing architecture as long as it works, but > > > > > if it is going to inhibit our ability to effectively use the tags > > > > > then i say forget the entryattribute table and go ahead and do > > > > > whatever you need to do. > > > > > > > > > > i don't really know what you mean by RDF-friendly, so you'd have to > > > > > elaborate more. > > > > > > > > > > > > > Basically the entryid column should be a normal column that can take a > > > > URI/URL and not just an entryid and maybe another column for entryid > > > > so we can fetch quickly all of the triples associated with that entry. > > > > > > > > I can then do this: > > > > > > > > <entryid-1> <hasTagging> <tagging1> > > > > <entryid-1> <hasTagging> <tagging2> > > > > <tagging1> <dc:date> "2005-09-13" > > > > <tagging1> <tag> "blogs" > > > > <tagging2> <dc:date> "2005-09-15" > > > > <tagging2> <tag> "farm" > > > > > > > > plus things like: > > > > > > > > <tagging1> <syn> "weblogs" > > > > <tagging1> <syn> "blog" > > > > ... > > > > > > > > Again, if we are going to bake Tags into the core, then the table you > > > > mentioned would be best for the servlet to render entries. But for any > > > > entryattribute/metadata, I think the RDF might be more flexible for > > > > things like structuredblogging. > > > > > > I think you lost me on that first part. The entryid column of the > > > entryattribute table is a foreign key relationship to the weblogentry > > > table, so i don't see how you could change that. > > > > > > What is the purpose for tracking what date a specific tag was entered on? > > > I would bet that 99% of entries would be tagged only once and all tags > > > would be entered at the same time. > > > > This was just an example. Although it'd be nice to track tag usage and > > the tag date would be more accurate than the post date. I've done some > > studies on that. > > > > http://torrez.us/archives/2005/07/13/364 > > > > > > > > I like the simplicity of your tagging approach via searching, but as I > > > said above, I wonder if it's so simple that it's limiting. Your approach > > > basically leaves everything up to the search engine which I believe we > > > have less control over. > > > > > > -- Allen > > > > > > > Thanks and you are right regarding my intent to leave it to search > > engine to solve the problem (I'm not the first one to think that. > > *cough* Google). In our case we have a young product called OmniFind > > with a lot of flexibility so we wanted to try that because we are not > > committers to Roller and right now we are dealing on a patch per patch > > basis. However, I wouldn't mind taking your code and putting it back > > into Roller 2.0 use the table the way you have proposed and work out a > > way to "introduce" tags in the frontpage as a patch. At least, the > > most popular tags for starters and maybe even subscribe via Atom/RSS. > > We can then use the tag data and still feed it to our search engine > > and experiment there somemore. What do you think? > > i think that sounds great. i'll gather up the code i have and send it over > tomorrow morning. > > -- Allen > > > > > > -------------------------------------- > > Speaking of search engines, we need a way to register a plugin to hook > > on the save events of posts and comments w/o having to patch our local > > build except by adding a new plugin in the configuration. Things like > > settings (mediacast) are directly hard-coded into the jsp and rest of > > the code. We'd rather not do that, since it would be hard to maintain. > > > > Regards, > > > > Elias >