On Wed, 2005-09-14 at 16:12, Elias Torres wrote: > On 9/14/05, Allen Gilliland <[EMAIL PROTECTED]> wrote: > > On Wed, 2005-09-14 at 13:31, Elias Torres wrote: > > > The reason, why I'm not striving for lookup efficiency it's because I > > > wanted to leave it up to Lucene or in the IBM case to OmniFind search > > > engine to deal with the queries. I believe Lucene has a way to add > > > query terms so you can say posts with tag:apple and tag:farm, etc. > > > Plus of course, the added benefit of having tags for technorati to > > > consume in the rendered templates. I don't think that /tag/apple+farm > > > is something that Roller users are in desperate need of at this > > > moment, but I could be wrong. > > > > I agree that adding the tags to the pages to be consumed by sites like > > Technorati makes sense, but should we really *rely* on that? technorati > > requires actual anchors (<a>) which some Roller users may not want on their > > weblogs, so we'd have to support a way for the tag data to be html embedded > > metadata that doesn't affect page display. > > > > Sorry, I didn't mean to rely on technorati for searching but that it > would be a side benefit of having tags like with Lance's plugin except > the tags are not stored in the text but in the db. > > > A possible part of the problem with this approach is that to get all that > > tag metadata into users pages would require everyone to update their page > > templates. How would you propose to insert the tag data into user pages? > > > > That's for technorati to consume only and for users who want to > display their data. However, the approach would be to enhance the > search engine by extracting this other piece of metadata at indexing > time and adding UI to search to reflect this new option.
gotcha. i think i confused this part a bit. i agree that making the search engine tag smart would be very useful and would provide a pretty easy way for us to make use of tagged entries. but again, why stop there? the functionality this would provide is fairly limited. what about rss feeds? would the search engine be able to offer up an rss feed of the last 50 entries tagged with "apple" and "farm"? this is something that i would love to see our tagging support be capable of. > > > Another point to note is that at BSC we have disabled the built-in Lucene > > search because it has problems, so instead we use our custom Sun Onesearch > > mechanism. At least in our case this would make your implementation harder > > to manage because we would then need to make further customizations to our > > search engine just to support tag specific searches. > > > > Right. > > > I would also disagree that the /tag/apple+farm functionality is something > > that Roller users don't want. At BSC we regularly get users asking why we > > don't provide a kind of blog directory which makes it easier for users to > > find weblogs. All that Roller has is a search and the main page, and > > neither of those provide a navigatable structure for browsing blogs. > > > > Searching is definitely the best way to find what you're looking for, but > > what if you aren't really looking for anything in specific? I like the > > idea that the TagServlet homepage could provide some fun ways for people to > > find blogs, like listing things like ... Most Recent Tags and Most Popular > > Tags kind of the way flickr does it (http://www.flickr.com/photos/tags/). > > Searching would have no way of doing that. > > I wholeheartedly agree with this part. You are correct that we need a > better dashboard. The first version of our internal deployment had a > lot more in this regard. We have a special frontpage once you've > logged in that includes things like recent comments to your entries, > recent entries for quick editing and a huge query that found the most > recent threads you've participated and links to the posts and > comments. But this will be scrapped when we move to 2.0 so we have to > start fresh. I did not want to put that burden on Roller, so I was > thinking a separate application to do that for performance. We also > have a ton of stats on users per country, number of blogs and number > of entries, postings per month, etc for everyone in the company to > see. ahhh ... i think this is exactly what Roller needs! i think that a way to find/sort/view entries via tags would be the most important and usable component of a Roller dashboard. i also think we could use a simple directory which just lists all weblogs and provides some ways to break 'em down ... i.e. Newest Weblogs or Japanese Weblogs, etc. basically, i think Roller needs a nice dashboard that allows users to find weblogs and entries in any way that seems useful. tagging seems like a good way to do this. > > > > > > > > > > Let's talk more to see what we should be concentrating on for this > > > feature. > > > > > > > > > > > > > > > On Wed, 2005-09-14 at 12:25, Elias Torres wrote: > > > > > Allen, > > > > > > > > > > I was thinking of using the entryattribute table, what do you think? I > > > > > don't think that we want another table for every little feature. At > > > > > first I was thinking of something simple, like a "text" field a la > > > > > del.icio.us as another + in the settings section of the post that can > > > > > be edited by the user anytime. Maybe then using the Tag render plugin > > > > > for just rendering the tags and also making sure that Lucene indexes > > > > > the tags as well. I don't think we need to worry about the big content > > > > > or technorati style dashboards yet, but at least start collecting the > > > > > data. > > > > > > > > hmmm, it depends on how you want to use the entry attribute table. > > > > were you going to set a single attribute called "tags" which is a list > > > > of all the entry tags? or are you planning to do an attribute per tag? > > > > > > > > i agree that getting tag data is important, but if you can't use that > > > > tag data for something useful then what's the point? if we are going > > > > to do tag support then i'd at least like to see some way of finding > > > > tagged entries included in the first release. > > > > > > Agreed. > > > > > > > > > > > I do have a problem with the entryattribute table in general because > > > > > it's very limiting. For example, it's really cumbersome if I want to > > > > > store both the tag and the date it was inserted on. Even worse, if I > > > > > had another piece of metadata about that tagging to insert. It works > > > > > for MediaCast right now because you only have attributes about the > > > > > entry and not about the actualy entry metadata. I had mentioned to > > > > > Dave on IRC that since my day job is on Semantic Web stuff, maybe > > > > > making that table a more RDF-friendly table would be really cool for > > > > > Roller. > > > > > > > > this is murky water if you ask me. i think i like the fact that the > > > > entryattribute table is a simple hashtable of data attached to a weblog > > > > entry, that keeps it simple. if you need to relate complex data to an > > > > entry then it's probably best that you create a new table for that data. > > > > > > > > i am all for reuse of existing architecture as long as it works, but if > > > > it is going to inhibit our ability to effectively use the tags then i > > > > say forget the entryattribute table and go ahead and do whatever you > > > > need to do. > > > > > > > > i don't really know what you mean by RDF-friendly, so you'd have to > > > > elaborate more. > > > > > > > > > > Basically the entryid column should be a normal column that can take a > > > URI/URL and not just an entryid and maybe another column for entryid > > > so we can fetch quickly all of the triples associated with that entry. > > > > > > I can then do this: > > > > > > <entryid-1> <hasTagging> <tagging1> > > > <entryid-1> <hasTagging> <tagging2> > > > <tagging1> <dc:date> "2005-09-13" > > > <tagging1> <tag> "blogs" > > > <tagging2> <dc:date> "2005-09-15" > > > <tagging2> <tag> "farm" > > > > > > plus things like: > > > > > > <tagging1> <syn> "weblogs" > > > <tagging1> <syn> "blog" > > > ... > > > > > > Again, if we are going to bake Tags into the core, then the table you > > > mentioned would be best for the servlet to render entries. But for any > > > entryattribute/metadata, I think the RDF might be more flexible for > > > things like structuredblogging. > > > > I think you lost me on that first part. The entryid column of the > > entryattribute table is a foreign key relationship to the weblogentry > > table, so i don't see how you could change that. > > > > What is the purpose for tracking what date a specific tag was entered on? > > I would bet that 99% of entries would be tagged only once and all tags > > would be entered at the same time. > > This was just an example. Although it'd be nice to track tag usage and > the tag date would be more accurate than the post date. I've done some > studies on that. > > http://torrez.us/archives/2005/07/13/364 > > > > > I like the simplicity of your tagging approach via searching, but as I said > > above, I wonder if it's so simple that it's limiting. Your approach > > basically leaves everything up to the search engine which I believe we have > > less control over. > > > > -- Allen > > > > Thanks and you are right regarding my intent to leave it to search > engine to solve the problem (I'm not the first one to think that. > *cough* Google). In our case we have a young product called OmniFind > with a lot of flexibility so we wanted to try that because we are not > committers to Roller and right now we are dealing on a patch per patch > basis. However, I wouldn't mind taking your code and putting it back > into Roller 2.0 use the table the way you have proposed and work out a > way to "introduce" tags in the frontpage as a patch. At least, the > most popular tags for starters and maybe even subscribe via Atom/RSS. > We can then use the tag data and still feed it to our search engine > and experiment there somemore. What do you think? i think that sounds great. i'll gather up the code i have and send it over tomorrow morning. -- Allen > > -------------------------------------- > Speaking of search engines, we need a way to register a plugin to hook > on the save events of posts and comments w/o having to patch our local > build except by adding a new plugin in the configuration. Things like > settings (mediacast) are directly hard-coded into the jsp and rest of > the code. We'd rather not do that, since it would be hard to maintain. > > Regards, > > Elias