On Monday 20 October 2003 10:31, Erik Hatcher wrote: > On Monday, October 20, 2003, at 11:06 AM, Tom Howe wrote: > There is not a more "lucene" way to do this - its really up to you to > be creative with this. I'm sure there are folks that have implemented > something along these lines on top of Lucene. In fact, I have a > particular interest in doing so at some point myself. This is very > similar to the object-relational issues surrounding relational > databases - turning a pretty flat structure into an object graph. > There are several ideas that could be explored by playing tricks with > fields, such as giving them a hierarchical naming structure and > querying at the level you like (think Field.Keyword and PrefixQuery, > for example), and using a field to indicate type and narrowing queries > to documents of the desired type. > > I'm interested to see what others have done in this area, or what ideas > emerge about how to accomplish this.
I'm planning to do something similar. In my case problem is bit simpler; documents have associated products, and products form a hierarchy. Searches should be able to match not only direct matches (searching product, article associated with product), but also indirect ones via membership (product member of a product group, matching group). Product hierarchy also has variable depth. To do searches using non-leaf hierarchy items (groups), all actual product items/groups associated with docs are expanded to full ids when indexing (ie. they contain path from root, up to and including node, each node component having its own unique id). Thus, when searching for an intermediate node (product grouping), match occurs since that node id is part of path to products that are in the group (either directly or as members of sub-groups). Since no such path is stored (directly) in database, this also allows me to do queries that would be impossible to do in database (I could add similar path/full id fields for search purposes of course). Thus, Lucene index is optimized for searching purposes, and database structure for editing and retrieval of data. Another thing to keep in mind is that at least for metadata it may make sense to use specialized analyzer, one that allows tokenizing using specific ids to store ids as separate tokens; instead of using some standard plain text analyzer. This way it is possible to separate ids from textual words (by using prefixes, for example, "@1253" or "#13945"); this allows for accurate matching based on identity of associated metadata selections. -+ Tatu +- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]