Hi, Apologies for spamming, I was tired and thus I jumped away from the normal work, reading Lucen White Paper. http://www.lucidimagination.com/files/file/whitepaper/LIWP_WhatsNew_Lucene3.0%2B2.9.pdf
on page 17, they actually say that in the new lucene it is possible to use the payload for searching http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/search/payloads/PayloadNearQuery.html which is the contrary to the info below, got me wondering..., oh yeah, panta rei! cheers, roman On Wed, Oct 6, 2010 at 10:57 PM, Jay Luker <[email protected]> wrote: > On Wed, Oct 6, 2010 at 4:08 PM, Roman Chyla <[email protected]> wrote: > >> 1) >> -- is it possible to use payload for search? [i know it can influence >> scoring and be useful for display, but as i understand it, it is a >> metadata about the given position] >> >> example, if we assume situation when we index authors <-- and add >> payload to them >> >> field:author | payload [affiliation,field_of_study,email] >> ------------------------------ >> ellis | cern,umi hep-theory [email protected] >> swank | umi hep-ex [email protected] >> >> is it possible to query this structure directly? ex. >> >> "author:swink~4 and author:affiliation:cern" >> >> (I want to find all names similar to swink, schwink, sink... and i >> also know the person is working at cern -- but i am not interested in >> a record which was written by swink@umi, and ellis@cern --> i want >> only swink@cern and for that i need payload) > > > The answer to the specific question is no, you can't query the payload > directly. > Suggested alternatives: > * Index the author and the affiliation at the same position. There > should then be a way to query for "swink~4" and "cern" and specify > there must be zero distance between the terms. > * index the author and the affiliation with a delimiter, like "swink_cern", > >> >> 2) >> What would be the best strategy to have several separate indexes? Ie. >> to have a separate index for metadata, for recently-changed-metadata, >> fulltext, citation-pairs? >> >> presumably, all those indexes contain only records (so the results >> from them are mergeable on the recid match), but obviously the scoring >> function makes sense only inside the index; but if one would like to >> combine results (in a meaningful way) from the several indexes, what >> would be the best strategy? >> > > Grant says something called ParallelReader could be used in this case. > > I need more time to digest your first answer to the original question. > > -- > ****************************************************** > Jay Luker Astrophysics Data System (ADS) > [email protected] Center for Astrophysics > 617-495-4588 60 Garden Street MS 67 > 617-495-7356 fax Cambridge, MA 02138 > ****************************************************** >
