Hi Tibor, On Wed, Oct 6, 2010 at 10:39 PM, Tibor Simko <[email protected]> wrote: > On Wed, 06 Oct 2010, Roman Chyla wrote: >> doc: 10 >> cited: 3,6,80,90,89... >> citing_author: witten, frank, lagra, ngeyen, chu, thuey... > > What citing_author holds? An equivalent to data points for > citedby:author:lagra? Not practical to store stuff like that next to
It was trading space for speed, why that is not practical? Especially if the values are data points > every record. Note that citedby/refersto operators can operate on any > query, not only on authors. See my `refersto:keyword:muon' example. > >> the lucene query with the same effect then is: >> >> ((author:ellis +citedby:witten -author:witten) +keyword:muon) --> >> cluster_by(len(cited)) > > Not quite. `refersto:keyword:muon' gives you a set of papers that cite > some paper from the set of papers that are tagged with keyword muon. > Gives 93k hits on INSPIRE. To be compared to the set of papers tagged > with keyword muon, 22k hits on INSPIRE. OK, I see - that is a very nice case and I understand little bit more - could you point me to some info about how this 2nd order lookup is done, or where int the code, or discuss it here? I guess it will apply also for the chain of 2nd order lookups > >> notes: >> - citedby:author:witten -- it doesnt make sense to me that it could >> be sb else than other author > > Note sure what you mean. Gives collection of all papers that are cited > by any of the paper written by Witten. It could be any author, > including Witten itself (think self-cites). Try it on INSPIRE. I meant this case, the toy-index would allow that query... > >> - 2nd order links must be carefully prepared (but honestly, how many >> of those 2nd order relations are really needed, and really used? this >> number is probably low...) > > Cite summary, co-cited with, etc are all second-order operations. I obviously don't know most of the 2nd order operations, but musing about how they are implemented - if somebody could point out reasons why they are not possible or very difficult to implement with the search engine index, that would help me to understand much more. Cheers, roman > > Best regards > -- > Tibor Simko >
