Hi, I was talking about a solely binary DocValues field. Not searchable, stored whatever. A completely separate field that stores the values in order in binary form (e.g. 47*4 bytes if it's ints or floats) just for scoring. DocValues fields other than numeric are binary by default!
But for _exactly_ 47 values I'd use 47 separate numeric docvalues-only fields like "value01, value02, value03". The searchable stuff is multivlaued and just "value". But using 47 numeric fields at scoring time is a bit much to read. Is there no possibility to combine all those values into fewer fields, soely used for scoring (e.g, like 2 values like a linear factor and a quadratic factor or whatever). It's hard to image that you need all values while scoring! Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Dominik Safaric [mailto:dominiksafa...@gmail.com] > Sent: Thursday, October 12, 2017 8:53 AM > To: java-user@lucene.apache.org > Subject: Re: Lucene 7.x custom Scorer on point values > > The number of values per document per field is equal to 47. > > Unfortunately using binary fields is not an option because a binary field > is not searchable. However, using a keyword field where the array of long > values would be equivalent to a hex encoded binary array and later > retrieving them as binary data might do the trick. But before that, could > you please explain how keyword fields are stored within Lucene? I'm asking > because unfortunately I haven't found any information about it online. > > Thanks, > Dominik > > 2017-10-11 13:59 GMT+02:00 Uwe Schindler <u...@thetaphi.de>: > > > Hi, > > > > if you have multiple docvalues for the same field in the same document, > > the order is undefined. The original order is not preserved, sorry. How > > many values per document do you have? If it’s a fixed number or low, I'd go > > with single valued fields. > > > > If you really need multi-valued docvalues where the order is preserved, > > you can go and use binary bytes instead and encode your values into it. But > > this is much more expensive to use during scoring (decoding overhead,...). > > > > Uwe > > > > ----- > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > -----Original Message----- > > > From: Dominik Safaric [mailto:dominiksafa...@gmail.com] > > > Sent: Wednesday, October 11, 2017 1:39 PM > > > To: java-user@lucene.apache.org > > > Subject: Re: Lucene 7.x custom Scorer on point values > > > > > > Thanks Uwe for the clarification. > > > > > > The values are already indexed as numeric docvalues, i.e. numeric > > > point-docvalues. In both cases, either by implementing a custom scorer or > > > function query I would need to access the point values for the > > matched/hit > > > documents. How can I derive these values given a DocIdSetIterator (subset > > > of documents i.e. hit documents ids) and a LeafContextReader. Using the > > > getSortedNumericDocValues("field") can derive me the longs in question, > > > however these values are sorted using Long.compare whereas in my case > > > order > > > of the values for a particular field matters. > > > > > > Kind regards, > > > Dominik > > > > > > 2017-10-11 11:43 GMT+02:00 Uwe Schindler <u...@thetaphi.de>: > > > > > > > Hi, > > > > > > > > You would need to index that as numeric docvalues. Just add another > > field > > > > of type numeric docvalues with same or different name and use the > > > > LeafReader's docvalues accessors to fetch values. But that's all way > > too > > > > hard. You can create function queries without hazzle using the function > > > > queries package. Or much better: I'd use the lucene expressions module > > to > > > > do this. It allows you to express the scoring formula as a javascript > > > > formula and use all docvalues fields in your document to calculate the > > > > final score. > > > > > > > > In both cases there is no need to create a custom scorer and everything > > > > works efficient. Creating own scorers just for this is way to > > complicated > > > > and not recommended. This leads to usage errors like you have > > discovered: > > > > slow stored fields, misusage of docvalues APIs (those are iterators, > > too) > > > > or other problems. > > > > > > > > Uwe > > > > > > > > ----- > > > > Uwe Schindler > > > > Achterdiek 19, D-28357 Bremen > > > > http://www.thetaphi.de > > > > eMail: u...@thetaphi.de > > > > > > > > > -----Original Message----- > > > > > From: Dominik Safaric [mailto:dominiksafa...@gmail.com] > > > > > Sent: Wednesday, October 11, 2017 11:23 AM > > > > > To: java-user@lucene.apache.org > > > > > Subject: Lucene 7.x custom Scorer on point values > > > > > > > > > > Recently I've implemented a custom Query that in turn scores > > documents > > > > > using a custom Scorer implementation using a long primitive point > > values. > > > > > The associated field is multi valued and has doc values enabled. For > > > > > retrieving these multi valued longs I've used LeafReader.document() > > > > within > > > > > the Scorer implementation. However, the invocation requires iterating > > > > > through the space of matching documents which may induce > > > performance > > > > > degradations. > > > > > > > > > > Hence my question is, what would be the most efficient implementation > > > of > > > > a > > > > > custom Scorer that computes scores based on the value of a multi > > valued > > > > > long points field? > > > > > > > > > > Thanks in advance, > > > > > Dominik > > > > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org