[ 
https://issues.apache.org/jira/browse/LUCENE-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741816#comment-13741816
 ] 

Robert Muir commented on LUCENE-5178:
-------------------------------------

I am not acknowledging there is a problem: I'm just telling you if you have 
'sparse' values in a docvalues field, and you want to emulate what fieldcache 
does in allowing you to optionally pull a bitset telling you when a value 
does/doesnt exist: you can do the same thing at index-time yourself today.

I'm against changing the "default" of 0 because its both unnecessary and 
unhelpful to differentiate whether a value exists in the field (it wont work: 
for numeric types it could be a "real value". Thats why FieldCache does this as 
a bitset, thats why FieldCache has a "hardcoded default" of 0). I don't want to 
add unnecessary complexity that ultimately provides no benefit (because that 
solves nothing, sorry).

I'm not opposed to allowing the comparators to take in a bits from somewhere 
other than the fieldcache (which i think always returns MatchAllBits for dv 
fields). This way if someone wants this: they can do it. I do have some 
reservations about it, because it doesnt give a 1-1 consistency with FieldCache 
api (so wouldnt "automatically" work for function queries without giving them 
special ctors too). So this would make APIs harder to use: and I don't like 
that... but its an option and its totally clear to the user what is happening.

I'm significantly less opposed to supporting an equivalent to 
FieldCache.getDocsWithField for docvalues. The advantage is we could pass 
FieldCache.getDocsWithField thru to it, and the sort missing-first/last, 
function queries exist() and so on would automatically work. The downsides are: 
it adds some complexity under the hood to deal with (e.g. indexwriter 
consumers, codec apis need change, codecs need to optimize for the case where 
none are missing, etc). And is this really complexity we should be adding for 
what is supposed to be a column-stride type (like norms?)... I'm just not sure 
its the right tradeoff. 
 
                
> doc values should allow configurable defaults
> ---------------------------------------------
>
>                 Key: LUCENE-5178
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5178
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>
> DocValues should somehow allow a configurable default per-field.
> Possible implementations include setting it on the field in the document or 
> registration of an IndexWriter callback.
> If we don't make the default configurable, then another option is to have 
> DocValues fields keep track of whether a value was indexed for that document 
> or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to