ErickErickson commented on pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733#issuecomment-696828850


   
   
   > On Sep 22, 2020, at 11:04 AM, Michael McCandless 
<[email protected]> wrote:
   > 
   > 
   > So I propose we get rid of the fullPathField altogether.
   > 
   > Wow, +1, this looks like it is (pre-existingly?) double-indexed? Maybe we 
should do this as a separate pre-cursor PR to this one (switch to StoredField 
when indexing the fullPathField)?
   > 
   > For maintaining backwards compatibility, we can read facet labels from new 
BinaryDocValues field, falling back to old StoredField if BinaryDocValues field 
does not exist or has no value for the docId. The performance penalty of doing 
so should be acceptable.
   > 
   > Yeah +1 to, on a hit by hit basis, try BinaryDocValues first, and then 
fallback to the StoredField. This is the cost of backwards compatibility ... 
though, for a fully new (all BinaryDocValues) index, the performance should be 
fine. Also, note that in Lucene 10.x we can remove that back-compat fallback.
   > 
   > Alternatively we can implement a special merge policy that takes care of 
moving data from old Stored field to BinaryDocValues field at the time of merge 
but that might be tricky to implement.
   > 
   > I think this would indeed be tricky.
   
   Andrzej and I spent quite a bit of time trying to get something similar to 
work for adding docValues on the fly using a custom merge policy. We realized 
that you could create a docValues field from an indexed field for primitive 
types since all the information was already in the index. We never could get it 
working if there was active indexing happening, so resorted to a batch process 
that rewrote all segments doing the transformation along the way that had to be 
run on a quiescent index, the client decided that was good enough and didn’t 
want to spend more time on it.
   
   Our best guess was that there was a race condition that we somehow couldn’t 
find in the time allowed… Mostly just FYI...
   
   FWIW,
   Erick
   
   > 
   > —
   > You are receiving this because you are subscribed to this thread.
   > Reply to this email directly, view it on GitHub, or unsubscribe.
   > 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to