[ 
https://issues.apache.org/jira/browse/LUCENE-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572163#comment-15572163
 ] 

Michael McCandless commented on LUCENE-7489:
--------------------------------------------

+1, this patch looks wonderful!

It looks like it uses the same compression techniques for the values as the 6.x 
codec, but then for "which docIDs have a value" it has three different 
approaches, for the very sparse, mostly dense, and 100% dense cases.

I hit this test failure, but doesn't repro on trunk (though it could still be a 
pre-existing issue, if e.g. this patch shifted seeds):

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestBlockJoinSorting -Dtests.method=testNestedSorting 
-Dtests.seed=A0B8F022A1A8B661 -Dtests.locale=en-CA -Dtests.timezone=Etc/GMT+4 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.20s | TestBlockJoinSorting.testNestedSorting <<<
   [junit4]    > Throwable #1: org.junit.ComparisonFailure: expected:<[e]> but 
was:<[f]>
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([A0B8F022A1A8B661:A8511D63E101BB0F]:0)
   [junit4]    >        at 
org.apache.lucene.search.join.TestBlockJoinSorting.testNestedSorting(TestBlockJoinSorting.java:233)
   [junit4]    >        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
{field1=FST50, __type=Lucene50(blocksize=128), 
filter_1=Lucene50(blocksize=128), field2=Lucene50(blocksize=128)}, 
docValues:{field2=DocValuesFormat(name=Asserting)}, maxPointsInLeafNode=972, 
maxMBSortInHeap=5.645435808865713, sim=RandomSimilarity(queryNorm=false): {}, 
locale=en-CA, timezone=Etc/GMT+4
   [junit4]   2> NOTE: Linux 4.4.0-38-generic amd64/Oracle Corporation 
1.8.0_101 (64-bit)/cpus=8,threads=1,free=420118024,total=514850816
   [junit4]   2> NOTE: All tests run in this JVM: [TestBlockJoinSorting]
   [junit4] Completed [1/1 (1!)] in 0.37s, 1 test, 1 failure <<< FAILURES!
{noformat}

> Improve sparsity support of Lucene70DocValuesFormat
> ---------------------------------------------------
>
>                 Key: LUCENE-7489
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7489
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7489.patch, LUCENE-7489.patch
>
>
> Like Lucene70NormsFormat, it should be able to only encode actual values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to