[ 
https://issues.apache.org/jira/browse/LUCENE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180435#comment-13180435
 ] 

Michael McCandless commented on LUCENE-3628:
--------------------------------------------

Patch looks great!  What an awesome step forward... once we later fix
sim to be able to do whatever it wants at indexing time (eg, use a
4-byte float), then apps are free to create arbitrary "norms" per doc!
Wonderful...

It's nice we now have a default merge base impl for DocValues, which
the 4.0 codec then overrides with it's low-RAM impls.  Though, why do
we need to add the 3 per-type add methods...?  Can we somehow use the
existing add that takes a DocValue...?  (We can also fix this up
separately after committing...).

The PerDocConsumer.pull seems a bit odd... we need this just because
we don't know if we are merging .normValues vs .docValues right?
Maybe rename pull to getDocValuesToMerge?

This change is curious:
{noformat}
+            try {
             assert dir.fileExists(IndexFileNames.segmentFileName(filename, "",
                 Writer.INDEX_EXTENSION));
+            } catch (IOException e) {
+            }
+            break;
{noformat}

... what motivated that?  Does MDW sometimes throw errant exceptions
in fileExists...?  Or some test was failing...?

Hmm I hit an exc in TestSearcherManager, but it doesn't always
reproduce:

{noformat}
NOTE: reproduce with: ant test -Dtestcase=TestSearcherManager 
-Dtestmethod=testSearcherManager 
-Dtests.seed=9ddad32eae88e7f:4561a85cf046520e:1f76c8c586f3eb2d 
-Dargs="-Dfile.encoding=UTF-8"
ENOTE: test params are: codec=Lucene40: {extra29=PostingsFormat(name=MockSep), 
extra28=PostingsFormat(name=Lucene40WithOrds), 
body=MockVariableIntBlock(baseBlockSize=43), 
extra27=PostingsFormat(name=NestedPulsing), 
extra26=MockVariableIntBlock(baseBlockSize=43), 
extra25=MockFixedIntBlock(blockSize=1841), extra24=Lucene40(minBlockSize=42 
maxBlockSize=86), extra23=Pulsing40(freqCutoff=13 minBlockSize=3 
maxBlockSize=44), extra22=PostingsFormat(name=MockRandom), 
packID=MockVariableIntBlock(baseBlockSize=43), 
date=PostingsFormat(name=MockRandom), docid=PostingsFormat(name=MockSep), 
title=PostingsFormat(name=Lucene40WithOrds), 
extra20=PostingsFormat(name=Lucene40WithOrds), 
extra21=PostingsFormat(name=MockSep), 
extra38=PostingsFormat(name=NestedPulsing), 
extra8=PostingsFormat(name=MockRandom), extra12=Pulsing40(freqCutoff=13 
minBlockSize=3 maxBlockSize=44), 
extra37=MockVariableIntBlock(baseBlockSize=43), 
extra11=PostingsFormat(name=MockRandom), extra9=Pulsing40(freqCutoff=13 
minBlockSize=3 maxBlockSize=44), extra14=MockFixedIntBlock(blockSize=1841), 
extra13=Lucene40(minBlockSize=42 maxBlockSize=86), 
extra39=PostingsFormat(name=Lucene40WithOrds), extra34=Pulsing40(freqCutoff=13 
minBlockSize=3 maxBlockSize=44), extra16=PostingsFormat(name=NestedPulsing), 
extra15=MockVariableIntBlock(baseBlockSize=43), 
extra33=PostingsFormat(name=MockRandom), 
extra36=MockFixedIntBlock(blockSize=1841), 
extra18=PostingsFormat(name=MockSep), extra35=Lucene40(minBlockSize=42 
maxBlockSize=86), extra17=PostingsFormat(name=Lucene40WithOrds), 
extra0=PostingsFormat(name=MockRandom), 
thisCodeMakesAbsolutelyNoSenseCanWeDeleteIt=PostingsFormat(name=NestedPulsing), 
extra1=Pulsing40(freqCutoff=13 minBlockSize=3 maxBlockSize=44), 
extra2=Lucene40(minBlockSize=42 maxBlockSize=86), 
extra3=MockFixedIntBlock(blockSize=1841), 
extra5=PostingsFormat(name=NestedPulsing), 
extra6=PostingsFormat(name=Lucene40WithOrds), 
extra7=PostingsFormat(name=MockSep), titleTokenized=Pulsing40(freqCutoff=13 
minBlockSize=3 maxBlockSize=44), extra30=PostingsFormat(name=NestedPulsing), 
extra31=PostingsFormat(name=Lucene40WithOrds), 
extra32=PostingsFormat(name=MockSep), extra10=PostingsFormat(name=MockSep)}, 
sim=RandomSimilarityProvider(queryNorm=true,coord=true): 
{extra29=BM25(k1=1.2,b=0.75), extra28=DFR I(ne)L2, body=DFR I(F)2, extra27=DFR 
I(F)B3(800.0), extra26=DFR I(n)LZ(0.3), extra25=DFR I(n)3(800.0), extra24=DFR 
GB2, extra23=DFR I(n)Z(0.3), extra22=DFR I(F)L2, packID=IB SPL-D2, date=DFR 
I(F)1, docid=DFR GL2, title=DFR I(n)2, extra20=DFR GLZ(0.3), extra21=IB SPL-L2, 
extra38=DefaultSimilarity, extra8=DFR I(ne)2, extra12=DefaultSimilarity, 
extra11=IB SPL-D2, extra37=IB SPL-D2, extra9=IB LL-L2, extra14=IB SPL-D1, 
extra13=DFR I(ne)B3(800.0), extra39=DFR I(ne)B3(800.0), extra16=IB SPL-LZ(0.3), 
extra34=IB SPL-L1, extra15=DFR I(n)L2, extra33=IB LL-L1, extra18=DFR I(ne)B2, 
extra36=DFR I(ne)L1, extra35=DFR I(ne)1, extra17=DFR I(n)L3(800.0), extra0=IB 
SPL-D2, extra1=DefaultSimilarity, extra2=DFR I(ne)B3(800.0), extra3=IB SPL-D1, 
extra5=IB SPL-LZ(0.3), extra6=DFR I(n)L3(800.0), extra7=DFR I(ne)B2, 
titleTokenized=DFR I(ne)B1, extra30=DFR I(ne)Z(0.3), extra31=DFR I(n)1, 
extra32=IB LL-D2, extra10=DFR GB3(800.0)}, locale=pt, timezone=VST
NOTE: all tests run in this JVM:
[TestShardSearching, TestComplexExplanations, TestTieredMergePolicy, 
TestLongPostings, TestFieldCacheRewriteMethod, TestSlowCollationMethods, 
TestDocsAndPositions, TestSearcherManager]

java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 
open files: {_6_nrm.cfs=1}
        at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:545)
        at 
org.apache.lucene.index.ThreadedIndexingAndSearchingTestCase.runTest(ThreadedIndexingAndSearchingTestCase.java:629)
        at 
org.apache.lucene.search.TestSearcherManager.testSearcherManager(TestSearcherManager.java:52)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at org.junit.rules.TestWatchman$1.evaluate(TestWatchman.java:48)
        at 
org.apache.lucene.util.LuceneTestCase$3$1.evaluate(LuceneTestCase.java:528)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
        at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:165)
        at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
        at org.junit.runners.Suite.runChild(Suite.java:128)
        at org.junit.runners.Suite.runChild(Suite.java:24)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:157)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:136)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
        at org.junit.runner.JUnitCore.runMain(JUnitCore.java:98)
        at org.junit.runner.JUnitCore.runMainAndExit(JUnitCore.java:53)
        at org.junit.runner.JUnitCore.main(JUnitCore.java:45)
Caused by: java.lang.RuntimeException: unclosed IndexOutput: _6_nrm.cfs
        at 
org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:469)
        at 
org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:441)
        at 
org.apache.lucene.store.CompoundFileWriter.getOutput(CompoundFileWriter.java:124)
        at 
org.apache.lucene.store.CompoundFileWriter.createOutput(CompoundFileWriter.java:260)
        at 
org.apache.lucene.store.CompoundFileDirectory.createOutput(CompoundFileDirectory.java:290)
        at 
org.apache.lucene.codecs.lucene40.values.Bytes$BytesWriterBase.getOrCreateDataOut(Bytes.java:257)
        at 
org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$Writer.merge(FixedStraightBytesImpl.java:138)
        at 
org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:90)
        at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:57)
        at 
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:391)
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:127)
        at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3630)
        at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3258)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:382)
        at 
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:451)
{noformat}


                
> Cut Norms over to DocValues
> ---------------------------
>
>                 Key: LUCENE-3628
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3628
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index, core/search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-3628.patch, LUCENE-3628.patch, LUCENE-3628.patch, 
> LUCENE-3628.patch, LUCENE-3628.patch, LUCENE-3628.patch
>
>
> since IR is now fully R/O and norms are inside codecs we can cut over to use 
> a IDV impl for writing norms. LUCENE-3606 has some 
> [ideas|https://issues.apache.org/jira/browse/LUCENE-3606?focusedCommentId=13160559&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13160559]
>  about how this could be implemented

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to