D'oh !!! OK, I see where this happened, we were leveraging this: https://issues.apache.org/jira/browse/LUCENE-6212
So changing the subject - can I get 6212 behaviour back in latest lucene somehow ??? Resp. what I am doing now is to use the fields setTokenStream to have its own tokenstream per doc ... does the tokenstream need to be private, or can one instance be reused ? thnx L On 17 February 2017 at 09:27, Ľuboš Koščo <[email protected]> wrote: > One more Q before I can work on tests > > how does recent lucene pick appropriate analyzer for the doc? > Were you doing some changes in that area since 4.7.1 ? > (if we decide the indexing chain didn't influence this and still uses > analyzer properly picked) > (I checked changelogs and didn't find any suspicious change in that area > ... ) > > thnx > L > > > On 11 February 2017 at 00:47, Michael McCandless < > [email protected]> wrote: > >> Could you make a small standalone test case showing what used to work >> and what no longer works? >> >> I don't think that issue was supposed to alter how IndexWriter >> interacts with the analysis chain. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Fri, Feb 10, 2017 at 9:48 AM, Ľuboš Koščo <[email protected]> wrote: >> > Resp. how to make the double inherited analyzer (on the bottom of >> > inheritance) be used again, instead of hidden by its father direct >> > descendant of Analyzer? >> > (father: >> > https://github.com/OpenGrok/OpenGrok/blob/master/src/org/ope >> nsolaris/opengrok/analysis/FileAnalyzer.java >> > child: >> > https://github.com/OpenGrok/OpenGrok/blob/master/src/org/ope >> nsolaris/opengrok/analysis/java/JavaAnalyzer.java >> > - looking at above it's even deeper inheritance, so Analyzer -> >> FileAnalyzer >> > -> ... ->JavaAnalyzer as the last child) >> > >> > (funny enough the code on our side that creates docs didn't really >> change >> > since 4.7.1 , but new lucene now picks FileAnalyzer over any other >> analyzer >> > for createComponents anyways) >> > >> > tia >> > L >> > >> > On 10 February 2017 at 13:41, Ľuboš Koščo <[email protected]> wrote: >> >> >> >> Hi guys, Mike >> >> >> >> is there any chance I can somehow get the indexing chain to behave >> similar >> >> as before LUCENE-5611 in 6.4.1 ? >> >> >> >> We used to have analyzers that inherited multiple times from Analyzer >> >> (e.g. second child and relaxed and overriden createComponents) and >> lucene >> >> used to run them for appropriate docs properly >> >> but after LUCENE-5611 I can see the chain changed and only the first >> child >> >> is always taken into account, even though the document is handled by >> proper >> >> analyzer ... >> >> (basically between 4.7.1 and 6.4.1 something changed that made lucene >> just >> >> ignore second child of analyzer and won't use it and always use first >> one >> >> (and its father, the direct override of createComponents)) >> >> Some code pointers on what used to work and now isn't : >> >> https://github.com/OpenGrok/OpenGrok/issues/1376 >> >> (and I tried to dig the changelogs and the only thing I found is really >> >> around 5611, hence this silly Q) >> >> >> >> any clues how to get old behaviour back? >> >> >> >> thnx >> >> L >> >> >> > >> > >
