Can you describe how you encode 'cost' as a CharsRef? The TopNSearcher (that code you found, looking for a NO_OUTPUT arc) very much relies on how PositiveIntOutputs behaves, i.e. that the min value seen so far on any prefix path will have a completion to a final node that has only NO_OUTPUT arcs.
If you turn on assertions you should see an assert failure? Mike McCandless http://blog.mikemccandless.com On Sat, Sep 26, 2015 at 12:51 AM, Aditya Tripathi <aditya.tripa...@gmail.com> wrote: > Hi, > > Question: > Looking at the code I got slightly confused if TopNSearcher would work for > non weighted CharSequence Output FST. I am trying to use a scale up model > and accommodate many tenants on one machine and hence was not planning to > use Pair<Long,CharRef> output. It would have been great if the path output > could be considered as cost and TopNSearcher could use cost of the whole > Path while decding NO_OUTPUT arc. However it goes in infinte loop for the > code snippet given below. > > Background: (X of XY problem) > I build an FST<CharRef> with input from one index field and output as > concatenation of two other stored fields. > > I wanted to get suggestions from this FST using TopNSearcher search method. > As an experimental code I just wanted to see if shortestPaths would work on > this FST with CharSequence outputs as the cost. > > It does not work. Goes in infinte loop. > > I think (Haven't digged much though and not at all sure) the problem lies > in the fact that while finding minimum weight arc (no weight here - weight > is output) - the old path is not considered and only the current arc is > compared for NO_OUTPUT. And then it keeps copying this NO_OUTPUT arc back > into the current arc later. This spins it into an infinte loop. > > In TopNSearcher search() method, the following line. Line 464 in > org/apache/lucene/util/fst/Util.java > if (comparator.compare(NO_OUTPUT, path.arc.output) == 0) > > And then copying the NO_OUTPUT arc back to the current arc spoils the fun > here in line:490 > path.arc.copyFrom > <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-core/5.2.0/org/apache/lucene/util/fst/FST.java#FST.Arc.copyFrom%28org.apache.lucene.util.fst.FST.Arc%29> > (scratchArc > <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-core/5.2.0/org/apache/lucene/util/fst/Util.java#Util.TopNSearcher.0scratchArc> > ); > > The sample code to reproduce this along with Sysout's for seeing how the > FST is formed is given below. > > (If I use PositiveIntOutputs it works fine. Commented lines.) > > > > public static void main(String[] args) throws IOException { > > String inputValues[] = {"aafish4","abcat","abcmonkey6" , "abcdog", > "abcdogs"}; > long outputValues[] = {14,5, 16, 7, 12}; > CharsRef[] outputValuesString = {new CharsRef("pqrfish4"),new > CharsRef("pqcat"),new CharsRef("pqsmonkey6") ,new CharsRef( "pqrsdog"), new > CharsRef("pqrdogs")}; > > PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton(); > CharSequenceOutputs outputsO = CharSequenceOutputs.getSingleton(); > //Builder<Long> builder = new Builder<Long>(INPUT_TYPE.BYTE1, > outputs); > Builder<CharsRef> builder = new Builder<CharsRef>(INPUT_TYPE.BYTE4, > outputsO); > BytesRef scratchBytes = new BytesRef(); > > IntsRef scratchInts = new IntsRef(); > > for (int i = 0; i < inputValues.length; i++) { > scratchBytes.copyChars(inputValues[i]); > //builder.add(Util.toIntsRef(scratchBytes, scratchInts), > outputValues[i]); > builder.add(Util.toIntsRef(scratchBytes, scratchInts), > outputValuesString[i]); > > } > //FST<Long> fst = builder.finish(); > FST<CharsRef> fst = builder.finish(); > Arc<CharsRef> arc; > //Arc<Long> arc; > > //Arc<Long> firstArc = fst.getFirstArc(new Arc<Long>()); > Arc<CharsRef> firstArc = fst.getFirstArc(new Arc<CharsRef>()); > arc = firstArc; > System.out.println("firstArc: " +arc + " isLastArch:"+ > arc.isLast()+" isFinal:"+arc.isFinal()+" label:"+arc.label+" > output:"+arc.output + " target:"+arc.target); > > BytesReader reader = fst.getBytesReader(); > > Arc<CharsRef> firstTargetArc = fst.readFirstTargetArc(firstArc, new > Arc<CharsRef>(), reader ); > //Arc<Long> firstTargetArc = fst.readFirstTargetArc(firstArc, new > Arc<Long>(), reader ); > arc = firstTargetArc; > System.out.println("firstTargetArc: " +arc + " isLastArch:"+ > arc.isLast()+" isFinal:"+arc.isFinal()+" label:"+arc.label+" > output:"+arc.output + " target:"+arc.target); > > //Arc<Long> lastTargetArc = fst.readLastTargetArc(firstArc, new > Arc<Long>(), reader ); > Arc<CharsRef> lastTargetArc = fst.readLastTargetArc(firstArc, new > Arc<CharsRef>(), reader ); > arc = lastTargetArc; > System.out.println("lastTargetArc: " +arc + " isLastArch:"+ > arc.isLast()+" isFinal:"+arc.isFinal()+" label:"+arc.label+" > output:"+arc.output + " target:"+arc.target); > > //Arc<Long> nextArc = fst.readNextArc(firstTargetArc, reader); > Arc<CharsRef> nextArc = fst.readNextArc(firstTargetArc, reader); > arc = nextArc; > System.out.println("nextArc: " +arc + " isLastArch:"+ > arc.isLast()+" isFinal:"+arc.isFinal()+" label:"+arc.label+" > output:"+arc.output + " target:"+arc.target); > > > System.out.println("****************************************************************************************************"); > int a=0; > arc = firstTargetArc; > while(true) { > try { > > System.out.println("nextArc: " +arc + " isLastArch:"+ > arc.isLast()+" isFinal:"+arc.isFinal()+" label:"+arc.label+" > output:"+arc.output + " target:"+arc.target); > arc = fst.readNextArc(arc, reader); > }catch (Exception e) { > break; > } > > } > > System.out.println("****************************************************************************************************"); > > Comparator<CharsRef> comparator = new Comparator<CharsRef>() { > public int compare(CharsRef left, CharsRef right) { > return left.compareTo(right); > } > }; > > /*Comparator<Long> comparator = new Comparator<Long>() { > public int compare(Long left, Long right) { > return left.compareTo(right); > } > }; */ > > //MinResult<Long> paths[] = Util.shortestPaths(fst, firstArc, 0L, > comparator, 4, false); > MinResult<CharsRef> paths[] = Util.shortestPaths(fst, firstArc, > new CharsRef(), comparator, 106, true); > > } > > > } --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org