Can you describe how you encode 'cost' as a CharsRef?

The TopNSearcher (that code you found, looking for a NO_OUTPUT arc)
very much relies on how PositiveIntOutputs behaves, i.e. that the min
value seen so far on any prefix path will have a completion to a final
node that has only NO_OUTPUT arcs.

If you turn on assertions you should see an assert failure?

Mike McCandless

http://blog.mikemccandless.com

On Sat, Sep 26, 2015 at 12:51 AM, Aditya Tripathi
<aditya.tripa...@gmail.com> wrote:
> Hi,
>
> Question:
> Looking at the code I got slightly confused if TopNSearcher would work for
> non weighted CharSequence Output FST. I am trying to use a scale up model
> and accommodate many tenants on one machine and hence was not planning to
> use Pair<Long,CharRef> output. It would have been great if the path output
> could be considered as cost and TopNSearcher could use cost of the whole
> Path while decding NO_OUTPUT arc. However it goes in infinte loop for the
> code snippet given below.
>
> Background: (X of XY problem)
> I build an FST<CharRef> with input from one index field and output as
> concatenation of two other stored fields.
>
> I wanted to get suggestions from this FST using TopNSearcher search method.
> As an experimental code I just wanted to see if shortestPaths would work on
> this FST with CharSequence outputs as the cost.
>
> It does not work. Goes in infinte loop.
>
> I think (Haven't digged much though and not at all sure) the problem lies
> in the fact that while finding minimum weight arc (no weight here - weight
> is output) - the old path is not considered and only the current arc is
> compared for NO_OUTPUT. And then it keeps copying this NO_OUTPUT arc back
> into the current arc later. This spins it into an infinte loop.
>
> In TopNSearcher search() method, the following line. Line 464 in
> org/apache/lucene/util/fst/Util.java
> if (comparator.compare(NO_OUTPUT, path.arc.output) == 0)
>
> And then copying the NO_OUTPUT arc back to the current arc spoils the fun
> here in line:490
> path.arc.copyFrom
> <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-core/5.2.0/org/apache/lucene/util/fst/FST.java#FST.Arc.copyFrom%28org.apache.lucene.util.fst.FST.Arc%29>
> (scratchArc
> <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.lucene/lucene-core/5.2.0/org/apache/lucene/util/fst/Util.java#Util.TopNSearcher.0scratchArc>
> );
>
> The sample code to reproduce this along with Sysout's for seeing how the
> FST is formed is given below.
>
>  (If I use PositiveIntOutputs it works fine. Commented lines.)
>
>
>
> public static void main(String[] args) throws IOException {
>
>         String inputValues[] = {"aafish4","abcat","abcmonkey6" , "abcdog",
> "abcdogs"};
>         long outputValues[] = {14,5, 16, 7, 12};
>         CharsRef[] outputValuesString = {new CharsRef("pqrfish4"),new
> CharsRef("pqcat"),new CharsRef("pqsmonkey6") ,new CharsRef( "pqrsdog"), new
> CharsRef("pqrdogs")};
>
>         PositiveIntOutputs outputs = PositiveIntOutputs.getSingleton();
>         CharSequenceOutputs outputsO = CharSequenceOutputs.getSingleton();
>         //Builder<Long> builder = new Builder<Long>(INPUT_TYPE.BYTE1,
> outputs);
>         Builder<CharsRef> builder = new Builder<CharsRef>(INPUT_TYPE.BYTE4,
> outputsO);
>         BytesRef scratchBytes = new BytesRef();
>
>         IntsRef scratchInts = new IntsRef();
>
>         for (int i = 0; i < inputValues.length; i++) {
>           scratchBytes.copyChars(inputValues[i]);
>           //builder.add(Util.toIntsRef(scratchBytes, scratchInts),
> outputValues[i]);
>           builder.add(Util.toIntsRef(scratchBytes, scratchInts),
> outputValuesString[i]);
>
>         }
>         //FST<Long> fst = builder.finish();
>         FST<CharsRef> fst = builder.finish();
>         Arc<CharsRef> arc;
>         //Arc<Long> arc;
>
>         //Arc<Long> firstArc = fst.getFirstArc(new Arc<Long>());
>         Arc<CharsRef> firstArc = fst.getFirstArc(new Arc<CharsRef>());
>         arc = firstArc;
>         System.out.println("firstArc: " +arc + "  isLastArch:"+
> arc.isLast()+"   isFinal:"+arc.isFinal()+"  label:"+arc.label+"
>  output:"+arc.output + "  target:"+arc.target);
>
>         BytesReader reader = fst.getBytesReader();
>
>         Arc<CharsRef> firstTargetArc = fst.readFirstTargetArc(firstArc, new
> Arc<CharsRef>(), reader );
>         //Arc<Long> firstTargetArc = fst.readFirstTargetArc(firstArc, new
> Arc<Long>(), reader );
>         arc = firstTargetArc;
>         System.out.println("firstTargetArc: " +arc + "  isLastArch:"+
> arc.isLast()+"   isFinal:"+arc.isFinal()+"  label:"+arc.label+"
>  output:"+arc.output + "  target:"+arc.target);
>
>         //Arc<Long> lastTargetArc = fst.readLastTargetArc(firstArc, new
> Arc<Long>(), reader );
>         Arc<CharsRef> lastTargetArc = fst.readLastTargetArc(firstArc, new
> Arc<CharsRef>(), reader );
>         arc = lastTargetArc;
>         System.out.println("lastTargetArc: " +arc + "  isLastArch:"+
> arc.isLast()+"   isFinal:"+arc.isFinal()+"  label:"+arc.label+"
>  output:"+arc.output + "  target:"+arc.target);
>
>         //Arc<Long> nextArc = fst.readNextArc(firstTargetArc, reader);
>         Arc<CharsRef> nextArc = fst.readNextArc(firstTargetArc, reader);
>         arc = nextArc;
>         System.out.println("nextArc: " +arc + "  isLastArch:"+
> arc.isLast()+"   isFinal:"+arc.isFinal()+"  label:"+arc.label+"
>  output:"+arc.output + "  target:"+arc.target);
>
>
> System.out.println("****************************************************************************************************");
>         int a=0;
>         arc = firstTargetArc;
>         while(true) {
>             try {
>
>             System.out.println("nextArc: " +arc + "  isLastArch:"+
> arc.isLast()+"   isFinal:"+arc.isFinal()+"  label:"+arc.label+"
>  output:"+arc.output + "  target:"+arc.target);
>             arc = fst.readNextArc(arc, reader);
>             }catch (Exception e) {
>                 break;
>             }
>
>         }
>
> System.out.println("****************************************************************************************************");
>
>         Comparator<CharsRef> comparator = new Comparator<CharsRef>() {
>             public int compare(CharsRef left, CharsRef right) {
>               return left.compareTo(right);
>             }
>           };
>
>           /*Comparator<Long> comparator = new Comparator<Long>() {
>               public int compare(Long left, Long right) {
>                 return left.compareTo(right);
>               }
>             }; */
>
>           //MinResult<Long> paths[] = Util.shortestPaths(fst, firstArc, 0L,
> comparator, 4, false);
>          MinResult<CharsRef> paths[] = Util.shortestPaths(fst, firstArc,
> new CharsRef(), comparator, 106, true);
>
>       }
>
>
> }

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to