Dawid Weiss created LUCENE-7477:
-----------------------------------

             Summary: ExternalRefSorter should use OfflineSorter's actual 
writer for writing the input file
                 Key: LUCENE-7477
                 URL: https://issues.apache.org/jira/browse/LUCENE-7477
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Dawid Weiss
            Assignee: Dawid Weiss
            Priority: Minor
             Fix For: 6.x, master (7.0)


Consider this constructor in ExternalRefSorter:
{code}
  public ExternalRefSorter(OfflineSorter sorter) throws IOException {
    this.sorter = sorter;
    this.input = 
sorter.getDirectory().createTempOutput(sorter.getTempFileNamePrefix(), 
"RefSorterRaw", IOContext.DEFAULT);
    this.writer = new OfflineSorter.ByteSequencesWriter(this.input);
  }
{code}

The problem with it is that the writer for the initial input file is written 
with the default {{OfflineSorter.ByteSequencesWriter}}, but the instance of 
{{OfflineSorter}} may be unable to read it if it overrides {{getReader}} to use 
something else than the default.

While this works now, it should be cleaned up (I think). It'd be probably ideal 
to allow {{OfflineSorter}} to generate its own temporary file and just return 
the ByteSequencesWriter it chooses to use, so the above snippet would read:

{code}
  public ExternalRefSorter(OfflineSorter sorter) throws IOException {
    this.sorter = sorter;
    this.writer = sorter.newUnsortedPartition();
  }
{code}

This could be also extended so that {{OfflineSorter}} is in charge of managing 
its own (sorted and unsorted) partitions. Then {{sort(String file)}} would 
simply become {{ByteSequenceIterator sort()}} (or even {{Stream<BytesRef> 
sort()}} as Stream is conveniently {{AutoCloseable}}). If we made 
{{OfflineSorter}} implement {{Closeable}} it could also take care of cleaning 
up any resources it opens in the directory we pass to it.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to