Split the filename into "basefilename" and "version" and make each a keyword.

Sort your query by version descending, and only use the first
"basefile" you encounter.

On Wed, 17 Nov 2004 15:05:19 -0500, Luke Shannon
<[EMAIL PROTECTED]> wrote:
> Hey all;
> 
> I have ran into an interesting case.
> 
> Our system has notes. These need to be indexed. They are xml files called 
> default.xml and are easily parsed and indexed. No problem, have been doing it 
> all week.
> 
> The problem is if someone edits the note, the system doesn't update the 
> default.xml. It creates a new file, default_1.xml (every edit creates a new 
> file with an incremented number, the sytem only displays the content from the 
> highest number).
> 
> My problem is I index all the documents and end up with terms that were taken 
> out of note several version ago still showing up in the query. From my point 
> of view this makes sense because the files are still in the content. But to a 
> user it is confusing because they have no idea every change they make to a 
> note spans a new file and now the are seeing a term they removed from their 
> note 2 weeks ago showing up in a query.
> 
> I have started modifying my incremental update to be look for multiple 
> version of the default.xml but it is more work than I thought and is going 
> make things complex.
> 
> Maybe there is an easier way? If I just let it run and create the index, can 
> somebody suggest a way I could easily scan the index folder ensuring only the 
> default.xml with the highest number in its filename remains (only for folders 
> were there is more than one default.xml file)? Or is this wishful thinking?
> 
> Thanks,
> 
> Luke
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to