[ 
https://issues.apache.org/jira/browse/LUCENE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406510#comment-13406510
 ] 

Shai Erera commented on LUCENE-4190:
------------------------------------

I also raised an eyebrow when I read this comment. Many of the lucene+facet 
deployments that I know of store the taxonomy index as a sub-directory of the 
search index. Also, we've been storing other files in the index directory too 
... this new feature will affect such existing deployments.

I think it makes sense to change IW behavior to only delete files that start 
with _. It's a reasonable requirement IMO.

While I don't know the nature of this change, I can assume it's related to IW 
not knowing which files to delete when a segment is no longer needed, because 
Codecs can pick their own file names. If we had an instance which kept track of 
all files that were created, e.g. every Codec would register the files there 
(if it wants to protect from their deletion), would make the decision of which 
files to delete easier?
                
> IndexWriter deletes non-Lucene files
> ------------------------------------
>
>                 Key: LUCENE-4190
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4190
>             Project: Lucene - Java
>          Issue Type: Bug
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.0
>
>
> Carl Austin raised a good issue in a comment on my Lucene 4.0.0 alpha blog 
> post: 
> http://blog.mikemccandless.com/2012/07/lucene-400-alpha-at-long-last.html
> IndexWriter will now (as of 4.0) delete all foreign files from the index 
> directory.  We made this change because Codecs are free to write to any files 
> now, so the space of filenames is hard to "bound".
> But if the user accidentally uses the wrong directory (eg c:/) then we will 
> in fact delete important stuff.
> I think we can at least use some simple criteria (must start with _, maybe 
> must fit certain pattern eg _<base36>(_X).Y), so we are much less likely to 
> delete a non-Lucene file....

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to