This isn't a support forum; solr-users@ might be more appropriate. On
that list someone might have a better idea about how the replication
handler gets its list of files. This would be a good list to try if
you wanted to propose a fix for the problem you're having. But since
you're here -- it looks to me as if IndexWriter indeed syncs all "new"
files in the current segments being committed; look in
IndexWriter.startCommit and SegmentInfos.files. Caveat: (1) I'm
looking at this code for the first time, and (2) things may have been
different in 7.7.2? Sorry I don't know for sure, but are you sure that
your backup process is not attempting to copy one of the new files?

On Thu, Mar 11, 2021 at 1:35 PM Rahul Goswami <rahul196...@gmail.com> wrote:
>
> Hello,
> Just wanted to follow up one more time to see if this is the right form for 
> my question? Or is this suitable for some other mailing list?
>
> Best,
> Rahul
>
> On Sat, Mar 6, 2021 at 3:57 PM Rahul Goswami <rahul196...@gmail.com> wrote:
>>
>> Hello everyone,
>> Following up on my question in case anyone has any idea. Why it's important 
>> to know this is because I am thinking of allowing the backup process to not 
>> hold any lock on the index files, which should allow the fsync during 
>> parallel commits. BUT, in case doing an fsync on existing segment files in a 
>> saved commit point DOES have an effect, it might render the backed up index 
>> in a corrupt state.
>>
>> Thanks,
>> Rahul
>>
>> On Fri, Mar 5, 2021 at 3:04 PM Rahul Goswami <rahul196...@gmail.com> wrote:
>>>
>>> Hello,
>>> We have a process which backs up the index (Solr 7.7.2) on a schedule. The 
>>> way we do it is we first save a commit point on the index and then using 
>>> Solr's /replication handler, get the list of files in that generation. 
>>> After the backup completes, we release the commit point (Please note that 
>>> this is a separate backup process outside of Solr and not the backup 
>>> command of the /replication handler)
>>> The assumption is that while the commit point is saved, no changes happen 
>>> to the segment files in the saved generation.
>>>
>>> Now the issue... The backup process opens the index files in a shared READ 
>>> mode, preventing writes. This is causing any parallel commits to fail as it 
>>> seems to be complaining about the index files to be locked by another 
>>> process(the backup process). Upon debugging, I see that fsync is being 
>>> called during commit on already existing segment files which is not 
>>> expected. So, my question is, is there any reason for lucene to call fsync 
>>> on already existing segment files?
>>>
>>> The line of code I am referring to is as below:
>>> try (final FileChannel file = FileChannel.open(fileToSync, isDir ? 
>>> StandardOpenOption.READ : StandardOpenOption.WRITE))
>>>
>>> in method fsync(Path fileToSync, boolean isDir) of the class file
>>>
>>> lucene\core\src\java\org\apache\lucene\util\IOUtils.java
>>>
>>> Thanks,
>>> Rahul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to