[ https://issues.apache.org/jira/browse/SOLR-830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647331#action_12647331 ]
Bill Au commented on SOLR-830: ------------------------------ Steve, thanks for the perl code. I need to get rid of the "\" before the "$" in order to get it to work for me: perl -e 'chdir q/${master_data_dir}/; print ((sort grep {/^snapshot[.][1-9][0-9]{13}$/} <*>)[-1])' I have tested this on Linux and FreeBSD. I will test on Mac OS X tonight. It will be good if someone can do a quick test on Solaris. You really don't need a full brown Solr installation to test it. Just create some dummy directory with various names like: snapshot.00080527124131 snapshot.20080527124131 snapshot.20080527124131-wip snapshot.20080527140518 snapshot.20080527140610 snapshot.20081113113700 snapshot.2080527124131 temp-snapshot.20080527124131 and then run the perl command to make sure the right one is returned. With the data set above, you should get: snapshot.20081113113700 > snappuller picks bad snapshot name > ---------------------------------- > > Key: SOLR-830 > URL: https://issues.apache.org/jira/browse/SOLR-830 > Project: Solr > Issue Type: Bug > Components: replication (scripts) > Affects Versions: 1.2, 1.3 > Reporter: Hoss Man > Assignee: Bill Au > Fix For: 1.3.1 > > > as mentioned on the mailing list... > http://www.nabble.com/FileNotFoundException-on-slave-after-replication---script-bug--to20111313.html#a20111313 > {noformat} > We're seeing strange behavior on one of our slave nodes after replication. > When the new searcher is created we see FileNotFoundExceptions in the log > and the index is strangely invalid/corrupted. > We may have identified the root cause but wanted to run it by the community. > We figure there is a bug in the snappuller shell script, line 181: > snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} "ls > ${master_data_dir}|grep 'snapshot\.'|grep -v wip|sort -r|head -1"` > This line determines the directory name of the latest snapshot to download > to the slave from the master. Problem with this line is that it grab the > temporary work directory of a snapshot in progress. Those temporary > directories are prefixed with "temp" and as far as I can tell should never > get pulled from the master so its easy to disambiguate. It seems that this > temp directory, if it exists will be the newest one so if present it will be > the one replicated: FAIL. > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.