[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488491 ] Bill Au commented on SOLR-207: -- I confirmed that find -maxdepth does not work on Solaris. So it is back to ls. We should be OK as long as we don't use any wildcard that causes expansion. > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488485 ] Yonik Seeley commented on SOLR-207: --- > I think find -maxdepth is not supported on Solaris Sigh... back to ls then. > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488469 ] Yonik Seeley commented on SOLR-207: --- I tried both versions out, and the "find" version was quicker (on Linux at least). System time was about the same, but "ls" had much higher user time. $ time find . -maxdepth 1 -name 'snapshot.*' | grep -v wip | head -1 ./snapshot.20070411235957 real0m0.009s user0m0.002s sys 0m0.008s $ time ls -r . | grep snapshot\\. | grep -v wip | head -1 snapshot.20070412114504 real0m0.050s user0m0.043s sys 0m0.009s > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488468 ] Bertrand Delacretaz commented on SOLR-207: -- I think find -maxdepth is not supported on Solaris. And the -t option in my previous example was obviously wrong. I'm not sure if ls -r sorts by filename everywhere (but I have no evidence that it does not). The most portable version might be ls ${data_dir} | grep snapshot\\. | grep -v wip | sort -r | head -1 > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488420 ] Yonik Seeley commented on SOLR-207: --- Although, another alternative that doesn't have the shell expansion problem would be ls -r ${data_dir} | grep snapshot\\. | grep -v wip | head -1 > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488418 ] Yonik Seeley commented on SOLR-207: --- That's close to the way it was done in the past, but some people ran into problems because of shell restrictions w.r.t. number or size of the argments passed to the process (because the shell expands the list). > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-207) snappuller inefficient finding latest snapshot
[ https://issues.apache.org/jira/browse/SOLR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488413 ] Bertrand Delacretaz commented on SOLR-207: -- IIUC the snapshot directories are named like snapshot.MMDDHHMMSS and they are all under the same parent directory. If that's the case, then doing ls -rt ${data_dir}/snapshot.* | head -1 will return the name of the most recent directory, efficiently. > snappuller inefficient finding latest snapshot > -- > > Key: SOLR-207 > URL: https://issues.apache.org/jira/browse/SOLR-207 > Project: Solr > Issue Type: Bug > Components: replication >Reporter: Yonik Seeley > Attachments: find_maxdepth.patch > > > snapinstaller (and snappuller) do the following to find the latest snapshot: > name=`find ${data_dir} -name snapshot.* -print|grep -v wip|sort -r|head -1` > This recurses into all of the snapshot directories, doing much more disk-io > than is necessary. > I think it is the cause of bloated kernel memory usage we have seen on some > of our Linux boxes, caused > by kernel dentry and inode caches. Those caches compete with buffer cache > (caching the actual data of the index) > and can thus decrease performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.