Re: Commits and new document visibility
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Shawn, On 3/14/19 10:46, Shawn Heisey wrote: > On 3/14/2019 8:23 AM, Christopher Schultz wrote: >> I believe that the only thing I want to do is to set the >> autoSoftCommit value to something "reasonable". I'll probably >> start with maybe 15000 (15sec) to match the hard-commit setting >> and see if we get any complaints about delays between "save" and >> "seeing the user". > > In my opinion, 15 seconds is far too frequent for opening a new > searcher. If the index reaches any real size, you may be in a > situation where the full soft commit takes longer than 15 seconds > to complete - mostly due to warming or autowarming. Commits that > open a searcher can be very resource-intensive ... if they happen > too frequently, then heavy indexing will cause your Solr instance > to never "calm down" ... it will always be hitting the CPU and disk > hard. I'd personally start with one minute and adjust from there > based on how long the commits take. Okay. Current core size is ~1M documents. I think users can live with a 1-minute delay, but I'll have to ask :) Is the log file the best resource for information on (soft) commit-duration? >> In our case, we don't have a huge number of documents being >> created in a minute. Probably once per minute, if that. >> >> Does that seem reasonable? >> >> As for actually SETTING the setting, I'd prefer not to edit the >> solrconfig.xml document. Instead, can I set this in my >> solr.in.sh script? I see an example like this right in the file: >> >> SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000" > > 3 seconds is even more problematic than 15. Sorry, that was just a copy/paste directly from the default solr.in.sh script that ships with Solr. I wouldn't do a 3-second soft-commit. > I believe that when you use "bin/solr create" to create an index > with the default config, that it does set the autoSoftCommit to 3 > seconds. Which as I stated, I believe to be far too frequent. Nope, it sets it to "never soft commit", unless the defaults have changed since I built this service with, I think, 7.3.0. Is there any way to change this value at runtime, or does it require a service-restart? - -chris -BEGIN PGP SIGNATURE- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlyKxt8ACgkQHPApP6U8 pFg9ChAAkSgsvn3+xufyLM9bA8WIWqICwmDWRdFM9nbSiy4bDH1Zl/86FKjzcvbB lmyVFYlpFGedcSKLVsqXGEZiu8n0YgR6iVw6udfIJOWzex5JkwUBUsmS6bHP5ZAj 8wkTyWPyBQVBSBWUxQnEzfrgJCFxzEbzBt8no0gt0f7vbgXm+HaFBkb+l2MQzTK9 wrhsLh36cb17ig+/w16Eo4Rq5VQ5f/P4Y7PkTfzS5CaWyPi16mTP8Z7vTxQ+ltHQ IPAVnZ4U6Tx4hFxf2Ox99qRX5wAlX0lMD063Gx7Q348Xn+u8VH8Aur8hudnb9Icf MK9OqU0bxdeWkhDxGDCuxY4h+t+kE1YI0cPI5KWTkBVAU24dCOAPkJQ0LMGs/rGR B3KareFltLztowvM8rxOeNcLzeoKn1ZpWrtPuK9tuaCy9LnwxgfTOGJFRuzhzxPF WHA7R4LtQrjjmAXV1a/BgkNVXXmGnq1qJNyICiV6nYS/ALJXKidrexgcyJ4FoWK4 uEcy/62mtbTVz7I4mdmkNH/vwjjOTxZy2FXfwoUIQYe9R2RHM9NbF0Fzzrvx3hQH vp2GD+AhzhIQUuqBe50XqUkC0T199ZgR4YkCBX7LdPDPcv54QgAfgjfImidQAiqn s+i/J/rBFZPTD2vAgix+A74UNpePrKhODt0GNg92J4NvTU8P9kM= =FwiA -END PGP SIGNATURE-
Re: Commits and new document visibility
On 3/14/2019 8:23 AM, Christopher Schultz wrote: I believe that the only thing I want to do is to set the autoSoftCommit value to something "reasonable". I'll probably start with maybe 15000 (15sec) to match the hard-commit setting and see if we get any complaints about delays between "save" and "seeing the user". In my opinion, 15 seconds is far too frequent for opening a new searcher. If the index reaches any real size, you may be in a situation where the full soft commit takes longer than 15 seconds to complete - mostly due to warming or autowarming. Commits that open a searcher can be very resource-intensive ... if they happen too frequently, then heavy indexing will cause your Solr instance to never "calm down" ... it will always be hitting the CPU and disk hard. I'd personally start with one minute and adjust from there based on how long the commits take. In our case, we don't have a huge number of documents being created in a minute. Probably once per minute, if that. Does that seem reasonable? As for actually SETTING the setting, I'd prefer not to edit the solrconfig.xml document. Instead, can I set this in my solr.in.sh script? I see an example like this right in the file: SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000" 3 seconds is even more problematic than 15. I believe that when you use "bin/solr create" to create an index with the default config, that it does set the autoSoftCommit to 3 seconds. Which as I stated, I believe to be far too frequent. Thanks, Shawn
Commits and new document visibility
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 All, I recently had a situation where a document wasn't findable in a fairly small Solr core/collection and I didn't see any errors in either the application using Solr or within Solr itself. A Solr service restart caused the document to become visible. So I started reading. I believe the "problem" is that the document was indexed but not visible due to the default commit settings in Solr 7.5 -- which is the version I happen to be running right now. I never bothered so change anything from the defaults because, well, I didn't know what I was doing. Now that I (a) have a problem to solve and (b) know a little more about what is happening, I just wanted a quick sanity-check on what I'd like to do. [Quick background: my core/collection stores user data so that other users can quickly find anyone in the system via text-search. This replaced our previous RDBMS-based "SELECT ... WHERE name LIKE '%whatever%'" implementation which of course wasn't scaling well. Generally, users will expect that when a new user is created, they will be findable "fairly soon" (probably immediately) afterwards.] We are using SolrJ as a client from our application, btw. Initially, we were doing: SolrInputDocument document = ...; SolrClient solr = ...; solr.add(document); solr.commit(); Someone told me that committing after every document-add was wasteful and it seemed like good advice -- allow Solr's autoCommit mechanism to handle the commits and we'll get better performance. The problem was that no new documents are visible unless we take additional action. So, here's the default settings: autoCommit = max 15sec openSearcher = false autoSoftCommit = never[*] This means that every 15 seconds (plus OS/disk sync time), I'll get a safe snapshot of the data. I'm okay with losing 15 seconds worth of data if there is some catastrophe. It also means that my documents are pretty much never made visible. I believe that the only thing I want to do is to set the autoSoftCommit value to something "reasonable". I'll probably start with maybe 15000 (15sec) to match the hard-commit setting and see if we get any complaints about delays between "save" and "seeing the user". In our case, we don't have a huge number of documents being created in a minute. Probably once per minute, if that. Does that seem reasonable? As for actually SETTING the setting, I'd prefer not to edit the solrconfig.xml document. Instead, can I set this in my solr.in.sh script? I see an example like this right in the file: SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000" Is that a fairly standard way to set the autoSoftCommit value for all cores? Thanks, - -chris [*] This setting is documented only in a single place: in the "near-real-time" documentation. It would be nice if that special value was called-out in other places so it wasn't so hard to find. -BEGIN PGP SIGNATURE- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlyKY9wACgkQHPApP6U8 pFhxzRAAnxLCMPSFwJxChXZ8q7UJ9hHAGyMPHNs3k0tFilt9/aT+eR7rUEFGupvR anl+o7QNU8fOreF/l0KoFeGpjNLHZqEJRSKrZkaEb0PH3gabH5IKpgwY9hr+CS9N bcKC7GwQAs19TdkTorxY+MIBeQo0/bO51Ux7XallzYPdX6BW/+kRGlHCuiAQj3fg +EwQan0iXLslk/bDxvCvg95B1zlvr7R4iRAOwp9GxIsk4tL8X/B7sOS5pm0RK19/ tiVJuAqTBwD2fQ3lZ1oQftadKMuajgedJdrrgd94jCuwzWVLjJpIXql2AKA/QcsM 7e2zJqOsPy/4eGFUJ+St5/JYxFfm/yzFjV4rTW1/wng65mmbYAGpLsQ3A+05A8s1 o8ciDQ/80/fvnislr3/NGxZF5hSMjJG4xVriDWpdHX+PqfbqfpeaWnR4j8HEP3vy tPklo3MflnPLk0oA6wqvjSX32ujucVd+X5tKKtkqnE6rorD41FpJGVRvgUrq7Zof kwNro/r7ObqD72hioJJIkjol3ImL3NGSyeZ6XZtsKx+kEsGoyvW5lsRtC580ksXN tYaJbCWQbrHmXnf3ooQV0PatQi0YkG70BQceKPXNQJ3l8Fmc2MjrP7aJ9//ptrMl Pvc0qh4mpzGJKMBjSjaItadmouZdc3dn308xP4WIvpt2a4RYmjo= =PrAt -END PGP SIGNATURE-