-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

All,

I recently had a situation where a document wasn't findable in a
fairly small Solr core/collection and I didn't see any errors in
either the application using Solr or within Solr itself. A Solr
service restart caused the document to become visible.

So I started reading.

I believe the "problem" is that the document was indexed but not
visible due to the default commit settings in Solr 7.5 -- which is the
version  I happen to be running right now.

I never bothered so change anything from the defaults because, well, I
didn't know what I was doing. Now that I (a) have a problem to solve
and (b) know a little more about what is happening, I just wanted a
quick sanity-check on what I'd like to do.

[Quick background: my core/collection stores user data so that other
users can quickly find anyone in the system via text-search. This
replaced our previous RDBMS-based "SELECT ... WHERE name LIKE
'%whatever%'" implementation which of course wasn't scaling well.
Generally, users will expect that when a new user is created, they
will be findable "fairly soon" (probably immediately) afterwards.]

We are using SolrJ as a client from our application, btw.

Initially, we were doing:

SolrInputDocument document = ...;
SolrClient solr = ...;
solr.add(document);
solr.commit();

Someone told me that committing after every document-add was wasteful
and it seemed like good advice -- allow Solr's autoCommit mechanism to
handle the commits and we'll get better performance. The problem was
that no new documents are visible unless we take additional action.

So, here's the default settings:

autoCommit   = max 15sec
openSearcher = false

autoSoftCommit = never[*]

This means that every 15 seconds (plus OS/disk sync time), I'll get a
safe snapshot of the data. I'm okay with losing 15 seconds worth of
data if there is some catastrophe.

It also means that my documents are pretty much never made visible.

I believe that the only thing I want to do is to set the
autoSoftCommit value to something "reasonable". I'll probably start
with maybe 15000 (15sec) to match the hard-commit setting and see if
we get any complaints about delays between "save" and "seeing the user".

In our case, we don't have a huge number of documents being created in
 a minute. Probably once per minute, if that.

Does that seem reasonable?

As for actually SETTING the setting, I'd prefer not to edit the
solrconfig.xml document. Instead, can I set this in my solr.in.sh
script? I see an example like this right in the file:

SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000"

Is that a fairly standard way to set the autoSoftCommit value for all
cores?

Thanks,
- -chris

[*] This setting is documented only in a single place: in the
"near-real-time" documentation. It would be nice if that special value
was called-out in other places so it wasn't so hard to find.
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlyKY9wACgkQHPApP6U8
pFhxzRAAnxLCMPSFwJxChXZ8q7UJ9hHAGyMPHNs3k0tFilt9/aT+eR7rUEFGupvR
anl+o7QNU8fOreF/l0KoFeGpjNLHZqEJRSKrZkaEb0PH3gabH5IKpgwY9hr+CS9N
bcKC7GwQAs19TdkTorxY+MIBeQo0/bO51Ux7XallzYPdX6BW/+kRGlHCuiAQj3fg
+EwQan0iXLslk/bDxvCvg95B1zlvr7R4iRAOwp9GxIsk4tL8X/B7sOS5pm0RK19/
tiVJuAqTBwD2fQ3lZ1oQftadKMuajgedJdrrgd94jCuwzWVLjJpIXql2AKA/QcsM
7e2zJqOsPy/4eGFUJ+St5/JYxFfm/yzFjV4rTW1/wng65mmbYAGpLsQ3A+05A8s1
o8ciDQ/80/fvnislr3/NGxZF5hSMjJG4xVriDWpdHX+PqfbqfpeaWnR4j8HEP3vy
tPklo3MflnPLk0oA6wqvjSX32ujucVd+X5tKKtkqnE6rorD41FpJGVRvgUrq7Zof
kwNro/r7ObqD72hioJJIkjol3ImL3NGSyeZ6XZtsKx+kEsGoyvW5lsRtC580ksXN
tYaJbCWQbrHmXnf3ooQV0PatQi0YkG70BQceKPXNQJ3l8Fmc2MjrP7aJ9//ptrMl
Pvc0qh4mpzGJKMBjSjaItadmouZdc3dn308xP4WIvpt2a4RYmjo=
=PrAt
-----END PGP SIGNATURE-----

Reply via email to