Re: push to the limit without going over
Arturas: " it is becoming incredibly difficult to find working code" Yeah, I sympathize totally. What I usually do is go into the test code of whatever version of Solr I'm using and find examples there. _That_ code _must_ be kept up to date ;). About batching docs. What you gain basically more efficient I/O, you don't have to wait around for the client to connect/disconnect for every doc. Here's some numbers: https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/ with all the caveats that YMMV. Best, Erick On Thu, Jul 5, 2018 at 7:48 AM, Shawn Heisey wrote: > On 7/4/2018 3:32 AM, Arturas Mazeika wrote: >> >> Details: >> >> I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 "cpu >> cores", an SSD as well as a HDD) using the German Wikipedia collection. I >> created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and >> managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s, >> ~100% CPU). Now I wanted to see what happens if I push HDD to the limits. >> Indexing the files from the SSD (I am able to scan the collection at the >> actual rate 400-500MB/s) with 16 threads, I tried to send those to the >> solr >> cluster with all indexes on the HDD. > > >> >> - 4 cores running 2gb ram > > > If this is saying that the machine running Solr has 2GB of installed memory, > that's going to be a serious problem. > > The default heap size that Solr starts with is 512MB. With 4 Solr nodes > running on the machine, each with a 512MB heap, all of your 2GB of memory is > going to be required by the heaps. Java requires memory beyond the heap to > run. Your operating system and its other processes will also require some > memory. > > This means that not only are you going to have no memory left for the OS > disk cache, you're actually going to allocating MORE than the 2GB of > installed memory, which means the OS is going to start swapping to > accommodate memory allocations. > > When you don't have enough memory for good disk caching, Solr performance is > absolutely terrible. When Solr has to wait for data to be read off of disk, > even if the disk is SSD, its performance will not be good. > > When the OS starts swapping, the performance of ANY software on the system > drops SIGNIFICANTLY. > > You need a lot more memory than 2GB on your server. > > Thanks, > Shawn >
Re: push to the limit without going over
On 7/4/2018 3:32 AM, Arturas Mazeika wrote: Details: I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 "cpu cores", an SSD as well as a HDD) using the German Wikipedia collection. I created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s, ~100% CPU). Now I wanted to see what happens if I push HDD to the limits. Indexing the files from the SSD (I am able to scan the collection at the actual rate 400-500MB/s) with 16 threads, I tried to send those to the solr cluster with all indexes on the HDD. - 4 cores running 2gb ram If this is saying that the machine running Solr has 2GB of installed memory, that's going to be a serious problem. The default heap size that Solr starts with is 512MB. With 4 Solr nodes running on the machine, each with a 512MB heap, all of your 2GB of memory is going to be required by the heaps. Java requires memory beyond the heap to run. Your operating system and its other processes will also require some memory. This means that not only are you going to have no memory left for the OS disk cache, you're actually going to allocating MORE than the 2GB of installed memory, which means the OS is going to start swapping to accommodate memory allocations. When you don't have enough memory for good disk caching, Solr performance is absolutely terrible. When Solr has to wait for data to be read off of disk, even if the disk is SSD, its performance will not be good. When the OS starts swapping, the performance of ANY software on the system drops SIGNIFICANTLY. You need a lot more memory than 2GB on your server. Thanks, Shawn
Re: push to the limit without going over
s heavy, > but that hasn't been implemented yet. This could also address > your cluster state fetch error. > > You will get significantly better throughput if you batch your > docs and use the client.add(list_of_documents) BTW. > > Another possibility is to use the new metrics (since Solr 6.4). They > provide over 200 metrics you can query, and it's quite > possible that they'd help your clients know when to self-throttle > but AFAIK, there's nothing built in to help you there. > > Best, > Erick > > On Wed, Jul 4, 2018 at 2:32 AM, Arturas Mazeika wrote: > > Hi Solr Folk, > > > > I am trying to push solr to the limit and sometimes I succeed. The > > questions is how to not go over it, e.g., avoid: > > > > java.lang.RuntimeException: Tried fetching cluster state using the node > > names we knew of, i.e. [192.168.56.1:9998_solr, 192.168.56.1:9997_solr, > > 192.168.56.1:_solr, 192.168.56.1:9996_solr]. However, succeeded in > > obtaining the cluster state from none of them.If you think your Solr > > cluster is up and is accessible, you could try re-creating a new > > CloudSolrClient using working solrUrl(s) or zkHost(s). > > at org.apache.solr.client.solrj.impl.HttpClusterStateProvider. > > getState(HttpClusterStateProvider.java:109) > > at org.apache.solr.client.solrj.impl.CloudSolrClient. > resolveAliases( > > CloudSolrClient.java:1113) > > at org.apache.solr.client.solrj.impl.CloudSolrClient. > > requestWithRetryOnStaleState(CloudSolrClient.java:845) > > at org.apache.solr.client.solrj.impl.CloudSolrClient.request( > > CloudSolrClient.java:818) > > at org.apache.solr.client.solrj.SolrRequest.process( > > SolrRequest.java:194) > > at org.apache.solr.client.solrj.SolrClient.add(SolrClient. > java:173) > > at org.apache.solr.client.solrj.SolrClient.add(SolrClient. > java:138) > > at org.apache.solr.client.solrj.SolrClient.add(SolrClient. > java:152) > > at com.asc.InsertDEWikiSimple$SimpleThread.run( > > InsertDEWikiSimple.java:132) > > > > > > Details: > > > > I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 > "cpu > > cores", an SSD as well as a HDD) using the German Wikipedia collection. I > > created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and > > managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s, > > ~100% CPU). Now I wanted to see what happens if I push HDD to the limits. > > Indexing the files from the SSD (I am able to scan the collection at the > > actual rate 400-500MB/s) with 16 threads, I tried to send those to the > solr > > cluster with all indexes on the HDD. > > > > Clearly solr needs to deal with a very slow hard drive (10-20MB/s actual > > rate). If the cluster is not touched, solrj may start loosing connections > > after a few hours. If one checks the status of the cluster, it may happen > > sooner. After the connection is lost, the cluster calms down with writing > > after a half a dozen of minutes. > > > > What would be a reasonable way to push to the limit without going over? > > > > The exact parameters are: > > > > - 4 cores running 2gb ram > > - Schema: > > > >> positionIncrementGap="100"> > > > > > > > > > > > > > > > > > >positionIncrementGap="100"> > > > > > > > > > > > > > > > >required="true"/> > >> docValues="false" /> > > > > > > > > > > > > > >stored="false"/> > > > > I SolrJ-connect once: > > > > ArrayList urls = new ArrayList<>(); > > urls.add("http://localhost:/solr;); > > urls.add("http://localhost:9998/solr;); > > urls.add("http://localhost:9997/solr;); > > urls.add("http://localhost:9996/solr;); > > > > solrClient = new CloudSolrClient.Builder(urls) > > .withConnectionTimeout(1) > > .withSocketTimeout(6) > > .build(); > > solrClient.setDefaultCollection("de_wiki_man"); > > > > and then execute in 16 threads till there's anything to execute: > > > > Path p = getJobPath(); > >String content = new String > > (Files.readAllBytes(p)); > > UUID id = UUID.randomUUID(); > > SolrInputDocument doc = new SolrInputDocument(); > > > > BasicFileAttributes attr = Files.readAttributes(p, > > BasicFileAttributes.class); > > > > doc.addField("id", id.toString()); > > doc.addField("content", content); > > doc.addField("time", > attr.creationTime().toString()); > > doc.addField("size",content.length()); > > doc.addField("url", p.getFileName(). > > toAbsolutePath().toString()); > > solrClient.add(doc); > > > > > > to go through all the wiki html files. > > > > Cheers, > > Arturas >
Re: push to the limit without going over
First, I usually prefer to construct your CloudSolrClient by using the Zookeeper ensemble string rather than URLs, although that's probably not a cure for your problem. Here's what I _think_ is happening. If you're slamming Solr with a lot of updates, you're doing a lot of merging. At some point when there are a lot of merges going on incoming updates block until one or more merge threads is done. At that point, I suspect your client is timing out. And (perhaps) if you used the Zookeeper ensemble instead of HTTP, the cluster state fetch would go away. I suspect that another issue would come up, but It's also possible this would all go away if you increase your timeouts significantly. That's still a "set it and hope" approach rather than a totally robust solution though. Let's assume that the above works and you start getting timeouts. You can back off the indexing rate at that point, or just go to sleep for a while. This isn't what you'd like for a permanent solution, but may let you get by. There's work afoot to separate out update thread pools from query thread pools so _querying_ doesn't suffer when indexing is heavy, but that hasn't been implemented yet. This could also address your cluster state fetch error. You will get significantly better throughput if you batch your docs and use the client.add(list_of_documents) BTW. Another possibility is to use the new metrics (since Solr 6.4). They provide over 200 metrics you can query, and it's quite possible that they'd help your clients know when to self-throttle but AFAIK, there's nothing built in to help you there. Best, Erick On Wed, Jul 4, 2018 at 2:32 AM, Arturas Mazeika wrote: > Hi Solr Folk, > > I am trying to push solr to the limit and sometimes I succeed. The > questions is how to not go over it, e.g., avoid: > > java.lang.RuntimeException: Tried fetching cluster state using the node > names we knew of, i.e. [192.168.56.1:9998_solr, 192.168.56.1:9997_solr, > 192.168.56.1:_solr, 192.168.56.1:9996_solr]. However, succeeded in > obtaining the cluster state from none of them.If you think your Solr > cluster is up and is accessible, you could try re-creating a new > CloudSolrClient using working solrUrl(s) or zkHost(s). > at org.apache.solr.client.solrj.impl.HttpClusterStateProvider. > getState(HttpClusterStateProvider.java:109) > at org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases( > CloudSolrClient.java:1113) > at org.apache.solr.client.solrj.impl.CloudSolrClient. > requestWithRetryOnStaleState(CloudSolrClient.java:845) > at org.apache.solr.client.solrj.impl.CloudSolrClient.request( > CloudSolrClient.java:818) > at org.apache.solr.client.solrj.SolrRequest.process( > SolrRequest.java:194) > at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:173) > at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138) > at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:152) > at com.asc.InsertDEWikiSimple$SimpleThread.run( > InsertDEWikiSimple.java:132) > > > Details: > > I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 "cpu > cores", an SSD as well as a HDD) using the German Wikipedia collection. I > created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and > managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s, > ~100% CPU). Now I wanted to see what happens if I push HDD to the limits. > Indexing the files from the SSD (I am able to scan the collection at the > actual rate 400-500MB/s) with 16 threads, I tried to send those to the solr > cluster with all indexes on the HDD. > > Clearly solr needs to deal with a very slow hard drive (10-20MB/s actual > rate). If the cluster is not touched, solrj may start loosing connections > after a few hours. If one checks the status of the cluster, it may happen > sooner. After the connection is lost, the cluster calms down with writing > after a half a dozen of minutes. > > What would be a reasonable way to push to the limit without going over? > > The exact parameters are: > > - 4 cores running 2gb ram > - Schema: > >positionIncrementGap="100"> > > > > > > > > > > > > > > > > > >docValues="false" /> > > > > > > > > > I SolrJ-connect once: > > ArrayList urls = new ArrayList<>(); > urls.add("http://localhost:/solr;); > urls.add("http://localhost:9998/solr;); > urls.add("http://localhost:9997/solr;); > urls.add("http://localhost:9996/solr;); > &g
push to the limit without going over
Hi Solr Folk, I am trying to push solr to the limit and sometimes I succeed. The questions is how to not go over it, e.g., avoid: java.lang.RuntimeException: Tried fetching cluster state using the node names we knew of, i.e. [192.168.56.1:9998_solr, 192.168.56.1:9997_solr, 192.168.56.1:_solr, 192.168.56.1:9996_solr]. However, succeeded in obtaining the cluster state from none of them.If you think your Solr cluster is up and is accessible, you could try re-creating a new CloudSolrClient using working solrUrl(s) or zkHost(s). at org.apache.solr.client.solrj.impl.HttpClusterStateProvider. getState(HttpClusterStateProvider.java:109) at org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases( CloudSolrClient.java:1113) at org.apache.solr.client.solrj.impl.CloudSolrClient. requestWithRetryOnStaleState(CloudSolrClient.java:845) at org.apache.solr.client.solrj.impl.CloudSolrClient.request( CloudSolrClient.java:818) at org.apache.solr.client.solrj.SolrRequest.process( SolrRequest.java:194) at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:173) at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138) at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:152) at com.asc.InsertDEWikiSimple$SimpleThread.run( InsertDEWikiSimple.java:132) Details: I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 "cpu cores", an SSD as well as a HDD) using the German Wikipedia collection. I created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s, ~100% CPU). Now I wanted to see what happens if I push HDD to the limits. Indexing the files from the SSD (I am able to scan the collection at the actual rate 400-500MB/s) with 16 threads, I tried to send those to the solr cluster with all indexes on the HDD. Clearly solr needs to deal with a very slow hard drive (10-20MB/s actual rate). If the cluster is not touched, solrj may start loosing connections after a few hours. If one checks the status of the cluster, it may happen sooner. After the connection is lost, the cluster calms down with writing after a half a dozen of minutes. What would be a reasonable way to push to the limit without going over? The exact parameters are: - 4 cores running 2gb ram - Schema: I SolrJ-connect once: ArrayList urls = new ArrayList<>(); urls.add("http://localhost:/solr;); urls.add("http://localhost:9998/solr;); urls.add("http://localhost:9997/solr;); urls.add("http://localhost:9996/solr;); solrClient = new CloudSolrClient.Builder(urls) .withConnectionTimeout(1) .withSocketTimeout(6) .build(); solrClient.setDefaultCollection("de_wiki_man"); and then execute in 16 threads till there's anything to execute: Path p = getJobPath(); String content = new String (Files.readAllBytes(p)); UUID id = UUID.randomUUID(); SolrInputDocument doc = new SolrInputDocument(); BasicFileAttributes attr = Files.readAttributes(p, BasicFileAttributes.class); doc.addField("id", id.toString()); doc.addField("content", content); doc.addField("time",attr.creationTime().toString()); doc.addField("size",content.length()); doc.addField("url", p.getFileName(). toAbsolutePath().toString()); solrClient.add(doc); to go through all the wiki html files. Cheers, Arturas