By “get one” I meant a local ssd disk btw, for example https://www.amazon.com/SK-hynix-Gold-NAND-Internal/dp/B07SNHB4RC/ref=mp_s_a_1_3?dchild=1&keywords=server+ssd&qid=1627050778&sr=8-3
And you’re all set unless I’m missing the reason why you would ever want to use a network drive aside from having snapshots, in which case you already have a problem if you can’t rebuild or replicate the index in short order > On Jul 23, 2021, at 10:27 AM, Dave <[email protected]> wrote: > > For me personally on any system that swap has been disabled makes a much > better situation, as well as setting the jvm xms and xmx to the exact same > value, being 31gb, NOT higher than that as more actually makes gc slower. > Also yeah, solr on a network disk is going to be slow unless it’s on an ssd > based San, otherwise you can just get one on Amazon for $100 and drop it in > the machine and you will be way better off for next to no money. The nice > part about them is they are cheap and as long as you have redundancy in place > already you don’t need to worry about reliability. Just buy a couple per solr > server and keep them on hand and replicate the index back in and you’re done. > Another fun trick is to make sure the solr server itself, at least it’s log > files, are on an ssd. > >>> On Jul 23, 2021, at 10:14 AM, Shawn Heisey <[email protected]> wrote: >>> >>> On 7/22/2021 11:53 AM, Jon Morisi wrote: >>> >>> RE Shawn and Michael, >>> I am just looking for a way to speed it up. Mike Drob had mentioned >>> docvalues, which is why I was researching that route. >>> >>> I am running my search tests from solr admin, no facets, no sorting. I am >>> using Dsolr.directoryFactory=HdfsDirectoryFactory >> >> Getting good caching with HDFS is something I am not sure how to do. I >> would bet that you have to assign a whole bunch of memory to the Solr heap >> and then allocate a lot of that to the HDFS client for caching purposes. >> >> You can take a look at this wiki page I wrote, but keep in mind that it is >> tailored for local disks, not HDFS: >> >> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems >> >> Is there any way you can switch to local disks instead of HDFS? Solr tends >> to perform badly with indexes on the network instead of local. What are you >> trying to achieve with your usage of HDFS? >> >>> URL: >>> . /select?q=ptokens:8974561 AND ptokens:9844554 AND ptokens:8564484 AND >>> ptokens:9846541&echoParams=all >>> >>> Response once it ran (timeout on first attempt, waited 5min for re-try): >>> responseHeader >>> zkConnected true >>> status 0 >>> QTime 2411 >>> params >>> q "ptokens:243796009 AND ptokens:410512000 AND ptokens:410604004 AND >>> ptokens:408729009" >>> df "data" >>> rows "10" >>> echoParams "all" >> >> What is the field definition for ptokens and what is the fieldType >> definition for the type referenced in the field definition? If this field >> is set up as a numeric Point type, you're running into a known limitation -- >> single-value lookups on Point fields are slow, and if the field cardinality >> is high, then make that VERY slow. The workaround would be to switch to >> either a String type or a Trie type, and completely reindex. Trie types are >> deprecated and will eventually be removed from Solr. Or you could turn the >> query into a range query, and it would work much better -- Point types are >> EXCELLENT for range queries. >> >>> dashboard info: >>> System 0.16 0.13 0.14 >>> >>> Physical Memory 97.7% >>> 377.39 GB >>> 368.77 GB >>> >>> Swap Space 4.7% >>> 4.00 GB >>> 193.25 MB >>> >>> File Descriptor Count 0.2% >>> 128000 >>> 226 >>> >>> JVM-Memory 22.7% >>> 15.33 GB >>> 15.33 GB >> >> If disabling swap as Michael is suggesting DOES make performance better, I >> think that would be an indication of some very strange system level >> problems. I don't expect it to change anything. >> >> Thanks, >> Shawn >>
