By “get one” I meant a local ssd disk btw, for example

https://www.amazon.com/SK-hynix-Gold-NAND-Internal/dp/B07SNHB4RC/ref=mp_s_a_1_3?dchild=1&keywords=server+ssd&qid=1627050778&sr=8-3

And you’re all set unless I’m missing the reason why you would ever want to use 
a network drive aside from having snapshots, in which case you already have a 
problem if you can’t rebuild or replicate the index in short order

> On Jul 23, 2021, at 10:27 AM, Dave <[email protected]> wrote:
> 
> For me personally on any system that swap has been disabled makes a much 
> better situation, as well as setting the jvm xms and xmx to the exact same 
> value, being 31gb, NOT higher than that as more actually makes gc slower. 
> Also yeah, solr on a network disk is going to be slow unless it’s on an ssd 
> based San, otherwise you can just get one on Amazon for $100 and drop it in 
> the machine and you will be way better off for next to no money.  The nice 
> part about them is they are cheap and as long as you have redundancy in place 
> already you don’t need to worry about reliability. Just buy a couple per solr 
> server and keep them on hand and replicate the index back in and you’re done. 
>  Another fun trick is to make sure the solr server itself, at least it’s log 
> files, are on an ssd. 
> 
>>> On Jul 23, 2021, at 10:14 AM, Shawn Heisey <[email protected]> wrote:
>>> 
>>> On 7/22/2021 11:53 AM, Jon Morisi wrote:
>>> 
>>> RE Shawn and Michael,
>>> I am just looking for a way to speed it up.  Mike Drob had mentioned 
>>> docvalues, which is why I was researching that route.
>>> 
>>> I am running my search tests from solr admin, no facets, no sorting.  I am 
>>> using Dsolr.directoryFactory=HdfsDirectoryFactory
>> 
>> Getting good caching with HDFS is something I am not sure how to do.  I 
>> would bet that you have to assign a whole bunch of memory to the Solr heap 
>> and then allocate a lot of that to the HDFS client for caching purposes.
>> 
>> You can take a look at this wiki page I wrote, but keep in mind that it is 
>> tailored for local disks, not HDFS:
>> 
>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>> 
>> Is there any way you can switch to local disks instead of HDFS? Solr tends 
>> to perform badly with indexes on the network instead of local.  What are you 
>> trying to achieve with your usage of HDFS?
>> 
>>> URL:
>>> . /select?q=ptokens:8974561 AND ptokens:9844554 AND ptokens:8564484 AND 
>>> ptokens:9846541&echoParams=all
>>> 
>>> Response once it ran (timeout on first attempt, waited 5min for re-try):
>>> responseHeader    
>>> zkConnected    true
>>> status    0
>>> QTime    2411
>>> params    
>>> q    "ptokens:243796009 AND ptokens:410512000 AND ptokens:410604004 AND 
>>> ptokens:408729009"
>>> df    "data"
>>> rows    "10"
>>> echoParams    "all"
>> 
>> What is the field definition for ptokens and what is the fieldType 
>> definition for the type referenced in the field definition?  If this field 
>> is set up as a numeric Point type, you're running into a known limitation -- 
>> single-value lookups on Point fields are slow, and if the field cardinality 
>> is high, then make that VERY slow.  The workaround would be to switch to 
>> either a String type or a Trie type, and completely reindex.  Trie types are 
>> deprecated and will eventually be removed from Solr.  Or you could turn the 
>> query into a range query, and it would work much better -- Point types are 
>> EXCELLENT for range queries.
>> 
>>> dashboard info:
>>> System 0.16 0.13 0.14
>>> 
>>> Physical Memory 97.7%
>>> 377.39 GB
>>> 368.77 GB
>>> 
>>> Swap Space 4.7%
>>> 4.00 GB
>>> 193.25 MB
>>> 
>>> File Descriptor Count 0.2%
>>> 128000
>>> 226
>>> 
>>> JVM-Memory 22.7%
>>> 15.33 GB
>>> 15.33 GB
>> 
>> If disabling swap as Michael is suggesting DOES make performance better, I 
>> think that would be an indication of some very strange system level 
>> problems.  I don't expect it to change anything.
>> 
>> Thanks,
>> Shawn
>> 

Reply via email to