Whew! I haven't been lying to people for _years_......
On Thu, Sep 7, 2017 at 5:58 AM, Yonik Seeley <ysee...@gmail.com> wrote: > On Thu, Sep 7, 2017 at 12:47 AM, Erick Erickson <erickerick...@gmail.com> > wrote: >> bq: and deleted documents are irrelevant to term statistics... >> >> Did you mean "relevant"? Or do I have to adjust my thinking _again_? > > One can make it work either way ;-) > Whether a document is marked as deleted or not has no effect on term > statistics (i.e. irrelevant) > OR documents marked for deletion still count in term statistics (i.e. > relevant) > > I guess I used the former because we don't go out of our way to still > include deleted documents... it's just a side effect of the index > structure that we don't (and can't easily) update statistics when a > document is marked as deleted. > > -Yonik > > >> Erick >> >> On Wed, Sep 6, 2017 at 7:48 PM, Yonik Seeley <ysee...@gmail.com> wrote: >>> Different replicas of the same shard can have different numbers of >>> deleted documents (really just marked as deleted), and deleted >>> documents are irrelevant to term statistics (like the number of >>> documents a term appears in). Documents marked for deletion stop >>> contributing to corpus statistics when they are actually removed (via >>> expunge deletes, merges, optimizes). >>> -Yonik >>> >>> >>> On Wed, Sep 6, 2017 at 5:51 PM, Webster Homer <webster.ho...@sial.com> >>> wrote: >>>> I am using Solr 6.2.0 configured as a solr cloud with 2 shards and 4 >>>> replicas (total of 4 nodes). >>>> >>>> If I run the query multiple times I see the three different top scoring >>>> results. >>>> No data load is running, all data has been commited >>>> >>>> I get these three different hits with their scores: >>>> copperiinitratehemipentahydrate2325919004194 430.61722 >>>> copperiinitrateoncelite1234598765 432.44238 >>>> copperiinitratehydrate18756anhydrousbasis13778319 428.24185 >>>> >>>> How is it that the same search against the same data can give different >>>> responses? >>>> I looked at the specific cores they look OK the numdocs for the replicas in >>>> a shard match >>>> >>>> This is the query: >>>> http://ae1c-ecomdev-msc01.sial.com:8983/solr/sial-catalog-product/select?defType=edismax&fl=searchmv_en_keywords,%20searchmv_keywords,searchmv_pno,%20searchmv_en_s_pri_name,%20search_en_p_pri_name,%20search_pno%20[explain%20style=nl]&group.field=id_s&group.limit=30&group=true&group.sort=sort_ds%20asc&indent=on&mm=2%3C-25%25&q.op=OR&q=copper%20nitrate&qf=search_pid >>>> ^500%20search_concat_pno^400%20searchmv_concat_sku^400%20searchmv_pno^300%20search_concat_pno_genr^100%20searchmv_pno_genr%20searchmv_p_skus_genr%20searchmv_user_term^200%20search_lform^190%20searchmv_en_acronym^180%20search_en_root_name^170%20searchmv_en_s_pri_name^160%20search_en_p_pri_name^150%20searchmv_en_synonyms^145%20searchmv_en_keywords^140%20search_en_sortkey^120%20searchmv_p_skus^100%20searchmv_chem_comp^90%20searchmv_en_name_suf%20searchmv_cas_number^80%20searchmv_component_cas^70%20search_beilstein^50%20search_color_idx^40%20search_ecnumber^30%20search_egecnumber^30%20search_femanumber^20%20searchmv_isbn^10%20search_mdl_number%20searchmv_en_page_title%20searchmv_en_descriptions%20searchmv_en_attributes%20searchmv_rtecs%20searchmv_lookahead_terms%20searchmv_xref_comparable_pno%20searchmv_xref_comparable_sku%20searchmv_xref_equivalent_pno%20searchmv_xref_exact_pno%20searchmv_xref_exact_sku%20searchmv_component_molform&rows=30&sort=score%20desc,sort_en_name%20asc,sort_ds%20asc,search_pid%20asc&wt=json >>>> >>>> -- >>>> >>>> >>>> This message and any attachment are confidential and may be privileged or >>>> otherwise protected from disclosure. If you are not the intended recipient, >>>> you must not copy this message or attachment or disclose the contents to >>>> any other person. If you have received this transmission in error, please >>>> notify the sender immediately and delete the message and any attachment >>>> from your system. Merck KGaA, Darmstadt, Germany and any of its >>>> subsidiaries do not accept liability for any omissions or errors in this >>>> message which may arise as a result of E-Mail-transmission or for damages >>>> resulting from any unauthorized changes of the content of this message and >>>> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its >>>> subsidiaries do not guarantee that this message is free of viruses and does >>>> not accept liability for any damages caused by any virus transmitted >>>> therewith. >>>> >>>> Click http://www.emdgroup.com/disclaimer to access the German, French, >>>> Spanish and Portuguese versions of this disclaimer.