Kevin Bachmann created SOLR-13167:
-------------------------------------
Summary: Duplicate Child Documents and undeterministic search
Key: SOLR-13167
URL: https://issues.apache.org/jira/browse/SOLR-13167
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: search, SolrCloud
Affects Versions: 7.5
Environment: SOLR 7.5 running on AWS EC2 Instances with an AMI OS
split to two shards running on two different EC2 instances with the built in
Zookeeper of SOLR
Reporter: Kevin Bachmann
Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png,
screenshot-4.png
i have a product search hosted on a solr cloud with 2 shards and two instances
hosted on ec2 and the following setup:
a product has an unlimited amount of children which are small objects with shop
information. these child documents of the products define the shops where the
product is available. the requirement from my side is to update / sync the
whole documents (parent and children) at least once a day. the availability
information is included in the child-documents with a quantity field.
problem:
# after every sync the number of child documents (shops) increases and nests
deeper every sync as the quantity changes and the child documents are
apparently not updated by id but newly created with the same id (duplicates as
comparable in SOLR-5211, SOLR-6096, SOLR-12638).
# whenever i sync the products with the children with one level of depth
(parent > child) i get parent > child > child > child > ... depending on how
many children there are (see screenshot-4.png). these children also can't be
displayed with nodeType:shop
# whenever i try to request the products (parents) by a child attribute
(shopId) the search is underteministic and does not return the correct
products. a lot of products do contain children that never have been assigned
to them. some products are flooded with a huuge amount of children (>1000)
although they have assigned about 10. as you can see in screenshot-1 to 3 there
are three queries that are exactly the same and give back different products.
screenshot-1 with 26241 results would be the correct amount and correct data
but the other two are completely wrong.
i would really appreciate any workaround or help on these issues. this is a
huge problem and my business does depend on this (!):(
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]