David,
I hardly get the way which IDs are assigned, but beware that repeating
uniqueKey
value causes deleting former occurrence. In case of block join index it
corrupts block structure: parent can't be deleted and left children orphans
(.. so touching, I'm sorry). Just make sure that number of deleted docs is
0 at first.

On Thu, Feb 2, 2017 at 6:20 PM, David Kramer <david.kra...@shoebuy.com>
wrote:

> Thanks, for responding. Mikhail.  There are no deleted documents.  Since
> I’m fairly new to Solr, one of the things I’ve been paranoid about is I
> have no way of validating my schema.xml, or know whether Solr is even using
> it (I have evidence it’s not, more below). So for each test, I’ve wiped out
> the index, recreated, and reimported.
>
> Back to whether my schema.xml is being used, I mentioned that I had to
> come up with a compound UUID field of the first character of the docType
> plus the ID, and we put “<uniqueKey>uuid</uniqueKey>” (was id) in our
> schema.xml.  Then I deleted and recreated the index and restarted Solr.  In
> order to verify it was working, I created an import file that had unique
> IDs but UUIDs which were duplicates of existing records, and it imported
> the new records even though the UUIDs existed in the database already.  I’m
> not sure if Solr should have produced an error or not. I’ll research that,
> but I mention that here in case it’s relevant.
>
> Thanks.
>
> On 2/2/17, 6:10 AM, "Mikhail Khludnev" <m...@apache.org> wrote:
>
>     David,
>
>     Can you make sure your index doesn't have deleted docs? This  can be
> seen
>     in SolrAdmiun.
>     And can you merge index to avoid having them in the index?
>
>     On Thu, Feb 2, 2017 at 12:29 AM, David Kramer <
> david.kra...@shoebuy.com>
>     wrote:
>
>     >
>     >
>     > Some background:
>     > ·         The data involved is catalog data, with three nested
> objects:
>     > Products, Items, and Skus, in that order. We have a docType field on
> each
>     > record as a differentiator.
>     > ·         The "id" field in our data is unique within datatype, but
> not
>     > across datatypes. We added a "uuid" field in our program that
> generates the
>     > Solr import file that is the id prefixed by the first letter of the
>     > docType, like P12345. That makes the uuid field unique, and we have
> that as
>     > the uniqueKey in our schema.xml.
>     > ·         We are trying to retrieve the parent Product, and all
> children
>     > documents. As such, we are using the ChildDocTransformerFactory
>     > ([child...]) to retrieve the children along with the parent. We have
> not
>     > yet solved the problem of getting items within SKUs as nested
> documents in
>     > the results, and we will have to figure that out at some point, but
> for now
>     > we get them flattened
>     > ·         We are building out the proof of concept for this. This is
> all
>     > new work, so we are free to change a lot.
>     > ·         This is Solr 6.0.0, and we are importing in JSON format,
> if that
>     > matters
>     > ·         I submitted this question to StackOverflow<http://
>     > stackoverflow.com/questions/41969353/solr-querying-nested-
> documents-with-
>     > childdoctransformerfactory-get-parent-quer> but haven’t gotten any
>     > answers yet.
>     >
>     >
>     > Our data looks like this (I've removed some fields for simplicity):
>     >
>     > {
>     >
>     >   "id": 739063,
>     >
>     >   "docType": "Product",
>     >
>     >   "uuid": "P739063",
>     >
>     >   "_childDocuments_": [
>     >
>     >     {
>     >
>     >       "id": 1537378,
>     >
>     >       "price": 25.45,
>     >
>     >       "color": "Blush",
>     >
>     >       "docType": "Item",
>     >
>     >       "productId": 739063,
>     >
>     >       "uuid": "I1537378",
>     >
>     >       "_childDocuments_": [
>     >
>     >         {
>     >
>     >           "id": 12799578,
>     >
>     >           "size": "10",
>     >
>     >           "width": "W",
>     >
>     >           "docType": "Sku",
>     >
>     >           "itemId": 1537378,
>     >
>     >           "uuid": "S12799578"
>     >
>     >         }
>     >
>     >       ]
>     >
>     >     }
>     >
>     > }
>     >
>     >
>     >
>     > The query to fetch all Products and their children nested inside
> them is
>     > q=docType:Product&fl=title,id,docType,[child
>     > parentFilter=docType:Product]. When I run that query, all is well,
> and it
>     > returns the first 10 rows. However, if I fetch more rows by adding,
> say
>     > &rows=500, we get the error Parent query yields document which is not
>     > matched by parents filter, docID=XXX.
>     >
>     > When we first saw that error, we discovered our id field was not
> unique
>     > across document types, so we added the uuid field as mentioned
> above, which
>     > is. we also added in our schema.xml file, wiped the core, recreated
> it, and
>     > restarted Solr just to make sure it was in effect. We have double
> checked
>     > and are sure that the uuid fields are unique.
>     >
>     >
>     >
>     > In all the search results for that error that I've found, the OP did
> not
>     > have a field that could differentiate the different document types,
> but as
>     > you see we do. Since both the query and the parentFilter are
> searching for
>     > docType:Product I don't see how either could possibly return
> anything but
>     > parents. We've also tried adding childFilter=docType:Item and
>     > childFilter=docType:Sku but that did not help.  I also tried using
> title:*
>     > for the filter since only products have titles.
>     >
>     >
>     >
>     > Is there anything else we can try?
>     >
>     > Any explanation of this?
>     >
>     > Is it possible that it's not using uuid as the unique identifier even
>     > though it's specified in the schema.xml, and would that even cause
> this?
>     >
>     > Thanks.
>     >
>     >
>     >
>
>
>     --
>     Sincerely yours
>     Mikhail Khludnev
>
>
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to