q=+{!parent which=ntype:p v='+msg:Hello +person:Arturas'} +{!parent which= ntype:p v='+msg:ciao +person:Vai'}
On Wed, Apr 4, 2018 at 12:19 PM, Arturas Mazeika <maze...@gmail.com> wrote: > Hi Mikhail et al, > > It seems to me that the nested documents must include nodes that encode the > level of nodes (within the document). Therefore, the minimal example must > include the node type. Is the following structure sufficient? > > { > "id":1, > "ntype":"p", > "_childDocuments_": > [ > {"id":"1_1", "ntype":"c", "person":"Vai", "time":"3:14", > "msg":"Hello"}, > {"id":"1_2", "ntype":"c", "person":"Arturas", "time":"3:14", > "msg":"Hello"}, > {"id":"1_3", "ntype":"c", "person":"Vai", "time":"3:15", > "msg":"Coz Mathias is working on another system- different screen."}, > {"id":"1_4", "ntype":"c", "person":"Vai", "time":"3:15", > "msg":"It can get annoying"}, > {"id":"1_5", "ntype":"c", "person":"Arturas", "time":"3:15", > "msg":"Thank you. this is very nice of you"}, > {"id":"1_6", "ntype":"c", "person":"Vai", "time":"3:16", > "msg":"ciao"}, > {"id":"1_7", "ntype":"c", "person":"Arturas", "time":"3:16", > "msg":"ciao"} > ] > }, > { > "id":2, > "ntype":"p", > "_childDocuments_": > [ > {"id":"2_1", "ntype":"c", "person":"Vai", "time":"4:14", > "msg":"Hi"}, > {"id":"2_2", "ntype":"c", "person":"Arturas", "time":"4:14", > "msg":"IBM Watson"}, > {"id":"2_3", "ntype":"c", "person":"Vai", "time":"4:15", > "msg":"need to retain content"}, > {"id":"2_4", "ntype":"c", "person":"Vai", "time":"4:15", > "msg":"It can get annoying"}, > {"id":"2_5", "ntype":"c", "person":"Arturas", "time":"4:15", > "msg":"You can make all your meetings more access"}, > {"id":"2_6", "ntype":"c", "person":"Vai", "time":"4:16", > "msg":"Make every meeting a Skype meeting"}, > {"id":"2_7", "ntype":"c", "person":"Arturas", "time":"4:16", > "msg":"ciao"} > ] > } > > How would a query look like that has a Hello from Person Arturas and ciao > from Person Vai? > > Cheers, > Arturas > > > On Tue, Apr 3, 2018 at 5:21 PM, Arturas Mazeika <maze...@gmail.com> wrote: > > > Hi Mikhail, > > > > Thanks a lot for the reply. > > > > You mentioned that > > > > q=+{!parent which.. v='+text:hello +person:A'} +{!parent > > which..v='+text:ciao +person:B'} > > > > is the way to go. How would it look like precisely for the following > > collection? > > > > { > > "id":1, > > "_childDocuments_": > > [ > > {"id":"1_1", "person":"Vai" , "time":"3:14", > > "msg":"Hello"}, > > {"id":"1_2", "person":"Arturas" , "time":"3:14", > > "msg":"Hello"}, > > {"id":"1_3", "person":"Vai" , "time":"3:15", "msg":"Coz > > Mathias is working on another system- different screen."}, > > {"id":"1_4", "person":"Vai" , "time":"3:15", "msg":"It > can > > get annoying"}, > > {"id":"1_5", "person":"Arturas" , "time":"3:15", "msg":"Thank > > you. this is very nice of you"}, > > {"id":"1_6", "person":"Vai" , "time":"3:16", > "msg":"ciao"}, > > {"id":"1_7", "person":"Arturas" , "time":"3:16", > "msg":"ciao"} > > ] > > }, > > { > > "id":2, > > "_childDocuments_": > > [ > > {"id":"2_1", "person":"Vai" , "time":"4:14", > > "msg":"Hello"}, > > {"id":"2_2", "person":"Arturas" , "time":"4:14", "msg":"IBM > > Watson"}, > > {"id":"2_3", "person":"Vai" , "time":"4:15", "msg":"need > > to retain content"}, > > {"id":"2_4", "person":"Vai" , "time":"4:15", "msg":"It > can > > get annoying"}, > > {"id":"2_5", "person":"Arturas" , "time":"4:15", "msg":"You > > can make all your meetings more access"}, > > {"id":"2_6", "person":"Vai" , "time":"4:16", "msg":"Make > > every meeting a Skype meeting"}, > > {"id":"2_7", "person":"Arturas" , "time":"4:16", > "msg":"ciao"} > > ] > > } > > > > Cheers, > > Arturas > > > > > > On Tue, Apr 3, 2018 at 4:33 PM, Mikhail Khludnev <m...@apache.org> > wrote: > > > >> Hello, Arturas. > >> > >> TLDR; Please find inline below. > >> > >> On Tue, Apr 3, 2018 at 5:14 PM, Arturas Mazeika <maze...@gmail.com> > >> wrote: > >> > >> > Hi Solr Fans, > >> > > >> > I am trying to make sense of information retrieval using expressions > >> like > >> > "some parent", "*only parent*", " *all parent*". I am also trying to > >> > understand the syntax "!parent which" and "!child of". On the > technical > >> > level, I am reading the following documents: > >> > > >> > [1] > >> > https://lucene.apache.org/solr/guide/7_2/other-parsers. > >> > html#block-join-query-parsers > >> > [2] > >> > https://lucene.apache.org/solr/guide/7_2/uploading-data- > >> > with-index-handlers.html#nested-child-documents > >> > [3] http://yonik.com/solr-nested-objects/ > >> > > >> > and I am confused to read: > >> > > >> > This parser takes a query that matches some parent documents and > returns > >> > their children. The syntax for this parser is: q={!child > >> > of=<allParents>}<someParents>. The parameter allParents is a filter > that > >> > matches *only parent documents*; here you would define the field and > >> value > >> > that you used to identify *all parent documents*. The parameter > >> someParents > >> > identifies a query that will match some of the parent documents. The > >> output > >> > is the children. > >> > > >> > The first sentence talks about "matching" but does not define what > that > >> > means (and why it is only some parents matching?). The second sentence > >> > introduces a syntax of the parser, but blurs the understanding as > "some" > >> > and "all" of parents are combined into one sentence. My understanding > is > >> > that all documents are retrieve that satisfy a query. The query must > >> > express some constraints on the parent node and some on the child > node. > >> I > >> > have a feeling that "only parent documents" reads "criteria is > >> formulated > >> > over the parent part of {parent document}->{child document} of entity. > >> > My simplified conceptual world of solr looks in the following way: > >> > > >> > 1. Every document has an ID. > >> > 2. Every document may have additional attributes > >> > 3. Text attributes is what's at stake in solr. Sure we can search for > >> > products that costs at most X, but this is the added functionality. > For > >> > simplicity I am neglecting those here. > >> > 4. The user has an information need. She expresses it with (key)words > >> and > >> > hopes to find matching documents. For simplicity, I am skipping all > >> issues > >> > related to the information presentation of the documents > >> > 5. Analysis chain (and inverse index) are the key technologies solr is > >> > based upon. Once the chain-processing is applied, mathematical logic > >> kicks > >> > in, retrieving the documents (that are a set of processed, normalized, > >> > enriched tokens) matching the query (processed, normalized and > enriched > >> > tokens). Clearly, the logic function can be a fancy one (at least one > of > >> > query token is in the document set of tokens, etc.), ranking is used > to > >> > sort the results. > >> > 6. A nested document concept is introduced in solr. It needs to be > >> uploaded > >> > into the index structure using a specific handlers [2]. A nested > >> documents > >> > is a tree. A root may contain children documents, which may be parents > >> of > >> > grandchildren documents. > >> > 7. Querying nested documents is supported in the following manner: > >> > 7.1 Child documents are return that satisfies {parent > >> > document}->{document} > >> > 7.2 Parent documents are return that satisfy {document}->{child > >> > document} > >> > > >> > Would I be very wrong to have this conceptual picture? > >> > > >> > From this point, the situation is a bit bury in my head. At the core, > I > >> do > >> > not really understand what "a document" is anymore (since the complete > >> json > >> > or xml, so is a sub-json and sub-xml are documents, every document > must > >> > have an ID, does that meant the the subdocuments must have and ID too, > >> or > >> > sub-ids are also fine?), how to formulate mathematical expressions > over > >> > documents and what it means that the document satisfies my (key)word > >> query? > >> > Can we define a document to be the largest entity of information that > >> does > >> > not contain any other nested documents [4]? If this is defined and > >> > communicated like this already where can I find it? There is a use of > >> the > >> > clarification, as the concept of the document means different things > in > >> > different contexts (e.g., you can update only the "complete document" > in > >> > the index vs. parent document, etc.). > >> > > >> > Is it possible to formulate what's going on using mathematical logic? > >> Can > >> > one express something like > >> > > >> > { give documents d : d is a document, d is parent of document c, d > >> > satisfies logical criteria C1,....,CN, c satisfies logical criteria > >> > C1',...,CM'} > >> > { give documents c : c is a document, d is parent of document c, d > >> > satisfies logical criteria C1,....,CN, c satisfies logical criteria > >> > C1',...,CM'} > >> > > >> > here the meaning of document is as in definition [4] above. > >> > > >> > 1. Is it possible to retrieve all parent documents that have two > >> children > >> > c1 and c2? Consider a document that is a skype chat, and children are > >> > individual lines of communication in the chat. I would be looking for > >> the > >> > (parent) documents that have "hello" said by person A and "ciao" said > by > >> > person B (as two different sub-documents). > >> > > >> > >> q=+{!parent which.. v='+text:hello +person:A'} +{!parent which.. > >> v='+text:ciao +person:B'} > >> The query syntax is really tricky and cumbersome. > >> > >> > >> > > >> > 2. Is it possible to search for documents such that they have a > >> grandchild > >> > and the grandchild has the word "hello"? > >> > > >> > >> http://blog-archive.griddynamics.com/2013/12/grandchildren- > >> and-siblings-with-block.html > >> > >> > >> > > >> > 3. Is it possible to search for documents that do not have children? > >> > > >> q=-{!parent which..}type:child > >> Beware that mixing parents and childfree products is not supported and > >> causes pain. as a workaround you need to put empty child placeholder > doc. > >> Sic. Sorry. > >> > >> > >> > Is this the right venue to discuss documentation of solr? > >> > > >> > Thanks! > >> > Arturas > >> > > >> > >> > >> > >> -- > >> Sincerely yours > >> Mikhail Khludnev > >> > > > > > -- Sincerely yours Mikhail Khludnev