[jira] [Commented] (SOLR-12298) Index Full nested document Hierarchy For Queries (umbrella issue)

mosh (JIRA) Mon, 07 May 2018 22:44:23 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466905#comment-16466905
 ]


mosh commented on SOLR-12298:
-----------------------------

David you have some really strong points.

Firstly,
While it is true __childDocument__ is not required, sometimes the JSON you get 
is not an array, but a child document, e.g.
{code:java}
{ "id": "X998_Y998", "from": { "name": "Peyton Manning", "id": "X18" }, 
"message": "Where's my contract?", "actions": [ { "name": "Comment", "link": 
"http://www.facebook.com/X998/posts/Y998"; }, { "name": "Like", "link": 
"http://www.facebook.com/X998/posts/Y998"; } ], "type": "status", 
"created_time": "2010-08-02T21:27:44+0000", "updated_time": 
"2010-08-02T21:27:44+0000" }
{code}
This is a sample Facebook API response. The array syntax will index the array 
as child documents, but it will not index the child document under the key 
"from"
{code:java}
 { "from": { "name": "Peyton Manning", "id": "X18" } } {code}
It would be nice if you could just index JSON as is, like you can in elastic 
search, moving the responsibility from the user to Solr itself.
This feature could also be added to the XML loader if needed, to enable feature 
equality. After this change the is introduced to the data loaders, the rest can 
be done using an URP, as long as the loaders add the needed metadata for the 
URP to add the required fields.

Afterwards, a new transformer could be introduced that rebuilds the whole JSON 
structure, including the full original hierarchy.

On the other hand, adding a SolrInputDocument as a supported field could be the 
better way to go, making most of the logic "hack" redundant and unneeded. 
Perhaps you are right, and this is the better choice in the long run.

> Index Full nested document Hierarchy For Queries (umbrella issue)
> -----------------------------------------------------------------
>
>                 Key: SOLR-12298
>                 URL: https://issues.apache.org/jira/browse/SOLR-12298
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: mosh
>            Priority: Major
>
> Solr ought to have the ability to index deeply nested objects, while storing 
> the original document hierarchy.
>  Currently the client has to index the child document's full path and level 
> to manually reconstruct the original document structure, since the children 
> are flattened and returned in the reserved "__childDocuments__" key.
> Ideally you could index a nested document, having Solr transparently add the 
> required fields while providing a document transformer to rebuild the 
> original document's hierarchy.
>  
> This issue is an umbrella issue for the particular tasks that will make it 
> all happen – either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-12298) Index Full nested document Hierarchy For Queries (umbrella issue)

Reply via email to