[ 
https://issues.apache.org/jira/browse/SOLR-6096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510205#comment-16510205
 ] 

Lorenzo Speranzoni commented on SOLR-6096:
------------------------------------------

I'm using HBASE-INDEXER and our use case requires us to generate a parent + 
children set of documents per HBASE row. When it comes to deleting a record in 
HBASE the only option is to delete by ID which results in the deletion of the 
parent document only.

I was wondering if there's a way to "trigger" the deletion of the orphaned 
children via configuration or with a specific delete by query "delete all 
orphaned children" (that I can't figure out) that could be scheduled in a cron 
script?

Or do you have see any better strategy to keep the index clean?

Thank you very much in advance,

Lorenzo

> Support Update and Delete on nested documents
> ---------------------------------------------
>
>                 Key: SOLR-6096
>                 URL: https://issues.apache.org/jira/browse/SOLR-6096
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.7.2
>            Reporter: Thomas Scheffler
>            Priority: Major
>              Labels: blockjoin, nested
>
> When using nested or child document. Update and delete operation on the root 
> document should also affect the nested documents, as no child can exist 
> without its parent :-)
> Example
> {code:xml|title=First Import}
> <doc>
>   <field name="id">1</field>
>   <field name="title">Article with author</field>
>   <doc>
>     <field name="name">Smith, John</field>
>     <field name="role">author</field>
>   </doc>
> </doc>
> {code}
> If I change my mind and the author was not named *John* but *_Jane_*:
> {code:xml|title=Changed name of author of '1'}
> <doc>
>   <field name="id">1</field>
>   <field name="title">Article with author</field>
>   <doc>
>     <field name="name">Smith, Jane</field>
>     <field name="role">author</field>
>   </doc>
> </doc>
> {code}
> I would expect that John is not in the index anymore. Currently he is. There 
> might also be the case that any subdocument is removed by an update:
> {code:xml|title=Remove author}
> <doc>
>   <field name="id">1</field>
>   <field name="title">Article without author</field>
> </doc>
> {code}
> This should affect a delete on all nested documents, too. The same way all 
> nested documents should be deleted if I delete the root document:
> {code:xml|title=Deletion of '1'}
> <delete>
>   <id>1</id>
>   <!-- implying also
>     <query>_root_:1</query>
>    -->
> </delete>
> {code}
> This is currently possible to do all this stuff on client side by issuing 
> additional request to delete document before every update. It would be more 
> efficient if this could be handled on SOLR side. One would benefit on atomic 
> update. The biggest plus shows when using "delete-by-query". 
> {code:xml|title=Deletion of '1' by query}
> <delete>
>   <query>title:*</query>
>   <!-- implying also
>     <query>_root_:1</query>
>    -->
> </delete>
> {code}
> In that case one would not have to first query all documents and issue 
> deletes by those id and every document that are nested.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to