Is caching works with other entity processor like SolrEntityprocessor?

On Fri 12 Apr, 2019, 3:10 PM Srinivas Kashyap, <srini...@bamboorose.com>
wrote:

> Hi Shawn/Mikhail Khludnev,
>
> I was going through Jira  https://issues.apache.org/jira/browse/SOLR-4799
> and see, I can do my intended activity by specifying zipper.
>
> I tried doing it, however I'm getting error as below:
>
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.IllegalArgumentException: expect increasing foreign keys for
> Relation CHILD_KEY=PARENT.PARENT_KEY got: QA-HQ008880,HQ011782
> at
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:62)
> at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:246)
> at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
> at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514)
> at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
> ... 5 more
> Caused by: java.lang.IllegalArgumentException: expect increasing foreign
> keys for Relation CHILD_KEY=PARENT.PARENT_KEY got: QA-HQ008880,HQ011782
> at
> org.apache.solr.handler.dataimport.Zipper.supplyNextChild(Zipper.java:70)
> at
> org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:126)
> at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
> at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
>
>
> Below is my dih config:
>
>
> <entity name="PARENT" pk="PQRS"
>                                                 query="SELECT
> PQRS,PARENT_KEY,L,M,N,O FROM DEF order by PARENT_KEY DESC"
>                                                 >
>
>                                                 <field name="L" column="L"
> />
>                                                 <field name="M" column="M"
> />
>                                                 <field name="N" column="N"
> />
>
>                                                 <entity
> name="childentity1" pk="PQRS"
>
> query="SELECT A,B,C,D,E,F,CHILD_KEY,MODIFY_TS FROM ABC ORDER BY CHILD_KEY
> DESC"
>
> processor="SqlEntityProcessor" join="zipper" where="CHILD_KEY=
> PARENT.PARENT_KEY"
>                                                                 >
>
>                                                                 <field
> name="A" column="A" />
>                                                                 <field
> name="B" column="B" />
>                                                 </entity>
>
>
> Thanks and Regards,
> Srinivas Kashyap
>
> -----Original Message-----
> From: Shawn Heisey <apa...@elyograg.org>
> Sent: 09 April 2019 01:27 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Sql entity processor sortedmapbackedcache out of memory issue
>
> On 4/8/2019 11:47 PM, Srinivas Kashyap wrote:
> > I'm using DIH to index the data and the structure of the DIH is like
> below for solr core:
> >
> > <entity>
> > 16 child entities
> > </entity>
> >
> > During indexing, since the number of requests being made to database was
> high(to process one document 17 queries) and was utilizing most of
> connections of database thereby blocking our web application.
>
> If you have 17 entities, then one document will indeed take 17 queries.
> That's the nature of multiple DIH entities.
>
> > To tackle it, we implemented SORTEDMAPBACKEDCACHE with cacheImpl
> parameter to reduce the number of requests to database.
>
> When you use SortedMapBackedCache on an entity, you are asking Solr to
> store the results of the entire query in memory, even if you don't need all
> of the results.  If the database has a lot of rows, that's going to take a
> lot of memory.
>
> In your excerpt from the config, your inner entity doesn't have a WHERE
> clause.  Which means that it's going to retrieve all of the rows of the ABC
> table for *EVERY* single entry in the DEF table.  That's going to be
> exceptionally slow.  Normally the SQL query on inner entities will have
> some kind of WHERE clause that limits the results to rows that match the
> entry from the outer entity.
>
> You may need to write a custom indexing program that runs separately from
> Solr, possibly on an entirely different server.  That might be a lot more
> efficient than DIH.
>
> Thanks,
> Shawn
> ________________________________
> DISCLAIMER:
> E-mails and attachments from Bamboo Rose, LLC are confidential.
> If you are not the intended recipient, please notify the sender
> immediately by replying to the e-mail, and then delete it without making
> copies or using it in any way.
> No representation is made that this email or any attachments are free of
> viruses. Virus scanning is recommended and is the responsibility of the
> recipient.
>

Reply via email to