Is caching works with other entity processor like SolrEntityprocessor? On Fri 12 Apr, 2019, 3:10 PM Srinivas Kashyap, <srini...@bamboorose.com> wrote:
> Hi Shawn/Mikhail Khludnev, > > I was going through Jira https://issues.apache.org/jira/browse/SOLR-4799 > and see, I can do my intended activity by specifying zipper. > > I tried doing it, however I'm getting error as below: > > Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: > java.lang.IllegalArgumentException: expect increasing foreign keys for > Relation CHILD_KEY=PARENT.PARENT_KEY got: QA-HQ008880,HQ011782 > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:62) > at > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:246) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:514) > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414) > ... 5 more > Caused by: java.lang.IllegalArgumentException: expect increasing foreign > keys for Relation CHILD_KEY=PARENT.PARENT_KEY got: QA-HQ008880,HQ011782 > at > org.apache.solr.handler.dataimport.Zipper.supplyNextChild(Zipper.java:70) > at > org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:126) > at > org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74) > at > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243) > > > Below is my dih config: > > > <entity name="PARENT" pk="PQRS" > query="SELECT > PQRS,PARENT_KEY,L,M,N,O FROM DEF order by PARENT_KEY DESC" > > > > <field name="L" column="L" > /> > <field name="M" column="M" > /> > <field name="N" column="N" > /> > > <entity > name="childentity1" pk="PQRS" > > query="SELECT A,B,C,D,E,F,CHILD_KEY,MODIFY_TS FROM ABC ORDER BY CHILD_KEY > DESC" > > processor="SqlEntityProcessor" join="zipper" where="CHILD_KEY= > PARENT.PARENT_KEY" > > > > <field > name="A" column="A" /> > <field > name="B" column="B" /> > </entity> > > > Thanks and Regards, > Srinivas Kashyap > > -----Original Message----- > From: Shawn Heisey <apa...@elyograg.org> > Sent: 09 April 2019 01:27 PM > To: solr-user@lucene.apache.org > Subject: Re: Sql entity processor sortedmapbackedcache out of memory issue > > On 4/8/2019 11:47 PM, Srinivas Kashyap wrote: > > I'm using DIH to index the data and the structure of the DIH is like > below for solr core: > > > > <entity> > > 16 child entities > > </entity> > > > > During indexing, since the number of requests being made to database was > high(to process one document 17 queries) and was utilizing most of > connections of database thereby blocking our web application. > > If you have 17 entities, then one document will indeed take 17 queries. > That's the nature of multiple DIH entities. > > > To tackle it, we implemented SORTEDMAPBACKEDCACHE with cacheImpl > parameter to reduce the number of requests to database. > > When you use SortedMapBackedCache on an entity, you are asking Solr to > store the results of the entire query in memory, even if you don't need all > of the results. If the database has a lot of rows, that's going to take a > lot of memory. > > In your excerpt from the config, your inner entity doesn't have a WHERE > clause. Which means that it's going to retrieve all of the rows of the ABC > table for *EVERY* single entry in the DEF table. That's going to be > exceptionally slow. Normally the SQL query on inner entities will have > some kind of WHERE clause that limits the results to rows that match the > entry from the outer entity. > > You may need to write a custom indexing program that runs separately from > Solr, possibly on an entirely different server. That might be a lot more > efficient than DIH. > > Thanks, > Shawn > ________________________________ > DISCLAIMER: > E-mails and attachments from Bamboo Rose, LLC are confidential. > If you are not the intended recipient, please notify the sender > immediately by replying to the e-mail, and then delete it without making > copies or using it in any way. > No representation is made that this email or any attachments are free of > viruses. Virus scanning is recommended and is the responsibility of the > recipient. >