Hey there, I experienced the problem and sort it with the patch. But... in case I would have 5000000 of rows to modify the outofmemory problem would appear again?
Would be a good solution to run the query with limit 100000?. And keep doing it until no more docs would have to be updated? Every time a query is ran and data is persisted I would set he maps to null. Would this be a good solution to turn dataimporthandler more scalable? Thanks in advanced JIRA [email protected] wrote: > > > [ > https://issues.apache.org/jira/browse/SOLR-846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655574#action_12655574 > ] > > Shalin Shekhar Mangar commented on SOLR-846: > -------------------------------------------- > > Committed revision 725627. > > I've committed Noble's patch, however as he noted, it is only a partial > solution. I'm in favor of streaming it however that will be an invasive > change. Let's keep this issue open until we can implement a better > solution. > >> Out Of memory doing delta import with fetch size set to -1 >> ---------------------------------------------------------- >> >> Key: SOLR-846 >> URL: https://issues.apache.org/jira/browse/SOLR-846 >> Project: Solr >> Issue Type: Bug >> Components: contrib - DataImportHandler >> Affects Versions: 1.3 >> Environment: Linux 2.6.18-92.1.13.el5xen, mysql 5.0 >> Reporter: Ricky Leung >> Attachments: SOLR-846.patch >> >> >> Database has about 3 million records. Doing full-import there is no >> problem. However, when a large number of changes occurred 2558057, >> delta-import throws OutOfMemory error after 1288338 documents processed. >> The stack trace is below >> Exception in thread "Thread-3" java.lang.OutOfMemoryError: Java heap >> space >> at org.tartarus.snowball.ext.EnglishStemmer.<init>(EnglishStemmer.java:4 >> 9) >> at org.apache.solr.analysis.EnglishPorterFilter.<init>(EnglishPorterFilt >> erFactory.java:83) >> at org.apache.solr.analysis.EnglishPorterFilterFactory.create(EnglishPor >> terFilterFactory.java:66) >> at org.apache.solr.analysis.EnglishPorterFilterFactory.create(EnglishPor >> terFilterFactory.java:35) >> at org.apache.solr.analysis.TokenizerChain.tokenStream(TokenizerChain.ja >> va:48) >> at org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer.tokenStream(Inde >> xSchema.java:348) >> at org.apache.lucene.analysis.Analyzer.reusableTokenStream(Analyzer.java >> :44) >> at org.apache.lucene.index.DocInverterPerField.processFields(DocInverter >> PerField.java:117) >> at org.apache.lucene.index.DocFieldConsumersPerField.processFields(DocFi >> eldConsumersPerField.java:36) >> at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(Do >> cFieldProcessorPerThread.java:234) >> at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWrite >> r.java:765) >> at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWrite >> r.java:748) >> at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2 >> 118) >> at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2 >> 095) >> at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandle >> r2.java:232) >> at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpd >> ateProcessorFactory.java:59) >> at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java: >> 69) >> at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImp >> ortHandler.java:288) >> at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde >> r.java:319) >> at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java >> :211) >> at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java >> :133) >> at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImp >> orter.java:359) >> at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j >> ava:388) >> at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja >> va:377) >> dataSource in data-config.xml has been with the batchSize of "-1". >> <dataSource driver="com.mysql.jdbc.Driver" >> url="jdbc:mysql://host/dbname" >> user="*" password="*" batchSize="-1"/> > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > > -- View this message in context: http://www.nabble.com/-jira--Created%3A-%28SOLR-846%29-Out-Of-memory-doing-delta-import-with-fetch-size-set-to--1-tp20441742p21145545.html Sent from the Solr - Dev mailing list archive at Nabble.com.
