Did you get a chance to test with the patch? did it work? On Wed, Oct 1, 2008 at 10:13 AM, Noble Paul നോബിള് नोब्ळ् <[EMAIL PROTECTED]> wrote: > this patch is created from 1.3 (may apply on trunk also) > --Noble > > On Wed, Oct 1, 2008 at 9:56 AM, Noble Paul നോബിള് नोब्ळ् > <[EMAIL PROTECTED]> wrote: >> I guess it is a threading problem. I can give you a patch. you can raise a >> bug >> --Noble >> >> On Wed, Oct 1, 2008 at 2:11 AM, KyleMorrison <[EMAIL PROTECTED]> wrote: >>> >>> As a follow up: I continued tweaking the data-config.xml, and have been able >>> to make the commit fail with as little as 3 fields in the sdc.xml, with only >>> one multivalued field. Even more strange, some fields work and some do not. >>> For instance, in my dc.xml: >>> >>> <field column="Taxon" >>> xpath="/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Taxonomy/Lineage/Taxon" >>> /> >>> . >>> . >>> . >>> <field column="GenPept" >>> xpath="/iProClassDatabase/iProClassEntry/GENERAL_INFORMATION/Protein_Name_and_ID/GenPept" >>> /> >>> >>> and in the schema.xml: >>> <field name="GenPept" type="text" indexed="true" stored="false" >>> multiValued="true" /> >>> . >>> . >>> . >>> <field name="Taxon" type="text" indexed="true" stored="false" >>> multiValued="true" /> >>> but taxon works and genpept does not. What could possibly account for this >>> discrepancy? Again, the error logs from the server are exactly that seen in >>> the first post. >>> >>> What is going on? >>> >>> >>> KyleMorrison wrote: >>>> >>>> Yes, this is the most recent version of Solr, stream="true" and stopwords, >>>> lowercase and removeDuplicate being applied to all multivalued fields? >>>> Would the filters possibly be causing this? I will not use them and see >>>> what happens. >>>> >>>> Kyle >>>> >>>> >>>> Shalin Shekhar Mangar wrote: >>>>> >>>>> Hmm, strange. >>>>> >>>>> This is Solr 1.3.0, right? Do you have any transformers applied to these >>>>> multi-valued fields? Do you have stream="true" in the entity? >>>>> >>>>> On Tue, Sep 30, 2008 at 11:01 PM, KyleMorrison <[EMAIL PROTECTED]> >>>>> wrote: >>>>> >>>>>> >>>>>> I apologize for spamming this mailing list with my problems, but I'm at >>>>>> my >>>>>> wits end. I'll get right to the point. >>>>>> >>>>>> I have an xml file which is ~1GB which I wish to index. If that is >>>>>> successful, I will move to a larger file of closer to 20GB. However, >>>>>> when I >>>>>> run my data-config(let's call it dc.xml) over it, the import only >>>>>> manages >>>>>> to >>>>>> get about 27 rows, out of roughly 200K. The exact same >>>>>> data-config(dc.xml) >>>>>> works perfectly on smaller data files of the same type. >>>>>> >>>>>> This data-config is quite large, maybe 250 fields. When I run a smaller >>>>>> data-config (let's call it sdc.xml) over the 1GB file, the sdc.xml works >>>>>> perfectly. The only conclusion I can draw from this is that the >>>>>> data-config >>>>>> method just doesn't scale well. >>>>>> >>>>>> When the dc.xml fails, the server logs spit out: >>>>>> >>>>>> Sep 30, 2008 11:40:18 AM org.apache.solr.core.SolrCore execute >>>>>> INFO: [] webapp=/solr path=/dataimport params={command=full-import} >>>>>> status=0 >>>>>> QTime=95 >>>>>> Sep 30, 2008 11:40:18 AM org.apache.solr.handler.dataimport.DataImporter >>>>>> doFullImport >>>>>> INFO: Starting Full Import >>>>>> Sep 30, 2008 11:40:18 AM org.apache.solr.update.DirectUpdateHandler2 >>>>>> deleteAll >>>>>> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX >>>>>> Sep 30, 2008 11:40:20 AM org.apache.solr.handler.dataimport.DataImporter >>>>>> doFullImport >>>>>> SEVERE: Full Import failed >>>>>> java.util.ConcurrentModificationException >>>>>> at >>>>>> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) >>>>>> at java.util.AbstractList$Itr.next(AbstractList.java:343) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178) >>>>>> at >>>>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) >>>>>> Sep 30, 2008 11:41:18 AM org.apache.solr.core.SolrCore execute >>>>>> INFO: [] webapp=/solr path=/dataimport params={command=full-import} >>>>>> status=0 >>>>>> QTime=77 >>>>>> Sep 30, 2008 11:41:18 AM org.apache.solr.handler.dataimport.DataImporter >>>>>> doFullImport >>>>>> INFO: Starting Full Import >>>>>> Sep 30, 2008 11:41:18 AM org.apache.solr.update.DirectUpdateHandler2 >>>>>> deleteAll >>>>>> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX >>>>>> Sep 30, 2008 11:41:19 AM org.apache.solr.handler.dataimport.DataImporter >>>>>> doFullImport >>>>>> SEVERE: Full Import failed >>>>>> java.util.ConcurrentModificationException >>>>>> at >>>>>> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) >>>>>> at java.util.AbstractList$Itr.next(AbstractList.java:343) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178) >>>>>> at >>>>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386) >>>>>> at >>>>>> >>>>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) >>>>>> >>>>>> This mass of exceptions DOES NOT occur when I perform the same >>>>>> full-import >>>>>> with sdc.xml. As far as I can tell, the only difference between the two >>>>>> files is the amount of fields they contain. >>>>>> >>>>>> Any guidance or information would be greatly appreciated. >>>>>> Kyle >>>>>> >>>>>> >>>>>> PS The schema.xml in use specifies almost all fields as multivalued, and >>>>>> has >>>>>> a copyfield for almost every field. I can fix this if it is causing my >>>>>> problem, but I would prefer not to. >>>>>> -- >>>>>> View this message in context: >>>>>> http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19746831.html >>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Shalin Shekhar Mangar. >>>>> >>>>> >>>> >>>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19749991.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> --Noble Paul >> > > > > -- > --Noble Paul >
-- --Noble Paul