Oh sure! As best as I can, anyway.

I have not set the Java heap size, or really configured it at all. 

The server running both the SQL Server and Solr has:
* 2 Intel Xeon X5660 (each one is 2.8 GHz, 6 cores, 12 logical processors)
* 64 GB RAM
* One Solr instance (no shards)

I'm not using faceting.
My schema has these fields:
  <field name="Id" type="string" indexed="true" stored="true" /> 
  <field name="RecordId" type="int" indexed="true" stored="true" /> 
  <field name="RecordType" type="string" indexed="true" stored="true" /> 
  <field name="Name" type="LikeText" indexed="true" stored="true" 
termVectors="true" /> 
  <field name="NameFuzzy" type="FuzzyText" indexed="true" stored="true" 
termVectors="true" /> 
  <copyField source="Name" dest="NameFuzzy" /> 
  <field name="NameType" type="string" indexed="true" stored="true" />

Custom types:

*LikeText
        PatternReplaceCharFilterFactory ("\W+" => "")
        KeywordTokenizerFactory 
        StopFilterFactory (~40 words in stoplist)
        ASCIIFoldingFilterFactory
        LowerCaseFilterFactory
        EdgeNGramFilterFactory
        LengthFilterFactory (min:3, max:512)

*FuzzyText
        PatternReplaceCharFilterFactory ("\W+" => "")
        KeywordTokenizerFactory 
        StopFilterFactory (~40 words in stoplist)
        ASCIIFoldingFilterFactory
        LowerCaseFilterFactory
        NGramFilterFactory
        LengthFilterFactory (min:3, max:512)

Devon Baumgarten


-----Original Message-----
From: Glen Newton [mailto:glen.new...@gmail.com] 
Sent: Wednesday, February 22, 2012 9:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Unusually long data import time?

Import times will depend on:
- hardware (speed of disks, cpu, # of cpus, amount of memory, etc)
- Java configuration (heap size, etc)
- Lucene/Solr configuration (many ...)
- Index configuration - how many fields, indexed how; faceting, etc
- OS configuration (this usually to a lesser degree; _usually_)
- Network issues if non-local
- DB configuration (driver, etc)

If you can give more information about the above, people on this list
should be able to better indicate whether 18 hours sounds right for
your situation.

-Glen Newton

On Wed, Feb 22, 2012 at 10:14 AM, Devon Baumgarten
<dbaumgar...@nationalcorp.com> wrote:
> Hello,
>
> Would it be unusual for an import of 160 million documents to take 18 hours?  
> Each document is less than 1kb and I have the DataImportHandler using the 
> jdbc driver to connect to SQL Server 2008. The full-import query calls a 
> stored procedure that contains only a select from my target table.
>
> Is there any way I can speed this up? I saw recently someone on this list 
> suggested a new user could get all their Solr data imported in under an hour. 
> I sure hope that's true!
>
>
> Devon Baumgarten
>
>



-- 
-
http://zzzoot.blogspot.com/
-

Reply via email to