Hi Sebastian,

What database are you using? How much RAM is available on your machine? It
looks like you're selecting from a view... Have you tried paging through
the view outside of Solr? Does that slow down as well? Do you notice any
increased load on the Solr box or the database server?



Michael Della Bitta

Applications Developer

o: +1 646 532 3062  | c: +1 917 477 7906

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions
w: appinions.com <http://www.appinions.com/>


On Thu, Jun 6, 2013 at 6:13 AM, Sebastian Steinfeld <
sebastian.steinf...@mgm-tp.com> wrote:

> Hi,
>
> I am new to solr and we want to use Solr to speed up our product search.
> And it is working really nice, but I think I have a problem with the
> indexing.
> It slows down after a few minutes.
>
> I am using the DataImportHandler to import the products from the database.
> And I start the import by executing the following HTTP request:
> /dataimport?command=full-import&clean=true&commit=true
>
> I guess this are the importend parts of my configuration:
>
> schema.xml:
> ----------------------------------------------
> <fields>
>    <field name="pk"               type="long"        indexed="true"
>  stored="true" required="true"  />
>    <field name="code"             type="string"      indexed="true"
>  stored="true" required="true"  />
>    <field name="ean"              type="string"      indexed="true"
>  stored="false"  />
>    <field name="name"             type="lowercase"   indexed="true"
>  stored="false"  />
>    <field name="text" type="text_general" indexed="true" stored="false"
> multiValued="true"/>
>    <field name="_version_" type="long" indexed="true" stored="true"/>
> </fields>
> ....
>     <fieldType name="lowercase" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer>
>         <tokenizer class="solr.KeywordTokenizerFactory"/>
>         <filter class="solr.LowerCaseFilterFactory" />
>       </analyzer>
>     </fieldType>
> ----------------------------------------------
>
> solrconfig.xml:
> ----------------------------------------------
>   <requestHandler name="/dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>     <lst name="defaults">
>         <str name="config">dataimport-handler.xml</str>
>     </lst>
>   </requestHandler>
> ----------------------------------------------
>
> dataimport-handler.xml:
> ----------------------------------------------
> <dataConfig>
>     <dataSource name="local" driver="="*************" "
>                 url="*************"
>                 user="*************" "
>                 password="*************"
>                 />
>    <document>
>             <entity name="product" pk="PRODUCTS_PK" dataSource="local"
>                         query="SELECT   PRODUCTS_PK, PRODUCTS_CODE,
> PRODUCTS_EAN, PRODUCTSLP_NAME FROM V_SOLR_IMPORT4PRODUCT_SEARCH">
>             <field column="PRODUCTS_PK"       name="pk" />
>             <field column="PRODUCTS_CODE"     name="code" />
>             <field column="PRODUCTS_EAN"      name="ean" />
>             <field column="PRODUCTSLP_NAME"   name="name" />
>         </entity>
>     </document>
> </dataConfig>
> ----------------------------------------------
>
> The amout of documents I want to index is 8 million, the first 1,6 million
> are indexed in 2min, but to complete the Import it takes nearly 2 hours.
> The size of the index on the hard drive is 610MB.
> I started the solr server with 2GB memory.
>
>
> I read that the duration of indexing might be connected to the batch size,
> so I increased the batchSize in the dataSource to 10.000, but this didn't
> make any differences.
> I also tried to disable the autocommit, which is configured in the
> solrconfig.xml. I disabled it by uncommenting it, but this also didn't made
> any differences.
>
> It would be realy nice if someone of you could help me with this problem.
>
> Thank you very much,
> Sebastian
>
>

Reply via email to