Re: import efficiencies

John Blythe Thu, 26 May 2016 11:03:23 -0700

oo gotcha. cool, will make sure to check it out and bounce any related
questions through here.


thanks!

best,


-- 
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | j...@curvolabs.com
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713

On Thu, May 26, 2016 at 1:45 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Solr commits aren't the issue I'd guess. All the time is
> probably being spent getting the data from MySQL.
>
> I've had some luck writing to Solr from a DB through a
> SolrJ program, here's a place to get started:
> searchhub.org/2012/02/14/indexing-with-solrj/
> you can peel out the Tika bits pretty easily I should
> think.
>
> One technique I've used is to cache
> some of the DB tables in Java's memory to keep
> from having to do the secondary lookup(s). This only
> really works if the "secondary table" is small enough to fit in
> Java's memory of course. You can do some creative
> things with caching partial tables if you can sort appropriately.
>
> Best,
> Erick
>
> On Thu, May 26, 2016 at 9:01 AM, John Blythe <j...@curvolabs.com> wrote:
> > hi all,
> >
> > i've got layered entities in my solr import. it's calling on some
> > transactional data from a MySQL instance. there are two fields that are
> > used to then lookup other information from other tables via their related
> > UIDs, one of which has its own child entity w yet another select
> statement
> > to grab up more data.
> >
> > it fetches at about 120/s but processes at ~50-60/s. we currently only
> have
> > close to 500k records, but it's growing quickly and thus is becoming
> > increasingly painful to make modifications due to the reimport that needs
> > to then occur.
> >
> > i feel like i'd seen some threads regarding commits of new data,
> > master/slave, or solrcloud/sharding that could help in some ways related
> to
> > this but as of yet can't scrounge them up w my searches (ironic :p).
> >
> > can someone help by pointing me to some good material related to this
> sort
> > of thing?
> >
> > thanks-
>

Re: import efficiencies

Reply via email to