Re: Full import alternatives

2019-03-04 Thread sami
Dear Furkan, I did. What i am not able to understand correctly at the moment, how to run SOLR in parallel. So, i figured out that we can run indexing with SolrJ with XML file. http://lucene.472066.n3.nabble.com/Index-database-with-SolrJ-using-xml-file-directly-throws-an-error-td4426491.html

Re: Full import alternatives

2019-03-04 Thread Furkan KAMACI
Hi Sami, Did you check delta import documentation: https://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command Kind Regards, Furkan KAMACI On Thu, Feb 28, 2019 at 7:24 PM sami wrote: > Hi Shawan, can you please suggest a small program or atleast a backbone of > a > program which

Re: Full import alternatives

2019-02-28 Thread sami
Hi Shawan, can you please suggest a small program or atleast a backbone of a program which can give me hints how exactly to achieve, I quote: "I send a full-import DIH command to all of the shards, and each one makes an SQL query to MySQL, all of them running in parallel. " -- Sent from:

Re: Full import alternatives

2018-04-13 Thread Shawn Heisey
On 4/13/2018 11:34 AM, Jesus Olivan wrote: > first of all, thanks for your answer. > > How you import simultaneously these 6 shards? I'm not running in SolrCloud mode, so Solr doesn't know that each shard is part of a larger index.  What I'm doing would probably not work in SolrCloud mode without

Re: Full import alternatives

2018-04-13 Thread Jesus Olivan
hi Shawn, first of all, thanks for your answer. How you import simultaneously these 6 shards? 2018-04-13 19:30 GMT+02:00 Shawn Heisey : > On 4/13/2018 11:03 AM, Jesus Olivan wrote: > > thanks for your answer. It happens that when we launch full import > process > > didn't

Re: Full import alternatives

2018-04-13 Thread Shawn Heisey
On 4/13/2018 11:03 AM, Jesus Olivan wrote: > thanks for your answer. It happens that when we launch full import process > didn't finished (we wait for more than 60 hours last time, and we cancelled > it, because this is not an acceptable time for us) There weren't any errors > in solr logfile

Re: Full import alternatives

2018-04-13 Thread Jesus Olivan
Hi Shawn, thanks for your answer. It happens that when we launch full import process didn't finished (we wait for more than 60 hours last time, and we cancelled it, because this is not an acceptable time for us) There weren't any errors in solr logfile simply because it was working fine. The

Re: Full import alternatives

2018-04-13 Thread Mikhail Khludnev
Jesus, Usually zipper join (aka external merge in old ETL world) and explicit partitioning is able to boost import. https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html#entity-processors On Fri, Apr 13, 2018 at 7:11 PM, Jesus Olivan

Re: Full import alternatives

2018-04-13 Thread Erick Erickson
_how_ are you importing? DIH? SolrJ? Here's an article about using SolrJ https://lucidworks.com/2012/02/14/indexing-with-solrj/ But without more details it's really impossible to say much. Things I've done in the past: 1> use SolrJ and partition the job up amongst a bunch of clients each of

Re: Full import alternatives

2018-04-13 Thread Shawn Heisey
On 4/13/2018 10:11 AM, Jesus Olivan wrote: > we're trying to launch a full import of 375 millions of docs aprox. from a > MySQL database to our solrcloud cluster. Until now, this full import > process takes around 24/27 hours to finish due to an huge import query > (several group bys, left joins,

Full import alternatives

2018-04-13 Thread Jesus Olivan
Hi! we're trying to launch a full import of 375 millions of docs aprox. from a MySQL database to our solrcloud cluster. Until now, this full import process takes around 24/27 hours to finish due to an huge import query (several group bys, left joins, etc), but after another import query