Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-11 Thread Shawn Heisey
On 7/11/2019 9:04 AM, Joseph_Tucker wrote: Looks like I've managed to get some semblance of this working. The indexes are much faster, but the RAM usage by SolrJ is quite high. Is it normal to see around 6GB of RAM usage? (My test is indexing 250,000 records with the 50 child entities)

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-11 Thread Joseph_Tucker
Thanks For the help. Looks like I've managed to get some semblance of this working. The indexes are much faster, but the RAM usage by SolrJ is quite high. Is it normal to see around 6GB of RAM usage? (My test is indexing 250,000 records with the 50 child entities) In short, I'm running through

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-08 Thread Jörn Franke
Ideally you use scripts that can use JVM/Java - in this way you can always use the latest SolrJ client library but also other libraries that are relevant (eg Tika for unstructured content). This does not have to be Java directly but can be based also on Scala or JVM script languages, such as

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-08 Thread Alexandre Rafalovitch
You may also want to look at the existing systems, such as https://nifi.apache.org/ Regards, Alex. On Mon, 8 Jul 2019 at 08:23, Joseph_Tucker wrote: > > Thanks again. > > I guess I'll have to start researching how to create such custom indexing > scripts and determine which language would be

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-08 Thread Joseph_Tucker
Thanks again. I guess I'll have to start researching how to create such custom indexing scripts and determine which language would be best based on the environment I'm using (Azure in this case). Appreciate the help greatly Charlie Hull-3 wrote > On 05/07/2019 14:33, Joseph_Tucker wrote:

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-05 Thread Charlie Hull
On 05/07/2019 14:33, Joseph_Tucker wrote: Thanks for your help / suggestion. I'm not sure I completely follow in this case. SolrJ looks like a method to allow Java applications to talk to Solr, or any other third party application would simply be a communication method between Solr and the

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-05 Thread Joseph_Tucker
Thanks for your help / suggestion. I'm not sure I completely follow in this case. SolrJ looks like a method to allow Java applications to talk to Solr, or any other third party application would simply be a communication method between Solr and the language of your choosing. I guess what I'm

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-05 Thread Alexandre Rafalovitch
I don't think you should be designing this around DIH. It was never planned for complex scenarios. Or particularly fault tollerant, which you may need. Either use SolrJ or a third party tools that integrate with Solr. Regards, Alex On Fri, Jul 5, 2019, 7:43 AM Joseph_Tucker, wrote: >

Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-05 Thread Joseph_Tucker
What is the best way - performance wise - to index data from multiple databases? I'm potentially going to have around 50 different data sources grabbing unique data Here's what I've roughly designed: ... I've excluded fields but each entity