On 7/11/2019 9:04 AM, Joseph_Tucker wrote:
Looks like I've managed to get some semblance of this working.
The indexes are much faster, but the RAM usage by SolrJ is quite high. Is it
normal to see around 6GB of RAM usage?
(My test is indexing 250,000 records with the 50 child entities)
Thanks For the help.
Looks like I've managed to get some semblance of this working.
The indexes are much faster, but the RAM usage by SolrJ is quite high. Is it
normal to see around 6GB of RAM usage?
(My test is indexing 250,000 records with the 50 child entities)
In short, I'm running through
Ideally you use scripts that can use JVM/Java - in this way you can always use
the latest SolrJ client library but also other libraries that are relevant (eg
Tika for unstructured content).
This does not have to be Java directly but can be based also on Scala or JVM
script languages, such as
You may also want to look at the existing systems, such as
https://nifi.apache.org/
Regards,
Alex.
On Mon, 8 Jul 2019 at 08:23, Joseph_Tucker
wrote:
>
> Thanks again.
>
> I guess I'll have to start researching how to create such custom indexing
> scripts and determine which language would be
Thanks again.
I guess I'll have to start researching how to create such custom indexing
scripts and determine which language would be best based on the environment
I'm using (Azure in this case).
Appreciate the help greatly
Charlie Hull-3 wrote
> On 05/07/2019 14:33, Joseph_Tucker wrote:
On 05/07/2019 14:33, Joseph_Tucker wrote:
Thanks for your help / suggestion.
I'm not sure I completely follow in this case.
SolrJ looks like a method to allow Java applications to talk to Solr, or any
other third party application would simply be a communication method between
Solr and the
Thanks for your help / suggestion.
I'm not sure I completely follow in this case.
SolrJ looks like a method to allow Java applications to talk to Solr, or any
other third party application would simply be a communication method between
Solr and the language of your choosing.
I guess what I'm
I don't think you should be designing this around DIH. It was never planned
for complex scenarios. Or particularly fault tollerant, which you may need.
Either use SolrJ or a third party tools that integrate with Solr.
Regards,
Alex
On Fri, Jul 5, 2019, 7:43 AM Joseph_Tucker,
wrote:
>
What is the best way - performance wise - to index data from multiple
databases?
I'm potentially going to have around 50 different data sources grabbing
unique data
Here's what I've roughly designed:
...
I've excluded fields but each entity