People here are in different timezones, have their normal jobs for which they
are actually paid to provide answers to questions as those one below etc. There
are also a wide number of resources out on the Internet.
It can also not harm to read more about the formats that you are processing and
there are known perf issues in computing very large clusters
give it a try with the following rules
"FOO_CUSTOMER":[
{
"replica":"0",
"sysprop.HELM_CHART":"!FOO_CUSTOMER",
"strict":"true"},
{
"replica":"<2",
"node":"#ANY",
Hook up a profiler to the overseer and see what it's doing, file a JIRA and
note the hotspots or what methods appear to be hanging out.
On Tue, Sep 3, 2019 at 1:15 PM Andrew Kettmann
wrote:
>
> > You’re going to want to start by having more than 3gb for memory in my
> opinion but the rest of
Guys, could I get any help ? Or it's useless posting queries over here ?
On Sep 3, 2019 4:00 PM, "Khare, Kushal (MIND)"
wrote:
Hello, mates !
I am extracting content from my documents using Apache Tika.
I need to exclude the headers & footers of the documents. I have already done
this for Word
On 9/3/2019 4:46 PM, Russell Bahr wrote:
Hi Shawn,
Here is a screenshot of one of the master nodes
solr4
Screen Shot 2019-09-03 at 3.37.08 PM.png
solr8
Screen Shot 2019-09-03 at 3.45.46 PM.png
Email attachments do not make it to the list. I cannot see those
pictures. You will need to use a
This really sounds like an XY problem. What do you need the SolrClient _for_? I
suspect there’s an easier way to do this…..
Best,
Erick
> On Sep 3, 2019, at 6:17 PM, Arnold Bronley wrote:
>
> Hi,
>
> Is there a way to create SolrClient from inside processAdd function for
> custom update
Hi Shawn,
Here is a screenshot of one of the master nodes
solr4
[image: Screen Shot 2019-09-03 at 3.37.08 PM.png]
solr8
[image: Screen Shot 2019-09-03 at 3.45.46 PM.png]
*Manzama*a MODERN GOVERNANCE company
Russell Bahr
Lead Infrastructure Engineer
USA & CAN Office: +1 (541) 306 3271
USA & CAN
Hi,
Is there a way to create SolrClient from inside processAdd function for
custom update processor for the same Solr on which it is executing?
On 9/3/2019 1:22 PM, Russell Bahr wrote:
Yes, some of our queries are quite complex due to a lot of very specific
positive as well as negative boosts, however, the query that I ran as the
base test after we found our queries were taking so long is just "
What about combining:
1) KeywordRepeatFilterFactory
2) An existing folding filter (need to check it ignores Keyword marked word)
3) RemoveDuplicatesTokenFilterFactory
That may give what you are after without custom coding.
Regards,
Alex.
On Tue, 3 Sep 2019 at 16:14, Audrey Lorberfeld -
Toke,
Thank you! That makes a lot of sense.
In other news -- we just had a meeting where we decided to try out a hybrid
strategy. I'd love to know what you & everyone else thinks...
- Since we are concerned with the overhead created by "double-fielding" all
tokens per language (because I'm
Hi Toke,
Also, if it helps, the content on each server is between around 6.2Gb and
7.8Gb.
Thanks,
Russ
*Manzama*a MODERN GOVERNANCE company
Russell Bahr
Lead Infrastructure Engineer
USA & CAN Office: +1 (541) 306 3271
USA & CAN Support: +1 (541) 706 9393
UK Office & Support: +44 (0)203 282
Hi Toke,
Yes, some of our queries are quite complex due to a lot of very specific
positive as well as negative boosts, however, the query that I ran as the
base test after we found our queries were taking so long is just "
http://solr.obscured.com:8990/solr/content/select?q=*%3A*=json=true
"
Our
On 9/3/2019 11:47 AM, dev beautiful wrote:
I want to subscribe solr mailing list.
When I sent a request, I got the following message.
Can you add this email address to the mailing list please?
Thank you.
Louis Choi
---
This is the mail system at host n3.nabble.com.
Nabble is a website
Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote:
> Do you find that searching over both the original title field and the
> normalized title
> field increases the time it takes for your search engine to retrieve results?
It is not something we have measured as that index is fast enough
Hello,
I want to subscribe solr mailing list.
When I sent a request, I got the following message.
Can you add this email address to the mailing list please?
Thank you.
Louis Choi
---
This is the mail system at host n3.nabble.com.
I'm sorry to have to inform you that your message could not
ResourceLoader worked brilliantly - my brain, on the other hand, not so much
--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> You’re going to want to start by having more than 3gb for memory in my
> opinion but the rest of your set up is more complex than I’ve dealt with.
right now the overseer is set to a max heap of 3GB, but is only using ~260MB of
heap, so memory doesn't seem to be the issue unless there is a
Russell Bahr wrote:
> approximately 18 million documents
> *:* query across 10 times returning
> [13234, 18714, 13384, 12966, 12192, 18420, 16592, 15691, 13373, 12458]
>vs
> [93359, 94263, 86949, 90747, 91171, 91588, 87921, 88632, 88035, 89137]
Even the 12-18 seconds for Solr 4 is a long time,
Hi,
I am trying to replace our solr4 cluster with a solr 8.1.1 cluster and am
running into a problem where searches are taking way to long to respond.
The clusters are set up with the same number of servers, same number of
shards, and same number of replicas. They are indexing the same documents,
Hello Solr Community!
*Problem*: I wish to know if the result document matched all the terms in
the query. The ranking used in solr works most of the time. For some cases
where one of the term is rare and occurs in couple of fields; such
documents trump a document which matches all the terms.
Thanks Erick,
ulimit in all three lodes are more than 65K including max process list. If
you look at the timestamp the core down error happened ahead of unable to
create thread error, and also core down error took place in node1 and
unable to create thread error took place in node3.
BTW we are
You’re going to want to start by having more than 3gb for memory in my opinion
but the rest of your set up is more complex than I’ve dealt with.
On Sep 3, 2019, at 1:10 PM, Andrew Kettmann
wrote:
>> How many zookeepers do you have? How many collections? What is there size?
>> How much CPU /
> How many zookeepers do you have? How many collections? What is there size?
> How much CPU / memory do you give per container? How much heap in comparison
> to total memory of the container ?
3 Zookeepers.
733 containers/nodes
735 total cores. Each core ranges from ~4-10GB of index.
I'm working on a custom tokenizer (Solr 7.3.0) whose Factory needs to read a
configuration file.
I have been able to run it successfully in my local reading from a local
directory.
I would like to be able to have the configuration read from zookeeper
(similarly to how SynonymGraphFilterFactory
How many zookeepers do you have? How many collections? What is there size?
How much CPU / memory do you give per container? How much heap in comparison to
total memory of the container ?
> Am 03.09.2019 um 17:49 schrieb Andrew Kettmann :
>
> Currently our 7.7.2 cluster has ~600 hosts and each
Currently our 7.7.2 cluster has ~600 hosts and each collection is using an
autoscaling policy based on system property. Our goal is a single core per host
(container, running on K8S). However as we have rolled more
containers/collections into the cluster any creation/move actions are taking a
If you have a properly secured cluster eg with Kerberos then you should not
update files in ZK directly. Use the corresponding Solr REST interfaces then
you also less likely to mess something up.
If you want to have HA you should have at least 3 Solr nodes and replicate the
collection to all
Shankar:
Two things:
1> please do not hijack threads
2> Follow the instructions here:
http://lucene.apache.org/solr/community.html#mailing-lists-irc. You must use
the _exact_ same e-mail as you used to subscribe.
If the initial try doesn't work and following the suggestions at the "problems"
The “unable to create new thread” is where I’d focus first. It means you’re
running out of some system resources and it’s quite possible that your other
problems are arising from that root cause.
What are you “ulimit” settings? the number of file handles and processes should
be set to 65k at
Having custom core.properties files is “fraught”. First of all, that file can
be re-written. Second, the collections ADDREPLICA command will create a new
core.properties file. Third, any mistakes you make when hand-editing the file
can have grave consequences.
What change exactly do you want
On 9/3/2019 7:22 AM, Porritt, Ian wrote:
We have a schema which I have managed to upload to Zookeeper along with
the Solrconfig, how do I get the system to recognise both a lib/.jar
extension and a custom core.properties file? I bypassed the issue of the
core.properties by amending the
Hi,
We are using 3 node SOLR (7.0.1) cloud setup 1 node zookeeper ensemble.
Each system has 16CPUs, 90GB RAM (14GB HEAP), 130 cores (3 replicas NRT)
with index size ranging from 700MB to 20GB.
autoCommit - 10 minutes once
softCommit - 30 Sec Once
We are facing the following problems in recent
Yeah, it beats me. If you've made sure that the security.json in
ZooKeeper is exactly the same as the one I posted but you're still
getting different results, then I'm stumped. Maybe someone else here
has an idea.
Out of curiosity, are you setting your security.json via the
Toke,
Do you find that searching over both the original title field and the
normalized title field increases the time it takes for your search engine to
retrieve results?
--
Audrey Lorberfeld
Data Scientist, w3 Search
Digital Workplace Engineering
CIO, Finance and Operations
IBM
Languages are the best. Thank you all so much!
--
Audrey Lorberfeld
Data Scientist, w3 Search
Digital Workplace Engineering
CIO, Finance and Operations
IBM
audrey.lorberf...@ibm.com
On 8/30/19, 4:09 PM, "Walter Underwood" wrote:
The right transliteration for accents is
Thank you, Erick!
--
Audrey Lorberfeld
Data Scientist, w3 Search
Digital Workplace Engineering
CIO, Finance and Operations
IBM
audrey.lorberf...@ibm.com
On 8/30/19, 3:49 PM, "Erick Erickson" wrote:
It Depends (tm). In this case on how sophisticated/precise your users are.
If your
Hi,
I am relatively new to Solr especially Solr Cloud and have been using it for
a few days now. I think I have setup Solr Cloud correctly however would like
some guidance to ensure I am doing it correctly. I ideally want to be able
to process 40 million documents on production via Solr Cloud.
Tracked https://issues.apache.org/jira/browse/SOLR-13735 patches are
welcome.
On Mon, Sep 2, 2019 at 12:39 PM Vadim Ivanov <
vadim.iva...@spb.ntk-intourist.ru> wrote:
> Timeout causes DIH to finish with error message. So, If I check DIH
> response to be sure
> that DIH session have finished
Hi all,
If you're in town for Activate next week, we're running another free
Lucene Hackday on Tuesday:
https://www.meetup.com/Apache-Lucene-Solr-London-User-Group/events/263993681/
- do come along if you can! It's only a block and a half from the
Activate venue.
Cheers
Charlie
--
Hi Jason,
Apologies for the late reply. My laptop was broken and I got it today from
service centre.
I am still having issues with solr-user able to view the Collections list
as follow.
Testing permissions for user [solr]
Request [/admin/collections?action=LIST] returned status [200]
Hi Jörn,
I am not supplying the name in the update chain. I am not sure pysolr
supports it yet:
def __init__(
self,
url,
decoder=None,
timeout=60,
results_cls=Results,
search_handler="select",
use_qt_param=False,
always_commit=False,
auth=None,
verify=True,
):
How can I define it as default?
How do you send the request? You need to specify the update.chain parameter
with the name of the Update chain or define it as default
> Am 03.09.2019 um 12:14 schrieb Arturas Mazeika :
>
> Hi Solr Fans,
>
> I am trying to figure out how to use the parse-date processor for pdates.
>
> I am
Hello, mates !
I am extracting content from my documents using Apache Tika.
I need to exclude the headers & footers of the documents. I have already done
this for Word & Excel format using OfficeParseConfig, but need to implement the
same for PPT & PDF.
How to achieve that ?
Hi Solr Fans,
I am trying to figure out how to use the parse-date processor for pdates.
I am able to insert data with this python code to a solr collection/core:
solr = pysolr.Solr('http://localhost:/solr/core1', timeout=10)
solr.add([
{
"t": '2017-08-19T21:00:42.043Z',
}
])
Please remove my email id from this list.
On Tue, 3 Sep, 2019, 11:06 AM Akreeti Agarwal, wrote:
> Hello,
>
> Please help me with the solution for below error.
>
> Memory details of slave server:
> total used free sharedbuffers cached
> Mem: 15947 15460
46 matches
Mail list logo