when 1.15 will be released? maybe you have some beta version and I could
test it :)
SAX sounds interesting, and from info that I found in google it could solve
my issues.
On Tue, Apr 11, 2017 at 10:48 PM, Allison, Timothy B.
wrote:
> It depends. We've been trying to make
Hi,
I'm getting an error with indexing using SolrJ after setting up the Basic
Authentication with the following code.
Credentials defaultcreds = new UsernamePasswordCredentials("id",
"password");
appendAuthentication(defaultcreds, "BASIC", solr);
private static void
JVM version? We’re running v8 update 121 with the G1 collector and it is
working really well. We also have an 8GB heap.
Graph your heap usage. You’ll see a sawtooth shape, where it grows, then there
is a major GC. The maximum of the base of the sawtooth is the working set of
heap that your
On 4/11/2017 2:56 PM, Chetas Joshi wrote:
> I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection
> with number of shards = 80 and replication Factor=2
>
> Sold JVM heap size = 20 GB
> solr.hdfs.blockcache.enabled = true
> solr.hdfs.blockcache.direct.memory.allocation = true
>
On 4/11/2017 2:19 PM, Scruggs, Matt wrote:
> I’m updating our schema.xml file with 1 change: deleting a field.
>
> Do I need to re-index all of my documents in Solr, or can I simply reload my
> collection config by calling:
>
>
Hi Jordi,
Thanks for the advice.
Regards,
Edwin
On 11 April 2017 at 18:27, Jordi Domingo Borràs
wrote:
> Browsers retain basic auth information. You have to close it or clean
> browsing history. You can also change the user password at server side.
>
> Best
>
> On
When I have done this, it is in multiple steps.
1. Change the indexing so that no data is going to that field.
2. Reindex, so the field is empty.
3. Remove the field from the schema.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Apr 11, 2017, at
Hi - We did this on one occasion and Solr started complaining in the logs about
a field that is present but not defined. We thought the problem would go away
within 30 days - the time within every document is reindexed or deleted - but
it did not, for some reason. Forcing a merge did not solve
I’m updating our schema.xml file with 1 change: deleting a field.
Do I need to re-index all of my documents in Solr, or can I simply reload my
collection config by calling:
http://mysolrhost:8000/solr/admin/collections?action=RELOAD=mycollection
Thanks,
Matt
Hi - i cannot think of any real drawback right away. But you probably can
expect a slightly different ordered MLT response. It should not be a problem if
you select enough terms for MLT lookup.
Regards,
Markus
-Original message-
> From:David Hastings
On 4/8/2017 6:42 PM, Mike Thomsen wrote:
> I'm running two nodes of SolrCloud in Docker on Windows using Docker
> Toolbox. The problem I am having is that Docker Toolbox runs inside of a
> VM and so it has an internal network inside the VM that is not accessible
> to the Docker Toolbox VM's host
Here is a small snippet that I copy pated from Shawn Helsey (who is a core
contributor I think, he's good):
> One thing to note: SolrCloud begins to have performance issues when the
> number of collections in the cloud reaches the low hundreds. It's not
> going to scale very well with a
Hello,
I am using Solr (5.5.0) on HDFS. SolrCloud of 80 nodes. Sold collection
with number of shards = 80 and replication Factor=2
Sold JVM heap size = 20 GB
solr.hdfs.blockcache.enabled = true
solr.hdfs.blockcache.direct.memory.allocation = true
MaxDirectMemorySize = 25 GB
I am querying a solr
John,
Here I mean a query, which matches a doc, which it expected to be matched
by the problem query.
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-TheexplainOtherParameter
On Tue, Apr 11, 2017 at 11:32 PM, John Blythe wrote:
first off, i don't think i have a full handle on the import of what is
outputted by the debugger.
that said, if "...PhraseQuery(manufacturer_split_syn:\"vendor vendor\")" is
matching against `vendor_coolmed | coolmed | vendor`, then 'vendor' should
match. the query analyzer is keywordtokenizer,
Hi, was wondering if there are any known drawbacks to using the CommonGram
factory, in regards to such features as the "more like this"
John,
How do you suppose to match any of "parsed_filter_queries":["
MultiPhraseQuery(manufacturer_syn_both:\"(vendor_vendor_us vendor)
vendor\")", "PhraseQuery(manufacturer_split_syn:\"vendor vendor\")"
against
vendor_coolmed | coolmed | vendor ?
I just can't see any chance to match them.
One
It depends. We've been trying to make parsers more, erm, flexible, but there
are some problems from which we cannot recover.
Tl;dr there isn't a short answer. :(
My sense is that DIH/ExtractingDocumentHandler is intended to get people up and
running with Solr easily but it is not really a
>
> And this overhead depends on what? I mean, if I create an empty collection
> will it take up much heap size just for "being there" ?
Yes. You can search on elastic-search/solr/lucene mailing lists and see
that it's true. But nobody has `empty` collections, so yours will have a
schema and
hi, erick.
appreciate the feedback.
1> i'm sending the terms to solr enquoted
2> i'd thought that at one point and reran the indexing. i _had_ had two of
the fields not indexed, but this represented one pass (same analyzer) from
two diff source fields while 2 or 3 of the other 4 fields _were_
Thanks for your responses.
Are there any posibilities to ignore parsing errors and continue indexing?
because now solr/tika stops parsing whole document if it finds any exception
On Apr 11, 2017 19:51, "Allison, Timothy B." wrote:
> You might want to drop a note to the dev
The way the data is spread across the cluster is not really uniform. Most of
shards have way lower than 50GB; I would say about 15% of the total shards
have more than 50GB.
Dorian Hoxha wrote
> Each shard is a lucene index which has a lot of overhead.
And this overhead depends on what? I mean,
Skimming, I don't think this is inconsistent. First I assume that
you're OK with the second example, it's this one seems odd to you:
sort=score asc
group.sort=score desc
You're telling Solr to return the highest scoring doc in each group.
However, you're asking to order the _groups_ in ascending
=query is your friend. There are several issues that often trip people up:
1> The analysis tab pre-supposes that what you put in the boxes gets
all the way to the field in question. Trivial example:
I put (without quotes) "erick erickson" in the "name" field in the
analysis page and see that it
You might want to drop a note to the dev or user's list on Apache POI.
I'm not extremely familiar with the vsd(x) portion of our code base.
The first item ("PolylineTo") may be caused by a mismatch btwn your doc and the
ooxml spec.
The second item appears to be an unsupported feature.
The
Ok :)
But if you have time have a look at my project https://github.com/freedev/
solrcloud-zookeeper-docker
The project builds a couple of docker instances (solr - zookeeper) or a
cluster with 6 nodes.
Then you have just to put in your hosts file the ip addresses of your VM
and you can play
I am looking for best practices when a search component in one handler,
needs to invoke another handler, say /basic. So far, I got this working
prototype:
public void process(ResponseBuilder rb) throws IOException {
SolrQueryResponse response = new SolrQueryResponse();
What I'm suggesting, is that you should aim for max(50GB) per shard of
data. How much is it currently ?
Each shard is a lucene index which has a lot of overhead. If you can, try
to have 20x-50x-100x less shards than you currently do and you'll see lower
heap requirement. I don't know about
Dorian Hoxha wrote
> Isn't 18K lucene-indexes (1 for each shard, not counting the replicas) a
> little too much for 3TB of data ?
> Something like 0.167GB for each shard ?
> Isn't that too much overhead (i've mostly worked with es but still lucene
> underneath) ?
I don't have only 3TB , I have
Thanks. I think I'll take a look at that. I decided to just build a big
vagrant-managed desktop VM to let me run Ubuntu on my company machine, so I
expect that this pain point may be largely gone soon.
On Mon, Apr 10, 2017 at 12:31 PM, Vincenzo D'Amore
wrote:
> Hi Mike
>
>
hi everyone.
i recently wrote in ('analysis matching, query not') but never heard back
so wanted to follow up. i'm at my wit's end currently. i have several
fields that are showing matches in the analysis tab. when i dumb down the
string sent over to query it still gives me issues in some field
I modified and cleaned the previous query. As you can see the first query
sorting is a bit odd.
Using parameters
sort=score asc
group.sort=score desc
http://localhost:8983/solr/mcontent.ph_post/select?==*,score=partnerId=1=false=true=score
desc=true=on=text:cars=5000=score
#field() is defined in _macros.vm as this monstrosity:
# TODO: make this parameterized fully, no context sensitivity
#macro(field $f)
#if($response.response.highlighting.get($docId).get($f).get(0))
#set($pad = "")
#foreach($v in $response.response.highlighting.get($docId).get($f))
the group.sort spec is specified twice in the URL
group.sort=score desc&
group.sort=score desc
Is there a chance that during testing you only changed _one_ of them so you had
group.sort=score desc&
group.sort=score asc
? I think the last one should win.. Shot in the dark.
Best,
Erick
On Tue,
Can't the filter be used in cases when you're paginating in
sharded-scenario ?
So if you do limit=10, offset=10, each shard will return 20 docs ?
While if you do limit=10, _score<=last_page.min_score, then each shard will
return 10 docs ? (they will still score all docs, but merging will be
Hey guys,
I have a problem:
In Velocity:
*Beschreibung:*#field('LONG_TEXT')
In Solr the field "LONG_TEXT" dont show everything only the first ~90-110
characters.
But if I set "$doc.getFieldValue('LONG_TEXT')" in the Velocity file, then he
show me everything whats inside in the field
Hi,
history:
1. we're using single core Solr 6.4 instance on windows server (windows
server 2012 R2 standard),
2. Java v8, (build 1.8.0_121-b13).
3. as a workaround for earlier issues with visio files, we have in
solr-6.4.0\contrib\extraction\lib:
3.1. ooxml-schemas-1.3.jar instead of
Can i ask what is the final requirement here ?
What are you trying to do ?
- just display less results ?
you can easily do at search client time, cutting after a certain amount
- make search faster returning less results ?
This is not going to work, as you need to score all of them as Erick
On Mon, 2017-04-10 at 13:27 +0530, Himanshu Sachdeva wrote:
> Thanks for your time and quick response. As you said, I changed our
> logging level from SEVERE to INFO and indeed found the performance
> warning *Overlapping onDeckSearchers=2* in the logs.
If you only see it occasionally, it is
Browsers retain basic auth information. You have to close it or clean
browsing history. You can also change the user password at server side.
Best
On Tue, Apr 11, 2017 at 7:18 AM, Zheng Lin Edwin Yeo
wrote:
> Anyone has any idea if the authentication will expired
To be fair the second result seems consistent with the Solr grouping logic :
*First Query results (Suspicious)*
1) group.sort= score desc -> select the group head as you have 1 doc per
group( the head will be the top scoring doc per group)
2) sort=score asc -> sort the groups by the score of the
Also you should change the heap 32GB->30GB so you're guaranteed to get
pointer compression. I think you should have no need to increase it more
than this, since most things have moved to out-of-heap stuff, like
docValues etc.
On Tue, Apr 11, 2017 at 12:07 PM, Dorian Hoxha
Isn't 18K lucene-indexes (1 for each shard, not counting the replicas) a
little too much for 3TB of data ?
Something like 0.167GB for each shard ?
Isn't that too much overhead (i've mostly worked with es but still lucene
underneath) ?
Can't you use 1/100 the current number of collections ?
On
Hi All,
I am trying to integrate UIMA with Solr.I was able to do the same.But some
of the UIMA fields are not getting indexed into solr whereas other *fields
like pos,ChukType are getting indexed*.
I am using openNLP-UIMA together for text analysis.
When I tried to index the UIMA field for
44 matches
Mail list logo