Hi guys,
We observe some strange bug in solr 4.10.2, where by a sloppy query hits
words it should not:
lst name=debugstr name=rawquerystringthe e commerce/strstr
name=querystringthe e commerce/strstr
name=parsedquerySpanNearQuery(spanNear([Contents:the,
spanNear([Contents:eä,
Hi,
I encounter this peculiar case with solr 4.10.2 where the parsed query
doesnt seem to be logical.
PHRASE23(reduce workforce) ==
SpanNearQuery(spanNear([spanNear([Contents:reduceä,
Contents:workforceä], 1, true)], 23, true))
The question is why does the Phrase(quoted string) gets converted
Re,
Thanks for your reply.
I mock my parser like this :
@Overridepublic Query parse() { SpanQuery[] clauses = new SpanQuery[2];
clauses[0] = new SpanTermQuery(new Term(details, london));
clauses[1] = new SpanTermQuery(new Term(details, city)); return new
Hi,
I am very new to the nutch and solr plateform. I have been trying a
lot to integrate Solr 5.2.0 with nutch 1.10 but not able to do so. I have
followed all the steps mentioned at nutch 1.x tutorial page but when I
execute the following command ,
bin/nutch solrindex
As a general rule, there are only two ways that Solr scales to large
numbers: large number of documents and moderate number of nodes (shards and
replicas). All other parameters should be kept relatively small, like
dozens or low hundreds. Even shards and replicas should probably kept down
to that
We're running some tests on Solr and would like to have a deeper
understanding of its limitations.
Specifically, We have tens of millions of documents (say 50M) and are
comparing several #collections X #docs_per_collection configurations.
For example, we could have a single collection with 50M
When in 2012? I'd give it a go with Solr 3.6 if you don't want to modify
the library.
Upayavira
On Sun, Jun 14, 2015, at 04:14 AM, Zheng Lin Edwin Yeo wrote:
I'm still trying to find out which version it is compatible for, but the
document which I've followed is written in 2012.
My answer remains the same - a large number of collections (cores) in a
single Solr instance is not one of the ways in which Solr is designed to
scale. To repeat, there are only two ways to scale Solr, number of
documents and number of nodes.
Jack, I understand that, but I still feel you're
No clue, you'd probably have better luck on the Nutch user's list
unless there are _Solr_ errors. Does your Solr log show any errors?
Best,
Erick
On Sun, Jun 14, 2015 at 6:49 AM, kunal chakma kchax4...@gmail.com wrote:
Hi,
I am very new to the nutch and solr plateform. I have been trying
To my knowledge there's nothing built in to Solr to limit the number
of collections. There's nothing explicitly in place to handle
many hundreds of collections either so you're really in uncharted,
certainly untested waters. Anecdotally we've heard of the problem
you're describing.
You say you
I'm having trouble getting Solr to pay attention to the defaultField value
when I send a document to Solr Cell or Tika. Here is my post I'm sending
using Solrj
POST
/solr/collection1/update/extract?extractOnly=truedefaultField=textwt=javabinversion=2
HTTP/1.1
When I get the response back the
re: hybrid approach.
Hmmm, _assuming_ that no single user has a really huge number of
documents you might be able to use a single collection (or much
smaller group of collections), by using custom routing. That allows
you to send all the docs for a particular user to a particular shard.
There are
Why don't you take a step back and tell us what you are really trying to do.
Try using a normal Solr query parser first, to verify that the data is
analyzed as expected.
Did you try using the surround query parser? It supports span queries.
Your span query appears to require that the two terms
Thanks Jack for your response. But I think Arnon's question was different.
If you need to index 10,000 different collection of documents in Solr (say
a collection denotes someone's Dropbox files), then you have two options:
index all collections in one Solr collection, and add a field like
Looks like this has been solved recently in the current dev branch:
SimplePostTool (and thus bin/post) cannot index files with unknown
extensions
https://issues.apache.org/jira/browse/SOLR-7546
--
View this message in context:
My guess is that you have WordDelimiterFilterFactory in your
analysis chain with parameters that break up E-Tail to both e and tail _and_
put them in the same position. This assumes that the result fragment
you pasted is incomplete and commerce is in it
From emE/em-Tail emcommerce/em
or some
My answer remains the same - a large number of collections (cores) in a
single Solr instance is not one of the ways in which Solr is designed to
scale. To repeat, there are only two ways to scale Solr, number of
documents and number of nodes.
-- Jack Krupansky
On Sun, Jun 14, 2015 at 11:00 AM,
Yes, there are some known problems while scaling to large number of
collections, say 1000 or above. See
https://issues.apache.org/jira/browse/SOLR-7191
On Sun, Jun 14, 2015 at 8:30 PM, Shai Erera ser...@gmail.com wrote:
Thanks Jack for your response. But I think Arnon's question was different.
Hi,
I face the same problem when trying to index DITA XML files. These are XML
files but have the file extension .dita which Solr ignores.
According to java -jar post.jar -h only the following file extensions are
supported:
/-Dfiletypes=type[,type,...]
I think I have this about working with the analytics component. It seems to
fill in all the gaps that the stats component and the json facet don't
support.
It solved the following problems for me:
- I am able to perform math on stats to form other stats.. Then i can sort
on those as needed.
-
And anyone who, you know, really likes working with UI code please
help making it better!
As of Solr 5.2, there is a new version of the Admin UI available, and
several improvements are already in 5.2.1 (release imminent). The old
admin UI is still the default, the new one is available at
Why it isn't in core Solr... Because it doesn't (and probably can't)
support distributed mode.
The Streaming aggregation stuff, and the (in trunk Real Soon Now)
Parallel SQL support
are where the effort is going to support this kind of stuff.
https://issues.apache.org/jira/browse/SOLR-7560
But I think Solr 3.6 is too far back to fall back to as I'm already using
Solr 5.1.
Regards,
Edwin
On 14 June 2015 at 14:49, Upayavira u...@odoko.co.uk wrote:
When in 2012? I'd give it a go with Solr 3.6 if you don't want to modify
the library.
Upayavira
On Sun, Jun 14, 2015, at 04:14 AM,
Hi all,
Every time I optimize my index with maxSegment=2 after some time the
replication fails to get filelist for a given generation. Looks like the index
version and generation count gets messed up.
(If the maxSegment=1 this never happens. I am able to successfully reproduce
this by
Hi chillra,
I have changed the index and query filed configuration to
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
But still my problem not solved , it won't resolve my problem.
--
View this message in context:
This issue has also already been discussed in the Tika issue queue:
Add method get file extension from MimeTypes
https://issues.apache.org/jira/browse/TIKA-538
And
http://svn.apache.org/repos/asf/tika/trunk/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
does support DITA
26 matches
Mail list logo