That is correct, but twice the disk space is theoretically not enough.
Worst case is actually three times the storage, I guess this worst case can
happen if you also submit new documents to the index while optimizing.
I have experienced 2.5 times the disk space during an optimize for a large
Hi,
Is it possible to point two solr instances to point to a common index
directory. Will this work wit changing the lock type?
Thanks,
Prasi
Hello
We have the default dutch stopwords implemented in our Solr instance, so
words like 'de', 'het', 'ben' are filtered at index time.
Is there a way to trick Solr into ignoring those stopwords at query time,
when users puts the search terms between quotes?
Best
Geert
Hi,
In fact, you can use analysis page to check the result of query or index
process!
--
Gabriel Zhang
On Thursday, June 26, 2014 5:33 PM, Geert Van Huychem ge...@iframeworx.be
wrote:
Hello
We have the default dutch stopwords implemented in our Solr instance, so words
like ‘de’,
Hi,
Not really as the words don’t exist in the corpus field. They way we have got
around it in the past is to have another non stopped field that is also
searched on (in addition to the the stopped field) with a boost to the score
for matches.
As an slight alternative you could do the above
Hi,
with the lock type 'simple' I have tree instances (different JREs, GC-Problem)
running on the same files.
You should use this option only for a readonly system. Otherwise it's easy to
corrupt the index.
Maybe you should have a look on replication or SolrCloud.
Uwe
Am 26.06.2014 11:25,
I am new in Solr and I have to do a filter to lemmatize text to index documents
and also to lemmatize querys.
I created a custom Tokenizer Factory for lemmatized text before passing it to
the Standard Tokenizer.
Making tests in Solr analysis section works fairly good (on index ok,
but on
Can you please tell me whihc solr version you have tried with? I tried
giving
lockType${solr.lock.type:none}/lockType in 2 solr instances and now it
is working. I am not getting the write lock exception when starting the
second instance.
But my scenario is that both solr instances would write to
I found the root of the problem. This is very strange, but I guess
someone can explain to me why this happens.
Take a look at the static block in my factory:
http://folk.uio.no/erlendfg/solr/NorwegianLemmatizerFilterFactory.java
static {
...
}
If I remove this block and return a stemmed
Hello,
I have 5000 French, Arabic, English documents. my shema.xml contain
300fields for French Documents.
exemple:
field name=ContenuDocument type=text_fr multiValued=false
indexed=true required=false stored=true/
so what i need to do is detect language of the document before indexing then
i
Thank you, Alex, Kuro and Simon. I've had a chance to look into this a bit
more.
I was under the (wrong) belief that the ICUTokenizer splits on individual
Chinese characters like the StandardAnalyzer after (mis)reading these two
sources
Can you share your configuration with us ? have you modified the Solr
source code in anyway?
On Thu, Jun 26, 2014 at 1:06 AM, Aman Tandon amantandon...@gmail.com
wrote:
Hi,
We are getting the results for the query but the spellchecker component is
returning 500. Please help us out.
Hi,
I guess this link https://wiki.apache.org/solr/LanguageDetection may help
you
On Jun 26, 2014 6:12 PM, benjelloun anass@gmail.com wrote:
Hello,
I have 5000 French, Arabic, English documents. my shema.xml contain
300fields for French Documents.
exemple:
field name=ContenuDocument
Hi,
I am grouping in my results and also applying the group limit. Is there is
any way to log the ngroups as well along with hits.
The 3x worst case is:
1. All documents are in one segment.
2. Without merging, all documents are deleted, then re-added and committed.
3. A merge is done.
At the end of step 2, there are two equal-sized segments, 2X the space needed.
During step 3, a third segment of that size is created.
This
Thank you all for the reply and shedding more light on this topic.
A follow up question: during optimization, If I run out of disk space, what
happens other than the optimizer failing? Am I now left with even a larger
index than I started with or am I back to the original none optimized index
On 6/26/2014 7:27 AM, Allison, Timothy B. wrote:
So, I'm left with this as a candidate for the text_all field (I'll probably
add a stop filter, too):
fieldType name=text_all class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer class=solr.ICUTokenizerFactory/
Try this:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-User-Guide/Cloudera-Search-User-Guide.html
Wolfgang.
On Jun 24, 2014, at 11:14 PM, atp annamalai...@hcl.com wrote:
Thanks Ahmet, Walfgang , i have installed hbase-indexer on one the server
bq: But my scenario is that both solr instances would write to the common
directory
Do NOT do this. Don't even try. I guarantee Bad Things Will Happen.
Why do you want to do this? To save disk space? Accomplish NRT
searching on multiple machines?
Please define the problem you're trying to solve
Using Solr 4.8.1.
I am creating an index containing Solr documents both with and without
nested documents. When Indexing documents from a single SolrJ client on a
single thread if I do not call commit() after each document add() I see
some erroneous documents returned from my child of or parent
Hello Elliot,
Parent doc is mandatory, you can't omit it.
Thus instead of:
add() - single001
you have to
add() - fakeparent000 : [single001]
there was no plans to support any sort of flexibility there...
On Thu, Jun 26, 2014 at 9:52 PM, Elliot Ickovic elliot.icko...@gmail.com
wrote:
Using
: We are getting the results for the query but the spellchecker component is
: returning 500. Please help us out.
:
: *query*: http://localhostt:8111/solr/srch/select?q=malerkotlaqt=search
what version of solr?
what does your solrconfig.xml show for /select the spellcheck config?
what does
Hi Mikhail, Thank you for the quick response!
If I instead of:
add() - fakeparent000 : [single001]
I do :
add() - single000 : [fakeChild001]
will this prevent the index from appearing corrupted? This way I can
retain my logical top level docs.
What is the reason I need add a fake doc? If
Erick, I agree, but... wouldn't it be SO COOL if it did work! Avoid all the
ridiculous complexity of cloud.
Have a temporary lock to permit and exclude updates.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent: Thursday, June 26, 2014 12:37 PM
To:
You can use the Cursor based paging API added in 4.7 which is much more
resilient to index updates.
See the section titled How cursors are Affected by Index Updates at
https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
On Mon, Jun 23, 2014 at 2:08 PM, Bram Van Dam
Cool? More like generally useless. --wunder
On Jun 26, 2014, at 12:44 PM, Jack Krupansky j...@basetechnology.com wrote:
Erick, I agree, but... wouldn't it be SO COOL if it did work! Avoid all the
ridiculous complexity of cloud.
Have a temporary lock to permit and exclude updates.
--
There's also a new searcher lease feature which might land in Solr in
future.
https://issues.apache.org/jira/browse/SOLR-2809
On Fri, Jun 27, 2014 at 1:18 AM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
You can use the Cursor based paging API added in 4.7 which is much more
I am building an enterprise search Engine with Solr 4.8.1 (and the AJAX solr
interface, not relevant to this question though) - in doing so, I am
attempting to display the file names of each indexed document in my GUI
search results.
In my gui, I can successfully display any field that is in
Tried the following:
add() - fakeparent000 : [single001] //with new 'doc-type:fakeparent'
add() - parent001 : [child001_1, child001_2]
commit()
Then query:
{!child of='doc-type:parent'}doc-type:parent
response now contains *fakeparent000*, *single001*, child001_1, child001_2
should only
: *ab:(system entity) OR ab:authorization* : Number of results returned 2
: which is not expected.
: It seems this query makes the previous terms as OR if the next term is
: introduced by an OR.
in general, that's they way the boolean operators like AND/OR work in
all of the various parser
I think you are correct -- deinitely looks like a bug to me...
https://issues.apache.org/jira/browse/LUCENE-5790
: Date: Fri, 13 Jun 2014 10:45:12 +
: From: 海老澤 志信 shinobu_ebis...@waku-2.com
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
bq: Avoid all the ridiculous complexity of cloud
And then re-introduce a single point of failure. Bad disk ==
unfortunate consequences
But frankly I don't see why you would ever _need_ to write from two
Solr instances. Wouldn't simply having one writer (which you could
change when you
Let's back up here. I'm guessing (since you didn't say) that you're using
ExtractingRequestHandler here? How are you sending docs to Solr?
You can always use literal.filename=whatever.
Best,
Erick
On Thu, Jun 26, 2014 at 2:02 PM, jrusnak jrus...@live.unc.edu wrote:
I am building an enterprise
hi
I use solr4.4 , 2 shards and 2 replicas and I found a problem on solrCloud
search.
If I perform a query with start=0 and rows=10 and say fq=ownerId:123 , I get
numFound=225.
If I simply change the start param to start=6, I get numFound=223.
and i change the start param to start=10 , i get
Thanks Damien for your response.
We have modified our Solr schema a little bit to add router.field.
Regards,
Modassar
On Thu, Jun 26, 2014 at 1:23 AM, Damien Dykman damien.dyk...@gmail.com
wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi Modassar,
I ran into the same issue (Solr
35 matches
Mail list logo