On Wed, Apr 3, 2013 at 7:55 PM, Shawn Heisey s...@elyograg.org wrote:
In situations where I don't want to change the default value, I prefer
to leave config elements out of the solrconfig. It makes the config
smaller, and it also makes it so that I will automatically see benefits
from the
Hi Otis, then what is the difference between add and update? And how we
update or add documents into Solr (I see that there is just one update
handler)?
2013/4/4 Otis Gospodnetic otis.gospodne...@gmail.com
I don't recall what Nutch does, so it's hard to tell.
In Solr (Lucene, really), you
The problem was connected with filter order. WordDelimiterFilter should be
put before others. Thanks for your help.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Query-parser-cuts-last-letter-from-search-term-tp4053432p4053736.html
Sent from the Solr - User mailing list
A custom QueryResponseWriter...this makes sense, thanks Jack
On Wed, Apr 3, 2013 at 11:21 PM, Jack Krupansky j...@basetechnology.comwrote:
The search components can see the response as a namedlist, but it is
only when SolrDispatchFIlter calls the QueryResponseWriter that XML or JSON
or
It its in your SolrCloud-based collection's config, it won't be on disk
and only in Zookeeper.
What I did was use the XInclude feature to include a file with my
dataimport handler properties, so I'm assuming you're doing the same.
Use a relative path to the config dir in Zookeeper, ie: no
On 04/03/2013 07:22 AM, amit wrote:
Below is my query
http://localhost:8983/solr/select/?q=subject:session management in
phpfq=category:[*%20TO%20*]fl=category,score,subject
You specify that you want session to appear in field subject, but
the other tokens only match to the default search
Hi Jan,
Thanks for your reply. I have defined string_ci like below:
fieldType name=string_ci class=solr.TextField sortMissingLast=true
omitNorms=true compressThreshold=10
analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter
Hi,
I configured index-based spell check component and unexpected problem
occurs.
*CASE 1: *
I added two documents with following content:
1. handbuch
2. hanbuch
The suggestions are returned for both terms: e.g. handbuch - hanbuch and
hanbuch- handbuch.
Comment: Works as expected.
*CASE 2: *
Hi,
I think you need to use the alternativeTermCount parameter (
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount)
to return suggestions for terms which occur less often than the
user-entered term.
More discussion here:
I tried to add spellcheck.alternativeTermCount=5 but still no suggestion has
been found.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Spell-check-component-does-not-return-any-suggestions-tp4053757p4053772.html
Sent from the Solr - User mailing list archive at
Hi,
we've successfully implemented suggestion of search terms using facet
prefixing with Solr 4.0. However, with lots of unique index terms we've
encountered performance problems (long running queries) and even
exceptions: Too many values for UnInvertedField faceting on field
textbody.
We
The simple way to write the query:
q=subject:session subject:management subject:in subject:php
Would be:
q=subject:(session management in php)
Of course, edismax is usually a better way to go in general.
-- Jack Krupansky
-Original Message-
From: Andre Bois-Crettez
Sent: Thursday,
Technically, update and add are identical from a user perspective - you
don't need to worry about whether the document already exists.
But, there is another, newer form of update, selective or atomic which
is updating a subset of the fields in an existing document without needing
to re-send
I craw webages with Nutch and send them to Solr for indexing. There are two
parameters to send data into Solr. One of them is -index and the other one
is -reindex. I just want to learn what they do.
2013/4/4 Jack Krupansky j...@basetechnology.com
Technically, update and add are identical from
That's a question for the Nutch email list.
In Solr, reindexing simply means that you manually delete your full Solr
index (or at least delete all documents using a query) and fully ingest all
documents, from scratch. There is no option, it's just something that you,
the user/developer, do
I assume you're using Nutch 2.x? Nutch 1.x does not have such an option and i
find it strange to hear 2.x does. It really makes no sense to have a -reindex
option and it should be removed. I'd recommend to stick to plain indexing.
-Original message-
From:Jack Krupansky
On Thu, Apr 4, 2013 at 9:03 AM, Furkan KAMACI furkankam...@gmail.comwrote:
I craw webages with Nutch and send them to Solr for indexing. There are two
parameters to send data into Solr. One of them is -index and the other one
is -reindex. I just want to learn what they do.
Are you sure this
On 4 April 2013 18:33, Furkan KAMACI furkankam...@gmail.com wrote:
I craw webages with Nutch and send them to Solr for indexing. There are two
parameters to send data into Solr. One of them is -index and the other one
is -reindex. I just want to learn what they do.
[...]
Which version of Nutch
I use Nutch 2.1 and using that:
bin/nutch solrindex http://localhost:8983/solr -index
bin/nutch solrindex http://localhost:8983/solr -reindex
2013/4/4 Gora Mohanty g...@mimirtech.com
On 4 April 2013 18:33, Furkan KAMACI furkankam...@gmail.com wrote:
I craw webages with Nutch and send them
Thank you for all your hard work!
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where Influence Isn’t a Game
On Wed, Apr 3, 2013 at 6:08 PM, Mark Miller markrmil...@gmail.com wrote:
On
Thanks Jack and Andre
I am trying to use edismax;but struck with the NoClassDefFoundError:
org/apache/solr/response/QueryResponseWriter
I am using solr 3.6
I have followed the steps here
http://wiki.apache.org/solr/VelocityResponseWriter#Using_the_VelocityResponseWriter_in_Solr_Core
Just the
Good morning,
I'm currently running Solr 4.0 final with tika v1.2 and Manifoldcf v1.2 dev.
And I'm battling Tika XML parse errors again.
Solr reports this error:org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: XML parse error which is too vague.
I had to
ok, one possible fix is to add the xml equivalent to nbsp with is:
?xml version=1.0?
!DOCTYPE some_name [
lt;!ENTITY nbsp quot;amp;#160;quot;
]
but how do I add this into the tika configuration?
--
View this message in context:
If you are using dismax/edismax with mm=0 (or some other low number), you
should override this in the spellchecker. Specify
spellcheck.collateParam.mm=100%, or something high like that. Likewise if
you're using the default lucene/solr query parser with q.op=OR, then you can
specify
On 4 April 2013 19:29, Furkan KAMACI furkankam...@gmail.com wrote:
I use Nutch 2.1 and using that:
bin/nutch solrindex http://localhost:8983/solr -index
bin/nutch solrindex http://localhost:8983/solr -reindex
[...]
Sorry, but are you sure that you are using 2.1. Here is
what I get with:
On 4 April 2013 20:16, Gora Mohanty g...@mimirtech.com wrote:
On 4 April 2013 19:29, Furkan KAMACI furkankam...@gmail.com wrote:
I use Nutch 2.1 and using that:
bin/nutch solrindex http://localhost:8983/solr -index
bin/nutch solrindex http://localhost:8983/solr -reindex
[...]
Sorry, but
Make sure you also set spellcheck.onlyMorePopular=false (or leave it out as
false is the default) when using spellcheck.alternativeTermCount. You may
also need to set spellcheck.maxResultsForSuggest=0. See
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.maxResultsForSuggest
to
It may be a deprecated usage(maybe not) but certainly can run -index and
-reindex on Nutch 2.1.
2013/4/4 Gora Mohanty g...@mimirtech.com
On 4 April 2013 20:16, Gora Mohanty g...@mimirtech.com wrote:
On 4 April 2013 19:29, Furkan KAMACI furkankam...@gmail.com wrote:
I use Nutch 2.1 and
Could you guys please take this discussion offline or over to a Nutch
mailing list - where it belongs? This has nothing to do with Solr.
-- Jack Krupansky
-Original Message-
From: Gora Mohanty
Sent: Thursday, April 04, 2013 10:46 AM
To: solr-user@lucene.apache.org
Subject: Re:
hello,
I'm using a single server setup with Nutch (1.6) and Solr (4.2)
I plan to trigger the Nutch crawling process every 30 minutes or so and add
about 300+ websites a month with (~5-10 pages each). At this point I'm not
sure about the query requests/sec.
Can I run this on a single server (how
Another problem that I see in Solr analysis is the query term that matches
the tokenized field does not match on the case insensitive field.
So, if I'm searching for 'coast to coast', I see that the tokenized series
title (pg_series_title) is matched but not the ci field which is
I am trying to understand how to plug data into the solr query option from
the UI.
The query below works on our old solr version (1.3) but does not return
results on 4.2. I pulled it from the catalina log file. I am trying to
plug in the values one by one into the query UI to see which one it
On 4 April 2013 22:11, scallawa dami...@altrec.com wrote:
I am trying to understand how to plug data into the solr query option from
the UI.
The query below works on our old solr version (1.3) but does not return
results on 4.2. I pulled it from the catalina log file. I am trying to
plug
I'm trying to understand the context is here... are you trying to crawl web
pages that have bad HTML? Or, ... what?
-- Jack Krupansky
-Original Message-
From: eShard
Sent: Thursday, April 04, 2013 10:23 AM
To: solr-user@lucene.apache.org
Subject: detailed Error reporting in Solr
Hi James,
Thanks for the response.
Nope, I'm not using dismax or edismax. Just the standard solr query parser.
Also by using the variable spellcheck.collateParam.q.op=AND I see this
working. This also means that all the words need to correct and the
maxEdits can only be 2 else it won't suggest
Yes, that's it exactly.
I crawled a link with these (nbsp;rsaquo;) in each list item and solr
couldn't handle it threw the xml parse error and the crawler terminated the
job.
Is this fixable? Or do I have to submit a bug to the tika folks?
Thanks,
--
View this message in context:
Use IndexBasedSpellChecker instead of DirectSolrSpellChecker if you need more
than 2 edits. You may need to set the accuracy parameter lower than the
default of .5
Keep in mind that while this might get the correct responses for your test
cases, in the wild your users might find their
hi all
I am using solr directspellcheker for spell suggestions using raw analyses
for indexing but I have some fields which have single characters like l L
so its is been indexed in the dictionary and when I am using this for
suggestions for query like delll its suggesting de and l l l as the
I assume if your user queries delll and it breaks it into pieces like de l l
l, then you're probably using WordBreakSolrSpellChecker in addition to
DirectSolrSpellChecker, right? If so, then you can specify minBreakLength in
solrconfig.xml like this:
searchComponent name=spellcheck
We are still in the testing phase for 4.2. A new server was built and the
latest tomcat, java and solr were installed. The schema file was copied
over from the old and then customized as follows.
Schema Changes
We changed all float field types to tfloat. The solrqueryparser default
operator is
On 4/4/2013 12:34 AM, Dotan Cohen wrote:
In the case of maxWarmingSearchers, I would hope that you have your
system set up so that you would never need more than 1 warming searcher
at a time. If you do a commit while a previous commit is still warming,
Solr will try to create a second warming
I've been away from Tika for awhile, so I'm not sure. This might also be an
issue of Tika using a strict XML parser for HTML rather than a looser and
more error-tolerant HTML-specific parser, like most browsers use, that
allows these kinds of technical errors that in reality, in most cases, can
I had read somewhere that text fields by default were compressed in 4.2.1,
is this the case? If not how do I enable compression of stored text fields?
On Thu, Apr 4, 2013 at 7:41 PM, Jamie Johnson jej2...@gmail.com wrote:
I had read somewhere that text fields by default were compressed in 4.2.1,
is this the case? If not how do I enable compression of stored text fields?
Compressed stored fields are the default since 4.1
-Yonik
I found the problem. The values that we have for cat-path include the
special character /. This was not a special character in pre 4.0
releases. That explains why it worked in my previous version but not in
4.2.
Pre 4.0
Lucene supports escaping special characters that are part of the query
Hi I saw some similar problems in other threads but I think that this is a
little different and couldn't get any solution.*I get the exception
*/org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token
eightysix exceeds length of provided text sized 80/This happens for example
when I
- Is dataimport.properties ever written to the filesystem? (Trying to
determine if I have a permissions error because I don't see it anywhere
on disk).
- How do you manually edit dataimport.properties? My system is
periodically pulling in new data. If that process has issues, I want to
be able
We need to also track the size of the response (as the size in bytes of the
whole xml response tat is streamed, with stored fields and all). I was a
bit worried cause I am wondering if a searchcomponent will actually have
access to the response bytes...
== Can't you get this from your container
48 matches
Mail list logo