Hi,
I understand that setting omitTermFreqAndPositions=true for a field in
schema.xml stores less information in the index with some restrictions e.g.
phrase search.
But does setting this property as true for a field which is of type
string, int or is analyzed by KeywordAnalyzer makes any
Hello everyone,
I have a question on how to update index using xml messages when there are
some complex custom field types in my index...like:
fieldtype name=offer class=com.xxx.OfferField/
And field offer has some attributes in it...
I've read page, http://wiki.apache.org/solr/UpdateXmlMessages
how about using regular expressions:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternReplaceCharFilterFactory
On Tue, Jan 10, 2012 at 1:14 AM, geeky2 gee...@hotmail.com wrote:
Hello all,
i have been reading the solr book as well as searching the archives of this
On Tue, Jan 10, 2012 at 4:44 AM, geeky2 gee...@hotmail.com wrote:
[...]
i have a database with approximately 7Million rows that i am bringing in to
solr.
for a very small sub-set of these 7Million rows (about 130 rows), i need to
substitute an old part number for a new part number. i know
Hi, I;m having issues using the new way of faceting dates with the Query
Facets.
The issue is that it is returning wrong counts. I tested it using a Date Facet
instead and the Dated one did result correct counters. I'm using Sunspot RSolr
client and I'm using also new folding/group feature.
I think I solve it... It seems to be because of the - that's just before the
query facet name
--
Mauro Asprea
E-Mail: mauroasp...@gmail.com
Mobile: +34 654297582
Skype: mauro.asprea
On Tuesday, January 10, 2012 at 11:33 AM, Mauro Asprea wrote:
Hi, I;m having issues using the new way
In my case the cores are populated with different records that adhere to the
same schema. The question about randomly distributing requests is because each
core has the `shards` parameter populated so that it can hit the other core's
indexes.
My question is more about the advantages (if any)
Hello,
I sent some data into the solr/lucene index but when I query
the data I see weird results.
There are documents with identical id fields but they have different
hash values.
Apart from the hash values the results are the same.
I thought it was impossible to have documents with same
Thank you for your patience and assistance. XML is not my forte, but layoffs
and attrition have reduced IT staff well below minimum functional levels here.
Thanks to your help, the exact title matches have made it to the first page of
results.
Robert McCarroll
Systems Administration
NYS
On a very quick glance, it looks like the source is at:
./lucene/contrib/analyzers/common/src/java/org/tartarus/snowball
and from there just compile Lucene and/or Solr as you normally would.
See: http://wiki.apache.org/solr/HowToContribute
Best
Erick
On Mon, Jan 9, 2012 at 2:13 PM,
Hello again,
Well after further review the ID's are different. The difference was
just so small I missed it after staring it for a few hours.
BR,
Lauri
On 01/10/2012 02:20 PM, Hyttinen Lauri wrote:
Hello,
I sent some data into the solr/lucene index but when I query
the data I see weird
I am not able to do highlighting with surround query parser
on the returned
results.
I have tried the highlighting component but it does not
return highlighted
results.
Highlighter does not recognize Surround Query. It must be re-written to enable
highlighting in
thank you both for the information.
Gora, when you mentioned:
- For keeping both values, use synonyms.
what did you mean exactly.
mark
--
View this message in context:
http://lucene.472066.n3.nabble.com/best-way-to-force-substitutions-in-data-tp3646195p3647920.html
Sent from the Solr -
Hi Erick,
I change all my url fields into text (they were string fields before), and
added a WordDelimiterFilterFactory, so that url fields can be tokenized
into several words. But I still got around 15 seconds response time
measured using debugyQuery=on, and most of the time still spend on
On Tue, Jan 10, 2012 at 9:04 PM, geeky2 gee...@hotmail.com wrote:
thank you both for the information.
Gora, when you mentioned:
- For keeping both values, use synonyms.
what did you mean exactly.
[...]
Please take a look at
I have no idea what you mean by different hash, and you
haven't provided much information go on here.
What is your evidence that the document is in the index
twice? If you're inspecting the index at a low level
that's expected, since documents are just marked
as deleted not immediately removed
On 1/9/2012 5:15 PM, Hector Castro wrote:
Hi,
Has anyone had success with multicore single node Solr configurations that have
one core acting solely as a dispatcher for the other cores? For example, say
you had 4 populated Solr cores – configure a 5th to be the definitive endpoint
with
I see a missing required title field for every document when I'm using DIH.
Yes, these documents have titles in the database. Is there a way to see what
exact queries are sent to MySQL or received by MySQL?
Here is a relevant chunk of the dataConfig:
entity name=book query=select * from
just a guess but this might need to change from
${biblio.id}
to
${book.id}
Since the entity name is book instead of biblio
On 1/10/12 10:37 AM, Walter Underwood wrote:
I see a missing required title field for every document when I'm using DIH.
Yes, these documents have titles in the
Thanks! That looks like it fixed the problem. This list continues to be awesome.
Is the function of the name attribute actually described in the docs? I could
not figure out what it was for.
wunder
On Jan 10, 2012, at 10:41 AM, dan whelan wrote:
just a guess but this might need to change
On Wed, Jan 11, 2012 at 12:37 AM, Walter Underwood
wun...@wunderwood.org wrote:
Thanks! That looks like it fixed the problem. This list continues to be
awesome.
Is the function of the name attribute actually described in the docs? I could
not figure out what it was for.
Yes, it is, though
Right, but that says exactly nothing about how that identifier is used. --wunder
On Jan 10, 2012, at 11:23 AM, Gora Mohanty wrote:
On Wed, Jan 11, 2012 at 12:37 AM, Walter Underwood
wun...@wunderwood.org wrote:
Thanks! That looks like it fixed the problem. This list continues to be
awesome.
I am trying to get the IndexBasedSpellChecker to work. I believe I have
everything setup properly and the spellcheck component seems to be running
but the suggestions list is empty.
I am using SOLR 3.5 with Jetty.
My solrconfig.xml and schema.xml are as follows:
solrconfig.xml:
Three things to check:
1. Use a higher spellcheck.count than 1. Try 10. IndexBasedSpellChecker
pre-filters the possibilities in a first pass of a 2-pass process. If
spellcheck.count is too low, all the good suggestions might get filtered on the
first pass and then it won't find anything on
my copyField was defined as copyfield --- notice the lowercase f
On Tue, Jan 10, 2012 at 2:50 PM, Dyer, James james.d...@ingrambook.comwrote:
Three things to check:
1. Use a higher spellcheck.count than 1. Try 10. IndexBasedSpellChecker
pre-filters the possibilities in a first pass
We've had some issues with people searching for a document with the
search term '200 movies'. The document is actually title 'two hundred
movies'.
Do we need to add every number to our synonyms dictionary to
accomplish this? Is it best done at index or search time?
Thanks for your reply.
I added the argument in the solrconfig.xml and it worked like a charm.
Thanks again
Minh
-Original Message-
From: Koji Sekiguchi [mailto:k...@r.email.ne.jp]
Sent: mardi 10 janvier 2012 01:25
To: solr-user@lucene.apache.org
Subject: Re: ignoreTikaException value
On Tue, Jan 10, 2012 at 5:32 PM, Tanner Postert tanner.post...@gmail.comwrote:
We've had some issues with people searching for a document with the
search term '200 movies'. The document is actually title 'two hundred
movies'.
Do we need to add every number to our synonyms dictionary to
You mention that is one way to do it is there another i'm not seeing?
On Jan 10, 2012, at 4:34 PM, Ted Dunning ted.dunn...@gmail.com wrote:
On Tue, Jan 10, 2012 at 5:32 PM, Tanner Postert
tanner.post...@gmail.comwrote:
We've had some issues with people searching for a document with the
I was afraid you would say that.
See http://fora.tv/2009/10/14/ACM_Data_Mining_SIG_Ted_Dunning#fullprogram,
click on the Recommendations section to skip to the good part.
The point is that cross recommendation can let you learn what sorts of
rewrites of this kind are needed. The idea is that
Hi Tanner,
Here is another simple way: AutoComplete.
You know what your users are searching for, you can identify top queries and
you can identify common queries that are not finding matches. This all allows
you to figure out what to feed in AutoComplete. And hopefully your
AutoComplete
It's a bit of a privacy through obscurity measure, unfortunately. The
problem is that American courts do a lousy job of removing social
security numbers from cases that I put on my site. I do anonymization
before sending the cases to Solr, but if you're clever (and the
stopwords weren't in
Straying a bit from the subject,
don't you think it will be useful to have the shards parameter used also in
the index, in order to maintain document uniqueness?
I mean as an out of the box feature of Solr.
Because the situation today is that a Solr's client working with a sharded
Solr is
Hi, I'm having some issues trying to sort my grouped results by more than one
field. If I use just one, independently of which I use it just work fine (I
mean it sorts).
I have a case that the first sorting key is equal for all the head docs of each
group, so I expect to return the groups
Hi,
I didn't hear any responses here, so I went ahead and made a bunch of
changes to the highlighting parameters wiki:
- Highlighter is now known as Original Highlighter so it's more clear
that Highlighter doesn't just refer to the highlighting utilities generally.
- I need help with
35 matches
Mail list logo