Hi all,
I need to tokenize my field on whitespaces, html, punctuation, apostrophe
but if I use HTMLStripStandardTokenizerFactory it strips only html but no
apostrophes
If I use PatternTokenizerFactory i don't know if i can create a pattern to
tokenizer all of theese characters...(hmtl,
Hi,
I would like to get the difference between q=text:+toto AND q=toto ?
/select?fl=*qt=dismaxq=text:+toto : 4 docs find.
lst name=params
str name=fl*/str
str name=qtext: toto/str
str name=qtdismax/str
/select?fl=*qt=dismaxq=toto : 5682 docs find.
lst name=params
str name=fl*/str
str
We have a requirement for a keyword search in one of our projects and we are
using Solr/Lucene for the same.
We have the data, link_id, title, url and a collection of keywords
associated to a link_id. Right now we have indexed link_id, title, url and
keywords (multivalued field) in a single
Hi,
I have a request handler in my solrconfig.xml : /spellCheckCompRH
It utilizes the search component spellcheck.
When I specify following query in browser, I get correct spelling
suggestions from the file dictionary.
http://localhost:8080/solr/spellCheckCompRH/?q=SolrDocsspellcheck.q=rel
dismax doesn't support field selection in it's query syntax, only via
the qf parameter.
add debugQuery=true to see how the queries are being parsed, that'll
reveal what is going on.
Erik
On Dec 10, 2008, at 5:07 AM, sunnyfr wrote:
Hi,
I would like to get the difference
On Wed, Dec 10, 2008 at 4:23 PM, Marc Sturlese [EMAIL PROTECTED]wrote:
Is there any way to start solar having the index folder empty without
having
and error? What I would like to do is start with the empty folder, do a
full
import (wich would create the index from 0) and from there keep
On Wed, Dec 10, 2008 at 5:54 PM, ayyanar
[EMAIL PROTECTED]wrote:
Also, in our requirement each keyword value has a weight associated to it
and this weight is calculated based on certain factors like (if the keyword
exist in title then it takes a specific weight etc…). This weight should
The first output is from the query component. You might just need to
make the collapse component first and remove the query component
completely.
We perform geographic searching with localsolr first (if we need to),
and then try to collapse those results (if collapse=true). If we
don't
On Wed, Dec 10, 2008 at 5:12 PM, sunnyfr [EMAIL PROTECTED] wrote:
When I look for this expression it does stop the search at the , taking
that for a parameter i guess.
You will need to URL encode the query parameter before you make the request.
URLEncoder.encode(tom jerry, UTF-8);
If you
Thanks, it did work.
Shalin Shekhar Mangar wrote:
On Wed, Dec 10, 2008 at 4:23 PM, Marc Sturlese
[EMAIL PROTECTED]wrote:
Is there any way to start solar having the index folder empty without
having
and error? What I would like to do is start with the empty folder, do a
full
import
Hi All,
Issue: Need to fetch the data available in different core folders.
Scenario:
We are storing the information on different core folders specific to website
ids (such as CoreUSA,CoreUK,CoreIndia ..). Thus information specific to any
region get store in specific core folder. for e.g. for
On Wed, Dec 10, 2008 at 5:19 PM, payalsharma [EMAIL PROTECTED] wrote:
We are storing the information on different core folders specific to
website
ids (such as CoreUSA,CoreUK,CoreIndia ..). Thus information specific to any
region get store in specific core folder. for e.g. for india specific
Thanks Erik,
Have a good day,
Erik Hatcher wrote:
dismax doesn't support field selection in it's query syntax, only via
the qf parameter.
add debugQuery=true to see how the queries are being parsed, that'll
reveal what is going on.
Erik
On Dec 10, 2008, at 5:07 AM,
Hi all,
I want to index the rich text documents like .doc, .xls, .ppt files. I had
done the patch for updating the rich documents by followed the instructions
in this below url. http://wiki.apache.org/solr/UpdateRichDocuments
When i indexing the doc file, im getting this following error in the
payalsharma wrote:
Hi All,
Issue: Need to fetch the data available in different core folders.
Scenario:
We are storing the information on different core folders specific to website
ids (such as CoreUSA,CoreUK,CoreIndia ..). Thus information specific to any
region get store in specific core
Hi,
Will you please explain what exactly you mean by :
Distributed search over the cores.
Please provide some context around this.
Thanks
markrmiller wrote:
payalsharma wrote:
Hi All,
Issue: Need to fetch the data available in different core folders.
Scenario:
We are storing the
I notices that you are using the same rysncd port for both core. Do you
have a scripts.conf for each core?
Bill
On Tue, Dec 9, 2008 at 11:40 PM, Kashyap, Raghu [EMAIL PROTECTED]wrote:
Hi,
We are seeing a strange behavior with snappuller
We have 2 cores Hotel Location
Here are
Bill,
Yes I do have scripts.conf for each core. However, all the options
needed for snappuller is specified in the command line itself (-D -S
etc...)
-Raghu
-Original Message-
From: Bill Au [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:17 AM
To:
-Original Message-
From: payalsharma [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:11 AM
To: solr-user@lucene.apache.org
Subject: Re: Can we extract contents from two Core folders
Hi,
Will you please explain what exactly you mean by :
Distributed search over the cores.
On Dec 10, 2008, at 9:58 AM, sunnyfr wrote:
Second question, if I want to weight status_official:true^2 should I
do it
this way ??? for weighting the true one? thanks
/select?fl=*qt=dismaxq=+tom+jerry
+
cartoontv
qf=status_official^2.5+owner_login^10+title^3debugQuery=true
Use bq
Yes but when I check the debug, there is no weight about it ???
/select?fl=*qt=dismaxq=+tom+jerry+cartoontvbq=status_official:true^12qf=owner_login^10+title^3debugQuery=true
and its like if it doesnt weight as well my word cartoontv ?? ok maybe the
doc which contain this three word is not
Try using the -d option with the snappuller so you can specify the
path to the directory holding index data on local machine.
Doug
On Dec 10, 2008, at 10:20 AM, Kashyap, Raghu wrote:
Bill,
Yes I do have scripts.conf for each core. However, all the options
needed for snappuller is
Inline below...
Also, though, you should note that the /spellCheckCompRH that is
packaged with the example is not necessarily the best way to actually
use the SpellCheckComponent. It is intended to be used as a
component in whatever your MAIN Request Handler is, it merely shows
the how
Ok I think the problem is what Bill mentioned earlier. The rsync port
was the same for both the cores and due to which it was copying the same
snapshot for both the cores
Thanks for all the help
-Raghu
-Original Message-
From: Kashyap, Raghu [mailto:[EMAIL PROTECTED]
Sent: Wednesday,
Hi,
There is a ClassNotFound exception in there. Make sure you rebuild the war,
completely remove the old one, and properly deploy the new one. Peek into the
war and look for the class that the error below is missing to make sure the
class is really there. Get the latest code for
Hi Raghav,
Recently, integration with Tika was completed for SOLR-284 and it is
now committed on the trunk (but does not use the old
RichDocumentHandler approach). See http://wiki.apache.org/solr/ExtractingRequestHandler
for how to use and configure.
Otherwise, it looks to me like the
Hi All,
I'm curious about what people have done with dates.
We Require:
1. multiple granularities to query and facet on: by year, by
year/month, by year/month/day
2. sortability: sort/order by date
3. time typically isn't important to us
4. some of these items don't have a day or
Hi -
I am a new user of Solr tool and came across the introductory
tutorial here - http://lucene.apache.org/solr/tutorial.html .
I am planning to use Solr in one of my projects . I see that the
tutorial mentions about a REST api / interface to add documents and to
query the same.
I would like
Tricia,
I think you might have missed the key nugget at the bottom of
http://wiki.apache.org/jakarta-lucene/DateRangeQueries
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Tricia Williams [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Hi,
I have not worked with a 50 node Solr cluster, but I've worked with pure Lucene
clusters of that size, very high query and data volumes. I don't imagine a
dist search involving 50 nodes will be a problem for Solr. As for handling
query slave failures, I'm sure you'll want to involve a LB
For a similar idea, check:
https://issues.apache.org/jira/browse/SOLR-906
This opens a single stream and writes all documents to that. It could
easily be extended to have multiple threads draining the same Queue
On Dec 9, 2008, at 4:02 AM, Noble Paul നോബിള്
नोब्ळ् wrote:
I guess this
Hi Otis,
Absolutely, I missed that nugget. I didn't think of using prefix
filters/queries. This works really well with how we had already stored
dates in a MMDD string. Thanks for pointing me in the right direction.
Tricia
Otis Gospodnetic wrote:
Tricia,
I think you might have
I'm trying to see if anyone has any recommendations on the maximum number of
cores that should be used within Solr. Is there significant overhead to each
core? Should it be 10 or less, or is 100 or 1,000 cores acceptable.
Thanks,
Ryan
it depends!
yes there is overhead to each core -- how much it matters will depend
entirely on your setup and typical usage pattern.
sorry this is not a particularly useful answer.
I think the choice of how many cores will come down to your domain
logic needs more then hardware. If you
: I want to index a field with an array of arrays, is that possible in Solr?
Not out of the box ... you can implement custom FieldTypes that store any
data you want in using a byte[] but you'd still need to do some tricks
with your FieldType to get the ResponsWriter to write it out in a
We are considering a migration to SOLR from a home grown Lucene solution.
Currently we have 27,000 seperate lucene indexes that are separated based on
business logic. Collectively the indexes are about 1.5 Terrabytes in size. We
have some very small indexes and some that are quite large (up to
: This is really cool. U... How does it integrate with the Data Import
: Handler?
my DIH knowledge is extremely limited, but i'm guessing approach #1 is
trivial (there is an easy way to concat DB values to build up solr field
values right?); approach #2 would probably be possible using
Hi,
I am a new solr user.
I have an application that I would like to show the results but one
result may be the part of larger set of results. So for example
result #1 might also have 10 other results that are part of the same
data set.
Hopefully this makes sense.
What I would like to
I am curious as to whether there is a solution to be able to replicate
solrconfig.xml with the 1.4 replication. The obvious problem is that the
master would replicate the solrconfig turning all slaves into masters with
its config. I have also tried on a whim to configure the master and slave
on
Hi John,
What is your process for determining that #1 is part of the other
result set? My gut says this is a faceting problem, i.e. #1 has a
field contain its category that is also shared by the 10 other
results, and that all you need to do is facet on the category field.
The other
Grant,
Basically I have created a text field that has the grouping value.
All of the records would have the same value in this text field. This
is accomplished with some pre-processing. When I capture the data, but
before it is submitted into the index.
-John
On Dec 10, 2008, at 8:46
Grant,
For the more like this that would show the grouped results, once you
have clicked on the item, so basically making another query, would it
show a count of the more like this results?
Something like cxxc and a collection 10 other items.
-John
On Dec 10, 2008, at 8:46 PM, Grant
Hi All,
I am trying to use ord() function query ord() on created_date. I am
concrened with the warning of ord behaviour as it uses actual entry creation
in indices instead of created_date value.
Does all entries created initially with different created_date will have
same or nearly ordinal
1) Our limit is: is how big a file do we want to copy around?
We switched to multiple indexes because of the logistics of
replicating/backing up giant Lucene index files.
2) Searching takes a little memory, sorting takes a lot of memory, and
faceting eats like a black hole.
There is an
Hi John,
This sounds a lot like field collapsing functionality that a few people are
working on in SOLR-236:
https://issues.apache.org/jira/browse/SOLR-236
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: John Martyniak [EMAIL PROTECTED]
Jeff,
Are you using Solr 1.3 replication scripts? If so, I think it would be pretty
simple to:
1) put all additional files to replicate to slaves to a specific location (or
use a special naming scheme) on the master
2) write another script that uses scp or rsync to look for those additional
Hey folks,
I'm looking at implementing ExtractingRequestHandler in the Apache_Solr_PHP
library, and I'm wondering what we can do about adding meta-data.
I saw the docs, which suggests you use different post headers to pass field
values along with ext.literal. Is there anyway to use the
Otis,
Thanks for the information. It looks like the field collapsing is
similar to what I am looking. But is that in the current release? Is
it stable?
Is there anyway to do it in Solr 1.3?
-John
On Dec 10, 2008, at 9:59 PM, Otis Gospodnetic wrote:
Hi John,
This sounds a lot like
Hi Grant,
Thanks for the help.
So now I can have multiple components, configured as last-components
of standard request handler.
Best Regards,
Mukta
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 10, 2008 9:25 PM
To:
On Thu, Dec 11, 2008 at 4:41 AM, Chris Hostetter
[EMAIL PROTECTED] wrote:
: This is really cool. U... How does it integrate with the Data Import
: Handler?
my DIH knowledge is extremely limited, but i'm guessing approach #1 is
trivial (there is an easy way to concat DB values to build up
This is a known issue and I was planning to take it up soon.
https://issues.apache.org/jira/browse/SOLR-821
On Thu, Dec 11, 2008 at 5:30 AM, Jeff Newburn [EMAIL PROTECTED] wrote:
I am curious as to whether there is a solution to be able to replicate
solrconfig.xml with the 1.4 replication.
On Wed, Dec 10, 2008 at 11:00 PM, Rakesh Sinha [EMAIL PROTECTED] wrote:
Hi -
I am a new user of Solr tool and came across the introductory
tutorial here - http://lucene.apache.org/solr/tutorial.html .
I am planning to use Solr in one of my projects . I see that the
tutorial mentions about
I am trying to configure jboss wih solr
As stated in wiki docs I copied the solr.war but there is no web-apps
folder currently present in jboss.
So should I create web-apps manually and paste the war file there.
I tried configuring solr with tomcat as well. I paste the war file in
Hi John,
It's not in the current release, but the chances are it will make it into 1.4.
You can try one of the recent patches and apply it to your Solr 1.3 sources.
Check list archives for more discussion, this field collapsing was just
discussed again today/yesterday. markmail.org is a
On Thu, Dec 11, 2008 at 11:21 AM, Neha Bhardwaj
[EMAIL PROTECTED] wrote:
I am trying to configure jboss wih solr
As stated in wiki docs I copied the solr.war but there is no web-apps
folder currently present in jboss.
So should I create web-apps manually and paste the war file there.
Hi,
do any one know how to make sure minimum match in dismax is working? i change
the values and try doing solrCtl restart indexname but i don't see it taking
into effect. any body have an idea on this?
thank you
vinay
_
You
I read many articles on boosting still iam not so clear on boosting. Can
anyone explain the following questions with examples?
1) Can you given an example for field level boosting and document level
boosting and the difference between two?
2) If we set the boost at field level (index time),
I read many articles on boosting still iam not so clear on boosting. Can
anyone explain the following questions with examples?
1) Can you given an example for field level boosting and document level
boosting and the difference between two?
2) If we set the boost at field level (index time),
On Thu, Dec 11, 2008 at 6:49 AM, ayyanar
[EMAIL PROTECTED]wrote:
1) Can you given an example for field level boosting and document level
boosting and the difference between two?
Field level boosting is used when one field is considered more or less
important than another. For example, you may
59 matches
Mail list logo