Hi,
nutch search results provide a link for getting the cached document copy.
It fetches the raw content from segments based on document id. {cached.jsp}
Is it possible to have similar functionality in solr, what can be done to
achieve this? Any pointers.
Thanks,
Ram
DISCLAIMER
==
Am 07.01.2010 um 00:07 schrieb Turner, Robbin J:
I've been doing a bunch of googling and haven't seen if there is a parameter
to set within Tomcat other than the solr/home which is setup in the solr.xml
under the $CATALINA_HOME/conf/Catalina/localhost/.
Hi.
We set this in solr.xml
Hi, guys, I am getting started with solr.
when I search a collection of data, I care both the document score(relevance
towards user query word) and document publishTime(which is another field in
each of the document).
If I simply sort matching document set by publishTime field, then the score
is
it wouldn't be q.alt though, just q, in the config file.
q.alt is typically *:*, it's the fall back query when no q is provided.
though, in thinking about it, q.alt would work here, but i'd use q
personally.
On Jan 6, 2010, at 9:45 PM, Andy wrote:
Let me make sure I understand you.
I'd
nutch search results provide a link for getting the cached document copy.
It fetches the raw content from segments based on document id. {cached.jsp}
Is it possible to have similar functionality in solr, what can be done to
achieve this? Any pointers.
I could retrieve the content using the text
2010/1/7 Wangsheng Mei hairr...@gmail.com
Hi, guys, I am getting started with solr.
when I search a collection of data, I care both the document
score(relevance
towards user query word) and document publishTime(which is another field in
each of the document).
If I simply sort matching
actually it does not.
BTW, FYI, backup is just to take periodics backups not necessary for
the Replicationhandler to work
On Thu, Jan 7, 2010 at 2:37 AM, Giovanni Fernandez-Kincade
gfernandez-kinc...@capitaliq.com wrote:
How can you tell when the backup is done?
-Original Message-
Hi All,
i have a document indexed in solr, which is as follow :
doc
str name=idP-E-HE-Philips-32PFL5409-98-Black-32/str
arr name=keywords
strPhilips/str
strLCD TVs/str
/arr
str name=title
Philips 32PFL5409-98 32 LCDTV withPixel Plus HD (Black,32)
/str
/doc
now when i search for lcd tvs, i
I am trying to use solr's csv updater to index the data , i am tryin to
specify the .Dat format consisting of field seperator , text qualifier and a
line seperator
for example
field 1 field separator field 2field seperator
text qualifiervalue for field 1text qualifierfield seperatortext
This is exactly what I need, really appreciate.
2010/1/7 Shalin Shekhar Mangar shalinman...@gmail.com
2010/1/7 Wangsheng Mei hairr...@gmail.com
Hi, guys, I am getting started with solr.
when I search a collection of data, I care both the document
score(relevance
towards user query
This is exactly what I need, really appreciate.
2010/1/7 Shalin Shekhar Mangar shalinman...@gmail.com
2010/1/7 Wangsheng Mei hairr...@gmail.com
Hi, guys, I am getting started with solr.
when I search a collection of data, I care both the document
score(relevance
towards user query
Hi,
I'm trying to highlight short text values. The field they came from has
a type shared with other fields. I have highlighting working on other
fields but not on this one.
Why ?
How are these fields defined in your schema.xml? Note
that String types are indexed without tokenization, so
if str is defined as a String field type, that may be
part of your problem (try text type if so).
If this is irrelevant, please show us the relevant parts
of your schema and the query
It's really hard to provide any response with so little information,
could you show us the difference between a field that works
and one that doesn't? Especially the relevant schema.xml entries
and the query that fails to highlight
Erick
On Thu, Jan 7, 2010 at 7:47 AM, Xavier Schepler
Erick Erickson a écrit :
It's really hard to provide any response with so little information,
could you show us the difference between a field that works
and one that doesn't? Especially the relevant schema.xml entries
and the query that fails to highlight
Erick
On Thu, Jan 7, 2010 at 7:47
Hey,
I'm doing a query which involves using an frange in the filter query — and I
was wondering if there is a way of combing the frange with other parameters.
Something like ({!frange l=x u=y)*do stuff*) AND *field:param*) — but
obviously this doesn't work. Is there a way of doing this?
—Oliver
Erick - thanks very much, all of this makes sense. But the one thing I still
find puzzling is the fact that re-adding the file a second, third, fourth
etc time causes numDocs to increase, and ALWAYS by the same amount
(141,645). Any ideas as to what could cause that?
Dan
Erick Erickson
I've tried having two servers set up to replicate each other, and it is not a
pretty thing. It seems that SOLR doesn't really do any checking of the version
# to see if the version # on the master is the version # on the slave before
deciding to replicate. It only looks to see if it's
Eric,
you mean, everything is okay, but I do not see it?
Internally for searching the analysis takes place and writes to the
index in an inverted fashion, but the stored stuff is left alone.
if I use an analyzer, Solr stores it's output two ways?
One public output, which is similar to the
Right, but if you want to take periodic backups and ship them to tape or some
DR site, you need to be able to tell when the backup is actually complete.
It's seems very strange to me that you can actually track the replication
progress on a slave, but you can't track the backup progress on a
hello,
i'm trying to use an ontology (homegrown :) ) to support the search.
i.e. i'd like my search engine to report search results for barack
obama even if i look for president. I see there's some support in
Nutch API (org.apache.nutch.ontology) so (if it does what i'm looking
for) i'm guessing
On Jan 7, 2010, at 10:50 AM, MitchK wrote:
Eric,
you mean, everything is okay, but I do not see it?
Internally for searching the analysis takes place and writes to the
index in an inverted fashion, but the stored stuff is left alone.
if I use an analyzer, Solr stores it's output two
It puzzles me too. I don't know the internals of that code
well enough to speculate, but once you're into undefined
behavior, I have great faith in *many* inexplicable things
happening.
Erick
On Thu, Jan 7, 2010 at 9:45 AM, danben dan...@gmail.com wrote:
Erick - thanks very much, all of
Thank you, Ryan. I will have a look on lucene's material and luke.
I think I got it. :)
Sometimes there will be the need, to response on the one hand the value and
on the other hand the indexed version of the value.
How can I fullfill such needs? Doing copyfield on indexed-only fields?
Hi there,
I'm trying to understand how the query syntax specified on the Solr Wiki (
http://wiki.apache.org/solr/SolrQuerySyntax ) fits in with the usage of the
SolJ class SolrQuery. There are not too many examples of usage to be found.
For example. Say I wanted to replicate the following query
--- On Thu, 1/7/10, Jon Poulton jon.poul...@vyre.com wrote:
From: Jon Poulton jon.poul...@vyre.com
Subject: SolJ and query parameters
To: 'solr-user@lucene.apache.org' solr-user@lucene.apache.org
Date: Thursday, January 7, 2010, 7:25 PM
Hi there,
I'm trying to understand how the query
All,
I have two indices - one has 23 M document and the other has less than 1000.
The small index is for real time update.
Does updating small index (with commit) hurt the overall performance?
(We can not update realtime for 23M big index because of heavy traffic and
size).
Thanks,
Jae Joo
Thanks for the reply.
Using SolrQuery.setQuery({!lucene q.op=AND df=text}myfield:foo +bar -baz});
would make more sense if it were not for the other methods available on
SolrQuery.
For example, there is a setFields(String..) method. So what happens if I call
setFields(title, description)
I've also just noticed that QueryParsing is not in the SolrJ API. It's in one
of the other Solr jar dependencies.
I'm beginning to think that maybe the best approach it to write a query string
generator which can generate strings of the form:
q={!lucene q.op=AND df=text}myfield:foo +bar -baz
Using SolrQuery.setQuery({!lucene q.op=AND
df=text}myfield:foo +bar -baz}); would make more sense if
it were not for the other methods available on SolrQuery.
For example, there is a setFields(String..) method. So
what happens if I call setFields(title, description)
after having set the
On Jan 7, 2010, at 12:11 PM, MitchK wrote:
Thank you, Ryan. I will have a look on lucene's material and luke.
I think I got it. :)
Sometimes there will be the need, to response on the one hand the
value and
on the other hand the indexed version of the value.
How can I fullfill such
What is your use case for responding sometimes with the indexed value?
Other than reconstructing a field that hasn't been stored, I can't think of
one.
I still think you're missing the point. Indexing and storing are
orthogonal operations that have (almost) nothing to do with each
other, for all
On Jan 7, 2010, at 1:05 PM, Jon Poulton wrote:
I've also just noticed that QueryParsing is not in the SolrJ API.
It's in one of the other Solr jar dependencies.
I'm beginning to think that maybe the best approach it to write a
query string generator which can generate strings of the form:
The difference between stored and indexed is clear now.
You are right, if you are responsing only to normal users.
Use case:
You got a stored field The good, the bad and the ugly.
And you got a really fantastic analyzer, which is doing some magic to this
movie title.
Let's say, the analyzer
On Wed, Jan 6, 2010 at 4:30 PM, Erick Erickson erickerick...@gmail.comwrote:
Hmmm, I'll have to defer to the highlighter experts here
I've looked at the source code for the highlighter, and I think I know
what's going on. I haven't had time to play with this yet, so I could be
wrong, but
Hi all,
Our application uses solrj to communicate with our solr servers. We started a
fresh index yesterday after upping the maxFieldLength setting in solrconfig.
Our task indexes content in batches and all appeared to be well until noonish
today, when after 40k docs, I started seeing errors.
what version of solr are you running?
On Jan 7, 2010, at 3:08 PM, Jake Brownell wrote:
Hi all,
Our application uses solrj to communicate with our solr servers. We
started a fresh index yesterday after upping the maxFieldLength
setting in solrconfig. Our task indexes content in batches
Yes, that would be helpful to include, sorry, the official 1.4.
-Original Message-
From: Ryan McKinley [mailto:ryan...@gmail.com]
Sent: Thursday, January 07, 2010 2:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Corrupted Index
what version of solr are you running?
On Jan 7, 2010,
That's just setting the solr/home environment not the user.dir variable. I
have that already set. But when I got to the solr/admin page, at the top it
shows the Solr Admin(schemaname), hostname, and cwd=/root, SolrHome=/opt/solr.
How do I get cwd=/root not to be that but to be set to
If you need to fix the index and maybe lose some data (in bad segments),
check Lucene's CheckIndex (cmd-line app)
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Jake Brownell ja...@benetech.org
To: solr-user@lucene.apache.org
Won't hurt the performance - that *is* why people use BIG+small core trick. :)
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Jae Joo jae...@gmail.com
To: solr-user@lucene.apache.org
Sent: Thu, January 7, 2010 12:40:16 PM
Subject:
Claudio,
Check out Solr synonym support:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Claudio Martella claudio.marte...@tis.bz.it
To:
Your setup with the master behind a LB VIP looks right.
I don't think replication in Solr was meant to be bidirectional.
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Matthew Inger mattin...@yahoo.com
To: solr-user@lucene.apache.org;
Well, I'd approach either of these use cases
by simply performing my computations on
the input and storing the result in another
(non-indexed unless I wanted to search it)
field. This wouldn't happen in the Analyzer,
but in the code that populated the document
fields.
Which is a much cleaner
Regular expressions won't work well for sentence boundary detection.
If you want something free, you could plug in OpenNLP or GATE. Or LingPipe,
but that's not free.
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Caleb Land
Hi,
Either I don't understand this or this doesn't make much sense.
Are you saying you want to show only facet values whose counts == # of hits?
If so, what would be the value of showing facets -- they wouldn't be narrowing
down the result set.
Otis
--
Sematext -- http://sematext.com/ -- Solr
Not sure if this was answered.
Yes, you can set the default params/values for a request handler in the
solrconfig.xml .
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Andy angelf...@yahoo.com
To: solr-user@lucene.apache.org
Sent: Mon,
Peter - Aaron just commented on a recent Solr issue (reading large result sets)
and mentioned his patch.
So far he has 2 x +1 from Grant and me to stick his patch in JIRA.
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Peter Wolanin
Hi all,
I've got an index split across 28 cores -- 4 cores on each of 7 boxes
(multiple cores per box in order to use more of its CPUs.)
When I configure a toplevel core to fan out to all 28 index cores,
it works, but is slower than I'd have expected:
Toplevel core == all 28 index cores
In
On Thu, Jan 7, 2010 at 4:17 PM, Michael solrco...@gmail.com wrote:
I wanted to try 2 layers of sharding.
Distrib search was written with multi-level in mind, but it's not supported yet.
-Yonik
http://www.lucidimagination.com
Thanks, Yonik.
Does not supported mean we can't guarantee whether it will work or
not, or you may be able to figure it out on your own? Apparently I
am able to get *some* queries through, just not those that pass
through the fieldtype that i really need (a complex analyzer). When I
search for
Christopher,
It's not Lucene or Solr, but have a look at
http://www.sematext.com/products/key-phrase-extractor/index.html
There is an unofficial demo for it (uses Reuters news feeds with 2 1-week long
windows for SIPs):
http://www.sematext.com/demo/kpe/i.html
(it looks like the
Right. But my understanding is that the handler default setting in solrconfig
doesn't take the parameter {!boost}, it only takes the parameter bf, which adds
the function query instead of multiply it.
Seems like the only way to have a default for parameter {!boost} is to use
edismax, which
On Thu, Jan 7, 2010 at 4:33 PM, Michael solrco...@gmail.com wrote:
Does not supported mean we can't guarantee whether it will work or
not, or you may be able to figure it out on your own?
Not implemented, and not expected to work.
For example, some info such as sortFieldValues would need to be
Matt:
http://sharehound.sourceforge.net/
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Matt Wilkie matt.wil...@gov.yk.ca
To: solr-user@lucene.apache.org
Sent: Thu, December 10, 2009 3:06:38 PM
Subject: Indexing content on Windows file
Shalin,
- Original Message
From: Shalin Shekhar Mangar shalinman...@gmail.com
To: solr-user@lucene.apache.org
Sent: Wed, December 23, 2009 2:45:21 AM
Subject: Re: Adaptive search?
On Wed, Dec 23, 2009 at 4:09 AM, Lance Norskog wrote:
Nice!
Siddhant: Another problem to
For me, the Java replication is nice because it's much easier to set up and has
fewer moving pieces (vs. rsync server, scripts config file, event hook,
external shell scripts).
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Jason
Strange. Ever figured out the source of performance difference?
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Raghuveer Kancherla raghuveer.kanche...@aplopio.com
To: solr-user@lucene.apache.org
Sent: Sat, December 5, 2009 12:05:49 PM
Hi there
If i have two documents with a field indexing a taxonomy path for example
doc1: bags/handbags/clutch
doc2: bags/handbags/beach
and that field tokenizes on the forward slash, the facets produced will be :
bags(2), handbags(2),beach(1),clutch(1)
if i select clutch, the facets returned
I am using LucidWorks Solr v1.4 and I would like to compile in a search
component, however it does not seem like a very straightforward process. The
ant script in the solr directory is that of the stock solr installation
which does not compile out of the box.
Has anyone been able to successfully
Hi,
I think the how can I perform both exact and non-exact (no stemming involved)
searches? is a pretty FAQ, but it looks like we don't have an answer for it on
the Wiki. The advice is typically to copy a field and apply different analysis
to it (one stemmed, the other not stemmed), and then
Analysis is called when creating the indexed data for content, but not
when storing the content.
CopyField copies one field's raw values to another field for storage.
The source and target fields can be of any type.
copyField does not analyse the source data and then feed it to
another field's
Hi,
Did you re-start tomcat and re-index your collection?
Yes
Do you want to search inside alpanumeric strings? Or you are interested
only prefix queries. Can you give us more examples like target documents and
queries.
Searching inside would be required, yes. If the above example
If you index the raw document, that is what is returned by the search.
The analyzers create separate data that is stored in various files,
but is only used in searching. Searching, facets, and sorting use this
analyzed output, but search returns pull the original.
On Thu, Jan 7, 2010 at 2:28 AM,
Great - this issue? https://issues.apache.org/jira/browse/LUCENE-2127
Sounds like it would be a real win for lucene.
-Peter
On Thu, Jan 7, 2010 at 4:12 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
Peter - Aaron just commented on a recent Solr issue (reading large result
sets) and
I recently noticed the same sort of thing.
The attached screenshot shows the transition on a search server
when we updated from a Solr 1.4 dev build (revision 779609 from
2009-05-28) to the Solr 1.4.0 released code. Every 3 hours we have a
cron task to log some of the data from the stats.jsp
Thanks.
Can I use the standard request handler for this purpose? So something like:
requestHandler name=standard class=solr.StandardRequestHandler
lst name=defaults
str name=q{!boost b=$popularityboost
v=$qq}popularityboost=log(popularity)/str
/lst
/requestHandlerOr do I
I'd love to see the screenshot, but it didn't come through - got stripped by ML
manager. Maybe upload it somewhere?
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Peter Wolanin peter.wola...@acquia.com
To: solr-user@lucene.apache.org
Si si, that issue.
Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
- Original Message
From: Peter Wolanin peter.wola...@acquia.com
To: solr-user@lucene.apache.org
Sent: Thu, January 7, 2010 9:27:04 PM
Subject: Re: SOLR Performance Tuning: Pagination
Great -
On Jan 7, 2010, at 9:51 PM, Andy wrote:
Thanks.
Can I use the standard request handler for this purpose? So
something like:
Yes, but...
requestHandler name=standard class=solr.StandardRequestHandler
lst name=defaults
str name=q{!boost b=$popularityboost v=
Oh I see.
Is popularityboost the name of the parameter?
requestHandler name=standard class=solr.StandardRequestHandler
lst name=defaults
str name=q{!boost b=$popularityboost v=$qq}/str
str name=popularityboostlog(popularity)/str
/lst
/requestHandler
--- On Thu,
http://www.lucidimagination.com/search/s:wiki?q=update+csv
You can set the field names on the URL or as the first line.
On Thu, Jan 7, 2010 at 3:48 AM, Mark N nipen.m...@gmail.com wrote:
I am trying to use solr's csv updater to index the data , i am tryin to
specify the .Dat format consisting
Hi,
I would like to split the existing index by 2 index, ( inverse of
merge index function).
My index directory size around 20G and 10 Million documents.
-Kalidoss.m,
Get your world in your inbox!
Mail, widgets, documents, spreadsheets, organizer and much more with your
Sifymail
I am using Solr 1.3.
I have an index with a field called name. It is of type text
(unmodified, stock text field from solr).
My query
field:foo-bar
is parsed as a phrase query
field:foo bar
I was rather expecting it to be parsed as
field:(foo bar)
or
field:foo field:bar
Is there an expectation
I have defined a field type in schema.xml :
fieldType name=lowercase class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter class=solr.TrimFilterFactory /
filter
75 matches
Mail list logo