In the spellchecker search component declaration:
http://wiki.apache.org/solr/SpellCheckComponent#Configuration
What role does the "name" play, which is "default" in this
sample? Can this be any arbitrary name? Should this name
match with something else in the configuration files?
I came to thi
Jan,
So you are implying that the fields made by copyField are not processed by
UpdateProcessors, right?
Erik,
Logically this makes sense but then copyField operations must move to
solrconfig.xml?
Editing solrconfig.xml is more challenging than schema.xml, I feel.
Kuro
On 2/23/11 2:09 AM, "Erik
Markus,
I searched but I couldn't find a definite answer, so I posted this
question.
The article you quoted talks about implementing a copyField-like operation
using UpdateProcessor. It doesn't talk about relationship between
the copyField operation proper and UpdateProcessors.
Kuro
On 2/22/11
Greg,
You could use copyField to copy the column in question to 6 fields, one
for each of your 6 languages,
and hope they none of the analyzers do something reasonable without
crashing.
Or apply the white-space tokenizer and hope for the best?
If the column has long enough text, you could try a l
Can fields created by copyField instructions be processed by
UpdateProcessors?
Or only raw input fields can?
So far my experiment is suggesting the latter.
T. "Kuro" Kurosaka
I've downloaded
http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x
and ran ant there. I've followed the tutorial but
highlighting on analyzer debug screen isn't working.
This link found in the tutorial doesn't show any highlight.
http://localhost:8983/solr/admin/analysis.jsp?name=nam
t; On Sun, Sep 26, 2010 at 11:49 PM, Teruhiko Kurosaka wrote:
>
>>
>> As I understand it, payloads go to the Lucene index.
>> In most cases, the part-of-speech tags are not used if
>> retrieved by the search applications. So they shouldn't
>> go to the
Erik,
On Sep 26, 2010, at 8:04 AM, Erick Erickson wrote:
> The reason I ask is that you had to put the payloads into the
> input in the first place, and they don't affect searching unless
> you want them to. So why do you want to remove them
> with a token filter?
Our Tokenizer puts a part-of-sp
Is there an existing TokenFilter that simply removes
payloads from the token stream?
Teruhiko "Kuro" Kurosaka
RLP + Lucene & Solr = powerful search for global contents
In
http://wiki.apache.org/solr/FAQ#Why_don.27t_International_Characters_Work.3F
These three links are broken.
http://www.nabble.com/International-Charsets-in-embedded-XML-tf1780147.html#a4897795
(International Charsets in embedded XML for Jetty 5.1)
http://www.nabble.com/Problem-with-surrogate-
I'd like to reopen a bug SOLR-1960
https://issues.apache.org/jira/browse/SOLR-1960
"http://wiki.apache.org/solr/ : non-English users get generic MoinMoin page
instead of the desired information"
as I submitted a patch. But jira won't let me do it.
Do I have to clone it?
Teruhiko "Kuro" Kuro
I suppose you mean Extract_ing_RequestHandler.
Out of curiosity, I sent in a Japanese HTML file of EUC-JP encoding,
and it converted to Unicode properly and the index has correct
Japanese words.
Does your HTML files have META tag for Content-type with the value
having charset= ? For example, this
responding language attribute?
>
>
> Dennis Gearon
>
> Signature Warning
>
> EARTH has a Right To Life,
> otherwise we all die.
>
> Read 'Hot, Flat, and Crowded'
> Laugh at http://www.yert.com/film.php
>
>
> --- On Wed, 3/2
(Sorry for very late response on this topic.)
On Feb 28, 2010, at 5:47 AM, Adrien Specq wrote:
> - langage attribute for each field
I was thinking about it and it was one of my wishes.
Currently, Solr practically requires that we have
a field for each natural language that an application
support
h using the analysis page of the
> solr default web page. I assume you are using the same analyzers and
> tokenizers in indexing and searching for this field in your schema.
>
> Regards,
>
>
> Marco Martínez Bautista
>
>
>
> 2010/3/17 Teruhiko Kurosaka
>
&
It seems that Solr's query parser doesn't pass a single term query
to the Analyzer for the field. For example, if I give it
2001年 (year 2001 in Japanese), the searcher returns 0 hits
but if I quote them with double-quotes, it returns hits.
In this experiment, I configured schema.xml so that
the f
at jetty level, perhaps.
On Mar 15, 2010, at 1:27 PM, Teruhiko Kurosaka wrote:
> I have been seeing strange phenomena.
>
> I've written a HTML form that calls Solr like this:
> http://localhost:8983/solr/select/?q=Basis&df=text&wt=xslt&tr=btdemo.xsl
>
> It wor
I have been seeing strange phenomena.
I've written a HTML form that calls Solr like this:
http://localhost:8983/solr/select/?q=Basis&df=text&wt=xslt&tr=btdemo.xsl
It works. But when I change the contents of
solr/conf/xslt/btdemo.xsl
and restart solr, it still show the behavior of
the older versi
Why don't we change the links to have "FrontPage" explicitly?
Wouldn't it be the easiest fix unless there are numerous
other pages that references the default page w/o "FrontPage"?
-kuro
> -Original Message-
> From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
> Sent: Tuesday, Jan
One more comment on this.
I can see this page
http://wiki.apache.org/solr/SolrTomcat
w/o a problem, for example.
Or I can see this:
http://wiki.apache.org/solr/FrontPage
I think it's only the main page
without actual page name
http://wiki.apache.org/solr/
that is having the problem.
So the quick
s though.
>
> Erik
>
> On Jan 26, 2010, at 1:23 AM, Teruhiko Kurosaka wrote:
>
>> In
>> http://lucene.apache.org/solr/
>> the wiki tab and "Docs (wiki)" hyper text in the side bar text after
>> expansion are the link to
>> http://wiki.apache.
I'm sorry. Please ignore this duplicate posting.
From: Teruhiko Kurosaka
Sent: Tuesday, January 26, 2010 8:32 AM
To: solr-user@lucene.apache.org
Subject: Solr wiki link broken
In
http://lucene.apache.org/solr/
the wiki tab and "Docs (wiki)"
In
http://lucene.apache.org/solr/
the wiki tab and "Docs (wiki)" hyper text in the side bar text after expansion
are the link to
http://wiki.apache.org/solr
But the wiki site seems to be broken. The above link took me to a generic help
page of the Wiki system.
What's going on? Did I just hit t
In
http://lucene.apache.org/solr/
the wiki tab and "Docs (wiki)" hyper text in the side bar text after expansion
are the link to
http://wiki.apache.org/solr
But the wiki site seems to be broken. The above link took me to a generic help
page of the Wiki system.
What's going on? Did I just hit t
I have schema.xml that uses a Tokenizer that I wrote.
I understand the standard way of deploying Solr is
to place solr.war in webapps directory, have a separate
directory that has conf files under its conf subdirectory,
and specify that directory as Solr home dir via either
JVM property or JNDI.
Israel,
> If you downloaded the 1.3.0 release, you should find a "docs"
> folder inside the zip file.
>
> This contains the javadoc for that particular release.
>
> You may also re download a 1.3.0 release to get the docs for Solr 1.3.
This doesn't solve my problem. I can't write my javadoc c
Lucene keeps multiple versions of its API doc online at
http://lucene.apache.org/java/X_Y_Z/api/all/index.html
for version X.Y.Z. I am finding this very useful when
comparing different versions. This is also good because
the javadoc comments that I write for my software can
reference the API com
> Aha!
> Sounds like a job for a simple, custom
> UpdateRequestProcessor. Actually, I think URP doesn't get
> access to the actual XML, but what it has access may be
> enough for you: http://wiki.apache.org/solr/UpdateRequestProcessor
I added this to solrconfig.xml but I don't see any extra o
://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> ----- Original Message
> > From: Teruhiko Kurosaka
> > To: "solr-user@lucene.apache.org"
> > Sent: Fri, December 4, 2009 2:23:17 PM
> > Subject: Dumping solr requests for indexing
> >
> > Is there
Is there any way to dump all incoming requests to Solr
into a file?
My customer is seeing a strange problem of disappearing
docs from index and I'd like to ask them to capture all
incoming requests.
Thanks.
-kuro
> From: Grant Ingersoll [mailto:gsing...@apache.org]
> Sent: Tuesday, October 27, 2009 1:15 PM
> To: solr-user@lucene.apache.org
> Subject: Re: long startup time
>
> How big is your index? Can you share your solrconfig? Have
> you looked at it in a profiler during this time? What is it doing?
I've been testing Solr 1.4.0 (RC).
After sometime, solr started to pause
for a long time (a minutes or two) after
printing:
INFO: jetty-6.1.3
Sometime it starts immediately, but more often
than not, it pasues. Is there any known cause
of this kind of long pause?
-kuro
cene.apache.org
> Subject: Re: Solr 1.4 (RC) performance on multi-CPU system
>
> 2009/10/26 Teruhiko Kurosaka :
> > Is Solr 1.4 (Release Candidate) suppose to take advantage
> of muti-CPU
> > (core) system? I.e. if more than one update or search
> requests come in
> >
Is Solr 1.4 (Release Candidate) suppose to take advantage
of muti-CPU (core) system? I.e. if more than one update or
search requests come in about the same time, they can be
automatically assigned to differnt CPUs if available
(and the OS does its job right)?
BTW, the term "multicore" in Solr dis
I'm trying to stress-test solr (nightly build of 2009-10-12) using JMeter.
I set up JMeter to post pod_other.xml, then hd.xml, then commit.xml that only
has a line "", 100 times.
Solr instance runs on a multi-core system.
Solr didn't complian when the number of test threads is 1, 2, 3 or 4.
But
Actually, I meant to say I have my Tokenizer jars in solr/lib.
I have the jars that my Tokenizer jars depend in lib/ext,
as I wanted them to be loaded only once per container
due to their internal description. Bad idea?
-kuro
> From: Teruhiko Kurosaka
> Sent: Wednesday, October 14, 200
I have my custom Tokenizer and TokenizerFactory in a jar,
and I've been putting it in example/lib/ext. and it's been
working fine with Solr 1.3.
This jar uses SLF4J as a logging API, and I had the SLF4J jars
in the same place, example/lib/ext.
Because Solr 1.4 uses SLF4J too and have it builtin,
I've downloaded solr-2009-10-12.zip and tried to
compile my TokenizerFactory impelmentation against this
version of Solr. Compilation failed. One of the causes
is that the compiler couldn't find
org.apache.solr.common.ReosourceLoader.
I discovered this class in apache-solr-solrj-nightly.jar.
I
The release candidates is up again.
> -Original Message-
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
> Sent: Monday, September 08, 2008 10:34 AM
> To: solr-user@lucene.apache.org
> Subject: Re: 1.3.0 candidate
>
> This is temporarily removed, as I need to create another.
>
> On S
Grant,
Is this coming back soon? Rough estimate?
-kuro
> -Original Message-
> From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
> Sent: Monday, September 08, 2008 10:34 AM
> To: solr-user@lucene.apache.org
> Subject: Re: 1.3.0 candidate
>
> This is temporarily removed, as I need to crea
I've noticed that schema.xml in the dev version of Solr spells
what used to be fieldtype as fieldType with capital T.
Are there any other compatibility issues between the would-be
Solr 1.3 and Solr 1.2?
How soon Solr 1.3 will be available, by the way?
Basis Technology Corporation, San
Thank you for mentioning our product, Walter :-)
> I've worked with the Basis products. Solid, good support.
> Last time I talked to them, they were working on hooking them
> into Lucene.
So, Basis Technology's Rosette Language Platform has what we call
"Base Linguistics" (basically a morpholo
Hello David,
> And if I do a search in Luke and the solr analysis page
> for美聯, I get a hit. But on the actual search, I don't.
I think you need to tell us what you mean by "actual search"
and your code that interfaces with Solr.
-kuro
Thank you Ezra and Chris for explaining this,
and I like your idea, Erik. This will make intro to Solr
easier for new comers, and make Solr more popular.
-Kuro
> That example is definitely in the cool category. I couldn't resist
> creating a SolrTerminology wiki page linking to your post a
Could someone tell me what facet is?
I have a vague idea but I am not too clear.
A pointer to a sample web site that uses Solr facet
would be very good.
Thanks.
-Kuro
Can anyone tell me how to use the Java client ?
I downloaded the complete source from SVN solr trunk and
took a look at files under client/java but no .java file has
main(). Or I don't see README.
-kuro
TECTED]
> Sent: Friday, August 03, 2007 12:50 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SolJava --- which attachments are valid?
>
> Teruhiko Kurosaka wrote:
> >> or you can get it from the nightly builds in:
> >> http://people.apache.org/builds/lucene/solr/nig
Christian,
This is interesting. I have been always thinking that Solr shouldn't
be in the business of parsing; it's responsibility of the Solr client.
But
what Peter suggested, adding a parsing capability to the Solr
as a request handler does make sense.
One thing that I noticed this approach ca
> or you can get it from the nightly builds in:
> http://people.apache.org/builds/lucene/solr/nightly/
For those of you who are interested...
As far as I can tell by inspecting the source code in Trunk,
solrj.jar from the nightly doesn't seem to work with Solr 1.2.
For one thing, there is a new
> Some form of some files from SOLR-20 should work, but I would suggest
> using the client in trunk now:
>
> http://svn.apache.org/repos/asf/lucene/solr/trunk/client/java/solrj/
>
Thanks. I updated
http://wiki.apache.org/solr/SolJava
to reflect the new state of this component.
> or you can
I think it's best to control log level by an external file; you don't
want to
reprogram when you need log.
Define the system property java.util.logging.config.file to point to
your
log properties file. I would copy
$JAVA_HOME/jre/lib/logging.properties and then add a line:
org.apache.solr.level = W
http://issues.apache.org/jira/browse/SOLR-20
has many attachments:
1. Java Source File Licensed for inclusion in ASF works
DocumentManagerClient.java (12 kb)
2. Java Source File Licensed for inclusion in ASF works
DocumentManagerClient.java (12 kb) [dimmed]
3. Zip Archive Licensed for inclusion in
I'm using Solr 1.1.
I ran:
post.sh vidcard.xml
(with URL modified in post.sh) then got an error:
Posting file vidcard.xml to http://localhost:28080/solr/update
ERROR: multiple values encountered for non
multiValued field text: f\
irst='ASUS Extreme N7800GTX/2DHTV (256 MB)' second='ASUS Computer
In
Peter,
I was playing with Nutch for quite some time before Solr, so
I know Nutch better than Solr. Nutch has a plugin mechanism
so that you can add a parser for a document type. It comes with
parser plugins for most popular doc types (with varying degrees of
international text support).
My que
Thank you, Otis and Peter, for your replies.
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> doc of some type -> parse content into various fields -> post to Solr
I understand this part, but the question is who should do this.
I was under assumption that it's Solr client's job to crawl the
Solr looks very good for indexing and searching strcutured data.
But I noticed there is no tool in the Solr distribution with which documents
of other doc types can be indexed. Are there other side projects that
develop Solr clients for indexing documents of other doc types?
Or is the generic f
Hi Daniel,
As you know, Chinese and Japanese does not use
space or any other delimiters to break words.
To overcome this problem, CJKTokenizer uses a method
called bi-gram where the run of ideographic (=Chinese)
characters are made into tokens of two neighboring
characters. So a run of five chara
Hi Yonik,
> On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote:
> > For bi-lingual
> > or tri-lingual search, we can have parallel fields (title_en,
> > title_fr, title_de, for example) but this wouldn't scale well.
>
> Due to search across multiple
Daniel,
I was reading your email and responses to it with great
interest.
I was aware that Solr has an implicit assumption that
a field is mono-lingual per system. But your mail and
its correspondence made me wonder if this limitation
is practical for multi-lingual search applications. For bi-
I noticed there is no example/ext
directory or jars that was found there
in 1.1 (commons-el.jar, commons-logging.jar,
jasper-*.jar, mx4j-*.jar)
I have a jar that my Solr plugin depends on.
This jar contains a class that needs to be
loaded only once per container because
it is a JNI library. Fo
I see Solr uses the JDK java.util.logging.Logger.
I should also be using this Logger when I write
a plugin, correct?
I am asking only because I see commons-logging.jar
in apache-solr-1.1.0-incubating/example/ext
What is this for?
-kuro
Never mind. My mistake. I still had a copy of the jar in ext dir.
After cleaning it up, it's now loading my plugin.
THANK YOU VERY MUCH!
> -Original Message-
> From: Teruhiko Kurosaka [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, June 06, 2007 5:58 PM
> To: solr-user@
Ryan,
Thank you.
But creating lib under example/solr and placing
my plugin jar there yielded the same error of
not able to locate
org/apache/solr/analysis/BaseTokenizerFactory
How can this be
-kuro
This is about Solr 1.1.0 running on Win XP w/JDK 1.5.
Thank you.
> -Original Message-
> From: Teruhiko Kurosaka [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, June 06, 2007 5:32 PM
> To: solr-user@lucene.apache.org
> Subject: Where to put my plugins?
>
> I mad
I made a plugin that has a Tokenizer, its Factory, a
Filter and its Factory. I modified example/solr/conf/schema.xml
to use these Factories.
Following
http://wiki.apache.org/solr/SolrPlugins
I placed the plugin jar in the top level lib and ran
the start.jar. I got:
org.mortbay.util.MultiExcepti
Ryan,
Thank you for your reply, but I can't find this
class SolrException.ErrorCode in Solr 1.1.
The Solr source seems to be giving a random number,
400, 500, etc. for the first arg to SolrException
constructor. (Is there any unwritten convention?)
Is SolrException.ErrorCode new to the latest versi
When the parameter to a token filter is out of
range, or a mandatory paramter is not given, what
is the proper way to fail in the init() and
crate() methods?
Should I throw an RuntimeException? Or should I
simply call SolrCore.log.severe(message)?
Is it OK for create() to return null when the
unde
Ryan,
Thank you. The JavaScript code you mentioned works well.
But I am now hitting the similar problem with XSLT. The
following XSLT code can't retrieve the value of "hl.fl"
parameter even though the similar code for other parameter
works.
I am using the XSLT Writer and whatever XSLT pro
I have a form that sets the hl.fl form hidden variable.
I wanted to change the higlighted field depending on the
query string that is typed, using JavaScript.
This is normally done by the JavaScript code like this:
document.myform.varname.value = "whatever"
But this doesn't work for hl.fl b
I've had this a few weeks ago.
You are probably starting Tomcat from somewhere other than the Solr
home.
See "Simple Example Install" section of
http://wiki.apache.org/solr/SolrTomcat
There, tomcat is started from the Solr home by:
./apache-tomcat-5.5.20/bin/startup.sh
If you do
cd apache-tomca
I am trying to understand the highlighting output example, the last
one in this page:
http://wiki.apache.org/solr/StandardRequestHandler
It the example is showing the top level element of a set of higlighted
results
for a document is .
What does this, SOLR1000, mean? Or rather, how does Solr
c
If my memory is correct, UTF-8 has been the default encoding per
XML specification from a very early stage. If the XML parser is not
defaulting
to UTF-8 in absence of the encoding attribute, that means the XML
parser has a bug, and the code should be corrected.
(I don't have an objection to add
Yes, that is it!
Thank you, Brian.
I've filed SOLR-233.
https://issues.apache.org/jira/browse/SOLR-233
-kuro
> -Original Message-
> From: Brian Whitman
> Sent: Thursday, May 10, 2007 1:19 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Does Solr XSL writer work with Arabic text?
>
>
I'm trying to search an index of docs which have text fields in Arabic,
using XSL writer (wt=xslt&tr=example.xsl). But the Arabic text gets
all garbled. Is XSL writer known to work for Arabic text? Is anybody
using it?
-kuro
BTW,
The Simple Example Install section in
http://wiki.apache.org/solr/SolrTomcat
leaves the unzipped directory apache-solr-nightly-incubating
intact, but this is not needed after copying the
solr.war and the example solr directory, is it?
Can I edit the instruction to insert:
rm -r apache-solr-ni
> did you try searching for that error message? the first
> result google gave
> me points to this mailing list thread...
>
> http://mail-archives.apache.org/mod_mbox/tomcat-dev/200512.mbo
> x/[EMAIL PROTECTED]
>
Yes, I found this email archive thread in another mail archive site.
I tried nuki
Thank you, Hoss, for replying m question.
> : An important factor in the instruction is that Tomcat must
> : be started from the directory under which the solr directory
> : (copied from the exmaple) exists
> that's not true. if you use JNDI or system properties to
> configure the
> "solr h
I struggled to run Solr in Tomcat 5.5 (or 6.0 for that matter).
Then I found a step-by-step instruction at
http://wiki.apache.org/solr/SolrTomcat
and followed it as much as possible (wget URL didn't work, so
I had to download using browser). Then Solr worked.
An important factor in the instructio
Hello,
I am new to solr, and trying to undestand how things work.
If I want to use my tokenizer, there seems to be three choices:
1. Write a TokenizerFactory that create() my Tokenizer, and specify the factory
in schema.xml.
2. Write an Analyzer that uses my Tokenizer, and specify that Analyzer in
79 matches
Mail list logo