Role of the name in spellchecker declaration. Can there be multiple instances of it?

2011-03-28 Thread Teruhiko Kurosaka
In the spellchecker search component declaration: http://wiki.apache.org/solr/SpellCheckComponent#Configuration What role does the name play, which is default in this sample? Can this be any arbitrary name? Should this name match with something else in the configuration files? I came to this

Re: UpdateProcessor and copyField

2011-02-23 Thread Teruhiko Kurosaka
Jan, So you are implying that the fields made by copyField are not processed by UpdateProcessors, right? Erik, Logically this makes sense but then copyField operations must move to solrconfig.xml? Editing solrconfig.xml is more challenging than schema.xml, I feel. Kuro On 2/23/11 2:09 AM, Erik

UpdateProcessor and copyField

2011-02-22 Thread Teruhiko Kurosaka
Can fields created by copyField instructions be processed by UpdateProcessors? Or only raw input fields can? So far my experiment is suggesting the latter. T. Kuro Kurosaka

Re: Indexing languages, dataimporthandler

2011-02-22 Thread Teruhiko Kurosaka
Greg, You could use copyField to copy the column in question to 6 fields, one for each of your 6 languages, and hope they none of the analyzers do something reasonable without crashing. Or apply the white-space tokenizer and hope for the best? If the column has long enough text, you could try a

Re: UpdateProcessor and copyField

2011-02-22 Thread Teruhiko Kurosaka
Markus, I searched but I couldn't find a definite answer, so I posted this question. The article you quoted talks about implementing a copyField-like operation using UpdateProcessor. It doesn't talk about relationship between the copyField operation proper and UpdateProcessors. Kuro On 2/22/11

highlighting not working with Solr 3.0 trunk?

2011-01-07 Thread Teruhiko Kurosaka
I've downloaded http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x and ran ant there. I've followed the tutorial but highlighting on analyzer debug screen isn't working. This link found in the tutorial doesn't show any highlight.

Re: TokenFilter that removes payload ?

2010-09-27 Thread Teruhiko Kurosaka
, 2010 at 11:49 PM, Teruhiko Kurosaka k...@basistech.comwrote: As I understand it, payloads go to the Lucene index. In most cases, the part-of-speech tags are not used if retrieved by the search applications. So they shouldn't go to the index. So I'd like to know if there is an existing

Re: TokenFilter that removes payload ?

2010-09-26 Thread Teruhiko Kurosaka
Erik, On Sep 26, 2010, at 8:04 AM, Erick Erickson wrote: The reason I ask is that you had to put the payloads into the input in the first place, and they don't affect searching unless you want them to. So why do you want to remove them with a token filter? Our Tokenizer puts a

TokenFilter that removes payload ?

2010-09-23 Thread Teruhiko Kurosaka
Is there an existing TokenFilter that simply removes payloads from the token stream? Teruhiko Kuro Kurosaka RLP + Lucene Solr = powerful search for global contents

Broken links in Solr FAQ's Why don't International Characters Work?

2010-08-26 Thread Teruhiko Kurosaka
In http://wiki.apache.org/solr/FAQ#Why_don.27t_International_Characters_Work.3F These three links are broken. http://www.nabble.com/International-Charsets-in-embedded-XML-tf1780147.html#a4897795 (International Charsets in embedded XML for Jetty 5.1)

What is the proper procedure to reopen closed bugs?

2010-06-28 Thread Teruhiko Kurosaka
I'd like to reopen a bug SOLR-1960 https://issues.apache.org/jira/browse/SOLR-1960 http://wiki.apache.org/solr/ : non-English users get generic MoinMoin page instead of the desired information as I submitted a patch. But jira won't let me do it. Do I have to clone it? Teruhiko Kuro

Re: If you could have one feature in Solr...

2010-03-24 Thread Teruhiko Kurosaka
(Sorry for very late response on this topic.) On Feb 28, 2010, at 5:47 AM, Adrien Specq wrote: - langage attribute for each field I was thinking about it and it was one of my wishes. Currently, Solr practically requires that we have a field for each natural language that an application

Re: If you could have one feature in Solr...

2010-03-24 Thread Teruhiko Kurosaka
? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Wed, 3/24/10, Teruhiko Kurosaka k...@basistech.com wrote: From: Teruhiko Kurosaka k...@basistech.com

Re: Encoding problem with ExtractRequestHandler for HTML indexing

2010-03-24 Thread Teruhiko Kurosaka
I suppose you mean Extract_ing_RequestHandler. Out of curiosity, I sent in a Japanese HTML file of EUC-JP encoding, and it converted to Unicode properly and the index has correct Japanese words. Does your HTML files have META tag for Content-type with the value having charset= ? For example,

Solr query parser doesn't invoke analyzer for simple term query?

2010-03-16 Thread Teruhiko Kurosaka
It seems that Solr's query parser doesn't pass a single term query to the Analyzer for the field. For example, if I give it 2001年 (year 2001 in Japanese), the searcher returns 0 hits but if I quote them with double-quotes, it returns hits. In this experiment, I configured schema.xml so that the

Solr doesn't pick up the updated .xsl file. Where does it keep the cache?

2010-03-15 Thread Teruhiko Kurosaka
I have been seeing strange phenomena. I've written a HTML form that calls Solr like this: http://localhost:8983/solr/select/?q=Basisdf=textwt=xslttr=btdemo.xsl It works. But when I change the contents of solr/conf/xslt/btdemo.xsl and restart solr, it still show the behavior of the older version

Re: Solr doesn't pick up the updated .xsl file. Where does it keep the cache?

2010-03-15 Thread Teruhiko Kurosaka
on at jetty level, perhaps. On Mar 15, 2010, at 1:27 PM, Teruhiko Kurosaka wrote: I have been seeing strange phenomena. I've written a HTML form that calls Solr like this: http://localhost:8983/solr/select/?q=Basisdf=textwt=xslttr=btdemo.xsl It works. But when I change the contents of solr

RE: Solr wiki link broken

2010-01-27 Thread Teruhiko Kurosaka
Why don't we change the links to have FrontPage explicitly? Wouldn't it be the easiest fix unless there are numerous other pages that references the default page w/o FrontPage? -kuro -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tuesday, January

Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text in the side bar text after expansion are the link to http://wiki.apache.org/solr But the wiki site seems to be broken. The above link took me to a generic help page of the Wiki system. What's going on? Did I just hit

RE: Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
I'm sorry. Please ignore this duplicate posting. From: Teruhiko Kurosaka Sent: Tuesday, January 26, 2010 8:32 AM To: solr-user@lucene.apache.org Subject: Solr wiki link broken In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text

RE: Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
moments though. Erik On Jan 26, 2010, at 1:23 AM, Teruhiko Kurosaka wrote: In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text in the side bar text after expansion are the link to http://wiki.apache.org/solr But the wiki site seems to be broken. The above

RE: Solr wiki link broken

2010-01-26 Thread Teruhiko Kurosaka
One more comment on this. I can see this page http://wiki.apache.org/solr/SolrTomcat w/o a problem, for example. Or I can see this: http://wiki.apache.org/solr/FrontPage I think it's only the main page without actual page name http://wiki.apache.org/solr/ that is having the problem. So the

Solr wiki link broken

2010-01-25 Thread Teruhiko Kurosaka
In http://lucene.apache.org/solr/ the wiki tab and Docs (wiki) hyper text in the side bar text after expansion are the link to http://wiki.apache.org/solr But the wiki site seems to be broken. The above link took me to a generic help page of the Wiki system. What's going on? Did I just hit

What is the proper way to deploy Solr with a custom schema.xml that requires extra JARs?

2010-01-12 Thread Teruhiko Kurosaka
I have schema.xml that uses a Tokenizer that I wrote. I understand the standard way of deploying Solr is to place solr.war in webapps directory, have a separate directory that has conf files under its conf subdirectory, and specify that directory as Solr home dir via either JVM property or JNDI.

Can solr web site have multiple versions of online API doc?

2009-12-15 Thread Teruhiko Kurosaka
Lucene keeps multiple versions of its API doc online at http://lucene.apache.org/java/X_Y_Z/api/all/index.html for version X.Y.Z. I am finding this very useful when comparing different versions. This is also good because the javadoc comments that I write for my software can reference the API

Dumping solr requests for indexing

2009-12-04 Thread Teruhiko Kurosaka
Is there any way to dump all incoming requests to Solr into a file? My customer is seeing a strange problem of disappearing docs from index and I'd like to ask them to capture all incoming requests. Thanks. -kuro

RE: Dumping solr requests for indexing

2009-12-04 Thread Teruhiko Kurosaka
Message From: Teruhiko Kurosaka k...@basistech.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Fri, December 4, 2009 2:23:17 PM Subject: Dumping solr requests for indexing Is there any way to dump all incoming requests to Solr into a file? My customer

RE: long startup time

2009-10-27 Thread Teruhiko Kurosaka
From: Grant Ingersoll [mailto:gsing...@apache.org] Sent: Tuesday, October 27, 2009 1:15 PM To: solr-user@lucene.apache.org Subject: Re: long startup time How big is your index? Can you share your solrconfig? Have you looked at it in a profiler during this time? What is it doing? The

Solr 1.4 (RC) performance on multi-CPU system

2009-10-26 Thread Teruhiko Kurosaka
Is Solr 1.4 (Release Candidate) suppose to take advantage of muti-CPU (core) system? I.e. if more than one update or search requests come in about the same time, they can be automatically assigned to differnt CPUs if available (and the OS does its job right)? BTW, the term multicore in Solr

RE: Solr 1.4 (RC) performance on multi-CPU system

2009-10-26 Thread Teruhiko Kurosaka
: Solr 1.4 (RC) performance on multi-CPU system 2009/10/26 Teruhiko Kurosaka k...@basistech.com: Is Solr 1.4 (Release Candidate) suppose to take advantage of muti-CPU (core) system? I.e. if more than one update or search requests come in about the same time, they can be automatically

long startup time

2009-10-26 Thread Teruhiko Kurosaka
I've been testing Solr 1.4.0 (RC). After sometime, solr started to pause for a long time (a minutes or two) after printing: INFO: jetty-6.1.3 Sometime it starts immediately, but more often than not, it pasues. Is there any known cause of this kind of long pause? -kuro

(Solr 1.4 dev) Why solr.common.* packages are in solrj-*.jar ?

2009-10-14 Thread Teruhiko Kurosaka
I've downloaded solr-2009-10-12.zip and tried to compile my TokenizerFactory impelmentation against this version of Solr. Compilation failed. One of the causes is that the compiler couldn't find org.apache.solr.common.ReosourceLoader. I discovered this class in apache-solr-solrj-nightly.jar. I

RE: Right place to put my Tokenizer jars

2009-10-14 Thread Teruhiko Kurosaka
Actually, I meant to say I have my Tokenizer jars in solr/lib. I have the jars that my Tokenizer jars depend in lib/ext, as I wanted them to be loaded only once per container due to their internal description. Bad idea? -kuro From: Teruhiko Kurosaka Sent: Wednesday, October 14, 2009 4:28 PM

RE: 1.3.0 candidate

2008-09-15 Thread Teruhiko Kurosaka
The release candidates is up again. -Original Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 10:34 AM To: solr-user@lucene.apache.org Subject: Re: 1.3.0 candidate This is temporarily removed, as I need to create another. On Sep 7,

RE: 1.3.0 candidate

2008-09-08 Thread Teruhiko Kurosaka
Grant, Is this coming back soon? Rough estimate? -kuro -Original Message- From: Grant Ingersoll [mailto:[EMAIL PROTECTED] Sent: Monday, September 08, 2008 10:34 AM To: solr-user@lucene.apache.org Subject: Re: 1.3.0 candidate This is temporarily removed, as I need to create

schema.xml compatibility

2008-07-09 Thread Teruhiko Kurosaka
I've noticed that schema.xml in the dev version of Solr spells what used to be fieldtype as fieldType with capital T. Are there any other compatibility issues between the would-be Solr 1.3 and Solr 1.2? How soon Solr 1.3 will be available, by the way? Basis Technology Corporation, San

RE: question about bi-gram analysis on query

2007-10-04 Thread Teruhiko Kurosaka
Hello David, And if I do a search in Luke and the solr analysis page for美聯, I get a hit. But on the actual search, I don't. I think you need to tell us what you mean by actual search and your code that interfaces with Solr. -kuro

RE: What is facet?

2007-09-27 Thread Teruhiko Kurosaka
Thank you Ezra and Chris for explaining this, and I like your idea, Erik. This will make intro to Solr easier for new comers, and make Solr more popular. -Kuro That example is definitely in the cool category. I couldn't resist creating a SolrTerminology wiki page linking to your post

What is facet?

2007-09-26 Thread Teruhiko Kurosaka
Could someone tell me what facet is? I have a vague idea but I am not too clear. A pointer to a sample web site that uses Solr facet would be very good. Thanks. -Kuro

How to use solrj ?

2007-08-29 Thread Teruhiko Kurosaka
Can anyone tell me how to use the Java client ? I downloaded the complete source from SVN solr trunk and took a look at files under client/java but no .java file has main(). Or I don't see README. -kuro

RE: SolJava --- which attachments are valid?

2007-08-22 Thread Teruhiko Kurosaka
: Friday, August 03, 2007 12:50 PM To: solr-user@lucene.apache.org Subject: Re: SolJava --- which attachments are valid? Teruhiko Kurosaka wrote: or you can get it from the nightly builds in: http://people.apache.org/builds/lucene/solr/nightly/ For those of you who are interested

RE: Logging in Solr Embedded

2007-08-03 Thread Teruhiko Kurosaka
I think it's best to control log level by an external file; you don't want to reprogram when you need log. Define the system property java.util.logging.config.file to point to your log properties file. I would copy $JAVA_HOME/jre/lib/logging.properties and then add a line: org.apache.solr.level =

RE: SolJava --- which attachments are valid?

2007-08-03 Thread Teruhiko Kurosaka
Some form of some files from SOLR-20 should work, but I would suggest using the client in trunk now: http://svn.apache.org/repos/asf/lucene/solr/trunk/client/java/solrj/ Thanks. I updated http://wiki.apache.org/solr/SolJava to reflect the new state of this component. or you can get

RE: SolJava --- which attachments are valid?

2007-08-03 Thread Teruhiko Kurosaka
or you can get it from the nightly builds in: http://people.apache.org/builds/lucene/solr/nightly/ For those of you who are interested... As far as I can tell by inspecting the source code in Trunk, solrj.jar from the nightly doesn't seem to work with Solr 1.2. For one thing, there is a new

bogus multiple values encountered for non multiValued field text error on post?

2007-08-01 Thread Teruhiko Kurosaka
I'm using Solr 1.1. I ran: post.sh vidcard.xml (with URL modified in post.sh) then got an error: Posting file vidcard.xml to http://localhost:28080/solr/update result status=400ERROR: multiple values encountered for non multiValued field text: f\ irst='ASUS Extreme N7800GTX/2DHTV (256 MB)'

RE: Indexing HTML and other doc types

2007-07-05 Thread Teruhiko Kurosaka
Thank you, Otis and Peter, for your replies. From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] doc of some type - parse content into various fields - post to Solr I understand this part, but the question is who should do this. I was under assumption that it's Solr client's job to crawl the

Indexing HTML and other doc types

2007-07-03 Thread Teruhiko Kurosaka
Solr looks very good for indexing and searching strcutured data. But I noticed there is no tool in the Solr distribution with which documents of other doc types can be indexed. Are there other side projects that develop Solr clients for indexing documents of other doc types? Or is the generic

RE: Multi-language Tokenizers / Filters recommended?

2007-06-22 Thread Teruhiko Kurosaka
Hi Daniel, As you know, Chinese and Japanese does not use space or any other delimiters to break words. To overcome this problem, CJKTokenizer uses a method called bi-gram where the run of ideographic (=Chinese) characters are made into tokens of two neighboring characters. So a run of five

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka
Daniel, I was reading your email and responses to it with great interest. I was aware that Solr has an implicit assumption that a field is mono-lingual per system. But your mail and its correspondence made me wonder if this limitation is practical for multi-lingual search applications. For

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka
Hi Yonik, On 6/12/07, Teruhiko Kurosaka [EMAIL PROTECTED] wrote: For bi-lingual or tri-lingual search, we can have parallel fields (title_en, title_fr, title_de, for example) but this wouldn't scale well. Due to search across multiple fields, or due to increased index size? Due

RE: Solr 1.2 released

2007-06-08 Thread Teruhiko Kurosaka
I noticed there is no example/ext directory or jars that was found there in 1.1 (commons-el.jar, commons-logging.jar, jasper-*.jar, mx4j-*.jar) I have a jar that my Solr plugin depends on. This jar contains a class that needs to be loaded only once per container because it is a JNI library.

Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
I made a plugin that has a Tokenizer, its Factory, a Filter and its Factory. I modified example/solr/conf/schema.xml to use these Factories. Following http://wiki.apache.org/solr/SolrPlugins I placed the plugin jar in the top level lib and ran the start.jar. I got:

RE: Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
This is about Solr 1.1.0 running on Win XP w/JDK 1.5. Thank you. -Original Message- From: Teruhiko Kurosaka [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 06, 2007 5:32 PM To: solr-user@lucene.apache.org Subject: Where to put my plugins? I made a plugin that has a Tokenizer, its

RE: Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
Ryan, Thank you. But creating lib under example/solr and placing my plugin jar there yielded the same error of not able to locate org/apache/solr/analysis/BaseTokenizerFactory How can this be -kuro

RE: Where to put my plugins?

2007-06-06 Thread Teruhiko Kurosaka
Never mind. My mistake. I still had a copy of the jar in ext dir. After cleaning it up, it's now loading my plugin. THANK YOU VERY MUCH! -Original Message- From: Teruhiko Kurosaka [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 06, 2007 5:58 PM To: solr-user@lucene.apache.org

RE: Proper ways to handle errors in BaseTokenFilterFactory subclasses

2007-05-30 Thread Teruhiko Kurosaka
Ryan, Thank you for your reply, but I can't find this class SolrException.ErrorCode in Solr 1.1. The Solr source seems to be giving a random number, 400, 500, etc. for the first arg to SolrException constructor. (Is there any unwritten convention?) Is SolrException.ErrorCode new to the latest

Proper ways to handle errors in BaseTokenFilterFactory subclasses

2007-05-29 Thread Teruhiko Kurosaka
When the parameter to a token filter is out of range, or a mandatory paramter is not given, what is the proper way to fail in the init() and crate() methods? Should I throw an RuntimeException? Or should I simply call SolrCore.log.severe(message)? Is it OK for create() to return null when the

RE: How to handle hl.fl form variable (any variable with a dot in its name) from javascript?

2007-05-22 Thread Teruhiko Kurosaka
Ryan, Thank you. The JavaScript code you mentioned works well. But I am now hitting the similar problem with XSLT. The following XSLT code can't retrieve the value of hl.fl parameter even though the similar code for other parameter works. xsl:variable name=hlfl

How to handle hl.fl form variable (any variable with a dot in its name) from javascript?

2007-05-21 Thread Teruhiko Kurosaka
I have a form that sets the hl.fl form hidden variable. I wanted to change the higlighted field depending on the query string that is typed, using JavaScript. This is normally done by the JavaScript code like this: document.myform.varname.value = whatever But this doesn't work for hl.fl

What does the name attribute of lst element in highlighting result mean?

2007-05-15 Thread Teruhiko Kurosaka
I am trying to understand the highlighting output example, the last one in this page: http://wiki.apache.org/solr/StandardRequestHandler It the example is showing the top level element of a set of higlighted results for a document is lst name=SOLR1000. What does this, SOLR1000, mean? Or

RE: problem installing solr

2007-05-15 Thread Teruhiko Kurosaka
I've had this a few weeks ago. You are probably starting Tomcat from somewhere other than the Solr home. See Simple Example Install section of http://wiki.apache.org/solr/SolrTomcat There, tomcat is started from the Solr home by: ./apache-tomcat-5.5.20/bin/startup.sh If you do cd

RE: cwd requirement to run Solr with Tomcat

2007-05-10 Thread Teruhiko Kurosaka
BTW, The Simple Example Install section in http://wiki.apache.org/solr/SolrTomcat leaves the unzipped directory apache-solr-nightly-incubating intact, but this is not needed after copying the solr.war and the example solr directory, is it? Can I edit the instruction to insert: rm -r

Does Solr XSL writer work with Arabic text?

2007-05-10 Thread Teruhiko Kurosaka
I'm trying to search an index of docs which have text fields in Arabic, using XSL writer (wt=xslttr=example.xsl). But the Arabic text gets all garbled. Is XSL writer known to work for Arabic text? Is anybody using it? -kuro

RE: Facet only support english?

2007-05-10 Thread Teruhiko Kurosaka
If my memory is correct, UTF-8 has been the default encoding per XML specification from a very early stage. If the XML parser is not defaulting to UTF-8 in absence of the encoding attribute, that means the XML parser has a bug, and the code should be corrected. (I don't have an objection to add

cwd requirement to run Solr with Tomcat

2007-05-08 Thread Teruhiko Kurosaka
I struggled to run Solr in Tomcat 5.5 (or 6.0 for that matter). Then I found a step-by-step instruction at http://wiki.apache.org/solr/SolrTomcat and followed it as much as possible (wget URL didn't work, so I had to download using browser). Then Solr worked. An important factor in the

RE: cwd requirement to run Solr with Tomcat

2007-05-08 Thread Teruhiko Kurosaka
Thank you, Hoss, for replying m question. : An important factor in the instruction is that Tomcat must : be started from the directory under which the solr directory : (copied from the exmaple) exists that's not true. if you use JNDI or system properties to configure the solr home,