Re: Solr 1.30 Collection Distribution Search

2011-04-12 Thread Li
On Apr 12, 2011, at 11:47 AM, Erick Erickson erickerick...@gmail.com wrote: Yes. You need to put, say, a load balancer on front of your slaves and distribute the requests to the slave. Best Erick On Tue, Apr 12, 2011 at 2:20 PM, Li Tan litan1...@gmail.com wrote: I have 1 master, and 2

Re: Regarding filterquery

2011-04-13 Thread Li
You should just ask me. Sent from my iPhone On Apr 13, 2011, at 11:27 AM, soumya rao soumrao...@gmail.com wrote: Thanks for the reply Josh. And where should I make changes in ruby to add filters? Soumya On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair

Curl bulk XML

2011-04-13 Thread Li
Hey guys, how do you curl update all the XML inside a folder from A-D? Example: curl http://localhost:8080/solr update Sent from my iPhone

Re: TikaEntityProcessor

2011-04-19 Thread Li
Looks like dependencies. Did you or him included the dependencies in the solrconfig? Sent from my iPhone On Apr 19, 2011, at 8:35 AM, Oleg Tikhonov o...@apache.org wrote: Hello everybody, Recently, I got a message from a guy who was asking about TikaEntityProcessor. He uses Solr 1.4 and

Re: Indexing 20M documents from MySQL with DIH

2011-04-21 Thread Li
Can you post the dataconfig.XML? Probably you didn't use batch size Sent from my iPhone On Apr 21, 2011, at 5:09 PM, Scott Bigelow eph...@gmail.com wrote: Thanks for the e-mail. I probably should have provided more details, but I was more interested in making sure I was approaching the

how to test solr's performance?

2010-06-09 Thread Li Li
are there any built-in tools for performance test? thanks

how to patch?

2010-06-12 Thread Li Li
I want to use fast highlighter in solr1.4 and find a issue in https://issues.apache.org/jira/browse/SOLR-1268 File Name Date Attached ↑ Attached By Size SOLR-1268.patch 2010-02-05 10:32 PM Koji

collapse exception

2010-06-21 Thread Li Li
it says Either filter or filterList may be set in the QueryCommand, but not both. I am newbie of solr and have no idea of the exception. What's wrong with it? thank you. java.lang.IllegalArgumentException: Either filter or filterList may be set in the QueryCommand, but not both. at

Re: collapse exception

2010-06-21 Thread Li Li
I don't know because it's patched by someone else but I can't get his help. When this component become a contrib? Using patch is so annoying 2010/6/22 Martijn v Groningen martijn.is.h...@gmail.com: What version of Solr and which patch are you using? On 21 June 2010 11:46, Li Li fancye

about function query

2010-06-22 Thread Li Li
I want to integrate document's timestamp into scoring of search. And I find an example in the book Solr 1.4 Enterprise Search Server about function query. I want to boost a document which is newer. so it may be a function such as 1/(timestamp+1) . But the function query is added to the final

is there a delete all command in updateHandler?

2010-06-27 Thread Li Li
I want to delete all index and rebuild index frequently. I can't delete the index files directly because I want to use replication

index format error because disk full

2010-07-06 Thread Li Li
the index file is ill-formated because disk full when feeding. Can I roll back to last version? Is there any method to avoid unexpected errors when indexing? attachments are my segment_N

How to manage resource out of index?

2010-07-07 Thread Li Li
I used to store full text into lucene index. But I found it's very slow when merging index because when merging 2 segments it copy the fdt files into a new one. So I want to only index full text. But When searching I need the full text for applications such as hightlight and view full text. I can

Re: index format error because disk full

2010-07-07 Thread Li Li
I use SegmentInfos to read the segment_N file and found the error is that it try to load deletedDocs but the .del file's size is 0(because of disk error) . So I use SegmentInfos to set delGen=-1 to ignore deleted Docs. But I think there is some bug. The logic of write my be -- it first writes the

Distributed Indexing

2010-07-08 Thread Li Li
Is there any tools for Distributed Indexing? It refers to KattaIntegration and ZooKeeperIntegration in http://wiki.apache.org/solr/DistributedSearch. But it seems that they concern more on error processing and replication. I need a dispatcher that dispatch different docs by uniqueKey(such

how to save a snapshot of an index?

2010-07-12 Thread Li Li
When I add some docs by post.jar(org.apache.solr.util.SimplePostTool), It commits after all docs are added. It will call IndexWriter.commit(). And a new segment will be added and sometimes it triggers segment merging. New index files will be generated(frm, tii,tis, ). Old segments will be

Cache full text into memory

2010-07-14 Thread Li Li
I want to cache full text into memory to improve performance. Full text is only used to highlight in my application(But it's very time consuming, My avg query time is about 250ms, I guess it will cost about 50ms if I just get top 10 full text. Things get worse when get more full text because

Re: Cache full text into memory

2010-07-14 Thread Li Li
at 12:08 PM, Li Li fancye...@gmail.com wrote:     I want to cache full text into memory to improve performance. Full text is only used to highlight in my application(But it's very time consuming, My avg query time is about 250ms, I guess it will cost about 50ms if I just get top 10 full text

Re: Cache full text into memory

2010-07-14 Thread Li Li
at 12:39 PM, Li Li fancye...@gmail.com wrote: I have already store it in lucene index. But it is in disk and When a query come, it must seek the disk to get it. I am not familiar with lucene cache. I just want to fully use my memory that load 10GB of it in memory and a LRU stragety when cache full

about warm up

2010-07-14 Thread Li Li
I want to load full text into an external cache, So I added so codes in newSearcher where I found the warm up takes place. I add my codes before solr warm up which is configed in solrconfig.xml like this: listener event=firstSearcher class=solr.QuerySenderListener arr name=queries

Re: Solr Statistics, num docs

2010-07-16 Thread Li Li
numDocs is the total indexed docs. May be your docs have duplicated key. When duplicated, the older one will be deleted. uniqueKey is defined in solrconfig.xml 2010/7/16 Karthik K karthikkato...@gmail.com: Hi, Is numDocs in solr statistics equal to the total number of documents that are

Re: Ranking based on term position

2010-07-19 Thread Li Li
I have considerd this problem and tried to solve it using 2 methods By these methods, we also can boost a doc by the relative positions of query terms. 1: add term Position when indexing modify TermScorer.score public float score() { assert doc != -1; int f = freqs[pointer];

a bug of solr distributed search

2010-07-21 Thread Li Li
in QueryComponent.mergeIds. It will remove document which has duplicated uniqueKey with others. In current implementation, it use the first encountered. String prevShard = uniqueDoc.put(id, srsp.getShard()); if (prevShard != null) { // duplicate detected

Re: a bug of solr distributed search

2010-07-21 Thread Li Li
But users will think there is something wrong with it when he/she search the same query but got different result. 2010/7/21 MitchK mitc...@web.de: Li Li, this is the intended behaviour, not a bug. Otherwise you could get back the same record in a response for several times, which may

Re: a bug of solr distributed search

2010-07-21 Thread Li Li
yes. This will make user think our search engine has some bug. from the comments of the codes, it needs more things to do if (prevShard != null) { // For now, just always use the first encountered since we can't currently // remove the previous one added to the

Re: a bug of solr distributed search

2010-07-21 Thread Li Li
I think what Siva mean is that when there are docs with the same url, leave the doc whose score is large. This is the right solution. But itshows a problem of distrubted search without common idf. A doc will get different score in different shard. 2010/7/22 MitchK mitc...@web.de: It already was

Re: Which is a good XPath generator?

2010-07-25 Thread Li Li
it's not a related topic in solr. maybe you should read some papers about wrapper generation or automatical web data extraction. If you want to generate xpath, you could possibly read liubing's papers such as Structured Data Extraction from the Web based on Partial Tree Alignment. Besides dom

Re: a bug of solr distributed search

2010-07-25 Thread Li Li
where is the link of this patch? 2010/7/24 Yonik Seeley yo...@lucidimagination.com: On Fri, Jul 23, 2010 at 2:23 PM, MitchK mitc...@web.de wrote: why do we do not send the output of TermsComponent of every node in the cluster to a Hadoop instance? Since TermsComponent does the map-part of the

Re: a bug of solr distributed search

2010-07-25 Thread Li Li
the solr version I used is 1.4 2010/7/26 Li Li fancye...@gmail.com: where is the link of this patch? 2010/7/24 Yonik Seeley yo...@lucidimagination.com: On Fri, Jul 23, 2010 at 2:23 PM, MitchK mitc...@web.de wrote: why do we do not send the output of TermsComponent of every node

Re: Problem with parsing date

2010-07-26 Thread Li Li
I uses format like -MM-ddThh:mm:ssZ. it works 2010/7/26 Rafal Bluszcz Zawadzki ra...@headnet.dk: Hi, I am using Data Import Handler from Solr 1.4. Parts of my data-config.xml are:        entity name=page                processor=XPathEntityProcessor                stream=false      

Is there a cache for a query?

2010-07-26 Thread Li Li
I want a cache to cache all result of a query(all steps including collapse, highlight and facet). I read http://wiki.apache.org/solr/SolrCaching, but can't find a global cache. Maybe I can use external cache to store key-value. Is there any one in solr?

Re: Speed up Solr Index merging

2010-07-29 Thread Li Li
I faced this problem but can't find any good solution. But if you have large stored field such as full text of document. If you don't store it in lucene, it will be quicker because 2 merge indexes will force copy all fdts into a new fdt. If you store it externally. The problem you have to face is

Re: Solr searching performance issues, using large documents

2010-07-30 Thread Li Li
hightlight's time is mainly spent on getting the field which you want to highlight and tokenize this field(If you don't store term vector) . you can check what's wrong, 2010/7/30 Peter Spam ps...@mac.com: If I don't do highlighting, it's really fast.  Optimize has no effect. -Peter On Jul

how to ignore position in indexing?

2010-07-31 Thread Li Li
hi all in lucene, we can only store tf of a term's invert list. in my application, I only provide dismax query with boolean query and don't support queries which need position info such as phrase query. So I don't want to store position info in prx file. How to turn off it? And if I turn off

a small problem of distributed search

2010-08-16 Thread Li Li
current implementation of distributed search use unique key in the STAGE_EXECUTE_QUERY stage. public int distributedProcess(ResponseBuilder rb) throws IOException { ... if (rb.stage == ResponseBuilder.STAGE_EXECUTE_QUERY) { createMainQuery(rb); return

Re: Color search for images

2010-09-16 Thread Li Li
do you mean content based image retrieval or just search images by tag? if the former, you can try LIRE 2010/9/15 Shawn Heisey s...@elyograg.org:  My index consists of metadata for a collection of 45 million objects, most of which are digital images.  The executives have fallen in love with

Re: Can Solr do approximate matching?

2010-09-22 Thread Li Li
It seems there is a SimilarLikeThis in lucene . I don't know whether a counterpart in solr. It just use the found document as a query to find similar documents. Or you just use boolean or query and similar questions with getting higher score. Of course, you can analyse the question using some NLP

is multi-threads searcher feasible idea to speed up?

2010-09-28 Thread Li Li
hi all I want to speed up search time for my application. In a query, the time is largly used in reading postlist(io with frq files) and calculate scores and collect result(cpu, with Priority Queue). IO is hardly optimized or already part optimized by nio. So I want to use multithreads to

Re: is multi-threads searcher feasible idea to speed up?

2010-09-28 Thread Li Li
yes, there is a multisearcher in lucene. but it's idf in 2 indexes are not global. maybe I can modify it and also the index like: term1 df=5 doc1 doc3 doc5 term1 df=5 doc2 doc4 2010/9/28 Li Li fancye...@gmail.com: hi all    I want to speed up search time for my application. In a query

more sql-like commands for solr

2012-02-07 Thread Li Li
hi all, we have used solr to provide searching service in many products. I found for each product, we have to do some configurations and query expressions. our users are not used to this. they are familiar with sql and they may describe like this: I want a query that can search books whose

Re: Chinese Phonetic search

2012-02-07 Thread Li Li
you can convert Chinese words to pinyin and use n-gram to search phonetic similar words On Wed, Feb 8, 2012 at 11:10 AM, Floyd Wu floyd...@gmail.com wrote: Hi there, Does anyone here ever implemented phonetic search especially with Chinese(traditional/simplified) using SOLR or Lucene?

Re: New segment file created too often

2012-02-13 Thread Li Li
Commit is called after adding each document you should add enough documents and then calling a commit. commit is a cost operation. if you want to get latest feeded documents, you could use NRT On Tue, Feb 14, 2012 at 12:47 AM, Huy Le hu...@springpartners.com wrote: Hi, I am using solr

Re: New segment file created too often

2012-02-13 Thread Li Li
version of solr. Huy On Mon, Feb 13, 2012 at 11:55 AM, Li Li fancye...@gmail.com wrote: Commit is called after adding each document you should add enough documents and then calling a commit. commit is a cost operation. if you want to get latest feeded documents, you could use NRT

Re: New segment file created too often

2012-02-13 Thread Li Li
. Are the commit calls triggering new segment files being created? I don't see this behavior in another environment of the same version of solr. Huy On Mon, Feb 13, 2012 at 11:55 AM, Li Li fancye...@gmail.com wrote: Commit is called after adding each document you should add enough

Re: Can I rebuild an index and remove some fields?

2012-02-13 Thread Li Li
method1, dumping data for stored fields, you can traverse the whole index and save it to somewhere else. for indexed but not stored fields, it may be more difficult. if the indexed and not stored field is not analyzed(fields such as id), it's easy to get from FieldCache.StringIndex. But

Re: Can I rebuild an index and remove some fields?

2012-02-13 Thread Li Li
for method 2, delete is wrong. we can't delete terms. you also should hack with the tii and tis file. On Tue, Feb 14, 2012 at 2:46 PM, Li Li fancye...@gmail.com wrote: method1, dumping data for stored fields, you can traverse the whole index and save it to somewhere else. for indexed

Re: Can I rebuild an index and remove some fields?

2012-02-14 Thread Li Li
wrapper=new FilterIndexReader(reader); SegmentMerger merger=new SegmentMerger(writer); merger.add(wrapper); merger.Merge(); On Feb 14, 2012, at 1:49 AM, Li Li wrote: for method 2, delete is wrong. we can't delete terms. you also should hack with the tii and tis file. On Tue, Feb

Re: Can I rebuild an index and remove some fields?

2012-02-15 Thread Li Li
. Implementation uses separate thread for each segment, so it re-writes them in parallel. Took about 15 minutes to do 770,000 doc index on my macbook. On Tue, Feb 14, 2012 at 10:12 PM, Li Li fancye...@gmail.com wrote: I have roughly read the codes of 4.0 trunk. maybe it's feasible

Re: Sort by the number of matching terms (coord value)

2012-02-16 Thread Li Li
you can fool the lucene scoring fuction. override each function such as idf queryNorm lengthNorm and let them simply return 1.0f. I don't lucene 4 will expose more details. but for 2.x/3.x, lucene can only score by vector space model and the formula can't be replaced by users. On Fri, Feb 17,

Re: Fw:how to make fdx file

2012-03-04 Thread Li Li
lucene will never modify old segment files, it just flushes into a new segment or merges old segments into new one. after merging, old segments will be deleted. once a file(such as fdt and fdx) is generated. it will never be re-generated. the only possible is that in the generating stage, there is

Re: How to limit the number of open searchers?

2012-03-06 Thread Li Li
what do u mean programmatically? modify codes of solr? becuase solr is not like lucene, it only provide http interfaces for its users other than java api if you want to modify solr, you can find codes in SolrCore private final LinkedListRefCountedSolrIndexSearcher _searchers = new

Re: index size with replication

2012-03-13 Thread Li Li
optimize will generate new segments and delete old ones. if your master also provides searching service during indexing, the old files may be opened by old SolrIndexSearcher. they will be deleted later. So when indexing, the index size may double. But a moment later, old indexes will be deleted.

Re: Sorting on non-stored field

2012-03-14 Thread Li Li
it should be indexed by not analyzed. it don't need stored. reading field values from stored fields is extremely slow. So lucene will use StringIndex to read fields for sort. so if you want to sort by some field, you should index this field and don't analyze it. On Wed, Mar 14, 2012 at 6:43 PM,

Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
There is a class org.apache.solr.common.util.XML in solr you can use this wrapper: public static String escapeXml(String s) throws IOException{ StringWriter sw=new StringWriter(); XML.escapeCharData(s, sw); return sw.getBuffer().toString(); } On Wed, Mar 14, 2012

Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
no, it's nothing to do with schema.xml post.jar just post a file, it don't parse this file. solr will use xml parser to parse this file. if you don't escape special characters, it's not a valid xml file and solr will throw exceptions. On Thu, Mar 15, 2012 at 12:33 AM, neosky neosk...@yahoo.com

Re: Solr out of memory exception

2012-03-14 Thread Li Li
how many memory are allocated to JVM? On Thu, Mar 15, 2012 at 1:27 PM, Husain, Yavar yhus...@firstam.com wrote: Solr is giving out of memory exception. Full Indexing was completed fine. Later while searching maybe when it tries to load the results in memory it starts giving this exception.

Re: Solr out of memory exception

2012-03-15 Thread Li Li
and solr configuration memory it is working fine? -Original Message- From: Li Li [mailto:fancye...@gmail.com] Sent: Thursday, March 15, 2012 11:11 AM To: solr-user@lucene.apache.org Subject: Re: Solr out of memory exception how many memory are allocated to JVM? On Thu, Mar 15, 2012

Re: Solr out of memory exception

2012-03-15 Thread Li Li
wrote the JRocket book you refer to no doubt had other scenarios in mind... On Thu, Mar 15, 2012 at 3:02 PM, C.Yunqin 345804...@qq.com wrote: why should enable pointer compression? -- Original -- From: Li Lifancye...@gmail.com; Date: Thu, Mar 15, 2012 02:41 PM

Re: How to avoid the unexpected character error?

2012-03-16 Thread Li Li
it's not the right place. when you use java -Durl=http://... -jar post.jar data.xml the data.xml file must be a valid xml file. you shoud escape special chars in this file. I don't know how you generate this file. if you use java program(or other scripts) to generate this file, you should use xml

Re: Trouble Setting Up Development Environment

2012-03-23 Thread Li Li
here is my method. 1. check out latest source codes from trunk or download tar ball svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunklucene_trunk 2. create a dynamic web project in eclipse and close it. for example, I create a project name lucene-solr-trunk in my workspace.

Re: Trouble Setting Up Development Environment

2012-03-24 Thread Li Li
at 3:25 AM, Li Li fancye...@gmail.com wrote: here is my method. 1. check out latest source codes from trunk or download tar ball svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunklucene_trunk 2. create a dynamic web project in eclipse and close it. for example, I create

Re: using solr to do a 'match'

2012-04-11 Thread Li Li
it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest lucene should divide scorer to a matcher and a scorer. the matcher just return which doc is

Re: using solr to do a 'match'

2012-04-11 Thread Li Li
values. Wdyt? On Wed, Apr 11, 2012 at 10:08 AM, Li Li fancye...@gmail.com wrote: it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest

Re: Solr Scoring

2012-04-13 Thread Li Li
another way is to use payload http://wiki.apache.org/solr/Payloads the advantage of payload is that you only need one field and can make frq file smaller than use two fields. but the disadvantage is payload is stored in prx file, so I am not sure which one is fast. maybe you can try them both. On

Re: How to read SOLR cache statistics?

2012-04-13 Thread Li Li
http://wiki.apache.org/solr/SolrCaching On Fri, Apr 13, 2012 at 2:30 PM, Kashif Khan uplink2...@gmail.com wrote: Does anyone explain what does the following parameters mean in SOLR cache statistics? *name*: queryResultCache *class*: org.apache.solr.search.LRUCache *version*: 1.0

question about NRT(soft commit) and Transaction Log in trunk

2012-04-28 Thread Li Li
hi I checked out the trunk and played with its new soft commit feature. it's cool. But I've got a few questions about it. By reading some introductory articles and wiki, and hasted code reading, my understand of it's implementation is: For normal commit(hard commit), we should flush all

Re: get latest 50 documents the fastest way

2012-05-01 Thread Li Li
you should reverse your sort algorithm. maybe you can override the tf method of Similarity and return -1.0f * tf(). (I don't know whether default collector allow score smaller than zero) Or you can hack this by add a large number or write your own collector, in its collect(int doc) method, you can

Re: Sorting result first which come first in sentance

2012-05-03 Thread Li Li
as for version below 4.0, it's not possible because lucene's score model. position information is stored, but only used to support phrase query. it just tell us whether a document is matched, but we can boost a document. The similar problem is : how to implement proximity boost. for 2 search

Re: Sorting result first which come first in sentance

2012-05-03 Thread Li Li
for this version, you may consider using payload for position boost. you can save boost values in payload. I have used it in lucene api where anchor text should weigh more than normal text. but I haven't used it in solr. some searched urls: http://wiki.apache.org/solr/Payloads

Re: SOLRJ: Is there a way to obtain a quick count of total results for a query

2012-05-04 Thread Li Li
don't score by relevance and score by document id may speed it up a little? I haven't done any test of this. may be u can give it a try. because scoring will consume some cpu time. you just want to match and get total count On Wed, May 2, 2012 at 11:58 PM, vybe3142 vybe3...@gmail.com wrote: I

Re: Solr query with mandatory values

2012-05-09 Thread Li Li
+ before term is correct. in lucene term includes field and value. Query ::= ( Clause )* Clause ::= [+, -] [TERM :] ( TERM | ( Query ) ) #_TERM_CHAR: ( _TERM_START_CHAR | _ESCAPED_CHAR | - | + ) #_ESCAPED_CHAR: \\ ~[] in lucene query syntax, you can't express a term value including space.

Re: How can i search site name

2012-05-22 Thread Li Li
you should define your search first. if the site is www.google.com. how do you match it. full string matching or partial matching. e.g. is google should match? if it does, you should write your own analyzer for this field. On Tue, May 22, 2012 at 2:03 PM, Shameema Umer shem...@gmail.com wrote:

Re: Installing Solr on Tomcat using Shell - Code wrong?

2012-05-22 Thread Li Li
you should find some clues from tomcat log 在 2012-5-22 晚上7:49,Spadez james_will...@hotmail.com写道: Hi, This is the install process I used in my shell script to try and get Tomcat running with Solr (debian server): I swear this used to work, but currently only Tomcat works. The Solr page

how to enable MMapDirectory in solr 1.4?

2011-08-08 Thread Li Li
hi all, I read Apache Solr 3.1 Released Note today and found that MMapDirectory is now the default implementation in 64 bit Systems. I am now using solr 1.4 with 64-bit jvm in Linux. how can I use MMapDirectory? will it improve performance?

Re: how to enable MMapDirectory in solr 1.4?

2011-08-08 Thread Li Li
NIOFSDir.  I'm pretty sure in Trunk/4.0 it's the default for Windows and maybe Solaris.  In Windows, there is a definite advantage for using MMapDirectory on a 64-bit system. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Li Li

can't use distributed spell check

2011-08-19 Thread Li Li
hi all, I tested it following the instructions in http://wiki.apache.org/solr/SpellCheckComponent. but it seems something wrong. the sample url in the wiki is

solr distributed search don't work

2011-08-19 Thread Li Li
hi all, I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent but there is something wrong. the url given my the wiki is

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
this may need something like language models to suggest. I found an issue https://issues.apache.org/jira/browse/SOLR-2585 what's going on with it? On Thu, Aug 18, 2011 at 11:31 PM, Valentin igorlacro...@gmail.com wrote: I'm trying to configure a spellchecker to autocomplete full sentences

Re: solr distributed search don't work

2011-08-19 Thread Li Li
directly, not in url, but should work the same. Maybe an issue in your spell request handler. 2011/8/19 Li Li fancye...@gmail.com hi all,     I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent but there is something wrong.     the url given my the wiki is http://solr:8983/solr

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
I haven't used suggest yet. But in spell check if you don't provide spellcheck.q, it will analyze the q parameter by a converter which tokenize your query. else it will use the analyzer of the field to process parameter q. If you don't want to tokenize query, you should pass spellcheck.q

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
NullPointerException? do you have the full exception print stack? On Fri, Aug 19, 2011 at 6:49 PM, Valentin igorlacro...@gmail.com wrote: Li Li wrote: If you don't want to tokenize  query, you should pass spellcheck.q and provide your own analyzer such as keyword analyzer. That's already

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
Line 476 of SpellCheckComponent.getTokens of mine is assert analyzer != null; it seems our codes' versions don't match. could you decompile your SpellCheckComponent.class ? On Fri, Aug 19, 2011 at 7:23 PM, Valentin igorlacro...@gmail.com wrote: My beautiful NullPointer Exception : SEVERE:

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
or your analyzer is null? any other exception or warning in your log file? On Fri, Aug 19, 2011 at 7:37 PM, Li Li fancye...@gmail.com wrote: Line 476 of  SpellCheckComponent.getTokens of mine  is  assert analyzer != null; it seems our codes' versions don't match. could you decompile your

what's the status of droids project(http://incubator.apache.org/droids/)?

2011-08-23 Thread Li Li
hi all I am interested in vertical crawler. But it seems this project is not very active. It's last update time is 11/16/2009

What will happen when one thread is closing a searcher while another is searching?

2011-09-05 Thread Li Li
hi all, I am using spellcheck in solr 1.4. I found that spell check is not implemented as SolrCore. in SolrCore, it uses reference count to track current searcher. oldSearcher and newSearcher will both exist if oldSearcher is servicing some query. But in FileBasedSpellChecker public void

Re: Multi CPU Cores

2011-10-16 Thread Li Li
for indexing, your can make use of multi cores easily by call IndexWriter.addDocument with multi-threads as far as I know, for searching, if there is only one request, you can't make good use of cpus. On Sat, Oct 15, 2011 at 9:37 PM, Rob Brown r...@intelcompute.com wrote: Hi, I'm running Solr

Re: Want to support did you mean xxx but is Chinese

2011-10-21 Thread Li Li
we have implemented one supporting did you mean and preffix suggestion for Chinese. But we base our working on solr 1.4 and we did many modifications so it will cost time to integrate it to current solr/lucene. Here are our solution. glad to see any advices. 1. offline words and

Re: Can't find resource 'solrconfig.xml'

2011-10-31 Thread Li Li
modify catalina.sh(bat) adding java startup params: -Dsolr.solr.home=/your/path On Mon, Oct 31, 2011 at 8:30 PM, 刘浪 liu.l...@eisoo.com wrote: Hi, After I start tomcat, I input http://localhost:8080/solr/admin. It can display. But in the tomcat, I find an exception like Can't find

Re: RE: Can't find resource 'solrconfig.xml'

2011-10-31 Thread Li Li
set JAVA_OPTS=%JAVA_OPTS% -Dsolr.solr.home=c:\xxx On Mon, Oct 31, 2011 at 9:14 PM, 刘浪 liu.l...@eisoo.com wrote: Hi Li Li, I don't know where I should add in catalina.bat. I have know Linux how to do it, but my OS is windows. Thank you very much. Sincerely, Amos

question about SolrCore

2010-10-11 Thread Li Li
hi all, I want to know the detail of IndexReader in SolrCore. I read a little codes of SolrCore. Here is my understanding, are they correct? Each SolrCore has many SolrIndexSearcher and keeps them in _searchers. and _searcher keep trace of the latest version of index. Each

Re: How to manage different indexes for different users

2010-10-11 Thread Li Li
will one user search other user's index? if not, you can use multi cores. 2010/10/11 Tharindu Mathew mcclou...@gmail.com: Hi everyone, I'm using solr to integrate search into my web app. I have a bunch of users who would have to be given their own individual indexes. I'm wondering whether

Re: question about SolrCore

2010-10-28 Thread Li Li
is there anyone could help me? 2010/10/11 Li Li fancye...@gmail.com: hi all,    I want to know the detail of IndexReader in SolrCore. I read a little codes of SolrCore. Here is my understanding, are they correct?    Each SolrCore has many SolrIndexSearcher and keeps them in _searchers

Re: Does Solr support Natural Language Search

2010-11-04 Thread Li Li
I don't think current lucene will offer what you want now. There are 2 main tasks in a search process. One is understanding users' intension. Because natural language understanding is difficult, Current Information Retrival systems force users input some terms to express their needs.

strange problem

2010-11-15 Thread Li Li
hi all I confronted a strange problem when feed data to solr. I started feeding and then Ctrl+C to kill feed program(post.jar). Then because XML stream is terminated unnormally, DirectUpdateHandler2 will throw an exception. And I goto the index directory and sorted it by date. newest files are

Re: shutdown.sh does not kill the tomcat process running solr./?

2010-11-30 Thread Li Li
1. make sure the Server port=8005 shutdown=SHUTDOWN the port is not used. 2. ./bin/shutdown.sh tail -f logs/xxx to see what the server is doing if you just feed data or modified index, and don't flush/commit, when shutdowning, it will do something. 2010/12/1 Robert Petersen rober...@buy.com:

Re: Best practice for Delta every 2 Minutes.

2010-11-30 Thread Li Li
you may implement your own MergePolicy to keep on large index and merge all other small ones or simply set merge factor to 2 and the largest index not be merged by set maxMergeDocs less than the docs in the largest one. So there is one large index and a small one. when adding a little docs, they

Re: Best practice for Delta every 2 Minutes.

2010-12-16 Thread Li Li
I think it will not because default configuration can only have 2 newSearcher threads but the delay will be more and more long. The newer newSearcher will wait these 2 ealier one to finish. 2010/12/1 Jonathan Rochkind rochk...@jhu.edu: If your index warmings take longer than two minutes, but

Re: Best practice for Delta every 2 Minutes.

2010-12-16 Thread Li Li
write the document into log file. and after flushing, we delete corresponding lines in the log file if the program corrput. we will redo the log and add them into RAMDirectory. Any one has done similar work? 2010/12/1 Li Li fancye...@gmail.com: you may implement your own MergePolicy to keep

Re: Optimizing to only 1 segment

2010-12-26 Thread Li Li
see maxMergeDocs(maxMergeSize) in solrconfig.xml. if the segment's documents size is larger than this value, it will not be merged. 2010/12/27 Rok Rejc rokrej...@gmail.com: Hi all, I have created an index, commited the data and after that I had run the optimize with default parameters:

Re: Optimizing to only 1 segment

2010-12-26 Thread Li Li
maybe you can consult log files and it may show you something btw how do you post your command? do you use curl 'http://localhost:8983/solr/update?optimize=true' ? or posting a xml file? 2010/12/27 Rok Rejc rokrej...@gmail.com: On Mon, Dec 27, 2010 at 3:26 AM, Li Li fancye...@gmail.com wrote

  1   2   3   >