Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li
在 2012-7-2 傍晚6:37,Nicholas Ball nicholas.b...@nodelay.com写道: That could work, but then how do you ensure commit is called on the two cores at the exact same time? that may needs something like two phrase commit in relational dB. lucene has prepareCommit, but to implement 2pc, many things need

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li
do you really need this? distributed transaction is a difficult problem. in 2pc, every node could fail, including coordinator. something like leader election needed to make sure it works. you maybe try zookeeper. but if the transaction is not very very important like transfer money in bank, you

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li
http://zookeeper.apache.org/doc/r3.3.6/recipes.html#sc_recipes_twoPhasedCommit On Thu, Aug 16, 2012 at 7:41 AM, Nicholas Ball nicholas.b...@nodelay.com wrote: Haven't managed to find a good way to do this yet. Does anyone have any ideas on how I could implement this feature? Really need to

Re: how to boost exact match

2012-08-10 Thread Li Li
create an field for exact match. it is a optional boolean clause 在 2012-8-11 下午1:42,abhayd ajdabhol...@hotmail.com写道: hi I have documents like iphone 4 - white iphone 4s - black ipone4 - black when user searches for iphone 4 i would like to show iphone 4 docs first and iphone 4s after

Re: Solr seems to hang

2012-06-28 Thread Li Li
could you please use jstack to dump the call stacks? On Thu, Jun 28, 2012 at 2:53 PM, Arkadi Colson ark...@smartbit.be wrote: It now hanging for 15 hour and nothing changes in the index directory. Tips for further debugging? On 06/27/2012 03:50 PM, Arkadi Colson wrote: I'm sending files

Re: what is precisionStep and positionIncrementGap

2012-06-28 Thread Li Li
, 2012 at 3:51 PM, ZHANG Liang F liang.f.zh...@alcatel-sbell.com.cn wrote: Thanks a lot, but the precisionStep is still very vague to me! Could you give me a example? -Original Message- From: Li Li [mailto:fancye...@gmail.com] Sent: 2012年6月28日 11:25 To: solr-user@lucene.apache.org

Re: Solr seems to hang

2012-06-27 Thread Li Li
seems that the indexwriter wants to flush but need to wait others become idle. but i see you the n gram filter is working. is your field's value too long? you sould also tell us average load the system. the free memory and memory used by jvm 在 2012-6-27 晚上7:51,Arkadi Colson ark...@smartbit.be写道:

Re: what is precisionStep and positionIncrementGap

2012-06-27 Thread Li Li
1. precisionStep is used for ranging query of Numeric Fields. see http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/search/NumericRangeQuery.html 2. positionIncrementGap is used for phrase query of multi-value fields e.g. doc1 has two titles. title1: ab

Re: Query Logic Question

2012-06-27 Thread Li Li
I think they are logically the same. but 1 may be a little bit faster than 2 On Thu, Jun 28, 2012 at 5:59 AM, Rublex ruble...@hotmail.com wrote: Hi, Can someone explain to me please why these two queries return different results: 1. -PaymentType:Finance AND -PaymentType:Lease AND

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
RAMDirectory. This sounds wrong, but it is true. With RAMDirectory, Java has to work harder doing garbage collection. On Fri, Jun 8, 2012 at 1:30 AM, Li Li fancye...@gmail.com wrote: hi all   I want to use lucene 3.6 providing searching service. my data is not very large, raw data is less that 1GB and I

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
. Every night I will merge them. new added documents will flush into a new segment and I will merge the new generated segment and the small one. Our update operations are not very frequent. On Mon, Jun 11, 2012 at 4:59 PM, Paul Libbrecht p...@hoplahup.net wrote: Li Li, have you considered

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
-Kuli Am 11.06.2012 10:38, schrieb Li Li: I have roughly read the codes of RAMDirectory. it use a list of 1024 byte arrays and many overheads. But as far as I know, using MMapDirectory, I can't prevent the page faults. OS will swap less frequent pages out. Even if I allocate enough memory

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
to live with disk IO anyway. Greetings, Kuli Am 11.06.2012 11:20, schrieb Li Li: I am sorry. I make a mistake. even use RAMDirectory, I can not guarantee they are not swapped out. On Mon, Jun 11, 2012 at 4:45 PM, Michael Kuhlmannk...@solarier.de  wrote: Set the swapiness to 0 to avoid

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
. http://en.wikipedia.org/wiki/Swappiness -Kuli Am 11.06.2012 10:38, schrieb Li Li: I have roughly read the codes of RAMDirectory. it use a list of 1024 byte arrays and many overheads. But as far as I know, using MMapDirectory, I can't prevent the page faults. OS will swap less frequent

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
potentially useful approach http://lucene.472066.n3.nabble.com/High-response-time-after-being-idle-tp3616599p3617604.html. On Mon, Jun 11, 2012 at 3:02 PM, Toke Eskildsen t...@statsbiblioteket.dkwrote: On Mon, 2012-06-11 at 11:38 +0200, Li Li wrote: yes, I need average query time less than 10 ms

Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Li Li
yes, I am also interested in good performance with 2 billion docs. how many search nodes do you use? what's the average response time and qps ? another question: where can I find related paper or resources of your algorithm which explains the algorithm in detail? why it's better than google

Re: How can i search site name

2012-05-22 Thread Li Li
you should define your search first. if the site is www.google.com. how do you match it. full string matching or partial matching. e.g. is google should match? if it does, you should write your own analyzer for this field. On Tue, May 22, 2012 at 2:03 PM, Shameema Umer shem...@gmail.com wrote:

Re: Installing Solr on Tomcat using Shell - Code wrong?

2012-05-22 Thread Li Li
you should find some clues from tomcat log 在 2012-5-22 晚上7:49,Spadez james_will...@hotmail.com写道: Hi, This is the install process I used in my shell script to try and get Tomcat running with Solr (debian server): I swear this used to work, but currently only Tomcat works. The Solr page

Re: Solr query with mandatory values

2012-05-09 Thread Li Li
+ before term is correct. in lucene term includes field and value. Query ::= ( Clause )* Clause ::= [+, -] [TERM :] ( TERM | ( Query ) ) #_TERM_CHAR: ( _TERM_START_CHAR | _ESCAPED_CHAR | - | + ) #_ESCAPED_CHAR: \\ ~[] in lucene query syntax, you can't express a term value including space.

Re: SOLRJ: Is there a way to obtain a quick count of total results for a query

2012-05-04 Thread Li Li
don't score by relevance and score by document id may speed it up a little? I haven't done any test of this. may be u can give it a try. because scoring will consume some cpu time. you just want to match and get total count On Wed, May 2, 2012 at 11:58 PM, vybe3142 vybe3...@gmail.com wrote: I

Re: Sorting result first which come first in sentance

2012-05-03 Thread Li Li
as for version below 4.0, it's not possible because lucene's score model. position information is stored, but only used to support phrase query. it just tell us whether a document is matched, but we can boost a document. The similar problem is : how to implement proximity boost. for 2 search

Re: Sorting result first which come first in sentance

2012-05-03 Thread Li Li
for this version, you may consider using payload for position boost. you can save boost values in payload. I have used it in lucene api where anchor text should weigh more than normal text. but I haven't used it in solr. some searched urls: http://wiki.apache.org/solr/Payloads

Re: get latest 50 documents the fastest way

2012-05-01 Thread Li Li
you should reverse your sort algorithm. maybe you can override the tf method of Similarity and return -1.0f * tf(). (I don't know whether default collector allow score smaller than zero) Or you can hack this by add a large number or write your own collector, in its collect(int doc) method, you can

question about NRT(soft commit) and Transaction Log in trunk

2012-04-28 Thread Li Li
hi I checked out the trunk and played with its new soft commit feature. it's cool. But I've got a few questions about it. By reading some introductory articles and wiki, and hasted code reading, my understand of it's implementation is: For normal commit(hard commit), we should flush all

Re: Solr Scoring

2012-04-13 Thread Li Li
another way is to use payload http://wiki.apache.org/solr/Payloads the advantage of payload is that you only need one field and can make frq file smaller than use two fields. but the disadvantage is payload is stored in prx file, so I am not sure which one is fast. maybe you can try them both. On

Re: How to read SOLR cache statistics?

2012-04-13 Thread Li Li
http://wiki.apache.org/solr/SolrCaching On Fri, Apr 13, 2012 at 2:30 PM, Kashif Khan uplink2...@gmail.com wrote: Does anyone explain what does the following parameters mean in SOLR cache statistics? *name*: queryResultCache *class*: org.apache.solr.search.LRUCache *version*: 1.0

Re: using solr to do a 'match'

2012-04-11 Thread Li Li
it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest lucene should divide scorer to a matcher and a scorer. the matcher just return which doc is

Re: using solr to do a 'match'

2012-04-11 Thread Li Li
values. Wdyt? On Wed, Apr 11, 2012 at 10:08 AM, Li Li fancye...@gmail.com wrote: it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest

Re: Trouble Setting Up Development Environment

2012-03-24 Thread Li Li
at 3:25 AM, Li Li fancye...@gmail.com wrote: here is my method. 1. check out latest source codes from trunk or download tar ball svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunklucene_trunk 2. create a dynamic web project in eclipse and close it. for example, I create

Re: Trouble Setting Up Development Environment

2012-03-23 Thread Li Li
here is my method. 1. check out latest source codes from trunk or download tar ball svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunklucene_trunk 2. create a dynamic web project in eclipse and close it. for example, I create a project name lucene-solr-trunk in my workspace.

Re: How to avoid the unexpected character error?

2012-03-16 Thread Li Li
it's not the right place. when you use java -Durl=http://... -jar post.jar data.xml the data.xml file must be a valid xml file. you shoud escape special chars in this file. I don't know how you generate this file. if you use java program(or other scripts) to generate this file, you should use xml

Re: Solr out of memory exception

2012-03-15 Thread Li Li
and solr configuration memory it is working fine? -Original Message- From: Li Li [mailto:fancye...@gmail.com] Sent: Thursday, March 15, 2012 11:11 AM To: solr-user@lucene.apache.org Subject: Re: Solr out of memory exception how many memory are allocated to JVM? On Thu, Mar 15, 2012

Re: Solr out of memory exception

2012-03-15 Thread Li Li
perfectly fine. I was thinking of increasing the system RAM tomcat heap space allocated but then how come on a different server with exactly same system and solr configuration memory it is working fine? -Original Message- From: Li Li [mailto:fancye...@gmail.com] Sent

Re: Sorting on non-stored field

2012-03-14 Thread Li Li
it should be indexed by not analyzed. it don't need stored. reading field values from stored fields is extremely slow. So lucene will use StringIndex to read fields for sort. so if you want to sort by some field, you should index this field and don't analyze it. On Wed, Mar 14, 2012 at 6:43 PM,

Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
There is a class org.apache.solr.common.util.XML in solr you can use this wrapper: public static String escapeXml(String s) throws IOException{ StringWriter sw=new StringWriter(); XML.escapeCharData(s, sw); return sw.getBuffer().toString(); } On Wed, Mar 14, 2012

Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
no, it's nothing to do with schema.xml post.jar just post a file, it don't parse this file. solr will use xml parser to parse this file. if you don't escape special characters, it's not a valid xml file and solr will throw exceptions. On Thu, Mar 15, 2012 at 12:33 AM, neosky neosk...@yahoo.com

Re: Solr out of memory exception

2012-03-14 Thread Li Li
how many memory are allocated to JVM? On Thu, Mar 15, 2012 at 1:27 PM, Husain, Yavar yhus...@firstam.com wrote: Solr is giving out of memory exception. Full Indexing was completed fine. Later while searching maybe when it tries to load the results in memory it starts giving this exception.

Re: index size with replication

2012-03-13 Thread Li Li
optimize will generate new segments and delete old ones. if your master also provides searching service during indexing, the old files may be opened by old SolrIndexSearcher. they will be deleted later. So when indexing, the index size may double. But a moment later, old indexes will be deleted.

Re: How to limit the number of open searchers?

2012-03-06 Thread Li Li
what do u mean programmatically? modify codes of solr? becuase solr is not like lucene, it only provide http interfaces for its users other than java api if you want to modify solr, you can find codes in SolrCore private final LinkedListRefCountedSolrIndexSearcher _searchers = new

Re: Fw:how to make fdx file

2012-03-04 Thread Li Li
lucene will never modify old segment files, it just flushes into a new segment or merges old segments into new one. after merging, old segments will be deleted. once a file(such as fdt and fdx) is generated. it will never be re-generated. the only possible is that in the generating stage, there is

Re: Sort by the number of matching terms (coord value)

2012-02-16 Thread Li Li
you can fool the lucene scoring fuction. override each function such as idf queryNorm lengthNorm and let them simply return 1.0f. I don't lucene 4 will expose more details. but for 2.x/3.x, lucene can only score by vector space model and the formula can't be replaced by users. On Fri, Feb 17,

Re: Can I rebuild an index and remove some fields?

2012-02-15 Thread Li Li
. Implementation uses separate thread for each segment, so it re-writes them in parallel. Took about 15 minutes to do 770,000 doc index on my macbook. On Tue, Feb 14, 2012 at 10:12 PM, Li Li fancye...@gmail.com wrote: I have roughly read the codes of 4.0 trunk. maybe it's feasible

Re: Can I rebuild an index and remove some fields?

2012-02-14 Thread Li Li
wrapper=new FilterIndexReader(reader); SegmentMerger merger=new SegmentMerger(writer); merger.add(wrapper); merger.Merge(); On Feb 14, 2012, at 1:49 AM, Li Li wrote: for method 2, delete is wrong. we can't delete terms. you also should hack with the tii and tis file. On Tue, Feb

Re: New segment file created too often

2012-02-13 Thread Li Li
Commit is called after adding each document you should add enough documents and then calling a commit. commit is a cost operation. if you want to get latest feeded documents, you could use NRT On Tue, Feb 14, 2012 at 12:47 AM, Huy Le hu...@springpartners.com wrote: Hi, I am using solr

Re: New segment file created too often

2012-02-13 Thread Li Li
version of solr. Huy On Mon, Feb 13, 2012 at 11:55 AM, Li Li fancye...@gmail.com wrote: Commit is called after adding each document you should add enough documents and then calling a commit. commit is a cost operation. if you want to get latest feeded documents, you could use NRT

Re: New segment file created too often

2012-02-13 Thread Li Li
. Are the commit calls triggering new segment files being created? I don't see this behavior in another environment of the same version of solr. Huy On Mon, Feb 13, 2012 at 11:55 AM, Li Li fancye...@gmail.com wrote: Commit is called after adding each document you should add enough

Re: Can I rebuild an index and remove some fields?

2012-02-13 Thread Li Li
method1, dumping data for stored fields, you can traverse the whole index and save it to somewhere else. for indexed but not stored fields, it may be more difficult. if the indexed and not stored field is not analyzed(fields such as id), it's easy to get from FieldCache.StringIndex. But

Re: Can I rebuild an index and remove some fields?

2012-02-13 Thread Li Li
for method 2, delete is wrong. we can't delete terms. you also should hack with the tii and tis file. On Tue, Feb 14, 2012 at 2:46 PM, Li Li fancye...@gmail.com wrote: method1, dumping data for stored fields, you can traverse the whole index and save it to somewhere else. for indexed

more sql-like commands for solr

2012-02-07 Thread Li Li
hi all, we have used solr to provide searching service in many products. I found for each product, we have to do some configurations and query expressions. our users are not used to this. they are familiar with sql and they may describe like this: I want a query that can search books whose

Re: Chinese Phonetic search

2012-02-07 Thread Li Li
you can convert Chinese words to pinyin and use n-gram to search phonetic similar words On Wed, Feb 8, 2012 at 11:10 AM, Floyd Wu floyd...@gmail.com wrote: Hi there, Does anyone here ever implemented phonetic search especially with Chinese(traditional/simplified) using SOLR or Lucene?

Re: Can't find resource 'solrconfig.xml'

2011-10-31 Thread Li Li
modify catalina.sh(bat) adding java startup params: -Dsolr.solr.home=/your/path On Mon, Oct 31, 2011 at 8:30 PM, 刘浪 liu.l...@eisoo.com wrote: Hi, After I start tomcat, I input http://localhost:8080/solr/admin. It can display. But in the tomcat, I find an exception like Can't find

Re: RE: Can't find resource 'solrconfig.xml'

2011-10-31 Thread Li Li
set JAVA_OPTS=%JAVA_OPTS% -Dsolr.solr.home=c:\xxx On Mon, Oct 31, 2011 at 9:14 PM, 刘浪 liu.l...@eisoo.com wrote: Hi Li Li, I don't know where I should add in catalina.bat. I have know Linux how to do it, but my OS is windows. Thank you very much. Sincerely, Amos

Re: Want to support did you mean xxx but is Chinese

2011-10-21 Thread Li Li
we have implemented one supporting did you mean and preffix suggestion for Chinese. But we base our working on solr 1.4 and we did many modifications so it will cost time to integrate it to current solr/lucene. Here are our solution. glad to see any advices. 1. offline words and

Re: Multi CPU Cores

2011-10-16 Thread Li Li
for indexing, your can make use of multi cores easily by call IndexWriter.addDocument with multi-threads as far as I know, for searching, if there is only one request, you can't make good use of cpus. On Sat, Oct 15, 2011 at 9:37 PM, Rob Brown r...@intelcompute.com wrote: Hi, I'm running Solr

What will happen when one thread is closing a searcher while another is searching?

2011-09-05 Thread Li Li
hi all, I am using spellcheck in solr 1.4. I found that spell check is not implemented as SolrCore. in SolrCore, it uses reference count to track current searcher. oldSearcher and newSearcher will both exist if oldSearcher is servicing some query. But in FileBasedSpellChecker public void

what's the status of droids project(http://incubator.apache.org/droids/)?

2011-08-23 Thread Li Li
hi all I am interested in vertical crawler. But it seems this project is not very active. It's last update time is 11/16/2009

can't use distributed spell check

2011-08-19 Thread Li Li
hi all, I tested it following the instructions in http://wiki.apache.org/solr/SpellCheckComponent. but it seems something wrong. the sample url in the wiki is

solr distributed search don't work

2011-08-19 Thread Li Li
hi all, I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent but there is something wrong. the url given my the wiki is

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
this may need something like language models to suggest. I found an issue https://issues.apache.org/jira/browse/SOLR-2585 what's going on with it? On Thu, Aug 18, 2011 at 11:31 PM, Valentin igorlacro...@gmail.com wrote: I'm trying to configure a spellchecker to autocomplete full sentences

Re: solr distributed search don't work

2011-08-19 Thread Li Li
directly, not in url, but should work the same. Maybe an issue in your spell request handler. 2011/8/19 Li Li fancye...@gmail.com hi all,     I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent but there is something wrong.     the url given my the wiki is http://solr:8983/solr

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
I haven't used suggest yet. But in spell check if you don't provide spellcheck.q, it will analyze the q parameter by a converter which tokenize your query. else it will use the analyzer of the field to process parameter q. If you don't want to tokenize query, you should pass spellcheck.q

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
NullPointerException? do you have the full exception print stack? On Fri, Aug 19, 2011 at 6:49 PM, Valentin igorlacro...@gmail.com wrote: Li Li wrote: If you don't want to tokenize  query, you should pass spellcheck.q and provide your own analyzer such as keyword analyzer. That's already

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
Line 476 of SpellCheckComponent.getTokens of mine is assert analyzer != null; it seems our codes' versions don't match. could you decompile your SpellCheckComponent.class ? On Fri, Aug 19, 2011 at 7:23 PM, Valentin igorlacro...@gmail.com wrote: My beautiful NullPointer Exception : SEVERE:

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
or your analyzer is null? any other exception or warning in your log file? On Fri, Aug 19, 2011 at 7:37 PM, Li Li fancye...@gmail.com wrote: Line 476 of  SpellCheckComponent.getTokens of mine  is  assert analyzer != null; it seems our codes' versions don't match. could you decompile your

how to enable MMapDirectory in solr 1.4?

2011-08-08 Thread Li Li
hi all, I read Apache Solr 3.1 Released Note today and found that MMapDirectory is now the default implementation in 64 bit Systems. I am now using solr 1.4 with 64-bit jvm in Linux. how can I use MMapDirectory? will it improve performance?

Re: how to enable MMapDirectory in solr 1.4?

2011-08-08 Thread Li Li
NIOFSDir.  I'm pretty sure in Trunk/4.0 it's the default for Windows and maybe Solaris.  In Windows, there is a definite advantage for using MMapDirectory on a 64-bit system. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Li Li

full text searching in cloud for minor enterprises

2011-07-04 Thread Li Li
hi all, I want to provide full text searching for some small websites. It seems cloud computing is popular now. And it will save costs because it don't need employ engineer to maintain the machine. For now, there are many services such as amazon s3, google app engine, ms azure etc. I am

Re: Unsupported encoding GB18030

2011-04-01 Thread Li Li
post.jar only support utf8. you must do the transformation. 2011/4/1 Jan Høydahl jan@cominvent.com: Hi, Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23 When trying to post example\exampledocs\gb18030-example.xml using post.jar I get this error: % java -jar post.jar

Re: RamBufferSize and AutoCommit

2011-03-28 Thread Li Li
there are 3 conditions that will trigger an auto flushing in lucene 1. size of index in ram is larger than ram buffer size 2. documents in mamory is larger than the number set by setMaxBufferedDocs. 3. deleted term number is larger than the ratio set by setMaxBufferedDeleteTerms. auto flushing by

Re: Snappull failed

2011-03-19 Thread Li Li
has master updated index during replication? this could occur when it failed to download any file becuase network problem. 209715200!=583644834 means the size of the file slave fetched is 583644834 but it only download 209715200 bytes. maybe the connection is time out. 2011/2/16 Markus Jelsma

Re: What request handlers to use for query strings in Chinese or Japanese?

2011-03-17 Thread Li Li
That's the job your analyzer should concern 2011/3/17 Andy angelf...@yahoo.com: Hi, For my Solr server, some of the query strings will be in Asian languages such as Chinese or Japanese. For such query strings, would the Standard or Dismax request handler work? My understanding is that

Re: Helpful new JVM parameters

2011-03-17 Thread Li Li
will UseCompressedOops be useful? for application using less than 4GB memory, it will be better that 64bit reference. But for larger memory using application, it will not be cache friendly. JRocket the definite guide says: Naturally, 64 GB isn't a theoretical limit but just an example. It was

I send a email to lucene-dev solr-dev lucene-user but always failed

2011-03-11 Thread Li Li
hi it seems my mail is judged as spam. Technical details of permanent failure: Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other

Re: I send a email to lucene-dev solr-dev lucene-user but always failed

2011-03-11 Thread Li Li
to use some synchronization mechanism to allow only 1 or 2 ReplicationHandler threads are doing CMD_GET_FILE command. Is that solution feasible? 2011/3/11 Li Li fancye...@gmail.com hi it seems my mail is judged as spam. Technical details of permanent failure: Google tried to deliver

Re: Turn off caching

2011-02-11 Thread Li Li
- the fieldcache - that can't be commented out. This cache will always jump into the picture If I need to do such things, I restart the whole tomcat6 server to flush ALL caches. 2011/2/11 Li Li fancye...@gmail.com do you mean queryResultCache? you can comment related paragraph

Re: Turn off caching

2011-02-10 Thread Li Li
do you mean queryResultCache? you can comment related paragraph in solrconfig.xml see http://wiki.apache.org/solr/SolrCaching 2011/2/8 Isan Fulia isan.fu...@germinait.com: Hi, My solrConfig file looks like config  updateHandler class=solr.DirectUpdateHandler2 /  requestDispatcher

Re: Optimizing to only 1 segment

2010-12-27 Thread Li Li
or http://localhost:8080/myindex/update?stream.body=optimize/ Thanks. On Mon, Dec 27, 2010 at 7:12 AM, Li Li fancye...@gmail.com wrote: maybe you can consult log files and it may show you something btw how do you post your command? do you use curl 'http://localhost:8983/solr/update?optimize

Re: Optimizing to only 1 segment

2010-12-26 Thread Li Li
see maxMergeDocs(maxMergeSize) in solrconfig.xml. if the segment's documents size is larger than this value, it will not be merged. 2010/12/27 Rok Rejc rokrej...@gmail.com: Hi all, I have created an index, commited the data and after that I had run the optimize with default parameters:

Re: Optimizing to only 1 segment

2010-12-26 Thread Li Li
maybe you can consult log files and it may show you something btw how do you post your command? do you use curl 'http://localhost:8983/solr/update?optimize=true' ? or posting a xml file? 2010/12/27 Rok Rejc rokrej...@gmail.com: On Mon, Dec 27, 2010 at 3:26 AM, Li Li fancye...@gmail.com wrote

Re: Best practice for Delta every 2 Minutes.

2010-12-16 Thread Li Li
I think it will not because default configuration can only have 2 newSearcher threads but the delay will be more and more long. The newer newSearcher will wait these 2 ealier one to finish. 2010/12/1 Jonathan Rochkind rochk...@jhu.edu: If your index warmings take longer than two minutes, but

Re: Best practice for Delta every 2 Minutes.

2010-12-16 Thread Li Li
write the document into log file. and after flushing, we delete corresponding lines in the log file if the program corrput. we will redo the log and add them into RAMDirectory. Any one has done similar work? 2010/12/1 Li Li fancye...@gmail.com: you may implement your own MergePolicy to keep

Re: shutdown.sh does not kill the tomcat process running solr./?

2010-11-30 Thread Li Li
1. make sure the Server port=8005 shutdown=SHUTDOWN the port is not used. 2. ./bin/shutdown.sh tail -f logs/xxx to see what the server is doing if you just feed data or modified index, and don't flush/commit, when shutdowning, it will do something. 2010/12/1 Robert Petersen rober...@buy.com:

Re: Best practice for Delta every 2 Minutes.

2010-11-30 Thread Li Li
you may implement your own MergePolicy to keep on large index and merge all other small ones or simply set merge factor to 2 and the largest index not be merged by set maxMergeDocs less than the docs in the largest one. So there is one large index and a small one. when adding a little docs, they

strange problem

2010-11-15 Thread Li Li
hi all I confronted a strange problem when feed data to solr. I started feeding and then Ctrl+C to kill feed program(post.jar). Then because XML stream is terminated unnormally, DirectUpdateHandler2 will throw an exception. And I goto the index directory and sorted it by date. newest files are

Re: Does Solr support Natural Language Search

2010-11-04 Thread Li Li
I don't think current lucene will offer what you want now. There are 2 main tasks in a search process. One is understanding users' intension. Because natural language understanding is difficult, Current Information Retrival systems force users input some terms to express their needs.

Re: question about SolrCore

2010-10-28 Thread Li Li
is there anyone could help me? 2010/10/11 Li Li fancye...@gmail.com: hi all,    I want to know the detail of IndexReader in SolrCore. I read a little codes of SolrCore. Here is my understanding, are they correct?    Each SolrCore has many SolrIndexSearcher and keeps them in _searchers

question about SolrCore

2010-10-11 Thread Li Li
hi all, I want to know the detail of IndexReader in SolrCore. I read a little codes of SolrCore. Here is my understanding, are they correct? Each SolrCore has many SolrIndexSearcher and keeps them in _searchers. and _searcher keep trace of the latest version of index. Each

Re: How to manage different indexes for different users

2010-10-11 Thread Li Li
will one user search other user's index? if not, you can use multi cores. 2010/10/11 Tharindu Mathew mcclou...@gmail.com: Hi everyone, I'm using solr to integrate search into my web app. I have a bunch of users who would have to be given their own individual indexes. I'm wondering whether

is multi-threads searcher feasible idea to speed up?

2010-09-28 Thread Li Li
hi all I want to speed up search time for my application. In a query, the time is largly used in reading postlist(io with frq files) and calculate scores and collect result(cpu, with Priority Queue). IO is hardly optimized or already part optimized by nio. So I want to use multithreads to

Re: is multi-threads searcher feasible idea to speed up?

2010-09-28 Thread Li Li
yes, there is a multisearcher in lucene. but it's idf in 2 indexes are not global. maybe I can modify it and also the index like: term1 df=5 doc1 doc3 doc5 term1 df=5 doc2 doc4 2010/9/28 Li Li fancye...@gmail.com: hi all    I want to speed up search time for my application. In a query

Re: Can Solr do approximate matching?

2010-09-22 Thread Li Li
It seems there is a SimilarLikeThis in lucene . I don't know whether a counterpart in solr. It just use the found document as a query to find similar documents. Or you just use boolean or query and similar questions with getting higher score. Of course, you can analyse the question using some NLP

Re: Color search for images

2010-09-16 Thread Li Li
do you mean content based image retrieval or just search images by tag? if the former, you can try LIRE 2010/9/15 Shawn Heisey s...@elyograg.org:  My index consists of metadata for a collection of 45 million objects, most of which are digital images.  The executives have fallen in love with

a small problem of distributed search

2010-08-16 Thread Li Li
current implementation of distributed search use unique key in the STAGE_EXECUTE_QUERY stage. public int distributedProcess(ResponseBuilder rb) throws IOException { ... if (rb.stage == ResponseBuilder.STAGE_EXECUTE_QUERY) { createMainQuery(rb); return

how to ignore position in indexing?

2010-07-31 Thread Li Li
hi all in lucene, we can only store tf of a term's invert list. in my application, I only provide dismax query with boolean query and don't support queries which need position info such as phrase query. So I don't want to store position info in prx file. How to turn off it? And if I turn off

Re: Solr searching performance issues, using large documents

2010-07-30 Thread Li Li
hightlight's time is mainly spent on getting the field which you want to highlight and tokenize this field(If you don't store term vector) . you can check what's wrong, 2010/7/30 Peter Spam ps...@mac.com: If I don't do highlighting, it's really fast.  Optimize has no effect. -Peter On Jul

Re: Speed up Solr Index merging

2010-07-29 Thread Li Li
I faced this problem but can't find any good solution. But if you have large stored field such as full text of document. If you don't store it in lucene, it will be quicker because 2 merge indexes will force copy all fdts into a new fdt. If you store it externally. The problem you have to face is

Re: Problem with parsing date

2010-07-26 Thread Li Li
I uses format like -MM-ddThh:mm:ssZ. it works 2010/7/26 Rafal Bluszcz Zawadzki ra...@headnet.dk: Hi, I am using Data Import Handler from Solr 1.4. Parts of my data-config.xml are:        entity name=page                processor=XPathEntityProcessor                stream=false      

Is there a cache for a query?

2010-07-26 Thread Li Li
I want a cache to cache all result of a query(all steps including collapse, highlight and facet). I read http://wiki.apache.org/solr/SolrCaching, but can't find a global cache. Maybe I can use external cache to store key-value. Is there any one in solr?

Re: Which is a good XPath generator?

2010-07-25 Thread Li Li
it's not a related topic in solr. maybe you should read some papers about wrapper generation or automatical web data extraction. If you want to generate xpath, you could possibly read liubing's papers such as Structured Data Extraction from the Web based on Partial Tree Alignment. Besides dom

Re: a bug of solr distributed search

2010-07-25 Thread Li Li
where is the link of this patch? 2010/7/24 Yonik Seeley yo...@lucidimagination.com: On Fri, Jul 23, 2010 at 2:23 PM, MitchK mitc...@web.de wrote: why do we do not send the output of TermsComponent of every node in the cluster to a Hadoop instance? Since TermsComponent does the map-part of the

  1   2   >