Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li
http://zookeeper.apache.org/doc/r3.3.6/recipes.html#sc_recipes_twoPhasedCommit On Thu, Aug 16, 2012 at 7:41 AM, Nicholas Ball wrote: > > Haven't managed to find a good way to do this yet. Does anyone have any > ideas on how I could implement this feature? > Really need to move docs across from on

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li
do you really need this? distributed transaction is a difficult problem. in 2pc, every node could fail, including coordinator. something like leader election needed to make sure it works. you maybe try zookeeper. but if the transaction is not very very important like transfer money in bank, you can

Re: Atomic Multicore Operations - E.G. Move Docs

2012-08-15 Thread Li Li
在 2012-7-2 傍晚6:37,"Nicholas Ball" 写道: > > > That could work, but then how do you ensure commit is called on the two > cores at the exact same time? that may needs something like two phrase commit in relational dB. lucene has prepareCommit, but to implement 2pc, many things need to do. > Also, any w

Re: how to boost exact match

2012-08-10 Thread Li Li
create an field for exact match. it is a optional boolean clause 在 2012-8-11 下午1:42,"abhayd" 写道: > hi > > I have documents like > iphone 4 - white > iphone 4s - black > ipone4 - black > > when user searches for iphone 4 i would like to show iphone 4 docs first > and > iphone 4s after that. > Simil

Re: what is precisionStep and positionIncrementGap

2012-06-28 Thread Li Li
hu, Jun 28, 2012 at 3:51 PM, ZHANG Liang F wrote: > Thanks a lot, but the precisionStep is still very vague to me! Could you give > me a example? > > -Original Message- > From: Li Li [mailto:fancye...@gmail.com] > Sent: 2012年6月28日 11:25 > To: solr-user@lucene.ap

Re: Solr seems to hang

2012-06-27 Thread Li Li
could you please use jstack to dump the call stacks? On Thu, Jun 28, 2012 at 2:53 PM, Arkadi Colson wrote: > It now hanging for 15 hour and nothing changes in the index directory. > > Tips for further debugging? > > > On 06/27/2012 03:50 PM, Arkadi Colson wrote: >> >> I'm sending files to solr wi

Re: Query Logic Question

2012-06-27 Thread Li Li
I think they are logically the same. but 1 may be a little bit faster than 2 On Thu, Jun 28, 2012 at 5:59 AM, Rublex wrote: > Hi, > > Can someone explain to me please why these two queries return different > results: > > 1. -PaymentType:Finance AND -PaymentType:Lease AND -PaymentType:Cash *(700 >

Re: what is precisionStep and positionIncrementGap

2012-06-27 Thread Li Li
1. precisionStep is used for ranging query of Numeric Fields. see http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/search/NumericRangeQuery.html 2. positionIncrementGap is used for phrase query of multi-value fields e.g. doc1 has two titles. title1: ab cd

Re: Solr seems to hang

2012-06-27 Thread Li Li
seems that the indexwriter wants to flush but need to wait others become idle. but i see you the n gram filter is working. is your field's value too long? you sould also tell us average load the system. the free memory and memory used by jvm 在 2012-6-27 晚上7:51,"Arkadi Colson" 写道: > Anybody an idea

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
ul approach > http://lucene.472066.n3.nabble.com/High-response-time-after-being-idle-tp3616599p3617604.html. > > On Mon, Jun 11, 2012 at 3:02 PM, Toke Eskildsen > wrote: > >> On Mon, 2012-06-11 at 11:38 +0200, Li Li wrote: >> > yes, I need average query time less than

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
t; > http://en.wikipedia.org/wiki/Swappiness > > -Kuli > > Am 11.06.2012 10:38, schrieb Li Li: > >> I have roughly read the codes of RAMDirectory. it use a list of 1024 >> byte arrays and many overheads. >> But as far as I know, using MMapDirectory, I can't prev

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
persist your index, > you'll need to live with disk IO anyway. > > Greetings, > Kuli > > Am 11.06.2012 11:20, schrieb Li Li: > >> I am sorry. I make a mistake. even use RAMDirectory, I can not >> guarantee they are not swapped out. >> >> On Mon,

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
ss > > -Kuli > > Am 11.06.2012 10:38, schrieb Li Li: > >> I have roughly read the codes of RAMDirectory. it use a list of 1024 >> byte arrays and many overheads. >> But as far as I know, using MMapDirectory, I can't prevent the page >> faults. OS will swap less

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
d a "small" segment. Every night I will merge them. new added documents will flush into a new segment and I will merge the new generated segment and the small one. Our update operations are not very frequent. On Mon, Jun 11, 2012 at 4:59 PM, Paul Libbrecht wrote: > Li Li, > > have yo

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
at 4:45 PM, Michael Kuhlmann wrote: > Set the swapiness to 0 to avoid memory pages being swapped to disk too > early. > > http://en.wikipedia.org/wiki/Swappiness > > -Kuli > > Am 11.06.2012 10:38, schrieb Li Li: > >> I have roughly read the codes of RAMDirectory. it

Re: what's better for in memory searching?

2012-06-11 Thread Li Li
This sounds wrong, but it is true. With > RAMDirectory, Java has to work harder doing garbage collection. > > On Fri, Jun 8, 2012 at 1:30 AM, Li Li wrote: >> hi all >>   I want to use lucene 3.6 providing searching service. my data is >> not very large, raw data is le

Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Li Li
yes, I am also interested in good performance with 2 billion docs. how many search nodes do you use? what's the average response time and qps ? another question: where can I find related paper or resources of your algorithm which explains the algorithm in detail? why it's better than google site(b

Re: Installing Solr on Tomcat using Shell - Code wrong?

2012-05-22 Thread Li Li
you should find some clues from tomcat log 在 2012-5-22 晚上7:49,"Spadez" 写道: > Hi, > > This is the install process I used in my shell script to try and get Tomcat > running with Solr (debian server): > > > > I swear this used to work, but currently only Tomcat works. The Solr page > just comes up wi

Re: How can i search site name

2012-05-21 Thread Li Li
you should define your search first. if the site is www.google.com. how do you match it. full string matching or partial matching. e.g. is "google" should match? if it does, you should write your own analyzer for this field. On Tue, May 22, 2012 at 2:03 PM, Shameema Umer wrote: > Sorry, > Please

Re: Solr query with mandatory values

2012-05-09 Thread Li Li
query=parser.parse(q); System.out.println(query); On Thu, May 10, 2012 at 8:20 AM, Li Li wrote: > + before term is correct. in lucene term includes field and value. > > Query  ::= ( Clause )* > > Clause ::= ["+", "-"] [ ":"] ( | "

Re: Solr query with mandatory values

2012-05-09 Thread Li Li
+ before term is correct. in lucene term includes field and value. Query ::= ( Clause )* Clause ::= ["+", "-"] [ ":"] ( | "(" Query ")" ) <#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> | "-" | "+" ) > <#_ESCAPED_CHAR: "\\" ~[] > in lucene query syntax, you can't express a term value i

Re: SOLRJ: Is there a way to obtain a quick count of total results for a query

2012-05-04 Thread Li Li
don't score by relevance and score by document id may speed it up a little? I haven't done any test of this. may be u can give it a try. because scoring will consume some cpu time. you just want to match and get total count On Wed, May 2, 2012 at 11:58 PM, vybe3142 wrote: > I can achieve this by

Re: Sorting result first which come first in sentance

2012-05-03 Thread Li Li
for this version, you may consider using payload for position boost. you can save boost values in payload. I have used it in lucene api where anchor text should weigh more than normal text. but I haven't used it in solr. some searched urls: http://wiki.apache.org/solr/Payloads http://digitalpebble.

Re: Sorting result first which come first in sentance

2012-05-03 Thread Li Li
as for version below 4.0, it's not possible because lucene's score model. position information is stored, but only used to support phrase query. it just tell us whether a document is matched, but we can boost a document. The similar problem is : how to implement proximity boost. for 2 search terms,

Re: get latest 50 documents the fastest way

2012-05-01 Thread Li Li
you should reverse your sort algorithm. maybe you can override the tf method of Similarity and return -1.0f * tf(). (I don't know whether default collector allow score smaller than zero) Or you can hack this by add a large number or write your own collector, in its collect(int doc) method, you can

question about NRT(soft commit) and Transaction Log in trunk

2012-04-28 Thread Li Li
hi I checked out the trunk and played with its new soft commit feature. it's cool. But I've got a few questions about it. By reading some introductory articles and wiki, and hasted code reading, my understand of it's implementation is: For normal commit(hard commit), we should flush all in

Re: How to read SOLR cache statistics?

2012-04-13 Thread Li Li
http://wiki.apache.org/solr/SolrCaching On Fri, Apr 13, 2012 at 2:30 PM, Kashif Khan wrote: > Does anyone explain what does the following parameters mean in SOLR cache > statistics? > > *name*: queryResultCache > *class*: org.apache.solr.search.LRUCache > *version*: 1.0 > *description*: LRU

Re: Solr Scoring

2012-04-13 Thread Li Li
another way is to use payload http://wiki.apache.org/solr/Payloads the advantage of payload is that you only need one field and can make frq file smaller than use two fields. but the disadvantage is payload is stored in prx file, so I am not sure which one is fast. maybe you can try them both. On

Re: using solr to do a 'match'

2012-04-11 Thread Li Li
houldMatch parameter'. Also > norms can be used as a source for dynamics mm values. > > Wdyt? > > On Wed, Apr 11, 2012 at 10:08 AM, Li Li wrote: > > > it's not possible now because lucene don't support this. > > when doing disjunction query, it onl

Re: using solr to do a 'match'

2012-04-10 Thread Li Li
it's not possible now because lucene don't support this. when doing disjunction query, it only record how many terms match this document. I think this is a common requirement for many users. I suggest lucene should divide scorer to a matcher and a scorer. the matcher just return which doc is matche

Re: Trouble Setting Up Development Environment

2012-03-24 Thread Li Li
gt;> Classpath entry /solr3_5/ssrc/solr/lib/easymock-2.2.jar will not be >> exported or published. Runtime ClassNotFoundExceptions may result. >> solr3_5P/solr3_5Classpath Dependency Validator Message >> Classpath entry >> /solr3_5/ssrc/solr/lib/geronimo-stax

Re: Trouble Setting Up Development Environment

2012-03-23 Thread Li Li
here is my method. 1. check out latest source codes from trunk or download tar ball svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunklucene_trunk 2. create a dynamic web project in eclipse and close it. for example, I create a project name lucene-solr-trunk in my workspace.

Re: How to avoid the unexpected character error?

2012-03-15 Thread Li Li
it's not the right place. when you use java -Durl=http://... -jar post.jar data.xml the data.xml file must be a valid xml file. you shoud escape special chars in this file. I don't know how you generate this file. if you use java program(or other scripts) to generate this file, you should use xml t

Re: Solr out of memory exception

2012-03-15 Thread Li Li
ag solved a real problem we were having. Whoever wrote the JRocket book you refer to no doubt had other scenarios in mind... On Thu, Mar 15, 2012 at 3:02 PM, C.Yunqin <345804...@qq.com> wrote: > why should enable pointer compression? > > > > > -- Original -

Re: Solr out of memory exception

2012-03-14 Thread Li Li
ver with exactly same system and solr configuration & > memory it is working fine? > > > -Original Message- > From: Li Li [mailto:fancye...@gmail.com] > Sent: Thursday, March 15, 2012 11:11 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr out of memory excep

Re: Solr out of memory exception

2012-03-14 Thread Li Li
how many memory are allocated to JVM? On Thu, Mar 15, 2012 at 1:27 PM, Husain, Yavar wrote: > Solr is giving out of memory exception. Full Indexing was completed fine. > Later while searching maybe when it tries to load the results in memory it > starts giving this exception. Though with the sam

Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
no, it's nothing to do with schema.xml post.jar just post a file, it don't parse this file. solr will use xml parser to parse this file. if you don't escape special characters, it's not a valid xml file and solr will throw exceptions. On Thu, Mar 15, 2012 at 12:33 AM, neosky wrote: > Thanks! > D

Re: How to avoid the unexpected character error?

2012-03-14 Thread Li Li
There is a class org.apache.solr.common.util.XML in solr you can use this wrapper: public static String escapeXml(String s) throws IOException{ StringWriter sw=new StringWriter(); XML.escapeCharData(s, sw); return sw.getBuffer().toString(); } On Wed, Mar 14, 2012 at

Re: Sorting on non-stored field

2012-03-14 Thread Li Li
it should be indexed by not analyzed. it don't need stored. reading field values from stored fields is extremely slow. So lucene will use StringIndex to read fields for sort. so if you want to sort by some field, you should index this field and don't analyze it. On Wed, Mar 14, 2012 at 6:43 PM, Fi

Re: index size with replication

2012-03-13 Thread Li Li
optimize will generate new segments and delete old ones. if your master also provides searching service during indexing, the old files may be opened by old SolrIndexSearcher. they will be deleted later. So when indexing, the index size may double. But a moment later, old indexes will be deleted.

Re: How to limit the number of open searchers?

2012-03-06 Thread Li Li
what do u mean "programmatically"? modify codes of solr? becuase solr is not like lucene, it only provide http interfaces for its users other than java api if you want to modify solr, you can find codes in SolrCore private final LinkedList> _searchers = new LinkedList>(); and _searcher is current

Re: Fw:how to make fdx file

2012-03-04 Thread Li Li
lucene will never modify old segment files, it just flushes into a new segment or merges old segments into new one. after merging, old segments will be deleted. once a file(such as fdt and fdx) is generated. it will never be re-generated. the only possible is that in the generating stage, there is

Re: Sort by the number of matching terms (coord value)

2012-02-16 Thread Li Li
you can fool the lucene scoring fuction. override each function such as idf queryNorm lengthNorm and let them simply return 1.0f. I don't lucene 4 will expose more details. but for 2.x/3.x, lucene can only score by vector space model and the formula can't be replaced by users. On Fri, Feb 17, 2012

Re: Can I rebuild an index and remove some fields?

2012-02-15 Thread Li Li
w have a shrunk index with specified terms removed. > > Implementation uses separate thread for each segment, so it re-writes > them in parallel. Took about 15 minutes to do 770,000 doc index on my > macbook. > > > On Tue, Feb 14, 2012 at 10:12 PM, Li Li wrote: > > I have rough

Re: Can I rebuild an index and remove some fields?

2012-02-14 Thread Li Li
nd Terms(...) it might work. > > Something like: > > HashSet ignoredTerms=...; > > FilteringIndexReader wrapper=new FilterIndexReader(reader); > > SegmentMerger merger=new SegmentMerger(writer); > > merger.add(wrapper); > > merger.Merge(); > > > > >

Re: Can I rebuild an index and remove some fields?

2012-02-13 Thread Li Li
for method 2, delete is wrong. we can't delete terms. you also should hack with the tii and tis file. On Tue, Feb 14, 2012 at 2:46 PM, Li Li wrote: > method1, dumping data > for stored fields, you can traverse the whole index and save it to > somewhere else. > for index

Re: Can I rebuild an index and remove some fields?

2012-02-13 Thread Li Li
method1, dumping data for stored fields, you can traverse the whole index and save it to somewhere else. for indexed but not stored fields, it may be more difficult. if the indexed and not stored field is not analyzed(fields such as id), it's easy to get from FieldCache.StringIndex. But for

Re: New segment file created too often

2012-02-13 Thread Li Li
available after adding to the index. > > What I don't understand is why new segment files are created so often. > Are the commit calls triggering new segment files being created? I don't > see this behavior in another environment of the same version of solr. > >

Re: New segment file created too often

2012-02-13 Thread Li Li
ts be available after adding to the index. > > What I don't understand is why new segment files are created so often. > Are the commit calls triggering new segment files being created? I don't > see this behavior in another environment of the same version of solr. > >

Re: New segment file created too often

2012-02-13 Thread Li Li
Commit is called after adding each document you should add enough documents and then calling a commit. commit is a cost operation. if you want to get latest feeded documents, you could use NRT On Tue, Feb 14, 2012 at 12:47 AM, Huy Le wrote: > Hi, > > I am using solr 3.5. I seeing solr keep

Re: Chinese Phonetic search

2012-02-07 Thread Li Li
you can convert Chinese words to pinyin and use n-gram to search phonetic similar words On Wed, Feb 8, 2012 at 11:10 AM, Floyd Wu wrote: > Hi there, > > Does anyone here ever implemented phonetic search especially with > Chinese(traditional/simplified) using SOLR or Lucene? > > Please share some

more sql-like commands for solr

2012-02-07 Thread Li Li
hi all, we have used solr to provide searching service in many products. I found for each product, we have to do some configurations and query expressions. our users are not used to this. they are familiar with sql and they may describe like this: I want a query that can search books whose

Re: RE: Can't find resource 'solrconfig.xml'

2011-10-31 Thread Li Li
set JAVA_OPTS=%JAVA_OPTS% -Dsolr.solr.home=c:\xxx On Mon, Oct 31, 2011 at 9:14 PM, 刘浪 wrote: > Hi Li Li, >I don't know where I should add in catalina.bat. I have know Linux > how to do it, but my OS is windows. >Thank you very much. > > Sincerely, > A

Re: Can't find resource 'solrconfig.xml'

2011-10-31 Thread Li Li
modify catalina.sh(bat) adding java startup params: -Dsolr.solr.home=/your/path On Mon, Oct 31, 2011 at 8:30 PM, 刘浪 wrote: > Hi, > After I start tomcat, I input http://localhost:8080/solr/admin. It > can display. But in the tomcat, I find an exception like "Can't find > resource 'solrconfig

Re: Want to support "did you mean xxx" but is Chinese

2011-10-21 Thread Li Li
we have implemented one supporting "did you mean" and preffix suggestion for Chinese. But we base our working on solr 1.4 and we did many modifications so it will cost time to integrate it to current solr/lucene. Here are our solution. glad to see any advices. 1. offline words and p

Re: Multi CPU Cores

2011-10-16 Thread Li Li
for indexing, your can make use of multi cores easily by call IndexWriter.addDocument with multi-threads as far as I know, for searching, if there is only one request, you can't make good use of cpus. On Sat, Oct 15, 2011 at 9:37 PM, Rob Brown wrote: > Hi, > > I'm running Solr on a machine with

What will happen when one thread is closing a searcher while another is searching?

2011-09-05 Thread Li Li
hi all, I am using spellcheck in solr 1.4. I found that spell check is not implemented as SolrCore. in SolrCore, it uses reference count to track current searcher. oldSearcher and newSearcher will both exist if oldSearcher is servicing some query. But in FileBasedSpellChecker public void bu

what's the status of droids project(http://incubator.apache.org/droids/)?

2011-08-23 Thread Li Li
hi all I am interested in vertical crawler. But it seems this project is not very active. It's last update time is 11/16/2009

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
or your analyzer is null? any other exception or warning in your log file? On Fri, Aug 19, 2011 at 7:37 PM, Li Li wrote: > Line 476 of  SpellCheckComponent.getTokens of mine  is  assert analyzer != > null; > it seems our codes' versions don't match. could

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
Line 476 of SpellCheckComponent.getTokens of mine is assert analyzer != null; it seems our codes' versions don't match. could you decompile your SpellCheckComponent.class ? On Fri, Aug 19, 2011 at 7:23 PM, Valentin wrote: > My beautiful NullPointer Exception : > > > SEVERE: java.lang.NullPoin

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
NullPointerException? do you have the full exception print stack? On Fri, Aug 19, 2011 at 6:49 PM, Valentin wrote: > > Li Li wrote: >> If you don't want to tokenize  query, you should pass spellcheck.q >> and provide your own analyzer such as keyword analyzer. > >

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
I haven't used suggest yet. But in spell check if you don't provide spellcheck.q, it will analyze the q parameter by a converter which "tokenize" your query. else it will use the analyzer of the field to process parameter q. If you don't want to tokenize query, you should pass spellcheck.q

Re: solr distributed search don't work

2011-08-19 Thread Li Li
g directly, not in url, but should > work the same. > Maybe an issue in your spell request handler. > > > 2011/8/19 Li Li > >> hi all, >>     I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent >> but there is something wrong. >>     t

Re: Full sentence spellcheck

2011-08-19 Thread Li Li
this may need something like language models to suggest. I found an issue https://issues.apache.org/jira/browse/SOLR-2585 what's going on with it? On Thu, Aug 18, 2011 at 11:31 PM, Valentin wrote: > I'm trying to configure a spellchecker to autocomplete full sentences from my > query. > > I've

solr distributed search don't work

2011-08-19 Thread Li Li
hi all, I follow the wiki http://wiki.apache.org/solr/SpellCheckComponent but there is something wrong. the url given my the wiki is http://solr:8983/solr/select?q=*:*&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=spell&shards=solr-shard1:8983/solr,solr-shar

can't use distributed spell check

2011-08-19 Thread Li Li
hi all, I tested it following the instructions in http://wiki.apache.org/solr/SpellCheckComponent. but it seems something wrong. the sample url in the wiki is http://solr:8983/solr/select?q=*:*&spellcheck=true&spellcheck.build=true&spellcheck.q=toyata&qt=spell&shards.qt=spell&shards=solr-

Re: how to enable MMapDirectory in solr 1.4?

2011-08-08 Thread Li Li
if MMapDirectory will perform better for you with Linux over >> NIOFSDir.  I'm pretty sure in Trunk/4.0 it's the default for Windows and >> maybe Solaris.  In Windows, there is a definite advantage for using >> MMapDirectory on a 64-bit system. >> >> James Dye

how to enable MMapDirectory in solr 1.4?

2011-08-08 Thread Li Li
hi all, I read Apache Solr 3.1 Released Note today and found that MMapDirectory is now the default implementation in 64 bit Systems. I am now using solr 1.4 with 64-bit jvm in Linux. how can I use MMapDirectory? will it improve performance?

full text searching in cloud for minor enterprises

2011-07-04 Thread Li Li
hi all, I want to provide full text searching for some "small" websites. It seems cloud computing is popular now. And it will save costs because it don't need employ engineer to maintain the machine. For now, there are many services such as amazon s3, google app engine, ms azure etc. I am

Re: Unsupported encoding GB18030

2011-04-01 Thread Li Li
post.jar only support utf8. you must do the transformation. 2011/4/1 Jan Høydahl : > Hi, > > Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23 > > When trying to post example\exampledocs\gb18030-example.xml using post.jar I > get this error: > % java -jar post.jar gb18030-exampl

Re: RamBufferSize and AutoCommit

2011-03-28 Thread Li Li
there are 3 conditions that will trigger an auto flushing in lucene 1. size of index in ram is larger than ram buffer size 2. documents in mamory is larger than the number set by setMaxBufferedDocs. 3. deleted term number is larger than the ratio set by setMaxBufferedDeleteTerms. auto flushing by

Re: Snappull failed

2011-03-19 Thread Li Li
has master updated index during replication? this could occur when it failed to download any file becuase network problem. 209715200!=583644834 means the size of the file slave fetched is 583644834 but it only download 209715200 bytes. maybe the connection is time out. 2011/2/16 Markus Jelsma :

Re: Helpful new JVM parameters

2011-03-17 Thread Li Li
will UseCompressedOops be useful? for application using less than 4GB memory, it will be better that 64bit reference. But for larger memory using application, it will not be cache friendly. "JRocket the definite guide" says: "Naturally, 64 GB isn't a theoretical limit but just an example. It was me

Re: What request handlers to use for query strings in Chinese or Japanese?

2011-03-17 Thread Li Li
That's the job your analyzer should concern 2011/3/17 Andy : > Hi, > > For my Solr server, some of the query strings will be in Asian languages such > as Chinese or Japanese. > > For such query strings, would the Standard or Dismax request handler work? My > understanding is that both the Stand

Re: I send a email to lucene-dev solr-dev lucene-user but always failed

2011-03-11 Thread Li Li
bility of master. we want to use some synchronization mechanism to allow only 1 or 2 ReplicationHandler threads are doing CMD_GET_FILE command. Is that solution feasible? 2011/3/11 Li Li > hi > it seems my mail is judged as spam. > Technical details of permanent failure: >

I send a email to lucene-dev solr-dev lucene-user but always failed

2011-03-11 Thread Li Li
hi it seems my mail is judged as spam. Technical details of permanent failure: Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other

Re: Turn off caching

2011-02-11 Thread Li Li
fieldcache - that can't be > commented out. This cache will always jump into the picture > > If I need to do such things, I restart the whole tomcat6 server to flush ALL > caches. > > 2011/2/11 Li Li > >> do you mean queryResultCache? you can comment r

Re: Turn off caching

2011-02-10 Thread Li Li
do you mean queryResultCache? you can comment related paragraph in solrconfig.xml see http://wiki.apache.org/solr/SolrCaching 2011/2/8 Isan Fulia : > Hi, > My solrConfig file looks like > > >   > >   >     multipartUploadLimitInKB="2048" /> >   > >   default="true" /> >   >   class="org.apache.so

Re: Optimizing to only 1 segment

2010-12-27 Thread Li Li
ser by entering url >> >> http://localhost:8080/myindex/update?optimize=true >> or >> http://localhost:8080/myindex/update?stream.body= >> >> Thanks. >> >> >> On Mon, Dec 27, 2010 at 7:12 AM, Li Li wrote: >> >>> maybe you can consul

Re: Optimizing to only 1 segment

2010-12-26 Thread Li Li
maybe you can consult log files and it may show you something btw how do you post your command? do you use curl 'http://localhost:8983/solr/update?optimize=true' ? or posting a xml file? 2010/12/27 Rok Rejc : > On Mon, Dec 27, 2010 at 3:26 AM, Li Li wrote: > >> see maxMerg

Re: Optimizing to only 1 segment

2010-12-26 Thread Li Li
see maxMergeDocs(maxMergeSize) in solrconfig.xml. if the segment's documents size is larger than this value, it will not be merged. 2010/12/27 Rok Rejc : > Hi all, > > I have created an index, commited the data and after that I had run the > optimize with default parameters: > > http://localhost:8

Re: Best practice for Delta every 2 Minutes.

2010-12-16 Thread Li Li
write the document into log file. and after flushing, we delete corresponding lines in the log file if the program corrput. we will redo the log and add them into RAMDirectory. Any one has done similar work? 2010/12/1 Li Li : > you may implement your own MergePolicy to keep on large index

Re: Best practice for Delta every 2 Minutes.

2010-12-16 Thread Li Li
I think it will not because default configuration can only have 2 newSearcher threads but the delay will be more and more long. The newer newSearcher will wait these 2 ealier one to finish. 2010/12/1 Jonathan Rochkind : > If your index warmings take longer than two minutes, but you're doing a > co

Re: Best practice for Delta every 2 Minutes.

2010-11-30 Thread Li Li
you may implement your own MergePolicy to keep on large index and merge all other small ones or simply set merge factor to 2 and the largest index not be merged by set maxMergeDocs less than the docs in the largest one. So there is one large index and a small one. when adding a little docs, they wi

Re: shutdown.sh does not kill the tomcat process running solr./?

2010-11-30 Thread Li Li
1. make sure the the port is not used. 2. ./bin/shutdown.sh && tail -f logs/xxx to see what the server is doing if you just feed data or modified index, and don't flush/commit, when shutdowning, it will do something. 2010/12/1 Robert Petersen : > Greetings, we're wondering why we can issue th

strange problem

2010-11-15 Thread Li Li
hi all I confronted a strange problem when feed data to solr. I started feeding and then Ctrl+C to kill feed program(post.jar). Then because XML stream is terminated unnormally, DirectUpdateHandler2 will throw an exception. And I goto the index directory and sorted it by date. newest files are f

Re: Does Solr support Natural Language Search

2010-11-04 Thread Li Li
I don't think current lucene will offer what you want now. There are 2 main tasks in a search process. One is "understanding" users' intension. Because natural language understanding is difficult, Current Information Retrival systems "force" users input some terms to express their needs

Re: question about SolrCore

2010-10-28 Thread Li Li
is there anyone could help me? 2010/10/11 Li Li : > hi all, >    I want to know the detail of IndexReader in SolrCore. I read a > little codes of SolrCore. Here is my understanding, are they correct? >    Each SolrCore has many SolrIndexSearcher and keeps them in > _searchers. and

Re: How to manage different indexes for different users

2010-10-11 Thread Li Li
will one user search other user's index? if not, you can use multi cores. 2010/10/11 Tharindu Mathew : > Hi everyone, > > I'm using solr to integrate search into my web app. > > I have a bunch of users who would have to be given their own individual > indexes. > > I'm wondering whether I'd have to

question about SolrCore

2010-10-11 Thread Li Li
hi all, I want to know the detail of IndexReader in SolrCore. I read a little codes of SolrCore. Here is my understanding, are they correct? Each SolrCore has many SolrIndexSearcher and keeps them in _searchers. and _searcher keep trace of the latest version of index. Each SolrIndexSearcher

Re: is multi-threads searcher feasible idea to speed up?

2010-09-28 Thread Li Li
yes, there is a multisearcher in lucene. but it's idf in 2 indexes are not global. maybe I can modify it and also the index like: term1 df=5 doc1 doc3 doc5 term1 df=5 doc2 doc4 2010/9/28 Li Li : > hi all >    I want to speed up search time for my application. In a query, th

is multi-threads searcher feasible idea to speed up?

2010-09-28 Thread Li Li
hi all I want to speed up search time for my application. In a query, the time is largly used in reading postlist(io with frq files) and calculate scores and collect result(cpu, with Priority Queue). IO is hardly optimized or already part optimized by nio. So I want to use multithreads to utili

Re: Can Solr do approximate matching?

2010-09-22 Thread Li Li
It seems there is a SimilarLikeThis in lucene . I don't know whether a counterpart in solr. It just use the found document as a query to find similar documents. Or you just use boolean or query and similar questions with getting higher score. Of course, you can analyse the question using some NLP t

Re: Color search for images

2010-09-16 Thread Li Li
do you mean content based image retrieval or just search images by tag? if the former, you can try LIRE 2010/9/15 Shawn Heisey : >  My index consists of metadata for a collection of 45 million objects, most > of which are digital images.  The executives have fallen in love with > Google's color im

a small problem of distributed search

2010-08-16 Thread Li Li
current implementation of distributed search use unique key in the STAGE_EXECUTE_QUERY stage. public int distributedProcess(ResponseBuilder rb) throws IOException { ... if (rb.stage == ResponseBuilder.STAGE_EXECUTE_QUERY) { createMainQuery(rb); return ResponseBuilder.STAGE_

how to ignore position in indexing?

2010-07-31 Thread Li Li
hi all in lucene, we can only store tf of a term's invert list. in my application, I only provide dismax query with boolean query and don't support queries which need position info such as phrase query. So I don't want to store position info in prx file. How to turn off it? And if I turn off i

Re: Solr searching performance issues, using large documents

2010-07-30 Thread Li Li
hightlight's time is mainly spent on getting the field which you want to highlight and tokenize this field(If you don't store term vector) . you can check what's wrong, 2010/7/30 Peter Spam : > If I don't do highlighting, it's really fast.  Optimize has no effect. > > -Peter > > On Jul 29, 2010, a

Re: Speed up Solr Index merging

2010-07-29 Thread Li Li
I faced this problem but can't find any good solution. But if you have large stored field such as full text of document. If you don't store it in lucene, it will be quicker because 2 merge indexes will force copy all fdts into a new fdt. If you store it externally. The problem you have to face is h

Is there a cache for a query?

2010-07-26 Thread Li Li
I want a cache to cache all result of a query(all steps including collapse, highlight and facet). I read http://wiki.apache.org/solr/SolrCaching, but can't find a global cache. Maybe I can use external cache to store key-value. Is there any one in solr?

Re: Problem with parsing date

2010-07-26 Thread Li Li
I uses format like -MM-ddThh:mm:ssZ. it works 2010/7/26 Rafal Bluszcz Zawadzki : > Hi, > > I am using Data Import Handler from Solr 1.4. > > Parts of my data-config.xml are: > > >                        processor="XPathEntityProcessor" >                stream="false" >                forEach="

  1   2   >