How to set default query operator in surround query parser?

2011-12-08 Thread Jason, Kim
Hi, all

I'm using surround query parser.
The request A B returns ParseException.
But A OR B returns correct results.
I think this is the problem of default query operator.
Anyone know how to set?

Thanks,
Jason

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-set-default-query-operator-in-surround-query-parser-tp3570034p3570034.html
Sent from the Solr - User mailing list archive at Nabble.com.


is there a way using 1.4 index at 4.0 trunk?

2011-11-30 Thread Jason, Kim
Hello,
I'm using solr 1.4 version.
I want to use some plugin in trunk version.
But I got IndexFormatTooOldException when it run old version index at trunk.
Is there a way using 1.4 index at 4.0 trunk?

Thanks,
Jason

--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-there-a-way-using-1-4-index-at-4-0-trunk-tp3550430p3550430.html
Sent from the Solr - User mailing list archive at Nabble.com.


appling SurroundQParserPlugin

2011-11-27 Thread Jason, Kim
Hi all
Is it possible to use SurroundQParserPlugin in Solr 1.4.0?
if so, how shoud I do it?

Thank in advance
Jason


--
View this message in context: 
http://lucene.472066.n3.nabble.com/appling-SurroundQParserPlugin-tp3540283p3540283.html
Sent from the Solr - User mailing list archive at Nabble.com.


server down caused by complex query

2011-11-24 Thread Jason, Kim
Hi all

Nowadays our solr server is frequently down.
Because our user send very long and complex queries with asterisk and near
operator.
Sometimes near operator exceeds 1,000 and keywords almost include asterisk.
If such query is sent to server, jvm memory is full. (our jvm memory
allocates 110G.)
After that, server is like down.

We also have old version's k2 engine.
But k2 is not down for same query.
k2 uses more i/o than memory.

Could we control solr memory usage?
Or is there any other solution?
(we are using solr1.4)

Thanks in advance.
Jason

--
View this message in context: 
http://lucene.472066.n3.nabble.com/server-down-caused-by-complex-query-tp3535506p3535506.html
Sent from the Solr - User mailing list archive at Nabble.com.


abort processing query

2011-11-09 Thread Jason, Kim
Hi all
We have very complexed queries including wildcard.
That causes memory overhead.
Sometimes, memory is full and server doesn't response.
What I wonder, when query process time on server exceeds the time limit, can
I abort processing query?
If possible, how should I do?

Thanks in advance
Jason

--
View this message in context: 
http://lucene.472066.n3.nabble.com/abort-processing-query-tp3495876p3495876.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about near query order

2011-10-20 Thread Jason, Kim
Which one is better performance of setting inOrder=false in solrconfig.xml
and quering with A B~1 AND B A~1 if performance differences?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-near-query-order-tp3427312p3437701.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about near query order

2011-10-18 Thread Jason, Kim
Thank you for your kind reply.

Is it possible only defType=lucnee in your second suggestion?
I'm using ComplexPhraseQueryParser.
So my defType is complexphrase.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-near-query-order-tp3427312p3431465.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about near query order

2011-10-18 Thread Jason, Kim
Thanks a ton iorixxx.

Jason.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-near-query-order-tp3427312p3432922.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about near query order

2011-10-17 Thread Jason, Kim
analyze term~2
term analyze~2 

In my case, two queries return different result set.
Isn't that in your case?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-near-query-order-tp3427312p3429916.html
Sent from the Solr - User mailing list archive at Nabble.com.


Question about near query order

2011-10-16 Thread Jason, Kim
Hi, all

I have some near query like analyze term~2.
That is matched in that order.
But I want to search regardless of order.
So far, I just queried analyze term~2 OR term analyze~2.
Is there a better way than what i did?

Thanks in advance.
Jason.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-near-query-order-tp3427312p3427312.html
Sent from the Solr - User mailing list archive at Nabble.com.


Phrase search error

2011-10-15 Thread Jason, Kim
Hi, all

When I queried a phrase search test mp3, I got some error below.
I think that the problem is because of WordDelimiterFilter.
In WordDelimiterFilter 'mp3' is splited pos1:mp, pos2:(3, mp3).
In such a case, the positions of subword and catenateword are incremented.
If this is not phrase search or WordDelimiterFilterFactory options just set
catenateAll=1, no problems.
But If WordDelimiterFilterFactory options set like below 'My Schema.xml',
occured error.

How can I solve this problem?
Give me any idea.

Thanks in advance.
Jason


[Error Message]
==
Unknown query type org.apache.lucene.search.MultiPhraseQuery found in
phrase query string test mp3

java.lang.IllegalArgumentException: Unknown query type
org.apache.lucene.search.MultiPhraseQuery found in phrase query string
test mp3
at
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:300)
at 
org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:307)
at org.apache.lucene.search.Query.weight(Query.java:98)
at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
==


[My Schema.xml]
==
fieldType name=text class=solr.TextField positionIncrementGap=100
 analyzer type=index
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-FoldToASCII.txt /
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=0 catenateWords=1
catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.SnowballPorterWithUnstemFilterFactory
language=English protected=protwords.txt /
 /analyzer
 analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=0 catenateWords=1
catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt /
 /analyzer
/fieldType

Re: Phrase search error

2011-10-15 Thread Jason, Kim
Hi, Ludovic

That's just what I'm looking for.
You're been a big help.
Thank you so much.

Jason.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Phrase-search-error-tp3423799p3423916.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to improve query result time.

2011-07-03 Thread Jason, Kim
Hi All
I have complex phrase queries including wildcard.
(ex. q=conn* pho*~2 OR inter* pho*~2 OR ...)
That takes long query result time.
I tried reindex after changing termIndexInterval to 8 for reduce the query
result time through more loading term index info.
I thought if I do so query result time will be faster.
But it wasn't.
I doubt searching for .frq/.prx spends more time...
Any ideas for impoving query result time?

I'm using Solr 1.4 and schema.xml is below.

fieldType name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.SnowballPorterWithUnstemFilterFactory
language=English protected=protwords.txt /
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.SnowballPorterFilterFactory language=English
protected=protwords.txt /
/analyzer
/fieldType

Thanks in advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-improve-query-result-time-tp3136554p3136554.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrDocumentList in Distributed search

2011-06-26 Thread Jason, Kim
Hi All
I have 5 shards. (sh01 ~ sh05)
I was debugging using solrJ.
When I quiried at each shard, results are right.
But when I quiried at all shards, elementData of SolrDocumentList is null.
But numFound of SolrDocumentList is right.
How can I get the SolrDocumentList in shards?

Thanks in Advance

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrDocumentList-in-Distributed-search-tp3112580p3112580.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: why too many open files?

2011-06-20 Thread Jason, Kim
Hi, Mark

I think FileNotFoundException will be worked around by raise the ulimit.
I just want to know why segments are created more than mergeFactor.
During the googling, I found contents concerning mergeFactor:
http://web.archiveorange.com/archive/v/bH0vUQzfYcdtZoocG2C9
Yonik wrote:
mergeFactor 10 means a maximum of 10 segments at each level.
if maxBufferedDocs=10 with a log doc merge policy (equivalent to
Lucene in the old days), then you could have up to ~ 10*log10(nDocs)
segments in the index (i.e. up to 60 segments for a 1M doc index).

But I don't understand this.
someone explain to me in more detail?

Thanks


--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-too-many-open-files-tp3084407p3085172.html
Sent from the Solr - User mailing list archive at Nabble.com.


why too many open files?

2011-06-19 Thread Jason, Kim
Hi, All

I have 12 shards and ramBufferSizeMB=512, mergeFactor=5.
But solr raise java.io.FileNotFoundException (Too many open files).
mergeFactor is just 5. How can this happen?
Below is segments of some shard. That is too many segments over mergFactor.
What's wrong and How should I set the mergeFactor?

==
[root@solr solr]# ls indexData/multicore-us/usn02/data/index/
_0.fdt   _gs.fdt  _h5.tii  _hl.nrm  _i1.nrm  _kn.nrm  _l1.nrm  _lq.tii
_0.fdx   _gs.fdx  _h5.tis  _hl.prx  _i1.prx  _kn.prx  _l1.prx  _lq.tis
_3i.fdt  _gs.fnm  _h7.fnm  _hl.tii  _i1.tii  _kn.tii  _l1.tii 
lucene-2de7b31b5eabdff0b6ec7fd32eecf8c7-write.lock
_3i.fdx  _gs.frq  _h7.frq  _hl.tis  _i1.tis  _kn.tis  _l1.tis  _lu.fnm
_3s.fnm  _gs.nrm  _h7.nrm  _hn.fnm  _j7.fdt  _kp.fnm  _l2.fnm  _lu.frq
_3s.frq  _gs.prx  _h7.prx  _hn.frq  _j7.fdx  _kp.frq  _l2.frq  _lu.nrm
_3s.nrm  _gs.tii  _h7.tii  _hn.nrm  _kb.fnm  _kp.nrm  _l2.nrm  _lu.prx
_3s.prx  _gs.tis  _h7.tis  _hn.prx  _kb.frq  _kp.prx  _l2.prx  _lu.tii
_3s.tii  _gu.fnm  _h9.fnm  _hn.tii  _kb.nrm  _kp.tii  _l2.tii  _lu.tis
_3s.tis  _gu.frq  _h9.frq  _hn.tis  _kb.prx  _kp.tis  _l2.tis  _ly.fnm
_48.fdt  _gu.nrm  _h9.nrm  _hp.fnm  _kb.tii  _kq.fnm  _l6.fnm  _ly.frq
_48.fdx  _gu.prx  _h9.prx  _hp.frq  _kb.tis  _kq.frq  _l6.frq  _ly.nrm
_4d.fnm  _gu.tii  _h9.tii  _hp.nrm  _kc.fnm  _kq.nrm  _l6.nrm  _ly.prx
_4d.frq  _gu.tis  _h9.tis  _hp.prx  _kc.frq  _kq.prx  _l6.prx  _ly.tii
_4d.nrm  _gw.fnm  _hb.fnm  _hp.tii  _kc.nrm  _kq.tii  _l6.tii  _ly.tis
_4d.prx  _gw.frq  _hb.frq  _hp.tis  _kc.prx  _kq.tis  _l6.tis  _m3.fnm
_4d.tii  _gw.nrm  _hb.nrm  _hr.fnm  _kc.tii  _kr.fnm  _la.fnm  _m3.frq
_4d.tis  _gw.prx  _hb.prx  _hr.frq  _kc.tis  _kr.frq  _la.frq  _m3.nrm
_5b.fdt  _gw.tii  _hb.tii  _hr.nrm  _kf.fdt  _kr.nrm  _la.nrm  _m3.prx
_5b.fdx  _gw.tis  _hb.tis  _hr.prx  _kf.fdx  _kr.prx  _la.prx  _m3.tii
_5b.fnm  _gy.fnm  _he.fdt  _hr.tii  _kf.fnm  _kr.tii  _la.tii  _m3.tis
_5b.frq  _gy.frq  _he.fdx  _hr.tis  _kf.frq  _kr.tis  _la.tis  _m8.fnm
_5b.nrm  _gy.nrm  _he.fnm  _ht.fnm  _kf.nrm  _kt.fnm  _le.fnm  _m8.frq
_5b.prx  _gy.prx  _he.frq  _ht.frq  _kf.prx  _kt.frq  _le.frq  _m8.nrm
_5b.tii  _gy.tii  _he.nrm  _ht.nrm  _kf.tii  _kt.nrm  _le.nrm  _m8.prx
_5b.tis  _gy.tis  _he.prx  _ht.prx  _kf.tis  _kt.prx  _le.prx  _m8.tii
_5m.fnm  _h0.fnm  _he.tii  _ht.tii  _kg.fnm  _kt.tii  _le.tii  _m8.tis
_5m.frq  _h0.frq  _he.tis  _ht.tis  _kg.frq  _kt.tis  _le.tis  _md.fnm
_5m.nrm  _h0.nrm  _hh.fnm  _hv.fnm  _kg.nrm  _kw.fnm  _li.fnm  _md.frq
_5m.prx  _h0.prx  _hh.frq  _hv.frq  _kg.prx  _kw.frq  _li.frq  _md.nrm
_5m.tii  _h0.tii  _hh.nrm  _hv.nrm  _kg.tii  _kw.nrm  _li.nrm  _md.prx
_5m.tis  _h0.tis  _hh.prx  _hv.prx  _kg.tis  _kw.prx  _li.prx  _md.tii
_5n.fnm  _h2.fnm  _hh.tii  _hv.tii  _kj.fdt  _kw.tii  _li.tii  _md.tis
_5n.frq  _h2.frq  _hh.tis  _hv.tis  _kj.fdx  _kw.tis  _li.tis  _mi.fnm
_5n.nrm  _h2.nrm  _hk.fnm  _hz.fdt  _kj.fnm  _ky.fnm  _lm.fnm  _mi.frq
_5n.prx  _h2.prx  _hk.frq  _hz.fdx  _kj.frq  _ky.frq  _lm.frq  _mi.nrm
_5n.tii  _h2.tii  _hk.nrm  _hz.fnm  _kj.nrm  _ky.nrm  _lm.nrm  _mi.prx
_5n.tis  _h2.tis  _hk.prx  _hz.frq  _kj.prx  _ky.prx  _lm.prx  _mi.tii
_5x.fnm  _h5.fdt  _hk.tii  _hz.nrm  _kj.tii  _ky.tii  _lm.tii  _mi.tis
_5x.frq  _h5.fdx  _hk.tis  _hz.prx  _kj.tis  _ky.tis  _lm.tis  segments_1
_5x.nrm  _h5.fnm  _hl.fdt  _hz.tii  _kn.fdt  _l1.fdt  _lq.fnm  segments.gen
_5x.prx  _h5.frq  _hl.fdx  _hz.tis  _kn.fdx  _l1.fdx  _lq.frq
_5x.tii  _h5.nrm  _hl.fnm  _i1.fnm  _kn.fnm  _l1.fnm  _lq.nrm
_5x.tis  _h5.prx  _hl.frq  _i1.frq  _kn.frq  _l1.frq  _lq.prx
==

Thanks in advance.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-too-many-open-files-tp3084407p3084407.html
Sent from the Solr - User mailing list archive at Nabble.com.


disable sort by score

2011-06-13 Thread Jason, Kim
Hi, All
I want to get the search result which is not sorted by anything.
Sorting by score take more time.
So, I want to disable sorting by score.
How can i do this?

Thanks, Jason.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/disable-sort-by-score-tp3057767p3057767.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: disable sort by score

2011-06-13 Thread Jason, Kim
Thanks to reply, Erick!

Actually, I need sort by score.
I was just curious that seach result without sorting is possible.
Then I found
http://lucene.472066.n3.nabble.com/MaxRows-and-disabling-sort-td2260650.html
In above context, Chris Hostetter-3 wrote
++
http://wiki.apache.org/solr/CommonQueryParameters#sort
You can sort by index id using sort=_docid_ asc or sort=_docid_ desc

if you specify _docid_ asc then solr should return as soon as it finds the
first N matching results w/o scoring all docs (because no score will be
computed) 
++

I tried to check perfomance using _docid_ asc.
But _docid_ didn't work in distributed search.
So I made inquiries to know that another method is.

Best
Jason

--
View this message in context: 
http://lucene.472066.n3.nabble.com/disable-sort-by-score-tp3057767p3061753.html
Sent from the Solr - User mailing list archive at Nabble.com.


how to work cache and improve performance phrase query included wildcard

2011-05-18 Thread Jason, Kim
Hi, all

I have two questions.
First,
I'm wondering how filterCache, queryResultCache, documentCache are applied.
After searching query1  OR query2 OR query3 ... , I searched query0  OR
query2 OR query3 ... .
Just query1 and query0 are difference.
But query time was not fast.
When are the caches applied?

Second,
I have 5 or more phrase queries included wildcard per query such as query1*
query2*~2 OR query3* query4*~2 ...
In the worst case, phrase queries included wildcard in one query are more
than 30.
QTime is more than 60 second.

Please give any idea to improve performance.

I have 2.5 million full text index.
That is running 10 shards on 1 tomcat.

Thanks,
Jason

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-work-cache-and-improve-performance-phrase-query-included-wildcard-tp2956671p2956671.html
Sent from the Solr - User mailing list archive at Nabble.com.


search problem after using EdgeNGramFilter

2010-12-09 Thread Jason, Kim

I am using EdgeNGramFilter for wildcard search.
But the search result is same whether or not followed by asterisk.
When I search without asterisk, I just want to retrive in original
terms(except ngram terms).

[example]
- doc1 : enterprise search server
- doc2 : enter key

When I query 'enter*', both doc1 and doc2 are retrived.
It's ok.
When I query 'enter', both doc1 and doc2 are also retrived.
But I just want a doc2.

How should I do this?
please help!

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/search-problem-after-using-EdgeNGramFilter-tp2060966p2060966.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: search problem after using EdgeNGramFilter

2010-12-09 Thread Jason, Kim

Hi, iorixxx
I thought that I have to use NGramFilter for wildcard search.
But It was the wrong idea.
Thanks, iorixxx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/search-problem-after-using-EdgeNGramFilter-tp2060966p2061961.html
Sent from the Solr - User mailing list archive at Nabble.com.


Using Ngram and Phrase search

2010-11-29 Thread Jason, Kim

Hi, all
I want to use both EdegeNGram analysis and phrase search.
But there is some problem.

On Field which is not using EdgeNGram analysis, phrase search.is good work.
But if using EdgeNGram then phrase search is incorrect.

Now I'm using Solr1.4.0.
Result of EdgeNGram analysis for pci express is below.
http://lucene.472066.n3.nabble.com/file/n1986848/before.jpg 

I thought cause is term position.
So I modified EdgeNGramTokenFilter of lucene-analyzers-2.9.1.
After modified, result is below.
http://lucene.472066.n3.nabble.com/file/n1986848/after.jpg 

So phrase search fot pci express from ngram index is good work.
But another problem is happend.

For example, when I searh phrase query pc express, docs included 'pci
express' are searched too.
In this case I don't want to search for 'pci express'.
I just want exact match pc express.

Please give your ideas.
Thanks,
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-Ngram-and-Phrase-search-tp1986848p1986848.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how can i use solrj binary format for indexing?

2010-10-21 Thread Jason, Kim

Hi Gora, I really appreciate.
Your reply was a great help to me. :)
I hope everything is fine with you.

Regards,
Jason




Gora Mohanty-3 wrote:
 
 On Mon, Oct 18, 2010 at 8:22 PM, Jason, Kim hialo...@gmail.com wrote:
 
 Sorry for the delay in replying. Was caught up in various things this
 week.
 
 Thank you for reply, Gora

 But I still have several questions.
 Did you use separate index?
 If so, you indexed 0.7 million Xml files per instance
 and merged it. Is it Right?
 
 Yes, that is correct. We sharded the data by user ID, so that each of the
 25
 cores held approximately 0.7 million out of the 3.5 million records. We
 could
 have used the sharded indices directly for search, but at least for now
 have
 decided to go with a single, merged index.
 
 Please let me know how to work multiple instances and cores in your case.
 [...]
 
 * Multi-core Solr setup is quite easy, via configuration in solr.xml:
   http://wiki.apache.org/solr/CoreAdmin . The configuration, i.e.,
   schema, solrconfig.xml, etc. need to be replicated across the
   cores.
 * Decide which XML files you will post to which core, and do the
   POST with curl, as usual. You might need to write a little script
   to do this.
 * After indexing on the cores is done, make sure to do a commit
   on each.
 * Merge the sharded indexes (if desired) as described here:
   http://wiki.apache.org/solr/MergingSolrIndexes . One thing to
   watch out for here is disk space. When merging with Lucene
   IndexMergeTool, we found that a rough rule of thumb was that
   intermediate steps in the merge would require about twice as
   much space as the total size of the indexes to be merged. I.e.,
   if one is merging 40GB of data in sharded indexes, one should
   have at least 120GB free.
 
 Regards,
 Gora
 
 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1750669.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how can i use solrj binary format for indexing?

2010-10-18 Thread Jason, Kim

Hi, Gora
I haven't tried yet indexing huge amount of xml files through curl or pure
java(like a post.jar).
Indexing through xml is really fast?
How many files did you index? And How did it(using curl or pure java)?

Thanks, Gora
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1724645.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how can i use solrj binary format for indexing?

2010-10-18 Thread Jason, Kim

Thank you for reply, Gora

But I still have several questions.
Did you use separate index?
If so, you indexed 0.7 million Xml files per instance
and merged it. Is it Right?
Please let me know how to work multiple instances and cores in your case.

Regards,
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1725679.html
Sent from the Solr - User mailing list archive at Nabble.com.


how can i use solrj binary format for indexing?

2010-10-17 Thread Jason, Kim

Hi all
I have a huge amount of xml files for indexing.
I want to index using solrj binary format to get performance gain.
Because I heard that using xml files to index is quite slow.
But I don't know how to use index through solrj binary format and can't find
examples.
Please give some help.
Thanks,
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1722612.html
Sent from the Solr - User mailing list archive at Nabble.com.


About setting solrconfig.xml

2010-10-12 Thread Jason, Kim

Hi, all.
I got some question about solrconfig.xml.
I have 10 fields in a document for index.
(Suppose that field names are f1, f2, ... , f10.)
Some user will want to search in field f1 and f5.
Another user will want to search in field f2, f3 and f7.

I am going to use dismax handler for this.
How should I write a dismax handler to satisfy variouse need.
Please give me any idea or a example.

(I know Dismax's qf parameter limits fields which user want to be searched.
Should I write dismax handlers for every case?
I think it's wrong. How should I do?)

Thanks in advance.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/About-setting-solrconfig-xml-tp1691836p1691836.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr, c/s type ?

2010-09-08 Thread Jason, Kim

I'd just like to use solr for in-house which is not web application.
But I don't know how should i do?
Thanks,

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-c-s-type-tp1392952p1444175.html
Sent from the Solr - User mailing list archive at Nabble.com.