API support upload file for External File Field

2015-06-17 Thread Floyd Wu
Is there any API to support upload file for ExternalFileField to /data/ directory or any good practice on this? My application and Solr Server were physically separated on two place. Application will calculate a score and generate a file for ExternalFileField. Thanks for any input.

Push ExternalFileField to Solr

2015-05-19 Thread Floyd Wu
Hi I have two server(Physical) that run my application and solr. I use external file field to do some search result ranking. According to the wiki page, external file field data need to resident in {solr}\data directory. Because EFF data is generated by my application. How can I push this file to

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-20 Thread Floyd Wu
Will these awesome features being implemented in Solr soon 2014/6/20 下午10:43 於 Yonik Seeley yo...@heliosearch.com 寫道: On Fri, Jun 20, 2014 at 10:15 AM, Yago Riveiro yago.rive...@gmail.com wrote: Yonik, This native code uses in any way the docValues? Nope... not yet. It is something I

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-20 Thread Floyd Wu
Hi Yonik, i dont' understand the relationship between solr and heliosearch since you were committer of solr? I just curious. 2014/6/21 上午12:07 於 Yonik Seeley yo...@heliosearch.com 寫道: On Fri, Jun 20, 2014 at 11:16 AM, Floyd Wu floyd...@gmail.com wrote: Will these awesome features being

Re: What is the best approach to send lots of XML Messages to Solr to build index?

2014-06-16 Thread Floyd Wu
Hi Mikhail Thanks for you suggestions. Floyd 2014-06-16 17:28 GMT+08:00 Mikhail Khludnev mkhlud...@griddynamics.com: On Mon, Jun 16, 2014 at 6:57 AM, Floyd Wu floyd...@gmail.com wrote: Hi Mikhail, What is the pros. to disable tlog? I consumes the heap much providing the benefits (real

What is the best approach to send lots of XML Messages to Solr to build index?

2014-06-15 Thread Floyd Wu
Hi, I have many XML Message file formatted like this https://wiki.apache.org/solr/UpdateXmlMessages These files are generated by my index builder daily. Currently I am sending these file through http post to Solr but sometimes I hit OOM exception or pending too many tlog. Do you have better way

Re: What is the best approach to send lots of XML Messages to Solr to build index?

2014-06-15 Thread Floyd Wu
://www.solr-start.com/ - Accelerating your Solr proficiency On Sun, Jun 15, 2014 at 3:44 PM, Floyd Wu floyd...@gmail.com wrote: Hi, I have many XML Message file formatted like this https://wiki.apache.org/solr/UpdateXmlMessages These files are generated by my index builder daily. Currently I

Re: What is the best approach to send lots of XML Messages to Solr to build index?

2014-06-15 Thread Floyd Wu
roughly ten huge files in parallel is a way to perform good. Once again, nuke tlog. On Sun, Jun 15, 2014 at 12:44 PM, Floyd Wu floyd...@gmail.com wrote: Hi, I have many XML Message file formatted like this https://wiki.apache.org/solr/UpdateXmlMessages These files are generated by my

Re: What is the best approach to send lots of XML Messages to Solr to build index?

2014-06-15 Thread Floyd Wu
, 2014 at 12:44 PM, Floyd Wu floyd...@gmail.com wrote: Hi, I have many XML Message file formatted like this https://wiki.apache.org/solr/UpdateXmlMessages These files are generated by my index builder daily. Currently I am sending these file through http post to Solr but sometimes I

Re: What is the best approach to send lots of XML Messages to Solr to build index?

2014-06-15 Thread Floyd Wu
Hi Shawn, I've tried to set 4GB heap for Solr and the OOM exception rellay get reduce and also performance gained. Floyd 2014-06-16 0:00 GMT+08:00 Shawn Heisey s...@elyograg.org: On 6/15/2014 2:54 AM, Floyd Wu wrote: Thank you Alex. I'm doing commit every 100 fiels. Maybe

Re: ranking retrieval measure

2014-04-01 Thread Floyd Wu
Usually IR system is measured using Precision Recall. But depends on what kind of system you are developing to fit what scenario. Take a look http://en.wikipedia.org/wiki/Precision_and_recall 2014-04-01 10:23 GMT+08:00 azhar2007 azhar2...@outlook.com: Hi people. Ive developed a search

Re: how to index 20 MB plain-text xml

2014-03-31 Thread Floyd Wu
of pushing 5. Increase amount of memory to Solr (-X command line flags) Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Mon, Mar 31, 2014 at 12:00 PM, Floyd Wu floyd...@gmail.com wrote: I

Re: how to index 20 MB plain-text xml

2014-03-31 Thread Floyd Wu
, I'd say letting a user hit Solr directly is a bad thing - especially a user who doesn't know the details of how Solr works. Upayavira On Mon, Mar 31, 2014, at 07:17 AM, Floyd Wu wrote: Hi Alex, Thanks for your responding. Personally I don't want to feed these big xml to solr. But users

how to index 20 MB plain-text xml

2014-03-30 Thread Floyd Wu
I have many plain text xml that I transfer to form of solr xml format. But every time I send them to solr, I hit OOM exception. How to configure solr to eat these big xml? Please guide me a way. Thanks floyd

DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
Hi there, I'm not fully understand what kind of usage example that DocValues can be used? When I set field docValues=true, do i need to change anyhting in xml that I sent to solr for indexing? Please point me. Thanks Floyd PS: I've googled and read lots of DocValues discussion but confused.

Re: DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
=solr.SchemaCodecFactory/ in the solrconfig.xml and the docValuesFormat=true on the fieldType definition. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Wednesday, November 20, 2013 at 9:38 AM, Floyd Wu wrote: Hi there, I'm not fully understand what kind

Re: DocValues uasge and senarios?

2013-11-20 Thread Floyd Wu
/02/fun-with-docvalues-in-solr-4-2/ -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Wednesday, November 20, 2013 at 10:15 AM, Floyd Wu wrote: Hi Yago Thanks for you reply. I once thought that DocValues feature is one for me to store some extra values. May

Re: Lots of tlog files remained, why?

2013-11-05 Thread Floyd Wu
? How big are you tlog files? Details matter. Best, Erick On Sun, Nov 3, 2013 at 10:03 AM, Floyd Wu floyd...@gmail.com wrote: After re-index 2 xml files and done commit, optimization many times, I still have many tlog files in data/tlof directory. Why? How to remove those

Lots of tlog files remained, why?

2013-11-03 Thread Floyd Wu
After re-index 2 xml files and done commit, optimization many times, I still have many tlog files in data/tlof directory. Why? How to remove those files(delete them directly or just ignored them?) What is the difference if tlog files exist or not? Please kindly guide me. Thanks Floyd

Re: How to avoid underscore sign indexing problem?

2013-08-22 Thread Floyd Wu
indexing problem? On 8/21/2013 7:54 PM, Floyd Wu wrote: When using StandardAnalyzer to tokenize string Pacific_Rim will get ST textraw_**bytesstartendtypeposition pacific_rim[70 61 63 69 66 69 63 5f 72 69 6d]011ALPHANUM1 How to make this string to be tokenized to these two tokens Pacific

Re: How to avoid underscore sign indexing problem?

2013-08-22 Thread Floyd Wu
. Although this decrease search quality a little, but user need higher recall rate than precision. Thank you all. Floyd 2013/8/22 Floyd Wu floyd...@gmail.com After trying some search case and different params combination of WordDelimeter. I wonder what is the best strategy to index string

How to avoid underscore sign indexing problem?

2013-08-21 Thread Floyd Wu
When using StandardAnalyzer to tokenize string Pacific_Rim will get ST textraw_bytesstartendtypeposition pacific_rim[70 61 63 69 66 69 63 5f 72 69 6d]011ALPHANUM1 How to make this string to be tokenized to these two tokens Pacific, Rim? Set _ as stopword? Please kindly help on this. Many thanks.

Re: How to avoid underscore sign indexing problem?

2013-08-21 Thread Floyd Wu
? On 8/21/2013 7:54 PM, Floyd Wu wrote: When using StandardAnalyzer to tokenize string Pacific_Rim will get ST textraw_**bytesstartendtypeposition pacific_rim[70 61 63 69 66 69 63 5f 72 69 6d]011ALPHANUM1 How to make this string to be tokenized to these two tokens Pacific, Rim? Set

Switch to new leader transparently?

2013-07-10 Thread Floyd Wu
Hi there, I've built a SolrCloud cluster from example, but I have some question. When I send query to one leader (say http://xxx.xxx.xxx.xxx:8983/solr/collection1) and no problem everything will be fine. When I shutdown that leader, the other replica(

Re: Switch to new leader transparently?

2013-07-10 Thread Floyd Wu
gone away. Also, ZK aware SolrJ Java client that load-balances across all nodes in cluster. On Wed, Jul 10, 2013 at 2:52 PM, Floyd Wu floyd...@gmail.com wrote: Hi there, I've built a SolrCloud cluster from example, but I have some question. When I send query to one leader (say http

Re: Switch to new leader transparently?

2013-07-10 Thread Floyd Wu
away. Also, ZK aware SolrJ Java client that load-balances across all nodes in cluster. On Wed, Jul 10, 2013 at 2:52 PM, Floyd Wu floyd...@gmail.com wrote: Hi there, I've built a SolrCloud cluster from example, but I have some question. When I send query to one leader (say

Re: Switch to new leader transparently?

2013-07-10 Thread Floyd Wu
KAMACI furkankam...@gmail.com wrote: By the this is not related to your question but this may help you for connecting Solr via C#: http://solrsharp.codeplex.com/ 2013/7/10 Floyd Wu floyd...@gmail.com Hi Furkan I'm using C#, SolrJ won't help on this, but its impl is a good reference

Re: PostingsSolrHighlighter not working on Multivalue field

2013-06-23 Thread Floyd Wu
Any idea can help on this? 2013/6/22 Erick Erickson erickerick...@gmail.com Unfortunately, from here I need to leave it to people who know the highlighting code Erick On Wed, Jun 19, 2013 at 8:40 PM, Floyd Wu floyd...@gmail.com wrote: Hi Erick, multivalue is my typo, thanks

Re: PostingsSolrHighlighter not working on Multivalue field

2013-06-19 Thread Floyd Wu
. Anything in the logs? What is the field definition? Did you re-index after changing to multiValued? Best Erick On Tue, Jun 18, 2013 at 11:01 PM, Floyd Wu floyd...@gmail.com wrote: In my test case, it seems this new highlighter not working. When field set multivalue=true, the stored

PostingsSolrHighlighter not working on Multivalue field

2013-06-18 Thread Floyd Wu
In my test case, it seems this new highlighter not working. When field set multivalue=true, the stored text in this field can not be highlighted. Am I miss something? Or this is current limitation? I have no luck to find any documentations mentioned this. Floyd

Re: Slow Highlighter Performance Even Using FastVectorHighlighter

2013-06-17 Thread Floyd Wu
Hi Michael, How do I configure posthighlighter with my solr 4.2 box? Please kindly point me. Many thanks. 2013/6/15 下午10:48 於 Michael McCandless luc...@mikemccandless.com 寫道: You could also try the new[ish] PostingsHighlighter:

Re: Very slow query when boosting involve with EnternalFileField

2013-03-21 Thread Floyd Wu
Anybody can point me a direction? Many thanks. 2013/3/20 Floyd Wu floyd...@gmail.com Hi everyone, I have a problem and have no luck to figure out. When I issue a query to Query 1 http://localhost:8983/solr/select?q={!boost+b=recip(ms(NOW/HOUR,last_modified_datetime),3.16e-11,1,1

Re: difference these two queries

2012-12-10 Thread Floyd Wu
the query cache. Otis -- SOLR Performance Monitoring - http://sematext.com/spm/index.html On Mon, Dec 10, 2012 at 10:11 PM, Floyd Wu floyd...@gmail.com wrote: Hi There, Sorry for sapmming if this question had already asked. Wha't the main difference between q=fieldA:value

Difference between 'bf' and 'boost' when using eDismax

2012-12-03 Thread Floyd Wu
Hi there, I'm not sure if I understand this clearly. 'bf' is that final score will be add some value return by bf? for example- score + bf = final score 'boost' is that score will be multiply with value that return by boost? for example- score * boost = final score When using both( 'bf' and

Re: Difference between 'bf' and 'boost' when using eDismax

2012-12-03 Thread Floyd Wu
- From: Floyd Wu Sent: Monday, December 03, 2012 11:00 PM To: solr-user@lucene.apache.org Subject: Difference between 'bf' and 'boost' when using eDismax Hi there, I'm not sure if I understand this clearly. 'bf' is that final score will be add some value return by bf? for example- score

Re: Dynamic ranking based on search term

2012-11-28 Thread Floyd Wu
, Floyd Wu wrote: Hi there, If I have a list that is key-value pair in text filed or database table. How do I achieve dynamic ranking based on search term? That say when user search term java and doc1,doc2, doc5 will get higher ranking. for example( key is search term, value is related

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-20 Thread Floyd Wu
Hi Chris, Thanks! Before your great suggestions, I give up using function query to calculate product of score and rankingField and using exactly the same with your boost query solution. Of course it works fine. The next step will be design suitable function to output a ranking value that also

Re: Custom ranking solutions?

2012-11-20 Thread Floyd Wu
,score descfl=score,_score_:product(query($q),2),[explain] Cheers, Dan On Tue, Nov 20, 2012 at 2:29 AM, Floyd Wu floyd...@gmail.com wrote: Hi there, Before ExternalFielField introduced, change document boost value to achieve custom ranking. My client app will update each boost value

Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Floyd Wu
Hi there, I have a field(which is externalFileField, called rankingField) and that value(type=float) is calculated by client app. For the solr original scoring model, affect boost value will result different ranking. So I think product(score,rankingField) may equivalent to solr scoring model.

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Floyd Wu
when there is a tie in ranking (two docs have the same rank value) 1. the reverse of 2. Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Nov 19, 2012 at 9:40 PM, Floyd Wu floyd...@gmail.com wrote

Re: Custom ranking solutions?

2012-11-19 Thread Floyd Wu
://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Nov 19, 2012 at 9:29 PM, Floyd Wu floyd...@gmail.com wrote: Hi there, Before ExternalFielField introduced, change document boost value to achieve custom ranking. My client app

Re: Ranking by sorting score and rankingField better or by product(score, rankingField)?

2012-11-19 Thread Floyd Wu
an error) Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Nov 19, 2012 at 10:16 PM, Floyd Wu floyd...@gmail.com wrote: Thanks Otis, But the sort=product(score, rankingField) is not working

Re: Custom ranking solutions?

2012-11-19 Thread Floyd Wu
://sematext.com/search-analytics/index.html On Mon, Nov 19, 2012 at 9:29 PM, Floyd Wu floyd...@gmail.com wrote: Hi there, Before ExternalFielField introduced, change document boost value to achieve custom ranking. My client app will update each boost value for documents daily and seem

Re: BM25 model for solr 4?

2012-11-15 Thread Floyd Wu
is 10 million books where we index the entire book. These are extremely long documents compared to most IR research. I'd love to hear about actual (non-research) production implementations that have tested the new ranking models available in Solr. Tom On Wed, Nov 14, 2012 at 9:16 PM, Floyd

BM25 model for solr 4?

2012-11-14 Thread Floyd Wu
Hi there, Does anybody can kindly tell me how to setup solr to use BM25? By the way, are there any experiment or research shows BM25 and classical VSM model comparison in recall/precision rate? Thanks in advanced.

Re: Limit the SolR acces from the web for one user-agent?

2012-11-08 Thread Floyd Wu
Hi Alex, I'd like to know how to using Client and Server Certificates to protect the connection and embedding those certificates into clients? Please kindly share your experience. Floyd 2012/11/8 Alexandre Rafalovitch arafa...@gmail.com It is very easy to do this on Apache, but you need to

Remove underscore char when indexing and query problem

2012-03-02 Thread Floyd Wu
Hi there, I have a document and its title is 20111213_solr_apache conference report. When I use analysis web interface to see what tokens exactly solr analyze and the following is the result term text20111213_solrapacheconferencereportterm typeNUMALPHANUM ALPHANUMALPHANUM Why 20111213_solr

Re: Separate ACL and document index

2011-11-23 Thread Floyd Wu
ACL will need to re-build index with document content. It make no sense to rebuild when I only change ACL. Have any idea? Or I just misunderstanding these patch? Floyd 2011/11/23 Floyd Wu floyd...@gmail.com: Hi there, Is it possible to separate ACL index and document index and achieve

Re: Separate ACL and document index

2011-11-23 Thread Floyd Wu
or groups.  I've seen it work ok with up to 1000 or so ACLs per user query.  So you build that filter query from the client using some external database to lookup user ACLs before sending request to SOLR. Bob On Tue, Nov 22, 2011 at 10:48 PM, Floyd Wu floyd...@gmail.com wrote: Hi

Separate ACL and document index

2011-11-22 Thread Floyd Wu
Hi there, Is it possible to separate ACL index and document index and achieve to search by user role in SOLR? Currently my implementation is to index ACL with document, but the document itself change frequently. I have to perform rebuild index every time when ACL change. It's heavy for whole

Re: Replicating Large Indexes

2011-10-31 Thread Floyd Wu
Hi Jason, I'm very curious about how you build( rebuild ) such a big index efficiently? Sorry that hijack this topic. Floyd 2011/11/1 Jason Biggin jbig...@hipdigital.com: Wondering if anyone has experience with replicating large indexes.  We have a Solr deployment with 1 master, 1

Re: Want to support did you mean xxx but is Chinese

2011-10-24 Thread Floyd Wu
/10/21 Floyd Wu floyd...@gmail.com Does anybody know how to implement this idea in SOLR. Please kindly point me a direction. For example, when user enter a keyword in Chinese ��多芬 (this is Beethoven in Chinese) but key in a wrong combination of characters 背多分 (this is pronouncation the same

Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Floyd Wu
Hi there, There are many English soundex implementation can be referenced, but I wonder how to do Chinese soundex(sounds like) filter (maybe). any idea? Floyd

Re: Does anybody has experience in Chinese soundex(sounds like) of SOLR?

2011-10-20 Thread Floyd Wu
. -- Ken From: Floyd Wu floyd...@gmail.com To: solr-user@lucene.apache.org Sent: Thursday, October 20, 2011 5:43 AM Subject: Does anybody has experience in Chinese soundex(sounds like) of SOLR? Hi  there, There are many English soundex implementation can be referenced, but I wonder how to do

Want to support did you mean xxx but is Chinese

2011-10-20 Thread Floyd Wu
Does anybody know how to implement this idea in SOLR. Please kindly point me a direction. For example, when user enter a keyword in Chinese 貝多芬 (this is Beethoven in Chinese) but key in a wrong combination of characters 背多分 (this is pronouncation the same with previous keyword 貝多芬). There in

Re: How to make a valid date facet query?

2011-07-26 Thread Floyd Wu
, 2011 at 1:23 AM, Floyd Wu floyd...@gmail.com wrote: Hi all, I need to make date faceted query and I tried to use facet.range but can't get result I need. I want to make 4 facet like following. 1 Months,3 Months, 6Months, more than 1 Year The onlinedate field in schema.xml like

How to make a valid date facet query?

2011-07-25 Thread Floyd Wu
Hi all, I need to make date faceted query and I tried to use facet.range but can't get result I need. I want to make 4 facet like following. 1 Months,3 Months, 6Months, more than 1 Year The onlinedate field in schema.xml like this field name=onlinedate type=tdate indexed=true stored=true/ I

Re: Fuzzy Query Param

2011-06-30 Thread Floyd Wu
if this is edit distance implementation, what is the result apply to CJK query? For example, 您好~3 Floyd 2011/6/30 entdeveloper cameron.develo...@gmail.com I'm using Solr trunk. If it's levenstein/edit distance, that's great, that's what I want. It just didn't seem to be officially

how to get lots fields this way?

2011-04-13 Thread Floyd Wu
Hi, As I know when using fl=*, score means we need to get all field and score as returned search result. And if field is stored, all text will be returned as part of result. Now I have 2x fields, some of fields name have no prefix or fixed naming rule and cannot be predicted what name will be. I

Re: how to get lots fields this way?

2011-04-13 Thread Floyd Wu
://search-lucene.com/ - Original Message From: Floyd Wu floyd...@gmail.com To: solr-user@lucene.apache.org Sent: Wed, April 13, 2011 2:34:49 PM Subject: how to get lots fields this way? Hi, As I know when using fl=*, score means we need to get all field and score

Re: Solr with example Jetty and score problem

2010-10-20 Thread Floyd Wu
I tried this work-around, but seems not work for me. I still get array of score in the response. I have two physical server A and B localhost -- A test --B I issue query to A like this

Re: Solr with example Jetty and score problem

2010-10-20 Thread Floyd Wu
is not coincidence. 2010/10/20 Floyd Wu floyd...@gmail.com I tried this work-around, but seems not work for me. I still get array of score in the response. I have two physical server A and B localhost -- A test --B I issue query to A like this http://localhost:8983/solr/core0/select

Tuning Solr

2010-10-05 Thread Floyd Wu
Hi there, If I dont need Morelikethis, spellcheck, highlight. Can I remove this configuration section in solrconfig.xml? In other workd, does solr load and use these SearchComponet on statup and suring runtime? Remove this configuration will or will not speedup query? Thanks

Re: Solr with example Jetty and score problem

2010-10-04 Thread Floyd Wu
Hi Chris Thanks. But do you have any suggest or work-around to deal with it? Floyd 2010/10/2 Chris Hostetter hossman_luc...@fucit.org : But when I issue the query with shard(two instances), the response XML will : be like following. : as you can see, that score has bee tranfer to a

Different between Lucid dist. Apache dist. ?

2010-10-04 Thread Floyd Wu
Hi there, What is the difference between Lucid distribution of Solr and Apache distribution? And can I use Lucid distribution for free in my commercial project?

Re: Solr with example Jetty and score problem

2010-09-29 Thread Floyd Wu
Does anybody can help on this ? Many thanks 2010/9/29 Floyd Wu floyd...@gmail.com Hi there I have a problem, the situation is when I issue a query to single instance, Solr response XML like following as you can see, the score is normal(float name=score value

Solr with example Jetty and score problem

2010-09-28 Thread Floyd Wu
Hi there I have a problem, the situation is when I issue a query to single instance, Solr response XML like following as you can see, the score is normal(float name=score value=...) === response lst name=responseHeader int name=status0/int int name=QTime23/int lst name=params