Re: Src code download url needed for SOLR 3.5

2012-01-19 Thread lboutros
Hello, you can get the source code from the svn repository too : http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_5/ Ludovic. - Jouve France. -- View this message in context:

Re: Responding to Requests with Chunks/Streaming

2012-03-15 Thread lboutros
Hi, I was looking for something similar. I tried this patch : https://issues.apache.org/jira/browse/SOLR-2112 it's working quite well (I've back-ported the code in Solr 3.5.0...). Is it really different from what you are trying to achieve ? Ludovic. - Jouve France. -- View this message

Re: correct XPATH syntax

2012-05-01 Thread lboutros
Hi David, I think you should add this option : flatten=true and the could you try to use this XPath : /MedlineCitationSet/MedlineCitation/AuthorList/Author see here for the description : http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 I don't think the that

Re: correct XPATH syntax

2012-05-03 Thread lboutros
Hi David, what do you want to do with the 'commonField' option ? Is it possible to have the part of the schema for the author field please ? Is the author field stored ? Ludovic. - Jouve France. -- View this message in context:

Re: correct XPATH syntax

2012-05-03 Thread lboutros
ok, not that easy :) I did not test it myself but it seems that you could use an XSL preprocessing with the 'xsl' option in your XPathEntityProcessor : http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 You could transform the author part as you wish and then

Re: System requirements in my case?

2012-05-22 Thread lboutros
Hi Bruno, will you use facets and result sorting ? What is the update frequency/volume ? This could impact the amount of memory/server count. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/System-requirements-in-my-case-tp3985309p3985327.html

Re: Highlighting data stored outside of Solr

2012-12-14 Thread lboutros
Hi Michael, it was late yesterday when I wrote my last message. And it did not help that much. Feel free to contact me directly. I can not share the code I wrote for legal obligations. But I can help you :) Ludovic. - Jouve France. -- View this message in context:

Mark document as hidden

2013-03-08 Thread lboutros
Dear all, I would like to mark documents as hidden. I could add a field hidden and pass the value to true, but the whole documents will be reindexed. And External file fields are not searchable. I could store the document keys in an external database and filter the result with these ids. But if

Re: Mark document as hidden

2013-03-08 Thread lboutros
Excellent Erik ! It works perfectly. Normal filter queries are cached. Is it the same for frange filter queries like this one ? : fq={!frange l=0 u=10}removed_revision Thanks to both for your answers. Ludovic. - Jouve France. -- View this message in context:

Re: Mark document as hidden

2013-03-08 Thread lboutros
One more question, is there already a way to update the external file (add values) in Solr ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Mark-document-as-hidden-tp4045756p4045823.html Sent from the Solr - User mailing list archive at

Re: Mark document as hidden

2013-03-08 Thread lboutros
Ok, thanks Erik. Do you see any problem in modifying the Update handler in order to append some values to this file ? Ludovic - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Mark-document-as-hidden-tp4045756p4045839.html Sent from the Solr - User

Re: Mark document as hidden

2013-03-08 Thread lboutros
I could create an UpdateRequestProcessorFactory that could update this file, it seems to be better ? - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Mark-document-as-hidden-tp4045756p4045842.html Sent from the Solr - User mailing list archive at

Re: How to Integrate Solr With Hbase

2013-03-12 Thread lboutros
Hi Kamaci, why don't you use the Nutch indexing functionality ? The Nutch Crawling script already contains the Solr indexing step. http://wiki.apache.org/nutch/bin/nutch%20solrindex Ludovic. - Jouve France. -- View this message in context:

Re: Problem with making Solr query

2011-08-05 Thread lboutros
Hi, if you are using the schema from the Solr example, the fields with the type string are not analyzed. You should find a text field type or you can create one like shown in this example: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/conf/schema.xml?view=markup take a look to

Re: How to deal with java.net.SocketTimeoutException: Read timed out on commit?

2011-08-16 Thread lboutros
We had this type of error too. Now we are using the StreamingUpdateSolrServer with a quite big queue and 2-4 threads depending on data type: http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html And we do not do any intermediate commit. We send only

Re: core creation and instanceDir parameter

2011-09-01 Thread lboutros
instanceDir=. does that fit your needs ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/core-creation-and-instanceDir-parameter-tp3287124p3302496.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to do sorting on no. of highlighting in solr

2011-09-08 Thread lboutros
Hi, it is possible to create a new similarity class which returns the term occurrences. You have to disable Idf (just return1), normalization and co. then you have to declare it in your schema: http://wiki.apache.org/solr/SchemaXml#Similarity http://wiki.apache.org/solr/SolrPlugins#Similarity

Re: Solr wildcard searching

2011-09-24 Thread lboutros
And to complete the answer of Erick, in this search, customer_name:Joh* * is not considered as a wildcard, it is an exact search. another thing, (it is not your problem...), Words with wildcards are not analyzed, so, if your analyzer contains a lower case filter, in the index, these words

Re: FieldCollapsing don't return every groups

2011-09-28 Thread lboutros
Hi Remy, could you paste the analyzer part of the field merchant_name_t please ? And when you say it should return more than that, could you explain why with examples ? If I'm not wrong, the field collapsing function is based on indexed values, so if your analyzer is complex (not string),

Re: FieldCollapsing don't return every groups

2011-09-28 Thread lboutros
Ok, thanks for the schema. the merchant Cult Beauty Ltd should be indexed like this: cult beauty ltd I think some other merchants contain at least one of these words. you should try to group with a special field used for field collapsing: dynamicField name=*_t_group type=string

Re: FieldCollapsing don't return every groups

2011-09-28 Thread lboutros
I just checked, you can disable the storing parameter and use this field: dynamicField name=*_t_group type=stringindexed=true stored=false/ Ludovic. - Jouve France. -- View this message in context:

Re: FieldCollapsing don't return every groups

2011-09-28 Thread lboutros
excellent ! and yes, il fait très beau en France :) - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/FieldCollapsing-don-t-return-every-groups-tp3376036p3376362.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phrase search error

2011-10-15 Thread lboutros
Hi Jason, you could add this filter to the end of your analyzer : http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory That should solve your problem. Ludovic. - Jouve France. -- View this message in context:

Re: getting solr to expand Acronym

2011-11-11 Thread lboutros
Hi, I'm not sure to see what you mean, but perhaps synonyms could solve your problem ? http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Ludovic. - Jouve France. -- View this message in context:

Re: Splitting Words but retaining offsets

2011-11-30 Thread lboutros
I think this is what you are looking for : http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Splitting-Words-but-retaining-offsets-tp3546104p3547977.html Sent

Re: Terms Component with documents marked for deletion

2011-11-30 Thread lboutros
Hi, you have to use the 'expungeDeletes' additional parameter: http://wiki.apache.org/solr/UpdateXmlMessages and depending on the version of Solr you are using, you perhaps have to use a merge policy like the LogByteSizeMergePolicy. See : https://issues.apache.org/jira/browse/SOLR-2725

Re: Full text hit term highlighting

2011-03-18 Thread lboutros
Hi, It seems that we have the same problem, how did you solve it ? Did you write some pieces of code ? thx, Ludovic - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Full-text-hit-term-highlighting-tp2020402p2698440.html Sent from the Solr - User mailing

Re: Dismax and worddelimiterfilter

2011-03-25 Thread lboutros
You could develop your own tokenizer to extract the different forms of your ids. It is possible to extend the pattern tokenizer. Ludovic. Le 25 mars 2011 21:13, David Yang [via Lucene] ml-node+2732007-1439913827-383...@n3.nabble.com a écrit : Hi, I am having some really strange issues

Re: Default operator

2011-03-26 Thread lboutros
The other way could be to extend the SolrQueryParser to read a per field default operator in the solr config file. Then it should be possible to override this functions : setDefaultOperator getDefaultOperator and this two which are using the default operator : getFieldQuery addClause The you

Re: copyField at search time / multi-language support

2011-03-29 Thread lboutros
Tom, to solve this kind of problem, if I understand it well, you could extend the query parser to support something like meta-fields. I'm currently developing a QueryParser Plugin to support a specific syntax. The support of meta-fields to search on different fields (multiple languages) is one of

Re: Matching the beginning of a word within a term

2011-03-30 Thread lboutros
Do you want to tokenize subwords based on dictionaries ? A bit like disagglutination of german words ? If so, something like this could help : DictionaryCompoundWordTokenFilter http://search.lucidimagination.com/search/document/CDRG_ch05_5.8.8 Ludovic

Re: Matching the beginning of a word within a term

2011-03-31 Thread lboutros
So if i understand well, in these exemples : http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND%20df=common_names}companion mank~10 http://localhost:8983/solr/search/?q=*:*fq={!q.op=AND%20df=common_names}companion manki~10

Re: wildcard search inconsistencies

2011-04-01 Thread lboutros
'conditional' seems to be stemmed into the word 'condit' in the index. So your results are normal. As you said, mixing wildcards searching and stemmed fields is not recommanded. Ludovic. 2011/4/1 Melanie Drake [via Lucene] ml-node+2763787-65059921-383...@n3.nabble.com I noticed an

Re: wildcard search inconsistencies

2011-04-01 Thread lboutros
And to be more helpfull, you can activate the debug (debugQuery=on in the query) mode to see the transform query : for instance 'field:contitional' : field:conditional field:conditional field:condit field:condit for 'field:conditional*' : field:conditional* field:conditional*

Re: Multiple terms in query

2011-04-02 Thread lboutros
You could turn on the debug mode, there is a part which explain the scoring of the query. It is a bit tricky but that could help. Could you paste your query (full url), and the field definition in your schema please ? Ludovic. - Jouve France. -- View this message in context:

Re: Multiple Words in String

2011-04-03 Thread lboutros
I managed to find both documents with your two input queries . Add this filter in your analyzer query part : = The main problem is that your query microsoft is

Re: question on solr.ASCIIFoldingFilterFactory

2011-04-05 Thread lboutros
Is there any Stemming configured in for this field in your schema configuration file ? Ludovic. 2011/4/5 Nemani, Raj [via Lucene] ml-node+2780463-48954297-383...@n3.nabble.com All, I am using solr.ASCIIFoldingFilterFactory to perform accent insensitive search. One of the words that got

RE: question on solr.ASCIIFoldingFilterFactory

2011-04-05 Thread lboutros
Your analyzer contains these two filters : before : So two things : The words you are testing are not english words (no ?), so the stemming will have strange behavior. If you really want to remove accents, try to put the ASCIIFoldingFilterFactory before the two others. Ludovic. -

RE: question on solr.ASCIIFoldingFilterFactory

2011-04-05 Thread lboutros
this analyzer seems to work : I used Spanish stemming, put the ASCIIFoldingFilterFactory before the stemming filter and added it in the

Re: Shared conf

2011-04-07 Thread lboutros
You could use the replication to replicate the configuration files : http://wiki.apache.org/solr/SolrReplication What do you want to do with your different cores ? Ludovic. - Jouve France. -- View this message in context:

Re: Using MLT feature

2011-04-08 Thread lboutros
It seems that tokens are sorted by frequencies : ... Collections.sort(profile, new TokenComparator()); ... and private static class TokenComparator implements ComparatorToken { public int compare(Token t1, Token t2) { return t2.cnt - t1.cnt; } and cnt is the token count.

Re: Using MLT feature

2011-04-08 Thread lboutros
'tokens' and not from the order the tokens appear on original text. Frederico -Original Message- From: lboutros [mailto:[hidden email]http://user/SendEmail.jtp?type=nodenode=2794604i=0by-user=t] Sent: sexta-feira, 8 de Abril de 2011 09:49 To: [hidden email]http://user

Re: Performance with search terms starting and ending with wildcards

2011-04-10 Thread lboutros
Which version of solr are you using ? NGrams could be an option but could you give us the field definition in your schema ? The words count in this field index ? Ludovic. 2011/4/10 Ueland [via Lucene] ml-node+2802561-121096623-383...@n3.nabble.com Hi! I have been doing some testing with

Re: Spellchecker with synonyms

2011-04-11 Thread lboutros
Did you configure synonyms for your field at query time ? Ludovic. 2011/4/11 royr [via Lucene] ml-node+2806028-1349039134-383...@n3.nabble.com Hello, I have some synonyms for city names. Sometimes there are multiple names for one city, example:. newyork, newyork city, big apple I

Re: Can I set up a config-based distributed search

2011-04-11 Thread lboutros
You can add to your search handler the shards parameter : requestHandler name=dist-search class=solr.SearchHander lst name default str name=shards host1/solr, host2/solrstr/ /lst /requestHandler Is is what you are looking for ? Ludovic. 2011/4/11 Ran Peled [via Lucene]

Re: Allowing looser matches

2011-04-13 Thread lboutros
If you are using the Dismax query parser, perhaps could you take a look to the minimum should match parameter 'mm' : http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 Ludovic. 2011/4/13 Mark Mandel [via Lucene] ml-node+2815186-149863473-383...@n3.nabble.com

Re: all searches return 0 hits - what have I done wrong?

2011-04-18 Thread lboutros
did you try with the comlete xpath ? field column=title xpath=/ARTIKEL/DOKTITEL/OVERSKRIFT1 / field column=text xpath=/ARTIKEL/AKROP/TXT / Ludovic. - Jouve France. -- View this message in context:

Re: all searches return 0 hits - what have I done wrong?

2011-04-18 Thread lboutros
If a document contains multiple 'txt' fields, it should be marked as 'multiValued'. field name=txt type=text indexed=true stored=true multiValued=true/ But if I'm understanding well, you also tried this ? : field column=text xpath=/ARTIKEL/AKROP / And for your search (MomsManual),

Re: Custom Sorting

2011-04-19 Thread lboutros
You could create a new Similarity class plugin that take in account every parameters you need. : http://wiki.apache.org/solr/SolrPlugins?highlight=%28similarity%29#Similarity but, as Jan said, be carefull with the cost of the the similarity function. Ludovic. 2011/4/19 Jan Høydahl / Cominvent

Re: How could each core share configuration files

2011-04-20 Thread lboutros
Perhaps this could help : http://lucene.472066.n3.nabble.com/Shared-conf-td2787771.html#a2789447 Ludovic. 2011/4/20 kun xiong [via Lucene] ml-node+2841801-1701787156-383...@n3.nabble.com Hi all, Currently in my project , most of the core configurations are same(solrconfig.xml,

Re: Solr - upgrade from 1.4.1 to 3.1 - finding AbstractSolrTestCase binaries - help please?

2011-04-21 Thread lboutros
There is a jar for the tests in solr. I added this dependency in my pom.xml : dependency groupIdorg.apache.solr/groupId artifactIdsolr-core/artifactId version3.1-SNAPSHOT/version classifiertests/classifier scopetest/scope

RE: The issue of import data from database using Solr DIH

2011-04-21 Thread lboutros
What you want to do is something like a left outer join, isn't it ? something like : select table2.OS06Y, f1,f2,f3,f4,f5 from table2 left outer join table1 on table2.OS06Y = table1.OS06Y where ... could you prepare a view in your RDBMS ? That could be another solution ? Ludovic. - Jouve

Re: Facing problem with white space in synonyms

2011-04-27 Thread lboutros
coud you try to escape white spaces like this: Hind\ claw Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Facing-problem-with-white-space-in-synonyms-tp2870193p2870552.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Autocomplete(terms) middle of words

2011-04-29 Thread lboutros
you could use EdgeNGramFilterFactory : http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory And you should mix front and back ngram process in your analyzer : filter class=solr.EdgeNGramFilterFactory minGramSize=2 maxGramSize=15 side=front/ filter

Re: DataImportHandler on 2 tables

2011-05-02 Thread lboutros
Do you want to search on the datas from the tables together or seperately ? Is there a join between the two tables ? Ludovic. 2011/5/2 Greg Georges [via Lucene] ml-node+2891256-222073995-383...@n3.nabble.com Hello all, I have a system where I have a dataimporthandler defined for one table

Re: DataImportHandler on 2 tables

2011-05-02 Thread lboutros
ok, so It seems you should create a new index and core as you said. see here for the management : http://wiki.apache.org/solr/CoreAdmin But it seems that is a problem for you. Is it ? Ludovic. 2011/5/2 Greg Georges [via Lucene] ml-node+2891277-472183207-383...@n3.nabble.com No, the data

Re: stemming for English

2011-05-03 Thread lboutros
Hi, I think you have to use stemming on both side (index and query) if you really want to use stemming. Ludovic 2011/5/3 Dmitry Kan [via Lucene] ml-node+2893599-894006307-383...@n3.nabble.com Dear list, In SOLR schema on the index side we use no stemming to support favor wildcard search.

Re: stemming for English

2011-05-03 Thread lboutros
Dmitry, I don't know any way to keep both stemming and consistent wildcard support in the same field. To me, you have to create 2 different fields. Ludovic. 2011/5/3 Dmitry Kan [via Lucene] ml-node+2893628-993677979-383...@n3.nabble.com Hi Ludovic, That's an option we had before we decided

Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread lboutros
In the ant script there is a target to generate maven's artifacts. After that, you will be able to open the project as a standard maven project. Ludovic. 2011/5/4 Gabriele Kahlout [via Lucene] ml-node+2898068-621882422-383...@n3.nabble.com Hello, I'm trying to modify Solr and I think

Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread lboutros
oups, sorry, this was not the target I used (this one should work too, but...), the one I used is get-maven-poms. That will just create pom files and copy them to their right target locations. I'm using netbeans and I'm using the plugin Automatic Projects to do everything inside the IDE. Which

Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread lboutros
ok, this is part of my build.xml (from the svn repository) : property name=version value=3.1-SNAPSHOT/ target name=get-maven-poms description=Copy Maven POMs from dev-tools/maven/ to their target locations copy todir=. overwrite=true fileset

Re: Deprication warnings in Solr log

2011-05-04 Thread lboutros
did you update this part in your solrconfig.xml ? luceneMatchVersionLUCENE_31/luceneMatchVersion Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Deprication-warnings-in-Solr-log-tp2898163p2898749.html Sent from the Solr - User mailing list

Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread lboutros
I do not build this part, I don't need it. The lib was present in the branch_3x branch, but is not there anymore. You can download it here : http://search.lucidimagination.com/search/out?u=http%3A%2F%2Fdownloads.osafoundation.org%2Fdb%2Fdb-4.7.25.jar You have to install it locally. Ludovic.

Re: Is it possible to build Solr as a maven project?

2011-05-04 Thread lboutros
I opened and built my needed projects in Netbeans, i.e.: Solr Core, Solr Search Server, Solrj, Lucene Core etc But with the given library you should go to the next step. Ludovic. - Jouve France. -- View this message in context:

Re: Is it possible to build Solr as a maven project?

2011-05-05 Thread lboutros
Thanks Steve, this will be really simpler next time :) Is it documented somewhere ? If no, perhaps could we add something in this page for example ? http://wiki.apache.org/solr/FrontPage#Solr_Development or here : http://wiki.apache.org/solr/NightlyBuilds Ludovic. 2011/5/5 steve_rowe [via

RE: Boosting score of a document without deleting and adding another document

2011-05-10 Thread lboutros
Perhaps the query elevation component is what you are looking for : http://wiki.apache.org/solr/QueryElevationComponent Ludovic. - Jouve France. -- View this message in context:

Re: Is it possible to build Solr as a maven project?

2011-05-10 Thread lboutros
Very nice Steve ! Thanks again. (I'm building from svn so that's perfect for me) Is this file referenced somewhere in the wiki ? Ludovic. - Jouve France. -- View this message in context:

RE: Is it possible to build Solr as a maven project?

2011-05-10 Thread lboutros
Steve, I'm not used to update wikis, but I've added a small part after the IntelliJ part here : http://wiki.apache.org/solr/HowToContribute Ludovic. - Jouve France. -- View this message in context:

Re: Results with and without whitspace(soccer club and soccerclub)

2011-05-14 Thread lboutros
Hi, synonyms could be an option, but could you describe a bit more your problem please (current analyzer, documents, solr version) ? Ludovic. 2011/5/13 roySolr [via Lucene] ml-node+2934742-186141045-383...@n3.nabble.com Hello, My index looks like this: Soccer club Football club etc.

Re: Order of words in proximity search

2011-05-15 Thread lboutros
Hi, see here for an explanation : http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29 Ludovic. - Jouve France. -- View this message in context:

Re: Order of words in proximity search

2011-05-16 Thread lboutros
the key phrase was this one :) : A sloppy phrase query specifies a maximum slop, or the number of positions tokens need to be moved to get a match. so you could search for foo bar~101 in your example. Ludovic. - Jouve France. -- View this message in context:

Re: Order of words in proximity search

2011-05-16 Thread lboutros
I would prefer to put a higher slop number instead of a boolean clause : 200 perhaps in your specific case. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Order-of-words-in-proximity-search-tp2938427p2946645.html Sent from the Solr - User

Re: Order of words in proximity search

2011-05-16 Thread lboutros
The analyzer of the field you are using could impact the Phrase Query Slop. Could you copy/paste the part of the schema ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Order-of-words-in-proximity-search-tp2938427p2946764.html Sent from the

Re: return unaltered complete multivalued fields with Highlighted results

2011-05-30 Thread lboutros
Hi Alexei, We have the same issue/behavior. The highlighting component fragments the fields to highlight and choose the bests to be returned and highlighted. You can return all fragments with the maximum size for each one, but it will never return fragments with scores equal to 0, I mean without

Re: Obtaining query AST?

2011-05-31 Thread lboutros
Hi Darren, I think that if I had to get the parsing result, I would create my own QueryComponent which would create the parser in the 'prepare' function (you can take a look to the actual QueryComponent class) and instead of resolving the query in the 'process' function, I would just parse the

Re: Obtaining query AST?

2011-05-31 Thread lboutros
Darren, you can even take a look to the DebugComponent which returns the parsed query in a string form. It uses the QueryParsing class to parse the query, you could perhaps do the same. Ludovic. - Jouve France. -- View this message in context:

Re: Return stemmed word

2011-06-03 Thread lboutros
Hi Kurt, I think this is a bit more tricky than that. For example, if a user searches for oranges, the stemmer may return orang which is not an existing word. So getting stemmed words might/will not work for your highlighting purpose. Ludovic. - Jouve France. -- View this message in

Getting payloads in Highlighter

2011-06-03 Thread lboutros
Hi all, I need to highlight searched words in the original text (xml) of a document. So I'm trying to develop a new Highlighter which uses the defaultHighlighter to highlight some fields and then retrieve the original text file/document (external or internal storage) and put the highlighted

Re: Getting payloads in Highlighter

2011-06-03 Thread lboutros
To clarify a bit more, I took a look to this function : termPositions public TermPositions termPositions() throws IOException Description copied from class: IndexReader Returns an unpositioned TermPositions enumerator. But it returns an unpositioned

Re: Getting payloads in Highlighter

2011-06-03 Thread lboutros
The original document is not indexed. Currently it is just stored and could be stored in an filesystem or a database in the future. The different parts of a document are indexed in multiple different fields with some different analyzers (stemming, multiple languages, regex,...). So, I don't

Re: wildcard search

2011-06-08 Thread lboutros
Hi Thomas, I don't use it myself (but I will soon), so I may be wrong, but did you try to use the ComplexPhraseQueryParser : ComplexPhraseQueryParser QueryParser which permits complex phrase query syntax eg (john jon jonathan~) peters*. It seems that you could do such type of queries

Re: Displaying highlights in formatted HTML document

2011-06-09 Thread lboutros
Hi Bryan, how do you index your html files ? I mean do you create fields for different parts of your document (for different stop words lists, stemming, etc) ? with DIH or solrj or something else ? iorixxx, could you please explain a bit more your solution, because I don't see how your solution

RE: Displaying highlights in formatted HTML document

2011-06-09 Thread lboutros
I am not (yet) a tika user, perhaps that the iorixxx's solution is good for you. We will share the highlighter module and 2 other developments soon. ('have to see how to do that) Ludovic. - Jouve France. -- View this message in context:

Re: Reject URL requests unless from localhost for dataimport

2011-06-25 Thread lboutros
If you are using Tomcat, perhaps you could use Valve to protect a given context of your application I think : Context path=/solr/dataimport docBase=${catalina.home}/server/solr/dataimport privileged=true Valve className=org.apache.catalina.valves.RemoteAddrValve

Re: Analyzer creates PhraseQuery

2011-06-28 Thread lboutros
You could add this filter after the NGram filter to prevent the phrase query creation : http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory Ludovic. - Jouve France. -- View this message in context:

Re: Can I invert the inverted index?

2011-07-05 Thread lboutros
Hi Gabriele, I'm not sure to understand your problem, but the TermVectorComponent may fit your needs ? http://wiki.apache.org/solr/TermVectorComponent http://wiki.apache.org/solr/TermVectorComponentExampleEnabled Ludovic. - Jouve France. -- View this message in context:

Re: Multicore Issue - Server Restart

2012-05-29 Thread lboutros
Hi Suajtha, each webapps has its own solr home ? Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Multicore-Issue-Server-Restart-tp3986516p3986602.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: TermComponent and Optimize

2012-06-06 Thread lboutros
It is possible to use the expungeDeletes option in the commit, that could solve your problem. http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22commit.22 Sadly, there is currently a bug with the TieredMergePolicy : https://issues.apache.org/jira/browse/SOLR-2725 SOLR-2725

Re: Disable cache ?

2012-07-17 Thread lboutros
Hi Bruno, don't forget the OS disk cache. On linux you can clear it with this tiny script : #!/bin/bash sync echo 3 /proc/sys/vm/drop_caches Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Disable-cache-tp3995575p3995589.html Sent from

Re: Solr 3.6.1: query performance is slow when asterisk is in the query

2012-08-23 Thread lboutros
You could add a default value in your field via the schema : field ... default=mynuvalue/ and then your query could be : -body:mynuvalue but I prefer the Chris's solution which is what I usually do. Ludovic. - Jouve France. -- View this message in context:

Re: Mark document as hidden

2013-03-16 Thread lboutros
Ok, I have created a processor which manages to update the external file. Basically, until a commit request, the hidden document IDs are stored in a Set and when a commit is requested, a new file is created by copying the last one, then the additional IDs are appended to the external file. Now

Re: Mark document as hidden

2013-03-16 Thread lboutros
Hi Jack, the external files involved in External File Fields are not stored in the configuration directory and cannot be replicated this way, furthermore in Solr Cloud, additional files are not replicated anymore. There is something like that in the code: / if (confFileNameAlias.size() 1 ||

Re: Mark document as hidden

2013-03-17 Thread lboutros
Thanks Jack for your answers. All files in the index directory are replicated ? I thought that only the lucene index files were replicated. If you are right, that's great, because I could create an ExternalFileField type which could get its input file from the index directory and not from the

Re: Mark document as hidden

2013-03-17 Thread lboutros
Oh, I see :) I did not catch well what you said. Well, my index could contain 80 millions of elements and a big amount of them could be hidden. As you already said, I don't think that ZooKeeper is the right place to store these files, they are too big. Thank you again, that gave me some ideas I

Re: Mark document as hidden

2013-03-18 Thread lboutros
Thanks Jack. I finally managed to replicate the external files with my own replication handler. But now, there's an issue with Solr in the Update Log replay process. The default processor chain is not used, this means that my processor which manage the external files is not used... I have

Re: Fuzzy 2 search results wrong

2014-01-27 Thread lboutros
Hi Lou, The Solr query Parser creates fuzzy queries with a maximum of 50 term expansions. This is the default value and this is hard coded in the FuzzyQuery class. I would say this is your problem. I think you could create a new Query Parser which could create the fuzzy query with a bigger

Re: Fuzzy 2 search results wrong

2014-01-28 Thread lboutros
You have to create your own parser which extends the current query parser. You have to override the newFuzzyQuery protected function to call the FuzzyQuery constructor with a configured maximum expansion value or something like that. Ludovic. - Jouve France. -- View this message in

SolrCloud Zookeeper disconnection/reconnection

2014-02-13 Thread lboutros
Dear all, we are currenty using Solr 4.3.1 in production (With SolrCloud). We encounter quite the same problem described in this other old post: http://lucene.472066.n3.nabble.com/SolrCloud-CloudSolrServer-Zookeeper-disconnects-and-re-connects-with-heavy-memory-usage-consumption-td4026421.html

Re: SolrCloud Zookeeper disconnection/reconnection

2014-02-16 Thread lboutros
Thanks a lot for your answer. Is there a web page, on the wiki for instance, where we could find some JVM settings or recommandations that we should used for Solr with some index configurations? Ludovic. - Jouve France. -- View this message in context:

Group on multiple fields in a sharded environment

2014-02-20 Thread lboutros
Dear all, I would like to group my query results on two different fields (not at the same time). I also would like to get the exact group count. And I'm working with a sharded index. I know that to get the exact group count, all documents from a group must be indexed in a unique shard. Now, is

  1   2   >