RE: Debugging/scoring question

2018-05-23 Thread LOPEZ-CORTES Mariano-ext
Yes. This make sense. I guess you talk about this doc: https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html How I can decrease the effect of the IDF component in my query? Thanks!! -Message d'origine- De : Alessandro Benedetti

Debugging/scoring question

2018-05-23 Thread LOPEZ-CORTES Mariano-ext
Hi all I've a 20 document collection. In a debugging plan, we have: "100051":" 20.794415 = max of: 20.794415 = weight(nomUsageE:jean in 1) [SchemaSimilarity], result of: 20.794415 = score(doc=1,freq=1.0 = termFreq=1.0 ), product of: 15.0 = boost 1.3862944 = idf, computed as

Solr Dates TimeZone

2018-05-22 Thread LOPEZ-CORTES Mariano-ext
Hi It's possible to configure Solr with a timezone other than GMT? It's possible to configure Solr Admin to view dates with a timezone other than GMT? What is the best way to store a birth date in Solr? We use TrieDate type. Thanks!

Commit too slow?

2018-05-14 Thread LOPEZ-CORTES Mariano-ext
Hi After having injecting 200 documents in our Solr server, the commit operation at the end of the process (using ConcurrentUpdateSolrClient) take 10 minutes. It's too slow? Our auto-commit policy is the following: 15000

Solr doesn't import the whole data

2018-04-27 Thread LOPEZ-CORTES Mariano-ext
Hi We've finished the data import of 40 millions data into a 3 node Solr cluster. After injecting all data via a Java program, we've noticed that the number of documents was less than expected (in 10 rows). No exception, no error. Some config details:

Filter query question

2018-04-12 Thread LOPEZ-CORTES Mariano-ext
Hi In our search application we have one facet filter (Status) Each status value corresponds to multiple values in the Solr database Example : Status : Initialized --> status in solr = 11I, 12I, 13I, 14I, ... On status value click, search is re-fired with fq filter: fq: status:(11I OR 12I OR

RE: Question liste solr

2018-03-20 Thread LOPEZ-CORTES Mariano-ext
ser@lucene.apache.org Objet : Re: Question liste solr -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Mariano, On 3/19/18 11:50 AM, LOPEZ-CORTES Mariano-ext wrote: > Hello > > We have an index Solr with 3 nodes, 1 shard et 2 replicas. > > Our goal is to index 42 millions rows. Indexing

RE: Question liste solr

2018-03-19 Thread LOPEZ-CORTES Mariano-ext
Sorry. Thanks in advance !! De : LOPEZ-CORTES Mariano-ext Envoyé : lundi 19 mars 2018 16:50 À : 'solr-user@lucene.apache.org' Objet : RE: Question liste solr Hello We have an index Solr with 3 nodes, 1 shard et 2 replicas. Our goal is to index 42 millions rows. Indexing time is important

RE: Question liste solr

2018-03-19 Thread LOPEZ-CORTES Mariano-ext
Hello We have an index Solr with 3 nodes, 1 shard et 2 replicas. Our goal is to index 42 millions rows. Indexing time is important. The data source is an oracle database. Our indexing strategy is : * Reading from Oracle to a big CSV file. * Reading from 4 files (big file

RE: Response time under 1 second?

2018-02-22 Thread LOPEZ-CORTES Mariano-ext
Heisey [mailto:elyog...@elyograg.org] Envoyé : jeudi 22 février 2018 17:06 À : solr-user@lucene.apache.org Objet : Re: Response time under 1 second? On 2/22/2018 8:53 AM, LOPEZ-CORTES Mariano-ext wrote: > With a 3 nodes cluster each 12GB and a corpus of 5GB (CSV format). > > Is

Response time under 1 second?

2018-02-22 Thread LOPEZ-CORTES Mariano-ext
Hello With a 3 nodes cluster each 12GB and a corpus of 5GB (CSV format). Is it better to disable completely Solr cache ? There is enough RAM for the entire index. Is there a way for reduce random queries under 1 second? Thanks!

RE: Facet performance problem

2018-02-20 Thread LOPEZ-CORTES Mariano-ext
Our query looks like this: ...factet=true=motifPresence We return a facet list of values in "motifPresence" field (person status). Status: [ ] status1 [x] status2 [x] status3 The user then selects 1 or multiple status (It's this step that we called "facet

RE: Reading data from Oracle

2018-02-15 Thread LOPEZ-CORTES Mariano-ext
@lucene.apache.org Objet : Re: Reading data from Oracle And where is the bottleneck? Is it reading from Oracle or injecting to Solr? Regards Bernd Am 15.02.2018 um 08:34 schrieb LOPEZ-CORTES Mariano-ext: > Hello > > We have to delete our Solr collection and feed it periodically from an Oracle &

Reading data from Oracle

2018-02-14 Thread LOPEZ-CORTES Mariano-ext
Hello We have to delete our Solr collection and feed it periodically from an Oracle database (up to 40M rows). We've done the following test: From a java program, we read chunks of data from Oracle and inject to Solr (via Solrj). The problem : It is really really slow (1'5 nights). Is there

RE: Facets OutOfMemoryException

2018-02-08 Thread LOPEZ-CORTES Mariano-ext
We are just 1 field "status" in facets with a cardinality of 93. We realize that increasing memory will work. But, you think it's necessary? Thanks in advance. -Message d'origine- De : Zisis T. [mailto:zist...@runbox.com] Envoyé : jeudi 8 février 2018 13:14 À :

Facets OutOfMemoryException

2018-02-08 Thread LOPEZ-CORTES Mariano-ext
We are experimentig memory problems regarding facets filters (OutOfMemory java heap). If we disable facets, it works ok. Our infrastructure : 3 nodes Solr 2048 MB RAM 3 nodes Zookeeper 1024 MB RAM Size : 27 millions of documents Any ideas ? Thanks in advance !

Highlighting over date fields

2018-02-07 Thread LOPEZ-CORTES Mariano-ext
It's possible to use highlighting over date fields ? We've tried but we've got no highlighting response for the field.

Custom Solr function

2018-01-30 Thread LOPEZ-CORTES Mariano-ext
Can we create a custom function in Java? Example : sort = func([USER-ENTERED TEXT]) desc func returns will numeric value Thanks in advance

Phonetic matching relevance

2018-01-29 Thread LOPEZ-CORTES Mariano-ext
Hello. We work on a search application whose main goal is to find persons by name (surname and lastname). Query text comes from a user-entered text field. Ordering of the text is not defined (lastname-surname, surname-lastname), but some orderings are most important than others. The ranking is