Re: problem with facets - out of memory exception

2013-12-19 Thread Marc Sturlese
Have you tried to reindex using DocValues? Fields used for faceting are stored on disk and not on ram using the FieldCache. If you have enough memory they will be loaded on the system cache but not on the java heap. This is good for GC too when committing. http://wiki.apache.org/solr/DocValues

Re: Tweaking boosts for more search results variety

2013-09-10 Thread Marc Sturlese
This is totally deprecated but maybe can be helpful if you want to re-sort some documents https://issues.apache.org/jira/browse/SOLR-1311 -- View this message in context: http://lucene.472066.n3.nabble.com/Tweaking-boosts-for-more-search-results-variety-tp4088302p4089044.html Sent from the

Listeners, cores and Similarity

2013-08-16 Thread Marc Sturlese
Hey there, I'm testing a custom similarity which loads data from and external file located in solr_home/core_name/conf/. I load data from the file into a Map on the init method of the SimilarityFactory. I would like to reload that Map every time a commit happens or every X hours. To do that I've

Re: Solr 3.6 optimize and field cache question

2013-07-10 Thread Marc Sturlese
Not a solution for the short term but sounds like a good use case to migrate to Solr 4.X and use DocValues instead of FieldCache for faceting. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-3-6-optimize-and-field-cache-question-tp4076398p4076822.html Sent from the

about NRTCachingDirectory

2012-12-10 Thread Marc Sturlese
I have a doubt about how NRTCachingDirectory works. As far as I've seen, it receives a delegator Directory and caches newly created segments. So, if MMapDirectory use to be the default: 1.- Does NRTCachingDirectory works acting sort of as a wrapper of MMap caching the new segments? 2.- If I have

offsets issues with multiword synonyms since LUCENE_33

2012-08-14 Thread Marc Sturlese
Has someone noticed this problem and solved it somehow? (without using LUCENE_33 in the solrconfig.xml) https://issues.apache.org/jira/browse/LUCENE-3668 Thanks in advance -- View this message in context:

Re: offsets issues with multiword synonyms since LUCENE_33

2012-08-14 Thread Marc Sturlese
Well an example would be: synonyms.txt: huge,big size The I have the docs: 1- The huge fox attacks first 2- The big size fox attacks first Then if I query for huge, the highlights for each document are: 1- The stronghuge/strong strongfox/strong attacks first 2- The strongbig size/strong fox

Re: Faceting on a date field multiple times

2012-05-04 Thread Marc Sturlese
http://lucene.472066.n3.nabble.com/Multiple-Facet-Dates-td495480.html -- View this message in context: http://lucene.472066.n3.nabble.com/Faceting-on-a-date-field-multiple-times-tp3961282p3961865.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to ignore indexing of duplicated documents?

2012-03-12 Thread Marc Sturlese
http://wiki.apache.org/solr/Deduplication -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-ignore-indexing-of-duplicated-documents-tp3814858p3818973.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Shard timeouts on large (1B docs) Solr cluster

2012-02-03 Thread Marc Sturlese
timeAllowed can be used outside distributed search. It is used by the TimeL¡mitingCollector. When the search time is equal to timeAllowed it will stop searching and will return the results that could find till then. This can be a problem when using incremental indexing. Lucene starts searching

Re: changing omitNorms on an already built index

2011-10-27 Thread Marc Sturlese
As far as I know there's no issue about this. You have to reindex and that's it. In which kind of field are you changing the norms? (You just will see changes in text fields) Using debugQuery=true you can see how norms affect the score (in case you have them not omited) -- View this message in

Re: Collection Distribution vs Replication in Solr

2011-10-27 Thread Marc Sturlese
Replication is easier to manage and a bit faster. See the performance numbers: http://wiki.apache.org/solr/SolrReplication -- View this message in context: http://lucene.472066.n3.nabble.com/Collection-Distribution-vs-Replication-in-Solr-tp3458724p3459178.html Sent from the Solr - User mailing

Adding a DocSet as a filter from a custom search component

2011-10-25 Thread Marc Sturlese
Hey there, I'm wondering if there's a more clean way to to this: I've written a SearchComponent, that runs as last-component. In the prepare method I build a DocSet (SortedIntDocSet) based on if some values of the fieldCache of a determined field accomplish some rules (if rules are accomplished,

Re: Solr 3.3. Grouping vs DeDuplication and Deduplication Use Case

2011-08-30 Thread Marc Sturlese
Deduplication uses lucene indexWriter.updateDocument using the signature term. I don't think it's possible as a default feature to choose wich document to index, the original should be always the last to be indexed. /IndexWriter.updateDocument Updates a document by first deleting the document(s)

Re: Boost documents based on the number of their fields

2011-08-19 Thread Marc Sturlese
You have different options here. You can give more boost at indexing time to the documents that have set the fields you want. For this to take effect you will have to reindex and set omitNorms=false to the fields you are going to search. This same concept can be applied to boost single fields

RE: embeded solrj doesn't refresh index

2011-07-22 Thread Marc Sturlese
Are u indexing with full import? In case yes and the resultant index has similar num of docs (that the one you had before) try setting reopenReaders to false in solrconfig.xml * You have to send the comit, of course. -- View this message in context:

problem with the new IndexSearcher when snpainstaller (and commit script) happen

2011-06-15 Thread Marc Sturlese
Hey there, I've noticed a very odd behaviour with the snapinstaller and commit (using collectionDistribution scripts). The first time I install a new index everything works fine. But when installing a new one, I can't see the new documents. Checking the status page of the core tells me that the

Re: problem with the new IndexSearcher when snpainstaller (and commit script) happen

2011-06-15 Thread Marc Sturlese
Test are done on Solr 1.4 The simplest way to reproduce my problem is having 2 indexes and a Solr box with just one core. Both index must have been created with the same schema. 1- Remove the index dir of the core and start the server (core is up with an empty index) 2- check status page of the

Re: problem with the new IndexSearcher when snpainstaller (and commit script) happen

2011-06-15 Thread Marc Sturlese
I don't know if this could have something to do with the problem but some of the files of the indexes have same size and name (in all the index but not in the empty one). I have also realized that when moving back to the empty index and committing, numDocs and maxDocs change. Once I'm with the

Re: problem with the new IndexSearcher when snpainstaller (and commit script) happen

2011-06-15 Thread Marc Sturlese
I have some more info! I've build another index bigger than the others so names of the files are not the same. This way, if I move from any of the other index to the bigger one or vicevera it works (I can see the cahnges in the version, numDocs and maxDocs)! So, I thing it is related to the name

Re: problem with the new IndexSearcher when snpainstaller (and commit script) happen [SOLVED]

2011-06-15 Thread Marc Sturlese
I've found the problem in case someone is interested. It's because of the indexReader.reopen(). If it is enabled, when opening a new searcher due to the commit, this code is executed (in SolrCore.getSearcher(boolean forceNew, boolean returnSearcher, final Future[] waitSearcher)): ...

Re: Strange performance behaviour when concurrent requests are done

2011-04-29 Thread Marc Sturlese
Any suggestion about this issue?-- View this message in context: http://lucene.472066.n3.nabble.com/Strange-performance-behaviour-when-concurrent-requests-are-done-tp505478p2878758.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Strange performance behaviour when concurrent requests are done

2011-04-29 Thread Marc Sturlese
That's true. But the degradation is so big. If you use lunch concurrent requests to a web app taht doesn't use Solr the time per request won't degradate that much. For me, it looks more like a synchronized is happening somewhere in Solr or Lucene and is causing this.-- View this message in

Re: Need to create dyanamic indexies base on different document workspaces

2011-04-22 Thread Marc Sturlese
In case you need to create lots of indexes and register/unregister fast, there is work on the way http://wiki.apache.org/solr/LotsOfCores -- View this message in context:

latest patches and big picture of search grouping

2011-01-17 Thread Marc Sturlese
I need to dive into search grouping / field collapsing again. I've seen there are lot's of issues about it now. Can someone point me to the minimum patches I need to run this feature in trunk? I want to see the code of the most optimised version and what's being done in distributed search. I

Re: Adding new field after data is already indexed

2010-11-08 Thread Marc Sturlese
and i index data on the basis of these fields. Now, incase i need to add a new field, is there a way i can add the field without corrupting the previous data. Is there any feature which adds a new field with a default value to the existing records. You just have to add the new field in the

Re: Adding new field after data is already indexed

2010-11-08 Thread Marc Sturlese
and i index data on the basis of these fields. Now, incase i need to add a new field, is there a way i can add the field without corrupting the previous data. Is there any feature which adds a new field with a default value to the existing records. You just have to add the new field in the

Core status uptime and startTime

2010-11-03 Thread Marc Sturlese
As far as I know, in the core admin page you can find when was the last time an index had a modification and was comitted checking the lastModified. But? what startTime and uptime mean? Thanks in advance -- View this message in context:

Re: Dynamically create new core

2010-11-02 Thread Marc Sturlese
To create the core, the folder with the confs must already exist and has to be placed in the proper place (inside the solr home). Once you run the create core action, this core will we added to solr.xml and dinamically loaded. -- View this message in context:

Re: How do you programatically create new cores?

2010-10-17 Thread Marc Sturlese
You have to create the core's folder with it's conf inside the Solr home. Once done you can call the create action of the admin handler: http://wiki.apache.org/solr/CoreAdmin#CREATE If you need to dinamically create, start and stop lots of cores there's this patch, but don't know about it's

Re: what differents between SolrCloud and Solr+Hadoop

2010-09-13 Thread Marc Sturlese
Well these are pretty different things. SolrCloud is meant to handle distributed search in a more easy way that raw solr distributed search. You have to build the shards in your own way. Solr+hadoop is a way to build these shards/indexes in paralel. -- View this message in context:

Re: Null pointer exception when mixing highlighter shards q.alt

2010-09-07 Thread Marc Sturlese
I noticed that long ago. Fixed it doing in HighlightComponent finishStage: @Override public void finishStage(ResponseBuilder rb) { boolean hasHighlighting = true ; if (rb.doHighlights rb.stage == ResponseBuilder.STAGE_GET_FIELDS) { Map.EntryString, Object[] arr = new

Re: JVM GC is very frequent.

2010-08-26 Thread Marc Sturlese
http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot-camp-draft/ -- View this message in context: http://lucene.472066.n3.nabble.com/JVM-GC-is-very-frequent-tp1345760p1348065.html Sent from the Solr - User mailing list archive at Nabble.com.

FieldCache.DEFAULT.getInts vs FieldCache.DEFAULT.getStringIndex. Memory usage

2010-08-26 Thread Marc Sturlese
I need to load a FieldCache for a field wich is a solr integer type and has as maximum 3 digits. Let's say my index has 10M docs. I am wandering what is more optimal and less memory consumig, to load a FieldCache.DEFAUL.getInts or a FieldCache.DEFAULT.getStringIndex. The second one will have a

Re: maxMergeDocs and performance tuning

2010-08-16 Thread Marc Sturlese
As far as I know, the higher you set the value, the faster the indexing process will be (because more things are kept in memory). But depending on which are your needs, it may not be the best option. If you set a high mergeFactor and you want to optimize the index once the process is done, this

ending an app taht uses EmbeddedSolrServer

2010-07-13 Thread Marc Sturlese
Hey there, I've done some tests with a custom java app using EmbeddedSolrServer to create an index. It works ok and I am able to build the index but I've noticed after the commit an optimize are done, the app never terminates. How should I end it? Is there any way to tell the EmbeddedSolrServer

Re: ending a java app that uses EmbeddedSolrServer

2010-07-13 Thread Marc Sturlese
Seems that coreContainer.shoutdown() solves the problem. Anyone doing it in a different way? -- View this message in context: http://lucene.472066.n3.nabble.com/ending-a-java-app-that-uses-EmbeddedSolrServer-tp963573p964013.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Recommended MySQL JDBC driver

2010-06-26 Thread Marc Sturlese
I supose you use BatchSize=-1 to index that amount of data. Up from 5.1.7 connector there's this param: netTimeoutForStreamingResults The default value is 600. Increasing that maybe can help (2400 for example?) -- View this message in context:

Re: performance sorting multivalued field

2010-06-25 Thread Marc Sturlese
*There are lot's of docs with the same value, I mention that because I supose that same value has nothing to do with the number of un-inverted term instances. It has to do, I've been able to reproduce teh error by setting different values to each field: HTTP Status 500 - there are more terms

Re: anyone use hadoop+solr?

2010-06-24 Thread Marc Sturlese
Hi Otis, just for curiosity, wich strategy do you use? Index in the map or reduce side? Do you use it to build shards or a single monolitic index? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p919335.html Sent from the Solr - User

Re: performance sorting multivalued field

2010-06-24 Thread Marc Sturlese
Thanks, that's very useful info. However can't reproduce the error. I've created and index where all documents have a multivalued date field and each document have a minimum of one value in that field. (most of the docs have 2 or 3). So, the number of un-inverted term instances is greater than

Re: performance sorting multivalued field

2010-06-22 Thread Marc Sturlese
Well, sorting requires that all the unique values in the target field get loaded into memory That's what I tought, thanks. But a larger question is whether what your doing is worthwhile even as just a measurement. You say This is good for me, I don't care for my tests. I claim that you do care I

Re: anyone use hadoop+solr?

2010-06-22 Thread Marc Sturlese
I think there's people using this patch in production: https://issues.apache.org/jira/browse/SOLR-1301 I have tested it myself indexing data from CSV and from HBase and it works properly -- View this message in context:

Re: solr with hadoop

2010-06-22 Thread Marc Sturlese
I think a good solution could be to use hadoop with SOLR-1301 to build solr shards and then use solr distributed search against these shards (you will have to copy to local from HDFS to search against them) -- View this message in context:

Re: anyone use hadoop+solr?

2010-06-22 Thread Marc Sturlese
Well, the patch consumes the data from a csv. You have to modify the input to use TableInputFormat (I don't remember if it's called exaclty like that) and it will work. Once you've done that, you have to specify as much reducers as shards you want. I know 2 ways to index using hadoop method 1

Re: Can query boosting be used with a custom request handlers?

2010-06-21 Thread Marc Sturlese
Maybe this helps: http://wiki.apache.org/solr/SolrPlugins#QParserPlugin -- View this message in context: http://lucene.472066.n3.nabble.com/Can-query-boosting-be-used-with-a-custom-request-handlers-tp884499p912691.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: performance sorting multivalued field

2010-06-19 Thread Marc Sturlese
Hey Erik, I am currently sorting by a multiValued. It apears a feature tha't you may not know wich of the fields of the multiValued field makes the document be in that position. This is good for me, I don't care for my tests. What I need to know if there is any performance issue in all of this.

performance sorting multivalued field

2010-06-18 Thread Marc Sturlese
hey there! can someone explain me how impacts to have multivalued fields when sorting? I have read in other threads how does it affect when faceting but couldn't find any info of the impact when sorting Thanks in advance -- View this message in context:

Re: performance sorting multivalued field

2010-06-18 Thread Marc Sturlese
I mean sorting the query results, not facets. I am asking because I have added a multivalued field that has as much 10 values. But 70% of the docs has just 1 or 2 fields of this multiValued field. I am not doing faceting. Since I have added the multiValued field, java old gen seems to get full

Re: how to test solr's performance?

2010-06-10 Thread Marc Sturlese
I normally use jmeter, jconsole and iostat. Recently http://www.newrelic.com/solr.html has been released -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-test-solr-s-performance-tp881928p885025.html Sent from the Solr - User mailing list archive at Nabble.com.

Question about specifying the query analysis at query time

2010-05-31 Thread Marc Sturlese
Hey there, I am facing a problem related to query analysis and stopwords. Have some ideas how to sort it out but would like to do it in the cleanest way possible. I am using dismax and I query to 3 fields. These fields are defined as text this way: fieldType name=text class=solr.TextField

Re: How well does Solr scale over large number of facet values?

2010-05-25 Thread Marc Sturlese
With the uninverted algorithm it will be very fast whatever is the number of unique terms. But be careful with the memory because it uses quite a lot. Using the oldest facet algorithm, if you have a lot of different terms it will be slow. -- View this message in context:

Re: How well does Solr scale over large number of facet values?

2010-05-25 Thread Marc Sturlese
Since Solr 1.4 I think the uninverted method is on by default. Anyway, you can choose wich to use with the method param: facet.method=fc/enum (where fc is the uninverted one) http://wiki.apache.org/solr/SimpleFacetParameters -- View this message in context:

Re: Skipping duplicates in DataImportHandler based on uniqueKey

2010-05-03 Thread Marc Sturlese
You can use deduplication to do that. Create the signature based on the unique field or any field you want. -- View this message in context: http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p772768.html Sent from the Solr - User mailing list

bug using distributed search, highlighting and q.alt

2010-04-15 Thread Marc Sturlese
I have noticed when using q.alt even if hl=true highlights are not returned. When using distributed search, q.alt and hl, HighlightComponent.java finishStage expects the highlighting NamedList of each shard (if hl=true) but it will never be returned. It will end up with a NullPointerExcepion. I

bug using distributed search, highlighting and q.alt

2010-04-15 Thread Marc Sturlese
I have noticed when using q.alt even if hl=true highlights are not returned. When using distributed search, q.alt and hl, HighlightComponent.java finishStage expects the highlighting NamedList of each shard (if hl=true) but it will never be returned. It will end up with a NullPointerExcepion. I

Re: Omitting norms question

2010-03-19 Thread Marc Sturlese
Should I include not omit-norms on any fields that I would like to boost via a boost-query/function query? You don't have to set norms to use boost queries or functions. Just have to set them when you want to boost docs or fields at indexing time. What about sortable fields? Facetable fields?

excluder filters and multivalued fields

2010-03-18 Thread Marc Sturlese
I don't think there's a way to do what has come to my mind but want to be sure. Let's say I have a doc with 2 fileds, one is multiValued doc1: name-john year-2009;year-2010;year-2011 And I query for: q=johnfq=-year:2010 Doc1 won't be in the matching results. Is there a way to make it appear

What does means ~2, ~3, ~4 in DisjunctionMaxQuery?

2010-03-11 Thread Marc Sturlese
I am debuggin a 2 words query build using dismax. So it's build from DisjunctionMaxQueries being the minShouldMatch 100% and tie breaker multiplier = 0.3 +((DisjunctionMaxQuery((content:john | title:john~0.3) DisjunctionMaxQuery((content:malone | title:malone)~0.3))~2) And a 3 words one (with

Best performance for facet dates in trunk using solr.TrieDateField

2010-03-03 Thread Marc Sturlese
Hey there, I am testing date facets in trunk with huge index. Aparently, as the default solrconfig.xml shows, the fastest way to run dace facets queries is index the field with this data type: !-- A Trie based date field for faster date range queries and date faceting. -- fieldType

Re: Formatting Results

2010-03-03 Thread Marc Sturlese
I'll give you an example about how to configure your default SearchHandler to do highlighting but I strongly recomend you to check properly the wiki. Everything is really well explained in there: http://wiki.apache.org/solr/HighlightingParameters str name=hltrue/str str

Re: Error on startup

2010-03-03 Thread Marc Sturlese
If you shut down the server propertly it's weird that you get an error when starting up again. How did you delete the index? I was experiencing something similar long time ago because I was removing the content from the index folder but not the folder itself. The correct way to do it was to

Re: Can I used .XML files instead of .OSM files

2010-03-03 Thread Marc Sturlese
Are you sure you don't have a folder called exampledocs with xml files inside? These are the files to index as a first example: apache-solr-1.5-dev/example/exampledocs Check the /home/marc/Desktop/data/apache-solr-1.5-dev/example/solr/conf/schema.xml and solrconfig.xml and you will see how to

Re: Need suggestion regarding custom transformer

2010-03-03 Thread Marc Sturlese
I think you can handle that writing a custom transformer. There's a good explanation in the wiki: http://wiki.apache.org/solr/DIHCustomTransformer KshamaPai wrote: Hi, Am new to solr. I am trying location aware search with spatial lucene in solr1.5 nightly build. My table in mysql has

Re: new/first searcher

2010-02-26 Thread Marc Sturlese
There's no problem about having the same warming in both cases. First queries are use to warm the index once you start the solr instance. New queries warm the index once a commit in executed, for example. In first queries warming there was no previous IndexSearcher opened. In new queries there

Re: Highest frequency

2010-02-26 Thread Marc Sturlese
As far as I know it's not suported by default. I thing you should implement your custom Lucene Similarity class and plug it into Solr via solrconfig.xml pcmanprogrammeur wrote: Hello all (sorry if my english is bad, i'm french) ! I have a Solr Index with ads which contain a title and a

readOnly and concurrency performance problems

2010-02-20 Thread Marc Sturlese
Hey there, I am experiencing concurrent performance problems in trunk. Does it open readers in readOnly mode? Thanks in advance -- View this message in context: http://old.nabble.com/readOnly-and-concurrency-performance-problems-tp27670680p27670680.html Sent from the Solr - User mailing list

Re: readOnly and concurrency performance problems

2010-02-20 Thread Marc Sturlese
or something?I am quite lost and surprised about the behaviour I have noticed... markrmiller wrote: Yeah it does - I take it your not on windows? - Mark http://www.lucidimagination.com (mobile) On Feb 20, 2010, at 4:39 PM, Marc Sturlese marc.sturl...@gmail.com wrote: Hey there, I am

Why synchronized access to FieldValueCache in getUninvertedField.java

2010-02-20 Thread Marc Sturlese
I have noticed that in the class UninvertedField.java there is a synchronized access to the FieldValueCache. I would like to know why this access is synchronized. Could this end up in a loss of performance when there are concurrent search requests? I am doing as much research as I can as I have

Strange performance behaviour when concurrent requests are done

2010-02-19 Thread Marc Sturlese
Hey there, I have been doing some stress with a 2 physical CPU (with 4 cores each) server. After some reading about GC performance tunning I have configured it this way: /usr/lib/jvm/java-6-sun/bin/java -server -Xms7000m -Xmx7000m -XX:ReservedCodeCacheSize=10m -XX:NewSize=1000m

scores are the same for many diferent documents

2010-02-17 Thread Marc Sturlese
Hey there, I see that when solr gives me back the scores in the response it are the same for many different documents. I have build a simple index for testing purposes with just documents with one field indexed with standard analyzer and containing pices of text. I have done the same with a self

Re: weird behabiour when setting negative boost with bq using dismax

2010-02-04 Thread Marc Sturlese
? On Mon, Feb 1, 2010 at 8:04 AM, Marc Sturlese marc.sturl...@gmail.com wrote: I already asked about this long ago but the answer doesn't seem to work... I am trying to set a negative query boost to send the results that match field_a: 54 to a lower position. I have tried it in 2 different

Re: weird behabiour when setting negative boost with bq using dismax

2010-02-04 Thread Marc Sturlese
: bq=(*:* -field_a:54^1) I think what you want there is bq=(*:* -field_a:54)^1 ...you are boosting things that don't match field_a:54 Thanks Hoss. I've updated the Wiki, the content of the bq param was wrong:

Re: how to stress test solr

2010-02-03 Thread Marc Sturlese
I like to use JMeter with a large queries file. This way you can measure response times with lots of requests at the same time. Having JConsole opened at the same time you can check the memory status James liu-2 wrote: before stressing test, Should i close SolrCache? which tool u use?

weird behabiour when setting negative boost with bq using dismax

2010-02-01 Thread Marc Sturlese
I already asked about this long ago but the answer doesn't seem to work... I am trying to set a negative query boost to send the results that match field_a: 54 to a lower position. I have tried it in 2 different ways: bq=(*:* -field_a:54^1) bq=-field_a:54^1 None of them seem to work.

loading an updateProcessorChain with multicore in trunk

2010-01-29 Thread Marc Sturlese
I am testing trunk and have seen a different behaviour when loading updateProcessors wich I don't know if it's normal (at least with multicore) Before I use to use an updateProcessorChain this way: requestHandler name=/update class=solr.XmlUpdateRequestHandler lst name=defaults str

Re: solr - katta integration

2010-01-28 Thread Marc Sturlese
have a look: http://issues.apache.org/jira/browse/SOLR-1395 V SudershanReddy wrote: Hi, Can we Integrate solr with katta? In order to overcome the limitations of Solr in distributed search, I need to integrate katta with solr, without loosing any features of Solr.

Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Marc Sturlese
In case you are going to use core per user take a look to this patch: http://wiki.apache.org/solr/LotsOfCores Trey-13 wrote: Hi Matt, In most cases you are going to be better off going with the userid method unless you have a very small number of users and a very large number of

Re: big index vs. lots of small ones

2010-01-20 Thread Marc Sturlese
Check out this patch witch solve the distributed IDF's problem: https://issues.apache.org/jira/browse/SOLR-1632 I think it fixes what you are explaining. The price you pay is that there are 2 requests per shard. If I am not worng the first is to get term frequencies and needed info and the second

Re: suggestions for DIH batchSize

2009-12-23 Thread Marc Sturlese
If you want to retrieve a huge volume of rows you will end up with an OutOfMemoryException due to the jdbc driver. Setting batchSize to -1 in your data-config.xml (that internally will set it to Integer.MIN_VALUE) will make the query to be executed in streaming, avoiding the memory exception.

tire fields and sortMissingLast

2009-12-21 Thread Marc Sturlese
Should sortMissingLast param be working on trie-fields? -- View this message in context: http://old.nabble.com/tire-fields-and-sortMissingLast-tp26873134p26873134.html Sent from the Solr - User mailing list archive at Nabble.com.

UpdateRequestProcessor to avoid documents of being indexed

2009-12-10 Thread Marc Sturlese
Hey there, I need that once a document has been created be able to decide if I want it to be indexed or not. I have thought in implement an UpdateRequestProcessor to do that but don't know how to tell Solr in the processAdd void to skip the document. If I delete all the field would it be skiped

Re: UpdateRequestProcessor to avoid documents of being indexed

2009-12-10 Thread Marc Sturlese
{ LOG.debug(Doc skipped!) ; } } Thanks in advance Chris Male wrote: Hi, If your UpdateRequestProcessor does not forward the AddUpdateCommand onto the RunUpdateProcessor, I believe the document will not be indexed. Cheers On Thu, Dec 10, 2009 at 12:09 PM, Marc

Re: UpdateRequestProcessor to avoid documents of being indexed

2009-12-10 Thread Marc Sturlese
Yes, it did Cheers Chris Male wrote: Hi, Yeah thats what I was suggesting. Did that work? On Thu, Dec 10, 2009 at 12:24 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Do you mean something like?: @Override public void processAdd(AddUpdateCommand cmd) throws IOException

About fsv (sort field falues)

2009-12-08 Thread Marc Sturlese
I am tracing QueryComponent.java and would like to know the pourpose of doFSV function. Don't understand what fsv are for. Have tried some queries with fsv=true and some extra info apears in the response: lst name=sort_values/ But don't know what is it for and can't find much info out there. I

Re: Sanity check on numeric types and which of them to use

2009-12-05 Thread Marc Sturlese
And what about: fieldtype name=sint class=solr.SortableIntField sortMissingLast=true/ vs. fieldtype name=bcdint class=solr.BCDIntField sortMissingLast=true/ Wich is the differenece between both? It's just bcdint always better? Thanks in advance Yonik Seeley-2 wrote: On Fri, Dec 4, 2009 at

Re: solr+jetty logging to syslog?

2009-11-26 Thread Marc Sturlese
With 1.4 -Add log4j jars to Solr -Configure de SyslogAppender with something like: log4j.appender.solrLog=org.apache.log4j.net.SyslogAppender log4j.appender.solrLog.Facility=LOCAL0 log4j.appender.solrLog.SyslogHost=127.0.0.1 log4j.appender.solrLog.layout=org.apache.log4j.PatternLayout

error with multicore CREATE action

2009-11-23 Thread Marc Sturlese
Hey there, I am using Solr 1.4 out of the box and am trying to create a core at runtime using the CREATE action. I am getting this error when executing: http://localhost:8983/solr/admin/cores?action=CREATEname=xinstanceDir=xpersist=trueconfig=solrconfig.xmlschema=schema.xmldataDir=data

distributed facet dates

2009-11-10 Thread Marc Sturlese
Hey there, I am thinking to develope facet dates for distributed search but I don't know exacly where to start. I am familiar with facet dates source code and I think if I could undesertand how distributed facet queries work shouldn't be that difficult. I have read

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread Marc Sturlese
Are you using one single solr instance with multicore or multiple solr instances with one index each? Erik_l wrote: Hi, Currently we're running 10 Solr indexes inside a single Tomcat6 instance. In the near future we would like to add another 30-40 indexes to every Tomcat instance we

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread Marc Sturlese
to hold you will suffer of slow response times. Erik_l wrote: We're not using multicore. Today, one Tomcat instance host a number of indexes in form of 10 Solr indexes (10 individual war files). Marc Sturlese wrote: Are you using one single solr instance with multicore or multiple solr

SOLR-1395 integration with katta. Question about Katta's ranking among shards and IDF's

2009-10-09 Thread Marc Sturlese
Hey there, I am trying to set up the Katta integration plugin. I would like to know if Katta's ranking algorith is used when searching among shards. In case yes, would it mean it solves the problem with IDF's of distributed Solr? -- View this message in context:

Re: Solr Trunk Heap Space Issues

2009-10-05 Thread Marc Sturlese
I think it doesn't make sense to enable warming if your solr instance is just for indexing pourposes (it changes if you use it for search aswell). You could comment the caches aswell from solrconfig.xml Setting queryResultWindowSize and queryResultMaxDocsCached to sero maybe could help... (but if

DIH applying variosu transformers to a field

2009-09-08 Thread Marc Sturlese
Hey there, I am using DIH to import a db table and and have writed a custom transformer following the example: package foo; public class CustomTransformer1{ public Object transformRow(MapString, Object row) { String artist = row.get(artist); if

Best way to do a lucene matchAllDocs not using q.alt=*:*

2009-09-03 Thread Marc Sturlese
Hey there, I need a query to get the total number of documents in my index. I can get if I do this using DismaxRequestHandler: q.alt=*:*facet=falsehl=falserows=0 I have noticed this query is very memory consuming. Is there any more optimized way in trunk to get the total number of documents of my

Optimizing a query to sort results alphabetically for a determinated field

2009-08-24 Thread Marc Sturlese
Hey there, I need to sort my query results alphabetically for a determinated field called town. This field is analyzed with a KeywordAnalyzer and isn't multiValued. Add that some docs doesn't doesn'h have this field. Doing just:

Re: Optimizing a query to sort results alphabetically for a determinated field

2009-08-24 Thread Marc Sturlese
. On Mon, Aug 24, 2009 at 11:58 AM, Marc Sturlese marc.sturl...@gmail.comwrote: Hey there, I need to sort my query results alphabetically for a determinated field called town. This field is analyzed with a KeywordAnalyzer and isn't multiValued. Add that some docs doesn't doesn'h have

Re: Optimizing a query to sort results alphabetically for a determinated field

2009-08-24 Thread Marc Sturlese
happens ;) On Mon, Aug 24, 2009 at 12:24 PM, Marc Sturlese marc.sturl...@gmail.comwrote: Yes but I thought it was just for sortable fields: sint,sfloat,sdouble,slong. Can I apply sortMissingLastto text fields analyzed with KeywordAnalyzer? Constantijn Visinescu wrote: There's

Re: Remove data from index

2009-08-20 Thread Marc Sturlese
As far as I know you can not do that with DIH. What size is your index? Probably the best you can do is index from scratch again with full-import. clico wrote: I hope it could be a solution. But I think I understood that u can use deletePkQuery like this select document_id from

Re: Is negative boost possible?

2009-08-19 Thread Marc Sturlese
:the only way to negative boost is to positively boost the inverse... : : (*:* -field1:value_to_penalize)^10 This will do the job aswell as bq supports pure negative queries (at least in trunk): bq=-field1:value_to_penalize^10

  1   2   3   >