Re: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Wed, May 21, 2008 at 6:27 AM, Julio Castillo [EMAIL PROTECTED] wrote: I wanted to learn how to index data that I have on my dB. I followed the instructions on the wiki page for the Data Import Handler (Full Import Example -example-solr-home.jar). I got an exception running it as is (see

Fetching the first 10 results and the last result

2008-05-21 Thread Tim Mahy
Hi all, is there a way to let Solr not only return the total number of found articles, but also the data of the last document when for example only requesting the first 10 documents ? we could do this with a seperate query by either letting the second query fetch 1 row from position =

Re: What are stopwords and protwords ???

2008-05-21 Thread Grant Ingersoll
Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing

SOLR OOM (out of memory) problem

2008-05-21 Thread gurudev
Hi We currently host index of size approx 12GB on 5 SOLR slaves machines, which are load balanced under cluster. At some point of time, which is after 8-10 hours, some SOLR slave would give Out of memory error, after which it just stops responding, which then requires restart and after restart

Re: What are stopwords and protwords ???

2008-05-21 Thread Akeel
Thank you very much for such a detailed reply. can you please tell me how can i interact with solr from within my Java/JSP application ? I mean how to query the solr running at localhost and getting results back in the application. Do i have to change something there in solrconfig.xml ? Please

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread gurudev
Just to add more: The JVM heap allocated is 6GB with initial heap size as 2GB. We use quadro(which is 8 cpus) on linux servers for SOLR slaves. We use facet searches, sorting. document cache is set to 7 million (which is total documents in index) filtercache 1 gurudev wrote: Hi We

Re: Release date of SOLR 1.3

2008-05-21 Thread Dan Thomas
On Mon, May 19, 2008 at 2:49 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : solr release in some time, would it be worth looking at what outstanding : issues are critical for 1.3 and perhaps pushing some over to 1.4, and : trying to do a release soon? That's what is typically done when the

Re: Release date of SOLR 1.3

2008-05-21 Thread Alexander Ramos Jardim
It is difficult to say such a thing when we consider that Solr is developed by voluntaries that use their free time or time as part of a working project to dedicate to Solr. I think that Solr development is giving us outstanding results. 2008/5/21 Dan Thomas [EMAIL PROTECTED]: On Mon, May 19,

Re: Release date of SOLR 1.3

2008-05-21 Thread Andrew Savory
Hi, 2008/5/21 Dan Thomas [EMAIL PROTECTED]: One year between releases is a very long time for such a useful and dynamic system. Are project leaders willing to (re)consider the development process to prioritize improvements/features scope into chunks that can be accomplished in shorter time

Solr Text Vs String

2008-05-21 Thread Yerraguntla
Hi, I have incoming field stored both as Text and String field in solr indexed data. When I search the following cases, string field returns documents(from Solr client) and not text fields. NAME:T - no results Name_Str:T - returns documents Similarly for the following cases - CPN*, DPS*, S,

Re[2]: the time factor

2008-05-21 Thread JLIST
Hello Chris, it sounds like you only attempted tweaking the boost value, and not tweaking the function params ... you can change the curve so that really new things get a large score increase, but older things get less of an increase. recip(rord(creationDate),1,a,b)^w I was tweaking the

Re: Release date of SOLR 1.3

2008-05-21 Thread Umar Shah
On Wed, May 21, 2008 at 7:40 PM, Andrew Savory [EMAIL PROTECTED] wrote: Hi, 2008/5/21 Dan Thomas [EMAIL PROTECTED]: One year between releases is a very long time for such a useful and dynamic system. Are project leaders willing to (re)consider the development process to prioritize

Re: expression in an fq parameter fails

2008-05-21 Thread Daniel Papasian
Ezra Epstein wrote: str name=fqstoreAvailableDate:[* TO NOW]/str str name=fqstoreExpirationDate:[NOW TO *]/str ... This works perfectly. Only trouble is that the two data fields may actually be empty, in which case this filters out such records and we want to include them. I

RE: expression in an fq parameter fails

2008-05-21 Thread Ezra Epstein
As a work-around that'd work. It means either changing the contents of the data sets or changing the schema and how data are fed to SOLR/Lucene. I'm hoping to be able to put an expression in the fq param instead, if that's supported. -Original Message- From: Daniel Papasian

RE: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Julio Castillo
Noble Paul, I took a look at the jar files included in the nightly builds and they do not include the dataimport.jar content. So, I assume then that my best approach is to download the corresponding dataimport sources used and build my own dataimport.jar? Thanks ** julio -Original

RE: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Julio Castillo
OK, I just downloaded the source tree and discovered that the sources for the dataimport handler are not there. I guess I have to download the SOLR-469-contrib.patch I suppose that later the source tree will have a contrib directory formally and not as a patch? Thanks ** julio -Original

RE: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Julio Castillo
You have to excuse me here, but I can't find the contrib sources. I have nothing the apply the patch to. I used the following URL to get the SVN sources (per the website): http://svn.apache.org/repos/asf/lucene/solr/. Sorry, I'm a newbie with Solr, but intend to use it to index my data on the

Re: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Shalin Shekhar Mangar
Hi Julio, Please download the SOLR-469.patch (not the contrib patch) from the SOLR-469 jira issue and apply it to the latest trunk code. I apologize for not keeping the example in the wiki in sync with the latest code. Please let us know here if you face a problem. On Wed, May 21, 2008 at 10:46

Re: What are stopwords and protwords ???

2008-05-21 Thread Shalin Shekhar Mangar
Hi Akeel, Take a look at SolrJ which is a Java client library for Solr. It is packaged with the Solr nightly binary downloads. This can be used by your Java/JSP application to add documents or query Solr. No changes to any config files is needed. On Wed, May 21, 2008 at 5:15 PM, Akeel [EMAIL

Re: What are stopwords and protwords ???

2008-05-21 Thread Shalin Shekhar Mangar
Here's the link to wiki documentation on SolrJ http://wiki.apache.org/solr/Solrj On Wed, May 21, 2008 at 11:09 PM, Shalin Shekhar Mangar [EMAIL PROTECTED] wrote: Hi Akeel, Take a look at SolrJ which is a Java client library for Solr. It is packaged with the Solr nightly binary downloads.

RE: expression in an fq parameter fails

2008-05-21 Thread Chris Hostetter
: I'm hoping to be able to put an expression in the fq param instead, if : that's supported. you have to invert your logic. docs that have not yet expired, or will never expire match the negacted query for docs expired in the past... fq = -storeExpirationDate:[* TO NOW] -Hoss

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread solr
But that means that it can't fit all documents in the cache, doesn't it? The index is 12GB and your allocated heap is 6GB... 12GB 6GB... /Jimi Quoting gurudev [EMAIL PROTECTED]: Just to add more: The JVM heap allocated is 6GB with initial heap size as 2GB. We use quadro(which is 8 cpus)

Re: Fetching the first 10 results and the last result

2008-05-21 Thread Mike Klaas
On 21-May-08, at 2:35 AM, Tim Mahy wrote: Hi all, is there a way to let Solr not only return the total number of found articles, but also the data of the last document when for example only requesting the first 10 documents ? we could do this with a seperate query by either letting the

Re: Release date of SOLR 1.3

2008-05-21 Thread Chris Hostetter
: One year between releases is a very long time for such a useful and : dynamic system. Are project leaders willing to (re)consider the : development process to prioritize improvements/features scope into : chunks that can be accomplished in shorter time frames - say 90 days? : In my experience,

RE: SOLR OOM (out of memory) problem

2008-05-21 Thread Yongjun Rong
I had the same problem some weeks before. You can try these: 1. Check the hit ratio for the cache via the solr/admin/stats.jsp. If the hit ratio is very low. Just disable those cache. It will save you some memory. 2. set -Xms and -Xmx to the same size will help improve GC performance. 3. Check

Re: Problem getting spelling suggestions to work

2008-05-21 Thread Chris Hostetter
: Thats true, but that's not the problem. The problem is that you can't call : qt=spellchecker if you redefine /select in solrconfig.xml. I was wondering : how I could add qt functionality back. If you override /select to bind it to a specific handler, then you lose the abiliy to pick a handler

Re: How to limit number of pages per domain

2008-05-21 Thread Chris Hostetter
: : I'm indexing pages from multiple domains. In any given : result set, I don't want to return more than two links : from the same domain, so that the first few pages won't : be all from the same domain. I suppose I could get more : (say, 100) pages from solr, then sort in memory in the :

Re: How to limit number of pages per domain

2008-05-21 Thread Jonathan Ariel
Sorry. But how field collapsing works? Is there documentation about this anywhere? Thanks! On Wed, May 21, 2008 at 7:02 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : : I'm indexing pages from multiple domains. In any given : result set, I don't want to return more than two links : from the

Re: How to limit number of pages per domain

2008-05-21 Thread Koji Sekiguchi
There is a documentation: http://wiki.apache.org/solr/FieldCollapsing Koji Jonathan Ariel wrote: Sorry. But how field collapsing works? Is there documentation about this anywhere? Thanks!

Re: Delete by multiple query doesn't seem to work

2008-05-21 Thread Shalin Shekhar Mangar
Not sure, but try using: deletequerydocument_id:A-395 OR document_id:A-1949/query/delete On Thu, May 22, 2008 at 7:46 AM, Tracy Flynn [EMAIL PROTECTED] wrote: I'm trying to exploit 'Delete by Query' with multiple IDs in the query. I'm using vanilla SOLR 1.2 My schema specifies.

Re: What are stopwords and protwords ???

2008-05-21 Thread Akeel
thanks everyone On Thu, May 22, 2008 at 7:18 AM, Grant Ingersoll [EMAIL PROTECTED] wrote: See http://lucene.apache.org/solr/tutorial.html. You can also see the wiki for a whole bunch of docs, including links to tutorials, etc. Also, just for future reference, please separate out questions

Use of entities in the DataImportHandler config file

2008-05-21 Thread Julio Castillo
I'm trying to configure a document config file using the example data-config.xml mentioned in the wiki. One question I have is when to nest the entity tags/nodes in the xml file? The proposed example has them nested as document entity entity /entity entity

Re: How to limit number of pages per domain

2008-05-21 Thread Otis Gospodnetic
Actually, the best documentation are really the comments in the JIRA issue itself. Is there anyone actually using Solr with this patch? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Koji Sekiguchi [EMAIL PROTECTED] To:

Re: Use of entities in the DataImportHandler config file

2008-05-21 Thread Shalin Shekhar Mangar
Hi Julio, Entities are nested when they have parent-child relationships as in a SQL Join. For example, if your product has categories, you will create an entity for products and a child entity for categories. However, if your entities are totally independent of each other, then you can keep them

Re: Use of entities in the DataImportHandler config file

2008-05-21 Thread Noble Paul നോബിള്‍ नोब्ळ्
Julio, This is to convert the 1:n and m:n relationships in a DB to multivalued fields in solr. A single sql query ends up giving a 2D matrix where each cell holds one value. It would be harder to denormalize and extract the multivalued fields from a single result set. Check the architecture to