date:20080521

Re: Use of entities in the DataImportHandler config file

2008-05-21 Thread Noble Paul നോബിള്‍ नोब्ळ्

Julio, This is to convert the 1:n and m:n relationships in a DB to multivalued fields in solr. A single sql query ends up giving a 2D matrix where each cell holds one value. It would be harder to denormalize and extract the multivalued fields from a single result set. Check the architecture to see

Re: Use of entities in the DataImportHandler config file

2008-05-21 Thread Shalin Shekhar Mangar

Hi Julio, Entities are nested when they have parent-child relationships as in a SQL Join. For example, if your product has categories, you will create an entity for products and a child entity for categories. However, if your entities are totally independent of each other, then you can keep them a

Re: How to limit number of pages per domain

2008-05-21 Thread Otis Gospodnetic

Actually, the best documentation are really the comments in the JIRA issue itself. Is there anyone actually using Solr with this patch? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Koji Sekiguchi <[EMAIL PROTECTED]> > To: solr-user@lucen

Use of entities in the DataImportHandler config file

2008-05-21 Thread Julio Castillo

I'm trying to configure a document config file using the example data-config.xml mentioned in the wiki. One question I have is when to nest the entity tags/nodes in the xml file? The proposed example has them nested as Why didn't the example had a

Re: What are stopwords and protwords ???

2008-05-21 Thread Akeel

thanks everyone On Thu, May 22, 2008 at 7:18 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > See http://lucene.apache.org/solr/tutorial.html. You can also see the > wiki for a whole bunch of docs, including links to tutorials, etc. > > Also, just for future reference, please separate out questi

Re: Delete by multiple query doesn't seem to work

2008-05-21 Thread Shalin Shekhar Mangar

Not sure, but try using: document_id:"A-395" OR document_id:"A-1949" On Thu, May 22, 2008 at 7:46 AM, Tracy Flynn <[EMAIL PROTECTED]> wrote: > > I'm trying to exploit 'Delete by Query' with multiple IDs in the query. > > I'm using vanilla SOLR 1.2 > > My schema specifies. > > document_id > > My u

Re: What are stopwords and protwords ???

2008-05-21 Thread Grant Ingersoll

See http://lucene.apache.org/solr/tutorial.html. You can also see the wiki for a whole bunch of docs, including links to tutorials, etc. Also, just for future reference, please separate out questions so that they can be addressed separately, and more easily found by others in the future.

Delete by multiple query doesn't seem to work

2008-05-21 Thread Tracy Flynn

I'm trying to exploit 'Delete by Query' with multiple IDs in the query. I'm using vanilla SOLR 1.2 My schema specifies. document_id My unique document ids are of the form 'A-xxx' , 'T-xxx" and so on. The following individual delete works: curl http://work:8983/solr/update -H "Content-Type:

Re: How to limit number of pages per domain

2008-05-21 Thread Koji Sekiguchi

There is a documentation: http://wiki.apache.org/solr/FieldCollapsing Koji Jonathan Ariel wrote: Sorry. But how field collapsing works? Is there documentation about this anywhere? Thanks!

Re: How to limit number of pages per domain

2008-05-21 Thread Jonathan Ariel

Sorry. But how field collapsing works? Is there documentation about this anywhere? Thanks! On Wed, May 21, 2008 at 7:02 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > : > : I'm indexing pages from multiple domains. In any given > : result set, I don't want to return more than two links > : from

Re: How to limit number of pages per domain

2008-05-21 Thread Chris Hostetter

: : I'm indexing pages from multiple domains. In any given : result set, I don't want to return more than two links : from the same domain, so that the first few pages won't : be all from the same domain. I suppose I could get more : (say, 100) pages from solr, then sort in memory in the : front-e

Re: Problem getting spelling suggestions to work

2008-05-21 Thread Chris Hostetter

: Thats true, but that's not the problem. The problem is that you can't call : qt=spellchecker if you redefine /select in solrconfig.xml. I was wondering : how I could add qt functionality back. If you override "/select" to bind it to a specific handler, then you lose the abiliy to pick a handle

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread Mike Klaas

Facet searches cache a filter per unique term for multivalued fields. There are many ways to reduce memory consumption in these scenarios, but it usually requires a case-by-case solution. -Mike On 21-May-08, at 12:08 PM, Lance Norskog wrote: We have had major OOM problems doing facet searc

RE: SOLR OOM (out of memory) problem

2008-05-21 Thread Lance Norskog

We have had major OOM problems doing facet searches. Having 20 searches at once used up maybe 5G and one faceting request would blow up at 12. More important, when a facet request throws an OOM it seems like the memory is not released. When a normal search throws an OOM, the memory is released and

RE: SOLR OOM (out of memory) problem

2008-05-21 Thread Yongjun Rong

I had the same problem some weeks before. You can try these: 1. Check the hit ratio for the cache via the solr/admin/stats.jsp. If the hit ratio is very low. Just disable those cache. It will save you some memory. 2. set -Xms and -Xmx to the same size will help improve GC performance. 3. Check wha

dismax handler and WordDelimiterFilterFactory

2008-05-21 Thread peter360

Hi, Let's say I have an index with two fields: f1 and f2, and queries to both are analyzed using WhiteSpaceTokenizerFactory and WordDelimiterFilterFactory. I use dismax handler for queries and observed the following anomally. Suppose I have a document with f1="american" and f2="idol". Then a s

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread Mike Klaas

On 21-May-08, at 4:46 AM, gurudev wrote: Just to add more: The JVM heap allocated is 6GB with initial heap size as 2GB. We use quadro(which is 8 cpus) on linux servers for SOLR slaves. We use facet searches, sorting. document cache is set to 7 million (which is total documents in index) filte

Re: Release date of SOLR 1.3

2008-05-21 Thread Chris Hostetter

: One year between releases is a very long time for such a useful and : dynamic system. Are project leaders willing to (re)consider the : development process to prioritize improvements/features scope into : chunks that can be accomplished in shorter time frames - say 90 days? : In my experience,

Re: Fetching the first 10 results and the last result

2008-05-21 Thread Mike Klaas

On 21-May-08, at 2:35 AM, Tim Mahy wrote: Hi all, is there a way to let Solr not only return the total number of found articles, but also the data of the last document when for example only requesting the first 10 documents ? we could do this with a seperate query by either letting the se

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread solr

But that means that it can't fit all documents in the cache, doesn't it? The index is 12GB and your allocated heap is 6GB... 12GB > 6GB... /Jimi Quoting gurudev <[EMAIL PROTECTED]>: Just to add more: The JVM heap allocated is 6GB with initial heap size as 2GB. We use quadro(which is 8 cpus

RE: expression in an fq parameter fails

2008-05-21 Thread Chris Hostetter

: I'm hoping to be able to put an expression in the fq param instead, if : that's supported. you have to invert your logic. docs that "have not yet expired, or will never expire" match the negacted query for "docs expired in the past"... fq = -storeExpirationDate:[* TO NOW] -Hoss

Re: What are stopwords and protwords ???

2008-05-21 Thread Shalin Shekhar Mangar

Here's the link to wiki documentation on SolrJ http://wiki.apache.org/solr/Solrj On Wed, May 21, 2008 at 11:09 PM, Shalin Shekhar Mangar <[EMAIL PROTECTED]> wrote: > Hi Akeel, > > Take a look at SolrJ which is a Java client library for Solr. It is > packaged with the Solr nightly binary downloads

Re: What are stopwords and protwords ???

2008-05-21 Thread Shalin Shekhar Mangar

Hi Akeel, Take a look at SolrJ which is a Java client library for Solr. It is packaged with the Solr nightly binary downloads. This can be used by your Java/JSP application to add documents or query Solr. No changes to any config files is needed. On Wed, May 21, 2008 at 5:15 PM, Akeel <[EMAIL PRO

Re: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Shalin Shekhar Mangar

Hi Julio, Please download the SOLR-469.patch (not the contrib patch) from the SOLR-469 jira issue and apply it to the latest trunk code. I apologize for not keeping the example in the wiki in sync with the latest code. Please let us know here if you face a problem. On Wed, May 21, 2008 at 10:46 P

RE: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Julio Castillo

You have to excuse me here, but I can't find the contrib sources. I have nothing the apply the patch to. I used the following URL to get the SVN sources (per the website): http://svn.apache.org/repos/asf/lucene/solr/. Sorry, I'm a newbie with Solr, but intend to use it to index my data on the dB

RE: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Julio Castillo

OK, I just downloaded the source tree and discovered that the sources for the dataimport handler are not there. I guess I have to download the SOLR-469-contrib.patch I suppose that later the source tree will have a contrib directory formally and not as a patch? Thanks ** julio -Original Me

RE: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Julio Castillo

Noble Paul, I took a look at the jar files included in the nightly builds and they do not include the dataimport.jar content. So, I assume then that my best approach is to download the corresponding dataimport sources used and build my own dataimport.jar? Thanks ** julio -Original Message---

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread Otis Gospodnetic

Hi, Does this happen while a new searcher is warming up by any chance? Have you tried decreasing your document cache size? Try that... Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: gurudev <[EMAIL PROTECTED]> > To: solr-user@lucene.apach

RE: expression in an fq parameter fails

2008-05-21 Thread Ezra Epstein

As a work-around that'd work. It means either changing the contents of the data sets or changing the schema and how data are fed to SOLR/Lucene. I'm hoping to be able to put an expression in the fq param instead, if that's supported. -Original Message- From: Daniel Papasian [mailto:[EMAI

Re: expression in an fq parameter fails

2008-05-21 Thread Daniel Papasian

Ezra Epstein wrote: storeAvailableDate:[* TO NOW] storeExpirationDate:[NOW TO *] ... This works perfectly. Only trouble is that the two data fields may actually be empty, in which case this filters out such records and we want to include them. I think the easiest thing to do w

Re: Release date of SOLR 1.3

2008-05-21 Thread Umar Shah

On Wed, May 21, 2008 at 7:40 PM, Andrew Savory <[EMAIL PROTECTED]> wrote: > Hi, > > 2008/5/21 Dan Thomas <[EMAIL PROTECTED]>: > > > One year between releases is a very long time for such a useful and > > dynamic system. Are project leaders willing to (re)consider the > > development process to pr

Re[2]: the time factor

2008-05-21 Thread JLIST

Hello Chris, > it sounds like you only attempted tweaking the boost value, and not > tweaking the function params ... you can change the curve so that really > new things get a large score increase, but older things get less of an > increase. recip(rord(creationDate),1,a,b)^w I was tweaking the

Solr Text Vs String

2008-05-21 Thread Yerraguntla

Hi, I have incoming field stored both as Text and String field in solr indexed data. When I search the following cases, string field returns documents(from Solr client) and not text fields. NAME:T - no results Name_Str:T - returns documents Similarly for the following cases - CPN*, DPS*, S, I

Re: Release date of SOLR 1.3

2008-05-21 Thread Andrew Savory

Hi, 2008/5/21 Dan Thomas <[EMAIL PROTECTED]>: > One year between releases is a very long time for such a useful and > dynamic system. Are project leaders willing to (re)consider the > development process to prioritize improvements/features scope into > chunks that can be accomplished in shorter

Re: Release date of SOLR 1.3

2008-05-21 Thread Alexander Ramos Jardim

It is difficult to say such a thing when we consider that Solr is developed by voluntaries that use their free time or time as part of a working project to dedicate to Solr. I think that Solr development is giving us outstanding results. 2008/5/21 Dan Thomas <[EMAIL PROTECTED]>: > On Mon, May 19

Re: Release date of SOLR 1.3

2008-05-21 Thread Dan Thomas

On Mon, May 19, 2008 at 2:49 PM, Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : solr release in some time, would it be worth looking at what outstanding > : issues are critical for 1.3 and perhaps pushing some over to 1.4, and > : trying to do a release soon? > > That's what is typically done whe

Re: What are stopwords and protwords ???

2008-05-21 Thread gurudev

Hi Akeel -Stopwords are general words of language, which, as such do not contain any meaning in searches like; a,an, the, where, who, am etc. The analyzer in lucene ignores such words and do not index them. You can also specify you own stopwords in stopwords.txt in SOLR -Protwords are the words

Re: SOLR OOM (out of memory) problem

2008-05-21 Thread gurudev

Just to add more: The JVM heap allocated is 6GB with initial heap size as 2GB. We use quadro(which is 8 cpus) on linux servers for SOLR slaves. We use facet searches, sorting. document cache is set to 7 million (which is total documents in index) filtercache 1 gurudev wrote: > > Hi > >

Re: What are stopwords and protwords ???

2008-05-21 Thread Akeel

Thank you very much for such a detailed reply. can you please tell me how can i interact with solr from within my Java/JSP application ? I mean how to query the solr running at localhost and getting results back in the application. Do i have to change something there in solrconfig.xml ? Please help

SOLR OOM (out of memory) problem

2008-05-21 Thread gurudev

Hi We currently host index of size approx 12GB on 5 SOLR slaves machines, which are load balanced under cluster. At some point of time, which is after 8-10 hours, some SOLR slave would give Out of memory error, after which it just stops responding, which then requires restart and after restart it

Re: What are stopwords and protwords ???

2008-05-21 Thread Grant Ingersoll

Stopwords are commonly occurring words that don't add _much_ value to search, such as the, an, a and are usually removed during analysis. Protwords (protected words) are words that would be stemmed by the English porter stemmer that you do not want to be stemmed. In the end, removing stop

Fetching the first 10 results and the last result

2008-05-21 Thread Tim Mahy

Hi all, is there a way to let Solr not only return the total number of found articles, but also the data of the last document when for example only requesting the first 10 documents ? we could do this with a seperate query by either letting the second query fetch 1 row from position = previous

Re: Exception on the use of dataimport.jar in Full Import Example

2008-05-21 Thread Noble Paul നോബിള്‍ नोब्ळ्

On Wed, May 21, 2008 at 6:27 AM, Julio Castillo <[EMAIL PROTECTED]> wrote: > I wanted to learn how to index data that I have on my dB. > I followed the instructions on the wiki page for the Data Import Handler > (Full Import Example -example-solr-home.jar). I got an exception running it > as is (se

43 matches

Mail list logo