IndexSchema object

2010-02-16 Thread Gargate, Siddharth
How can we get instance of IndexSchema object in Tokenizer subclass?

Query performance

2009-09-22 Thread Gargate, Siddharth
Hi all, Does the following query has any performance impact over the second query? +title:lucene +(title:lucene -name:sid) +(title:lucene -name:sid)

Does semi-colon still works as special character for sorting?

2009-07-13 Thread Gargate, Siddharth
I read somewhere that it is deprecated

Optimize

2009-05-20 Thread Gargate, Siddharth
Hi all, I am not sure how to call optimize on the existing index. I tried with following URL http://localhost:9090/solr/update?optimize=true With this request, the response took a long time, and the index folder size doubled. Then again I queried the same URL and index size

RE: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread Gargate, Siddharth
Hi all, I am also facing the same issue where autocommit blocks all other requests. I having around 1,00,000 documents with average size of 100K each. It took more than 20 hours to index. I have currently set autocommit maxtime to 7 seconds, mergeFactor to 25. Do I need more configuration

RE: OutofMemory on Highlightling

2009-04-28 Thread Gargate, Siddharth
take 50K * 500 * 2 = 50 MB for 500 results). I would really appreciate some feedback on this issue... Thanks, Siddharth -Original Message- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Friday, April 24, 2009 10:46 AM To: solr-user@lucene.apache.org Subject: RE: OutofMemory

RE: OutofMemory on Highlightling

2009-04-23 Thread Gargate, Siddharth
SetNonLazyFieldSelector(fields)); } Are we setting the fields as NonLazy even if lazy loading is enabled? Thanks, Siddharth -Original Message- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Wednesday, April 22, 2009 11:12 AM To: solr-user@lucene.apache.org Subject: RE: OutofMemory

RE: OutofMemory on Highlightling

2009-04-21 Thread Gargate, Siddharth
@lucene.apache.org Subject: Re: OutofMemory on Highlightling Gargate, Siddharth wrote: Anybody facing the same issue? Following is my configuration ... field name=content type=text indexed=true stored=false multiValued=true/ field name=teaser type=text indexed=false stored=true/ copyField source

RE: OutofMemory on Highlightling

2009-04-21 Thread Gargate, Siddharth
) -Original Message- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Wednesday, April 22, 2009 9:29 AM To: solr-user@lucene.apache.org Subject: RE: OutofMemory on Highlightling I tried disabling the documentCache but still the same issue. documentCache class=solr.LRUCache

RE: OutofMemory on Highlightling

2009-04-20 Thread Gargate, Siddharth
to just 20 I get OOME. -Original Message- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Friday, April 17, 2009 11:32 AM To: solr-user@lucene.apache.org Subject: RE: OutofMemory on Highlightling I tried hl.maxAnalyzedChars=500 but still the same issue. I get OOM for row size 20

RE: OutofMemory on Highlightling

2009-04-17 Thread Gargate, Siddharth
, Have you tried: http://wiki.apache.org/solr/HighlightingParameters#head-2ca22f63cb8d1b2b a3ff0cfc05e85b94898c59cf Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Gargate, Siddharth sgarg...@ptc.com To: solr-user@lucene.apache.org Sent

OutofMemory on Highlightling

2009-04-16 Thread Gargate, Siddharth
Hi, I am analyzing the memory usage for my Solr setup. I am testing with 500 text documents of 2 MB each. I have defined a field for displaying the teasers and storing 1 MB of text in it. I am testing with just 128 MB maxHeap(I know I should be increasing it but just testing the

Memory usage

2009-04-14 Thread Gargate, Siddharth
Hi all, I am testing indexing with 2000 text documents of size 2 MB each. These documents contain words created with random characters. I observed that the tomcat memory usage goes on increasing slowly. I tried by removing all the cache configuration, but still memory usage increases. Once

maxBufferedDocs

2009-04-06 Thread Gargate, Siddharth
I see two entries of maxBufferedDocs property in solrconfig.xml. One in indexDefaults tag and other in mainIndex tag commented as Deprecated. So is this property required and gets used? What if remove the indexDefaults tag altogether? Thanks, Siddharth

Index time boost

2009-03-24 Thread Gargate, Siddharth
Hi all, Can we specify the index-time boost value for a particular field in schema.xml? Thanks, Siddharth

RE: Special character indexing

2009-03-20 Thread Gargate, Siddharth
(BoundedThreadPool.java:442) Thanks in advance for help. Siddharth -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Friday, March 20, 2009 10:35 AM To: solr-user@lucene.apache.org Subject: Re: Special character indexing On Fri, Mar 20, 2009 at 10:17 AM, Gargate

FW: Special character indexing

2009-03-20 Thread Gargate, Siddharth
: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Friday, March 20, 2009 3:58 PM To: solr-user@lucene.apache.org Subject: Re: Special character indexing On Fri, Mar 20, 2009 at 3:19 PM, Gargate, Siddharth sgarg...@ptc.comwrote: Hi Shalin, Thanks for the suggestion. I tried

RE: Special character indexing

2009-03-19 Thread Gargate, Siddharth
Subject: Re: Special character indexing Gargate, Siddharth wrote: Hi all, I am trying to index words containing special characters like 'Räikkönen'. Using EmbeddedSolrServer indexing is working fine, but if I use CommonHttpSolrServer then it is indexing garbage values. I am using Solr 1.4 and set

Special character indexing

2009-03-18 Thread Gargate, Siddharth
Hi all, I am trying to index words containing special characters like 'Räikkönen'. Using EmbeddedSolrServer indexing is working fine, but if I use CommonHttpSolrServer then it is indexing garbage values. I am using Solr 1.4 and set URLEcoding as UTF-8 in tomcat. Is this a known issue or am I

Phrase slop / Proximity search

2009-03-16 Thread Gargate, Siddharth
Can I set the phrase slop value to standard request handler? I want it to be configurable in solrconfig.xml file. Thanks, Siddharth

RE: Distributed search

2009-03-09 Thread Gargate, Siddharth
Hi, I am trying distributed search and multicore but not able to fire a query. I tried http://localhost:8080/solr/select/?shards=localhost:8080/solr/core0,localhost:8080/solr/core1q=solr I am getting following error: Missing solr core name in path. Should I use particular core to fire

multicore file path

2009-03-09 Thread Gargate, Siddharth
I am trying out multicore environment with single schema and solrconfig file. Below is the folder structure Solr/ conf/ schema.xml solrconfig.xml core0/ data/ core1/ data/ tomcat/ The solrhome property is set in tomcat as -Dsolr.solr.home=../.. And the

Ignoring Whitespace

2009-03-06 Thread Gargate, Siddharth
After parsing HTML documents, tika adds whitespaces (newlines and tabs) and this content gets stored as is in SOLR. If I fetch the teasers, the teaser contains these additonal whitespaces. How do I remove these whitespaces? At tika, solr or explicitly remove with my code? Thanks, Siddharth

RE: Outofmemory error for large files

2009-02-16 Thread Gargate, Siddharth
is technically correct.  Have you tried doing this and have you then tried your searches?  Everything should still work, even if you index one document at a time. Otis-- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Gargate, Siddharth sgarg

Store limited text

2009-01-27 Thread Gargate, Siddharth
Hi All, Is it possible to store only limited text in the field, say, max 1 mb? The field maxfieldlength limits only the number of tokens to be indexed, but stores complete content. Thanks, Siddharth

Maximum size of document indexed

2009-01-23 Thread Gargate, Siddharth
Hi, I am trying to index a 25 MB word document. I am not able to search all the keywords. Looks like only certain number of initial words are getting indexed. Is there any limit to the size of document getting indexed? Or is there any word count limit per field? Thanks, Siddharth

CommonsHttpSolrServer in multithreaded env

2009-01-13 Thread Gargate, Siddharth
Hi all, Is it safe to use a single instance of CommonsHttpSolrServer object in multithreaded environment? I am having multiple threads that are accessing single CommonsHttpSolrServer static object but sometimes the application gets blocked. Following is the stacktrace printed for all threads

Ensuring documents indexed by autocommit

2009-01-09 Thread Gargate, Siddharth
Hi all, I am using CommonsHttpSolrServer to add documents to Solr. Instead of explicitly calling commit for every document I have configured autocommit in solrconfig.xml. But how do we ensure that the document added is successfully indexed/committed on Solr side. Is there any callback

RE: Ensuring documents indexed by autocommit

2009-01-09 Thread Gargate, Siddharth
, Siddharth -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Friday, January 09, 2009 3:16 PM To: solr-user@lucene.apache.org Subject: Re: Ensuring documents indexed by autocommit On Fri, Jan 9, 2009 at 3:03 PM, Gargate, Siddharth sgarg...@ptc.com wrote

RE: Ensuring documents indexed by autocommit

2009-01-09 Thread Gargate, Siddharth
by autocommit On Fri, Jan 9, 2009 at 4:20 PM, Gargate, Siddharth sgarg...@ptc.com wrote: Thanks Shalin for the reply. I am working with the remote Solr server. I am using autocommit instead of commit method call because I observed significant performance improvement with autocommit. Just wanted

RE: Ensuring documents indexed by autocommit

2009-01-09 Thread Gargate, Siddharth
Sorry, for the previous question. What I meant was whether we can set the configuration from the code. But what you were suggesting is that I should call commit only after some time or after few number of documents, right? -Original Message- From: Gargate, Siddharth [mailto:sgarg

RE: Ensuring documents indexed by autocommit

2009-01-09 Thread Gargate, Siddharth
Thanks again for your inputs. But then I am still stuck on the question that how do we ensure that document is successfully indexed. One option I see is search for every document sent to solr. Or do we assume that autocommit always indexes all the documents successfully? Thanks, Siddharth