Re: Greater-than and less-than in data import SQL queries

2009-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Mon, Nov 2, 2009 at 11:34 AM, Amit Nithian wrote: > A thought I had on this from a DIH design perspective. Would it be better to > have the SQL queries stored in an element rather than an attribute so that > you can wrap it in a CDATA block without having to mess up the look of query > with <,

Problems downloading lucene 2.9.1

2009-11-02 Thread Licinio Fernández Maurelo
Hi folks, as we are using an snapshot dependecy to solr1.4, today we are getting problems when maven try to download lucene 2.9.1 (there isn't a any 2.9.1 there). Which repository can i use to download it? Thx -- Lici

RE: CPU utilization and query time high on Solr slave when snapshot install

2009-11-02 Thread biku...@sapient.com
Hi Solr Gurus, We have solr in 1 master, 2 slave configuration. Snapshot is created post commit, post optimization. We have autocommit after 50 documents or 5 minutes. Snapshot puller runs as a cron every 10 minutes. What we have observed is that whenever snapshot is installed on the slave, we

Re: Indexing multiple entities

2009-11-02 Thread Chantal Ackermann
I'm using a code generator for my entities, and I cannot modify the generation. I need to work out another option :( shouldn't code generators help development and not make it more complex and difficult? oO (sry off topic) chantal

Re: StreamingUpdateSolrServer - indexing process stops in a couple of hours

2009-11-02 Thread Shalin Shekhar Mangar
I'm able to reproduce this issue consistently using JDK 1.6.0_16 After an optimize is called, only one thread keeps adding documents and the rest wait on StreamingUpdateSolrServer line 196. On Sun, Oct 25, 2009 at 8:03 AM, Dadasheva, Olga wrote: > I am using java 1.6.0_05 > > To illustrate what

Lock problems: Lock obtain timed out

2009-11-02 Thread Jérôme Etévé
Hi, I've got a few machines who post documents concurrently to a solr instance. They do not issue the commit themselves, instead, I've got autocommit set up at solr server side: 5 6 This usually works fine, but sometime the server goes in a deadlock state . Here's

Re: Spell check suggestion and correct way of implementation and some Questions

2009-11-02 Thread Shalin Shekhar Mangar
On Wed, Oct 28, 2009 at 8:57 PM, darniz wrote: > > Question. Should i build the dictionlary only once and after that as new > words are indexed the dictionary will be updated. Or i to do that manually > over certain interval. > > No. The dictionary is built only when spellcheck.build=true is spec

tracking solr response time

2009-11-02 Thread bharath venkatesh
Hi, We are using solr for many of ur products it is doing quite well . But since no of hits are becoming high we are experiencing latency in certain requests ,about 15% of our requests are suffering a latency . We are trying to identify the problem . It may be due to network issue or sol

Re: Problems downloading lucene 2.9.1

2009-11-02 Thread Grant Ingersoll
On Nov 2, 2009, at 12:12 AM, Licinio Fernández Maurelo wrote: Hi folks, as we are using an snapshot dependecy to solr1.4, today we are getting problems when maven try to download lucene 2.9.1 (there isn't a any 2.9.1 there). Which repository can i use to download it? They won't be there

NullPointerException with TermVectorComponent

2009-11-02 Thread Andrew Clegg
Hi, I've recently added the TermVectorComponent as a separate handler, following the example in the supplied config file, i.e.: true tvComponent It works, but with one quirk. When you use tf.all=true, you

Re: tracking solr response time

2009-11-02 Thread Yonik Seeley
On Mon, Nov 2, 2009 at 8:13 AM, bharath venkatesh wrote: >    We are using solr for many of ur products  it is doing quite well > .  But since no of hits are becoming high we are experiencing latency > in certain requests ,about 15% of our requests are suffering a latency How much of a latency co

Re: tracking solr response time

2009-11-02 Thread Israel Ekpo
On Mon, Nov 2, 2009 at 8:41 AM, Yonik Seeley wrote: > On Mon, Nov 2, 2009 at 8:13 AM, bharath venkatesh > wrote: > >We are using solr for many of ur products it is doing quite well > > . But since no of hits are becoming high we are experiencing latency > > in certain requests ,about 15% of

Re: tracking solr response time

2009-11-02 Thread Grant Ingersoll
On Nov 2, 2009, at 5:41 AM, Yonik Seeley wrote: QTime is the time spent in generating the in-memory representation for the response before the response writer starts streaming it back in whatever format was requested. The stored fields of returned documents are also loaded at this point (to en

Re: NullPointerException with TermVectorComponent

2009-11-02 Thread david.stu...@progressivealliance.co.uk
I think it might be to do with the library itself I downloaded semanticvectors-1.22 and compiled from source. Then created a demo corpus using java org.apache.lucene.demo.IndexFiles against the lucene src directory I then ran a java pitt.search.semanticvectors.BuildIndex against the index and got

Re: Problems downloading lucene 2.9.1

2009-11-02 Thread Ryan McKinley
On Nov 2, 2009, at 8:29 AM, Grant Ingersoll wrote: On Nov 2, 2009, at 12:12 AM, Licinio Fernández Maurelo wrote: Hi folks, as we are using an snapshot dependecy to solr1.4, today we are getting problems when maven try to download lucene 2.9.1 (there isn't a any 2.9.1 there). Which rep

Re: adding and updating a lot of document to Solr, metadata extraction etc

2009-11-02 Thread Alexey Serba
Hi Eugene, > - ability to iterate over all documents, returned in search, as Lucene does >  provide within a HitCollector instance. We would need to extract and >  aggregate various fields, stored in index, to group results and aggregate > them >  in some way. > > Also I did not find any way

RE: Solr YUI autocomplete

2009-11-02 Thread Ankit Bhatnagar
Hey Amit, My index(ie Solr) was on different domain, so I can't use XHR(as XHR doesnot work with cross domain proxyless data fetch). I tried using YUI's DS_ScriptNode but didn't work. I completed my task by using jQuery and it worked well with solr. -Ankit -Original Message- From:

question about collapse.type = adjacent

2009-11-02 Thread michael8
Hi, I would like to confirm if 'adjacent' in collapse.type means the documents (with the same collapse field value) are considered adjacent *after* the 'sort' param from the query has been applied, or *before*? I would think it would be *after* since collapse feature primarily is meant for prese

Re: tracking solr response time

2009-11-02 Thread bharath venkatesh
Thanks for the quick response @yonik >How much of a latency compared to normal, and what version of Solr are you using? latency is usually around 2-4 secs (some times it goes more than that ) which happens to only 15-20% of the request other 80-85% of request are very fast it is in milli s

Re: Solr YUI autocomplete

2009-11-02 Thread Eric Pugh
It does, have you looked at http://wiki.apache.org/solr/SolJSON?highlight=%28json%29#Using_Solr.27s_JSON_output_for_AJAX. Also, in my book on Solr, there is an example, but using the jquery autocomplete, which I think was answered earlier on the thread! Hope that helps. ANKITBHATNAGAR wrote:

Re: Solr Cell on web-based files?

2009-11-02 Thread Alexey Serba
> e.g (doesn't work) > curl http://localhost:8983/solr/update/extract?extractOnly=true > --data-binary @http://myweb.com/mylocalfile.htm -H "Content-type:text/html" > You might try remote streaming with Solr (see > http://wiki.apache.org/solr/SolrConfigXml). Yes, curl example curl 'http://local

RE: Solr YUI autocomplete

2009-11-02 Thread Ankit Bhatnagar
Hey Eric, That correct however it didn't work with YUI widget. I changed my approach to use jQuery for now. -Ankit -Original Message- From: Eric Pugh [mailto:ep...@opensourceconnections.com] Sent: Monday, November 02, 2009 10:20 AM To: solr-user@lucene.apache.org Subject: Re: Solr

storing other files in index directory

2009-11-02 Thread Paul Rosen
Are there any pitfalls to storing an arbitrary text file in the same directory as the solr index? We're slinging different versions of the index around while we're testing and it's hard to keep them straight. I'd like to put a readme.txt file in the directory that contains some history about

Re: tracking solr response time

2009-11-02 Thread Erick Erickson
Also, how about a sample of a fast and slow query? And is a slow query only slow the first time it's executed or every time? Best Erick On Mon, Nov 2, 2009 at 9:52 AM, bharath venkatesh < bharathv6.proj...@gmail.com> wrote: > Thanks for the quick response > @yonik > > >How much of a latency comp

tokenize after filters

2009-11-02 Thread Joe Calderon
is it possible to tokenize a field on whitespace after some filters have been applied: ex: "A + W Root Beer" the field uses a keyword tokenizer to keep the string together, then it will get converted to "aw root beer" by a custom filter ive made, i now want to split that up into 3 tokens (aw, roo

Re: Annotations and reference types

2009-11-02 Thread Shalin Shekhar Mangar
On Thu, Oct 29, 2009 at 7:57 PM, M. Tinnemeyer wrote: > Dear listusers, > > Is there a way to store an instance of class A (including the fields from > "myB") via solr using annotations ? > The index should look like : id; name; b_id; b_name > > -- > Class A { > > @Field > private String

Re: Question about DIH execution order

2009-11-02 Thread Bertie Shen
Hi Noble, I tried to understand your suggestions and played different variations according to your reply. But none of them work. Can you explain it in more details? Thanks a lot! BTW, do you mean your solution as follows? But 1) T

Re: tracking solr response time

2009-11-02 Thread Israel Ekpo
On Mon, Nov 2, 2009 at 9:52 AM, bharath venkatesh < bharathv6.proj...@gmail.com> wrote: > Thanks for the quick response > @yonik > > >How much of a latency compared to normal, and what version of Solr are > you using? > > latency is usually around 2-4 secs (some times it goes more than that > ) w

Re: CPU utilization and query time high on Solr slave when snapshot install

2009-11-02 Thread Walter Underwood
If you are going to pull a new index every 10 minutes, try turning off cache autowarming. Your caches are never more than 10 minutes old, so spending a minute warming each new cache is a waste of CPU. Autowarm submits queries to the new Searcher before putting it in service. This will creat

Re: tracking solr response time

2009-11-02 Thread bharath venkatesh
@Israel: yes I got that point which yonik mentioned .. but is qtime the total time taken by solr server for that request or is it part of time taken by the solr for that request ( is there any thing that a solr server does for that particulcar request which is not included in that qtime bracket )

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Any thoughts regarding the subject? I hope FieldCache doesn't use more than 6 bytes per document-field instance... I am too lazy to research Lucene source code, I hope someone can provide exact answer... Thanks > Subject: Lucene FieldCache memory requirements > > Hi, > > > Can anyone confirm L

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Michael McCandless
Which FieldCache API are you using? getStrings? or getStringIndex (which is used, under the hood, if you sort by this field). Mike On Mon, Nov 2, 2009 at 2:27 PM, Fuad Efendi wrote: > Any thoughts regarding the subject? I hope FieldCache doesn't use more than > 6 bytes per document-field insta

LocalSolr, Maven, build files and release candidates (Just for info) and spatial radius (A question)

2009-11-02 Thread Ian Ibbotson
Hallo All. I've been trying to prepare a project using localsolr for the impending (I hope) arrival of solr 1.4 and Lucene 2.9.1.. Here are some notes in case anyone else is suffering similarly. Obviously everything here may change by next week. First problem has been the lack of any stable maven

Re: question about collapse.type = adjacent

2009-11-02 Thread Martijn v Groningen
Hi Micheal, Field collapsing is basicly done in two steps. The first step is to get the uncollapsed sorted (whether it is score or a field value) documents and the second step is to apply the collapse algorithm on the uncollapsed documents. So yes, when specifying collapse.type=adjacent the docume

apply a patch on solr

2009-11-02 Thread michael8
Hi, First I like to pardon my novice question on patching solr (1.4). What I like to know is, given a patch, like the one for collapse field, how would one go about knowing what solr source that patch is meant for since this is a source level patch? Wouldn't the exact versions of a set of java

apply a patch on solr

2009-11-02 Thread michael8
Hi, First I like to pardon my novice question on patching solr (1.4). What I like to know is, given a patch, like the one for collapse field, how would one go about knowing what solr source that patch is meant for since this is a source level patch? Wouldn't the exact versions of a set of java

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
I am not using Lucene API directly; I am using SOLR which uses Lucene FieldCache for faceting on non-tokenized fields... I think this cache will be lazily loaded, until user executes sorted (by this field) SOLR query for all documents *:* - in this case it will be fully populated... > Subject: Re

Dismax and Standard Queries together

2009-11-02 Thread ram_sj
Hi, I have three fields, business_name, category_name, sub_category_name in my solrconfig file. my query = "pet clinic" example sub_category_names: Veterinarians, Kennels, Veterinary Clinics Hospitals, Pet Grooming, Pet Stores, Clinics my ideal requirement is dismax searching on a. dismax

RE: tokenize after filters

2009-11-02 Thread Steven A Rowe
I think you want Koji Sekiguchi's Char Filters: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?highlight=char+filters#Char_Filters Steve > -Original Message- > From: Joe Calderon [mailto:calderon@gmail.com] > Sent: Monday, November 02, 2009 11:25 AM > To: solr-user@lucen

field queries seem slow

2009-11-02 Thread mike anderson
I took a look through my Solr logs this weekend and noticed that the longest queries were on particular fields, like "author:albert einstein". Is this a result consistent with other setups out there? If not, Is there a trick to make these go faster? I've read up on filter queries and use those when

manually creating indices to speed up indexing with app-knowledge

2009-11-02 Thread Britske
This may seem like a strange question, but here it goes anyway. Im considering the possibility of low-level constructing indices for about 20.000 indexed fields (type sInt) if at all possible . (With indices in this context I mean the inverted indices from term to Documentid just to be 100% comp

Re: apply a patch on solr

2009-11-02 Thread mike anderson
You can see what revision the patch was written for at the top of the patch, it will look like this: Index: org/apache/solr/handler/MoreLikeThisHandler.java === --- org/apache/solr/handler/MoreLikeThisHandler.java (revision 772437) ++

highlighting error using 1.4rc

2009-11-02 Thread Jake Brownell
Hi, I've tried installing the latest (3rd) RC for Solr 1.4 and Lucene 2.9.1. One of our integration tests, which runs against and embedded server appears to be failing on highlighting. I've included the stack trace and the configuration from solrconf. I'd appreciate any insights. Please let me

Question regarding snapinstaller

2009-11-02 Thread Prasanna Ranganathan
It looks like the snapinstaller script does an atomic remove and replace of the entire solr_home/data_dir/index folder with the contents of the new snapshot before issuing a commit command. I am trying to understand the implication of the same. What happens to queries that come during the time

Re: tracking solr response time

2009-11-02 Thread Erick Erickson
So I need someone with better knowledge to chime in here with an opinion on whether autowarming would help since the whole faceting thing is something I'm not very comfortable with... Erick On Mon, Nov 2, 2009 at 2:21 PM, bharath venkatesh < bharathv6.proj...@gmail.com> wrote: > @Israel: yes I

Re: field queries seem slow

2009-11-02 Thread Erick Erickson
H, are you sorting? And has your readers been reopened? Is the second query of that sort also slow? If the answer to this last question is "no", have you tried some autowarming queries? Best Erick On Mon, Nov 2, 2009 at 4:34 PM, mike anderson wrote: > I took a look through my Solr logs this

Re: Question about DIH execution order

2009-11-02 Thread Fergus McMenemie
Bertie, Not sure what you are trying to do, we need a clearer description of what "select *" returns and what you want to end up in the index. But to answer your question The transformations happen after DIH has performed the SQL statement. In fact the rows output from the SQL command are assigne

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Michael McCandless
OK I think someone who knows how Solr uses the fieldCache for this type of field will have to pipe up. For Lucene directly, simple strings would consume an pointer (4 or 8 bytes depending on whether your JRE is 64bit) per doc, and the string index would consume an int (4 bytes) per doc. (Each als

Re: highlighting error using 1.4rc

2009-11-02 Thread Mark Miller
Umm - crap. This looks looks like a bug in a fix that just went in. My fault on the review. I'll fix it tonight when I get home - unfortunetly, both lucene and sold are about to be released... - Mark http://www.lucidimagination.com (mobile) On Nov 2, 2009, at 5:17 PM, Jake Brownell wrote:

Re: Spell check suggestion and correct way of implementation and some Questions

2009-11-02 Thread darniz
Hello everybody i am able to use spell checker but i have some questions if someone can answer this if i search free text word waranty then i get back suggestion warranty which is fine. but if do a search on field for example description:waranty the output collation element is description:warranty

Re: Spell check suggestion and correct way of implementation and some Questions

2009-11-02 Thread darniz
Hello everybody i am able to use spell checker but i have some questions if someone can answer this if i search free text word waranty then i get back suggestion warranty which is fine. but if do a search on field for example description:waranty the output collation element is description:warranty

Re: CPU utilization and query time high on Solr slave when snapshot install

2009-11-02 Thread Mark Miller
Hmm...I think you have to setup warming queries yourself and that autowarm just copies entries from the old cache to the new cache, rather than issuing queries - the value is how many entries it will copy. Though that's still going to take CPU and time. - Mark http://www.lucidimagination.c

Re: solr search

2009-11-02 Thread Lance Norskog
The problem is in db-dataconfig.xml. You should start with the example DataImportHandler configuration fles. The structure is wrong. First there is a datasource, then there are 'entities' which fetch a document's fields from the datasource. On Fri, Oct 30, 2009 at 9:03 PM, manishkbawne wrote: >

Re: solr web ui

2009-11-02 Thread Lance Norskog
This is what I meant to mention - Uri's GWT browser, not the Velocity toolkit. On Fri, Oct 30, 2009 at 1:20 PM, Grant Ingersoll wrote: > There is also a GWT contribution in JIRA that is pretty handy and will > likely be added in 1.5.  See http://issues.apache.org/jira/browse/SOLR-1163 > > -Grant

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Thank you very much Mike, I found it: org.apache.solr.request.SimpleFacets ... // TODO: future logic could use filters instead of the fieldcache if // the number of terms in the field is small enough. counts = getFieldCacheCounts(searcher, base, field, offset,limit, mincou

Re: CPU utilization and query time high on Solr slave when snapshot install

2009-11-02 Thread Jay Hill
So assuming you set up a few sample sort queries to run in the firstSearcher config, and had very low query volume during that ten minutes so that there were no evictions before a new Searcher was loaded, would those queries run by the firstSearcher be passed along to the cache for the next Searche

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Mark Miller
It also briefly requires more memory than just that - it allocates an array the size of maxdoc+1 to hold the unique terms - and then sizes down. Possibly we can use the getUnuiqeTermCount method in the flexible indexing branch to get rid of that - which is why I was thinking it might be a good ide

Why does BinaryRequestWriter force the path to be base URL + "/update/javabin"

2009-11-02 Thread Stuart Tettemer
Hi folks, First of all, thanks for Solr. It is a great piece of work. I have a question about BinaryRequestWriter in the solrj project. Why does it force the path of UpdateRequests to have be "/update/javabin" (see BinaryRequestWriter.getPath(String) starting on line 109)? I am extending Binary

Re: highlighting error using 1.4rc

2009-11-02 Thread Mark Miller
Sorry - it was a bug in the backport from trunk to 2.9.1 - didn't realize that code didn't get hit because we didn't pass a null field - else the tests would have caught it. Fix has been committed but I don't know whether it will make 2.9.1 or 1.4 because both have gotten the votes and time needed

Re: Programmatically configuring SLF4J for Solr 1.4?

2009-11-02 Thread Don Werve
2009/11/1 Ryan McKinley > I'm sure it is possible to configure JDK logging (java.util.loging) > programatically... but I have never had much luck with it. > > It is very easy to configure log4j programatically, and this works great > with solr. > Don't suppose I could trouble you for an example?

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no difference between maxdoc and maxdoc + 1 for such estimate... difference is between 0.4Gb and 1.2Gb... So, let's vote ;) A. [maxdoc] x [8 bytes ~ pointer to String object] B. [maxdoc] x [8 bytes ~ pointer to Document ob

Getting update/extract RequestHandler to work under Tomcat

2009-11-02 Thread Glock, Thomas
Hoping someone might help with getting /update/extract RequestHandler to work under Tomcat. Error 500 happens when trying to access http://localhost:8080/apache-solr-1.4-dev/update/extract/ (see below) Note /update/extract DOES work correctly under the Jetty provided example. I think I must ha

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Mark Miller
Fuad Efendi wrote: > Simple field (10 different values: Canada, USA, UK, ...), 64-bit JVM... no > difference between maxdoc and maxdoc + 1 for such estimate... difference is > between 0.4Gb and 1.2Gb... > > I'm not sure I understand - but I didn't mean to imply the +1 on maxdoc meant anything. T

SolrJ looping until I get all the results

2009-11-02 Thread Paul Tomblin
If I want to do a query and only return X number of rows at a time, but I want to keep querying until I get all the row, how do I do that? Can I just keep advancing query.setStart(...) and then checking if server.query(query) returns any rows? Or is there a better way? Here's what I'm thinking

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
I just did some tests in a completely new index (Slave), sort by low-distributed non-tokenized Field (such as Country) takes milliseconds, but sort (ascending) on tokenized field with heavy distribution took 30 seconds (initially). Second sort (descending) took milliseconds. Generic query *.*; Fiel

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Mark, I don't understand this: > so with a ton of docs and a few uniques, you get a temp boost in the RAM > reqs until it sizes it down. Sizes down??? Why is it called Cache indeed? And how SOLR uses it if it is not cache? And this: > A pointer for each doc. Why can't we use (int) DocumentID?

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Ok, my "naive" thinking about FieldCache: for each Term we can quickly retrieve DocSet. What are memory requirements? Theoretically, [maxdoc]x[4-bytes DocumentID], plus some (small) array to store terms pointing to (large) arrays of DocumentIDs. Mike suggested http://issues.apache.org/jira/browse/

Re: SolrJ looping until I get all the results

2009-11-02 Thread Avlesh Singh
> > final static int MAX_ROWS = 100; > int start = 0; > query.setRows(MAX_ROWS); > while (true) > { > QueryResponse resp = solrChunkServer.query(query); > SolrDocumentList docs = resp.getResults(); > if (docs.size() == 0) > break; > > start += MAX_ROWS; > query.setStart(start); >

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
To be correct, I analyzed FieldCache awhile ago and I believed it never "sizes down"... /** * Expert: The default cache implementation, storing all values in memory. * A WeakHashMap is used for storage. * * Created: May 19, 2004 4:40:36 PM * * @since lucene 1.4 */ Will it size down? Onl

Re: SolrJ looping until I get all the results

2009-11-02 Thread Paul Tomblin
On Mon, Nov 2, 2009 at 8:40 PM, Avlesh Singh wrote: >> >> final static int MAX_ROWS = 100; >> int start = 0; >> query.setRows(MAX_ROWS); >> while (true) >> { >>   QueryResponse resp = solrChunkServer.query(query); >>   SolrDocumentList docs = resp.getResults(); >>   if (docs.size() == 0) >>     br

Re: adding and updating a lot of document to Solr, metadata extraction etc

2009-11-02 Thread Lance Norskog
About large XML files and http overhead: you can tell solr to load the file directly from a file system. This will stream thousands of documents in one XML file without loading everything in memory at once. This is a new book on Solr. It will help you through this early learning phase. http://www

Re: SolrJ looping until I get all the results

2009-11-02 Thread Avlesh Singh
> > I was doing it that way, but what I'm doing with the documents is do > some manipulation and put the new classes into a different list. > Because I basically have two times the number of documents in lists, > I'm running out of memory. So I figured if I do it 1000 documents at > a time, the So

Re: Lucene FieldCache memory requirements

2009-11-02 Thread Mark Miller
static final class StringIndexCache extends Cache { StringIndexCache(FieldCache wrapper) { super(wrapper); } @Override protected Object createValue(IndexReader reader, Entry entryKey) throws IOException { String field = StringHelper.intern(entryKey.field);

Re: SolrJ looping until I get all the results

2009-11-02 Thread Paul Tomblin
On Mon, Nov 2, 2009 at 8:47 PM, Avlesh Singh wrote: >> >> I was doing it that way, but what I'm doing with the documents is do >> some manipulation and put the new classes into a different list. >> Because I basically have two times the number of documents in lists, >> I'm running out of memory.  

Re: SolrJ looping until I get all the results

2009-11-02 Thread Avlesh Singh
> > This isn't a search, this is a search and destroy. Basically I need the > file names of all the documents that I've indexed in Solr so that I can > delete them. > Okay. I am sure you are aware of the "fl" parameter which restricts the number of fields returned back with a response. If you need

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
I believe this is correct estimate: > C. [maxdoc] x [4 bytes ~ (int) Lucene Document ID] > > same as > [String1_Document_Count + ... + String10_Document_Count + ...] > x [4 bytes per DocumentID] So, for 100 millions docs we need 400Mb for each(!) non-tokenized field. Although FieldCacheImpl

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Hi Mark, Yes, I understand it now; however, how will StringIndexCache size down in a production system faceting by Country on a homepage? This is SOLR specific... Lucene specific: Lucene doesn't read from disk if it can retrieve field value for a specific document ID from cache. How will it size

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
Even in simplistic scenario, when it is Garbage Collected, we still _need_to_be_able_ to allocate enough RAM to FieldCache on demand... linear dependency on document count... > > Hi Mark, > > Yes, I understand it now; however, how will StringIndexCache size down in a > production system facetin

Re: Why does BinaryRequestWriter force the path to be base URL + "/update/javabin"

2009-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
yup, that can be relaxed. It was just a convention. On Tue, Nov 3, 2009 at 5:24 AM, Stuart Tettemer wrote: > Hi folks, > First of all, thanks for Solr.  It is a great piece of work. > > I have a question about BinaryRequestWriter in the solrj project.  Why does > it force the path of UpdateReques

Re: Question regarding snapinstaller

2009-11-02 Thread Lance Norskog
In Posix-compliant systems (basically Unix system calls) a file exists independent of file names, and there can be multiple names for a file. If a program has a file open, that file can be deleted but it will still exist until the program closes (or the program exits). In the snapinstaller cycle,

Re: Annotations and reference types

2009-11-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
I guess this is not a very good idea. The document itself is a flat data structure. It is hard to see that is nested datastructure. If allowed , how deep would we wish to make it. The simple solution would be to write setters for "b_id" and "b_name" in class A and the setters can inject values in

Re: field queries seem slow

2009-11-02 Thread Lance Norskog
This searches author:albert and (default text field): einstein. This may not be what you expect? On Mon, Nov 2, 2009 at 2:30 PM, Erick Erickson wrote: > H, are you sorting? And has your readers been reopened? Is the > second query of that sort also slow? If the answer to this last question is

Re: tracking solr response time

2009-11-02 Thread Yonik Seeley
On Mon, Nov 2, 2009 at 2:21 PM, bharath venkatesh wrote: > we observed many times there is huge mismatch between qtime and > time measured at the client for the response Long times to stream back the result to the client could be due to - client not reading fast enough - network congestion - r

RE: Lucene FieldCache memory requirements

2009-11-02 Thread Fuad Efendi
FieldCache uses internally WeakHashMap... nothing wrong, but... no any Garbage Collection tuning will help in case if allocated RAM is not enough for replacing Weak** with Strong**, especially for SOLR faceting... 10%-15% CPU taken by GC were reported... -Fuad

solrj query size limit?

2009-11-02 Thread Gregg Horan
I'm constructing a query using solrj that has a fairly large number of 'OR' clauses. I'm just adding it as a big string to setQuery(), in the format "accountId:(this OR that OR yada)". This works all day long with 300 values. When I push it up to 350-400 values, I get a "Bad Request" SolrServerE

Proper way to set up Multi Core / Core admin

2009-11-02 Thread Jonathan Hendler
Getting started with multi core setup following http://wiki.apache.org/solr/CoreAdmin and the book. Generally everything makes sense, but I have one question. Here's how easy it was: place the solr.war into the server create your core directories in the newly created solr/ directory set up s

Re: Proper way to set up Multi Core / Core admin

2009-11-02 Thread Jonathan Hendler
Sorry for the confusion - step four is to be avoided, obviously. On Nov 2, 2009, at 11:46 PM, Jonathan Hendler wrote: Getting started with multi core setup following http://wiki.apache.org/solr/CoreAdmin and the book. Generally everything makes sense, but I have one question. Here's how e

Re: Match all terms in doc

2009-11-02 Thread Shalin Shekhar Mangar
On Sun, Nov 1, 2009 at 3:33 AM, Magnus Eklund wrote: > Hi > > How do I restrict hits to documents containing all words (regardless of > order) of a query in particular field? > > Suppose I have two documents with a field called name in my index: > > doc1 => name: Pink > doc2 => name: Pink Floyd >

Re: SpellCheckComponent suggestions and case

2009-11-02 Thread Shalin Shekhar Mangar
On Sat, Oct 31, 2009 at 2:51 AM, Acadaca wrote: > > I am having great difficulty getting SpellCheckComponent to ignore case. > > Given a search of Glod, the suggestion is wood > Given a search of glod, the suggestion is gold > > I am using LowerCaseTokenizerFactory for both query and index, so as

Re: json.wrf parameter

2009-11-02 Thread Shalin Shekhar Mangar
On Sun, Nov 1, 2009 at 5:55 AM, Ankit Bhatnagar wrote: > Hi Yonik, > > I have a question regarding json.wrf parameter that you introduced in Solr > query. > > I am using YUi Datasource widget and it accepts JSONP format. > > Could you tell me if I specify json.wrf in the query will solr return the

Re: solrj query size limit?

2009-11-02 Thread Avlesh Singh
Did you hit the limit for maximum number of characters in a GET request? Cheers Avlesh On Tue, Nov 3, 2009 at 9:36 AM, Gregg Horan wrote: > I'm constructing a query using solrj that has a fairly large number of 'OR' > clauses. I'm just adding it as a big string to setQuery(), in the format > "

Re: Problems downloading lucene 2.9.1

2009-11-02 Thread Licinio Fernández Maurelo
Thanks guys !!! 2009/11/2 Ryan McKinley > > On Nov 2, 2009, at 8:29 AM, Grant Ingersoll wrote: > > >> On Nov 2, 2009, at 12:12 AM, Licinio Fernández Maurelo wrote: >> >> Hi folks, >>> >>> as we are using an snapshot dependecy to solr1.4, today we are getting >>> problems when maven try to downl

Re: Problems downloading lucene 2.9.1

2009-11-02 Thread Licinio Fernández Maurelo
Well, i've solved this problem executing mvn install:install-file -DgroupId=org.apache.lucene -DartifactId=lucene-analyzers -Dversion=2.9.1 -Dpackaging=jar -Dfile= for each lucene-* artifact. I think there must be an easier way to do this, am i wrong? Hope it helps Thx El 3 de noviembre de 200