Unique id

2008-11-19 Thread Raghunandan Rao
Hi, Is the uniqueKey in schema.xml really required? Reason is, I am indexing two tables and I have id as unique key in schema.xml but id field is not there in one of the tables and indexing fails. Do I really require this unique field for Solr to index it better or can I do away with this?

RE: Unique id

2008-11-19 Thread Raghunandan Rao
Ok got it. I am indexing two tables differently. I am using Solrj to index with @Field annotation. I make two queries initially and fetch the data from two tables and index them separately. But what if the ids in two tables are same? That means documents with same id will be deleted when doing

Re: Error in indexing timestamp format.

2008-11-19 Thread con
Hi Nobble, Thank you very much That removed the error while server startup. But I don't think the data is getting indexed upon running the dataimport. I am unable to display the date field values on searching. This is my complete configs: entity name=employees

Re: Unique id

2008-11-19 Thread Aleksander M. Stensby
Ok, but how do you map your table structure to the index? As far as I can understand, the two tables have different structre, so why/how do you map two different datastructures onto a single index? Are the two tables connected in some way? If so, you could make your index structure reflect

Re: Use SOLR like the MySQL LIKE

2008-11-19 Thread Norberto Meijome
On Tue, 18 Nov 2008 14:26:02 +0100 Aleksander M. Stensby [EMAIL PROTECTED] wrote: Well, then I suggest you index the field in two different ways if you want both possible ways of searching. One, where you treat the entire name as one token (in lowercase) (then you can search for avera* and

Re: Unique id

2008-11-19 Thread Aleksander M. Stensby
Yes it is. You need a unique id because the add method works as and add or update method. When adding a document whose ID is already found in the index, the old document will be deleted and the new will be added. Are you indexing two tables into the same index? Or does one entry in the index

Upgrade from 1.2 to 1.3 gives 3x slowdown

2008-11-19 Thread Fergus McMenemie
Hello, I have a CSV file with 6M records which took 22min to index with solr 1.2. I then stopped tomcat replaced the solr stuff inside webapps with version 1.3, wiped my index and restarted tomcat. Indexing the exact same content now takes 69min. My machine has 2GB of RAM and tomcat is running

Re: Unique id

2008-11-19 Thread Erik Hatcher
Technically, no, a uniqueKey field is NOT required. I've yet to run into a situation where it made sense not to use one though. As for indexing database tables - if one of your tables doesn't have a primary key, does it have an aggregate unique key of some sort? Do you plan on updating

DataImportHandler: Javascript transformer for splitting field-values

2008-11-19 Thread Steffen
Hi everyone, I'm currently working with the nightly build of Solr (solr-2008-11-17) and trying to figure out how to transform a row-object with Javascript to include multiple values (in a single multivalued field). When I try something like this as a transformer: function splitTerms(row) {

Question about autocommit

2008-11-19 Thread Nickolai Toupikov
Hello, I would like some details on the autocommit mechanism. I tried to search the wiki, but found only the standard maxDoc/time settings. i have set the autocommit parameters in solrconfig.xml to 8000 docs and 30milis. Indexing at around 200 docs per second (from multiple processes,

Re: Question about autocommit

2008-11-19 Thread Mark Miller
Interesting...could go along with the earlier guys post about slow indexing... Nickolai Toupikov wrote: Hello, I would like some details on the autocommit mechanism. I tried to search the wiki, but found only the standard maxDoc/time settings. i have set the autocommit parameters in

Re: Question about autocommit

2008-11-19 Thread Mark Miller
Could also go with the thread safety issues with pending and the deadlock that was reported the other day. All could pretty easily be related. Do we have a JIRA issue on it yet? Suppose I'll look... Mark Miller wrote: Interesting...could go along with the earlier guys post about slow

RE: Question about autocommit

2008-11-19 Thread Nguyen, Joe
Could ramBufferSizeMB trigger the commit in this case? -Original Message- From: Nickolai Toupikov [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 8:36 Joe To: solr-user@lucene.apache.org Subject: Question about autocommit Hello, I would like some details on the autocommit

Re: Question about autocommit

2008-11-19 Thread Mark Miller
They are separate commits. ramBufferSizeMB controls when the underlying Lucene IndexWriter flushes ram to disk (this isnt like the IndexWriter commiting or closing). The solr autocommit controls when solr asks IndexWriter to commit what its done so far. Nguyen, Joe wrote: Could

RE: Question about autocommit

2008-11-19 Thread Nguyen, Joe
As far as I know, commit could be triggered by Manually 1. invoke commit() method Automatically 2. maxDoc 3. maxTime Since the document size is arbitrary and some document could be huge, could commit also be triggered by memory buffered size? -Original Message- From: Mark Miller

Re: Question about autocommit

2008-11-19 Thread Nickolai Toupikov
The documents have an average size of about a kilobyte i would say. bigger ones can pop up, but not nearly often enough to trigger memory-commits every couple of seconds. I dont have the exact figures, but i would expect the memory buffer limit to be far beyond the 8000 document one in most of

Re: Question about autocommit

2008-11-19 Thread Nickolai Toupikov
I dont know. After reading my last email, i realized i did not say explicitly that by 'restarting' i merely meant 'restarting resin' . I did not restart indexing from scratch. And - if I understand correctly - if the merge factor was the culprit, restarting the servlet container would have had

Multi word Synonym

2008-11-19 Thread Jeff Newburn
I am trying to figure out how the synonym filter processes multi word inputs. I have checked the analyzer in the GUI with some confusing results. The indexed field has ³The North Face² as a value. The synonym file has morthface, morth face, noethface, noeth face, norhtface, norht face, nortface,

No search result behavior (a la Amazon)

2008-11-19 Thread Caligula
It appears to me that Amazon is using a 100% minimum match policy. If there are no matches, they break down the original search terms and give suggestion search results. example:

RE: No search result behavior (a la Amazon)

2008-11-19 Thread Nguyen, Joe
Have a look at DisMaxRequestHandler and play with mm (miminum terms should match) http://wiki.apache.org/solr/DisMaxRequestHandler?highlight=%28CategorySo lrRequestHandler%29%7C%28%28CategorySolrRequestHandler%29%29#head-6c5fe4 1d68f3910ed544311435393f5727408e61 -Original Message- From:

Solr schema Lucene's StandardAnalyser equivalent?

2008-11-19 Thread Glen Newton
Hello, I am looking for the Solr schema equivalent to Lucene's StandardAnalyser. Is it the Solr schema type: fieldType name=text class=solr.TextField Is there some way of directly invoking Lucene's StandardAnalyser? Thanks, Glen -- -

RE: No search result behavior (a la Amazon)

2008-11-19 Thread Caligula
I understand how to do the 100% mm part. It's the behavior when there are no matches that i'm asking about :) Nguyen, Joe-2 wrote: Have a look at DisMaxRequestHandler and play with mm (miminum terms should match) http://wiki.apache.org/solr/DisMaxRequestHandler?highlight=%28CategorySo

filtering on blank OR specific range

2008-11-19 Thread Geoffrey Young
hi all :) I'm having difficultly filtering my documents when a field is either blank or set to a specific value. I would have thought this would work fq=-Type:[* TO *] OR Type:blue which I would expect to find all document where either Type is undefined or Type is blue. my actual result set

RE: filtering on blank OR specific range

2008-11-19 Thread Lance Norskog
Try: Type:blue OR -Type:[* TO *] You can't have a negative clause at the beginning. Yes, Lucene should barf about this. -Original Message- From: Geoffrey Young [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 12:17 PM To: solr-user@lucene.apache.org Subject: filtering on

Logging in Solr.

2008-11-19 Thread Erik Holstad
I kind if remember hearing that Solr was using SLF4J for the logging, but I haven't been able to find any information about it. And in that case where do you set it to redirect to you log4j server for example? Regards Erik

Re: filtering on blank OR specific range

2008-11-19 Thread Geoffrey Young
Lance Norskog wrote: Try: Type:blue OR -Type:[* TO *] You can't have a negative clause at the beginning. Yes, Lucene should barf about this. I did try that, before and again now, and still no luck. anything else? --Geoff

Re: Logging in Solr.

2008-11-19 Thread Ryan McKinley
the trunk (solr-1.4-dev) is now using SLF4J If you are using the packaged .war, the behavior should be identical to 1.3 -- that is, it uses the java.util.logging implementation. However, if you are using solr.jar, you select what logging framework you actully want to use by including that

RE: No search result behavior (a la Amazon)

2008-11-19 Thread Nguyen, Joe
Seemed like its first search required match all terms. If it could not find it, like you motioned, you broke down into multiple smaller term set and ran search to get total hit for each smaller term set, sort the results by total hits, and display summary page. Searching for A B C would be 1. q=

Re: Solr schema Lucene's StandardAnalyser equivalent?

2008-11-19 Thread Otis Gospodnetic
Glen: $ ff \*Standard\*java | grep analysis ./src/java/org/apache/solr/analysis/HTMLStripStandardTokenizerFactory.java ./src/java/org/apache/solr/analysis/StandardFilterFactory.java ./src/java/org/apache/solr/analysis/StandardTokenizerFactory.java Does that do it? Otis -- Sematext --

Searchable/indexable newsgroups

2008-11-19 Thread John Martyniak
Does anybody know of a good way to index newsgroups using SOLR? Basically would like to build a searchable list of newsgroup content. Any help would be greatly appreciated. -John

Re: Solr schema Lucene's StandardAnalyser equivalent?

2008-11-19 Thread Glen Newton
Thanks. I've decided to use: fieldType name=textN class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StandardFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter

RE: Searchable/indexable newsgroups

2008-11-19 Thread Feak, Todd
Can Nutch crawl newsgroups? Anyone? -Todd Feak -Original Message- From: John Martyniak [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 19, 2008 3:06 PM To: solr-user@lucene.apache.org Subject: Searchable/indexable newsgroups Does anybody know of a good way to index newsgroups using

Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Jon Baer
Hi, I wanted to try the TermVectorComponent w/ current schema setup and I did a build off trunk but it's giving me something like ... org.apache.solr.common.SolrException: ERROR:unknown field 'DOCTYPE' Even though it is declared in schema.xml (lowercase), before I grep replace the entire

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Ryan McKinley
schema fields should be case sensitive... so DOCTYPE != doctype is the behavior different for you in 1.3 with the same file/schema? On Nov 19, 2008, at 6:26 PM, Jon Baer wrote: Hi, I wanted to try the TermVectorComponent w/ current schema setup and I did a build off trunk but it's giving

Re: Solr schema Lucene's StandardAnalyser equivalent?

2008-11-19 Thread Erik Hatcher
Note that you can use a standard Lucene Analyzer subclass too. The example schema shows how with this commented out: fieldType name=text_greek class=solr.TextField analyzer class=org.apache.lucene.analysis.el.GreekAnalyzer/ /fieldType Erik On Nov 19, 2008, at 6:24 PM, Glen

Re: Newbe! Trying to run solr-1.3.0 under tomcat. Please help

2008-11-19 Thread James liu
check procedure: 1: rm -r $tomcat/webapps/* 2: rm -r $solr/data ,,,ur index data directory 3: check xml(any xml u modified) 4: start tomcat i had same error, but i forgot how to fix...so u can use my check procedure, i think it will help you i use tomcat+solr in win2003, freebsd, mac osx

Re: posting error in solr

2008-11-19 Thread James liu
first u sure the xml is utf-8,,and field value is utf-8,, second u should post xml by utf-8 my advice : All encoding use utf-8... it make my solr work well,,, i use chinese -- regards j.L

Tomcat undeploy/shutdown exception

2008-11-19 Thread Erik Hatcher
In analyzing a clients Solr logs, from Tomcat, I came across the exception below. Anyone encountered issues with Tomcat shutdowns or undeploys of Solr contexts? I'm not sure if this is an anomaly due to some wonky Tomcat handling, or if this is some kind of bug in Solr. I haven't

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Jon Baer
Sorry I should have mentioned this is from using the DataImportHandler ... it seems case insensitive ... ie my columns are UPPERCASE and schema field names are lowercase and it works fine in 1.3 but not in 1.4 ... it seems strict. Going to resolve all the field names to uppercase to see

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi John, it could probably not the expected behavior? only 'explicit' fields must be case-sensitive. Could you tell me the usecase or can you paste the data-config? --Noble On Thu, Nov 20, 2008 at 8:55 AM, Jon Baer [EMAIL PROTECTED] wrote: Sorry I should have mentioned this is from using

Re: DataImportHandler: Javascript transformer for splitting field-values

2008-11-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
unfortunately native JS objects are not handled by the ScriptTransformer yet. but what you can do in the script is create a new java.util.ArrayList() and add each item into that . some thing like var jsarr = ['term','term','term'] var arr = new java.util.ArrayList(); for each in jsarr...

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Jon Baer
Schema: field name=docid type=string indexed=true stored=true/ DIH: field column=DOCID name=docid template=PLAYER-$ {players.PLAYERID}/ The column is uppercase ... isn't there some automagic happening now where DIH will introspect the fields @ load time? - Jon On Nov 19, 2008, at 11:11

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
So originally you had the field declaration as follows . right? field column=DOCID template=PLAYER-${players.PLAYERID}/ we did some refactoring to minimize the object creation for case-insensitive comparisons. I guess it should be rectified soon. Thanks for bringing it to our notice. --Noble

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Jon Baer
Correct ... it is the unfortunate side effect of having some legacy tables in uppercase :-\ I thought the explicit declaration of field name attribute was ok. - Jon On Nov 19, 2008, at 11:53 PM, Noble Paul നോബിള്‍ नोब्ळ् wrote: So originally you had the field declaration as follows .

Field collapsing (SOLR-236) and Solr 1.3.0 release version

2008-11-19 Thread Stephen Weiss
Hi, A requirement has come up in a project where we're going to need to group by a field in the result set. I looked into the SOLR-236 patch and it seems there are a couple versions out now that are supposed to work against the Solr 1.3.0 release. This is a production site, it really

Re: Error in indexing timestamp format.

2008-11-19 Thread con
Hi Noble Thanks for your update. Sorry, that's a typo that I put same name for both soure and dest. Actually i failed to removed it at some stage of trial and error. I removed the copyfield as it is not fully necessary at this stage. My scenario is like: I have various date fields in my

RE: Unique id

2008-11-19 Thread Raghunandan Rao
Basically, I am working on two views. First one has an ID column. The second view has no unique ID column. What to do in such situations? There are 3 other columns where I can make a composite key out of those. I have to index these two views now. -Original Message- From: Erik Hatcher

Re: Tomcat undeploy/shutdown exception

2008-11-19 Thread Shalin Shekhar Mangar
Eric, which Solr version is that stack trace from? On Thu, Nov 20, 2008 at 7:57 AM, Erik Hatcher [EMAIL PROTECTED]wrote: In analyzing a clients Solr logs, from Tomcat, I came across the exception below. Anyone encountered issues with Tomcat shutdowns or undeploys of Solr contexts? I'm not

Re: Solr schema 1.3 - 1.4-dev (changes?)

2008-11-19 Thread Shalin Shekhar Mangar
Jon, I just committed a fix for this issue at https://issues.apache.org/jira/browse/SOLR-873 Can you please use trunk and see if it solved your problem? On Thu, Nov 20, 2008 at 10:32 AM, Jon Baer [EMAIL PROTECTED] wrote: Correct ... it is the unfortunate side effect of having some legacy