Is it possibily to dynamically/programmatically add new fields

2007-10-23 Thread Karen Loughran

Hi there,

Is there a way in Solr to programmatically add new fields (named and dynamic) 
so that they don't have to be defined statically within schema.xml ?

thanks
Karen


Re: Search results problem

2007-10-23 Thread Yonik Seeley
On 10/23/07, Maximilian Hütter [EMAIL PROTECTED] wrote:

  ???  maxFieldLength only applies to the number of tokens indexed.  You
  will always get the complete field back if it's stored, regardless of
  what maxFieldLength is.

 What I meant was, that it is different from just having a field with all
 the tokens compared to using copyField to copy all the content to a
 field. CopyField doesn't just copy the contents to the field but seems
 to somehow link them there.

copyField simply creates an additional value for the target...
it would end up the same as if you sent it in yourself.

 So if my maxFieldLength is for example set to 100 and I use copyField
 for 101 other fields, will the 101th get truncated?

copyField and maxFieldLength have nothing to do with each other.

maxFieldLength limits the number of *tokens* in all values of a given
name in a given document.

So if you had

field1: this is a test
and a maxFieldLength of 3, then the test token would be dropped.

if you had
field1: this is
field1: a test
and a maxFieldLength of 3, then the test token would still be dropped.


  Is there a performance penalty for using copyFields when indexing?
 
  copyFields are done as a discrete step before indexing... almost no
  cost to do that.
  Indexing itself will have a performance impact if there are more
  fields to index + store as a result of the copyField commands.

 The documents in my application have something like 400+ fields (many
 multivalued). For easy searching the application copies all the contents
 of the 400+ fields to one field (fulltext field) which is used as
 defaultfield. This field is quite large for many documents (it gets as
 long as 55 tokens). I was thinking about using copyField for copying
 the fields onto that field instead of having the application do it
 before sending it to Solr.

The indexing cost will be identical in either case.  Since copyField
is a little more elegant (why force the user to send the data more
than once), I'd use that.

If you don't need to search on all 400+ fields individually, don't
index them (just index your defaultfield).
And I wouldn't store your defaultfield since it's redundant info.

-Yonik


RE: Timeout Settings

2007-10-23 Thread Charlie Jackson
The CommonsHttpSolrServer has a setConnectionTimeout method. For my
import, which was on a similar scale as yours, I had to set it up to
1000 (1 second). I think messing with this setting may take care of your
timeout problem.



-Original Message-
From: Daniel Clark [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 22, 2007 6:59 PM
To: solr-user@lucene.apache.org
Subject: Timeout Settings

I'm indexing about 10,000,000 docs and I'm getting the following error
at
the optimize stage.  I'm using Tomcat 6.  I believe it's timing out due
to
the size of the index.  How can increase the timeout setting while it's
optimizing?  Any help would be greatly appreciated.

 

java.lang.Exception:

at org.apache.solr.client.SolrClient.update(SolrClient.java:660)

at org.apache.solr.client.SolrClient.update(SolrClient.java:620)

at
org.apache.solr.client.SolrClient.addDocuments(SolrClient.java:580)

at
org.apache.solr.client.SolrClient.addDocuments(SolrClient.java:595)

at
com.aol.music.search.indexer2.MusicIndexer$SolrUpdateTask.call(MusicInde
xer.
java:244)

at
com.aol.music.search.indexer2.MusicIndexer$SolrUpdateTask.call(MusicInde
xer.
java:214)

at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)

at java.util.concurrent.FutureTask.run(FutureTask.java:123)

at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
r.ja
va:650)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
va:6
75)

at java.lang.Thread.run(Thread.java:595)

Caused by: java.net.SocketTimeoutException: Read timed out

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.read(SocketInputStream.java:129)

at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)

at
java.io.BufferedInputStream.read(BufferedInputStream.java:235)

at
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:77)

at
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:105)

at
org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.jav
a:11
15)

at
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpCon
nect
ionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1373)

at
org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBa
se.j
ava:1832)

at
org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase
.jav
a:1590)

at
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java
:995
)

at
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMe
thod
Director.java:397)

at
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMetho
dDir
ector.java:170)

at
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:3
96)

at
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:3
24)

at org.apache.solr.client.SolrClient.update(SolrClient.java:637)

... 10 more

 

~

Daniel Clark, President

DAC Systems, Inc.

(703) 403-0340

~

 



OSSummit Asia / ApacheCon Atlanta

2007-10-23 Thread Erik Hatcher
A bit of self-promotion, sorry but also just want to in general  
make Solr and Lucene users aware of upcoming training sessions at  
OSSummit Asia (and ApacheCon Atlanta).   It's a struggle for the  
conference organizers to put on training sessions because of the  
upfront expense and risk in them not being filled, so getting the  
word out is necessary to ensure these training sessions keep going.   
All training providers have been asked to help spread the word to the  
right spots, which is here :)


I'm leading two training sessions at OSSummit Asia, as well as  
presenting a regular conference session, all on Lucene and Solr:  
http://www.ossummit.com/2007/program/speaker/20


Grant is leading a Lucene training at ApacheCon Atlanta and Hoss,  
Michael Busch, and Ken Krugler are presenting Lucene/Solr related  
sessions there too.


There are more pointers to details from the Lucene and Solr home  
pages: http://lucene.apache.org/java/ and http://lucene.apache.org/ 
solr


Looking forward to seeing some of you soon!

Now back to your regularly scheduled program

Erik



Forced Top Document

2007-10-23 Thread mark angelillo

Hi all,

Is there a way to get a specific document to appear on top of search  
results even if a sorting parameter would push it further down?


Thanks in advance,
Mark

mark angelillo
snooth inc.
o: 646.723.4328
c: 484.437.9915
[EMAIL PROTECTED]
snooth -- 1.8 million ratings and counting...