date:20110704

Shawn when i reindex data using full-import i got:
*_0.fdt 3310
_0.fdx  23
_0.frq  857
_0.nrm  31
_0.prx  1748
_0.tis  350
_1.fdt  3310
_1.fdx  23
_1.fnm  1
_1.frq  857
_1.nrm  31
_1.prx  1748
_1.tii  5
_1.tis  350
segments.gen1
segments_3  1*

Where all  _1  marked as archived(A)

And when i run again full import(for testing ) i got _1 and 2_ files where
all 2_ marked as archive. What does it mean.
and the problem i am not getting is while i am doing full import which
deletes the old indexes and creates new than why i m getting the old one
again.




-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/MergerFactor-and-MaxMergerDocs-effecting-num-of-segments-created-tp3128897p3136664.html
Sent from the Solr - User mailing list archive at Nabble.com.

Payload doesn't apply to WordDelimiterFilterFactory-generated tokens

2011-07-04 Thread Lox

Hi, I have a problem with the WordDelimiterFilterFactory and the
DelimitedPayloadTokenFilterFactory.
It seems that the payloads are applied only to the original word that I
index and the WordDelimiterFilter doesn't apply the payloads to the tokens
it generates.

For example, imagine I index the string JavaProject|1.7, 
at the end of my analyzer pipeline will be transformed like this:
JavaProject|1.7 - javaproject|1.7 java project

Instead, what I would is a result like this:
JavaProject|1.7 - javaproject|1.7 java|1.7 project|1.7

This way the payload would be applied to the document even in case of
partial matches on the original word.
Now I have used the pipe notation but imagine those payloads already stored
in solr internally.

How can I do this?

If it is needed, my analyzer looks like this:
fieldType name=text_C class=solr.TextField positionIncrementGap=100
stored=false indexed=true
  analyzer type=index   
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.DelimitedPayloadTokenFilterFactory 
encoder=float/
filter class=solr.PatternReplaceFilterFactory
pattern=^[a-z]{2,5}[0-9]{1,4}?([.]|[a-z])?(.*)
replacement= replace=all /
filter class=solr.WordDelimiterFilterFactory 
preserveOriginal=1
generateNumberParts=1/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt  enablePositionIncrements=true /   
filter class=solr.TrimFilterFactory /   
filter class=solr.LowerCaseFilterFactory/
filter class=solr.LengthFilterFactory min=1 max=30 /
filter class=solr.SnowballPorterFilterFactory 
language=English
protected=protwords.txt/
  /analyzer
.
.
.

Thank you.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Payload-doesn-t-apply-to-WordDelimiterFilterFactory-generated-tokens-tp3136748p3136748.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem in including both clustering component and spellchecker for solr search results at the same time

Markus, i did like it 

*requestHandler name=search class=solr.SearchHandler default=true
 
 lst name=defaults 
str name=spellcheck.dictionarydefault/str

  
 str name=spellcheck.onlyMorePopulartrue/str
  
 str name=spellcheck.extendedResultsfalse/str
  
str name=spellcheck.count1/str
  /lst


 lst name=appends
 str name=spellcheck.dictionarydefault/str

  
 str name=spellcheck.onlyMorePopulartrue/str
  
 str name=spellcheck.extendedResultsfalse/str
  
str name=spellcheck.count1/str
 /lst
*


I hope i have done things correctly.
But when i run solr server i am getting exception 
*org.apache.solr.common.SolrException: Unknown Search Component:
clusteringComponent*


-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Problem-in-including-both-clustering-component-and-spellchecker-for-solr-search-results-at-the-same-e-tp3128864p3136756.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to improve query result time.

2011-07-04 Thread Dmitry Kan

how long is an average query?
I have noticed, that if the query with such a contents as you specified, it
can take a while to return the hits. How big is your index?

On Mon, Jul 4, 2011 at 8:48 AM, Jason, Kim hialo...@gmail.com wrote:

 Hi All
 I have complex phrase queries including wildcard.
 (ex. q=conn* pho*~2 OR inter* pho*~2 OR ...)
 That takes long query result time.
 I tried reindex after changing termIndexInterval to 8 for reduce the query
 result time through more loading term index info.
 I thought if I do so query result time will be faster.
 But it wasn't.
 I doubt searching for .frq/.prx spends more time...
 Any ideas for impoving query result time?

 I'm using Solr 1.4 and schema.xml is below.

 fieldType name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.SnowballPorterWithUnstemFilterFactory
 language=English protected=protwords.txt /
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.SnowballPorterFilterFactory language=English
 protected=protwords.txt /
/analyzer
 /fieldType

 Thanks in advance

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/how-to-improve-query-result-time-tp3136554p3136554.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,

Dmitry Kan

@field for child object

2011-07-04 Thread Kiwi de coder

hi,

i wondering solrj @Field annotation support embedded child object ? e.g.

class A {

  @field
  string somefield;

 @emebedded
  B b;

}

regards,
kiwi

Spellchecker in zero-hit search result

2011-07-04 Thread Marian Steinbach

Hi!

I want my spellchecker component to return search query suggestions,
regardless of the number of items in the search results. (Actually I'd
find it most useful in zero-hit cases...)

Currently I only get suggestions if the search returns one ore more hits.

Example: q=place

response
  result name=response numFound=20 start=0 maxScore=2.2373123/
  lst name=spellcheck
lst name=suggestions
  lst name=place
int name=numFound4/int
int name=startOffset0/int
int name=endOffset5/int
arr name=suggestion
  strplace/str
  strplaces/str
  strplaced/str
/arr
  /lst
  str name=collationplace/str
/lst
  /lst
/response

Example: q=placw

response
  result name=response numFound=0 start=0 maxScore=0.0/
  lst name=spellcheck
lst name=suggestions/
  /lst
/response

This is my spellchecker configuration (where I already fiddled around
more than probably useful):

searchComponent class=solr.SpellCheckComponent name=suggest
  lst name=spellchecker
str name=namesuggest/str
str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
str name=fieldautocomplete/str
float name=threshold0.005/float
str name=accuracy0.1/str
str name=buildOnCommittrue/str
float name=thresholdTokenFrequency.001/float
  /lst
/searchComponent
requestHandler
class=org.apache.solr.handler.component.SearchHandler
name=/suggest
  lst name=defaults
str name=wtjson/str
str name=spellchecktrue/str
str name=spellcheck.dictionarysuggest/str
str name=spellcheck.onlyMorePopularfalse/str
str name=spellcheck.count4/str
str name=spellcheck.collatetrue/str
  /lst
  arr name=components
strsuggest/str
  /arr
/requestHandler


Did I misunderstand anything?

Thanks!

configure dismax requesthandlar for boost a field

I want to apply boost for searching. i want that if a query term occur both
in description,name than docs having query term in description field come
high in search results. for this i configure dismax request handler as:

*requestHandler name=dismax class=solr.DisMaxRequestHandler
default=true  
lst name=defaults
 str name=echoParamsexplicit/str
 float name=tie0.01/float
 str name=qf
text^0.5 name^1.0 description^1.5 
 /str
str name=fl
UID_PK,name,price,description
 /str
 str name=mm
2lt;-1 5lt;-2 6lt;90%
 /str
 int name=ps100/int
 str name=q.alt*:*/str
str name=f.name.hl.fragsize0/str
str name=f.name.hl.alternateFieldname/str
 str name=f.text.hl.fragmenterregex/str 
/lst
  /requestHandler*

But i am not finding any effect in my search results. do i need to do some
more configuration to see the effect.





-
Thanks  Regards
Romi
--
View this message in context: 
http://lucene.472066.n3.nabble.com/configure-dismax-requesthandlar-for-boost-a-field-tp3137239p3137239.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How many fields can SOLR handle?

2011-07-04 Thread roySolr

Nobody? I'm still confused about this

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-many-fields-can-SOLR-handle-tp3033910p3137301.html
Sent from the Solr - User mailing list archive at Nabble.com.

Problems using Solr with UIMA

Hi All

I tried integrating UIMA in to Solr, following the instructions here:
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt

However, I set  a solrconfig error, when I try to run Solr as a webapp, on
Eclipse.

org.apache.solr.common.SolrException: Error loading class
'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory'

But, the class does exist in the JAR snapshot created from the
solr/contrib/uima.

Any suggestions? I did search the past archives, but did not find anything
addressing this particular error...

S.

-- 
Sowmya V.B.

Losing optimism is blasphemy!
http://vbsowmya.wordpress.com

what s the optimum size of SOLR indexes

2011-07-04 Thread Jame Vaalet

Hi,

What would be the maximum size of a single SOLR index file for resulting in 
optimum search time ?
In case I have got to index all the documents in my repository  (which is in TB 
size) what would be the ideal architecture to follow , distributed SOLR ?

Regards,
JAME VAALET
Software Developer
EXT :8108
Capital IQ

Re: what s the optimum size of SOLR indexes

2011-07-04 Thread Mohammad Shariq

There are Solutions for Indexing huge data. e.g.  SolrCloud,
ZooKeeperIntegration, MultiCore, MultiShard.
depending on your requirement you can choose one or other.


On 4 July 2011 17:21, Jame Vaalet jvaa...@capitaliq.com wrote:

 Hi,

 What would be the maximum size of a single SOLR index file for resulting in
 optimum search time ?
 In case I have got to index all the documents in my repository  (which is
 in TB size) what would be the ideal architecture to follow , distributed
 SOLR ?

 Regards,
 JAME VAALET
 Software Developer
 EXT :8108
 Capital IQ




-- 
Thanks and Regards
Mohammad Shariq

Re: Problems using Solr with UIMA

2011-07-04 Thread Tommaso Teofili

Hello Sowmya,
Is the problem a ClassNotFoundException?
If so check there exist a lib element referencing the solr-uima jar.
Otherwise it may be some configuration error.
By the way, which version of Solr are you using ? I ask since you're seeing
README for trunk but you may be using Solr jars with different versions.
Cheers,
Tommaso

2011/7/4 Sowmya V.B. vbsow...@gmail.com

 Hi All

 I tried integrating UIMA in to Solr, following the instructions here:

 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt

 However, I set  a solrconfig error, when I try to run Solr as a webapp, on
 Eclipse.

 org.apache.solr.common.SolrException: Error loading class
 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory'

 But, the class does exist in the JAR snapshot created from the
 solr/contrib/uima.

 Any suggestions? I did search the past archives, but did not find anything
 addressing this particular error...

 S.

 --
 Sowmya V.B.
 
 Losing optimism is blasphemy!
 http://vbsowmya.wordpress.com

Re: How many fields can SOLR handle?

2011-07-04 Thread Marian Steinbach

Hi!

I can't help you with the question about the limit to the number of
fields. But until now I haven't read anywhere that there is a limit.
So I'd assume that there is none.

For your second question:

Another question: Is it possible to add the FACET fields automatically to my
query? facet.field=*_FACET? Now i do first a request to a DB to get the
FACET titles and add this to the request: facet.field=cpu_FACET,gpu_FACET.
I'm affraid that *_FACET is a overkill solution.

You can add parameters automatically (as defaults) to your requests.
Look into the solrconfig.xml file for requestHandler that handles your
requests. (In the example it's the one starting requestHandler
name=search class=solr.SearchHandler default=true). There is a
lst name=defaults where you can add as many request parameters as
you like.

Is that what you're talking about?



On Mon, Jul 4, 2011 at 13:44, roySolr royrutten1...@gmail.com wrote:
 Nobody? I'm still confused about this

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-many-fields-can-SOLR-handle-tp3033910p3137301.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: what s the optimum size of SOLR indexes

2011-07-04 Thread Toke Eskildsen

On Mon, 2011-07-04 at 13:51 +0200, Jame Vaalet wrote:
 What would be the maximum size of a single SOLR index file for resulting in 
 optimum search time ?

There is no clear answer. It depends on the number of (unique) terms,
number of documents, bytes on storage, storage speed, query complexity,
faceting, number of concurrent users and a lot of other factors.

 In case I have got to index all the documents in my repository  (which is in 
 TB size) what would be the ideal architecture to follow , distributed SOLR ?

A TB in source documents might very well end up as a simple, single
machine index of 100GB or less. It depends on the amount of search
relevant information in the documents, rather that their size in bytes.

If your sources are Word-documents or a similar format with a relatively
large amount of stuffing and your searches are mostly simple the user
enters 2-5 verbs and hits enter, my guess is that you don't need to
worry about distribution yet.

Make a pilot. Most of the work you'll have to do for a single machine
test can be reused for a distributed production setup.

Question regarding solr workflow

2011-07-04 Thread Engy Morsy

Hi,

What is the workflow of solr starting from submitting an xml document to be 
indexed? Is there any default analyzer that is called before the analyzer 
specified in my solr schema for the text field. I have a situation where the 
words of the text field that will be analyzed if somehow splitted.

For example if I have a text field ABC DEF, I can get it like AB C D EF.

Thanks
engy

Re: configure dismax requesthandlar for boost a field

2011-07-04 Thread Marian Steinbach

On Mon, Jul 4, 2011 at 13:11, Romi romijain3...@gmail.com wrote:
 I want to apply boost for searching. i want that if a query term occur both
 in description,name than docs having query term in description field come
 high in search results. for this i configure dismax request handler as:

 *requestHandler name=dismax class=solr.DisMaxRequestHandler
 default=true  
        lst name=defaults
         str name=echoParamsexplicit/str
         float name=tie0.01/float
         str name=qf
            text^0.5 name^1.0 description^1.5
         /str
        str name=fl
            UID_PK,name,price,description
         /str
         str name=mm
            2lt;-1 5lt;-2 6lt;90%
         /str
         int name=ps100/int
         str name=q.alt*:*/str
        str name=f.name.hl.fragsize0/str
        str name=f.name.hl.alternateFieldname/str
         str name=f.text.hl.fragmenterregex/str
        /lst
      /requestHandler*

 But i am not finding any effect in my search results. do i need to do some
 more configuration to see the effect.



Did you return the score for the queries?

Did you compare scores between trials with description^1.5 and, for
example, description^10.0?

Did you restart Solr after changes to solrconfig.xml?

Marian

Re: Problems using Solr with UIMA

Hi Tommaso,


I am using: Solr 3.3, that got released last week.
The Readme on the Solr version I have also had the same info as the read me
on that link.

There exists a lib element in my solrconfig.xml.   lib dir=../../dist/
regex=apache-solr-uima-\d.*\.jar /

Here is my trace: from this, it seemed like a class not found exception.

The server encountered an internal error (Severe errors in solr
configuration.
Check your log files for more detailed information on what may be wrong. If
you want solr to continue after configuration errors, change:
abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml
-
org.apache.solr.common.SolrException: Error loading class
'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:445) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1569) at
org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:57)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:447) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1553) at
org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1547) at
org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:620)
at org.apache.solr.core.SolrCore.init(SolrCore.java:561) at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at
org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
at
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at
java.util.concurrent.FutureTask.run(FutureTask.java:138) at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
*Caused by: java.lang.ClassNotFoundException*:
org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory at
java.net.URLClassLoader$1.run(URLClassLoader.java:202) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at
java.lang.ClassLoader.loadClass(ClassLoader.java:248) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:247) at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
... 25 more ) that prevented it from fulfilling this request.


Thanks.
Sowmya.
On Mon, Jul 4, 2011 at 2:15 PM, Tommaso Teofili
tommaso.teof...@gmail.comwrote:

 Hello Sowmya,
 Is the problem a ClassNotFoundException?
 If so check there exist a lib element referencing the solr-uima jar.
 Otherwise it may be some configuration error.
 By the way, which version of Solr are you using ? I ask since you're seeing
 README for trunk but you may be using Solr jars with different versions.
 Cheers,
 Tommaso

 2011/7/4 Sowmya V.B. vbsow...@gmail.com

  Hi All
 
  I tried integrating UIMA in to Solr, following the instructions here:
 
 
 https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt
 
  However, I set  a solrconfig error, when I try to run Solr as a webapp,
 on
  Eclipse.
 
  org.apache.solr.common.SolrException: Error loading class
  'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory'
 
  But, the class does exist in the JAR snapshot created from the
  solr/contrib/uima.
 
  Any suggestions? I did search the past archives, but did not find
 anything
  addressing this particular error...
 
  S.
 
  --
  Sowmya V.B.
  
  Losing optimism is blasphemy!
  http://vbsowmya.wordpress.com
  
 




-- 
Sowmya V.B.

Losing optimism is blasphemy!
http://vbsowmya.wordpress.com

Re: Problems using Solr with UIMA

2011-07-04 Thread Tommaso Teofili

Hello Sowmya,

I've just made a fresh checkout from
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/ then I've
done the following:

1. cd solr
2. ant example
3. cd solr/contrib/uima
4. ant dist
5. cd ../../example
6. edit solr/conf/solrconfig.xml
7. copied-pasted lib directives:
 lib dir=../../contrib/uima/lib /
lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar /
8. copied-pasted  updateRequestProcessorChain name=uima element at point
3 in README [1] inside solrconfig
9. created the request handler as in point 4 of README
10. then ran java -jar start.jar from command line

It worked for me, since you said you were running the webapp from inside
Eclipse I wonder if that's a classpath problem related to Eclipse.
Hope this helps,
Tommaso



[1] :
https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/solr/contrib/uima/README.txt


2011/7/4 Sowmya V.B. vbsow...@gmail.com

 Hi Tommaso,


 I am using: Solr 3.3, that got released last week.
 The Readme on the Solr version I have also had the same info as the read me
 on that link.

 There exists a lib element in my solrconfig.xml.   lib dir=../../dist/
 regex=apache-solr-uima-\d.*\.jar /

 Here is my trace: from this, it seemed like a class not found exception.

 The server encountered an internal error (Severe errors in solr
 configuration.
 Check your log files for more detailed information on what may be wrong. If
 you want solr to continue after configuration errors, change:
 abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml
 -
 org.apache.solr.common.SolrException: Error loading class
 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389)
 at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at
 org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:445) at
 org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1569) at

 org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:57)
 at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:447) at
 org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1553) at
 org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1547) at
 org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:620)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:561) at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at
 org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at

 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133)
 at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
 at

 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
 at

 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
 at

 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
 at

 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
 at

 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562)
 at
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240)
 at
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at
 java.util.concurrent.FutureTask.run(FutureTask.java:138) at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:680)
 *Caused by: java.lang.ClassNotFoundException*:
 org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory at
 java.net.URLClassLoader$1.run(URLClassLoader.java:202) at
 java.security.AccessController.doPrivileged(Native Method) at
 java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
 java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
 java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at
 java.lang.ClassLoader.loadClass(ClassLoader.java:248) at
 java.lang.Class.forName0(Native Method) at
 java.lang.Class.forName(Class.java:247) at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
 ... 25 more ) that prevented it from fulfilling this request.


 Thanks.
 Sowmya.
 On Mon, Jul 4, 2011 at 2:15 PM, Tommaso Teofili
 tommaso.teof...@gmail.comwrote:

  Hello Sowmya,
  Is the problem a ClassNotFoundException?
  If so check there exist a lib element referencing the solr-uima jar.
  Otherwise it may be some configuration error.
  By the way, which version of Solr are you using ? I ask since you're
 seeing
  README for trunk but you may be using Solr jars with different versions.
  Cheers,
  Tommaso
 
  2011/7/4 Sowmya V.B. vbsow...@gmail.com
 
   Hi

Solr vs Hibernate Search (Huge number of DB DMLs)

2011-07-04 Thread fire fox

Hi all,
   There were several places I could find a discussion on this but I
failed to find the suited one for me.

I'd like to be clear on my requirements, so that you may suggest me the
better solution.

- A project deals with tons of database tables (with *millions *of records)
out of which some are to be indexed which should be searchable of-course. It
uses Hibernate for MySQL transactions.

 As per my knowledge, there could be two solutions to maintain sync between
index and database effectively.

-- There'd be a *huge number of transactions (DMLs) on the DB*, so I'm
wondering which one of the following will be able to handle it
effectively.

 1) Configure *Solr *server, query it to search / send events to update.
This might be better than handling Lucene solely which provides index
read/write and load balancing. The problem here could be to implement 
maintain sync between index and DB with no lag as the updations (DMLs on DB)
are very frequent. Too many events to be sent!

 2) Using *Hibernate Search*. I'm just wondering about its
*performance*considering high volume of transactions on DB every
minute.

Please suggest.

Thanks in advance.

Re: Problems using Solr with UIMA

Hello Tomasso

It was indeed a relative path issue inside eclipse. I key-ed in the total
path instead of ../../ and it ran without throwing an error.

However, when I gave the path for index as an old lucene index directory's
path and modified schema.xml accordingly, it still says numDocs = 0, on
stats.jsp page.
How can I tell Solr to use an already existing lucene index (which also used
UIMA)... this is just to check if the integration works and ensure that i am
on right track

S.

On Mon, Jul 4, 2011 at 2:55 PM, Tommaso Teofili
tommaso.teof...@gmail.comwrote:

 Hello Sowmya,

 I've just made a fresh checkout from
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/ then I've
 done the following:

 1. cd solr
 2. ant example
 3. cd solr/contrib/uima
 4. ant dist
 5. cd ../../example
 6. edit solr/conf/solrconfig.xml
 7. copied-pasted lib directives:
  lib dir=../../contrib/uima/lib /
 lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar /
 8. copied-pasted  updateRequestProcessorChain name=uima element at
 point
 3 in README [1] inside solrconfig
 9. created the request handler as in point 4 of README
 10. then ran java -jar start.jar from command line

 It worked for me, since you said you were running the webapp from inside
 Eclipse I wonder if that's a classpath problem related to Eclipse.
 Hope this helps,
 Tommaso



 [1] :

 https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/solr/contrib/uima/README.txt


 2011/7/4 Sowmya V.B. vbsow...@gmail.com

  Hi Tommaso,
 
 
  I am using: Solr 3.3, that got released last week.
  The Readme on the Solr version I have also had the same info as the read
 me
  on that link.
 
  There exists a lib element in my solrconfig.xml.   lib dir=../../dist/
  regex=apache-solr-uima-\d.*\.jar /
 
  Here is my trace: from this, it seemed like a class not found exception.
 
  The server encountered an internal error (Severe errors in solr
  configuration.
  Check your log files for more detailed information on what may be wrong.
 If
  you want solr to continue after configuration errors, change:
  abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml
  -
  org.apache.solr.common.SolrException: Error loading class
  'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' at
 
 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389)
  at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at
  org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:445) at
  org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1569) at
 
 
 org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:57)
  at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:447) at
  org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1553) at
  org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1547) at
 
 org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:620)
  at org.apache.solr.core.SolrCore.init(SolrCore.java:561) at
  org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at
  org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at
 
 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
  at
 
 
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
  at
 
 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
  at
 
 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
  at
 
 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
  at
 
 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562)
  at
 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240)
  at
 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at
  java.util.concurrent.FutureTask.run(FutureTask.java:138) at
 
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at
 
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:680)
  *Caused by: java.lang.ClassNotFoundException*:
  org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory at
  java.net.URLClassLoader$1.run(URLClassLoader.java:202) at
  java.security.AccessController.doPrivileged(Native Method) at
  java.net.URLClassLoader.findClass(URLClassLoader.java:190) at
  java.lang.ClassLoader.loadClass(ClassLoader.java:307) at
  java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at
  java.lang.ClassLoader.loadClass(ClassLoader.java:248) at
  java.lang.Class.forName0(Native Method) at

Re: Custom Cache cleared after a commit?

2011-07-04 Thread Yonik Seeley

On Mon, Jul 4, 2011 at 2:07 AM, arian487 akarb...@tagged.com wrote:
 I guess I'll have to use something other then SolrCache to get what I want
 then.  Or I could use SolrCache and just change the code (I've already done
 so much of this anwyways...).  Anyways thanks for the reply.

You can specify a regenerator for your cache that examines items in
the old cache and pre-populates the new cache when a commit happens.

-Yonik
http://www.lucidimagination.com

Re: Problems using Solr with UIMA

Hello Tomasso


I noticed that though I can see the Solr Admin interface, when I click on
links schema and conf, its not taking me to the pages inside solr/conf/
folder of the webapp, again, I guess because of eclipse paths.

This is the stack trace on console:
INFO: Solr home set to 'solr/./'
Jul 4, 2011 4:57:58 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
classpath or 'solr/./conf/',
cwd=/Users/svajjala/Documents/eclipse/Eclipse.app/Contents/MacOS
at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:268)
at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:234)
at org.apache.solr.core.Config.init(Config.java:141)
at org.apache.solr.core.SolrConfig.init(SolrConfig.java:131)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:435)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
at
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240)
at
org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: user.dir=/Users/svajjala/Documents/eclipse/Eclipse.app/Contents/MacOS
Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init() done
Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrServlet init
INFO: SolrServlet.init()
Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: No /solr/home in JNDI
Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrServlet init
INFO: SolrServlet.init() done
Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: No /solr/home in JNDI
Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrUpdateServlet init
INFO: SolrUpdateServlet.init() done
Jul 4, 2011 4:57:58 PM org.apache.coyote.AbstractProtocolHandler start
INFO: Starting ProtocolHandler [http-bio-8080]
Jul 4, 2011 4:57:58 PM org.apache.coyote.AbstractProtocolHandler start
INFO: Starting ProtocolHandler [ajp-bio-8009]
Jul 4, 2011 4:57:58 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 3661 ms
Jul 4, 2011 4:58:02 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/apache-solr-3.3.0 path=/admin/file/
params={file=schema.xmlcontentType=text/xml;charset%3Dutf-8} status=0
QTime=1

I used solr before...from command line...and I never had such errors. Iam
new to IDE usage than Solr. So, I don't understand the path errors :(

S

On Mon, Jul 4, 2011 at 3:41 PM, Sowmya V.B. vbsow...@gmail.com wrote:

 Hello Tomasso

 It was indeed a relative path issue inside eclipse. I key-ed in the total
 path instead of ../../ and it ran without throwing an error.

 However, when I gave the path for index as an old lucene index directory's
 path and modified schema.xml accordingly, it still says numDocs = 0, on
 stats.jsp page.
 How can I tell Solr to use an already existing lucene index (which also
 used UIMA)... this is just to check if the integration works and ensure that
 i am on right track

 S.


 On Mon, Jul 4, 2011 at 2:55 PM, Tommaso Teofili tommaso.teof...@gmail.com
  wrote:

 Hello Sowmya,

 I've just made a fresh checkout from
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/ then
 I've
 done the following:

 1. cd solr
 2. ant example
 3. cd solr/contrib/uima
 4. ant dist
 5. cd ../../example
 6. edit solr/conf/solrconfig.xml
 7. copied-pasted lib directives:
  lib dir=../../contrib/uima/lib /
 lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar /
 8. copied-pasted

Re: Spellchecker in zero-hit search result

2011-07-04 Thread Juan Grande

Hi Marian,

I guess that your problem isn't related to the number of results, but to the
component's configuration. The configuration that you show is meant to set
up an autocomplete component that will suggest terms from an incomplete user
input (something similar to what google does while you're typing in the
search box), see http://wiki.apache.org/solr/Suggester. That's why your
suggestions to place are places and placed, all sharing the place
prefix. But when you search for placw, the component doesn't return any
suggestion, because in your index no term begins with placw.

You can learn how to correctly configure a spellchecker here:
http://wiki.apache.org/solr/SpellCheckComponent. Also, I'd recommend to take
a look at the example's solrconfig, because it provides an example
spellchecker configuration.

Regards,

*Juan*

On Mon, Jul 4, 2011 at 7:30 AM, Marian Steinbach marian.steinb...@gmail.com
 wrote:

 Hi!

 I want my spellchecker component to return search query suggestions,
 regardless of the number of items in the search results. (Actually I'd
 find it most useful in zero-hit cases...)

 Currently I only get suggestions if the search returns one ore more hits.

 Example: q=place

 response
  result name=response numFound=20 start=0 maxScore=2.2373123/
  lst name=spellcheck
lst name=suggestions
  lst name=place
int name=numFound4/int
int name=startOffset0/int
int name=endOffset5/int
arr name=suggestion
  strplace/str
  strplaces/str
  strplaced/str
/arr
  /lst
  str name=collationplace/str
/lst
  /lst
 /response

 Example: q=placw

 response
  result name=response numFound=0 start=0 maxScore=0.0/
  lst name=spellcheck
lst name=suggestions/
  /lst
 /response

 This is my spellchecker configuration (where I already fiddled around
 more than probably useful):

 searchComponent class=solr.SpellCheckComponent name=suggest
  lst name=spellchecker
str name=namesuggest/str
str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str
 name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str
str name=fieldautocomplete/str
float name=threshold0.005/float
str name=accuracy0.1/str
str name=buildOnCommittrue/str
float name=thresholdTokenFrequency.001/float
  /lst
 /searchComponent
 requestHandler
 class=org.apache.solr.handler.component.SearchHandler
 name=/suggest
  lst name=defaults
str name=wtjson/str
str name=spellchecktrue/str
str name=spellcheck.dictionarysuggest/str
str name=spellcheck.onlyMorePopularfalse/str
str name=spellcheck.count4/str
str name=spellcheck.collatetrue/str
  /lst
  arr name=components
strsuggest/str
  /arr
 /requestHandler


 Did I misunderstand anything?

 Thanks!

Is solrj 3.3.0 ready for field collapsing?

2011-07-04 Thread Per Newgro


Hi,

i've tried to add the params for group=true and group.field=myfield by 
using the SolrQuery.
But the result is null. Do i have to configure something? In wiki part 
for field collapsing i couldn't

find anything.

Thanks
Per

A beginner problem

2011-07-04 Thread carmmello

I use nutch, as a search engine.  Until now nutch did the crawl and the
search functions.  The newest version, however, delegated the search to
solr. I don't know almost nothing about programming, but i'm able to
follow a receipe.  So I went to the the solr site, downloaded solr and
tried to follow the tutorial.  In the  example folder of solr, using
java -jar start.jar  I got:

2011-07-04 13:22:38.439:INFO::Logging to STDERR via org.mortbay.log.StdErrLog
2011-07-04 13:22:38.893:INFO::jetty-6.1-SNAPSHOT
2011-07-04 13:22:38.946:INFO::Started SocketConnector@0.0.0.0:8983

When I tried  to go to http://localhost:8983/solr/admin/  I got:

HTTP ERROR: 404
Problem accessing /solr/admin/. Reason:
NOT_FOUND

Can someone help me with this?

Tanks

Re: Solr vs Hibernate Search (Huge number of DB DMLs)

2011-07-04 Thread fire fox

From my exploration so far, I understood that we can opt Solr straightaway
if the index changes are kept to minimal. However, mine is absolutely the
opposite. I'm still vague about the perfect solution for the scenario
mentioned.

Please share..

On Mon, Jul 4, 2011 at 6:28 PM, fire fox fyr3...@gmail.com wrote:

 Hi all,
There were several places I could find a discussion on this but I
 failed to find the suited one for me.

 I'd like to be clear on my requirements, so that you may suggest me the
 better solution.

 - A project deals with tons of database tables (with *millions *of
 records) out of which some are to be indexed which should be searchable
 of-course. It uses Hibernate for MySQL transactions.

  As per my knowledge, there could be two solutions to maintain sync between
 index and database effectively.

 -- There'd be a *huge number of transactions (DMLs) on the DB*, so I'm
 wondering which one of the following will be able to handle it
 effectively.

  1) Configure *Solr *server, query it to search / send events to update.
 This might be better than handling Lucene solely which provides index
 read/write and load balancing. The problem here could be to implement 
 maintain sync between index and DB with no lag as the updations (DMLs on DB)
 are very frequent. Too many events to be sent!

  2) Using *Hibernate Search*. I'm just wondering about its 
 *performance*considering high volume of transactions on DB every minute.

 Please suggest.

 Thanks in advance.

Re: Question regarding solr workflow

2011-07-04 Thread Shalin Shekhar Mangar

On Mon, Jul 4, 2011 at 5:47 PM, Engy Morsy engy.mo...@bibalex.org wrote:


 What is the workflow of solr starting from submitting an xml document to be
 indexed? Is there any default analyzer that is called before the analyzer
 specified in my solr schema for the text field. I have a situation where the
 words of the text field that will be analyzed if somehow splitted.


Only the analyzer specified in solr schema are applied. You can try the
Analysis link on the Solr dashboard to see how the analysis is being done
for a particular field.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Index Version and Epoch Time?


: The index version shown on the dashboard is the time at which the most
: recent index segment was created. I'm not sure why it has a value older than
: a month if a commit has happened after that time.

I'm fairly certian that's false.

last time i checked, newly created indexes are assigned a version based on 
index time, but after that each commit simply imcrements the version - so 
index versions are only suitable for comparing if one instance of an index 
is newer or older then another instance of the same index -- it doesn't 
tell you anything about the relative age.


-Hoss

Re: Index Version and Epoch Time?

2011-07-04 Thread Shalin Shekhar Mangar

On Tue, Jul 5, 2011 at 12:03 AM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : The index version shown on the dashboard is the time at which the most
 : recent index segment was created. I'm not sure why it has a value older
 than
 : a month if a commit has happened after that time.

 I'm fairly certian that's false.

 last time i checked, newly created indexes are assigned a version based on
 index time, but after that each commit simply imcrements the version - so
 index versions are only suitable for comparing if one instance of an index
 is newer or older then another instance of the same index -- it doesn't
 tell you anything about the relative age.


Thanks for clearing that up Hoss. I only looked at a place where IndexCommit
was being created and it used System.currentTimeMillis, hence the confusion.
Anyways, what the version represents is not guaranteed except that it will
uniquely identify a commit point so users should not make any assumptions.

-- 
Regards,
Shalin Shekhar Mangar.

Re: upgraded from 2.9 to 3.x, problems. help?

: i recently upgraded al systems for indexing and searching to lucene/solr 3.1,
: and unfortunatly it seems theres a lot more changes under the hood than
: there used to be.

it wounds like you are saying you had a system that wsa working fine for
you, but when you tried to upgrade it stoped working.

: i have a java based indexer and a solr based searcher, on the java end for
...
: Analyzer an = new StandardAnalyzer(Version.LUCENE_31, nostopwords);

right off the bat, that line of code couldn't posisly have been in your
existing 2.9 code (Version.LUCENE_31 didn't existing in 2.9) and
instructs StandardAnalyzer to to some very basic things very
differnetly then they were dong in 2.9...

http://lucene.apache.org/java/3_1_0/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html

I would start by setting that to Version.LUCENE_29 to tell
StandardAnalyzer that you want the same behavior as before.

Having said all of that -- the LUCENE_31 is considered better then the
LUCENE_29 behavior, so you should consider change that to get the benefits
-- but you need to understand your full analysis stack to do that.

: and for the solr end i have:

...you should also check if you added a luceneMatchVersion/ of LUCENE_31
to your solrconfig.xml -- if not do so so it's consistent with your
external java code.

generally speaking just having your indexer using an off the shelf
analyzer while your solr instead uses something like WordDelimiterFilter
isn't going to work well, you need to think about index time analysis and
query time anslysis in conjunction with eachother.

hang on, scratch that -- you may think you are using
WordDelimiterFilterFactory, but you are not...

: fieldType name=text class=solr.TextField positionIncrementGap=100
:
: filter class=solr.WordDelimiterFilterFactory
: generateWordParts=1 generateNumberParts=1 catenateWords=1
: catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/
: analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer
: ignoreCase=true /
: /fieldType

...you can't just plop a filter/ tag in a fieldType/ like htat nad
have it mean something. filter/ can be used when you are declaring an
custom analyzer chain in the schema.xml, if you use analyzer class=...
/ you get a concrete analyzer that has hardcoded behavior.

so if you aren't getting matches, it's a straight up discrepency between
the LUCENE_31 and whatever seting you have in solrconfig.xml (which if you
didn't add to your existing config, is going to be a legacy default ...
2.4 or 2.9 ... i can't remember)

-Hoss

Re: A beginner problem

2011-07-04 Thread Filype Pereira

It's hard to find what is happening without more details about your setup.

I would start by asking the questions:

- Do you have a firewall installed?
- What opperating system do you run solr on?
- Can you ping the hostname localhost?

Filype

On Tue, Jul 5, 2011 at 4:49 AM, carmme...@qualidade.info wrote:

 I use nutch, as a search engine.  Until now nutch did the crawl and the
 search functions.  The newest version, however, delegated the search to
 solr. I don't know almost nothing about programming, but i'm able to
 follow a receipe.  So I went to the the solr site, downloaded solr and
 tried to follow the tutorial.  In the  example folder of solr, using
 java -jar start.jar  I got:

 2011-07-04 13:22:38.439:INFO::Logging to STDERR via
 org.mortbay.log.StdErrLog
 2011-07-04 13:22:38.893:INFO::jetty-6.1-SNAPSHOT
 2011-07-04 13:22:38.946:INFO::Started SocketConnector@0.0.0.0:8983

 When I tried  to go to http://localhost:8983/solr/admin/  I got:

 HTTP ERROR: 404
 Problem accessing /solr/admin/. Reason:
 NOT_FOUND

 Can someone help me with this?

 Tanks

Re: How do I compute and store a field?

2011-07-04 Thread Markus Jelsma

You can create a custom update processor. The passed AddUpdateCommand object 
has an accessor to the SolrInputDocument you're about to add. In the 
processAdd method you can add a new field with whatever you want.

The wiki has a good example:
http://wiki.apache.org/solr/UpdateRequestProcessor


 Hello,
 
 I'm trying to add a field that counts the number of terms in a document to
 my schema. So far I've been computing this value at query-time. Is there
 how I could compute this once only and store the field?
 
 final SolrIndexSearcher searcher = request.getSearcher();
 final SolrIndexReader reader = searcher.getReader();
 final String content = content;
 
 final byte[] norms = reader.norms(content);
 final int[] docLengths;
 if (norms == null) {
 docLengths = null;
 } else {
 docLengths = new int[norms.length];
 int i = 0;
 for (byte b : norms) {
 
 float docNorm =
 searcher.getSimilarity().decodeNormValue(b); int docLength = 0;
 if (docNorm != 0) {
 docLength = (int) (1 / docNorm); //reciprocal
 }
 docLengths[i++] = docLength;
 }
 ...
  final NumericField docLenNormField = new
 NumericField(TestQueryResponseWriter.DOC_LENGHT);
  docLenNormField.setIntValue(docLengths[id]);
  doc.add(docLenNormField);

Re: After the query component has the results, can I do more filtering on them?

: Sorry for the double post but in this case, is it possible for me to access
: the queryResultCache in my component and play with it?  Ideally what I want
: is this:

it could be possible to do what you're describing, but it would probabl be 
fairly brittle.

i know you said earlier thta you can't use any eisting components, but i 
strongly urge you to post the details on *what* you wnat to do (ie: where 
are these scores coming from, how are the determined, how often do they 
change, do all of them change or just some of them, etc..) instead of 
*how* you want to do it (ie: modify the scores after the search)

Even if an existing tool (like ExternalFileField) can't be used directly 
in your case, providing the full information about your use case may help 
people suggest a completley differnet approach then the one you're 
considering...

http://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an XY Problem ... that is: you are dealing
with X, you are assuming Y will help you, and you are asking about Y
without giving more details about the X so that we can understand the
full issue.  Perhaps the best solution doesn't involve Y at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341


-Hoss

Re: How do I compute and store a field?

2011-07-04 Thread Gabriele Kahlout

Gee, I was about to post. I figured my issue is that of computing the unique
terms per document. One approach to compute that value is running the
analyzer on the document before before calling addDocument, and count the
number of tokens.
Then I can invoke addDocument with the value of the field computed.

The only issue is that I'm here making the assumption that if I use the same
Analyzer addDocument used in addDocument then that will always equal the
number of terms indexed for that document. Is that a right assumption? Any
alternative where I don't need to make this assumption?


On Tue, Jul 5, 2011 at 1:29 AM, Markus Jelsma markus.jel...@openindex.iowrote:

 You can create a custom update processor. The passed AddUpdateCommand
 object
 has an accessor to the SolrInputDocument you're about to add. In the
 processAdd method you can add a new field with whatever you want.

 The wiki has a good example:
 http://wiki.apache.org/solr/UpdateRequestProcessor


  Hello,
 
  I'm trying to add a field that counts the number of terms in a document
 to
  my schema. So far I've been computing this value at query-time. Is there
  how I could compute this once only and store the field?
 
  final SolrIndexSearcher searcher = request.getSearcher();
  final SolrIndexReader reader = searcher.getReader();
  final String content = content;
 
  final byte[] norms = reader.norms(content);
  final int[] docLengths;
  if (norms == null) {
  docLengths = null;
  } else {
  docLengths = new int[norms.length];
  int i = 0;
  for (byte b : norms) {
 
  float docNorm =
  searcher.getSimilarity().decodeNormValue(b); int docLength = 0;
  if (docNorm != 0) {
  docLength = (int) (1 / docNorm); //reciprocal
  }
  docLengths[i++] = docLength;
  }
  ...
   final NumericField docLenNormField = new
  NumericField(TestQueryResponseWriter.DOC_LENGHT);
   docLenNormField.setIntValue(docLengths[id]);
   doc.add(docLenNormField);




-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains [LON] or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
 Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with X.
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Re: Feed index with analyzer output


: I will be more clear on the steps that I would like to take:
: 1) Call the analyzer of Solr that returns me an XML response in the
: following format (just a snippet as example)
...

: 2) now I would like to be able to extract the info that I need from there
: and tell Solr directly which things to index, telling him directly also
: which are the tokens with their respective payload without performing more
: analysis.

can you explain a bit more about what you goal is here?  what info are you 
planning on extracting?  what do you intend to change between the info you 
get back in the first request and the info you want to send in the second 
request?

Smells a little like an XY Problem...
http://people.apache.org/~hossman/#xyproblem

...if you *really* wanted to do this you could, but you'd need different 
fieldnames for the preanalysis fields that you'd use in request#1 
and the actual content that would be indexed/stored in request#2.

your analyziers and whatnot for request#1 would be exactly what you're use 
to, but for request#2 you'd need to specify an analyzer that would let you 
specify, in the field value, the details about the term and position, and 
offsets, and payloads and what not ... the 
DelimitedPayloadTokenFilterFactory / DelimitedPayloadTokenFilter can help 
with some of that, but not all -- you'd either need your own custom 
analyzer or custom FieldType or something depending on teh specific 
changes you want to make.

Frankly though i really believe you are going about this backwards -- if 
you want to manipulate the Tokenstream after analysis but before indexing, 
then why not implement this custom logic thta you want in a TokenFilter 
and use it in the last TokenFilterFactory you have for your analyzer?


-Hoss

full text searching in cloud for minor enterprises

2011-07-04 Thread Li Li

hi all,
I want to provide full text searching for some small websites.
It seems cloud computing is  popular now. And it will save costs
because it don't need employ engineer to maintain
the machine.
For now, there are many services such as amazon s3, google app
engine, ms azure etc. I am not familiar with cloud computing. Anyone
give me a direction or some advice? thanks

Re: A beginner problem

2011-07-04 Thread nilay....@gmail.com

ya i agree with Filype Pereira Please put your problem in details .  And
check all thing  what he says . Please also check in  8080 port 


-
Regards
Nilay Tiwari
--
View this message in context: 
http://lucene.472066.n3.nabble.com/A-beginner-problem-tp3138118p3139667.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: configure dismax requesthandlar for boost a field