Re: Synonyms Phrase not working

2012-10-02 Thread Bernd Fehling
Hi,

because your search for /?q=produto_nome:lubrificante intimo is
a phrase search and will be handled different.

Your other search gets the synonyms, but the last synonym is a multi-word 
synonym
and not a phrase
... produto_nome:lubrificante intimo) ))

See also:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

Regards
Bernd


Am 01.10.2012 19:02, schrieb Gustav:
 Hello Everyone,
 Im having a problem using the SynonymFilterFactory in a query analyzer.
 
 That's my synonyms.txt file:
 
 sexo = Preservativo, vaselina, viagra, lubrificante intimo
 
 And that is the fieldtype in which it is implemented:
 
 fieldType class=solr.TextField name=produto_nome_synonyms
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory
 enablePositionIncrements=true ignoreCase=true words=stopwords.txt/
 filter class=solr.ISOLatin1AccentFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.NGramFilterFactory maxGramSize=25
 minGramSize=1/
   /analyzer
 
   analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ISOLatin1AccentFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.SynonymFilterFactory expand=true
 ignoreCase=true synonyms=synonyms.txt
 tokenizerFactory=KeywordTokenizerFactory/
 filter class=solr.StopFilterFactory
 enablePositionIncrements=true ignoreCase=true words=stopwords.txt/
   /analyzer
 /fieldType
 
 The problem here is:
 When i search for /?q=produto_nome:lubrificante intimo Solr returns 8
 documents, that matches because of the n-gram filter factory, but when i
 search for /?q=produto_nome:sexo Solr brings no results.
 I was expecting the same result as /?q=lubrificante intimo , as configured
 in the synonyms.
 
 i turned debugQuery=true and got the following parsedquery:
 
 str name=parsedquery
 +DisjunctionMaxQuery(((produto_nome:preservativo produto_nome:vaselina
 produto_nome:viagra produto_nome:lubrificante intimo) ))
 /str
 
 I Dont undersant why it brings no results. 
 Any ideas? 
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Synonyms-Phrase-not-working-tp4011237.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 

-- 
*
Bernd FehlingUniversitätsbibliothek Bielefeld
Dipl.-Inform. (FH)LibTec - Bibliothekstechnologie
Universitätsstr. 25 und Wissensmanagement
33615 Bielefeld
Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*


Re: Understanding fieldCache SUBREADER insanity

2012-10-02 Thread Aaron Daubman
Hi Yonik,

I've been attempting to fix the SUBREADER insanity in our custom
component, and have made perhaps some progress (or is this worse?) -
I've gone from SUBREADER to VALUEMISMATCH insanity:
---snip---
entries_count : 12
entry#0 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',class
org.apache.lucene.search.FieldCacheImpl$DocsWithFieldCache,null=org.apache.lucene.util.FixedBitSet#1387502754
entry#1 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_track_count',class
org.apache.lucene.search.FieldCacheImpl$DocsWithFieldCache,null=org.apache.lucene.util.Bits$MatchAllBits#233863705
entry#2 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class
org.apache.lucene.search.FieldCache$StringIndex,null=org.apache.lucene.search.FieldCache$StringIndex#652215925
entry#3 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class
java.lang.String,null=[Ljava.lang.String;#1036517187
entry#4 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='thingID',class
java.lang.String,null=[Ljava.lang.String;#357017445
entry#5 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',float,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_FLOAT_PARSER=[F#322888397
entry#6 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',float,org.apache.lucene.search.FieldCache.DEFAULT_FLOAT_PARSER=org.apache.lucene.search.FieldCache$CreationPlaceholder#1229311421
entry#7 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='f_normalizedTotalHotttnesss',float,null=[F#322888397
entry#8 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_collapse',int,org.apache.lucene.search.FieldCache.DEFAULT_INT_PARSER=org.apache.lucene.search.FieldCache$CreationPlaceholder#92920526
entry#9 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_collapse',int,null=[I#494669113
entry#10 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_collapse',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=[I#494669113
entry#11 : 
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='i_track_count',int,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_INT_PARSER=[I#994584654
insanity_count : 1
insanity#0 : VALUEMISMATCH: Multiple distinct value objects for
MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)+s_artistID
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class
org.apache.lucene.search.FieldCache$StringIndex,null=org.apache.lucene.search.FieldCache$StringIndex#652215925
'MMapIndexInput(path=/io01/p/solr/playlist/c/playlist/index/_c2.frq)'='s_artistID',class
java.lang.String,null=[Ljava.lang.String;#1036517187
---snip---

Any suggestions on what the cause of this VALUEMISMATCH is, if it is
the normal case, or suggestions on how to fix it.

For anybody else with SUBREADER insanity issues, this is the change I
made to get this far (get the first leafReader, since we are using a
merged/optimized index):
---snip---
SolrIndexReader reader = searcher.getReader().getLeafReaders()[0];
collapseIDs = FieldCache.DEFAULT.getInts(reader, COLLAPSE_KEY_NAME);
hotnessValues = FieldCache.DEFAULT.getFloats(reader,
HOTNESS_KEY_NAME);
artistIDs = FieldCache.DEFAULT.getStrings(reader, ARTIST_KEY_NAME);
---snip---

Thanks,
 Aaron

On Wed, Sep 19, 2012 at 4:54 PM, Yonik Seeley yo...@lucidworks.com wrote:
 already-optimized, single-segment index

 That part is interesting... if true, then the type of insanity you
 saw should be impossible, and either the insanity detection or
 something else is broken.

 -Yonik
 http://lucidworks.com


Re: multivalued filed question (FieldCache error)

2012-10-02 Thread giovanni.bricc...@banzai.it



I'm also using that field for a facet:

|requestHandler  name=mytype  class=solr.SearchHandler  
lst  name=defaults
 str  name=defTypedismax/str
 str  name=echoParamsexplicit/str
 float  name=tie1/float
 str  name=qf
   many field but not store_slug
 /str
 str  name=pf
   |many field but not store_slug|||
 /str  
 str  name=fl

..., store_slug
 /str
 
 str  name=mm![CDATA[1100% 580%]]/str

 int  name=qs2/int
 int  name=ps2/int
 str  name=q.alt*:*/str
str  name=spellcheck.dictionarydefault/str
  str  name=spellchecktrue/str
  str  name=spellcheck.extendedResultstrue/str
  str  name=spellcheck.count10/str
  str  name=spellcheck.collatetrue/str  
!-- Facet --

str  name=facettrue/str
str  name=facet.mincount1/str
str  name=facet.pivot.mincount0/str
str  name=facet.sortcount/str
...
str  name=facet.fieldstore_slug/str
...
str  name=hlfalse/str
/lst
arr  name=last-components
  strspellcheck/str
/arr

  /requestHandler|


Il 01/10/12 18:34, Erik Hatcher ha scritto:

How is your request handler defined?  Using store_slug for anything but fl?

Erik

On Oct 1, 2012, at 10:51,giovanni.bricc...@banzai.it  
giovanni.bricc...@banzai.it  wrote:


Hello,

I would like to put a multivalued field into a qt definition as output field. 
to do this I edit the current solrconfig.xml definition and add the field in 
the fl specification.

Unexpectedly when I do the query q=*:*qt=mytype I get the error

str name=msg
can not use FieldCache on multivalued field: store_slug
/str

But if I instead run the query

http://src-eprice-dev:8080/solr/0/select/?q=*:*qt=mytypefl=otherfield,mymultivaluedfiled

I don't get the error

Have you got any suggestions?

I'm using solr 4 beta

solr-spec 4.0.0.2012.08.06.22.50.47
lucene-impl 4.0.0-BETA 1370099


Giovanni



--


 Giovanni Bricconi

Banzai Consulting
cell. 348 7283865
ufficio 02 00643839
via Gian Battista Vico 42
20132 Milano (MI)





Re: httpSolrServer and exyternal load balancer

2012-10-02 Thread Lee Carroll
Cheers, saved the day

Lee C

On 28 September 2012 23:27, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : The issue we face is the f5 balancer is returning a cookie which the
 client
 : is hanging onto. resulting in the same slave being hit for all requests.
 ...
 : My question is can I configure the solr server to ignore client state ?
 We
 : are on solr 3.4

 I'm not an expert on HTTP session affinity as implemented by various load
 blanacers, but i can say with a high degree of confidence:

 1) SolrJ doesn't care about cookies

 2) if any part of the codepath you are using cares about cookies sent back
 from your load-balancer, it would be the HttpClient objects used by
 CommonsHttpSolrServer.

 3) you have total control over the HttpClient objects used by
 CommonsHttpSolrServer via an optional constructor arg.

 4)
 https://hc.apache.org/httpcomponents-client-ga/tutorial/html/statemgmt.html#d5e799

 -Hoss



RE: Problem with spellchecker

2012-10-02 Thread Markus Jelsma
The problem is your stray double quote:
str name=queryAnalyzerFieldTypetext_general_fr/str

I'd think this would throw an exception somewhere.
 
 
-Original message-
 From:Jose Aguilar jagui...@searchtechnologies.com
 Sent: Tue 02-Oct-2012 01:40
 To: solr-user@lucene.apache.org
 Subject: Problem with spellchecker
 
 We have configured 2 spellcheckers English and French in solr 4 BETA.  Each 
 spellchecker works with a specific search handler. The English spellchecker 
 is working as expected with any word regardless of the case.  On the other 
 hand, the French spellchecker works with lowercase words. If the first letter 
 is uppercase, then the spellchecker is not returning any suggestion unless we 
 add the spellcheck.q parameter with that term. To further clarify, this 
 doesn't return any corrections:
 
 http://localhost:8984/solr/collection1/handler?wt=xmlq=Systme
 
 But this one works as expected:
 
 http://localhost:8984/solr/collection1/handler?wt=xmlq=Systmespellcheck.q=Systme
 
 According to this page 
 (http://wiki.apache.org/solr/SpellCheckComponent#q_OR_spellcheck.q) , the 
 spellcheck.q paramater shouldn't be required:
 
 If spellcheck.q is defined, then it is used, otherwise the original input 
 query is used
 
 Are we missing something?  We double checked the configuration settings for 
 English which is working fine and it seems well configured.
 
 Here is an extract of the spellcheck component configuration for French 
 language
 
   searchComponent name=spellcheckfr class=solr.SpellCheckComponent
   str name=queryAnalyzerFieldTypetext_general_fr/str
   lst name=spellchecker
   str name=namedefault/str
   str name=fieldSpellingFr/str
   str name=classnamesolr.DirectSolrSpellChecker/str
   str name=distanceMeasureinternal/str
   float name=accuracy0.5/float
  int name=maxEdits2/int
  int name=minPrefix1/int
   int name=maxInspections5/int
   int name=minQueryLength4/int
   float name=maxQueryFrequency0.01/float
   str name=buildOnCommittrue/str
 /lst
   /searchComponent
 
 Thanks for any help
 


Re: Deploying and securing Solr war in JBoss AS

2012-10-02 Thread Lee Carroll
Hi Billy
see
http://wiki.apache.org/solr/SolrSecurity

One approach is keep master internal, read only slaves with just select
handlers defined in the solr config for public facing requests.

See your app container security docs for other approaches

On 1 October 2012 16:32, Billy Newman newman...@gmail.com wrote:

 I am struggling with how to protect the Solr URLs (esp. the admin
 page(s)) when I deploy solr to JBoss.  I know that I can extract the
 web.xml from the war and mess with that, but was wondering if there
 was a way to deploy the war as-is and modify some JBoss config file to
 protect that wars URL(s).

 Anyone deployed this to JBoss w/ success and have any experience with
 how to secure it.

 I looked here for JBoss config with no luck:
 http://wiki.apache.org/solr/SolrJBoss

 Thanks in advance,
 Billy



Re: Synonyms Phrase not working

2012-10-02 Thread Mikhail Khludnev
Gustav,

AFAIK, multi words synonyms is one of the weak points for Lucene/Solr. I'm
going to propose a solution approach at forthcoming Eurocon
http://www.apachecon.eu/schedule/presentation/18/ . You are welcome!



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Can I rely on correct handling of interrupted status of threads?

2012-10-02 Thread Robert Krüger
Hi,

I'm using Solr 3.6.1 in an application embedded directly, i.e. via
EmbeddedSolrServer, not over an HTTP connection, which works
perfectly. Our application uses Thread.interrupt() for canceling
long-running tasks (e.g. through Future.cancel). A while (and a few
Solr versions) back a colleague of mine implemented a workaround
because he said that Solr didn't handle the thread's interrupted
status correctly, i.e. not setting the interrupted status after having
caught an InterruptedException or rethrowing it, thus killing the
information that an interrupt has been requested, which breaks
libraries relying on that. However, I did not find anything up-to-date
in mailing list or forum archives on the web. Is that still or was it
ever the case? What does one have to watch out for when interrupting a
thread that is doing anything within Solr/Lucene?

Any advice would be appreciated.

Regards,

Robert


Re: move solr.war to Glassfish and got error running http://host:port/ProjectName/browse

2012-10-02 Thread Iwan Hanjoyo
Hello list,



On Sun, Sep 30, 2012 at 6:43 PM, Iwan Hanjoyo ihanj...@gmail.com wrote:

 Hello all,

 I used older Solr 3.6.1 version.
 I created a new web project (called SolrRedo) on Netbeans 7.1.1 running on
 Glassfish Web Server
 Then I moved sources from the solr.war sample code (that resided inside
 apache-solr-3.6.1.zip)
 into SolrRedo' Netbeans 7.1.1 project.

 I also do some settings (ex: put the solr.home folder into a proper
 place), deploy and run the project
 I successfully run it on the browser (including
 http://localhost:8080/SolrRedo/admin/).
 However, I got error HTTP Status 500 when trying to browse
 http://localhost:8080/SolrRedo/browse/
 How should I fix this problem?

 Kind regards
 Hanjoyo


 Here is the details:
 HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException:
 lazy loading error at
 org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1763)
 at
 org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getContentType(SolrCore.java:1778)
 at
 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:217)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
 at
 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655)
 at
 org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161)
 at
 org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231)
 at
 com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317)
 at
 com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195)
 at
 com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849)
 at com.sun.grizzly.http.ProcessorTask.doProcess
 (ProcessorTask.java:746) at
 com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at
 com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228)
 at
 com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
 at
 com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
 at
 com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
 at
 com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
 at
 com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
 at
 com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
 at com.sun.grizzly.ContextTask.run
 (ContextTask.java:71) at
 com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
 at
 com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'solr.VelocityResponseWriter' at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394)
 at
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at
 org.apache.solr.core.SolrCore.createQueryResponseWriter(SolrCore.java:487)
 at org.apache.solr.core.SolrCore.access$300
 (SolrCore.java:72) at
 org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1758)

 ... 28 more Caused by: java.lang.ClassNotFoundException:
 solr.VelocityResponseWriter at java.net.URLClassLoader$1.run
 (URLClassLoader.java:366) at
 java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
 java.security.AccessController.doPrivileged(Native Method) at
 java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at
 java.net.FactoryURLClassLoader.loadClass
 (URLClassLoader.java:789) at
 java.lang.ClassLoader.loadClass(ClassLoader.java:356) at
 java.lang.Class.forName0(Native
 Method) at java.lang.Class.forName(Class.java:264) at
 org.apache.solr.core.SolrResourceLoader.findClass
 (SolrResourceLoader.java:378) ... 32 more


Re: move solr.war to Glassfish and got error running http://host:port/ProjectName/browse

2012-10-02 Thread Iwan Hanjoyo
Hello list,

I finally solved the problem. I miss the configuration of solr jar file in
the solrconfig.xml file.
thank you.

Kind regards,


Hanjoyo

On Tue, Oct 2, 2012 at 5:57 PM, Iwan Hanjoyo ihanj...@gmail.com wrote:

 Hello list,



 On Sun, Sep 30, 2012 at 6:43 PM, Iwan Hanjoyo ihanj...@gmail.com wrote:

 Hello all,

 I used older Solr 3.6.1 version.
 I created a new web project (called SolrRedo) on Netbeans 7.1.1 running
 on Glassfish Web Server
 Then I moved sources from the solr.war sample code (that resided inside
 apache-solr-3.6.1.zip)
 into SolrRedo' Netbeans 7.1.1 project.

 I also do some settings (ex: put the solr.home folder into a proper
 place), deploy and run the project
 I successfully run it on the browser (including
 http://localhost:8080/SolrRedo/admin/).
 However, I got error HTTP Status 500 when trying to browse
 http://localhost:8080/SolrRedo/browse/
 How should I fix this problem?

 Kind regards
 Hanjoyo


 Here is the details:
 HTTP Status 500 - lazy loading error
 org.apache.solr.common.SolrException: lazy loading error at
 org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1763)
 at
 org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getContentType(SolrCore.java:1778)
 at
 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:338)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:273)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:256)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:217)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:279)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
 at
 org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655)
 at
 org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161)
 at
 org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231)
 at
 com.sun.enterprise.v3.services.impl.ContainerMapper$AdapterCallable.call(ContainerMapper.java:317)
 at
 com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:195)
 at
 com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849)
 at com.sun.grizzly.http.ProcessorTask.doProcess
 (ProcessorTask.java:746) at
 com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at
 com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228)
 at
 com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
 at
 com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
 at
 com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
 at
 com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
 at
 com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
 at
 com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
 at com.sun.grizzly.ContextTask.run
 (ContextTask.java:71) at
 com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
 at
 com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'solr.VelocityResponseWriter' at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394)
 at
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at
 org.apache.solr.core.SolrCore.createQueryResponseWriter(SolrCore.java:487)
 at org.apache.solr.core.SolrCore.access$300
 (SolrCore.java:72) at
 org.apache.solr.core.SolrCore$LazyQueryResponseWriterWrapper.getWrappedWriter(SolrCore.java:1758)

 ... 28 more Caused by: java.lang.ClassNotFoundException:
 solr.VelocityResponseWriter at java.net.URLClassLoader$1.run
 (URLClassLoader.java:366) at
 java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
 java.security.AccessController.doPrivileged(Native Method) at
 java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at
 java.net.FactoryURLClassLoader.loadClass
 (URLClassLoader.java:789) at
 java.lang.ClassLoader.loadClass(ClassLoader.java:356) at
 java.lang.Class.forName0(Native
 Method) at java.lang.Class.forName(Class.java:264) at
 org.apache.solr.core.SolrResourceLoader.findClass
 (SolrResourceLoader.java:378) ... 32 more





Re: successfully move to glassfish but got error accessing Velocity sample code

2012-10-02 Thread Iwan Hanjoyo
Hello list,

I finally solved the problem. I miss the configuration of solr jar files in
the solrconfig.xml file.
thank you.

Kind regards,


Hanjoyo

On Mon, Oct 1, 2012 at 8:58 PM, Iwan Hanjoyo ihanj...@gmail.com wrote:

 Hello all,

 First, after extracting apache-solr-3.6.1.zip file, I can run and access
 http://localhost:8080/browse  (the solritas velocity example) from jetty.
  I also successfully move the solr.war to Glassfish and get it running.

 However, I got an error when accessing http://localhost:8080/browse
  (the solritas velocity example) from glassfish.
 What configuration is missing here?

 I have copied the solr.home folder as is from the apache-solr-3.6.1.zip
 Thanx before.

 Kind regards,


 Hanjoyo



Tuning DirectUpdateHandler2.addDoc

2012-10-02 Thread Trym R. Møller

Hi

I have been profiling SolrCloud when indexing into a sharded non-replica 
collection because indexing slows down when the index files (*.fdt) 
grows to a couple of GB (the largest is about 3.5GB).


When profiling for a couple of minutes I see that most time is spend in 
the DirectUpdateHandler2.addDoc method (being called about 8000 times). 
Its time is spend
in UpdateLog.lookupVersion, VersionInfo.getVersionFromIndex, 
SolrIndexSearcher.lookupId (being called about 6000 times) and it spends 
it time in AtomicReader.termDocsEnums which is called about 530.000 
times taking about 770.000 ms


Is it true, that the reason for AtomicReader.termDocsEnums is being 
called 530.000/6000 =~ 90 times per SolrIndexSearcher.lookupId call, 
is that I have in average 90 term-files?

Can I do anything to lower this number of term-files?

I'm running more cores on my SolrCloud instance. Is there any way I can 
lower the time spend in each AtomicReader.termDocsEnums method call 
(this seems to be much faster when I don't have so many documents in my 
collection/shard)?


Thanks as always.

Best regards Trym


mapping values in fields

2012-10-02 Thread tech.vronk

Hi,

I try to map values from one field into other values in another field.
For example:
original_field: orig_value1
mapped_field: mapped_value1

with the help of an explicitely defined (N:1) mapping:
orig_value1 = mapped_value1
orig_value2 = mapped_value1
orig_value3 = mapped_value2

I have tried to use SynonymFilterFactory
for the mapped_field:

fieldtype name=mapped_field class=solr.TextField
  analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory 
synonyms=region-map.txt ignoreCase=true expand=true/

  /analyzer

combined with:
copyField src=original_field dest=mapped_field /


Now, a search for
 mapped_field:mapped_value1
yields results,
however in the result the mapped_value1 does not appear at all,
but instead the orig_value1 appears also in the mapped_field.

How can I achieve, that the mapped_value appears in the result as well?


thank you,

matej


Re: mapping values in fields

2012-10-02 Thread Erick Erickson
What's the query you send? I'm guessing a bit here since you
haven't included it, but try insuring two things:

1  your mapped_field is has 'stored=true '
2 you specify (either in your request handler on on the URL) fl=mapped_value

Best
Erick

On Tue, Oct 2, 2012 at 9:04 AM, tech.vronk t...@vronk.net wrote:
 Hi,

 I try to map values from one field into other values in another field.
 For example:
 original_field: orig_value1
 mapped_field: mapped_value1

 with the help of an explicitely defined (N:1) mapping:
 orig_value1 = mapped_value1
 orig_value2 = mapped_value1
 orig_value3 = mapped_value2

 I have tried to use SynonymFilterFactory
 for the mapped_field:

 fieldtype name=mapped_field class=solr.TextField
   analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory
 synonyms=region-map.txt ignoreCase=true expand=true/
   /analyzer

 combined with:
 copyField src=original_field dest=mapped_field /


 Now, a search for
  mapped_field:mapped_value1
 yields results,
 however in the result the mapped_value1 does not appear at all,
 but instead the orig_value1 appears also in the mapped_field.

 How can I achieve, that the mapped_value appears in the result as well?


 thank you,

 matej


Re: mapping values in fields

2012-10-02 Thread tech.vronk


the query is:
  mapped_field:mapped_value1

and seems to correctly return the documents.

the mapped_field has attribute stored=true
and also appears in the result (even without requesting it explicitely 
with fl),

just with the orig_value1 instead of mapped_value1

matej

Am 02.10.2012 15:46, schrieb Erick Erickson:

What's the query you send? I'm guessing a bit here since you
haven't included it, but try insuring two things:

1  your mapped_field is has 'stored=true '
2 you specify (either in your request handler on on the URL) fl=mapped_value

Best
Erick

On Tue, Oct 2, 2012 at 9:04 AM, tech.vronk t...@vronk.net wrote:

Hi,

I try to map values from one field into other values in another field.
For example:
original_field: orig_value1
mapped_field: mapped_value1

with the help of an explicitely defined (N:1) mapping:
orig_value1 = mapped_value1
orig_value2 = mapped_value1
orig_value3 = mapped_value2

I have tried to use SynonymFilterFactory
for the mapped_field:

fieldtype name=mapped_field class=solr.TextField
   analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory
synonyms=region-map.txt ignoreCase=true expand=true/
   /analyzer

combined with:
copyField src=original_field dest=mapped_field /


Now, a search for
  mapped_field:mapped_value1
yields results,
however in the result the mapped_value1 does not appear at all,
but instead the orig_value1 appears also in the mapped_field.

How can I achieve, that the mapped_value appears in the result as well?


thank you,

matej






Re: At a high level how does faceting in SolrCloud work?

2012-10-02 Thread Jamie Johnson
Thanks for this guys, really excellent explanation!

On Thu, Sep 27, 2012 at 12:15 AM, Yonik Seeley yo...@lucidworks.com wrote:
 On Wed, Sep 26, 2012 at 6:21 PM, Chris Hostetter
 hossman_luc...@fucit.org wrote:
 2) the coordinator node sums up the counts for any constraint returned by
 multiple nodes, and then picks the top (facet.limit) constraints based n
 the counts it knows about.

 It's actually more sophisticated than that - we don't limit to the top
 facet.limit constraints at the first phase.
 For *all* constraints we see from the first phase, we calculate if it
 could possibly be in the top facet.limit constraints (based on shards
 we haven't heard from).  If so, we request exact counts from those
 shards we haven't heard from.

 (but i believe this is second query
 is optimized to only ask a shard about a constraint if it didn't already
 get the count in the first request)

 Correct.

 So imagine you have 3 shards, and querying them individually with
 facet.field=catfacet.limit=3 you get...

 shardA: cars(8), books(7), computers(6)
 shardB: toys(8), books(7), garden(5)
 shardC: garden(4), books(3), computers(3)

 If you made a solr cloud query (or an explicit distributed query of those
 three shards), the first request the coordinator would send to each shard
 would specify a higher facet.limit, and might get back something like...

 shardA: cars(8), books(7), computers(6), cleaning(4), ...
 shardB: toys(8), books(7), garden(5), cleaning(4), ...
 shardC: garden(4), books(3), computers(3), plants(3), ...

 ...in which case cleaning pops up as a contender for being in the top
 constraints.  The coordinator sums up the counts for the constraints it
 knows about, and might decide that these are the top 3...

 books(17), computers(9), cleaning(8)

 To extend your example, Solr notices that plants has a count of 3 on
 one shard, and was missing from the other two shards.
 The maximum possible count it *could* have is 11 (3+4+4), which could
 possibly put it in the top 3, hence it will also ask shardA and shardB
 about plants.

 -Yonik
 http://lucidworks.com


Re: At a high level how does faceting in SolrCloud work?

2012-10-02 Thread Jamie Johnson
So does mincount get considered in this as well?

On Tue, Oct 2, 2012 at 10:19 AM, Jamie Johnson jej2...@gmail.com wrote:
 Thanks for this guys, really excellent explanation!

 On Thu, Sep 27, 2012 at 12:15 AM, Yonik Seeley yo...@lucidworks.com wrote:
 On Wed, Sep 26, 2012 at 6:21 PM, Chris Hostetter
 hossman_luc...@fucit.org wrote:
 2) the coordinator node sums up the counts for any constraint returned by
 multiple nodes, and then picks the top (facet.limit) constraints based n
 the counts it knows about.

 It's actually more sophisticated than that - we don't limit to the top
 facet.limit constraints at the first phase.
 For *all* constraints we see from the first phase, we calculate if it
 could possibly be in the top facet.limit constraints (based on shards
 we haven't heard from).  If so, we request exact counts from those
 shards we haven't heard from.

 (but i believe this is second query
 is optimized to only ask a shard about a constraint if it didn't already
 get the count in the first request)

 Correct.

 So imagine you have 3 shards, and querying them individually with
 facet.field=catfacet.limit=3 you get...

 shardA: cars(8), books(7), computers(6)
 shardB: toys(8), books(7), garden(5)
 shardC: garden(4), books(3), computers(3)

 If you made a solr cloud query (or an explicit distributed query of those
 three shards), the first request the coordinator would send to each shard
 would specify a higher facet.limit, and might get back something like...

 shardA: cars(8), books(7), computers(6), cleaning(4), ...
 shardB: toys(8), books(7), garden(5), cleaning(4), ...
 shardC: garden(4), books(3), computers(3), plants(3), ...

 ...in which case cleaning pops up as a contender for being in the top
 constraints.  The coordinator sums up the counts for the constraints it
 knows about, and might decide that these are the top 3...

 books(17), computers(9), cleaning(8)

 To extend your example, Solr notices that plants has a count of 3 on
 one shard, and was missing from the other two shards.
 The maximum possible count it *could* have is 11 (3+4+4), which could
 possibly put it in the top 3, hence it will also ask shardA and shardB
 about plants.

 -Yonik
 http://lucidworks.com


Re: mapping values in fields

2012-10-02 Thread Erick Erickson
Ah, I get it (finally). OK, there's no good way to do what
you want that I know of. The problem is that the
stored=true takes effect long before any transformations
are applied, and is always the raw input. You effectively
want to chain the fields together, i.e. apply the analysis
chain _then_ have the copyfield take effect which is not
supported.

I don't know how to accomplish this off the top of my head
OOB, I'd guess your client would have to manage the
substitutions and then just index separate fields...

Best
Erick

On Tue, Oct 2, 2012 at 9:54 AM, tech.vronk t...@vronk.net wrote:

 the query is:
   mapped_field:mapped_value1

 and seems to correctly return the documents.

 the mapped_field has attribute stored=true
 and also appears in the result (even without requesting it explicitely with
 fl),
 just with the orig_value1 instead of mapped_value1

 matej

 Am 02.10.2012 15:46, schrieb Erick Erickson:

 What's the query you send? I'm guessing a bit here since you
 haven't included it, but try insuring two things:

 1  your mapped_field is has 'stored=true '
 2 you specify (either in your request handler on on the URL)
 fl=mapped_value

 Best
 Erick

 On Tue, Oct 2, 2012 at 9:04 AM, tech.vronk t...@vronk.net wrote:

 Hi,

 I try to map values from one field into other values in another field.
 For example:
 original_field: orig_value1
 mapped_field: mapped_value1

 with the help of an explicitely defined (N:1) mapping:
 orig_value1 = mapped_value1
 orig_value2 = mapped_value1
 orig_value3 = mapped_value2

 I have tried to use SynonymFilterFactory
 for the mapped_field:

 fieldtype name=mapped_field class=solr.TextField
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.SynonymFilterFactory
 synonyms=region-map.txt ignoreCase=true expand=true/
/analyzer

 combined with:
 copyField src=original_field dest=mapped_field /


 Now, a search for
   mapped_field:mapped_value1
 yields results,
 however in the result the mapped_value1 does not appear at all,
 but instead the orig_value1 appears also in the mapped_field.

 How can I achieve, that the mapped_value appears in the result as well?


 thank you,

 matej





Question about MoreLikeThis query with solrj

2012-10-02 Thread G.Long

Hi :)

I'm using Solr 3.6.1 and i'm trying to use the similarity features of 
lucene/solr to compare texts.


The content of my documents is in french so I defined a field like :

field name=content_mlt type=text_fr termVectors=true 
indexed=true stored=true/


(it uses the default text_fr fieldType provided with the default 
schema.xml file)


i'm using the following method to query my index :

SolrQuery sQuery = new SolrQuery();
sQuery.setQueryType(/ + MoreLikeThisParams.MLT);
sQuery.set(MoreLikeThisParams.MATCH_INCLUDE, false);
sQuery.set(MoreLikeThisParams.MIN_DOC_FREQ, 1);
sQuery.set(MoreLikeThisParams.MIN_TERM_FREQ, 1);
sQuery.set(MoreLikeThisParams.MAX_QUERY_TERMS, 50);
sQuery.set(MoreLikeThisParams.SIMILARITY_FIELDS, field);
sQuery.set(fl, *,id,score);
sQuery.setRows(5);
sQuery.setQuery(content_mlt:/the content to find/);

QueryResponse rsp = server.query(sQuery);
return rsp.getResults();

The problem is that the returned results and the associated scores look 
strange to me.


I indexed the three following texts :

sample 1 :
Le 1° de l'article 81 du CGI exige que les allocations pour frais 
soient utilisées conformément à leur objet
pour être affranchies de l'impôt. Lorsque la réalité du versement des 
allocations est établie,
le bénéficiaire doit cependant être en mesure de justifier de leur 
utilisation;


sample 2:
Le premier alinéa du 1° de l'article 81 du CGI prévoit que les 
rémunérations des journalistes,
rédacteurs, photographes, directeurs de journaux et critiques 
dramatiques et musicaux
perçues ès qualités constituent des allocations pour frais d'emploi 
affranchies d'impôt

à concurrence de 7 650 EUR.;

sample 3:
Par ailleurs, lorsque leur montant est fixé par voie législative, les 
allocations
pour frais prévues au 1° de l'article 81 du CGI sont toujours réputées 
utilisées
conformément à leur objet et ne peuvent donner lieu à aucune 
vérification de la part de l'administration.
Il s'agit d'une présomption irréfragable, qui ne peut donc pas être 
renversée par la preuve contraire qui
serait apportée par l'administration d'une utilisation non conforme à 
son objet de l'allocation concernée.
Pour que le deuxième alinéa du 1° de l'article 81 du CGI s'applique, 
deux conditions doivent être réunies
simultanément : - la nature d'allocation spéciale inhérente à la 
fonction ou à l'emploi résulte directement de la loi ;

- son montant est fixé par la loi;

I tried to query the index by passing the first sample as the content to 
query and the result is the following :

MLT result: id: dc3 - score: 0.114195324 (correspond to the sample 3)
MLT result: id: dc2 - score: 0.035233106 (correspond to the sample 2)

The results don't even contain the first sample, although it is exactly 
the same text as the one put into the query :/


Any idea of why I get these results?
Maybe the query parameters are incorrect or there is something to change 
in the solr config?


Thanks :)

Gary






Hierarchical Data

2012-10-02 Thread Maurizio Cucchiara
Hi all,
I'm trying to import some hierarchical data (stored in MySQL) on Solr,
using DataImportHandler.
Unfortunately, as most of you already knows, MySQL has no support for
recursive queries, so there is no way to get hierarchical data stored
as an adjacency list.
So I considered writing a DIH custom transformers which given a
specified sql (like select * from categories) and a value (f.e.
category_id):
* fetches all data
* builds an hierarchical representation of the fetched data
* optionally caches the hierarchical data structure
* then returns 2 multi-valued lists which contain the 2 full paths (as
String and as Number)

Is there something out of the box?
Alternatively, does the above approach sound good?

TIA


Twitter :http://www.twitter.com/m_cucchiara
G+  :https://plus.google.com/107903711540963855921
Linkedin:http://www.linkedin.com/in/mauriziocucchiara
VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara

Maurizio Cucchiara


RE: SolrJ - IOException

2012-10-02 Thread balaji.gandhi
Hi Toke,

We encountered this issue again. This time the SOLR servers were stalled. We 
are at 30 TPS.

Please let us know any updates in the HTTP issue.

Thanks,
Balaji

Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
Product Engineering  │  Apollo Group, Inc.
1225 W. Washington St.  |  AZ23  |  Tempe, AZ  85281
Phone: 602.713.2417  |  Email: 
balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu

P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms.

From: Balaji Gandhi
Sent: Thursday, September 27, 2012 10:52 AM
To: 'Toke Eskildsen [via Lucene]'
Subject: RE: SolrJ - IOException

Here is the stack trace:-

org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server:
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414)
 at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182)
 at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:122) at 
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:107) at 
org.apache.solr.handler.dataimport.thread.task.SolrUploadTask.upload(SolrUploadTask.java:31)
 at 
org.apache.solr.handler.dataimport.thread.SolrUploader.run(SolrUploader.java:31)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at 
java.lang.Thread.run(Unknown Source) Caused by: 
org.apache.http.NoHttpResponseException: The target server failed to respond at 
org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101)
 at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
 at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
 at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
 at 
org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
 at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
 at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
 at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
 at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
 at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
 at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
 at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
 at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353)
 ... 9 more

Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
Product Engineering  │  Apollo Group, Inc.
1225 W. Washington St.  |  AZ23  |  Tempe, AZ  85281
Phone: 602.713.2417  |  Email: 
balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu

P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms.

From: Toke Eskildsen [via Lucene] 
[mailto:ml-node+s472066n4010082...@n3.nabble.com]
Sent: Tuesday, September 25, 2012 12:19 AM
To: Balaji Gandhi
Subject: Re: SolrJ - IOException

On Tue, 2012-09-25 at 01:50 +0200, balaji.gandhi wrote:
 I am encountering this error randomly (under load) when posting to Solr
 using SolrJ.

 Has anyone encountered a similar error?

 org.apache.solr.client.solrj.SolrServerException: IOException occured when
 talking to server at: http://localhost:8080/solr/profile at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414)
[...]

This looks suspiciously like a potential bug in the HTTP keep-alive flow
that we encountered some weeks ago. I am guessing that you are issuing
more than 100 separate updates/second. Could you please provide the full
stack trace?


If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/SolrJ-IOException-tp4010026p4010082.html
To unsubscribe from SolrJ - IOException, click 
herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4010026code=YmFsYWppLmdhbmRoaUBhcG9sbG9ncnAuZWR1fDQwMTAwMjZ8LTEwNzE2NTA1NDI=.
NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml


This message is private and confidential. If you have received it in error, 
please notify the sender and 

Re: Hierarchical Data

2012-10-02 Thread Davide Lorenzo Marino
Hi Maurizio,
if you can manipulate your MySql db a simpler solution can be the following:

1 - Add a new field for your hierarchical data inside your table
MY_HIERARCHICAL_FIELD
2 - Populate directly in MySql this new field with a simple procedure*
3 - Import the data in your Solr index

*The MySql procedure could be similar to the following:

1 - iterate your table and find your top records not yet elaborated (it
means with MY_HIERARCHICAL_FIELD empty and no father or with a father with
MY_HIERARCHICAL_FIELD NOT empty)
2 - find the anchestor of the founded records
3 - Update your  MY_HIERARCHICAL_FIELD field with the value of
MY_HIERARCHICAL_FIELD father || '/' || MY_ID current

At the end of the procedure you will have your MY_HIERARCHICAL_FIELD
populated with values like '12/35/45/154'
The value '12/35/45/154' means that the current record has id 154 and it is
a child of record 45 that is a child of 35 that is a child of 12 that has
no parents.




2012/10/2 Maurizio Cucchiara mcucchi...@apache.org

 Hi all,
 I'm trying to import some hierarchical data (stored in MySQL) on Solr,
 using DataImportHandler.
 Unfortunately, as most of you already knows, MySQL has no support for
 recursive queries, so there is no way to get hierarchical data stored
 as an adjacency list.
 So I considered writing a DIH custom transformers which given a
 specified sql (like select * from categories) and a value (f.e.
 category_id):
 * fetches all data
 * builds an hierarchical representation of the fetched data
 * optionally caches the hierarchical data structure
 * then returns 2 multi-valued lists which contain the 2 full paths (as
 String and as Number)

 Is there something out of the box?
 Alternatively, does the above approach sound good?

 TIA


 Twitter :http://www.twitter.com/m_cucchiara
 G+  :https://plus.google.com/107903711540963855921
 Linkedin:http://www.linkedin.com/in/mauriziocucchiara
 VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara

 Maurizio Cucchiara



Re: multivalued filed question (FieldCache error)

2012-10-02 Thread Chris Hostetter

: I'm also using that field for a facet:

Hmmm... that still doesn't make sense.  faceting can use FieldCache, but 
it will check if ht field is mutivalued to decide if/when/how to do this.

There's nothing else in your requestHandler config that would suggest why 
you might get this error.

can you please provide more details about the error you are getting -- in 
particular: the completley stack trace from the server logs.  that should 
help us itendify the code path leading to the problem.


: 
: |requestHandler  name=mytype  class=solr.SearchHandler  
: lst  name=defaults
:  str  name=defTypedismax/str
:  str  name=echoParamsexplicit/str
:  float  name=tie1/float
:  str  name=qf
:many field but not store_slug
:  /str
:  str  name=pf
:|many field but not store_slug|||
:  /str   str  name=fl
: ..., store_slug
:  /str
:   str  name=mm![CDATA[1100% 580%]]/str
:  int  name=qs2/int
:  int  name=ps2/int
:  str  name=q.alt*:*/str
: str  name=spellcheck.dictionarydefault/str
:   str  name=spellchecktrue/str
:   str  name=spellcheck.extendedResultstrue/str
:   str  name=spellcheck.count10/str
:   str  name=spellcheck.collatetrue/str  !-- Facet --
: str  name=facettrue/str
: str  name=facet.mincount1/str
: str  name=facet.pivot.mincount0/str
: str  name=facet.sortcount/str
: ...
: str  name=facet.fieldstore_slug/str
: ...
: str  name=hlfalse/str
: /lst
: arr  name=last-components
:   strspellcheck/str
: /arr
: 
:   /requestHandler|
: 
: 
: Il 01/10/12 18:34, Erik Hatcher ha scritto:
:  How is your request handler defined?  Using store_slug for anything but fl?
:  
:  Erik
:  
:  On Oct 1, 2012, at 10:51,giovanni.bricc...@banzai.it
:  giovanni.bricc...@banzai.it  wrote:
:  
:   Hello,
:   
:   I would like to put a multivalued field into a qt definition as output
:   field. to do this I edit the current solrconfig.xml definition and add the
:   field in the fl specification.
:   
:   Unexpectedly when I do the query q=*:*qt=mytype I get the error
:   
:   str name=msg
:   can not use FieldCache on multivalued field: store_slug
:   /str
:   
:   But if I instead run the query
:   
:   
http://src-eprice-dev:8080/solr/0/select/?q=*:*qt=mytypefl=otherfield,mymultivaluedfiled
:   
:   I don't get the error
:   
:   Have you got any suggestions?
:   
:   I'm using solr 4 beta
:   
:   solr-spec 4.0.0.2012.08.06.22.50.47
:   lucene-impl 4.0.0-BETA 1370099
:   
:   
:   Giovanni
: 
: 
: -- 
: 
: 
:  Giovanni Bricconi
: 
: Banzai Consulting
: cell. 348 7283865
: ufficio 02 00643839
: via Gian Battista Vico 42
: 20132 Milano (MI)
: 
: 
: 
: 

-Hoss


Re: Hierarchical Data

2012-10-02 Thread Maurizio Cucchiara
Ciao Davide,

Unfortunately changing the structure of the dbs is not an option for
me (there are many legacy dbs involved), otherwise I would have chosen
a /closure table/ instead of the /path enumeration/ you mentioned
before.

Furthermore, I'd need 2 PE fields: 1 for the values (ids) and 1 for
the label (names).

Also, I'm looking for a definite general solution.

Twitter :http://www.twitter.com/m_cucchiara
G+  :https://plus.google.com/107903711540963855921
Linkedin:http://www.linkedin.com/in/mauriziocucchiara
VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara

Maurizio Cucchiara


Re: Can I rely on correct handling of interrupted status of threads?

2012-10-02 Thread Mikhail Khludnev
I remember a bug in EmbeddedSolrServer at 1.4.1 when exception bypasses
request closing that lead to searcher leak and OOM. It was fixed about two
years ago.

On Tue, Oct 2, 2012 at 1:48 PM, Robert Krüger krue...@lesspain.de wrote:

 Hi,

 I'm using Solr 3.6.1 in an application embedded directly, i.e. via
 EmbeddedSolrServer, not over an HTTP connection, which works
 perfectly. Our application uses Thread.interrupt() for canceling
 long-running tasks (e.g. through Future.cancel). A while (and a few
 Solr versions) back a colleague of mine implemented a workaround
 because he said that Solr didn't handle the thread's interrupted
 status correctly, i.e. not setting the interrupted status after having
 caught an InterruptedException or rethrowing it, thus killing the
 information that an interrupt has been requested, which breaks
 libraries relying on that. However, I did not find anything up-to-date
 in mailing list or forum archives on the web. Is that still or was it
 ever the case? What does one have to watch out for when interrupting a
 thread that is doing anything within Solr/Lucene?

 Any advice would be appreciated.

 Regards,

 Robert




-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


WordBreak spell correction makes split terms optional?

2012-10-02 Thread Carrie Coy
The user query  design your ownbinoculars  is corrected by the 
'wordbreak' dictionary to:


str name=querystringdesign your (own binoculars)/str

Where are the parentheses coming from?  Can I strip them with a 
post-processing filter?   The parentheses make the terms optional, so, 
while the first match is excellent, the rest are irrelevant.


Thx,
Carrie Coy





Re: PHP client for a web application

2012-10-02 Thread Damien Camilleri
Hi esteban. Im currently using both in my application. Both are fine.
Solarium is great because it models the concepts of solr and can build
queries using OOP. The other one is more lower level, so u have to write
queries manually, which can be good in some situations. Both are fast
enough. Solarium has bigger learning curve. Solarium has built in batch
updating and other things like parallel queries. So i would go with
solarium. Its a very nice library.
On Oct 3, 2012 5:38 AM, Esteban Cacavelos estebancacave...@gmail.com
wrote:

 Hi, I'm starting a web application using solr as a search engine. The web
 site will be developed in PHP (maybe I'll use a framework also).

 I would like to know some thoughts and opinions about the clients (
 http://wiki.apache.org/solr/SolPHP). I didn't like very much the PHP
 extension option because I think this is a limitation. So, I would like to
 read opinions about SOLARIUM and SOLR-PHP-CLIENT.


 Thanks in advance!


 --
 Esteban L. Cacavelos de Amoriza
 Cel: 0981 220 429



RE: WordBreak spell correction makes split terms optional?

2012-10-02 Thread Dyer, James
The parenthesis are being added by the spellchecker.  I tried to envision a 
number of different scenarios when designing how this would work and at the 
time it seemed best to add parenthesis around terms that originally were 
together but now are split up.  From your example, I see this is a mistake.

I see no reason why you can't just strip these, or use the 
collateExtendedResults to construct your own query, for now.  Would you mind 
giving a little more detail on your query parameters in a JIRA bug report, so 
we can track this problem and fix it?

Overall, its a bit tricky when splitting a term into multiple to make sure that 
if the original term was required|optional|prohibited that all the resulting 
terms also have the same status.  See WordBreakSolrSpellCheckerTest#testCollate 
for details.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Carrie Coy [mailto:c...@ssww.com] 
Sent: Tuesday, October 02, 2012 3:08 PM
To: solr-user@lucene.apache.org
Subject: WordBreak spell correction makes split terms optional?

The user query  design your ownbinoculars  is corrected by the 
'wordbreak' dictionary to:

str name=querystringdesign your (own binoculars)/str

Where are the parentheses coming from?  Can I strip them with a 
post-processing filter?   The parentheses make the terms optional, so, 
while the first match is excellent, the rest are irrelevant.

Thx,
Carrie Coy







NoHttpResponseException using Solrj to index

2012-10-02 Thread Rui Vaz
Hey I am trying to make a simple application using solrj to index documents.
I used the start.jar to start the Solr,. When I try to index a document to
Solr
I get the following exception:

Exception in thread main java.lang.NoClassDefFoundError:
org/apache/http/NoHttpResponseException

The exception occurs when I instantiate SolrServer (in orange):


public static void indexFilesSolrCell(File srcFile, String solrId)
throws IOException, SolrServerException {

String urlString = http://localhost:8983/solr;;

SolrServer solr = new HttpSolrServer(urlString);

ContentStreamUpdateRequest up
  = new ContentStreamUpdateRequest(/update/extract);

I already import apache solr-solrj

Thank you,
-- 
Rui Vaz


Re: Can SOLR Index UTF-16 Text

2012-10-02 Thread Lance Norskog
If it is a simple text file, does that text file start with the UTF-16 BOM 
marker?
http://unicode.org/faq/utf_bom.html

Also, do UTF-8 files work? If not, then your setup has a basic encoding problem.
And, when you post such a text file (for example, with curl), use the UTF-16 
charset mime-type: I think it is text/plain; charset=utf-16.


- Original Message -
| From: Chris Hostetter hossman_luc...@fucit.org
| To: solr-user@lucene.apache.org
| Sent: Friday, September 28, 2012 5:17:15 PM
| Subject: Re: Can SOLR Index UTF-16 Text
| 
| 
| : Our SOLR setup  (4.0.BETA on Tomcat 6) works as expected when
| indexing UTF-8
| : files. Recently, however, we noticed that it has issues with
| indexing
| : certain text files eg. UTF-16 files.  See attachment for an example
| : (tarred+zipped)
| :
| : tesla-utf16.txt
| : http://lucene.472066.n3.nabble.com/file/n4010834/tesla-utf16.txt
| 
| No attachment came through to the list, and the URL nabble seems to
| have
| provided when you posted your message leads to a 404.
| 
| IN general, the question of is indexing a UTF-16 file supported
| largely
| depneds on *how* you are indexing this file -- if it's plain text,
| are you
| parsing it yourself using some client code, and then sending it to
| solr,
| are you using DIH to read it from disk? are you using
| ExtractingRequestHandler?
| 
| those are all very differnet ways to index data in Solr -- and
| depending
| on what you are doing determins how/where the encoding of that file
| is
| processed.
| 
| 
| -Hoss
| 


ContentStreamUpdateRequest example in 4.0 Beta

2012-10-02 Thread Rui Vaz
Hi,

Is there any complete implementation for Solr 4.0 Beta of a class which
uses
ContentStreamUpdateRequest to send data to the
ExtractingRequestHandlerhttp://wiki.apache.org/solr/ExtractingRequestHandler
,
similar to this one for 3.1 version?

http://wiki.apache.org/solr/ContentStreamUpdateRequestExample

Thank you,
-- 
Rui Vaz


Re: Problem with spellchecker

2012-10-02 Thread Jose Aguilar
Thank you for your help, the whole team overlooked this simple error. It
was driving us crazy! :)

Thanks!! 

Jose.

On 10/2/12 1:23 AM, Markus Jelsma markus.jel...@openindex.io wrote:

The problem is your stray double quote:
str name=queryAnalyzerFieldTypetext_general_fr/str

I'd think this would throw an exception somewhere.
 
 
-Original message-
 From:Jose Aguilar jagui...@searchtechnologies.com
 Sent: Tue 02-Oct-2012 01:40
 To: solr-user@lucene.apache.org
 Subject: Problem with spellchecker
 
 We have configured 2 spellcheckers English and French in solr 4 BETA.
Each spellchecker works with a specific search handler. The English
spellchecker is working as expected with any word regardless of the
case.  On the other hand, the French spellchecker works with lowercase
words. If the first letter is uppercase, then the spellchecker is not
returning any suggestion unless we add the spellcheck.q parameter with
that term. To further clarify, this doesn't return any corrections:
 
 http://localhost:8984/solr/collection1/handler?wt=xmlq=Systme
 
 But this one works as expected:
 
 
http://localhost:8984/solr/collection1/handler?wt=xmlq=Systmespellcheck
.q=Systme
 
 According to this page
(http://wiki.apache.org/solr/SpellCheckComponent#q_OR_spellcheck.q) ,
the spellcheck.q paramater shouldn't be required:
 
 If spellcheck.q is defined, then it is used, otherwise the original
input query is used
 
 Are we missing something?  We double checked the configuration settings
for English which is working fine and it seems well configured.
 
 Here is an extract of the spellcheck component configuration for French
language
 
   searchComponent name=spellcheckfr class=solr.SpellCheckComponent
   str name=queryAnalyzerFieldTypetext_general_fr/str
   lst name=spellchecker
   str name=namedefault/str
   str name=fieldSpellingFr/str
   str name=classnamesolr.DirectSolrSpellChecker/str
   str name=distanceMeasureinternal/str
   float name=accuracy0.5/float
  int name=maxEdits2/int
  int name=minPrefix1/int
   int name=maxInspections5/int
   int name=minQueryLength4/int
   float name=maxQueryFrequency0.01/float
   str name=buildOnCommittrue/str
 /lst
   /searchComponent
 
 Thanks for any help
 



RE: Can SOLR Index UTF-16 Text

2012-10-02 Thread Fuad Efendi
Solr can index bytearrays too: unigram, bigram, trigram... even bitsets, 
tritsets, qatrisets ;- ) 
LOL I got strong cold... 
BTW, don't forget to configure UTF-8 as your default (Java) container 
encoding...
-Fuad






Re: SolrJ - IOException

2012-10-02 Thread Rozdev29
Was it stalled due to gc pause?

Sent from my iPhone

On Oct 2, 2012, at 10:02 AM, balaji.gandhi balaji.gan...@apollogrp.edu 
wrote:

 Hi Toke,
 
 We encountered this issue again. This time the SOLR servers were stalled. We 
 are at 30 TPS.
 
 Please let us know any updates in the HTTP issue.
 
 Thanks,
 Balaji
 
 Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
 Product Engineering  │  Apollo Group, Inc.
 1225 W. Washington St.  |  AZ23  |  Tempe, AZ  85281
 Phone: 602.713.2417  |  Email: 
 balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu
 
 P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms.
 
 From: Balaji Gandhi
 Sent: Thursday, September 27, 2012 10:52 AM
 To: 'Toke Eskildsen [via Lucene]'
 Subject: RE: SolrJ - IOException
 
 Here is the stack trace:-
 
 org.apache.solr.client.solrj.SolrServerException: IOException occured when 
 talking to server:
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414)
  at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182)
  at 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
  at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:122) at 
 org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:107) at 
 org.apache.solr.handler.dataimport.thread.task.SolrUploadTask.upload(SolrUploadTask.java:31)
  at 
 org.apache.solr.handler.dataimport.thread.SolrUploader.run(SolrUploader.java:31)
  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at 
 java.lang.Thread.run(Unknown Source) Caused by: 
 org.apache.http.NoHttpResponseException: The target server failed to respond 
 at 
 org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101)
  at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
  at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
  at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
  at 
 org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
  at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
  at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
  at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
  at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
  at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
  at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
  at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
  at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353)
  ... 9 more
 
 Balaji Gandhi, Senior Software Developer, Horizontal Platform Services
 Product Engineering  │  Apollo Group, Inc.
 1225 W. Washington St.  |  AZ23  |  Tempe, AZ  85281
 Phone: 602.713.2417  |  Email: 
 balaji.gan...@apollogrp.edumailto:balaji.gan...@apollogrp.edu
 
 P Go Green. Don't Print. Moreover soft copies can be indexed by algorithms.
 
 From: Toke Eskildsen [via Lucene] 
 [mailto:ml-node+s472066n4010082...@n3.nabble.com]
 Sent: Tuesday, September 25, 2012 12:19 AM
 To: Balaji Gandhi
 Subject: Re: SolrJ - IOException
 
 On Tue, 2012-09-25 at 01:50 +0200, balaji.gandhi wrote:
 I am encountering this error randomly (under load) when posting to Solr
 using SolrJ.
 
 Has anyone encountered a similar error?
 
 org.apache.solr.client.solrj.SolrServerException: IOException occured when
 talking to server at: http://localhost:8080/solr/profile at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414)
 [...]
 
 This looks suspiciously like a potential bug in the HTTP keep-alive flow
 that we encountered some weeks ago. I am guessing that you are issuing
 more than 100 separate updates/second. Could you please provide the full
 stack trace?
 
 
 If you reply to this email, your message will be added to the discussion 
 below:
 http://lucene.472066.n3.nabble.com/SolrJ-IOException-tp4010026p4010082.html
 To unsubscribe from SolrJ - IOException, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4010026code=YmFsYWppLmdhbmRoaUBhcG9sbG9ncnAuZWR1fDQwMTAwMjZ8LTEwNzE2NTA1NDI=.
 

Re: anyone has solrcloud perfromance numbers ?

2012-10-02 Thread Otis Gospodnetic
I don't have the URL handy, but guys at LinkedIn have a benchmark tool for
Solr, ElasticSearch, and Sensei. Check the list archives for URL and my
signature below for a tool that can show metrics for any of those systems,
which you'll probably want to observe during testing.

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote:

 Hi,
  Does anyone has some solr cloud preliminary performance numbers ? Or if
 someone has performance comparison ( throughput and latency) between  solr
 3.6 and solrcloud ( having a huge monolithic index vs sharded) ?

 Thanks
 Varun



Re: Query among multiple cores

2012-10-02 Thread Otis Gospodnetic
Are the cores join-able? If so, you can use Solr's join feature to execute
just one query.

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 2, 2012 5:50 PM, Nicholas Ding nicholas...@gmail.com wrote:

 Hello,

 I'm working on a search project, that involves searching against more than
 one cores.

 For example, I have 3 cores. Core A, Core B, and Core C.

- Fist Step, search Core A, get some Ids.
- Second Step, search Core B, get some keywords.
- Finally, I use Ids from Core A and keywords from Core B, searching
against Core C.

 I know I can write some php frontend to call Solr several times, but is
 that possible to do it inside Solr? Core A and Core B are pretty small, by
 comparing the searching time, the HTTP overhead is great. This project is
 gonna have high volume traffic, so I wanna reduce the overhead of HTTP if
 that's possible.

 Thanks
 Nicholas



Re: NoHttpResponseException using Solrj to index

2012-10-02 Thread Otis Gospodnetic
You need to add the jar with that missing class to the startup command line.

Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 2, 2012 5:42 PM, Rui Vaz rui@gmail.com wrote:

 Hey I am trying to make a simple application using solrj to index
 documents.
 I used the start.jar to start the Solr,. When I try to index a document to
 Solr
 I get the following exception:

 Exception in thread main java.lang.NoClassDefFoundError:
 org/apache/http/NoHttpResponseException

 The exception occurs when I instantiate SolrServer (in orange):


 public static void indexFilesSolrCell(File srcFile, String solrId)
 throws IOException, SolrServerException {

 String urlString = http://localhost:8983/solr;;

 SolrServer solr = new HttpSolrServer(urlString);

 ContentStreamUpdateRequest up
   = new ContentStreamUpdateRequest(/update/extract);

 I already import apache solr-solrj

 Thank you,
 --
 Rui Vaz



Re: Query among multiple cores

2012-10-02 Thread Nicholas Ding
Join is cool, but does it work among multiple cores? On Solr's wiki, I saw
it's only applied to single core.

On Tue, Oct 2, 2012 at 11:06 PM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Are the cores join-able? If so, you can use Solr's join feature to execute
 just one query.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Oct 2, 2012 5:50 PM, Nicholas Ding nicholas...@gmail.com wrote:

  Hello,
 
  I'm working on a search project, that involves searching against more
 than
  one cores.
 
  For example, I have 3 cores. Core A, Core B, and Core C.
 
 - Fist Step, search Core A, get some Ids.
 - Second Step, search Core B, get some keywords.
 - Finally, I use Ids from Core A and keywords from Core B, searching
 against Core C.
 
  I know I can write some php frontend to call Solr several times, but is
  that possible to do it inside Solr? Core A and Core B are pretty small,
 by
  comparing the searching time, the HTTP overhead is great. This project is
  gonna have high volume traffic, so I wanna reduce the overhead of HTTP if
  that's possible.
 
  Thanks
  Nicholas
 



Re: Follow links in xml doc

2012-10-02 Thread Otis Gospodnetic
Hi Billy,

There is nothing in Solr that will do XML parsing and link extraction,
so you'll need to do that part.  Once you do that have a look at Solr
join for parent-child querying.

http://search-lucene.com/?q=solr+join

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Tue, Oct 2, 2012 at 9:51 PM, Billy Newman newman...@gmail.com wrote:
 Hello again all.

 I have a URLDataSource to index xml data.  Is there any way to follow
 links within the xml doc and index items in those under the same
 document?  I.E. if I search for a word or term and that term lives in
 a link of doc with ID 12345 I would like to return that doc when
 searched.

 Thanks,
 Billy


Re: Query among multiple cores

2012-10-02 Thread Otis Gospodnetic
Solr join does work across multiple cores, as long as they are in the same JVM.

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Tue, Oct 2, 2012 at 11:09 PM, Nicholas Ding nicholas...@gmail.com wrote:
 Join is cool, but does it work among multiple cores? On Solr's wiki, I saw
 it's only applied to single core.

 On Tue, Oct 2, 2012 at 11:06 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:

 Are the cores join-able? If so, you can use Solr's join feature to execute
 just one query.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Oct 2, 2012 5:50 PM, Nicholas Ding nicholas...@gmail.com wrote:

  Hello,
 
  I'm working on a search project, that involves searching against more
 than
  one cores.
 
  For example, I have 3 cores. Core A, Core B, and Core C.
 
 - Fist Step, search Core A, get some Ids.
 - Second Step, search Core B, get some keywords.
 - Finally, I use Ids from Core A and keywords from Core B, searching
 against Core C.
 
  I know I can write some php frontend to call Solr several times, but is
  that possible to do it inside Solr? Core A and Core B are pretty small,
 by
  comparing the searching time, the HTTP overhead is great. This project is
  gonna have high volume traffic, so I wanna reduce the overhead of HTTP if
  that's possible.
 
  Thanks
  Nicholas
 



Re: PHP client for a web application

2012-10-02 Thread Esteban Cacavelos
Thanks for your response Damien.

As you said, you can do some basic things quiclier than solr-php-client. I
think is a good choice for basic applications, and if needed more specific
things, then go with solr-php-client also.




2012/10/2 Damien Camilleri i...@webdistribution.com.au

 Hi esteban. Im currently using both in my application. Both are fine.
 Solarium is great because it models the concepts of solr and can build
 queries using OOP. The other one is more lower level, so u have to write
 queries manually, which can be good in some situations. Both are fast
 enough. Solarium has bigger learning curve. Solarium has built in batch
 updating and other things like parallel queries. So i would go with
 solarium. Its a very nice library.
 On Oct 3, 2012 5:38 AM, Esteban Cacavelos estebancacave...@gmail.com
 wrote:

  Hi, I'm starting a web application using solr as a search engine. The web
  site will be developed in PHP (maybe I'll use a framework also).
 
  I would like to know some thoughts and opinions about the clients (
  http://wiki.apache.org/solr/SolPHP). I didn't like very much the PHP
  extension option because I think this is a limitation. So, I would like
 to
  read opinions about SOLARIUM and SOLR-PHP-CLIENT.
 
 
  Thanks in advance!
 
 
  --
  Esteban L. Cacavelos de Amoriza
  Cel: 0981 220 429
 




-- 
Esteban L. Cacavelos de Amoriza
Cel: 0981 220 429


Re: anyone has solrcloud perfromance numbers ?

2012-10-02 Thread varun srivastava
Thanks Otis

On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

 I don't have the URL handy, but guys at LinkedIn have a benchmark tool for
 Solr, ElasticSearch, and Sensei. Check the list archives for URL and my
 signature below for a tool that can show metrics for any of those systems,
 which you'll probably want to observe during testing.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com wrote:

  Hi,
   Does anyone has some solr cloud preliminary performance numbers ? Or if
  someone has performance comparison ( throughput and latency) between
  solr
  3.6 and solrcloud ( having a huge monolithic index vs sharded) ?
 
  Thanks
  Varun
 



Re: anyone has solrcloud perfromance numbers ?

2012-10-02 Thread varun srivastava
Otis, I am looking for performance benchmark number rather than performance
monitoring tools. SPM looks like monitoring tool. Moreover its comparing
Solr with Elastic Search etc, I want comparison between Solr 3.6 and
solrcloud.

Thanks
Varun

On Tue, Oct 2, 2012 at 9:15 PM, varun srivastava varunmail...@gmail.comwrote:

 Thanks Otis


 On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:

 I don't have the URL handy, but guys at LinkedIn have a benchmark tool for
 Solr, ElasticSearch, and Sensei. Check the list archives for URL and my
 signature below for a tool that can show metrics for any of those systems,
 which you'll probably want to observe during testing.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Hi,
   Does anyone has some solr cloud preliminary performance numbers ? Or if
  someone has performance comparison ( throughput and latency) between
  solr
  3.6 and solrcloud ( having a huge monolithic index vs sharded) ?
 
  Thanks
  Varun
 





Re: anyone has solrcloud perfromance numbers ?

2012-10-02 Thread Otis Gospodnetic
Hi,

Was trying to say you will need to run the benchmark yourself because each
context is different. The linkedin tool I referred you to will help you do
that - you don't have to bench non-solr engines.
I also tried suggesting that while you are benchmarking you really want to
be looking at various metrics, possible with the help of SPM.

HTH
Otis
--
Performance Monitoring - http://sematext.com/spm
On Oct 3, 2012 12:25 AM, varun srivastava varunmail...@gmail.com wrote:

 Otis, I am looking for performance benchmark number rather than performance
 monitoring tools. SPM looks like monitoring tool. Moreover its comparing
 Solr with Elastic Search etc, I want comparison between Solr 3.6 and
 solrcloud.

 Thanks
 Varun

 On Tue, Oct 2, 2012 at 9:15 PM, varun srivastava varunmail...@gmail.com
 wrote:

  Thanks Otis
 
 
  On Tue, Oct 2, 2012 at 8:06 PM, Otis Gospodnetic 
  otis.gospodne...@gmail.com wrote:
 
  I don't have the URL handy, but guys at LinkedIn have a benchmark tool
 for
  Solr, ElasticSearch, and Sensei. Check the list archives for URL and my
  signature below for a tool that can show metrics for any of those
 systems,
  which you'll probably want to observe during testing.
 
  Otis
  --
  Performance Monitoring - http://sematext.com/spm
  On Oct 2, 2012 6:47 PM, varun srivastava varunmail...@gmail.com
  wrote:
 
   Hi,
Does anyone has some solr cloud preliminary performance numbers ? Or
 if
   someone has performance comparison ( throughput and latency) between
   solr
   3.6 and solrcloud ( having a huge monolithic index vs sharded) ?
  
   Thanks
   Varun