Re: mysolr python client

2011-12-01 Thread Marco Martinez
Done!

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/12/1 Marc SCHNEIDER marc.schneide...@gmail.com

 Hi Marco,

 Great! Maybe you can add it on the Solr wiki? (
 http://wiki.apache.org/solr/IntegratingSolr).

 Regards,
 Marc.

 On Thu, Dec 1, 2011 at 10:42 AM, Jens Grivolla j+...@grivolla.net wrote:

  On 11/30/2011 05:40 PM, Marco Martinez wrote:
 
  For anyone interested, recently I've been using a new Solr client for
  Python. It's easy and pretty well documented. If you're interested its
  site
  is: http://mysolr.redtuna.org/
 
 
  Do you know what advantages it has over pysolr or solrpy? On the page it
  only says mysolr was born to be a fast and easy-to-use client for Apache
  Solr’s API and because existing Python clients didn’t fulfill these
  conditions.
 
  Thanks,
  Jens
 
 



mysolr python client

2011-11-30 Thread Marco Martinez
Hi all,

For anyone interested, recently I've been using a new Solr client for
Python. It's easy and pretty well documented. If you're interested its site
is: *http://mysolr.redtuna.org/*
*
*
bye!

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: Error Instantiating QParserPlugin

2011-10-20 Thread Marco Martinez
its seem that the problem is QParserPlugin2 class

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/20 karan.jindal1...@rediffmail.com

 hi,
 while to create customized query parser plugin for solr 3.2. I got the
 Instantiating error.As mentioned at various places I created two
 classesnbsp;1) MyQParserPlugin extends QParserPlugin2) MyQParser extends
 QParser
 org.apache.solr.common.SolrException: Error Instantiating QParserPlugin,
 MyQParserPlugin is not a org.apache.solr.search.QParserPlugin
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:428)
at
 org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:448)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1548)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1542)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1575)
at org.apache.solr.core.SolrCore.initQParsers(SolrCore.java:1492)
at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:558)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
at
 org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
 org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
at
 org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
at
 org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
at
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
 org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at org.mortbay.jetty.Server.doStart(Server.java:224)
at
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.mortbay.start.Main.invokeMain(Main.java:194)
at org.mortbay.start.Main.start(Main.java:534)
at org.mortbay.start.Main.start(Main.java:441)
at org.mortbay.start.Main.main(Main.java:119)
 Any idea about whats going on??
 Thanks Karan


Re: Solr scraping: Nutch and other alternatives.

2011-10-18 Thread Marco Martinez
Hi Luis,

Have you tried the copyField function with custom analyzers and tokenizers?

bye,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/18 Luis Cappa Banda luisca...@gmail.com

 Hello everyone.

 I've been thinking about a way to retrieve information from a domain (for
 example, http://www.ign.com) to process and index. My idea is to use Solr
 as
 a searcher. I'm familiarized with Apache Nutch and I know that the latest
 version has a gateway to Solr to retrieve and index information with it. I
 tried it and it worked fine, but it's a little bit complex to develop
 plugins to process info and index it in a new field desired. Perhaps one of
 you have tried another (and better) alternative to data mine web
 information. Which is your recommendation? Can you give me any scraping
 suggestion?

 Thank you very much.

 Luis Cappa.



Re: Controlling the order of partial matches based on the position

2011-10-18 Thread Marco Martinez
Hi,

I would use a custom function query that uses termPositions to calculate the
order of the values in the field to accomplished your requirements.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/10/18 aronitin aro_ni...@yahoo.com

 Guys,

 It's been almost a week but there are no replies to the question that I
 posted.

 If its a small problem and already answered somewhere, please point me to
 that post. Otherwise please suggest any pointer to handle the requirement
 mentioned in the question,

 Nitin

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Controlling-the-order-of-partial-matches-based-on-the-position-tp3413867p3429823.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: PositionIncrement gap and multi-valued fields.

2011-08-09 Thread Marco Martinez
Hi Luis,

As far as i know, the position increment gap only affects in some queries,
like phrase queries if you use the slop. The position incremente gap does
not affect  the similarity scoring formula of lucene :

score(q,d)   =
coord(q,d)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_coord
  ·  
queryNorm(q)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_queryNorm
  · ∑( tf(t in 
d)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_tf
  ·  
idf(t)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_idf
2  ·  
t.getBoost()http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_termBoost
 ·  
norm(t,d)http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/Similarity.html#formula_norm
 )t in q*Lucene Practical Scoring Function*
*
*
*
*
The two first arguments are related to normalizes the queries. In the
summation, the two first arguments are related to the frequency of the term,
in the document and in the index, the third one is the boost of the term in
the query, and the final one, encapsulates a few (indexing time) boost and
length factors, but the lengths factor are calculated with the number of
terms so the position increment gap doesnt make more tokens, so this factor
neither affect the score.

But if you use, for example a multivalue field, with a position incremente
gap of 100, if you do a query with a slop less than 100, you prevent to have
matches between two separated values of this field, ex:

q=test:A B~99

doc1
field test position increment gap=100
strA/str
strB/str

You dont get any matches for this doc, but if you do this query q=test:A
B~101 you will get the doc1 as a match.


Bye!


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/8/8 Luis Cappa Banda luisca...@gmail.com

 Hello!

 I have a doubt about the behaviour of searching over field types that have
 positionIncrementGap defined. For example, supose that:


   1. We have a field called test defined as multi-valued and white space
   tokenized.
   2. The index has an single document with a test value:

 str
 TEST1
 /str
 str
 AAA BBB
 /str
 str
 CCC DDD
 /str
 str
 EEE FFF
 /str
 str
 TEST2
 /str


 I read that positionIncrementGap defines the virtual space between the last
 token of one field instance and the first token of the next instance
 (source:

 http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html
 ).
 When it says last token of one field instance means that is the last
 token
 of the first entry from the multi-valued content? In our example before it
 will be TEST1.

 Anyway, I've been doing some tests modifying the positionIncrementGap value
 with high values and low values. Can anybody explain me with detail which
 implications has in Solr scoring algorythm an upper and a lower value? I
 would like to understand how this value affects matching results in fields
 and also calculating the final score (maybe more gap implies more spaces
 and
 a worst score when the value matches, etc.).

 Thank you for reading so far!



term positions performance

2011-07-20 Thread Marco Martinez
Hi,

I am developing a new query term proximity and i am using the term positions
to get the positions of each term. I want to know if there is any clues to
increase the perfomance of using term positions, in index time o in query
time, all my fields that i am applying the term positions are indexed.

Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: term positions performance

2011-07-20 Thread Marco Martinez
Also, i develop this query via function query, i wonder if i do it via a
normal query will increase the perfomance..

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/7/20 Marco Martinez mmarti...@paradigmatecnologico.com

 Hi,

 I am developing a new query term proximity and i am using the term
 positions to get the positions of each term. I want to know if there is any
 clues to increase the perfomance of using term positions, in index time o in
 query time, all my fields that i am applying the term positions are indexed.

 Thanks in advance,

 Marco Martínez Bautista
 http://www.paradigmatecnologico.com
 Avenida de Europa, 26. Ática 5. 3ª Planta
 28224 Pozuelo de Alarcón
 Tel.: 91 352 59 42



Re: embeded solrj doesn't refresh index

2011-07-20 Thread Marco Martinez
You should send a commit to you embedded solr

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/7/20 Jianbin Dai j...@huawei.com

 Hi,



 I am using embedded solrj. After I add new doc to the index, I can see the
 changes through solr web, but not from embedded solrj. But after I restart
 the embedded solrj, I do see the changes. It works as if there was a cache.
 Anyone knows the problem? Thanks.



 Jianbin




function queries scope

2011-06-07 Thread Marco Martinez
Hi,

I need to use the function queries operations with the score of a given
query, but only in the docset that i get from the query and i dont know if
this is possible.

Example:

q=shops in madridreturns  1 docs  with a specific score for each doc

but now i need to do some stuff like

q=sum(product(2,query(shops in madrid),productValueField) but this will be
return all the docs in my index.


I know that i can do it via filter queries, ex, q=sum(product(2,query(shops
in madrid),productValueField)fq=shops in madrid but this will do the query
two times and i dont want this because the performance is important to our
application.


Is there other approach to accomplished that=


Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: function queries scope

2011-06-07 Thread Marco Martinez
Thanks, but its not what i'm looking for, because the BoostQParserPlugin
multiplies the score of the query with the function queries defined in the b
param of the BoostQParserPlugin. and i can't use the edismax because we have
our own qparser. Its seems that i have to code another qparser.


Thanks Yonik anyway,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/6/7 Yonik Seeley yo...@lucidimagination.com

 One way is to use the boost qparser:

 http://search-lucene.com/jd/solr/org/apache/solr/search/BoostQParserPlugin.html
 q={!boost b=productValueField}shops in madrid

 Or you can use the edismax parser which as a boost parameter that
 does the same thing:
 defType=edismaxq=shops in madridboost=productValueField


 -Yonik
 http://www.lucidimagination.com


 On Tue, Jun 7, 2011 at 6:53 AM, Marco Martinez
 mmarti...@paradigmatecnologico.com wrote:
  Hi,
 
  I need to use the function queries operations with the score of a given
  query, but only in the docset that i get from the query and i dont know
 if
  this is possible.
 
  Example:
 
  q=shops in madridreturns  1 docs  with a specific score for each
 doc
 
  but now i need to do some stuff like
 
  q=sum(product(2,query(shops in madrid),productValueField) but this will
 be
  return all the docs in my index.
 
 
  I know that i can do it via filter queries, ex,
 q=sum(product(2,query(shops
  in madrid),productValueField)fq=shops in madrid but this will do the
 query
  two times and i dont want this because the performance is important to
 our
  application.
 
 
  Is there other approach to accomplished that=
 
 
  Thanks in advance,
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42
 



Re: function query apply only in the subset of the query

2011-04-13 Thread Marco Martinez
No, this query returns a few more documents than if a do it by lucene query
parser. I'm going to generate another query parser that send a simple term
query and see what is the output, when i have it, i will inform in the mail.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/12 Yonik Seeley yo...@lucidimagination.com

 On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
 mmarti...@paradigmatecnologico.com wrote:
  Thanks but I tried this and I saw that this work in a standard scenario,
 but
  in my query i use a my own query parser and it seems that they dont doing
  the AND and returns all the docs in the index:
 
  My query:
  _query_:{!bm25}car AND _val_:marketValue - 67000 docs returned

 This would seem to point to your generated query {!bm25}car
 matching all docs for some reason?

 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco



Re: function query apply only in the subset of the query

2011-04-13 Thread Marco Martinez
Its seems that is a problem of my own query, now i need to investigate if
there is something different between a normal query and my implementation of
the query, because if you use it alone, its works properly.

Thanks,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/13 Marco Martinez mmarti...@paradigmatecnologico.com

 No, this query returns a few more documents than if a do it by lucene query
 parser. I'm going to generate another query parser that send a simple term
 query and see what is the output, when i have it, i will inform in the mail.


 Marco Martínez Bautista
 http://www.paradigmatecnologico.com
 Avenida de Europa, 26. Ática 5. 3ª Planta
 28224 Pozuelo de Alarcón
 Tel.: 91 352 59 42


 2011/4/12 Yonik Seeley yo...@lucidimagination.com

 On Tue, Apr 12, 2011 at 10:25 AM, Marco Martinez
 mmarti...@paradigmatecnologico.com wrote:
  Thanks but I tried this and I saw that this work in a standard scenario,
 but
  in my query i use a my own query parser and it seems that they dont
 doing
  the AND and returns all the docs in the index:
 
  My query:
  _query_:{!bm25}car AND _val_:marketValue - 67000 docs returned

 This would seem to point to your generated query {!bm25}car
 matching all docs for some reason?

 -Yonik
 http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
 25-26, San Francisco





function query apply only in the subset of the query

2011-04-12 Thread Marco Martinez
Hi everyone,

My situation is the next, I need to sum the value of a field to the score to
the docs returned in the query, but not to all the docs, example:

q=car returns 3 docs

1-
name=car ford
marketValue=1
score=1.3

2-
name=car citroen
marketValue=2
score=1.3

3-
name=car mercedes
marketValue=0.5
score=1.3

but if want to sum the marketValue to the score, my returned list is the
next:

q=car+_val_:marketValue

1-
name=bus
marketValue=5
score=5

2-
name=car citroen
marketValue=2
score=3.3

3-
name=car ford
marketValue=1
score=2.3

4-
name=car mercedes
marketValue=0.5
score=1.8


Its possible to apply the function query only to the documents returned in
the first query?


Thanks in advance,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: function query apply only in the subset of the query

2011-04-12 Thread Marco Martinez
Thanks but I tried this and I saw that this work in a standard scenario, but
in my query i use a my own query parser and it seems that they dont doing
the AND and returns all the docs in the index:

My query:
_query_:{!bm25}car AND _val_:marketValue - 67000 docs returned


Solr query parser
car AND _val_:marketValue - 300 docs returned


Thanks,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2011/4/12 Erik Hatcher erik.hatc...@gmail.com

 Try using AND (or set q.op):

   q=car+AND+_val_:marketValue

 On Apr 12, 2011, at 07:11 , Marco Martinez wrote:

  Hi everyone,
 
  My situation is the next, I need to sum the value of a field to the score
 to
  the docs returned in the query, but not to all the docs, example:
 
  q=car returns 3 docs
 
  1-
  name=car ford
  marketValue=1
  score=1.3
 
  2-
  name=car citroen
  marketValue=2
  score=1.3
 
  3-
  name=car mercedes
  marketValue=0.5
  score=1.3
 
  but if want to sum the marketValue to the score, my returned list is the
  next:
 
  q=car+_val_:marketValue
 
  1-
  name=bus
  marketValue=5
  score=5
 
  2-
  name=car citroen
  marketValue=2
  score=3.3
 
  3-
  name=car ford
  marketValue=1
  score=2.3
 
  4-
  name=car mercedes
  marketValue=0.5
  score=1.8
 
 
  Its possible to apply the function query only to the documents returned
 in
  the first query?
 
 
  Thanks in advance,
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42




Re: Different Results..

2010-12-22 Thread Marco Martinez
We need more information about the the analyzers and tokenizers of the
default field of your search

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/12/22 satya swaroop satya.yada...@gmail.com

 Hi All,
 i am getting different results when i used with some escape keys..
 for example:::
 1) when i use this request
http://localhost:8080/solr/select?q=erlang!ericson
   the result obtained is
   result name=response numFound=1934 start=0

 2) when the request is
 http://localhost:8080/solr/select?q=erlang/ericson
the result is
  result name=response numFound=1 start=0


 My query here is, do solr consider both the queries differently and what do
 it consider for !,/ and all other escape characters.


 Regards,
 satya



Re: White space in facet values

2010-12-22 Thread Marco Martinez
try to copy the values (with copyfield) to a string field

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/12/22 Peter Karich peat...@yahoo.de



 you should try fq=Product:Electric Guitar


  How do I handle facet values that contain whitespace? Say I have a field
 Product that I want to facet on. A value for Product could be Electric
 Guitar. How should I handle the white space in Electric Guitar during
 indexing? What about when I apply the constraint fq=Product:Electric Guitar?

 --
 http://jetwick.com open twitter search




Re: Solr search speed very low

2010-08-25 Thread Marco Martinez
You should use the tokenizer solr.WhitespaceTokenizerFactory in your field
type to get your terms indexed, once you have indexed the data, you dont
need to use the * in your queries that is a heavy query to solr.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/25 Andrey Sapegin andrey.sape...@unister-gmbh.de

 Dear ladies and gentlemen.

 I'm newbie with Solr, I didn't find an aswer in wiki, so I'm writing here.

 I'm analysing Solr performance and have 1 problem. *Search time is about
 7-10 seconds per query.*

 I have a *.csv 5Gb-database with about 15 fields and 1 key field (record
 number). I uploaded it to Solr without any problem using curl. This database
 contains information about books and I'm intrested in keyword search using
 one of the fields (not a key field). I mean that if I search, for example,
 for word Hello, I expect response with sentences containing Hello:
 Hello all
 Hello World
 I say Hello to all
 etc.

 I tested it from console using time command and curl:

 /usr/bin/time -o test_results/time_solr -a curl 
 http://localhost:8983/solr/select/?q=itemname:*$query*version=2.2start=0rows=10indent=on;
 -6 21  test_results/response_solr

 So, my query is *itemname:*$query**. 'Itemname' - is the name of field.
 $query - is a bash variable containing only 1 word. All works fine.
 *But unfortunately, search time is about 7-10 seconds per query.* For
 example, Sphinx spent only about 0.3 second per query.
 If I use only $query, without stars (*), I receive answer pretty fast, but
 only exact matches.
 And I want to see any sentence containing my $query in the response. Thats
 why I'm using stars.

 NOW THE QUESTION.
 Is my query syntax correct (*field:*word**) for keyword search)? Why
 response time is so big? Can I reduce search time?

 Thank You in advance,
 Kind Regards,

 Andrey Sapegin,
 Software Developer,

 Unister GmbH
 Barfußgässchen 11 | 04109 Leipzig

 andrey.sape...@unister-gmbh.de mailto:%20andreas.b...@unister-gmbh.de
 www.unister.de http://www.unister.de




Re: Search Results optimization

2010-08-13 Thread Marco Martinez
You can use a boost higher for stapler to accomplished your requirement.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/13 Hasnain hasn...@hotmail.com


 Hi All,

 My question is related to search results, I want to customize my query so
 that for query stapler hammer, I should get results for all items
 containing word stapler first and then results containing hammer, right
 now results are mixing up, I want them sorted, i.e. all results of stapler
 on top and hammer on bottom not mixed, I havent changed any configuration
 files...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Search-Results-optimization-tp1129374p1129374.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: index pdf files

2010-08-12 Thread Marco Martinez
To help you we need the description of your fields in your schema.xml and
the query that you do when you search only a single word.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/8/12 Ma, Xiaohui (NIH/NLM/LHC) [C] xiao...@mail.nlm.nih.gov

 I wrote a simple java program to import a pdf file. I can get a result when
 I do search *:* from admin page. I get nothing if I search a word. I wonder
 if I did something wrong or miss set something.

 Here is part of result I get when do *:* search:
 *
 - doc
 - arr name=attr_Author
  strHristovski D/str
  /arr
 - arr name=attr_Content-Type
  strapplication/pdf/str
  /arr
 - arr name=attr_Keywords
  strmicroarray analysis, literature-based discovery, semantic
 predications, natural language processing/str
  /arr
 - arr name=attr_Last-Modified
  strThu Aug 12 10:58:37 EDT 2010/str
  /arr
 - arr name=attr_content
  strCombining Semantic Relations and DNA Microarray Data for Novel
 Hypotheses Generation Combining Semantic Relations and DNA Microarray Data
 for Novel Hypotheses Generation Dimitar Hristovski, PhD,1 Andrej
 Kastrin,2...
 *
 Please help me out if anyone has experience with pdf files. I really
 appreciate it!

 Thanks so much,




custom scoring phrase queries

2010-06-18 Thread Marco Martinez
Hi,

I want to know if its posiible to get a higher score in a phrase query when
the matching is on the left side of the field. For example:


doc1=name:stores peter john
doc2=name:peter john stores
doc3=name:peter john something

if you do a search with name=peter john the resultset i want to get is:

doc2
doc3
doc1

because the terms peter john are on the left side of the field and they get
a higher score.

Thanks in advance,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: custom scoring phrase queries

2010-06-18 Thread Marco Martinez
Hi Otis,

Finally i construct my own function query that gives more score if the value
is at the start  of the field. But, its possible to tell solr to use
spanFirstQuery without coding. I think i have read that its no possible.

Thanks,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/18 Otis Gospodnetic otis_gospodne...@yahoo.com

 Marco,

 I don't think there is anything in Solr to do that (is there?), but you
 could do it with some coding if you combined the regular query with
 SpanFirstQuery with bigger boost:


 http://search-lucene.com/jd/lucene/org/apache/lucene/search/spans/SpanFirstQuery.html

 Oh, here are some examples and at the bottom you will see exactly what I
 suggested above:


 http://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html||SpanFirstQueryhttp://search-lucene.com/c/Lucene:/src/java/org/apache/lucene/search/spans/package.html%7C%7CSpanFirstQuery

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Marco Martinez mmarti...@paradigmatecnologico.com
  To: solr-user@lucene.apache.org
  Sent: Fri, June 18, 2010 4:34:45 AM
  Subject: custom scoring phrase queries
 
  Hi,

 I want to know if its posiible to get a higher score in a phrase
  query when
 the matching is on the left side of the field. For
  example:


 doc1=name:stores peter john
 doc2=name:peter john
  stores
 doc3=name:peter john something

 if you do a search with
  name=peter john the resultset i want to get
  is:

 doc2
 doc3
 doc1

 because the terms peter john are on the
  left side of the field and they get
 a higher score.

 Thanks in
  advance,


 Marco Martínez Bautista

  href=http://www.paradigmatecnologico.com; target=_blank
  http://www.paradigmatecnologico.com
 Avenida de Europa, 26. Ática 5. 3ª
  Planta
 28224 Pozuelo de Alarcón
 Tel.: 91 352 59 42



Re: Distributed Search doesn't response the result set

2010-06-07 Thread Marco Martinez
Hi Scott,

We need more information about your request, can you put the query that you
are doing to the servers.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/7 Scott Zhang macromars...@gmail.com

 Hi. All.
   I am trying to use solr to search over 2 lucene indexes.  I am following
 the solr tutorial and test the distributed search example. It works.
   Then I am using my own lucene indexes. Search in each solr instance works
 and return the expected result. But when I do distributed search using
 shards. It only return the numFound=14. But the result contain nothing.
Don't know why. Can Any one help? Thanks.



Re: Distributed Search doesn't response the result set

2010-06-07 Thread Marco Martinez
Try to put the rows parameter in your request, i guess that in your
solrconfig you have configured the default rows to 0 in your default request
handler.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/6/7 Scott Zhang macromars...@gmail.com

 Thanks for replying.

 Here is the part of my schema.xml:
 I only have 4 fields in my document.

 fields

   field name=id type=string indexed=true stored=true
 required=true /
   field name=type type=string indexed=true stored=true
 required=true/
   field name=keyword_level1 type=text indexed=true stored=false/
   field name=keyword_level2 type=text indexed=true stored=false/




   dynamicField name=*_i  type=intindexed=true  stored=true/
   dynamicField name=*_s  type=string  indexed=true  stored=true/
   dynamicField name=*_l  type=long   indexed=true  stored=true/
   dynamicField name=*_t  type=textindexed=true  stored=true/
   dynamicField name=*_b  type=boolean indexed=true  stored=true/
   dynamicField name=*_f  type=float  indexed=true  stored=true/
   dynamicField name=*_d  type=double indexed=true  stored=true/
   dynamicField name=*_dt type=dateindexed=true  stored=true/

   !-- some trie-coded dynamic fields for faster range queries --
   dynamicField name=*_ti type=tintindexed=true  stored=true/
   dynamicField name=*_tl type=tlong   indexed=true  stored=true/
   dynamicField name=*_tf type=tfloat  indexed=true  stored=true/
   dynamicField name=*_td type=tdouble indexed=true  stored=true/
   dynamicField name=*_tdt type=tdate  indexed=true  stored=true/

   dynamicField name=*_pi  type=pintindexed=true  stored=true/

   dynamicField name=ignored_* type=ignored multiValued=true/
   dynamicField name=attr_* type=textgen indexed=true stored=true
 multiValued=true/

   dynamicField name=random_* type=random /



  /fields

  uniqueKeyid/uniqueKey


 I am running 2 instances as tutorial shows: one on 8983. Another one is on
 7574.
 When I search on 8983:
 URL:

 http://localhost:8983/solr/select/?q=marshipversion=2.2start=0rows=10indent=on
 I got:

 result name=response numFound=17 start=0
 -
 doc
 str name=id89/str
 str name=typeproduct/str
 /doc
 -
 doc
 str name=id90/str
 str name=typeproduct/str
 /doc
 ..


 when I search on 7574:
 URL:

 http://localhost:7574/solr/select/?q=marshipversion=2.2start=0rows=10indent=on
 I got:
 result name=response numFound=17 start=0
 -
 doc
 str name=id89/str
 str name=typeproduct/str
 /doc
 -
 doc
 str name=id90/str
 str name=typeproduct/str
 /doc
 -
 doc
 str name=id91/str
 str name=typeproduct/str
 /doc
 

 As they are using 2 copies of same lucene indexes. the result is same.
 Then I use
 URL:

 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=marship
 I got:
 response
 -
 lst name=responseHeader
 int name=status0/int
 int name=QTime31/int
 -
 lst name=params
 str name=indenttrue/str
 str name=qmarship/str
 str name=shardslocalhost:8983/solr,localhost:7574/solr/str
 /lst
 /lst
 result name=response numFound=14 start=0/
 /response

 Note the numFound is 14.
 When I try URL:

 http://localhost:8983/solr/select?shards=localhost:8983/solr/indent=trueq=marship
 The numFound=7 but still nothing returned.

 URL:

 http://localhost:8983/solr/select?shards=localhost:7574/solr/indent=trueq=marship
 return numFound=7 too. And the result has nothing.

 Please help.

 Thanks.
 Regards.
 Scott


 On Mon, Jun 7, 2010 at 3:47 PM, Marco Martinez 
 mmarti...@paradigmatecnologico.com wrote:

  Hi Scott,
 
  We need more information about your request, can you put the query that
 you
  are doing to the servers.
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42
 
 
  2010/6/7 Scott Zhang macromars...@gmail.com
 
   Hi. All.
 I am trying to use solr to search over 2 lucene indexes.  I am
  following
   the solr tutorial and test the distributed search example. It works.
 Then I am using my own lucene indexes. Search in each solr instance
  works
   and return the expected result. But when I do distributed search using
   shards. It only return the numFound=14. But the result contain
  nothing.
  Don't know why. Can Any one help? Thanks.
  
 



Re: solr.solr.home

2010-05-27 Thread Marco Martinez
Hi,

When you start the tomcat, you can specify the properties, it will be
something like this -Dsolr.solr.home=path/to/your/solr/home. For example, in
linux ./startup.sh -Dsolr.solr.home=path/to/your/solr/home



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/27 Antonello Mangone antonello.mang...@gmail.com

 But where I have to write this command ???

 System.setProperty(solr.solr.home,
  whateverpathyou'dliketosetonyourfilesystem);
 
  Claudio
 



Re: Any realtime indexing plugin available for SOLR

2010-05-26 Thread Marco Martinez
Maybe this will help you

http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Solr+Plugin

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/26 bbarani bbar...@gmail.com


 Hi,

 Sorry if I am asking this question again in this forum..

 Is there any plugin which I can use to do a realtime indexing?

 I have a requirement where we have an application which sits on top of SQL
 server DB and updates happen on day to day basis. Users would like to see
 the changes made to the DB immediately in the search results. I am thinking
 of using JMS queue for achieving this, but before that I just want to check
 if anyone has implemented similar kind of requirement before?

 Any help / suggestions would be greatly appreciated.

 Thanks,
 bb
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Any-realtime-indexing-plugin-available-for-SOLR-tp845026p845026.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: disable caches in real time

2010-05-19 Thread Marco Martinez
Hi Chris,

Thank you for your answer.

I've always undestand that if you do a commit (replication does it), a new
searcher is open, and you lose performance (queries per second) while the
caches are regenerated. I think i don't explain correctly my situation
before, with my schema i want to avoid this loss of performance in an
enviroment with frequent updates.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Chris Hostetter hossman_luc...@fucit.org

 : I want to know if there is any approach to disable caches in a specific
 core
 : from a multicore server.

 only via hte config.

 : I have a multicore server where the core0 will be listen to the queries
 and
 : other core (core1) that will be replicated from a master server. Once the
 : replication has been done, i will swap the cores. My point is that i want
 to
 : disable the caches in the core that is in charge of the replication to
 save
 : memory in the machine.

 that seems bizarely complicated -- replication can work against a live
 core, no need to do the swap yourself, the replicationHandler takes care
 of this for your transparently (ie: you have one core, replicating from a
 master -- the old index will be searched by users, and have caches, and
 when the new version of the index is ready, the replication handler will
 swap the *index* in that core (but the core itself never changes) ... it
 can even autowarm the caches on the new index for you before the swap if
 you configure it that way.

 -Hoss




Re: Storing RandomSortField

2010-05-19 Thread Marco Martinez
Hi Alexandre,

I am not totally sure about this, but the random sort field its only used to
do a random sort on your searchs, and you will to pass differents values to
have differents sorts, so this only applies in the searchs, so no value is
indexed. You will find more information here:
http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Alexandre Rocco alel...@gmail.com

 Hi guys,

 Is there any way to mak a RandomSortField be stored?
 I'm trying to do it for debugging purposes,
 My intention is to take a look at the values that are stored there to
 determine the sorting that is being applied to the results.

 I tried to make it a stored field as:
 field name=randomorder type=random stored=true /

 And also tried to create another text field, copying the result from the
 random field like this:
 field name=randomorderdebug type=text indexed=true stored=true/
 copyField source=randomorder dest=randomorderdebug/

 Neither of the approaches worked.
 Is there any restriction on this kind of field that prevents it from being
 displayed in the results?

 Thanks,
 Alexandre



Re: Multifaceting on multivalued field

2010-05-18 Thread Marco Martinez
Hi,

This exception is fired when you don't have this field on your index, but
this comes because you have an error in your query syntax  !{ex=cars}cars,
should be {*!*ex=cars}cars , whith the exclamation inside the brackets.



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/18 Peter Karich peat...@yahoo.de

 Hi all,

 I read about multifaceting [1] and tried it for myself. With
 multifaceting I would like to conserve the number of documents for the
 'un-facetted case'. This works nice with normal fields, but I get an
 exception [2] if I apply this on a multivalued field.
 Is this a bug or logical :-) ? If the latter one is the case, would
 anybody help me to understand this?

 Regards,
 Peter.

 [1]

 http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html

 [2]
 org.apache.solr.common.SolrException: undefined field !{ex=cars}cars
at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1077)
at
 org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:226)
at

 org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283)
at
 org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166)
at

 org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
at

 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:336)
at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:239)




Re: disable caches in real time

2010-05-17 Thread Marco Martinez
Any suggestions?

I have thought in have two configurations per server and reload each one
with the appropiated config file but i would prefer another solution if its
possible.

Thanks,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/14 Marco Martinez mmarti...@paradigmatecnologico.com

 Hi,

 I want to know if there is any approach to disable caches in a specific
 core from a multicore server.

 My situation is the next:

 I have a multicore server where the core0 will be listen to the queries and
 other core (core1) that will be replicated from a master server. Once the
 replication has been done, i will swap the cores. My point is that i want to
 disable the caches in the core that is in charge of the replication to save
 memory in the machine.

 Any suggestions will be appreciated.

 Thanks in advance,


 Marco Martínez Bautista
 http://www.paradigmatecnologico.com
 Avenida de Europa, 26. Ática 5. 3ª Planta
 28224 Pozuelo de Alarcón
 Tel.: 91 352 59 42



Re: Targeting two fields with the same query or one field gathering contents from both ?

2010-05-17 Thread Marco Martinez
No, the equivalent for this will be:

- A: (the lazy fox) *OR* B: (the lazy fox)
- C: (the lazy fox)


Imagine the situation that you dont have in B 'the lazy fox', with the AND
you get 0 results although you have 'the lazy fox' in A and C

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/17 Xavier Schepler xavier.schep...@sciences-po.fr

 Hey,

 let's say  I have :

 - a field named A with specific contents

 - a field named B with specific contents

 - a field named C witch contents only from A and B added with copyField.

 Are those queries equivalents in terms of performance :

 - A: (the lazy fox) AND B: (the lazy fox)
 - C: (the lazy fox)

 ??

 Thanks,

 Xavier






disable caches in real time

2010-05-14 Thread Marco Martinez
Hi,

I want to know if there is any approach to disable caches in a specific core
from a multicore server.

My situation is the next:

I have a multicore server where the core0 will be listen to the queries and
other core (core1) that will be replicated from a master server. Once the
replication has been done, i will swap the cores. My point is that i want to
disable the caches in the core that is in charge of the replication to save
memory in the machine.

Any suggestions will be appreciated.

Thanks in advance,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


Re: Question on pf (Phrase Fields)

2010-05-13 Thread Marco Martinez
I don't know if this solution accomplished your requirements but you can use
fq to do the query with only foo and q when you search by more terms.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/13 Blargy zman...@hotmail.com


 Is there any way to configure this so it only takes after if you match more
 than one word?

 For example if I search for: foo it should have no effect on scoring, but
 if I search for foo bar then it should.

 Is this possible? Thanks
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Question-on-pf-Phrase-Fields-tp815095p815095.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: JTeam Spatial Plugin

2010-05-12 Thread Marco Martinez
Hi,


You can use localsolr  (http://www.gissearch.com/localsolr) that supports
sharding if you need this feature.



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/11 Jean-Sebastien Vachon js.vac...@videotron.ca

 Hi,

 Thanks for your suggestion but I received more information about this issue
 from one of the JTeam's developer and he told me that
 my problem was caused by the plugin not supporting sharding at this time.

 In my case, I noticed that individual shards were computing the distance
 through the geo_distance field.
 However, the master Solr instance controlling the shards was kind of
 loosing this information from the lack of support for shards.

 For now there is no quick work around that I know of.

 Later,

 On 2010-05-11, at 2:54 PM, Michael wrote:

  Try using geo_distance in the return fields.
 
  On Thu, Apr 29, 2010 at 9:26 AM, Jean-Sebastien Vachon
  js.vac...@videotron.ca wrote:
  Hi All,
 
  I am using JTeam's Spatial Plugin RC3 to perform spatial searches on my
 index and it works great. However, I can't seem to get it to return the
 computed distances.
 
  My query component is run before the geoDistanceComponent and the
 distanceField is set to distance
  Fields for lat/long are defined as well and the different tiers field
 are in the results. Increasing the radius cause the number of matches to
 increase so I guess that my setup is working...
 
  Here is sample query and its output (I removed some of the fields to
 keep it short):
 
 
 /select?passkey=sampleq={!spatial%20lat=40.27%20long=-76.29%20radius=22%20calc=arc}title:engineerwt=jsonindent=onfl=*,distance
 
  
 
  {
   responseHeader:{
   status:0,
   QTime:69,
   params:{
 fl:*,distance,
 indent:on,
 q:{!spatial lat=40.27 long=-76.29 radius=22
 calc=arc}title:engineer,
 wt:json}},
   response:{numFound:223,start:0,docs:[
 {
 
  title:Electrical Engineer,
 long:-76.3054962158203,
  lat:40.037899017334,
  _tier_9:-3.004,
  _tier_10:-6.0008,
  _tier_11:-12.0016,
  _tier_12:-24.0031,
  _tier_13:-47.0061,
  _tier_14:-93.00122,
  _tier_15:-186.00243,
  _tier_16:-372.00485},
  }}
 
  This output suggests to me that everything is in place. Anyone knows how
 to fetch the computed distance? I tried adding the field 'distance' to my
 list of fields but it didn't work
 
  Thanks
 




Re: multivalue fields logic required

2010-05-12 Thread Marco Martinez
Hi,

2º solution:

Not use multiValue fields, instead use two single fields, in your example
will be:

doc1:
dept: student1
city: city1
principalFlag:T
doc2:
dept: student2
city: city2
principalFlag:F

So, if you search without specify any city or dept, you should put
princiaplFlag:T for no get duplicate on your response. And if you specify a
city or a dept, there is no need to specify the principalFlag because you
will only get the result that match with your fields (you dont get
duplicates).

3º solution:

Do a postprocessing to eleminate the fields in your response that you dont
need, i mean, get only the city and the dept that should be in the query
response.

Hope this will help



Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/12 Jonty Rhods jonty.rh...@gmail.com

 Hi Marco,

 I am trying to patch for collapse component support (till now no luck)..
 In mean time I would like to know the 2nd and 3rd option you mentioned
 (logic in solrj)..

 with regards

 On Thu, May 6, 2010 at 2:36 PM, Marco Martinez 
 mmarti...@paradigmatecnologico.com wrote:

  Hi Jonty,
 
  I think you have three possible solutions:
 
 
1. Use the collapse component with your name field for not have any
duplicates documents.
2. Create a simple logic in your index with flags, like one flag to
determine the first element of the same document (in your example you
  will
have three differents documents and the fist one wiill have this
  flag=true).
If the search only have name, you will have to set this flag to true,
 if
not, the dept or the student will be defined and you will have one
  document
returned.
3. Do a post-processing of your data.
 
  Maybe you will have more solutions but these are what i have thought
 right
  now.
 
  Regards,
 
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42
 
 
  2010/5/6 Jonty Rhods jonty.rh...@gmail.com
 
   thanks
  
   :General solution is to index 3 different SolrDocument in your example.
  id
   and name fields will repeat themselves. All fields will be
 single-valued.
  
   if I am indexing 3 different field then if user is searching by name +
  dept
   then it will return duplicate value.. is there any other best possible
   way..?
  
   thanks
   On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   
 recently I start to work on solr, So I am still very new to
 use solr. Sorry
 if I am logically wrong.
 I have two table, parent and referenced (child).

 for that I set multivalue field following is my schema
 details
  field name=id type=string indexed=true
 stored=true required=true
 /


field name=name type=text
 indexed=true stored=true/

field name=dept type=text
 indexed=true stored=true
 multiValued=true/
field name=city type=text
 indexed=true stored=true
 multiValued=true/

 indexed data details:

 doc

   arr name=dept
 strstudent1/str
 strstudent2/str
 strstudent3/str
   /arr

   arr name=city
 strcity1/str
 strcity2/str
 strcity3/str
   /arr
  str name=id1/str

  arr name=name
strname of emp/str
   /arr

 /doc

 now my question is :
 When user is searching by city2 then I want to return
 employee2 and their id
 (for multi value field).
 something like:

 doc

   arr name=dept

 strstudent2/str

   /arr

   arr name=city

 strcity2/str

   /arr
  str name=id1/str

  arr name=name
strname of emp/str
   /arr

 /doc

   
I had a similar need before. AFAIK you cannot do it with multivalued
fields. The indexing order is preserved in multivalued field. May be
  you
   can
post-process returned fields and capture correct position of matched
  city
field, and use this index to display correct dept value. But this is
  easy
   if
you are using string or integer type for city and dept.
   
General solution is to index 3 different SolrDocument in your
 example.
  id
and name fields will repeat themselves. All fields will be
  single-valued.
   
   
   
   
   
  
 



Re: multivalue fields logic required

2010-05-12 Thread Marco Martinez
You should do a preprocessing(multiply your document as many documents as
values you have in your multivalue field, with the principalFlag:T in your
first document) before you indexing the data with that logic

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/12 Jonty Rhods jonty.rh...@gmail.com

 hi Marco,

 Thanks for quick reply..
 I have another doubt: In 2nd solution: How to set flag for duplicate value.
 because I am not sure about the no fo duplicate rows (it could be random
 no..)
 so how can I set the flag..
 thank

 On Wed, May 12, 2010 at 12:59 PM, Marco Martinez 
 mmarti...@paradigmatecnologico.com wrote:

  Hi,
 
  2º solution:
 
  Not use multiValue fields, instead use two single fields, in your example
  will be:
 
  doc1:
  dept: student1
  city: city1
  principalFlag:T
  doc2:
  dept: student2
  city: city2
  principalFlag:F
 
  So, if you search without specify any city or dept, you should put
  princiaplFlag:T for no get duplicate on your response. And if you specify
 a
  city or a dept, there is no need to specify the principalFlag because you
  will only get the result that match with your fields (you dont get
  duplicates).
 
  3º solution:
 
  Do a postprocessing to eleminate the fields in your response that you
 dont
  need, i mean, get only the city and the dept that should be in the query
  response.
 
  Hope this will help
 
 
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42
 
 
  2010/5/12 Jonty Rhods jonty.rh...@gmail.com
 
   Hi Marco,
  
   I am trying to patch for collapse component support (till now no
 luck)..
   In mean time I would like to know the 2nd and 3rd option you mentioned
   (logic in solrj)..
  
   with regards
  
   On Thu, May 6, 2010 at 2:36 PM, Marco Martinez 
   mmarti...@paradigmatecnologico.com wrote:
  
Hi Jonty,
   
I think you have three possible solutions:
   
   
  1. Use the collapse component with your name field for not have any
  duplicates documents.
  2. Create a simple logic in your index with flags, like one flag to
  determine the first element of the same document (in your example
 you
will
  have three differents documents and the fist one wiill have this
flag=true).
  If the search only have name, you will have to set this flag to
 true,
   if
  not, the dept or the student will be defined and you will have one
document
  returned.
  3. Do a post-processing of your data.
   
Maybe you will have more solutions but these are what i have thought
   right
now.
   
Regards,
   
   
Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42
   
   
2010/5/6 Jonty Rhods jonty.rh...@gmail.com
   
 thanks

 :General solution is to index 3 different SolrDocument in your
  example.
id
 and name fields will repeat themselves. All fields will be
   single-valued.

 if I am indexing 3 different field then if user is searching by
 name
  +
dept
 then it will return duplicate value.. is there any other best
  possible
 way..?

 thanks
 On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan iori...@yahoo.com
   wrote:

 
   recently I start to work on solr, So I am still very new to
   use solr. Sorry
   if I am logically wrong.
   I have two table, parent and referenced (child).
  
   for that I set multivalue field following is my schema
   details
field name=id type=string indexed=true
   stored=true required=true
   /
  
  
  field name=name type=text
   indexed=true stored=true/
  
  field name=dept type=text
   indexed=true stored=true
   multiValued=true/
  field name=city type=text
   indexed=true stored=true
   multiValued=true/
  
   indexed data details:
  
   doc
  
 arr name=dept
   strstudent1/str
   strstudent2/str
   strstudent3/str
 /arr
  
 arr name=city
   strcity1/str
   strcity2/str
   strcity3/str
 /arr
str name=id1/str
  
arr name=name
  strname of emp/str
 /arr
  
   /doc
  
   now my question is :
   When user is searching by city2 then I want to return
   employee2 and their id
   (for multi value field).
   something like:
  
   doc
  
 arr name=dept
  
   strstudent2/str
  
 /arr
  
 arr name=city
  
   strcity2/str
  
 /arr
str name=id1/str
  
arr name=name
  strname of emp/str
 /arr
  
   /doc

Re: multivalue fields logic required

2010-05-06 Thread Marco Martinez
Hi Jonty,

I think you have three possible solutions:


   1. Use the collapse component with your name field for not have any
   duplicates documents.
   2. Create a simple logic in your index with flags, like one flag to
   determine the first element of the same document (in your example you will
   have three differents documents and the fist one wiill have this flag=true).
   If the search only have name, you will have to set this flag to true, if
   not, the dept or the student will be defined and you will have one document
   returned.
   3. Do a post-processing of your data.

Maybe you will have more solutions but these are what i have thought right
now.

Regards,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Jonty Rhods jonty.rh...@gmail.com

 thanks

 :General solution is to index 3 different SolrDocument in your example. id
 and name fields will repeat themselves. All fields will be single-valued.

 if I am indexing 3 different field then if user is searching by name + dept
 then it will return duplicate value.. is there any other best possible
 way..?

 thanks
 On Thu, May 6, 2010 at 1:34 PM, Ahmet Arslan iori...@yahoo.com wrote:

 
   recently I start to work on solr, So I am still very new to
   use solr. Sorry
   if I am logically wrong.
   I have two table, parent and referenced (child).
  
   for that I set multivalue field following is my schema
   details
field name=id type=string indexed=true
   stored=true required=true
   /
  
  
  field name=name type=text
   indexed=true stored=true/
  
  field name=dept type=text
   indexed=true stored=true
   multiValued=true/
  field name=city type=text
   indexed=true stored=true
   multiValued=true/
  
   indexed data details:
  
   doc
  
 arr name=dept
   strstudent1/str
   strstudent2/str
   strstudent3/str
 /arr
  
 arr name=city
   strcity1/str
   strcity2/str
   strcity3/str
 /arr
str name=id1/str
  
arr name=name
  strname of emp/str
 /arr
  
   /doc
  
   now my question is :
   When user is searching by city2 then I want to return
   employee2 and their id
   (for multi value field).
   something like:
  
   doc
  
 arr name=dept
  
   strstudent2/str
  
 /arr
  
 arr name=city
  
   strcity2/str
  
 /arr
str name=id1/str
  
arr name=name
  strname of emp/str
 /arr
  
   /doc
  
 
  I had a similar need before. AFAIK you cannot do it with multivalued
  fields. The indexing order is preserved in multivalued field. May be you
 can
  post-process returned fields and capture correct position of matched city
  field, and use this index to display correct dept value. But this is easy
 if
  you are using string or integer type for city and dept.
 
  General solution is to index 3 different SolrDocument in your example. id
  and name fields will repeat themselves. All fields will be single-valued.
 
 
 
 
 



Re: hi to everyone

2010-05-06 Thread Marco Martinez
You should specify the core in your request, like
http://localhost:8080/solr/*core0*/update?...  where /solr/ is your
webapp and 'core0' is the name of the core.

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Antonello Mangone antonello.mang...@gmail.com

 Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr
 (this is the 4th day :D).
 I'm just a novice and i would like to make a question ...

 I'm using solr in multicore way but i don't understad how to add xml
 documents to a particular core ...
 Can someone help me ???

 Antonello



Re: hi to everyone

2010-05-06 Thread Marco Martinez
See this page
http://wiki.apache.org/solr/UpdateXmlMessages#Updating_a_Data_Record_via_curland
the solr tutorial
http://lucene.apache.org/solr/tutorial.html (maybe you can use the
post.jar).

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/5/6 Antonello Mangone antonello.mang...@gmail.com

 Ok, you're right :D

 I exaplain my situation ...

 I have solr locally on my machine

 */home/antonello/solrtest*

 inside the folder solrtest I have:

 |_ build
 |_ build.xml
 |_ CHANGES.txt
 |_ client
 |_ common-build.xml
 |_ contrib
 |_ dist
 |_ docs
 |_ etc
 |_ lib
 |_ LICENSE.txt
 |_ logs
 |_ multicore
|_ bandb
|_ conf
|_ schema.xml
|_ solrconfig.xml
|_ data
|_ index
|_ segments_1
|_ segments.gen
|_ solr.xml
 |_ NOTICE.txt
 |_ README.txt
 |_ src
 |_ start.jar
 |_ start_multicore.sh
 |_ webapps


 I have also xml files in anoter place and I would like to add these xml
 files to the bandb core.
 Is there a command to add an xml file to a particular core, imagining we
 can
 have an indefinite number of cores ?





 2010/5/6 Marco Martinez mmarti...@paradigmatecnologico.com

  You should specify the core in your request, like
  http://localhost:8080/solr/*core0*/update?...  where /solr/ is your
  webapp and 'core0' is the name of the core.
 
  Marco Martínez Bautista
  http://www.paradigmatecnologico.com
  Avenida de Europa, 26. Ática 5. 3ª Planta
  28224 Pozuelo de Alarcón
  Tel.: 91 352 59 42
 
 
  2010/5/6 Antonello Mangone antonello.mang...@gmail.com
 
   Hi to everyone, my name is Antonello Mangone and I'm a new user of Solr
   (this is the 4th day :D).
   I'm just a novice and i would like to make a question ...
  
   I'm using solr in multicore way but i don't understad how to add xml
   documents to a particular core ...
   Can someone help me ???
  
   Antonello
  
 



Re: synonym filter problem for string or phrase

2010-05-03 Thread Marco Martinez
Hi Ranveer,

I don't see any stemming analyzer in your configuration of the field
'text_sync', also you have filter class=solr.TrimFilterFactory / at
query time and not at index time, maybe that is your problem.


Regards,


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/30 Jonty Rhods jonty.rh...@gmail.com

 On 4/29/10 8:50 PM, Marco Martinez wrote:

 Hi Ranveer,

 If you don't specify a field type in the q parameter, the search will be
 done searching in your default search field defined in the solrconfig.xml,
 its your default field a text_sync field?

 Regards,

 Marco Martínez Bautista
 http://www.paradigmatecnologico.com
 Avenida de Europa, 26. Ática 5. 3ª Planta
 28224 Pozuelo de Alarcón
 Tel.: 91 352 59 42


 2010/4/29 Ranveerranveer.s...@gmail.com ranveer.s...@gmail.com



 Hi,

 I am trying to configure synonym filter.
 my requirement is:
 when user searching by phrase like what is solr user? then it should be
 replace with solr user.
 something like : what is solr user? =  solr user

 My schema for particular field is:

 fieldType name=text_sync class=solr.TextField
 positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/

 /analyzer
 analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.TrimFilterFactory /
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true
 tokenizerFactory=KeywordTokenizerFactory/

 /analyzer
 /fieldType

 it seems working fine while trying by analysis.jsp but not by url
 http://localhost:8080/solr/core0/select?q=what is solr user?
 or
 http://localhost:8080/solr/core0/select?q=what is solr user?

 Please guide me for achieve desire result.






 Hi Marco,
 thanks.
 yes my default search field is text_sync.
 I am getting result now but not as I expect.
 following is my synonym.txt

 what is bone cancer=bone cancer
 what is bone cancer?=bone cancer
 what is of bone cancer=bone cancer
 what is symptom of bone cancer=bone cancer
 what is symptoms of bone cancer=bone cancer

 in above I am getting result of all synonym but not the last one what is
 symptoms of bone cancer=bone cancer.
 I think due to stemming I am not getting expected result. However when I am
 checking result from the analysis.jsp,
 its giving expected result. I am confused..
 Also I want to know best approach to configure synonym for my requirement.

 thanks
 with regards

 Hi,

 I am also facing same type of problem..
 I am Newbie please help.

 thanks
 Jonty



Re: synonym filter problem for string or phrase

2010-04-29 Thread Marco Martinez
Hi Ranveer,

If you don't specify a field type in the q parameter, the search will be
done searching in your default search field defined in the solrconfig.xml,
its your default field a text_sync field?

Regards,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/29 Ranveer ranveer.s...@gmail.com

 Hi,

 I am trying to configure synonym filter.
 my requirement is:
 when user searching by phrase like what is solr user? then it should be
 replace with solr user.
 something like : what is solr user? = solr user

 My schema for particular field is:

 fieldType name=text_sync class=solr.TextField
 positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/

 /analyzer
 analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.TrimFilterFactory /
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true tokenizerFactory=KeywordTokenizerFactory/
 /analyzer
 /fieldType

 it seems working fine while trying by analysis.jsp but not by url
 http://localhost:8080/solr/core0/select?q=what is solr user?
 or
 http://localhost:8080/solr/core0/select?q=what is solr user?

 Please guide me for achieve desire result.




Re: Facet count problem

2010-04-19 Thread Marco Martinez
Hi Ranveer,

The error in the count of the facets its caused by the tokenized field that
you are using, if you want to do facets for the whole string, use a
fieldType that doesn't strip the the field in tokens like the string field.

Regards,

Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/19 Ranveer Kumar ranveer.s...@gmail.com

 Hi Erick,

 My schema configuration is following.


  fieldType name=text class=solr.TextField positionIncrementGap=100
  analyzer type=index
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory/
!--tokenizer class=solr.WhitespaceTokenizerFactory/--
!-- in this example, we will only use synonyms at query time
filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
--
!-- Case insensitive stop word removal.
  add enablePositionIncrements=true in both the index and query
  analyzers to leave a 'gap' for more accurate phrase queries.
--
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
 protected=protwords.txt/
  /analyzer
  analyzer type=query
  charFilter class=solr.HTMLStripCharFilterFactory/!--
 escapedTags=lt;,gt;/  --
  tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory/
!--tokenizer class=solr.WhitespaceTokenizerFactory/--
!--tokenizer class=solr.HTMLStripStandardTokenizerFactory/--

   !--  filter class=solr.LengthFilterFactory min=2 max=50 /
 --
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
filter class=solr.StopFilterFactory
ignoreCase=true
words=stopwords.txt
enablePositionIncrements=true
/
filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory language=English
 protected=protwords.txt/
  /analyzer
/fieldType


 field name=type type=text indexed=true stored=true/

 !-- copy field for default search--
  copyField source=type dest=text/





 On Mon, Apr 19, 2010 at 6:22 AM, Erick Erickson erickerick...@gmail.com
 wrote:

  Can we see the actual field definitions from your schema file.
  Ahmet's question is vital and is best answered if you'll
  copy/paste the relevant configuration entries But based
  on what you *have* posted, I'd guess you're trying to
  facet on tokenized fields, which is not recommended.
 
  You might take a look at:
  http://wiki.apache.org/solr/UsingMailingLists, it'll help you
  frame your questions in a manner that gets you your
  answers as fast as possibld.
 
  Best
  Erick
 
  On Sun, Apr 18, 2010 at 12:59 PM, Ranveer Kumar ranveer.s...@gmail.com
  wrote:
 
   I am.using text for type, which is static. For example: type is a field
  and
   I am using type for categorization. For news type I am using news and
 for
   blog using blog.. type is a text field.
  
   On Apr 17, 2010 8:38 PM, Ahmet Arslan iori...@yahoo.com wrote:
  
I am facing problem to get facet result count. I must be  wrong
   somewhere.  I am getting proper ...
   Are you faceting on a tokenized field? What is the fieldType of your
  field?
  
 



Re: Replication process on Master/Slave slowing down slave read/search performance

2010-04-09 Thread Marco Martinez
Hi Marcin,

This is because when you do the replication, all the caches are rebuild
cause the index has changed, so the searchs performance decrease. You can
change your architecture to a multicore one to reduce the impact of the
replication. Using two cores, one to do the replication, and other to
search, when the replication is done, do a swap of the cores so the caches
are updated all the time.

Regards


Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón
Tel.: 91 352 59 42


2010/4/9 Marcin mar...@feedsmanagement.com

 Hi guys,

 I have noticed that Master/Slave replication process is slowing down slave
 read/search performance during replication being done.


 please help
 cheers



Re: Solr query parser doesn't invoke analyzer for simple term query?

2010-03-17 Thread Marco Martinez
Hello,

You can see what happen (which analyzer are used for this field and which is
the output of the analyzers) with this search using the analysis page of the
solr default web page. I assume you are using the same analyzers and
tokenizers in indexing and searching for this field in your schema.

Regards,


Marco Martínez Bautista



2010/3/17 Teruhiko Kurosaka k...@basistech.com

 It seems that Solr's query parser doesn't pass a single term query
 to the Analyzer for the field. For example, if I give it
 2001年 (year 2001 in Japanese), the searcher returns 0 hits
 but if I quote them with double-quotes, it returns hits.
 In this experiment, I configured schema.xml so that
 the field in question will use the morphological Analyzer
 my company makes that is capable of splitting 2001年
 into two tokens 2001 and 年.  I am guessing that this
 Analyzer is called ONLY IF the term is a phrase.
 Is my observation correct?

 If so, is there any configuration parameter that I can tweak
 to force any query for the text fields be processed by
 the Analyzer?

 One might ask why users won't put space between 2001 and 年.
 Well if they are clearly two separate words, people do that.
 But 年 works more like a suffix in this case, and in many
 Japanese speaker's mind, 2001年 seems like one token, so
 many people won't.  (Remember Japanese don't use spaces
 in normal writing.)  Forcing to use Analyzer would also
 be useful for compound word handling often desirable
 for languages like German.

 
 Teruhiko Kuro Kurosaka
 RLP + Lucene  Solr = powerful search for global contents