Re: documentation on the pragmatics behind the example schema.xml

2012-06-30 Thread Giovanni Gherdovich
Hello Eric,

2012/7/1 Erick Erickson :
> Your very best way of figuring this out is to use the admin/analysis
> page. [...]

thank you for this advice. I'll make myself comfortable
with the admin/analysis page.

cheers,
GGhh


Re: index writer in searchComponent

2012-06-30 Thread Peyman Faratin
Hi Erik

The workflow I'd like to implement is 

1- search the index using the incoming query
2- the query is of the type "does entity X exist"
3- if X does not exist in the index then I'd like to add X to the index

Currently I am using a custom search component to achieve this by creating a 
solrserver within the init (or inform) method of the search component and using 
that instance to update (and commit) the index. I am not sure this is the best 
approach either and thought using the IndexReader of the search component 
itself maybe better. 

Is there a better approach in your opinion?

thank you Erik

Peyman

On Jun 30, 2012, at 8:13 PM, Erick Erickson wrote:

> Lots of the index modification (all of it?) has been removed in 4.0
> from IndexReaders...
> 
> It seems like you could always get the directory and open a
> SolrIndexWriter wherever you wanted,
> but I'm not sure it's a good idea, are there other processes that will
> be writing to the index at the
> same time?
> 
> What's the purpose here anyway? There might be a better approach
> 
> Best
> Erick
> 
> On Thu, Jun 28, 2012 at 4:02 PM, Peyman Faratin  
> wrote:
>> Hi
>> 
>> Is it possible to add a new document to the index in a custom 
>> SearchComponent (that also implements a SolrCoreAware)? I can get a 
>> reference to the indexReader via the ResponseBuilder parameter of the 
>> process() method using
>> 
>> rb.req.getSearcher().getReader()
>> 
>> But is it possible to actually add a new document to the index _after_ 
>> searching the index? I.e accessing the indexWriter?
>> 
>> thank you
>> 
>> Peyman



Re: documentation on the pragmatics behind the example schema.xml

2012-06-30 Thread Erick Erickson
Your very best way of figuring this out is to use the admin/analysis
page. It will show
you the exact effects of each element of the analysis chains for the
field type you
specify. From there it's just a matter of getting your head around the fact that
the various filters and tokenizers can be combined in many different
ways to suit your
particular purpose. Be sure to check the "verbose" checkbox!

But other than the comments in the schema file, there's no source of
documentation for
the purposes of the field types that I know of...

Best
Erick

On Sat, Jun 30, 2012 at 9:51 AM, Giovanni Gherdovich
 wrote:
> Hi all,
>
> in the example schema.xml I can find a wide variety
> of fieldType and field, already there to be used.
>
> I believe each of them has been designed for a specific
> usage case, with some pragmatics in mind.
>
> Where can I find documentation on what those field / fieldTypes
> were designed for? Is the best place to get those info
> the schema.xml file and its comments?
>
> cheers,
> GGhh
>
> here I cut and paste fields and fieldTypes I have:
>
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
> required="true" />
> omitNorms="true"/>
>
> stored="false"/>
> omitNorms="true"/>
> multiValued="true" omitNorms="true" />
> multiValued="true"/>
> termVectors="true" termPositions="true" termOffsets="true" />
>
>
>
>
> multiValued="true"/>
>
>
>
>
>
>
> stored="true" multiValued="true"/>
>
> multiValued="true"/>
> multiValued="true"/>
> stored="false" multiValued="true"/>
>
>
> default="NOW" multiValued="false"/>
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
>
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="0" omitNorms="true" positionIncrementGap="0"/>
>  omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
>  precisionStep="0" positionIncrementGap="0"/>
>  omitNorms="true" precisionStep="6" positionIncrementGap="0"/>
> 
> 
> 
> 
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
> 
> 
>  positionIncrementGap="100">
> 
>  positionIncrementGap="100" >
>  positionIncrementGap="100">
>  positionIncrementGap="100">
>  positionIncrementGap="100" >
>  sortMissingLast="true" omitNorms="true">
>  positionIncrementGap="100">
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8


Re: Wildcard searches with leading and ending wildcard

2012-06-30 Thread Erick Erickson
for searching sub-strings, ngrams are generally preferred. To expand
on Jack's point.

The whole purpose behind reversed wildcards is that without them, searching for
*abcd requires that _every_ term in your field be enumerated, which can be very
expensive. Adding in reversed wildcards causes this to turn into a
trailing wildcard,
and enumerating bcda* is much easier/less costly.

Best
Erick

On Fri, Jun 29, 2012 at 9:21 AM, maurizio1976
 wrote:
> Hi all,
> I've been searching for an answer to this everywhere but I can never find an
> answer that is perfect for my case, so I'll ask this myself.
>
> I'm on Solr 3.6.
> I'm using I use the *ReversedWildcardFilterFactory* in a field containing a
> telephone number.
> So only one word to be indexed, no phrases no strange tokens.
> To be more exact:  withOriginal="true"
>maxPosAsterisk="3" maxPosQuestion="2"
> maxFractionAsterisk="0.33"/>
>
> I can check with Luke that two words are being indexed, one the reverse of
> the other. Perfect.
>
> I can run a query like this:*/ Num:*1234/* that will match docs starting
> with 1234
> and I can run a query like this:* /Num:1234*/* that will match docs ending
> with 1234
>
> but this is the question that everybody seems to be asking.
> Can I run in any way a query that will match records that "contains" the
> value 1234?
>
> If I write this: Num:*1234* this will match docs containing 1234 but also
> docs containing 4321 which is wrong. this means this query: /Num*4321*/ and
> this query: /Num:*1234*/ return exactly the same result.
>
> Is this the wrong approach? has anybody tried the N-gram solution to this
> problem?
>
> thanks very much
> Maurizio
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Wildcard-searches-with-leading-and-ending-wildcard-tp3992086.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Filtering a query by range returning unexpected results

2012-06-30 Thread Erick Erickson
This works fine for me with 3.6, float fields and even on a currency type.

I'm assuming a typo for 15.00.00 BTW.

I admit I'm not all that familiar with the "currency" type, which I infer you're
using given the "USD" bits. But I ran a quick test with currency types and
it worked at least the way I ran it... But another quick look shows that
some interesting things are being done with the "currency" type, so who knows?

So, let's see your relevant schema bits, and the results of your query
when you attach &debugQuery=on to it.


Best
Erick

On Fri, Jun 29, 2012 at 2:43 PM, Andrew Meredith  wrote:
> First off, I have to say that I am working on my first project that has
> required me to work with Solr, so my question my be very elementary - I
> just could not find an answer elsewhere.
>
> I am trying to add a ranged query filter that returns all items in a given
> "prices" range. In my situation, each item can have multiple prices, so it
> is a multivalued field. When I search a range, say, prices:[15.00.00 TO
> 21.00], I want Solr to return all items that have *any* price in that
> range, rather than returning results where *all* prices are in the range.
> For example, if i have an item with the following prices, it will not be
> returned:
>   
>   19.99,USD
>   22.50,USD
>   
>
> Is there any way to change the behaviour of Solr so that it will match
> documents in which any value of a multivalued field matches a ranged query
> filter?
>
> Thanks!
>
> --
> 
> S.D.G.


Re: index writer in searchComponent

2012-06-30 Thread Erick Erickson
Lots of the index modification (all of it?) has been removed in 4.0
from IndexReaders...

It seems like you could always get the directory and open a
SolrIndexWriter wherever you wanted,
but I'm not sure it's a good idea, are there other processes that will
be writing to the index at the
same time?

What's the purpose here anyway? There might be a better approach

Best
Erick

On Thu, Jun 28, 2012 at 4:02 PM, Peyman Faratin  wrote:
> Hi
>
> Is it possible to add a new document to the index in a custom SearchComponent 
> (that also implements a SolrCoreAware)? I can get a reference to the 
> indexReader via the ResponseBuilder parameter of the process() method using
>
> rb.req.getSearcher().getReader()
>
> But is it possible to actually add a new document to the index _after_ 
> searching the index? I.e accessing the indexWriter?
>
> thank you
>
> Peyman


Re: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'

2012-06-30 Thread Erick Erickson
What is the exception you're encountering? You might review:

http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Thu, Jun 28, 2012 at 2:48 PM, derohit  wrote:
> Hi All,
>
> I am facing an ecpetion while trying to use dataImportHandler for Indexing
> My solrcofig.xml help is:-
>
> ${solr.abortOnConfigurationError:true}
> LUCENE_36  dir="../../dist/" regex="apache-solr-dataimporthandler-d.*.jar" />
>  class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
>   handleSelect="true" > 
>   class="solr.StandardRequestHandler" default="true" />  name="/update"   class="solr.JsonUpdateRequestHandler"
> startup="lazy" />  class="solr.admin.AdminHandlers" />  class="solr.PingRequestHandler">name="qt">search   solrpingquery 
>all 
> class="org.apache.solr.handler.dataimport.DataImportHandler">  name="defaults">   data-config.xml 
>   solr
>
>
>
> and Jar's name is apache-solr-dataimporthandler-3.6.0.jar
>
> Please revert if someone has the solution to it.
>
> Regards
> Rohit
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Error-loading-class-org-apache-solr-handler-dataimport-DataImportHandler-tp3991940.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Can't find solr.xml

2012-06-30 Thread Lance Norskog
Try starting with the example/multicore directory. It shows how
solr.xml describes different available cores.

On Sat, Jun 30, 2012 at 11:28 AM, Nabeel Sulieman
 wrote:
> Hi,
>
>
>
> I really hate bothering this group with something that should be trivial,
> but I've been googling and experimenting to get this to work for the last
> week now. I had no trouble getting my simple configuration working on 3.5,
> but when I moved over to 3.6, I seem to have hit something strange.
>
>
>
> As I said I'm on the latest version of solr (3.6.0), and I'm using exactly
> the standard war file, with the "solr/home" section uncommented and set to
> my Solr directory.
>
>
>
> However, even though the path is correct, Solr/Tomcat don't seem to be able
> to find the solr.xml file, nor the solrconfig.xml file.
>
>
>
> Java version is 1.6.0_29-b11, tomcat 5.5.35, CentOS.
>
>
>
> What am I missing here?
>
>
>
> Thanks. Below is the error log.
>
>
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer$Initializer
> initialize
>
> INFO: looking for solr.xml: /home/dev/solr/solr.xml
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer$Initializer
> initialize
>
> INFO: no solr.xml file found - using default
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer load
>
> INFO: Loading CoreContainer using Solr Home: '/home/dev/solr/'
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader 
>
> INFO: new SolrResourceLoader for directory: '/home/dev/solr/'
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer create
>
> INFO: Creating SolrCore '' using instanceDir: /home/dev/solr/.
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader 
>
> INFO: new SolrResourceLoader for directory: '/home/dev/solr/./'
>
> Jun 30, 2012 12:47:58 PM org.apache.solr.common.SolrException log
>
> SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
> classpath or '/home/dev/solr/./conf/',
> cwd=/usr/local/jakarta/apache-tomcat-5.5.35/bin
>
> at
> org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java
> :273)
>
> at
> org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:2
> 39)
>
> at org.apache.solr.core.Config.(Config.java:141)
>
> at
> org.apache.solr.core.SolrConfig.(SolrConfig.java:138)
>
> at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:455)
>
> at
> org.apache.solr.core.CoreContainer.load(CoreContainer.java:335)
>
> at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java
> :165)
>
> at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:96)
>
> at
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilter
> Config.java:221)
>
> at
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFil
> terConfig.java:302)
>
> at
> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterCon
> fig.java:78)
>
> at
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:36
> 66)
>
> at
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4258)
>
> at
> org.apache.catalina.core.StandardContext.reload(StandardContext.java:3056)
>
> at
> org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:904)
>
> at
> org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.jav
> a:496)
>
> at
> org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java
> :99)
>
> at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:627)
>
> at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
>
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
> FilterChain.java:269)
>
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
> ain.java:188)
>
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
> va:213)
>
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
> va:172)
>
> at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase
> .java:563)
>
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127
> )
>
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117
> )
>
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
> :108)
>
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
>
> at
> org.apache.coyote.http11.Http11Processor.pro

Re: Atomic Multicore Operations - E.G. Move Docs

2012-06-30 Thread Lance Norskog
Index all documents to both cores, but do not call commit until both
report that indexing worked. If one of the cores throws an exception,
call roll back on both cores.

On Sat, Jun 30, 2012 at 6:50 AM, Nicholas Ball
 wrote:
>
> Hey all,
>
> Trying to figure out the best way to perform atomic operation across
> multiple cores on the same solr instance i.e. a multi-core environment.
>
> An example would be to move a set of docs from one core onto another core
> and ensure that a softcommit is done as the exact same time. If one were to
> fail so would the other.
> Obviously this would probably require some customization but wanted to
> know what the best way to tackle this would be and where should I be
> looking in the source.
>
> Many thanks for the help in advance,
> Nicholas a.k.a. incunix



-- 
Lance Norskog
goks...@gmail.com


Re: more than one text corpus with solr?

2012-06-30 Thread Giovanni Gherdovich
Hi Gora,

yes I was actually looking for a multi-core setup.

thanks!

GGhh

2012/6/30 Gora Mohanty
>
> Not quite sure what you mean by "more than one
> corpus", and by "several independent indices" in
> this context, but maybe multi-core Solr will meet
> your needs: http://wiki.apache.org/solr/CoreAdmin
>
> Regards,
> Gora


Can't find solr.xml

2012-06-30 Thread Nabeel Sulieman
Hi,

 

I really hate bothering this group with something that should be trivial,
but I've been googling and experimenting to get this to work for the last
week now. I had no trouble getting my simple configuration working on 3.5,
but when I moved over to 3.6, I seem to have hit something strange.

 

As I said I'm on the latest version of solr (3.6.0), and I'm using exactly
the standard war file, with the "solr/home" section uncommented and set to
my Solr directory.

 

However, even though the path is correct, Solr/Tomcat don't seem to be able
to find the solr.xml file, nor the solrconfig.xml file. 

 

Java version is 1.6.0_29-b11, tomcat 5.5.35, CentOS.

 

What am I missing here?

 

Thanks. Below is the error log.

 

Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer$Initializer
initialize

INFO: looking for solr.xml: /home/dev/solr/solr.xml

Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer$Initializer
initialize

INFO: no solr.xml file found - using default

Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer load

INFO: Loading CoreContainer using Solr Home: '/home/dev/solr/'

Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader 

INFO: new SolrResourceLoader for directory: '/home/dev/solr/'

Jun 30, 2012 12:47:58 PM org.apache.solr.core.CoreContainer create

INFO: Creating SolrCore '' using instanceDir: /home/dev/solr/.

Jun 30, 2012 12:47:58 PM org.apache.solr.core.SolrResourceLoader 

INFO: new SolrResourceLoader for directory: '/home/dev/solr/./'

Jun 30, 2012 12:47:58 PM org.apache.solr.common.SolrException log

SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
classpath or '/home/dev/solr/./conf/',
cwd=/usr/local/jakarta/apache-tomcat-5.5.35/bin

at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java
:273)

at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:2
39)

at org.apache.solr.core.Config.(Config.java:141)

at
org.apache.solr.core.SolrConfig.(SolrConfig.java:138)

at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:455)

at
org.apache.solr.core.CoreContainer.load(CoreContainer.java:335)

at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java
:165)

at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:96)

at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilter
Config.java:221)

at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFil
terConfig.java:302)

at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterCon
fig.java:78)

at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:36
66)

at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4258)

at
org.apache.catalina.core.StandardContext.reload(StandardContext.java:3056)

at
org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:904)

at
org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.jav
a:496)

at
org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java
:99)

at
javax.servlet.http.HttpServlet.service(HttpServlet.java:627)

at
javax.servlet.http.HttpServlet.service(HttpServlet.java:729)

at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
FilterChain.java:269)

at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
ain.java:188)

at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
va:213)

at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
va:172)

at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase
.java:563)

at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127
)

at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117
)

at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
:108)

at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)

at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:879)

at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processC
onnection(Http11BaseProtocol.java:665)

at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.jav
a:528)

at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWo
rkerThread.java:81)

at
org.apache.tomcat.util.thr

Re: difference between stored="false" and stored="true" ?

2012-06-30 Thread Giovanni Gherdovich
Thank you François and Jack for those explainations.

Cheers,
GGhh

2012/6/30 François Schiettecatte:
> Giovanni
>
>  means the data is stored in the index and [...]


2012/6/30 Jack Krupansky:
> "indexed" and "stored" are independent [...]


Re: more than one text corpus with solr?

2012-06-30 Thread Gora Mohanty
On 30 June 2012 15:28, Giovanni Gherdovich  wrote:
> Hi all,
>
> i am experimenting with solr, and I feel the need to
> index more than just one corpus and search them
> with solr independently.
>
> is it possible to have this setup?
> Several independent indices all managed by the same solr instance?

Not quite sure what you mean by "more than one
corpus", and by "several independent indices" in
this context, but maybe multi-core Solr will meet
your needs: http://wiki.apache.org/solr/CoreAdmin

Regards,
Gora


Re: difference between stored="false" and stored="true" ?

2012-06-30 Thread François Schiettecatte
Giovanni

 means the data is stored in the index and can be returned with 
the search results (see the 'fl' parameter). This is independent of 

Which means that you can store but not index a field:



Best regards

François

On Jun 30, 2012, at 9:57 AM, Giovanni Gherdovich wrote:

> Hi all,
> 
> when declaring a field in the schema.xml file you can
> set the attributes 'indexed' and 'stored' to "true" or "false".
> 
> What is the difference between a 
> and a ?
> 
> I guess understanding this would require me to have
> a closer look to lucene's index data structures;
> what's the pointer to some doc I can read?
> 
> Cheers,
> GGhh



Re: difference between stored="false" and stored="true" ?

2012-06-30 Thread Jack Krupansky
"indexed" and "stored" are independent, orthogonal attributes - you can use 
any of the four combinations of true and false. "indexed" is used for search 
or query, the "lookup" portion of processing a query request. Once the 
search/query/lookup is complete and a set of documents is selected, "stored" 
is the set of fields whose values are available for display or return with 
the Solr response.


Part of the reason for the separation is that Solr/Lucene "analyzes" or 
transforms the input data into a more efficient form for faster and more 
relevant search/lookup. Unfortunately, that analyzed/transformed data is 
frequently no longer suitable for display and human consumption. In other 
words the analysis/transformation is not bidirectional/reversible. Setting 
"stored=true" guarantees that the original data can be retrieved in its 
original form.


-- Jack Krupansky

-Original Message- 
From: Giovanni Gherdovich

Sent: Saturday, June 30, 2012 8:57 AM
To: solr-user@lucene.apache.org
Subject: difference between stored="false" and stored="true" ?

Hi all,

when declaring a field in the schema.xml file you can
set the attributes 'indexed' and 'stored' to "true" or "false".

What is the difference between a 
and a ?

I guess understanding this would require me to have
a closer look to lucene's index data structures;
what's the pointer to some doc I can read?

Cheers,
GGhh 



Re: how do I trash a whole index and start over?

2012-06-30 Thread Giovanni Gherdovich
2012/6/30 Dmitry Kan:
> Hello,
>
> The easiest way is to remove what's inside data/index directory; in case
> you have a spell-checker index, remove it as well. This requires solr
> instance restart.

thanks dmitry, I'll go for this solution.

cheers,
GGhh


difference between stored="false" and stored="true" ?

2012-06-30 Thread Giovanni Gherdovich
Hi all,

when declaring a field in the schema.xml file you can
set the attributes 'indexed' and 'stored' to "true" or "false".

What is the difference between a 
and a ?

I guess understanding this would require me to have
a closer look to lucene's index data structures;
what's the pointer to some doc I can read?

Cheers,
GGhh


Re: how do I trash a whole index and start over?

2012-06-30 Thread Dmitry Kan
Hello,

The easiest way is to remove what's inside data/index directory; in case
you have a spell-checker index, remove it as well. This requires solr
instance restart.

Another way, without restarting the server, is to issue deleteByQuery over
http.

When you are done, you need to reindex your data in order to take changes
of new schema.xml into use.

// Dmitry

On Sat, Jun 30, 2012 at 4:44 PM, Giovanni Gherdovich  wrote:

> Hi all,
>
> how do I trash a whole index and start over
> with a new fresh index of my corpus?
>
> I need that since I modified my schema.xml
> since my last indexing, and I'd like the changes
> to be taken into account.
>
> Cheers,
> Giovanni
>



-- 
Regards,

Dmitry Kan


documentation on the pragmatics behind the example schema.xml

2012-06-30 Thread Giovanni Gherdovich
Hi all,

in the example schema.xml I can find a wide variety
of fieldType and field, already there to be used.

I believe each of them has been designed for a specific
usage case, with some pragmatics in mind.

Where can I find documentation on what those field / fieldTypes
were designed for? Is the best place to get those info
the schema.xml file and its comments?

cheers,
GGhh

here I cut and paste fields and fieldTypes I have:

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8































-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8


Atomic Multicore Operations - E.G. Move Docs

2012-06-30 Thread Nicholas Ball

Hey all,

Trying to figure out the best way to perform atomic operation across
multiple cores on the same solr instance i.e. a multi-core environment.

An example would be to move a set of docs from one core onto another core
and ensure that a softcommit is done as the exact same time. If one were to
fail so would the other.
Obviously this would probably require some customization but wanted to
know what the best way to tackle this would be and where should I be
looking in the source.

Many thanks for the help in advance,
Nicholas a.k.a. incunix


how do I trash a whole index and start over?

2012-06-30 Thread Giovanni Gherdovich
Hi all,

how do I trash a whole index and start over
with a new fresh index of my corpus?

I need that since I modified my schema.xml
since my last indexing, and I'd like the changes
to be taken into account.

Cheers,
Giovanni


Re: how to retrieve a doc from its docID ?

2012-06-30 Thread Giovanni Gherdovich
Sascha:
> You should also make sure that the field definition (in schema.xml) for 'text'
> says stored="true", otherwise the field will not be returned.

I guess you're hitting my problem.
The field I want to search on is declared with store=false in
the schema.xml:

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

I guess I have to re-index all my corpus again,
after modifying that declaration in my schema.xml

(or choosing a different field in the example schema -- which one?)

Sascha:
> did you include the fl parameter in the Solr query URL?
> If that's the case make sure that the field name 'text' is mentioned there.

no, I am not using the "fiel list" (fl) param. should I?

Jack:
> Don't try doing this with the "text" field of the Solr example schema,
> copied to the catchall field. [...] he catchall field is designed for 
> indexing,
> not result display.

Uhm... I am using the field 'text', which is of fieldType 'text',
from the example schema. I choosed it with no clue -- I guessed
what could have been the fields for my docs, and picked up that one.
Should I have made a different choice?

Are you saying that all fields then are "copied" into some "behind the scenes"
'text' field, and that is the real purpose of the following field
in the example schema?

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

Then, what is a better suited field from the pool of fields
that are available off-the-shelf in the example schema,
given that my goal is to make text searches into that field?

Jack:
> Rather, add the original source field(s)
> to "fl" that was/were copied to the catchall field.

If anything has been copied from a field to another,
this has happened beyond my intentions :-)
I picked up the 'text' field since I thought it was
good for text search. You're saying it isn't,
if I understand you correctly.

> But do make sure that "stored=true" for any field
> you want returned in search results.

ok noted.

cheers,
GGhh


Re: how do I search the archives for solr-user

2012-06-30 Thread Jack Krupansky
Just use a simple Google search for any Solr question using specific 
technical terms. Google will find the Solr archives as well as quite a few 
discussions on StackOverflow.


-- Jack Krupansky

-Original Message- 
From: Giovanni Gherdovich

Sent: Saturday, June 30, 2012 5:39 AM
To: solr-user@lucene.apache.org
Subject: how do I search the archives for solr-user

Hi all,

I am sure pretty much all of my questions have already
been answered.

Apart from using google with
"site:http://mail-archives.apache.org/mod_mbox/lucene-solr-user/"; ,
how do I search the archives?

thanks,
Giovanni 



Re: querying thru solritas gives me zero results

2012-06-30 Thread Giovanni Gherdovich
2012/6/30 Erik Hatcher:
> Debugging this you can add &debugQuery=true&wt=xml to get
> the full classic Solr XML output that drives it all.

Thank you Erik, I'll see what I get from it.

cheers,
GGhh


Re: how to retrieve a doc from its docID ?

2012-06-30 Thread Jack Krupansky
Don't try doing this with the "text" field of the Solr example schema, which 
is a catchall field that is populated via CopyFields. Rather, add the 
original source field(s) to "fl" that was/were copied to the catchall field. 
The catchall field is designed for indexing, not result display.  But do 
make sure that "stored=true" for any field you want returned in search 
results.


-- Jack Krupansky

-Original Message- 
From: Sascha Szott

Sent: Saturday, June 30, 2012 6:39 AM
To: solr-user@lucene.apache.org
Subject: Re: how to retrieve a doc from its docID ?

Hi,

did you include the fl parameter in the Solr query URL? If that's the case 
make sure that the field name 'text' is mentioned there. You should also 
make sure that the field definition (in schema.xml) for 'text' says 
stored="true", otherwise the field will not be returned.


-Sascha



Giovanni Gherdovich  schrieb:

Hi all,

when querying my solr instance, the answers I get
are the document IDs of my docs. Here is how one of my docs
looks like:

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --


hello solar!
123


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

here is the response if I query for "solar" :

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --



1.0
123


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

which is, solr gives me the doc ID. How to retrieve the doc's field "text"
given its id ?

cheers,
Giovanni



Re: more than one text corpus with solr?

2012-06-30 Thread Giovanni Gherdovich
2012/6/30 Afroz Ahmad:
> You can set up multiple cores, each core managing a different index.
> See http://wiki.apache.org/solr/CoreAdmin
>

thank you very much Ahmad for this hint.

cheers,
Giovanni


Re: querying thru solritas gives me zero results

2012-06-30 Thread Giovanni Gherdovich
Hello Sascha,

Sascha:
> Solritas uses the dismax query parser.
> The dismax config parameter 'qf' specifies
> the index fields to be searched in.
> Make sure that 'name' is your default search field.

I am not sure I understand this; I have no field named 'name'.
My documents are like

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

   
   hello solar!
   123
   

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

so my understanding here is that they have two fields,
whose 'name' attributes are 'text' and 'id'. My intent
is to make searches over the 'text' field, so maybe this
is the default value for the 'qf' parameter I need to set up?

I am asking since actually my current default for 'qf'
in solritas __is__ 'name', from my solrconfig.xml:

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
 
   
 [...]
 name
   
 
-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

and now that you make me look closely to that,
such a default makes little sense to me
(what? no field named 'name', how could that work
as a default for the "query fields" (qf) !!)

Here the details of those fieldTypes from my schema.xml:

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8


-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8


resuming:

1) I don't understand what does it mean defaulting 'qf' to 'name' in
disMax. I have no field named 'name'.
2) From what I understand, my 'qf' value for disMax should default to
'text', the name of the field I care of.

correct?

cheers,
Giovanni


Re: Using custom user-defined caches to store user app data while indexing

2012-06-30 Thread Dmitry Kan
Hello!

If you implement SolrCoreAware interface in your custom
UpdateRequestProcessorFactory, you could then access your cache via Solr
Core in the inform method, I think. Haven't tried it myself, but it looks
logical to me to start from there.

// Dmitry

On Fri, Jun 29, 2012 at 4:44 PM, Iana Atanassova
wrote:

> Hi,
>
> I'm trying to implement a custom UpdateRequestProcessorFactory class that
> works with the XSLT Request handler for indexing.
> My UpdateRequestProcessorFactory has to examine some of the document fields
> and compare them against some regular expressions that are stored in an
> external MySQL database.
> Currently, my UpdateRequestProcessorFactory works by establishing a
> connection to the database and them retrieving the regular expressions for
> every new document that needs to be indexed.
>
> However, I would like to speed up this processing and store the regular
> expressions in memory. I tried to define a new user cache in solrconfig.xml
> (http://wiki.apache.org/solr/SolrCaching#User.2BAC8-Generic_Caches). As
> far
> as I understand, these caches can be used to store any user application
> data. But when I implement the UpdateRequestProcessorFactory, I do not
> arrive to access this cache.
>
> What would be the method to read/write into a user defined sorl cache while
> indexing? How can I access the current SolrIndexSearcher from my code? Are
> there any other solutions that I should look at?
>
> Thanks!
>
> Iana
>



-- 
Regards,

Dmitry Kan


Re: querying thru solritas gives me zero results

2012-06-30 Thread Erik Hatcher
Debugging this you can add &debugQuery=true&wt=xml to get the full classic Solr 
XML output that drives it all. 

Erik

On Jun 30, 2012, at 7:36, Giovanni Gherdovich  wrote:

> Hi all,
> 
> this morning I was very proud of myself since I managed
> to set up solritas ( http://wiki.apache.org/solr/VelocityResponseWriter )
> for the solr instance on my server (ubuntu natty).
> 
> This joy lasted only half a minute, since the only query
> that gets more than zero results with solritas is the catchall "*:*"
> 
> for example:
> http://my.server.com:8080/solr/select/?q=foobar has thousands of results,
> ​http://my.server.com:8080/solr/itas?q=foobar has none
> 
> Here the standard and "velocity" request handlers from my solrconfig.xml;
> 
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
>  
> 
>   explicit
> 
>  
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
> 
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
>   class="org.apache.solr.request.VelocityResponseWriter"/>
>  
>
>  velocity
>  browse
>  Solr cookbook example
>  dismax
>  *:*
>  10
>  *,score
>  name
>
>  
> -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
> 
> any hint on how I can debug that?
> 
> cheers,
> Giovanni


Re: querying thru solritas gives me zero results

2012-06-30 Thread Sascha Szott
Hi,

Solritas uses the dismax query parser. The dismax config parameter 'qf' 
specifies the index fields to be searched in. Make sure that 'name' is your 
default search field.

-Sascha




Giovanni Gherdovich  schrieb:

Hi all,

this morning I was very proud of myself since I managed
to set up solritas ( http://wiki.apache.org/solr/VelocityResponseWriter )
for the solr instance on my server (ubuntu natty).

This joy lasted only half a minute, since the only query
that gets more than zero results with solritas is the catchall "*:*"

for example:
http://my.server.com:8080/solr/select/?q=foobar has thousands of results,
​http://my.server.com:8080/solr/itas?q=foobar has none

Here the standard and "velocity" request handlers from my solrconfig.xml;

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8


explicit


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8



velocity
browse
Solr cookbook example
dismax
*:*
10
*,score
name


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8

any hint on how I can debug that?

cheers,
Giovanni



Re: how to retrieve a doc from its docID ?

2012-06-30 Thread Sascha Szott
Hi,

did you include the fl parameter in the Solr query URL? If that's the case make 
sure that the field name 'text' is mentioned there. You should also make sure 
that the field definition (in schema.xml) for 'text' says stored="true", 
otherwise the field will not be returned.

-Sascha



Giovanni Gherdovich  schrieb:

Hi all,

when querying my solr instance, the answers I get
are the document IDs of my docs. Here is how one of my docs
looks like:

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --


hello solar!
123


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

here is the response if I query for "solar" :

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --



1.0
123


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

which is, solr gives me the doc ID. How to retrieve the doc's field "text"
given its id ?

cheers,
Giovanni



querying thru solritas gives me zero results

2012-06-30 Thread Giovanni Gherdovich
Hi all,

this morning I was very proud of myself since I managed
to set up solritas ( http://wiki.apache.org/solr/VelocityResponseWriter )
for the solr instance on my server (ubuntu natty).

This joy lasted only half a minute, since the only query
that gets more than zero results with solritas is the catchall "*:*"

for example:
http://my.server.com:8080/solr/select/?q=foobar has thousands of results,
​http://my.server.com:8080/solr/itas?q=foobar has none

Here the standard and "velocity" request handlers from my solrconfig.xml;

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
  
 
   explicit
 
  
-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8
  
  

  velocity
  browse
  Solr cookbook example
  dismax
  *:*
  10
  *,score
  name

  
-- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8  -- -- >8

any hint on how I can debug that?

cheers,
Giovanni


how to retrieve a doc from its docID ?

2012-06-30 Thread Giovanni Gherdovich
Hi all,

when querying my solr instance, the answers I get
are the document IDs of my docs. Here is how one of my docs
looks like:

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --


hello solar!
123


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

here is the response if I query for "solar" :

-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --



1.0
123


-- -- >8 -- -- >8 -- -- >8 -- -- >8 -- -- >8 -- --

which is, solr gives me the doc ID. How to retrieve the doc's field "text"
given its id ?

cheers,
Giovanni


more than one text corpus with solr?

2012-06-30 Thread Giovanni Gherdovich
Hi all,

i am experimenting with solr, and I feel the need to
index more than just one corpus and search them
with solr independently.

is it possible to have this setup?
Several independent indices all managed by the same solr instance?

cheers,
Giovanni