Text formatting lost

2009-12-15 Thread Mike Aymard

Hi,

I'm a newbie and have a question about the text that is stored and then 
returned from a query. The field in question is of type "text", is indexed and 
stored. The original text included various blank lines (line feeds), but when 
the text field is returned as the result from a query, all of the blank lines 
and extra spaces have been removed. Since I am storing the content for the 
purpose of displaying, I need the original format to be preserved. Is this 
possible? I tried changing it to indexed="false" and using a copyvalue to copy 
it to the general text field for indexing, but this didn't help.

Thanks!
Mike
  
_
Hotmail: Trusted email with powerful SPAM protection.
http://clk.atdmt.com/GBL/go/177141665/direct/01/

Re: using q= , adding fq=

2009-12-15 Thread Chris Hostetter

: > 1) adding something like:  q=cat_id:xxx&fq=geo_id= would boost
: > performance?
: 
: 
: For the n > 1 query, yes, adding filters should improve performance 
: assuming it is selective enough.  The tradeoff is memory.

You might even find that something like this is faster...

   q=*:*&fq=cat_id:&fq=geo_id:

...but it can vary based on circumstances (depends a lot on how many 
unique  and  values you have, and how big each of those sets are, 
and how big you make your filterCache)

: > 2) we do find problems when we ask for a page=large offset!  ie: 
: > q=cat_id:xxx and geo_id:yyy&start=544545
: > (note that we limit docs to 50 max per resultset).
: > When start is 500 or more, Qtime is >=5 seconds while the avg qtime is
: > <100 ms

FWIW: limiting the number of rows per request to 50, but not limiting the 
start doesn't make much sense -- the same amount of work is needed to 
handle start=0&rows=5050 and start=5000&rows=50.

There are very few use cases for allowing people to iterate through all 
the rows that also require sorting.


-Hoss



RE: SolrPlugin Guidance

2009-12-15 Thread Chris Hostetter

: Our QParser plugin will perform queries against directory documents and
: return any file document that has the matching directory id(s).  So the
: plugin transforms the query to something like 
: 
: q:+(directory_id:4 directory:10) +directory_id:(4)
...
: Currently the parser plugin is doing the lookup queries via the standard
: request handler.  The problem with this approach is that the look up
: queries are going to be analyzed twice.  This only seems to be a problem

...you lost me there.  if you are taking part of the query, and using it 
to get directory ids, and then using those directory ids to build a new 
query, why are you ever passing the output from one query parser to 
another query parser?

You take the input string, you let the LuceneQParser parse it and use it 
to search against "Directory" documents, and then you iterate over hte 
results, and get an ID from them.  You should be using those IDs directly 
to build your new query.

Honestly: even if you were using those ids to build a query string, and 
then pass that string to hte analyzer, i don't see why stemming would 
cause any problems for you if the ids are numbers (like in your example)

-Hoss



Re: Using facets to narrow results with multiword field

2009-12-15 Thread Chris Hostetter
: 
: I'm using facet.field=lbrand and do get good results for eg: Geomax, GeoMax,
: GEOMAX  all of them falls into "geomax". But when I'm filtering I do get
: strange results:
: 
: brand:geomax  gives numFound="0"
: lbrand:geomax  gives numFound="57" (GEOMAX, GeoMag, Geomag)
: 
: How should I redefine brand to let narrow work correctly?

I'm not sure i understand what it is that isn't working for you ... if you 
are faceting on "lbrand" then you should filter on "lbrand" as well ... 
your query for "brand:geomax" is probably failing because you don't 
actually have "geomax" as a value for any doc -- which is what you should 
expect, since you didn't use a LowercaseFilter.

correct?




-Hoss



RE: Can solr web site have multiple versions of online API doc?

2009-12-15 Thread Teruhiko Kurosaka
Israel,

> If you downloaded the 1.3.0 release, you should find a "docs" 
> folder inside the zip file.
> 
> This contains the javadoc for that particular release.
> 
> You may also re download a 1.3.0 release to get the docs for Solr 1.3.

This doesn't solve my problem.  I can't write my javadoc comments
referencing to Solr API doc located in my local hard drive. 
The Solr API doc needs to be located in the Internet.
Various versions of J2SE (JDK) API doc and Lucene API doc 
are available online at well-defined URLs.  I'd like to have
Solr API docs available in the similar manner.

Kuro


Re: Filter exclusion on query facets?

2009-12-15 Thread Uri Boness
Yes, you can tag filters using the new local params format and then 
explicitly exclude them when providing the facet fields. see: 
http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters


Cheers,
Uri

Mat Brown wrote:

Hi all,

Just wondering if it's possible to do filter exclusion (i.e.,
multiselect faceting) on query facets in Solr 1.4?

Thanks!
Mat

  


store content only of documents

2009-12-15 Thread javaxmlsoapdev

I store document in a field "content" field defiend as follow in schema.xml


and following in solrconfig.xml


  content
  content

  

I want to store only "content" into this field but it store other meta data
of a document e.g. "Author", "timestamp", "document type" etc. how can I ask
solr to store only body of document into this field and not other meta data?

Thanks,

-- 
View this message in context: 
http://old.nabble.com/store-content-only-of-documents-tp26803101p26803101.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr client query vs Solr search query

2009-12-15 Thread insaneyogi3008

Hello,

I had this question to ask regarding building of a Solr query . On the solr
server running on a linux box my query that returns results is as follows ,
this one of course returns the search results 

http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile/select/?q=Bangalore&version=2.2&start=0&rows=10&indent=on


However when I try to access the same Solr server using a webapp on tomcat
if I print out the query it comes out as : 

http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile?q=bangalore&qt=/profile&rows=100&wt=javabin&version=1

Note the second query is missing the "select" clause among other things that
follow . This one does not return the results back to me . 

My question is am I building my query wrong in my client , could somebody
show me the way?

With Regards
Sri
-- 
View this message in context: 
http://old.nabble.com/Solr-client-query-vs-Solr-search-query-tp26802634p26802634.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr client query vs Solr search query

2009-12-15 Thread insaneyogi3008

Hello,

I had this question to ask regarding building of a Solr query . On the solr
server running on a linux box my query that returns results is as follows ,
this one of course returns the search results 

http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile/select/?q=Bangalore&version=2.2&start=0&rows=10&indent=on


However when I try to access the same Solr server using a webapp on tomcat
if I print out the query it comes out as : 

http://ncbu-cam35-2:17003/apache-solr-1.4.0/profile?q=bangalore&qt=/profile&rows=100&wt=javabin&version=1

Note the second query is missing the "select" clause among other things that
follow . This one does not return the results back to me . 

My question is am I building my query wrong in my client , could somebody
show me the way?

With Regards
Sri
-- 
View this message in context: 
http://old.nabble.com/Solr-client-query-vs-Solr-search-query-tp26802513p26802513.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: wildcard oddity

2009-12-15 Thread Erick Erickson
Do you get the same behavior if you search for "gang" instead of "gets"?
I'm wondering if there's something going on with stemEnglishPossesive.

According to the docs you *should* be OK since you set
setmEnglishPosessive=0,
but this would help point in the right direction.

Also, am I correct in assuming that that is the analyzer both for indexing
AND searching?

Best
Erick

On Tue, Dec 15, 2009 at 3:30 PM, Joe Calderon wrote:

> im trying to do a wild card search
>
> "q":"item_title:(gets*)"returns no results
> "q":"item_title:(gets)"returns results
> "q":"item_title:(get*)"returns results
>
>
> seems like * at the end of a token is requiring a character, instead
> of being 0 or more its acting like1 or more
>
> the text im trying to match is "The Gang Gets Extreme: Home Makeover
> Edition"
>
> the field uses the following analyzers
>
> positionIncrementGap="100" omitNorms="false">
>  
>
>
>
>
> generateWordParts="1" generateNumberParts="0" catenateAll="1"
> splitOnNumerics="0" splitOnCaseChange="0" stemEnglishPossessive="0" />
>  
>
>
>
> is anybody else having similar problems?
>
>
> best,
> --joe
>


Re: Log of zero result searches

2009-12-15 Thread stuart yeates

Chris Hostetter wrote:

See Also:  http://en.wikipedia.org/wiki/Thread_hijacking


You may want to update that link, since that wikipedia page has been 
deleted for some time.


cheers
stuart
--
Stuart Yeates
http://www.nzetc.org/   New Zealand Electronic Text Centre
http://researcharchive.vuw.ac.nz/ Institutional Repository


Re: facet.field problem in SolrParams to NamedList

2009-12-15 Thread Chris Hostetter

: Ej.: "q=something field:value" becomes "q=something value&fq=field:value"
: 
: To do this, in the createParser method, I apply a regular expression
: to the qstr param to obtain the fq part, and then I do the following:
: 
: NamedList paramsList = params.toNamedList();
: paramsList.add(CommonParams.FQ, generatedFilterQuery);
: params = SolrParams.toSolrParams(paramsList);
: req.setParams(params);
...
: SolrParams.toNameList() was saving the array correctly, but the method
: SolrParams.toSolrParams(NamedList) was doing:
: "params.getVal(i).toString()". So, it always loses the array.

I'm having trouble thinking through exactly where the problem is being 
introduced here ... ultimately what it comes down to is that the NamedList 
souldn't be containing a String[] ... it should be containing multiple 
string values with the same name ("fq")

It would be good to make sure all of these methods play nicely with one 
another so some round trip conversions worked as expected -- so if you 
could open a bug for this with a simple example test case that would be 
great, ...but...

for your purposes, i would skip the NamedList conversion alltogether, 
and just use AppendedSolrParams...

  MapSolrParams myNewParams = new MapSolrParams();
  myNewParams.getMap().put("fq", generatedFileterQuery);
  myNewParams.getMap().put("q", generatedQueryString);
  req.setParams(new AppendedSolrParams(myNewParams, originalPrams));

-Hoss



RE: Request Assistance with DIH

2009-12-15 Thread Turner, Robbin J
Thanks for the reply, just what I was looking for in answer.  I am running 
under Tomcat 6 on Solaris 10, the person that replied before you looks like 
they running under jetty.  I have configured jndi context.  I stop and start 
tomcat using the Solaris SMF, equivalent to services in linux.  But my cwd is 
point to root, I have solr home specified in Catalina/localhost/solr.xml.  Is 
there anything else that I can do to force cwd to point to solr/home?

Thanks again
Robbin

-Original Message-
From: Ken Lane (kenlane) [mailto:kenl...@cisco.com] 
Sent: Monday, December 14, 2009 11:04 AM
To: solr-user@lucene.apache.org
Subject: RE: Request Assistance with DIH

Hi Robbin,

I just went through this myself (I am a newbie).

The key things to look at are: 

1. Your data_config.xml. I created a table called 'foo' and an 
ora_data_config.xml file with a simple example to get it working that looks 
like this:


  
  


  


Some gotcha's: 
If your Oracle DB is configured with Service_name rather than SID (ie.
You may be running failover, RAC, etc), the url parameter of jdbc connection 
can read like this:

  url="jdbc:oracle:thin:@(DESCRIPTION = (LOAD_BALANCE = on) 
(FAILOVER = on) (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 
.cisco.com)(PORT = 1528))) (CONNECT_DATA = (SERVICE_NAME 
= ..COM)))"

2. In your solr_config.xml file, add something like this to reference the above 
listed file:

  
  
ora-data-config.xml
  
  

3. I have Solr1.4 running under Tomcat 6. It looks like you are trying the 
jetty example, but pay mind to getting the "cwd" pointing to your solr home by 
setting your JNDI path as described in the dataimporthandler wiki.

4. When it blows up, as it did numerous times for me until I got it right, 
check the logs. As I am running under Tomcat, I was able to check 
\logs\catalina.2009-12-14.log to view DIH errors both upon restart 
of Tomcat and after running the DIH.


5. There are some tools to check your JDBC connection you might try before 
pulling too much of your hair out. Try here:
http://otn.oracle.com/sample_code/tech/java/sqlj_jdbc/content.html

Good Luck!
Ken

-Original Message-
From: Turner, Robbin J [mailto:robbin.j.tur...@boeing.com]
Sent: Monday, December 14, 2009 10:27 AM
To: solr-user@lucene.apache.org
Subject: RE: Request Assistance with DIH

How does this help answer my question?  I am trying to use the 
DATAImportHandler Development console.  The url you suggest assumes I had it 
working already.  

Looking at my logs and the response to the Development console, it does not 
appear that the connection to Oracle is being made.

So if someone could offer some configuration/connection setup directions I 
would very much appreciate it.

Thanks
Robbin 

-Original Message-
From: Joel Nylund [mailto:jnyl...@yahoo.com]
Sent: Friday, December 11, 2009 8:26 PM
To: solr-user@lucene.apache.org
Subject: Re: Request Assistance with DIH

add ?command=full-import to your url

http://localhost:8983/solr/dataimport?command=full-import

thanks
Joel

On Dec 11, 2009, at 7:45 PM, Robbin wrote:

> I've been trying to use the DIH with oracle and would love it if 
> someone could give me some pointers.  I put the ojdbc14.jar in both 
> the Tomcat lib and /lib.  I created a dataimport.xml and 
> enabled it in the solrconfig.xml.  I go to the http:/// 
> solr/admin/dataimport.jsp.  This all seems to be fine, but I get the 
> default page response and doesn't look like the connection to the 
> oracle server is even attempted.
>
> I'm using the Solr 1.4 release on Nov 10.
> Do I need an oracle client on the server?  I thought having the ojdbc 
> jar should be sufficient.  Any help or configuration examples for 
> setting this up would be much appreciated.
>
> Thanks
> Robbin



Re: Log of zero result searches

2009-12-15 Thread Chris Hostetter

: Subject: Log of zero result searches
: References: <26747482.p...@talk.nabble.com> <26748588.p...@talk.nabble.com>
:  <359a9283091203m73b4dc9ya51aa97e460b3...@mail.gmail.com>
:  <26756663.p...@talk.nabble.com> <26776651.p...@talk.nabble.com>
:  <359a92830912141657r79881e4bg3a4370d81ea7e...@mail.gmail.com>
: In-Reply-To: <359a92830912141657r79881e4bg3a4370d81ea7e...@mail.gmail.com>


http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking




-Hoss



Re: Can solr web site have multiple versions of online API doc?

2009-12-15 Thread Israel Ekpo
2009/12/15 Teruhiko Kurosaka 

> Lucene keeps multiple versions of its API doc online at
> http://lucene.apache.org/java/X_Y_Z/api/all/index.html
> for version X.Y.Z.  I am finding this very useful when
> comparing different versions.  This is also good because
> the javadoc comments that I write for my software can
> reference the API comments of the exact version of
> Lucene that I am using.
>
> At Solr site, I can only find the API doc of the trunk
> build.  I cannot find 1.3.0 API doc, for example.
>
> Can Solr site also maintain the API docs for the past
> stable versions ?
>
> -kuro


Hi Teruhiko

If you downloaded the 1.3.0 release, you should find a "docs" folder inside
the zip file.

This contains the javadoc for that particular release.

You may also re download a 1.3.0 release to get the docs for Solr 1.3.

I hope this helps.

-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/


Re: Reverse sort facet query

2009-12-15 Thread Chris Hostetter

: Does anyone know of a good way to perform a reverse-sorted facet query (i.e. 
rarest first)?

I'm fairly confident that code doesn't exist at the moment.  

If i remember correctly, it would be fairly simply to implement if you'd 
like to submit a patch:  when sorting by count a simple bounded priority 
queue is used, so we'd just have the change the comparator.  If you're 
interested in working on a patch it should be in SimpleFacets.java.  I 
think the queue is called "BoundedTreeSet"


(that's a pretty novel request actually ... i don't remember anyone else 
ever asking for anything like this before .. can you describe your use 
case a bit  -- i'm curious as to how/when you would use this data)



-Hoss



Re: Converting java date to solr date and querying dates

2009-12-15 Thread Chris Hostetter

:   i want to store dates into a date field called publish date in solr.
: how do we do it using solrj

I'm pretty sure that when indexing docs, you can add Date objects directly 
to the SolrInputDocument as field values -- but 'm not 100% certain (i 
don't use SolrJ much)

:   likewise how do we query from solr using java date? do we always have
: to convert it into UTC field and then query it?

all of the query APIs are based on query strings -- so yes you need to 
construct hte query string on your client side, and yes that includes 
formatting in UTC.

:   How do i query solr for documents published on monday or for documents
: published on March etc.

if you mean "march of any year" or "any monday ever" then there isn't any 
built in support for anything like that ... your best bet would either be 
to add "month_of_year" and "day_of_week" fields and populate them in your 
client code, or write an UpdateProcessor to run in solr (that could be 
pretty generic if you wnat ot contribute it back, other people could find 
it useful)

If you mean "published in the most recent march" or "published on hte 
most recent monday" where you don't have to change anything to have the 
query "do what i mean" as time moves on then you'd either need to do that 
when building up your query, or write it as a QParser plugin.

:   or in that case even apply range queries on it??

basic range queries are easy...

http://wiki.apache.org/solr/SolrQuerySyntax
http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html



-Hoss



Filter exclusion on query facets?

2009-12-15 Thread Mat Brown
Hi all,

Just wondering if it's possible to do filter exclusion (i.e.,
multiselect faceting) on query facets in Solr 1.4?

Thanks!
Mat


Re: Spellchecking - Is there a way to do this?

2009-12-15 Thread Chris Hostetter

: My first problem appears because I need suggestions inclusive when the
: expression has returned results. It's seems that only appear
: suggestions when there are no results. Is there a way to do so?

can you give us an example of what your queries look like?  with the 
example configs, i can get matches, as well as suggestions...


http://localhost:8983/solr/spell?q=ide&spellcheck=true

: The second question is: For the purposes that I've mentioned, is the
: best way to use spellchecker or mlt component? Or some other (as a
: fuzzy query)?

there's no clear cut answer to that -- i don't remember anyone else ever 
asking about anything particularly similar to what you're doing, so i 
don't know that there is any precident for a "best" way to go about it.



-Hoss



Re: Concurrent Merge Scheduler & MaxThread Count

2009-12-15 Thread Chris Hostetter

: I'm having trouble getting Solr to use more than one thread during index 
: optimizations.  I have the following in my solrconfig.xml:
:  
: 6
: 

How many segments do you have?

I'm not an expert on segment merging, but i'm pretty sure the number of 
threads it will use is limited based on the number of segments -- so even 
though you say "use up to 8" it only uses one if that's all htat it can 
use.



-Hoss



wildcard oddity

2009-12-15 Thread Joe Calderon
im trying to do a wild card search

"q":"item_title:(gets*)"returns no results
"q":"item_title:(gets)"returns results
"q":"item_title:(get*)"returns results


seems like * at the end of a token is requiring a character, instead
of being 0 or more its acting like1 or more

the text im trying to match is "The Gang Gets Extreme: Home Makeover Edition"

the field uses the following analyzers


  





  



is anybody else having similar problems?


best,
--joe


synonyms

2009-12-15 Thread Peter A. Kirk
Hi



It appears that Solr reads a synonym list at startup from a text file.

Is it possible to alter this behaviour so that Solr obtains the synonym list 
from a database instead?



Thanks,

Peter



Re: Using lucenes custom filters in solr

2009-12-15 Thread AHMET ARSLAN

> Hi All,
> 
>       I have a custom filter for lucene ,Can
> anyone help me how to use this
> in SOLR.

http://wiki.apache.org/solr/SolrPlugins#Tokenizer_and_TokenFilter
http://wiki.apache.org/solr/SolrPlugins#Analyzer





Can solr web site have multiple versions of online API doc?

2009-12-15 Thread Teruhiko Kurosaka
Lucene keeps multiple versions of its API doc online at
http://lucene.apache.org/java/X_Y_Z/api/all/index.html
for version X.Y.Z.  I am finding this very useful when 
comparing different versions.  This is also good because
the javadoc comments that I write for my software can
reference the API comments of the exact version of
Lucene that I am using.

At Solr site, I can only find the API doc of the trunk
build.  I cannot find 1.3.0 API doc, for example.

Can Solr site also maintain the API docs for the past
stable versions ?

-kuro 

Re: Using lucenes custom filters in solr

2009-12-15 Thread pavan kumar donepudi
Hi All,

  I have a custom filter for lucene ,Can anyone help me how to use this
in SOLR.

Thanks in advance,
Pavan


facet.field problem in SolrParams to NamedList

2009-12-15 Thread Nestor Oviedo
Hi!
I wrote a subclass of DisMaxQParserPlugin to add a little filter for
processing the "q" param and generate a "fq" param.
Ej.: "q=something field:value" becomes "q=something value&fq=field:value"

To do this, in the createParser method, I apply a regular expression
to the qstr param to obtain the fq part, and then I do the following:

NamedList paramsList = params.toNamedList();
paramsList.add(CommonParams.FQ, generatedFilterQuery);
params = SolrParams.toSolrParams(paramsList);
req.setParams(params);

The problem is when I include two "facet.field" in the request. In the
results (facets section) it prints "[Ljava.lang.String;@c77a748",
which is the result of a toString() over an String[] .

So, getting a little in deep in the code, I saw the method
SolrParams.toNameList() was saving the array correctly, but the method
SolrParams.toSolrParams(NamedList) was doing:
"params.getVal(i).toString()". So, it always loses the array.

Something similar occurs with the methods SolrParams.toMap() and
SolrParams.toMultiMap().

Is this a bug ?

thanks.
Nestor


Re: Exception from Spellchecker

2009-12-15 Thread Sascha Szott

Hi Rafael,

Rafael Pappert wrote:

I try to enable the spellchecker in my 1.4.0 solr (running with tomcat 6 on 
debian).
But I always get the following exception, when I try to open 
http://localhost:8080/spell?:


The spellcheck=true pair is missing in your request. Try

http://localhost:8080/spell?q=&spellcheck=true

-Sascha



Exception from Spellchecker

2009-12-15 Thread Rafael Pappert
Hello List,

I try to enable the spellchecker in my 1.4.0 solr (running with tomcat 6 on 
debian).
But I always get the following exception, when I try to open 
http://localhost:8080/spell?:


HTTP Status 500 - null java.lang.NullPointerException at 
java.io.StringReader.(StringReader.java:33) at 
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:197) at 
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:78) at 
org.apache.solr.search.QParser.getQuery(QParser.java:131) at 
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:89)
 at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
 at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) 
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) 
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) 
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) 
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
 at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) at 
java.lang.Thread.run(Thread.java:619)

My Configuration looks like this:

solrconfig.xml



  textSpell

a_spell
a_spell
true
false
./spellchecker
  





  
  false
  
  false
  
  1


  spellcheck



schema.xml


   
 
 
 
 
 
 
  
  
 
  
 
 
 
 
  

 
   ..



I don't know what's wrong with the given configuration and the exception not 
really clear ;)
Can somebody give me a hint? Thank you in anticipation.

Best regards,
Rafael.



Re: Document model suggestion

2009-12-15 Thread caman

Erick,
I know what you mean. 
Wonder if it is actually cleaner to keep the authorization  model out of
solr index and filter the data at client side based on the user access
rights. 
Thanks all for help.



Erick Erickson wrote:
> 
> Yes, that should work. One hard part is what happens if your
> authorization model has groups, especially when membership
> in those groups changes. Then you have to go in and update
> all the affected docs.
> 
> FWIW
> Erick
> 
> On Tue, Dec 15, 2009 at 12:24 PM, caman
> wrote:
> 
>>
>> Shalin,
>>
>> Thanks. much appreciated.
>> Question about:
>>  "That is usually what people do. The hard part is when some documents
>> are
>> shared across multiple users. "
>>
>> What do you recommend when documents has to be shared across multiple
>> users?
>> Can't I just multivalue a field with all the users who has access to the
>> document?
>>
>>
>> thanks
>>
>> Shalin Shekhar Mangar wrote:
>> >
>> > On Tue, Dec 15, 2009 at 7:26 AM, caman
>> > wrote:
>> >
>> >>
>> >> Appreciate any guidance here please. Have a master-child table between
>> >> two
>> >> tables 'TA' and 'TB' where form is the master table. Any row in TA can
>> >> have
>> >> multiple row in TB.
>> >> e.g. row in TA
>> >>
>> >> id---name
>> >> 1---tweets
>> >>
>> >> TB:
>> >> id|ta_id|field0|field1|field2.|field20|created_by
>> >> 1|1|value1|value2|value2.|value20|User1
>> >>
>> >> 
>> >
>> >>
>> >> This works fine and index the data.But all the data for a row in TA
>> gets
>> >> combined in one document(not desirable).
>> >> I am not clear on how to
>> >>
>> >> 1) separate a particular row from the search results.
>> >> e.g. If I search for 'Android' and there are 5 rows for android in TB
>> for
>> >> a
>> >> particular instance in TA, would like to show them separately to user
>> and
>> >> if
>> >> the user click on any of the row,point them to an attached URL in the
>> >> application. Should a separate index be maintained for each row in
>> TB?TB
>> >> can
>> >> have millions of rows.
>> >>
>> >
>> > The easy answer is that whatever you want to show as results should be
>> the
>> > thing that you index as documents. So if you want to show tweets as
>> > results,
>> > one document should represent one tweet.
>> >
>> > Solr is different from relational databases and you should not think
>> about
>> > both the same way. De-normalization is the way to go in Solr.
>> >
>> >
>> >> 2) How to protect one user's data from another user. I guess I can
>> keep
>> a
>> >> column for a user_id in the schema and append that filter
>> automatically
>> >> when
>> >> I search through SOLR. Any better alternatives?
>> >>
>> >>
>> > That is usually what people do. The hard part is when some documents
>> are
>> > shared across multiple users.
>> >
>> >
>> >> Bear with me if these are newbie questions please, this is my first
>> day
>> >> with
>> >> SOLR.
>> >>
>> >>
>> > No problem. Welcome to Solr!
>> >
>> > --
>> > Regards,
>> > Shalin Shekhar Mangar.
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Document-model-suggestion-tp26784346p26799016.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Document model suggestion

2009-12-15 Thread Erick Erickson
Yes, that should work. One hard part is what happens if your
authorization model has groups, especially when membership
in those groups changes. Then you have to go in and update
all the affected docs.

FWIW
Erick

On Tue, Dec 15, 2009 at 12:24 PM, caman wrote:

>
> Shalin,
>
> Thanks. much appreciated.
> Question about:
>  "That is usually what people do. The hard part is when some documents are
> shared across multiple users. "
>
> What do you recommend when documents has to be shared across multiple
> users?
> Can't I just multivalue a field with all the users who has access to the
> document?
>
>
> thanks
>
> Shalin Shekhar Mangar wrote:
> >
> > On Tue, Dec 15, 2009 at 7:26 AM, caman
> > wrote:
> >
> >>
> >> Appreciate any guidance here please. Have a master-child table between
> >> two
> >> tables 'TA' and 'TB' where form is the master table. Any row in TA can
> >> have
> >> multiple row in TB.
> >> e.g. row in TA
> >>
> >> id---name
> >> 1---tweets
> >>
> >> TB:
> >> id|ta_id|field0|field1|field2.|field20|created_by
> >> 1|1|value1|value2|value2.|value20|User1
> >>
> >> 
> >
> >>
> >> This works fine and index the data.But all the data for a row in TA gets
> >> combined in one document(not desirable).
> >> I am not clear on how to
> >>
> >> 1) separate a particular row from the search results.
> >> e.g. If I search for 'Android' and there are 5 rows for android in TB
> for
> >> a
> >> particular instance in TA, would like to show them separately to user
> and
> >> if
> >> the user click on any of the row,point them to an attached URL in the
> >> application. Should a separate index be maintained for each row in TB?TB
> >> can
> >> have millions of rows.
> >>
> >
> > The easy answer is that whatever you want to show as results should be
> the
> > thing that you index as documents. So if you want to show tweets as
> > results,
> > one document should represent one tweet.
> >
> > Solr is different from relational databases and you should not think
> about
> > both the same way. De-normalization is the way to go in Solr.
> >
> >
> >> 2) How to protect one user's data from another user. I guess I can keep
> a
> >> column for a user_id in the schema and append that filter automatically
> >> when
> >> I search through SOLR. Any better alternatives?
> >>
> >>
> > That is usually what people do. The hard part is when some documents are
> > shared across multiple users.
> >
> >
> >> Bear with me if these are newbie questions please, this is my first day
> >> with
> >> SOLR.
> >>
> >>
> > No problem. Welcome to Solr!
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: Document model suggestion

2009-12-15 Thread caman

Shalin,

Thanks. much appreciated.
Question about: 
 "That is usually what people do. The hard part is when some documents are
shared across multiple users. "

What do you recommend when documents has to be shared across multiple users?
Can't I just multivalue a field with all the users who has access to the
document?


thanks

Shalin Shekhar Mangar wrote:
> 
> On Tue, Dec 15, 2009 at 7:26 AM, caman
> wrote:
> 
>>
>> Appreciate any guidance here please. Have a master-child table between
>> two
>> tables 'TA' and 'TB' where form is the master table. Any row in TA can
>> have
>> multiple row in TB.
>> e.g. row in TA
>>
>> id---name
>> 1---tweets
>>
>> TB:
>> id|ta_id|field0|field1|field2.|field20|created_by
>> 1|1|value1|value2|value2.|value20|User1
>>
>> 
> 
>>
>> This works fine and index the data.But all the data for a row in TA gets
>> combined in one document(not desirable).
>> I am not clear on how to
>>
>> 1) separate a particular row from the search results.
>> e.g. If I search for 'Android' and there are 5 rows for android in TB for
>> a
>> particular instance in TA, would like to show them separately to user and
>> if
>> the user click on any of the row,point them to an attached URL in the
>> application. Should a separate index be maintained for each row in TB?TB
>> can
>> have millions of rows.
>>
> 
> The easy answer is that whatever you want to show as results should be the
> thing that you index as documents. So if you want to show tweets as
> results,
> one document should represent one tweet.
> 
> Solr is different from relational databases and you should not think about
> both the same way. De-normalization is the way to go in Solr.
> 
> 
>> 2) How to protect one user's data from another user. I guess I can keep a
>> column for a user_id in the schema and append that filter automatically
>> when
>> I search through SOLR. Any better alternatives?
>>
>>
> That is usually what people do. The hard part is when some documents are
> shared across multiple users.
> 
> 
>> Bear with me if these are newbie questions please, this is my first day
>> with
>> SOLR.
>>
>>
> No problem. Welcome to Solr!
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Payloads with Phrase queries

2009-12-15 Thread Bill Au
Lucene 2.9.1 comes with a PayloadTermQuery:
http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/search/payloads/PayloadTermQuery.html

I have been using that to use the payload as part of the score without any
problem.

Bill


On Tue, Dec 15, 2009 at 6:31 AM, Raghuveer Kancherla <
raghuveer.kanche...@aplopio.com> wrote:

> The interesting thing I am noticing is that the scoring works fine for a
> phrase query like "solr rocks".
> This lead me to look at what query I am using in case of a single term.
> Turns out that I am using PayloadTermQuery taking a cue from solr-1485
> patch.
>
> I changed this to BoostingTermQuery (i read somewhere that this is
> deprecated .. but i was just experimenting) and the scoring seems to work
> as
> expected now for a single term.
>
> Now, the important question is what is the Payload version of a TermQuery?
>
> Regards
> Raghu
>
>
> On Tue, Dec 15, 2009 at 12:45 PM, Raghuveer Kancherla <
> raghuveer.kanche...@aplopio.com> wrote:
>
> > Hi,
> > Thanks everyone for the responses, I am now able to get both phrase
> queries
> > and term queries to use payloads.
> >
> > However the the score value for each document (and consequently, the
> > ordering of documents) are coming out wrong.
> >
> > In the solr output appended below, document 4 has a score higher than the
> > document 2 (look at the debug part). The results section shows a wrong
> score
> > (which is the payload value I am returning from my custom similarity
> class)
> > and the ordering is also wrong because of this. Can someone explain this
> ?
> >
> > My custom query parser is pasted here http://pastebin.com/m9f21565
> >
> > In the similarity class, I return 10.0 if payload is 1 and 20.0 if
> payload
> > is 2. For everything else I return 1.0.
> >
> > {
> >  'responseHeader':{
> >   'status':0,
> >   'QTime':2,
> >   'params':{
> >   'fl':'*,score',
> >   'debugQuery':'on',
> >   'indent':'on',
> >
> >
> >   'start':'0',
> >   'q':'solr',
> >   'qt':'aplopio',
> >   'wt':'python',
> >   'fq':'',
> >   'rows':'10'}},
> >  'response':{'numFound':5,'start':0,'maxScore':20.0,'docs':[
> >
> >
> >   {
> >'payloadTest':'solr|2 rocks|1',
> >'id':'2',
> >'score':20.0},
> >   {
> >'payloadTest':'solr|2',
> >'id':'4',
> >'score':20.0},
> >
> >
> >   {
> >'payloadTest':'solr|1 rocks|2',
> >'id':'1',
> >'score':10.0},
> >   {
> >'payloadTest':'solr|1 rocks|1',
> >'id':'3',
> >'score':10.0},
> >
> >
> >   {
> >'payloadTest':'solr',
> >'id':'5',
> >'score':1.0}]
> >  },
> >  'debug':{
> >   'rawquerystring':'solr',
> >   'querystring':'solr',
> >
> >
> >   'parsedquery':'PayloadTermQuery(payloadTest:solr)',
> >   'parsedquery_toString':'payloadTest:solr',
> >   'explain':{
> >   '2':'\n7.227325 = (MATCH) fieldWeight(payloadTest:solr in 1),
> product of:\n  14.142136 = (MATCH) btq, product of:\n0.70710677 =
> tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=1)\n',
> >
> >
> >   '4':'\n11.56372 = (MATCH) fieldWeight(payloadTest:solr in 3),
> product of:\n  14.142136 = (MATCH) btq, product of:\n0.70710677 =
> tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=3)\n',
> >
> >
> >   '1':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 0),
> product of:\n  7.071068 = (MATCH) btq, product of:\n0.70710677 =
> tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=0)\n',
> >
> >
> >   '3':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 2),
> product of:\n  7.071068 = (MATCH) btq, product of:\n0.70710677 =
> tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=2)\n',
> >
> >
> >   '5':'\n0.578186 = (MATCH) fieldWeight(payloadTest:solr in 4),
> product of:\n  0.70710677 = (MATCH) btq, product of:\n0.70710677 =
> tf(phraseFreq=0.5)\n1.0 = scorePayload(...)\n  0.81767845 =
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=4)\n'},
> >
> >
> >   'QParser':'BoostingTermQParser',
> >   'filter_queries':[''],
> >   'parsed_filter_queries':[],
> >   'timing':{
> >   'time':2.0,
> >   'prepare':{
> >'time':1.0,
> >
> >
> >'org.apache.solr.handler.component.QueryComponent':{
> > 'time':1.0},
> >'org.apache.solr.handler.component.FacetComponent':{
> > 'time':0.0},
> >'org.apache.solr.handler.component.MoreLikeThisComponent':{
> >
> >
> > 'time':0.0},
> >'org.apache.solr.handler.component.HighlightComponent':{
> > 'time':0.0},
> >'org.apache.solr.handler.component.StatsComponen

Re: solr php client vs file_get_contents?

2009-12-15 Thread Donovan Jimenez
In the end, the PHP client does a file_get_contents for doing a  
search the same way you'd do it "manually".  It's all PHP, so you can  
do anything it does yourself. It provides what any library of PHP  
classes should - convenience. I use the JSON response writer because  
it gets the most attention from the Solr community of all the non-XML  
writers, yet is still very quick to parse (you might want to do your  
own tests comparing the speed of unserializing a Solr phps response  
versus json_decode'ing the json version).


Happy Solr'ing,
- Donovan

On Dec 15, 2009, at 8:49 AM, Faire Mii wrote:


i am using php to access solr and i wonder one thing.

why should i use solr php client when i can use

$serializedResult = file_get_contents('http://localhost:8983/solr/ 
select?q=niklas&wt=phps');


to get the result in arrays and then print them out?

i dont really get the difference. is there any richer features with  
the php client?



regards

fayer




Re: solr php client vs file_get_contents?

2009-12-15 Thread Israel Ekpo
On Tue, Dec 15, 2009 at 8:49 AM, Faire Mii  wrote:

> i am using php to access solr and i wonder one thing.
>
> why should i use solr php client when i can use
>
> $serializedResult = file_get_contents('http://localhost:8983/solr/
> select?q=niklas&wt=phps');
>
> to get the result in arrays and then print them out?
>
> i dont really get the difference. is there any richer features with the php
> client?
>
>
> regards
>
> fayer



Hi Faire,

Have you actually used this library before? I think the library is pretty
well thought out.

>From a simple glance at the source code you can see that one can use it for
the following purposes:

1. Adding documents to the index (which you cannot just do with
file_get_contents alone). So that's one diff

2. Updating existing documents

3. Deleting existing documents.

4. Balancing requests across multiple backend servers

There are other operations with the Solr server that the library can also
perform.

Some example of what I am referring to is illustrated here

http://code.google.com/p/solr-php-client/wiki/FAQ

http://code.google.com/p/solr-php-client/wiki/ExampleUsage

IBM also has an interesting article illustrating how to add documents to the
Solr index and issue commit and optimize calls using this library.

http://www.ibm.com/developerworks/opensource/library/os-php-apachesolr/

The author of the library can probably give you more details on what the
library has to offer.

I think you should download the source code and spend some time looking at
all the features it has to offer.

In my opinion, it is not fair to compare a well thought out library like
that with a simple php function.
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.
http://www.israelekpo.com/


solr php client vs file_get_contents?

2009-12-15 Thread Faire Mii

i am using php to access solr and i wonder one thing.

why should i use solr php client when i can use

$serializedResult = file_get_contents('http://localhost:8983/solr/ 
select?q=niklas&wt=phps');


to get the result in arrays and then print them out?

i dont really get the difference. is there any richer features with  
the php client?



regards

fayer

Re: search in all fields for multiple values?

2009-12-15 Thread Shalin Shekhar Mangar
On Tue, Dec 15, 2009 at 5:35 PM, Faire Mii  wrote:

> i have two fields:
>
> title
> body
>
> and i want to search for two words
>
> dog
> OR
> cat
>
> in each of them.
>
> i have tried q=*:dog OR cat
>
> but it doesnt work.
>
> how should i type it?
>
> PS. could i enter default search field = ALL fields in schema.xml in
> someway?
>

See
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_.22superman.22_in_both_the_title_and_subject_fields

You can also create a copyField to which you can copy both title and body
and specify that as the default search field.

-- 
Regards,
Shalin Shekhar Mangar.


search in all fields for multiple values?

2009-12-15 Thread Faire Mii

i have two fields:

title
body

and i want to search for two words

dog
OR
cat

in each of them.

i have tried q=*:dog OR cat

but it doesnt work.

how should i type it?

PS. could i enter default search field = ALL fields in schema.xml in  
someway?


Re: Payloads with Phrase queries

2009-12-15 Thread Raghuveer Kancherla
The interesting thing I am noticing is that the scoring works fine for a
phrase query like "solr rocks".
This lead me to look at what query I am using in case of a single term.
Turns out that I am using PayloadTermQuery taking a cue from solr-1485
patch.

I changed this to BoostingTermQuery (i read somewhere that this is
deprecated .. but i was just experimenting) and the scoring seems to work as
expected now for a single term.

Now, the important question is what is the Payload version of a TermQuery?

Regards
Raghu


On Tue, Dec 15, 2009 at 12:45 PM, Raghuveer Kancherla <
raghuveer.kanche...@aplopio.com> wrote:

> Hi,
> Thanks everyone for the responses, I am now able to get both phrase queries
> and term queries to use payloads.
>
> However the the score value for each document (and consequently, the
> ordering of documents) are coming out wrong.
>
> In the solr output appended below, document 4 has a score higher than the
> document 2 (look at the debug part). The results section shows a wrong score
> (which is the payload value I am returning from my custom similarity class)
> and the ordering is also wrong because of this. Can someone explain this ?
>
> My custom query parser is pasted here http://pastebin.com/m9f21565
>
> In the similarity class, I return 10.0 if payload is 1 and 20.0 if payload
> is 2. For everything else I return 1.0.
>
> {
>  'responseHeader':{
>   'status':0,
>   'QTime':2,
>   'params':{
>   'fl':'*,score',
>   'debugQuery':'on',
>   'indent':'on',
>
>
>   'start':'0',
>   'q':'solr',
>   'qt':'aplopio',
>   'wt':'python',
>   'fq':'',
>   'rows':'10'}},
>  'response':{'numFound':5,'start':0,'maxScore':20.0,'docs':[
>
>
>   {
>'payloadTest':'solr|2 rocks|1',
>'id':'2',
>'score':20.0},
>   {
>'payloadTest':'solr|2',
>'id':'4',
>'score':20.0},
>
>
>   {
>'payloadTest':'solr|1 rocks|2',
>'id':'1',
>'score':10.0},
>   {
>'payloadTest':'solr|1 rocks|1',
>'id':'3',
>'score':10.0},
>
>
>   {
>'payloadTest':'solr',
>'id':'5',
>'score':1.0}]
>  },
>  'debug':{
>   'rawquerystring':'solr',
>   'querystring':'solr',
>
>
>   'parsedquery':'PayloadTermQuery(payloadTest:solr)',
>   'parsedquery_toString':'payloadTest:solr',
>   'explain':{
>   '2':'\n7.227325 = (MATCH) fieldWeight(payloadTest:solr in 1), product 
> of:\n  14.142136 = (MATCH) btq, product of:\n0.70710677 = 
> tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=1)\n',
>
>
>   '4':'\n11.56372 = (MATCH) fieldWeight(payloadTest:solr in 3), product 
> of:\n  14.142136 = (MATCH) btq, product of:\n0.70710677 = 
> tf(phraseFreq=0.5)\n20.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=3)\n',
>
>
>   '1':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 0), product 
> of:\n  7.071068 = (MATCH) btq, product of:\n0.70710677 = 
> tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=0)\n',
>
>
>   '3':'\n3.6136625 = (MATCH) fieldWeight(payloadTest:solr in 2), product 
> of:\n  7.071068 = (MATCH) btq, product of:\n0.70710677 = 
> tf(phraseFreq=0.5)\n10.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  0.625 = fieldNorm(field=payloadTest, doc=2)\n',
>
>
>   '5':'\n0.578186 = (MATCH) fieldWeight(payloadTest:solr in 4), product 
> of:\n  0.70710677 = (MATCH) btq, product of:\n0.70710677 = 
> tf(phraseFreq=0.5)\n1.0 = scorePayload(...)\n  0.81767845 = 
> idf(payloadTest:  solr=5)\n  1.0 = fieldNorm(field=payloadTest, doc=4)\n'},
>
>
>   'QParser':'BoostingTermQParser',
>   'filter_queries':[''],
>   'parsed_filter_queries':[],
>   'timing':{
>   'time':2.0,
>   'prepare':{
>'time':1.0,
>
>
>'org.apache.solr.handler.component.QueryComponent':{
> 'time':1.0},
>'org.apache.solr.handler.component.FacetComponent':{
> 'time':0.0},
>'org.apache.solr.handler.component.MoreLikeThisComponent':{
>
>
> 'time':0.0},
>'org.apache.solr.handler.component.HighlightComponent':{
> 'time':0.0},
>'org.apache.solr.handler.component.StatsComponent':{
> 'time':0.0},
>'org.apache.solr.handler.component.DebugComponent':{
>
>
> 'time':0.0}},
>   'process':{
>'time':1.0,
>'org.apache.solr.handler.component.QueryComponent':{
> 'time':0.0},
>'org.apache.solr.handler.component.FacetComponent':{
>
>
> 'time':0.0},
>'org.apache.solr.handler.component.MoreLikeThisComponent':{
> 'time':0.0},
>'org.apache.solr.handler.component.HighlightComponent':{
> 'time':0.0},
>
>
>'org.apache.solr.handler.component.StatsCom

Re: question regarding dynamic fields

2009-12-15 Thread Shalin Shekhar Mangar
On Mon, Dec 14, 2009 at 1:00 PM, Phanindra Reva wrote:

> Hello..,
> I have observed that the text or keywords which are being
> indexed using dynamicField concept are being searchable only when we
> mention field name too while querying.Am I wrong with my observation
> or  is it the default and can not be changed? I am just wondering if
> there is any route to search the text indexed using dynamicFields with
> out having to mention the field name in the query.
> Thanks.
>

If you are asking if you can give *_s to search on all dynamic fields ending
with "_s" then the answer is no. You must specify the field name.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Log of zero result searches

2009-12-15 Thread Shalin Shekhar Mangar
On Tue, Dec 15, 2009 at 2:36 PM, Roland Villemoes 
wrote:

> Yes, correct.
>
> But to use that - the search client must collect this information whenever
> we have "0" results.
> I do not want that to be part of the client application (quite hard when
> that is SolrJS) - this should be collected server site - on Solr.
> Do you know how to do that?
>
>
The number of hits are logged along with each query in INFO level. You can
analyze the logs to figure out this stat.

-- 
Regards,
Shalin Shekhar Mangar.


Re: I cant get it to work

2009-12-15 Thread regany


I've only just started with Solr too.

As a newbie, first I'd say forget about trying to "compare" it to your mysql
database.

It's completely different and performs it's own job in it's own way. You
feed a document in, and you store that information in the most efficient
manner you can to perform the search and return the results you want.

So ask, what do I want to search against?

field1
field2
field3

That's what you "feed" into Solr.

Then ask, what information do I want to "return" after a search? This
determines how you "store" the information you've just "fed" into Solr. Say
you want to return:

field2

Then you might accept field1, field2, and field3 and merge them together
into 1 searchable field called "searchtext". This is what users will search
against. Then you'd also have "field2" as another field.

field2 (not indexed, stored)
searchtext (combination of field1,field2,field2 - indexed, not stored)

So then you could search against "searchtext" and return "field2" as the
result.

Hope that provides some explanation (I know it's basic). From my very
limited experience with, Solr is great. My biggest hurdle was getting my
head around the fact that it's NOT a relational database (ie. mysql) but a
separate tool that you configure  in the best way for your "search" and only
that.
-- 
View this message in context: 
http://old.nabble.com/I-cant-get-it-to-work-tp26791099p26792373.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: maximum no of values in multi valued string field

2009-12-15 Thread Shalin Shekhar Mangar
On Tue, Dec 15, 2009 at 3:13 PM, bharath venkatesh <
bharath.venkat...@ibibogroup.com> wrote:

> Hi ,
>  Is there any limit in no of values stored in a single  multi valued
> string field ?


There is no limit theoretical limit. There are practical limits because your
documents are heavier. The document cache stores lucene documents in memory.


> if a single multi valued string field contains 1000-2000 string values what
> will be effect on query performance (we will be only indexing this field not
> storing it )  ?


Yes, the more the number of tokens, the longer it may take to search across
them. Faceting performance can drop drastically for such large number of
values.


> is it better to store all the strings  in a single  text field instead of
> multi valued string field.
>
>
It wouldn't make a lot of difference. The XML response may be a bit shorter.
In a single field highlighting can cause adjacent terms to be highlighted
which you may not want.

-- 
Regards,
Shalin Shekhar Mangar.


maximum no of values in multi valued string field

2009-12-15 Thread bharath venkatesh

Hi ,
  Is there any limit in no of values stored in a single  multi 
valued string field ? if a single multi valued string field contains 
1000-2000 string values what will be effect on query performance (we 
will be only indexing this field not storing it )  ? is it better to 
store all the strings  in a single  text field instead of multi valued 
string field.


Thanks in Advance,
Bharath
This message is intended only for the use of the addressee and may contain information that is privileged, confidential 
and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the 
employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly prohibited. If you have received this e-mail 
in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system.




Re: Auto update with deltaimport

2009-12-15 Thread Olala

Hi, thanks! I've done it by wrote a scripts to call
http://localhost:8080/solr/dataimport?command=delta-import automatically:-)


Joel Nylund wrote:
> 
> windows or unix?
> 
> unix - make a shell script and call it from cron
> 
> windows - make a .bat or .cmd file and call it from scheduler
> 
> within the shell scripts/bat files use wget or curl to call the right  
> import:
> 
> wget -q -O /dev/null
> http://localhost:8983/solr/dataimport?command=delta-import
> 
> 
> Joel
> 
> On Dec 12, 2009, at 1:38 AM, Olala wrote:
> 
>>
>> Hi All!
>>
>> I am developing a search engine using Solr, I was tested full-import  
>> and
>> delta-import command successfully.But now,I want to run delta-import
>> automatically with my schedule.So, can anyone help me???
>>
>> Thanks & Regards,
>> -- 
>> View this message in context:
>> http://old.nabble.com/Auto-update-with-deltaimport-tp26755386p26755386.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Auto-update-with-deltaimport-tp26755386p26792041.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query on Cache size.

2009-12-15 Thread Shalin Shekhar Mangar
On Mon, Dec 14, 2009 at 7:17 PM, kalidoss <
kalidoss.muthuramalin...@sifycorp.com> wrote:

> Hi,
>
>   We have enabled the query result cache, its 512 entries,
>
>   we have calculated the size used for cache :
>   page size about 1000bytes, (1000*512)/1024/1024  = .48MB
>
>
The query result cache is a map of (q, sort, n) to ordered list of Lucene
docids. Assuming queryResultWindowSize iw 20 and an average user does not go
beyond 20 results, your memory usage of the values in this map is
approx 20*sizeof(int)*512. Add some more for keys, map, references etc.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Not able to display search results on Tomcat/Solrj

2009-12-15 Thread Shalin Shekhar Mangar
On Tue, Dec 15, 2009 at 1:07 AM, insaneyogi3008 wrote:

>
> Hello,
>
> I am running a simple prg
> http://old.nabble.com/file/p26779970/SolrjTest.java SolrjTest.java  to get
> search results from a remote Solr server , I seem to correctly get back the
> number of documents that match my query , but I am not able to display the
> search results themselves .
>
> My question is , is this a known issue? I have attached the test & below is
> the sample of the result :
>
>
What is "displayname" and "displayphone"? Are they even in your schema?
Print out the SolrDocument object directly and you should see the results.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Document model suggestion

2009-12-15 Thread Shalin Shekhar Mangar
On Tue, Dec 15, 2009 at 7:26 AM, caman wrote:

>
> Appreciate any guidance here please. Have a master-child table between two
> tables 'TA' and 'TB' where form is the master table. Any row in TA can have
> multiple row in TB.
> e.g. row in TA
>
> id---name
> 1---tweets
>
> TB:
> id|ta_id|field0|field1|field2.|field20|created_by
> 1|1|value1|value2|value2.|value20|User1
>
> 

>
> This works fine and index the data.But all the data for a row in TA gets
> combined in one document(not desirable).
> I am not clear on how to
>
> 1) separate a particular row from the search results.
> e.g. If I search for 'Android' and there are 5 rows for android in TB for a
> particular instance in TA, would like to show them separately to user and
> if
> the user click on any of the row,point them to an attached URL in the
> application. Should a separate index be maintained for each row in TB?TB
> can
> have millions of rows.
>

The easy answer is that whatever you want to show as results should be the
thing that you index as documents. So if you want to show tweets as results,
one document should represent one tweet.

Solr is different from relational databases and you should not think about
both the same way. De-normalization is the way to go in Solr.


> 2) How to protect one user's data from another user. I guess I can keep a
> column for a user_id in the schema and append that filter automatically
> when
> I search through SOLR. Any better alternatives?
>
>
That is usually what people do. The hard part is when some documents are
shared across multiple users.


> Bear with me if these are newbie questions please, this is my first day
> with
> SOLR.
>
>
No problem. Welcome to Solr!

-- 
Regards,
Shalin Shekhar Mangar.


SV: Log of zero result searches

2009-12-15 Thread Roland Villemoes
Yes, correct. 

But to use that - the search client must collect this information whenever we 
have "0" results. 
I do not want that to be part of the client application (quite hard when that 
is SolrJS) - this should be collected server site - on Solr. 
Do you know how to do that?

Roland

-Oprindelig meddelelse-
Fra: David Stuart [mailto:david.stu...@progressivealliance.co.uk] 
Sendt: 15. december 2009 09:33
Til: solr-user@lucene.apache.org
Emne: Re: Log of zero result searches

The returning XML result tag has a numFound attribute that will report  
0 if nothing matches your search criteria

David

On 15 Dec 2009, at 08:16, Roland Villemoes   
wrote:

> Hi
>
> Question: How do you log zero result searches?
>
> I quite important from a business perspective to know what searches  
> that returns zero/empty results.
> Does anybody know a way to get this information?
>
> Roland Villemoes


Re: Log of zero result searches

2009-12-15 Thread David Stuart
The returning XML result tag has a numFound attribute that will report  
0 if nothing matches your search criteria


David

On 15 Dec 2009, at 08:16, Roland Villemoes   
wrote:



Hi

Question: How do you log zero result searches?

I quite important from a business perspective to know what searches  
that returns zero/empty results.

Does anybody know a way to get this information?

Roland Villemoes


Re: I cant get it to work

2009-12-15 Thread David Stuart

Hi,

The answer is "it depends" ;)

If your 10 tables represent an entity e.g a person their address etc  
the one document entity works


But if your 10 tables each represnt a series of entites that you want  
to surface in your search results separately then make a document for  
each (I.e it depends on your data).


What is your use case? Are you wanting a search index that is able to  
search on every field in your 10 tables or just a few?
Think of it this way if you where creating SQL to pull the data out of  
the db using joins etc what fields would you grab, do you get multiple  
rows back because some of you tables have a one to many relationship.  
Once you have formed that query that is your document minus the  
duplicate information caused by the rows


Cheers

David

On 15 Dec 2009, at 08:05, Faire Mii  wrote:


I just cant get it.

If i got 10 tables in mysql and they are all related to eachother  
with foreign keys. Should i have 10 documents in solr?


or just one document with rows from all tables in it?

i have tried in vain for 2 days now...plz help

regards

fayer


Log of zero result searches

2009-12-15 Thread Roland Villemoes
Hi 

Question: How do you log zero result searches?

I quite important from a business perspective to know what searches that 
returns zero/empty results. 
Does anybody know a way to get this information? 

Roland Villemoes


I cant get it to work

2009-12-15 Thread Faire Mii

I just cant get it.

If i got 10 tables in mysql and they are all related to eachother with  
foreign keys. Should i have 10 documents in solr?


or just one document with rows from all tables in it?

i have tried in vain for 2 days now...plz help

regards

fayer