RE: does solr handle hierarchical facets?

2007-12-10 Thread SDIS M. Beauchamp
I handle this trough the interface

I've got dynamics fileds ( path_0, path_1   , ... ) to make it easier.

Florent


-Message d'origine-
De : Sean Laval [mailto:[EMAIL PROTECTED] 
Envoyé : lundi 10 décembre 2007 14:54
À : solr-user@lucene.apache.org
Objet : does solr handle hierarchical facets?


eg. category/subcategory/subsubcategory?
 
such that if you search for category, you get all those documents that have 
been tagged with the category AND any sub categories. If this is possible I 
think I'll investigate using solr in place of some existing code we have that 
deals with indexing and searching of such data.
 
Regards,
 
Sean
_
Get free emoticon packs and customisation from Windows Live. 
http://www.pimpmylive.co.uk



RE: How do I search in all fields without index by solr

2007-12-07 Thread SDIS M. Beauchamp
You have to read the example solrconfig.xml bundled with a fresh install of solr

You'll find everything about dismax request handler


-Message d'origine-
De : Laxmilal Menaria [mailto:[EMAIL PROTECTED] 
Envoyé : vendredi 7 décembre 2007 09:12
À : solr-user@lucene.apache.org
Objet : Re: How do I search in all fields without index by solr

I have tried that :

?q=laxmilalqt=dismaxfl=FriendID,Title,Address,PhoneNo,Comments
?q=videoqt=dismaxqf=FriendID,Title,Address,PhoneNo,Comments

But both are not return search results, is any configuration in config for that 
? my configuration is :

fieldType name=string_ch class=solr.StrField
analyzer class=
org.apache.lucene.analysis.standard.StandardAnalyzer/
/fieldType

LM

On 12/7/07, SDIS M. Beauchamp [EMAIL PROTECTED] wrote:

 You can also use the dismaxrequesthandler to search across multiple 
 field



 -Message d'origine-
 De : Laxmilal Menaria [mailto:[EMAIL PROTECTED] Envoyé : vendredi 
 7 décembre 2007 08:25 À : solr-user@lucene.apache.org Objet : Re: How 
 do I search in all fields without index by solr

 Ok, thanks.. have tried it, It working.

 But if I use it and may be XXX or YYY value is too long, I think many 
 server dont support long urls so it may give us problem. So is there 
 any configuration in config file for future.

 LM

 On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote:
 
  You should be able to search any field:
  ?q=field1:XXX field2:YYY
 
  You can register fieldTypes directory to an analyzer using:
   fieldType name=text_ws class=solr.TextField
  positionIncrementGap=100
 analyzer
  class=org.apache.lucene.analysis.standard.StandardAnalyzer/
   /fieldType
 
  ryan
 
 
  Laxmilal Menaria wrote:
   thanks for fast reply, I have dump my index in solr data folder 
   and able
  to
   search in single field only, but want to search in all fields. 
   also how
  can
   I configure StandradAnalyzer in solr config xml.
  
   LM
  
   On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote:
   solr should be able to read any lucene index -- even if it did 
   not create it.  The hitch is that you need to make sure the 
   analyzers and fieldTypes match what is in your index otherwise it 
   is unlikely for the result to be what you expect.
  
   To get solr to use your manually created index files, just dump 
   them in the data/index directory
  
   ryan
  
  
   Laxmilal Menaria wrote:
   I don't want to use solr for indexing database, I want to use 
   solr for searching on existing index created by me with using my 
   sample
   application.
   LM
  
   On 12/7/07, Venkatraman S [EMAIL PROTECTED] wrote:
   On Dec 7, 2007 10:17 AM, Laxmilal Menaria 
   [EMAIL PROTECTED]
  wrote:
  
   Hello everyone,
  
   I have created a simple java application which indexes 
   database
   tables,
   now
   I want to configure the solr on my created index. My index has 
   5
   fields,
   FriendID, Title, Address, PhoneNo and Comments.
  
   Why you want to use solr for  indexing databases??? !!!
   rtfm!
  
   --
   Venkat
   Blog @ http://blizzardzblogs.blogspot.com
  
  
  
  
  
  
 
 


 --
 Thanks,
 Laxmilal menaria

 http://www.chambal.com/
 http://www.minalyzer.com/
 http://www.bucketexplorer.com/




--
Thanks,
Laxmilal menaria

http://www.chambal.com/
http://www.minalyzer.com/
http://www.bucketexplorer.com/



RE: How do I search in all fields without index by solr

2007-12-06 Thread SDIS M. Beauchamp
You can also use the dismaxrequesthandler to search across multiple field 



-Message d'origine-
De : Laxmilal Menaria [mailto:[EMAIL PROTECTED] 
Envoyé : vendredi 7 décembre 2007 08:25
À : solr-user@lucene.apache.org
Objet : Re: How do I search in all fields without index by solr

Ok, thanks.. have tried it, It working.

But if I use it and may be XXX or YYY value is too long, I think many server 
dont support long urls so it may give us problem. So is there any configuration 
in config file for future.

LM

On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote:

 You should be able to search any field:
 ?q=field1:XXX field2:YYY

 You can register fieldTypes directory to an analyzer using:
  fieldType name=text_ws class=solr.TextField
 positionIncrementGap=100
analyzer
 class=org.apache.lucene.analysis.standard.StandardAnalyzer/
  /fieldType

 ryan


 Laxmilal Menaria wrote:
  thanks for fast reply, I have dump my index in solr data folder and 
  able
 to
  search in single field only, but want to search in all fields. also 
  how
 can
  I configure StandradAnalyzer in solr config xml.
 
  LM
 
  On 12/7/07, Ryan McKinley [EMAIL PROTECTED] wrote:
  solr should be able to read any lucene index -- even if it did not 
  create it.  The hitch is that you need to make sure the analyzers 
  and fieldTypes match what is in your index otherwise it is unlikely 
  for the result to be what you expect.
 
  To get solr to use your manually created index files, just dump 
  them in the data/index directory
 
  ryan
 
 
  Laxmilal Menaria wrote:
  I don't want to use solr for indexing database, I want to use solr 
  for searching on existing index created by me with using my sample
  application.
  LM
 
  On 12/7/07, Venkatraman S [EMAIL PROTECTED] wrote:
  On Dec 7, 2007 10:17 AM, Laxmilal Menaria [EMAIL PROTECTED]
 wrote:
 
  Hello everyone,
 
  I have created a simple java application which indexes database
  tables,
  now
  I want to configure the solr on my created index. My index has 5
  fields,
  FriendID, Title, Address, PhoneNo and Comments.
 
  Why you want to use solr for  indexing databases??? !!!
  rtfm!
 
  --
  Venkat
  Blog @ http://blizzardzblogs.blogspot.com
 
 
 
 
 
 




--
Thanks,
Laxmilal menaria

http://www.chambal.com/
http://www.minalyzer.com/
http://www.bucketexplorer.com/



RE: I18N with SOLR?

2007-11-19 Thread SDIS M. Beauchamp
You can have only one default search field 

But you can use the dismax request handler to search across several fields
http://wiki.apache.org/solr/DisMaxRequestHandler

Then you can use query field boosting to make one field more significant  :

Exact_text^3 text_fr^2 text_en^2 stemmed_text^1.5

-Message d'origine-
De : Dilip.TS [mailto:[EMAIL PROTECTED] 
Envoyé : lundi 19 novembre 2007 07:09
À : solr-user@lucene.apache.org
Objet : RE: I18N with SOLR?

   Hello,

  Also can we have something like this ? i.e  having multiple 
defaultSearchField entries in the schema.xml while searching for a keyword 
which has a combination of more than 1 language:

  defaultSearchFieldtext/defaultSearchField
  defaultSearchFieldtext_french/defaultSearchField...
  -Original Message-
  From: Dilip.TS [mailto:[EMAIL PROTECTED]
  Sent: Monday, November 19, 2007 11:29 AM
  To: solr-user@lucene.apache.org
  Subject: RE: I18N with SOLR?


Hello,
Does SOLR supports searching for a keyword which has a 
combination of more than 1 language within the same search page?



-Original Message-
From: Guglielmo Celata [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 15, 2007 7:39 PM
To: solr-user@lucene.apache.org; [EMAIL PROTECTED]
Subject: Re: I18N with SOLR?


Hi Dillip,
don't know if this helps, but I have set up a TextIt field in the 
config/schema.xml file, in order to index italian text.
It works pretty well with non-ascii characters (we do have some accented 
vowels, even if not as many as the french).
It also works with  stopwords (and I assume with protwords as well, though 
I didn't try). I created an italian-stopwords.txt file in the config/ path.
I think the SnowballPorterFilterFactory is a default usable class in Solr, 
although I remember having read it's a bit slower than other libraries.
But I am no expert.


fieldtype name=textIt class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.WhitespaceTokenizerFactory /
filter class=solr.ISOLatin1AccentFilterFactory/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumber
s=1 catenateAll=0/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory
words=italian-stopwords.txt ignoreCase=true/
filter class=solr.SnowballPorterFilterFactory
language=Italian/
  /analyzer
/fieldtype



On 15/11/2007, Dilip.TS [EMAIL PROTECTED] wrote:
  Hi Ed,
Thanks for the help,  but i have some queries,
i understand that we need to have a stopwords_french.txt and
  protwords_french.txt files say for french in solr/conf directory.
Is it like we need to write the classes like FrenchStopFilterFactory,
  FrenchPorterFilterFactory for each language
or do we have these classes in built in solr? I didnt find them in
  SOLR/Lucene APIs.
I found some classes like
org.apache.lucene.analysis.fr.FrenchAnalyzer
  etc., in lucene-analyzers.jar.
Any idea what is this class used for?

  Thanks in advance,

  Regards
  Dilip

  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Ed
  Summers
  Sent: Monday, November 12, 2007 7:00 PM
  To: solr-user@lucene.apache.org ; [EMAIL PROTECTED]
  Subject: Re: I18N with SOLR?


  I'd say yes. Solr supports Unicode and ships with language specific
  analyzers, and allows you to provide your own custom analyzers if you
  need them. This allows you to create different fieldType definitions
  for the languages you want to support. For example here is an example
  field type for French text which uses a French stopword list and
  French stemming.

  fieldType
name=text_french
class=solr.TextField 
analyzer
  tokenizer class=solr.WhitespaceTokenizerFactory  /
  filter
class=solr.FrenchStopFilterFactory
ignoreCase=true
words=stopwords_french.txt /
  filter
class= solr.FrenchPorterFilterFactory
protected=protwords_french.txt /
  filter class=solr.RemoveDuplicatesTokenFilterFactory /
/analyzer
  /fieldType

  Then you can create a dynamicField definitions that allow you to
  index and query your documents using the correct field type:

  dynamicField
name=*_french
type=text_french
indexed=true
stored=true/

  This means that when you index you need to know what language your
  data is in so that you know what field names to use in your document
  (e.g. title_french). And at search time you need to know 

RE: Solr PHP client

2007-11-19 Thread SDIS M. Beauchamp
I use the php and php serialized writer to query Solr from php 

It's very easy to use

But it's not so easy to update solr from php ( that's why my crawlers are not 
written in php ) 

Florent BEAUCHAMP

-Message d'origine-
De : Jonathan Ariel [mailto:[EMAIL PROTECTED] 
Envoyé : mardi 20 novembre 2007 02:49
À : solr-user@lucene.apache.org
Objet : Solr PHP client

Hi!
I'm wondering if someone is using a PHP client for solr. Actually I'm not sure 
if there is one out there.
Would you be interested in having a SolrJ port for PHP?

Thanks,

Jonathan Leibiusky



RE: solr - other document formats

2007-11-14 Thread SDIS M. Beauchamp

The commit can't be false. It can be done or not . If it is not, your users 
won't be able to search through the uncommited documents. It it's done, users 
can search through all document successfully sent to Solr.

You can use the autocommit feature  (in solrconfig.xml)  to avoid the explicit 
usage of commit : you juste have to send documents to Solr

Florent BEAUCHAMP

-Message d'origine-
De : Dwarak R [mailto:[EMAIL PROTECTED] 
Envoyé : mercredi 14 novembre 2007 13:38
À : solr-user@lucene.apache.org
Objet : Re: solr - other document formats

Many thanks Florent

Hey All

My docs are parsed and indexes are updated (using UpdateRichDocuments patch). 
But tell me onething what will happen if i don't commit ?. If commit is false 
where the docs are stored ?.

Regards

Dwarak R
- Original Message -
From: SDIS M. Beauchamp [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, November 14, 2007 1:13 PM
Subject: RE: solr - other document formats


You should take a look at 
http://wiki.apache.org/solr/UpdateRichDocuments?highlight=%28richdocument%29

It gives you a starting point to make the extractor you need

Regards

Florent

-Message d'origine-
De : Dwarak R [mailto:[EMAIL PROTECTED]
Envoyé : mercredi 14 novembre 2007 05:17
À : solr-user@lucene.apache.org
Objet : solr - other document formats

Hey All

I read an article on http://www.xml.com/lpt/a/1668

Its states that

As we've seen, the XML format used by Solr for indexing is quite simple. 
Extracting the relevant metadata to create these XML documents from the many 
formats floating around, however, is another story. Fortunately, Lucene 
users have the same problem and have been working on it for quite a while; 
the Lucene FAQ lists a number of references to parsers and filters which can 
be used to extract content and metadata from many common document formats.
Solr won't index spreadsheets or other formats out of the box, but that is 
not its role: you should see Solr as the search engine component of a 
broader search system, where extraction of content and metadata is handled 
by other components. This will help to keep your search system maintainable 
and testable, and it helps the Solr team focus on doing one thing well.

Parsing documents like pdf, ms word document, excel to xml will be done 
other component ?

Somebody advise

Regards

Dwarak R

This message is for the designated recipient only and may contain 
privileged, proprietary, or otherwise private information. If you have 
received it in error, please notify the sender[EMAIL PROTECTED] 
immediately and delete the original. Any other use of the email by you is 
prohibited.



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information. If you have received it in 
error, please notify the sender[EMAIL PROTECTED]  immediately and delete the 
original. Any other use of the email by you is prohibited.



no segments* file found

2007-11-12 Thread SDIS M. Beauchamp
I'm using solr to index our files servers ( 480K files ) 
 
If I don't optimize, I 've got a too many files open at about 450K files
and 3 Gb index
 
If i optimize I've got this stacktrace during the commit of all the
following update
 
result status=1java.io.FileNotFoundException: no segments* file
found in
org.apache.lucene.store.FSDirectory@/root/trunk/example/solr/data/index:
files: _7xr.tis _7xt.fdt _7o1.tii _7xq.tis _7xn.nrm _7ws.fdt _7xt.prx
_7xp.nrm _7ws.nrm _7xo.nrm _7ws.tis _7xs.fdt _7vc.fnm _7u6.tis _7vx.fnm
_7vx.frq _7xs.nrm _7xn.tis _7xq.frq _7xs.tis _7xq.prx _7vx.fdx _7ur.tii
_7ur.frq _7xq.fnm _7xr.nrm _7vc.fdt _7xt.frq _7xp.fdx _7ws.prx _7xs.frq
_7xo.prx _7xq.nrm _7vx.tii _7vx.prx _7xq.tii _7xs.fnm _7xs.tii _7ws.tii
_7xt.fdx _7vc.nrm _7vc.prx _7vc.tis _7xq.fdt _7ur.prx _7xn.fdx _7xp.frq
_7vx.nrm _7ur.fdt _7xr.fnm _7ws.fdx _7u6.tii _7xr.tii _7vc.frq _7vx.tis
_7xp.fdt _7xr.frq_7ur.tis _7xp.prx _7xr.fdx _7xt.fnm _7xn.tii _7vc.fdx
_7xo.fdt _7u6.fnm _7xn.frq _7xp.tis _7o1.frq _7xn.prx _7ur.fdx _7ur.fnm
_7o1.fdx _7xs.fdx _7xn.fdt _7xt.tis _7xp.fnm _7xo.fnm _7xn.fnm _7u6.prx
_7xq.fdx _7xo.tii _7ws.fnm _7vc.tii _7o1.prx _7xr.fdt _7o1.fdt _7ur.nrm
_7ws.frq _7u6.nrm _7o1.nrm _7vx.fdt _7xt.tii _7u6.fdx _7xo.frq _7u6.frq
_7xs.prx _7xr.prx _7o1.tis _7xt.nrm _7xp.tii _7xo.tis _7u6.fdt _7xo.fdx
_7o1.fnm segments.gen
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfo
s.java:516)
at
org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:243)
at
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:616)
at
org.apache.lucene.index.IndexWriter.lt;initgt;(IndexWriter.java:410)
at
org.apache.solr.update.SolrIndexWriter.lt;initgt;(SolrIndexWriter.java
:97)
at
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler
.java:121)
at
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandl
er2.java:189)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.
java:267)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdate
ProcessorFactory.java:67)
at
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateR
equestHandler.java:196)
at
org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpdate
RequestHandler.java:386)
at
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:
57)
   
 
/result
 
If I restart solr I've got a NullPointerException in DispatchFilter
 
tested with solr 1.2 and 1.3 , the behaviour is the same 
 
Regards
 
Florent BEAUCHAMP


RE: no segments* file found

2007-11-12 Thread SDIS M. Beauchamp
No , I'm using a custom indexer, written in C# which submits content using some 
post request.

I let lucene manage the index on his own

Florent BEAUCHAMP

-Message d'origine-
De : Venkatraman S [mailto:[EMAIL PROTECTED] 
Envoyé : lundi 12 novembre 2007 10:19
À : solr-user@lucene.apache.org
Objet : Re: no segments* file found

are you using embedded solr?

I had stumbled on a similar error :
http://www.mail-archive.com/solr-user@lucene.apache.org/msg06085.html

-V

On Nov 12, 2007 2:16 PM, SDIS M. Beauchamp [EMAIL PROTECTED] wrote:

 I'm using solr to index our files servers ( 480K files )

 If I don't optimize, I 've got a too many files open at about 450K 
 files and 3 Gb index

 If i optimize I've got this stacktrace during the commit of all the 
 following update

 result status=1java.io.FileNotFoundException: no segments* file 
 found in
 org.apache.lucene.store.FSDirectory@/root/trunk/example/solr/data/index:
 files: _7xr.tis _7xt.fdt _7o1.tii _7xq.tis _7xn.nrm _7ws.fdt _7xt.prx 
 _7xp.nrm _7ws.nrm _7xo.nrm _7ws.tis _7xs.fdt _7vc.fnm _7u6.tis 
 _7vx.fnm _7vx.frq _7xs.nrm _7xn.tis _7xq.frq _7xs.tis _7xq.prx 
 _7vx.fdx _7ur.tii _7ur.frq _7xq.fnm _7xr.nrm _7vc.fdt _7xt.frq 
 _7xp.fdx _7ws.prx _7xs.frq _7xo.prx _7xq.nrm _7vx.tii _7vx.prx 
 _7xq.tii _7xs.fnm _7xs.tii _7ws.tii _7xt.fdx _7vc.nrm _7vc.prx 
 _7vc.tis _7xq.fdt _7ur.prx _7xn.fdx _7xp.frq _7vx.nrm _7ur.fdt 
 _7xr.fnm _7ws.fdx _7u6.tii _7xr.tii _7vc.frq _7vx.tis _7xp.fdt 
 _7xr.frq_7ur.tis _7xp.prx _7xr.fdx _7xt.fnm _7xn.tii _7vc.fdx _7xo.fdt 
 _7u6.fnm _7xn.frq _7xp.tis _7o1.frq _7xn.prx _7ur.fdx _7ur.fnm 
 _7o1.fdx _7xs.fdx _7xn.fdt _7xt.tis _7xp.fnm _7xo.fnm _7xn.fnm 
 _7u6.prx _7xq.fdx _7xo.tii _7ws.fnm _7vc.tii _7o1.prx _7xr.fdt 
 _7o1.fdt _7ur.nrm _7ws.frq _7u6.nrm _7o1.nrm _7vx.fdt _7xt.tii 
 _7u6.fdx _7xo.frq _7u6.frq _7xs.prx _7xr.prx _7o1.tis _7xt.nrm _7xp.tii 
 _7xo.tis _7u6.fdt _7xo.fdx _7o1.fnm segments.gen
at
 org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfo
 s.java:516)
at
 org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:243)
at
 org.apache.lucene.index.IndexWriter.init(IndexWriter.java:616)
at
 org.apache.lucene.index.IndexWriter.lt;initgt;(IndexWriter.java:410)
at
 org.apache.solr.update.SolrIndexWriter.lt 
 ;initgt;(SolrIndexWriter.java
 :97)
at
 org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandl
 er
 .java:121)
at
 org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHan
 dl
 er2.java:189)
at
 org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.
 java:267)
at
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpda
 te
 ProcessorFactory.java :67)
at
 org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdat
 eR
 equestHandler.java:196)
at
 org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpda
 te
 RequestHandler.java :386)
at
 org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:
 57)


 /result

 If I restart solr I've got a NullPointerException in DispatchFilter

 tested with solr 1.2 and 1.3 , the behaviour is the same

 Regards

 Florent BEAUCHAMP




--