Hello

Georges Racinet <[EMAIL PROTECTED]> wrote on 20.04.2007 14:25:35:

> 
> On Apr 20, 2007, at 1:49 PM, [EMAIL PROTECTED] wrote:
> 
> >
> > Hello
> >
> >
> > Georges Racinet <[EMAIL PROTECTED]> wrote on 20.04.2007 12:13:45:
> >
> > >
> > > On Apr 20, 2007, at 9:05 AM, [EMAIL PROTECTED] wrote:
> > >
> > > >
> > > > Hello,
> > > >
> > > >
> > > > > you should be aware that the current indexing and searching
> > > > > facilities are right now undergoing drastic changes. We are 
> > working
> > > > > on a generic search service that can in turn use different 
> > backends,
> > > > > the first one being Compass (http://www.opensymphony.org/ 
> > compass).
> > > >
> > > > > The current default UI is rather rough.
> > > >
> > > > So that means that the full syntax of Lucene ist supported? 
> > Like it
> > > > is described here <http://lucene.apache.org/java/docs/
> > > > queryparsersyntax.html>?
> > >
> > > Yes, with a single exception: since searching on a given field is
> > > handled differently (through a WHERE statemenet in NXQL), the colon
> > > character has no special meaning.
> > >
> > > More specifically: the "simple" search box that one can see on every
> > > page corresponds to a QueryParser query on the special "fulltext"
> > > field that aggregates  all fields that qualify as "content" (title,
> > > description, attached files,...)
> >
> > Fulltext means the whole text of the file _AND_ all metadata-fields?
> 
> Not exactly, because some fields typically hold data in a form 
> different from the one being presented to the user, so that matching 
> on them would provide false hits. It could for instance be that the 
> metadata field holds just a key, possibly coming mapping to an 
> external data source, and that the displayed value depends on the 
> user or is subject to change.
> 
>   But you will be able to configure which fields qualify for 
> fulltext, or even define new such aggregators besides the default one.

Okay, so generally that's possible. Basically, it is important to search 
within author- and create-year-metafield, and I think these are "easy" 
metafields.

> 
> >
> > > On the other hand, the advanced search page does the work of 
> > building
> > > a complex query for the user. Therefore it doesn't apply QueryParser
> > > syntax. This is the same logic that, e.g, google provides its users.
> > > Of course, this is what the end user interface does, Nuxeo's
> > > modularity is precisely designed to let the integrator change that
> > > easily.
> > >
> > > >
> > > > > Note that other common use-cases such as handling of synonyms 
> > and
> > > > > stemming will be a simple matter of  configuration with the 
> > Compass
> > > > > backend.
> > > >
> > > > Okay, I can define synonyms for "car", e.g. "vehicule", and when I
> > > > search for "car", Compass automatically searches also for 
> > "vehicule"?
> > >
> > > That's the idea, yes. It seems that Compass does it both at indexing
> > > and queryng times. See the Compass reference manual, 5.3.3 at
> > > http://www.opensymphony.com/compass
> >
> > Okay thanks. I will define an XML-file with the word and its 
> > synonyms and point with Compass to this file, right? Where can I 
> > find the structure of such a synonym-file.
> > I'll need to write a script to transcribe the actual synonym-file 
> > into the new format...
> 
> I haven't investigated this deeply yet, but it seems that you'd 
> register a java class in Compass configuration's file, and that this 
> class would act as a synonym  provider.
Okay, that's fine. I can write a class which can handle the actual 
synonym-dictionary. So I don't have to convert the dictionary itself.


Another thing:
Georges Racinet <[EMAIL PROTECTED]> wrote on 19.04.2007 15:31:33:
> 
> On Apr 19, 2007, at 3:10 PM, [EMAIL PROTECTED] wrote:
> 
> >
> > hello
> >
> > - does Nuxeo correct typing errors when it builds the search index? 
> > we have documents from OCR with spelling errors which should be 
> > corrected automatically.
> > - can Nuxeo check the search query for typing errors? so when I 
> > enter "vehiciule", Nuxeo for example says "Did you mean 'vehicule'?"
> 
> As far as I know, It's not been planned, but these are  indeed 
> interesting use cases. Thanks for mentioning them !
Lucene has such an feature, but I think works only during the searching, 
not while Lucene indexes a new file.
<http://today.java.net/pub/a/today/2005/08/09/didyoumean.html?page=1>
<http://wiki.apache.org/jakarta-lucene/SpellChecker>

And in addition to that, there is the fuzzy-search-term. E.g. 'roam~' 
finds also 'foam' or 'roams'.
<
http://lucene.apache.org/java/docs/queryparsersyntax.html#Fuzzy%20Searches>

Are these two features also implemented in Nuxeo, when it uses Compass?


Regards
Benedikt Köppel
-- 
This communication is for use by the intended recipient and contains 
information that may be privileged, confidential or copyrighted under 
applicable law. If you are not the intended recipient, you are hereby 
formally notified that any use, copying or distribution of this e-mail, in 
whole or in part, is strictly prohibited. Please notify the sender by 
return e-mail and delete this e-mail from your system. Unless explicitly 
and conspicuously designated as "E-Contract Intended", this e-mail does 
not constitute a contract offer, a contract amendment, or an acceptance of 
a contract offer. This e-mail does not constitute a consent to the use of 
sender's contact information for direct marketing purposes or for 
transfers of data to third parties.
_______________________________________________
ECM mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm

Reply via email to