RE: java.util.zip (was Questions about DeleteFile method)

2005-05-04 Thread Joaquin Delgado
Interesting to know that "<>, it is no surprise that the java.util.zip namespace is available to developers through the J# runtime." I wonder if some of the latest developments in Lucene-Java, can be ported to C# (.NET) via the J# runtime. J.D. -Original Message- From: Chuck Williams [ma

RE: Global Analysis possible in Lucene?

2005-09-06 Thread Joaquin Delgado
I think Shane is refering to Global Context Analysis (vs. Local Context Analysis), which refers to a statistical anlisys of the context (sourrounding windows) around noun-phrases from the index for automatic thesauri construction and query expansion. For more info, read the SIGIR'96 paper from Xu

Re: Reordering search results

2005-10-03 Thread Joaquin Delgado
Chris, you may consider using a modified version of the Nutch analysis (http://lucene.apache.org/nutch/apidocs/org/apache/nutch/analysis/package-summary.html) which has a very slick treatment of stopwords. Please refer to chapter 4, page 145 of the Lucene in Action written by Eric and Otis for s

Re: SImilarity between Terms

2005-10-18 Thread Joaquin Delgado
Sebastian, There is no simple way of calculating similarity between terms in Lucene. Normally documents are represented in the Vector Space Model (VSM) where as some weight is associated to each unique term associated with the document (e.g. term frequency or number of times a term occurs with

Re: "Advanced" query language

2005-12-15 Thread JOAQUIN . DELGADO
Mark, This is very cool. When I was at TripleHop we did something very similar where both query and results conformed to an XML Schema and we used XML over HTTP as our main vehicle to do remote/federated searches with quick rendering with stylesheets. That however is the first piece of the puz

Re: "Advanced" query language

2005-12-17 Thread JOAQUIN . DELGADO
Paul and Wolfang, Thank you very much for your input. I think there are two distinct problems that have emerged from this thread: 1) The ability to create efficient structures to index and query XML documents (element, attributes and corresponding values) with a full-text query language and pe

Re: "Advanced" query language

2005-12-19 Thread Joaquin Delgado
Comments in-line Wolfgang Hoschek wrote: Yes, there are interesting impls out there. I've myself implemented XQuery fulltext search via extension functions build on Lucene. See http://dsd.lbl.gov/nux/index.html#Google-like%20realtime%20fulltext% 20search%20via%20Apache%20Lucene%20engine H

Re: "Advanced" query language

2005-12-20 Thread JOAQUIN . DELGADO
Allows all of the Lucene query functionality to be exposed c) Is a real requirement for enough Lucene users I'm just not sure that any/all of these conditions are true. Maybe there needs to be a separate "interoperability" language development? Cheers Mark --- Joaquin Del

Re: Lucene 1.2 - scoring forumla needed

2006-09-10 Thread Joaquin Delgado
oiting bayes theorem. Both Vector Space Model and Probabilistic Model are well studied in Information Retrieval Literature. See http://www2.sims.berkeley.edu/courses/is202/f00/lectures/Lecture8_202.ppt for an overview of Ranking and Feedback. -- Joaquin Delgado Karl Koch wrote: Hi,

Re: Attached proposed modifications to Lucene 2.0 to support Field.Store.Encrypted

2006-12-01 Thread JOAQUIN DELGADO
Security should be responsibility of the application. However let's make it clear that field level encryption is more a "means of" implementing security and herefore an infrastructure functionality that in my opinion Lucene should optionally provide. In the same way relational databases provide