Re: Semi-structured queries

2012-12-07 Thread lukai
wrap your own parser. eg. org/apache/lucene/querypasser/classic/QueryParser.jj. On Fri, Dec 7, 2012 at 1:47 PM, Wu, Stephen T., Ph.D. wrote: > I’ve been trying to do semi-structured queries & query parsing. In other > words, you could have XML snippets mixed in with plain terms, e.g. a query

Semi-structured queries

2012-12-07 Thread Wu, Stephen T., Ph.D.
I’ve been trying to do semi-structured queries & query parsing. In other words, you could have XML snippets mixed in with plain terms, e.g. a query like: christmas tree where you’re looking for a document with the terms “christmas” “tree” but also some structured data about where (pract

Re: Lucene 4.0.0 - find term position.

2012-12-07 Thread Adrien Grand
Hi Vitaly, On Fri, Dec 7, 2012 at 3:24 PM, wrote: > I try to use or Terms tfvector = reader.getTermVector(docId, "contents"); > or Fields fields = reader.getTermVectors(docId); > but I get null from these calls. > What is wrong? These methods will always return null unless you turn term vect

RE: Lucene 4.0.0 - find term position.

2012-12-07 Thread Vitaly_Artemov
I try to use or Terms tfvector = reader.getTermVector(docId, "contents"); or Fields fields = reader.getTermVectors(docId); but I get null from these calls. What is wrong? -Original Message- From: lukai [mailto:lukai1...@gmail.com] Sent: Friday, December 07, 2012 2:50 AM To: java-user@l

RE: Alternative for WildcardQuery with leading *

2012-12-07 Thread Oliver Christ
If I remember correctly it was Baeza-Yates or someone in his group at U Santiago who came up with the rotated term indexing. Indexing "abc", you explicitly mark end of string and index all rotations using a data structure which supports prefix search (such as a trie): abc$ bc$a c$ab $abc This

Re: Using Lucene 2.3 indices with Lucene 4.0

2012-12-07 Thread Ramprakash Ramamoorthy
On Thu, Nov 29, 2012 at 4:05 AM, kiwi clive wrote: > Be aware that StandardAnalyzer changed slightly. This is particularly > important if you use it to analyze email addresses and certain text-numeral > combinations. My understanding is that the newer version of > StandardAnalyzer is more consist

RE: Alternative for WildcardQuery with leading *

2012-12-07 Thread Uwe Schindler
In general, you seem to need decomposing...: vacancyplan -> tokenized to -> vacancyplan, vacancy, plan. Wildcards are in general not really a replacement for correct text analysis on the indexing side. Unfortunately, decomposing is a hard task, but there are dictionary-based algorithms for e.g.

AW: Alternative for WildcardQuery with leading *

2012-12-07 Thread Clemens Wyss DEV
> Really off the top of my head, if that's an expected query, >you can try to index the words backwards (in that field) and >then convert the query *plan to nalp* :). "interesting" approach ... I might give it a try :-) ... no kidding ;) -Ursprüngliche Nachricht- Von: Shai Erera [mailto:

Re: Alternative for WildcardQuery with leading *

2012-12-07 Thread Shai Erera
Really off the top of my head, if that's an expected query, you can try to index the words backwards (in that field) and then convert the query *plan to nalp* :). You can also index the suffixes of words, e.g. vacancyplan, acancyplan, cancyplan and so forth, and then convert the query *plan to pla

Alternative for WildcardQuery with leading *

2012-12-07 Thread Clemens Wyss DEV
In order to provide suggestions our query also includes a "WildcardQuery with a leading *", which, of course, has a HUGE performance impact :-( E.g. Say we have indexed "vacancyplan", then if a user typed "plan" he should also be offered "vacancyplan" ... How can this feature be implemented wit