RE: Performance impact of searching across multiple fields

2015-07-28 Thread Uwe Schindler
It depends on the number of fields. If you search on 3 fields it is not likely to be a problem (the general use case 3 fields: plain, stemmed, folded). But if you have like 50 fields, the slow down is likely very large! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http

RE: Lucene segment selection strategy

2015-07-18 Thread Uwe Schindler
tch?v=mARACndILQc Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Saturday, July 18, 2015 12:43 AM > To: Lucene Users; coone

RE: Classpath issue

2015-07-12 Thread Uwe Schindler
the recommendation is to not use such tools unless you know how to correctly use them (so they preserve and merge the META-INF/services folder inside the JAR file). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Origi

RE: Upgrading Lucene 4 index to 5 doesn't update it - for just some indices

2015-07-06 Thread Uwe Schindler
Hi, > On Mon, Jul 6, 2015 at 4:32 PM, Uwe Schindler wrote: > > Hi, > > > > It could be the reason for this is your classpath: > > > > If you load all Lucene Versions into the same classloader (but with > > different > package names - I assume you us

RE: Upgrading Lucene 4 index to 5 doesn't update it - for just some indices

2015-07-05 Thread Uwe Schindler
Lucene 5 ships with "modified" versions of the old Lucene 4 codecs - but they are not identical. You can only workaround by loading the Lucene JARs into completely different classloaders (don't forget to also set context classloader!). In that case you would not even need to ch

Re: Deleting document form index by id

2015-06-27 Thread Uwe Schindler
l commands, e-mail: java-user-h...@lucene.apache.org >> > >- >To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >For additional commands, e-mail: java-user-h...@lucene.apache.org -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de

RE: IndexFormatTooOldException while upgrading Lucene 4.10 index to 5.2

2015-06-16 Thread Uwe Schindler
that has default codec). After that you can read the 4.10 index with 5.0 using lucene-backward-codecs.jar in classpath. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [ma

RE: IndexFormatTooOldException while upgrading Lucene 4.10 index to 5.2

2015-06-16 Thread Uwe Schindler
Hi, you need to add the JAR file lucene-backward-codec.jar to the classpath (or add it via Maven). It contains the codecs to read pre-5.0 versions. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- >

RE: Migrating from Lucene 4.10.3 to Lucene 5.10

2015-05-21 Thread Uwe Schindler
er used in Lucene. It was there to forcefully remove a lock, which is a bad idea. The only available method is, as said before, Directory#makeLock that should return a Lock instance. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@t

RE: BytesRef violates the principle of least astonishment

2015-05-20 Thread Uwe Schindler
Hi, > > BytesRef#clone()'s javadoc comment says that the result will be a > > shallow clone, sharing the backing array with the original instance, > > and points to another utility method for deep cloning: > BytesRef#deepCopyOf(BytesRef). > > There is no hard contract for clone > https://docs.or

RE: Java8 and lucene version

2015-05-07 Thread Uwe Schindler
va 8, but you should in any case test it. One example > that uses Lucene 3.3.0 (a bit newer, based on Java 5) is "Atlassian JIRA". > This > one works perfectly fine with Java 8! > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen >

RE: Java8 and lucene version

2015-05-07 Thread Uwe Schindler
it newer, based on Java 5) is "Atlassian JIRA". This one works perfectly fine with Java 8! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail

RE: getting exception in lucene 4.0

2015-04-30 Thread Uwe Schindler
) was only searching in context class loader for metadata. Lucene 4.3 fixes this issue by also searching for metadata in the classloader that originally loaded the lucene jars: https://issues.apache.org/jira/browse/LUCENE-4713 Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen

RE: StandardTokenizerImpl generation question

2015-04-25 Thread Uwe Schindler
is the one you want to have. It is easy to do if you use the corresponding ANT task for regenerating ("ant jflex" or "ant regenerate"). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message

RE: StandardTokenizerImpl generation question

2015-04-25 Thread Uwe Schindler
e correct version of JFlex and regenerate the scripts. Currently this is version 1.6.0. The warning was primarily there because of older versions of JFlex that used the Unicode version of the JVM which is no longer the case. Could you may be open an issue to update the documentation in the fi

RE: CachingTokenFilter tests fail when using MockTokenizer

2015-03-23 Thread Uwe Schindler
TokenFilter does not implement reset() correctly, so the whole thing cannot be reused in analyzers. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Spyros Kapnissis [mailto:ska...@yahoo.com.I

ApacheCon NA 2015 in Austin, Texas

2015-03-19 Thread Uwe Schindler
u have any questions, comments, or just want to hang out with us before and during the event, follow us on Twitter - @apachecon - or drop by #apachecon on the Freenode IRC network. Hope to see you in Austin! - Uwe Schindler uschind...@apache.org Apache Lucene PMC Member / Committer

RE: Filtering question

2015-03-16 Thread Uwe Schindler
Hi, http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/search/FieldCacheDocIdSet.html You just have to implement the "protected boolean matchDoc(int docId)" method. You should return this DocIdSet from your filter instead of the manual code you created. Uwe - Uwe Schi

RE: Filtering question

2015-03-12 Thread Uwe Schindler
Hi Chris, > Hi Uwe, thanks for your suggestions. I have tried a couple of things with no > luck yet: > > > Sorry, > > I just noticed, you are using TermFilter not TermsFilter: This one > > does not support random access (using bits()). Because of this the > > filtered docs cannot be passed down

RE: Filtering question

2015-03-11 Thread Uwe Schindler
the same field name for both. > I must add that a full reindex all in one go is currently not an option, so > the > solution must support this mixed mode. Uwe > Any thoughts on how this could be best achieved ..? > > Thanks > > Chris > > Sent from my iPhone >

RE: Lucene MMapDirectory: Mapping failure

2015-03-11 Thread Uwe Schindler
a while until also the address space gets exhausted. Just a few additional questions: How many indexes are you opening at the same time? What is the approx. size? What Lucene version are you using? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u

RE: Filtering question

2015-03-11 Thread Uwe Schindler
efore (use ConstantScoreQuery). Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Wednesday, March 11, 2015 8:07 PM > To: java-user@lucene.apa

RE: Filtering question

2015-03-11 Thread Uwe Schindler
the TermsFilter on the top level IndexSearcher (which internally rewrites to FilteredQuery(query, filter)), the documents matching the TermsFilter will be applied as acceptDocs by your BooleanQuery, which will pass it also down to the MyNDVFilter. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D

RE: Looking for FieldCacheRangeFilter pendant in 5.0.0

2015-02-26 Thread Uwe Schindler
misc module. It "emulates" DocValues for fields that don't have them by uninverting like FieldCache did before (http://lucene.apache.org/core/5_0_0/misc/org/apache/lucene/uninverting/UninvertingReader.html). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.theta

RE: Searching for DateRangeField in Lucene 5.0.0

2015-02-25 Thread Uwe Schindler
NumericField, it is a bit more complicated. You have to understand how the spatial module works. We are sorry that the release notes were not precise enough. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- >

RE: Lucene 4.x -> 5 : IllegalStateException while sorting

2015-02-23 Thread Uwe Schindler
Hi, Solr uses DocValues and falls back to wrapping with UninvertingReader, if user have not indexed them (with negative startup performance and memory effects). But in general, you should really enable DocValues for fields you want to sort on. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D

RE: write.lock is not removed

2015-02-23 Thread Uwe Schindler
, but the file stays alive. This was different in Lucene 3, where the lock was also removed. But this caused problems because of non-atomic file operations. Lock factory only checks the "lock on the file", not "the existence of the file as a lock". Uwe - Uwe Schindler H.-

RE: Lucene 5 : createComponents without reader

2015-02-23 Thread Uwe Schindler
e the org.apache.lucene.analysis package documentation for more details." Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Clemens Wyss DEV [mailto:clemens...@mysign.ch] > Sent: Monday, February 23

RE: [ANNOUNCE] Apache Lucene 5.0.0 released

2015-02-20 Thread Uwe Schindler
re is a section following). Maybe fix this on the web page, for the mail it's too late. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Anshum Gupta [mailto:ans...@anshumgupta.net] >

RE: A codec moment or pickle

2015-02-12 Thread Uwe Schindler
would not help you, too - you need the default Codec be available at the time your custom codec is loaded... Same issue, no idea how to solve this. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > F

RE: A codec moment or pickle

2015-02-12 Thread Uwe Schindler
Maybe try it out, was just an idea :-) Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Benson Margulies [mailto:bimargul...@gmail.com] > Sent: Thursday, February 12, 2015 2:11

RE: re-mapping lucene index

2015-02-10 Thread Uwe Schindler
ing a new index. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Vijay B [mailto:vijay.nip...@gmail.com] > Sent: Tuesday, February 10, 2015 8:38 PM > To: java-user@lucene.apache.org >

RE: Lucene search in attachments

2015-02-10 Thread Uwe Schindler
done at all, in that case the whole field is indexed as a single term - but that’s not useful for searching in full text anyways. So use a suitable analyzer! > Is that right? Yes! Uwe > Best Regards, > Sreedevi S > > On Tue, Feb 10, 2015 at 2:45 PM, Uwe Schindler wrote: > &g

RE: Indexing and searching a DateTime range

2015-02-10 Thread Uwe Schindler
Hi, > OK. I found the Alfresco code on GitHub. So it's open source it seems. > > And I found the DateTimeAnalyser, so I will just take that code as a starting > point: > https://github.com/lsbueno/alfresco/tree/master/root/projects/repository/ > source/java/org/alfresco/repo/search/impl/lucene/an

RE: Lucene search in attachments

2015-02-10 Thread Uwe Schindler
://goo.gl/SRf45A If you have a limitation to 10,000 characters somewhere, it might be your TIKA text extraction. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: sreedevi s [mailto:sreedevi

RE: Indexing and searching a DateTime range

2015-02-09 Thread Uwe Schindler
Hi, > I am in the beginning of implementing a Lucene application which would > supposedly search through some log files. > > One of the requirements is to return results between a time range. Let's say > these are two lines in a series of log files: > 2015-02-08 00:02:06.852Z INFO... > ... > 2015

RE: Lucene query behavior using NOT

2015-02-08 Thread Uwe Schindler
SHOULD be in results (at least one occurrence in total), or MUST_NOT. The Operators are applied to the clauses (terms) in Lucene. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Ian Koe

RE: MMapDirectory or FSDirectory

2015-02-05 Thread Uwe Schindler
ead. So please: Use MMapDirectory where possible - this is completely unrelated to how much RAM you have available! Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: sreedevi s [mailto:sreed

RE: Lucene Boolean Query Minimization

2015-02-02 Thread Uwe Schindler
rks.com/blog/why-not-and-or-and-not/ Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Ralf Heyde [mailto:ralf.he...@gmx.de] > Sent: Monday, February 02, 2015 4:28 PM > To: java-u

RE: Lucene Boolean Query Minimization

2015-02-02 Thread Uwe Schindler
h boosts multiplied). But if you have many Boolean clauses, how could those be optimized? If you have multiple term queries as clauses with identical terms, one could optimize that by merging those clauses and add their boosts, but this is not done automatically. Uwe - Uwe Schindler H.-H

RE: issue with IndexUpgrader

2015-01-29 Thread Uwe Schindler
nd replace the index, as you suggested. You might open a bug report for Lucene so we can fix this interesting corner case in IndexUpgrader, and we can fix it in later versions, but we cannot do this in 3.x anymore, because the development has ended there. Uwe - Uwe Schindler H.-H.-Meier-A

RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread Uwe Schindler
ds of Elasticsearch and Lucene." (http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_don_8217_t_touch_these_settings.html) In fact, the problems with G1GC can sometimes lead to index corruption, and are hard to reproduce. So better don't use... Uwe - Uwe Schindler H.

RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)

2015-01-27 Thread Uwe Schindler
Java 8 update 20 or later is also fine. At current time, always use latest update release and you are be fine with Java 7 and Java 8. Don't use older releases and don't use G1 Garbage Collector. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.d

RE: Absolute term position in scoring

2015-01-26 Thread Uwe Schindler
/elasticsearch/guide/current/multi-field-search.html Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Alexey Morozov [mailto:moro...@gmail.com] > Sent: Monday, January 26, 2015 11:49 AM > T

RE: Upgrade Lucene to latest version (4.0) from 2.4.0

2015-01-22 Thread Uwe Schindler
looks like you prevented IndexWriter from merging at all. This of course would make queries slow, because you have way too many index segments (CFS files). If you upgrade from very old Lucene versions, it is wise to remove any customizations and start with plain default settings. Uwe - Uwe

RE: forceMerge(1) grows index and does not shrink back

2015-01-19 Thread Uwe Schindler
Hi, > we use 4.8.1. We know that the javadoc advises against it. Like I wrote, the > deletion of old documents (that appear during an update) would be done > while closing the writer. This is not true. The merge policy continuously merges segments that contain deletions. The problem you might ha

RE: Custom tokenizer

2015-01-12 Thread Uwe Schindler
ommon cases, but whenever you have custom requirements, you have to define your Analyzer *completely* yourself. This is also what Solr and Elasticsearch users do in their config files. Uwe > Thank you. > > On Mon, Jan 12, 2015 at 1:36 PM, Uwe Schindler wrote: > > > Hi, > > >

RE: Custom tokenizer

2015-01-12 Thread Uwe Schindler
, because those are instantiated before the TokenFilters which depend on them, so changing the Tokenizer afterwards is impossible. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Vihari P

RE: Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

2015-01-11 Thread Uwe Schindler
for a specific field is done above – this one is more to execute Querys and so on): <http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/index/IndexReader.html#getTermVectors(int)> Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen <http://www.thetaphi.de/> http

RE: cloning a NumericTermAttributeImpl

2015-01-10 Thread Uwe Schindler
Hi, I checked it out a second time. We *can* implement deep clone. Actually this is a bug from the time when we changed to BytesRefBuilder. I opened https://issues.apache.org/jira/browse/LUCENE-6173 about this. Thanks. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http

RE: cloning a NumericTermAttributeImpl

2015-01-10 Thread Uwe Schindler
"standard" Analyzers you would not see those TokenStreams. Lucene also never adds TokenFilters on top. Let me check if this TokenStream is really marked as @lucene.internal in Javadocs. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@t

RE: Looking for docs that have certain fields empty (an/or not set)

2015-01-07 Thread Uwe Schindler
ument/abb73b45a48cb89e Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Jack Krupansky [mailto:jack.krupan...@gmail.com] > Sent: Wednesday, January 07, 2015 10:17 PM > To: java-user@lucene.apache.

RE: IndexSearcher.setSimilarity thread-safety

2015-01-05 Thread Uwe Schindler
y create a new IndexSearcher instance for every search request (...and I always do this). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Barry Coughlan [mailto:b.coughl...@gmail.com] > Sent:

RE: manually merging Directories

2014-12-30 Thread Uwe Schindler
In addition, use NoMergePolicy to prevent automatic merging once the segments were added. :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent

RE: manually merging Directories

2014-12-30 Thread Uwe Schindler
ll files, that were not copied unmodified, keep alive in the source directory, but all those that are copied as-is will move and disappear from source directory. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original M

RE: manually merging Directories

2014-12-29 Thread Uwe Schindler
sible, because the files reference each other by segment name (segments_n refers to them, also the segment ids are used all over). So You would need to change some index files already for merge to make the SegmentInfos structures use the correct names, so you can do a real merge anyways. Uwe ----

RE: BTRFS ?

2014-12-23 Thread Uwe Schindler
Hi Dawid, Unfortunately, for that to work, Solr needs to solely use NIO.2, too. Only Lucene disallows java.io.File and related classes, Solr is excluded from this forbidden-check. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de

RE: BTRFS ?

2014-12-22 Thread Uwe Schindler
rick here is to clone the file and its inode, but keep the blocks the same (only when one writes to the file, it clones the block). This could speed up tests, especially Solr where some dirs are copied over and over for every test case. :-) Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28

RE: BTRFS ?

2014-12-22 Thread Uwe Schindler
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Dawid Weiss [mailto:dawid.we...@gmail.com] > Sent: Monday, December 22, 2014 8:48 AM > To: java-user@lucene.apache.org > Cc: Uwe Schindler

RE: Index corruption with lucene 3.0.3

2014-12-17 Thread Uwe Schindler
file it will also not work. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de <http://www.thetaphi.de/> eMail: u...@thetaphi.de From: Shlomit Rosen [mailto:shlom...@il.ibm.com] Sent: Wednesday, December 17, 2014 8:04 PM To: Subject: Index corr

RE: unsafe memory access operation

2014-12-12 Thread Uwe Schindler
ion. > -- > on a side note,not related to this, could you please reply to my comments on > your blog for MMap directory post. What was your question? Uwe > On Fri, Dec 12, 2014 at 12:59 PM, Uwe Schindler wrote: > > > > Hi, > > > > I noticed, you

RE: unsafe memory access operation

2014-12-12 Thread Uwe Schindler
ere NFS is a problem with MMap, because whenever the connection to the NFS server drops, it may crush the whole JVM or produce strange errors. This could also be the reason for the error you see here. I would recommend to update to 1.7.0_72 or 1.8.0_25. Uwe - Uwe Schindler H.-H.-Meier-A

RE: unsafe memory access operation

2014-12-12 Thread Uwe Schindler
Hi, Which Java 7 build number / update level are you using? Those errors occur easily if you use an outdated JDK version. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Vijay B [mailto:vij

RE: how to "load" mmap directory into memory?

2014-12-03 Thread Uwe Schindler
with SearcherManager in Lucene to warm searchers. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Li Li [mailto:fancye...@gmail.com] > Sent: Wednesday, December 03, 2014 3:46 AM > T

RE: Reminder: FOSDEM 2015 - Open Source Search Dev Room

2014-12-03 Thread Uwe Schindler
Hello everyone, We have extended the deadline for submissions to the FOSDEM 2015 Open Source Search Dev Room to Monday, 9 December at 23:59 CET. We are looking forward to your talk proposal! Cheers, Uwe - Uwe Schindler uschind...@apache.org Apache Lucene PMC Member / Committer Bremen

RE: lucene query with additional clause field not null

2014-12-01 Thread Uwe Schindler
Hi, Use FieldValueFilter for that: http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/search/FieldValueFilter.html If you need a query instead of a Filter, wrap it with ConstantScoreQuery. This is also much faster than a RangeQuery like suggested by Ahmed. Uwe - Uwe Schindler H

Reminder: FOSDEM 2015 - Open Source Search Dev Room

2014-11-24 Thread Uwe Schindler
rs, LH on behalf of the Open Source Search Dev Room Program Committee* * Boaz Leskes, Isabel Drost-Fromm, Leslie Hawthorn, Ted Dunning, Torsten Curdt, Uwe Schindler - Uwe Schindler uschind...@apache.org Apache Lucene PMC Member / Committer Bremen, Germany http:

RE: Indexing Error

2014-11-22 Thread Uwe Schindler
The old constants are still working in later versions. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Robert Nikander [mailto:rob.nikan...@gmail.com] > Sent: Saturday, November 22, 2014 8:3

RE: Indexing Error

2014-11-22 Thread Uwe Schindler
Hi, This generally happens, if you have an older version of Lucene somewhere in your classpath. E.g., if older Lucene was placed outside of the webapp somewhere in the classpath of Websphere itself. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u

RE: Iterating TermsEnum for Long field produces zero values at the end

2014-11-17 Thread Uwe Schindler
Hi, > It is expected: those are the "prefix" terms, which come after all the full- > precision numeric terms. > > But I'm not sure why you see 0s ... the bytes should be unique for every term > you get back from the TermsEnum. That's easy to explain: The lower precision terms at the end have mo

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Uwe Schindler
ter case insensitive (there is a boolean to do this): StopFilter(boolean enablePositionIncrements, TokenStream input, Set stopWords, boolean ignoreCase) Uwe > Martin O'Shea. > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: 10 Nov 2014 1

RE: How to disable LowerCaseFilter when using SnowballAnalyzer in Lucene 3.0.2

2014-11-10 Thread Uwe Schindler
and add your "configuration" there. If you use Apache Solr or Elasticsearch you can create your analyzers by XML or JSON configuration. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From:

RE: Caused by: java.lang.OutOfMemoryError: Map failed

2014-11-07 Thread Uwe Schindler
Hi, > That error can also be thrown when the number of open files exceeds the > given limit. "OutOfMemory" should really have been named > "OutOfResources". This was changed already. Lucene no longer prints OOM (it removes the OOM from stack trace). It also adds useful information. So I think th

Re: Lucene spatial for grid clusters

2014-11-05 Thread Uwe Schindler
use case? Any suggestions on the best/most >efficient way to achieve? > >Thanks! -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de

RE: Effectiveness MMapDirectory on NFS Mounted indexes

2014-11-05 Thread Uwe Schindler
ation fault (crush) of your Java VM. Otherwise there is no real difference between the directory implementations. Writing and locking the index is the main problem (see above). Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -

RE: Dangerous reflection access to sun.misc.Cleaner by class org.apache.lucene.store.MMapDirectory$MMapIndexInput$1 detected!

2014-11-03 Thread Uwe Schindler
Hi Jean-Claude, > We have a NetBeans RCP application using Lucene 4.7 for indexing and > searching. > > When indexing, the following message is displayed: > > Dangerous reflection access to sun.misc.Cleaner by class > org.apache.lucene.store.MMapDirectory$MMapIndexInput$1 detected! This message

RE: FOSDEM 2015 - Open Source Search Dev Room

2014-11-03 Thread Uwe Schindler
Hi, forgot to mention: FOSDEM 2015 takes place in Brussels on January 31th and February 1st, 2015. See also: https://fosdem.org/2015/ I hope to see you there! Uwe > -Original Message- > From: Uwe Schindler [mailto:uschind...@apache.org] > Sent: Monday, November 03, 2014 1:29 P

CFP: FOSDEM 2015 - Open Source Search Dev Room

2014-11-03 Thread Uwe Schindler
zers: opensourcesearch-devr...@lists.fosdem.org Cheers, LH on behalf of the Open Source Search Dev Room Program Committee* * Boaz Leskes, Isabel Drost-Fromm, Leslie Hawthorn, Ted Dunning, Torsten Curdt, Uwe Schindler ----- Uwe Schindler uschind...@apache.org Apache Lucene PMC Member / Committer Br

RE: MyAnalyzer and Lucene version <= 4.9.1

2014-10-28 Thread Uwe Schindler
Hi, You have to implement createComponents(). The old way of Lucene 3 does no longer work because Analyzers have to provide reusable TokenStreams. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- >

RE: Search "_all" field with a term

2014-10-11 Thread Uwe Schindler
urself. An alternative would be to > use another query parser (like the SimpleQueryParser, see > http://goo.gl/4blGsp) that allows to expand the query to search on multiple > fields with different weight factors. > > > > Uwe > > > > - > > Uwe Schindle

RE: Search "_all" field with a term

2014-10-11 Thread Uwe Schindler
An alternative would be to use another query parser (like the SimpleQueryParser, see http://goo.gl/4blGsp) that allows to expand the query to search on multiple fields with different weight factors. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMa

RE: Search with term intersection

2014-10-10 Thread Uwe Schindler
2.9, but this is no longer the case - every segment is handled on its own. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: aurelien.mazo...@francelabs.com > [mailto:aurelien.mazo...@franc

RE: query.extractTerms(..) on rewritten queries

2014-10-06 Thread Uwe Schindler
rewriting and set it on the MultiTermQuery variant and rewrite it to collect the terms. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Christian Reuschling [mailto:reuschl...@dfki.uni-kl.de] > Sen

RE: Optimum Lucene’s MMapDirectory size on 64bit OS

2014-09-27 Thread Uwe Schindler
s space, but why do you think a "higher" address space is better for a larger index? Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Gaurav gupta [mailto:gupta.gaurav0...@gmail.com] &

RE: Optimum Lucene’s MMapDirectory size on 64bit OS

2014-09-26 Thread Uwe Schindler
Hi, 1 GiB is the maximum possible. The chunk size is only applicable for 32 bit JDKs because of limited address space. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Gaurav

RE: Issues with lucene 4.10.0 on android

2014-09-26 Thread Uwe Schindler
Hi > Background: My company has built an Android application that utilizes > Lucene 4.7.0 to index and search upon a fairly static set of about 100,000 > documents. We have used numerous versions of Lucene over the years, and > they have all worked well to accomplish this purpose. > > Issue: Upo

RE: getting exception while deploying on axis 2

2014-09-25 Thread Uwe Schindler
acked together as an .aar file (which is a special JAR, too). Maybe this step misses to add the the META-INF/services folder. Sorry I cannot help here more. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Messag

RE: getting exception while deploying on axis 2

2014-09-25 Thread Uwe Schindler
you have to use them as-is. If you for example unpack them and use the .class files in them directly or if you repackage them into a so called "uber-jar" (one JAR containing everything), the additional metadata often gets lost. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 B

RE: getting exception while deploying on axis 2

2014-09-24 Thread Uwe Schindler
n and its resource transformers. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Rajendra Rao [mailto:rajendra@launchship.com] > Sent: Wednesday, September 24, 2014 10:03 AM &

RE: How to use 'PhraseQuery' with Fuzzy?!

2014-09-23 Thread Uwe Schindler
re, that SpanQueries are expensive and may take a lot of time! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: teko [mailto:tec...@gmail.com] > Sent: Monday, September 22, 2014 8:45 PM >

Re: How to configure lucene 4.x to read 3.x index files

2014-09-23 Thread Uwe Schindler
if someone could point out the right direction. > >Regards, >Patrick > >- >To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >For additional commands, e-mail: java-user-h...@lucene.apache.org -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de

RE: Can lucene index tokenized files?

2014-09-14 Thread Uwe Schindler
(name, yourTokenStream)". Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Sachin Kulkarni [mailto:kulk...@hawk.iit.edu] > Sent: Sunday, September 14, 2014 10:06 PM > To: java-

RE: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Uwe Schindler
/IndexWriter to think it was created in 3.x. TestBackwards compatibility did not find that bug, because the backwards index in the tests directory was created with the Alpha version :( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de

RE: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Uwe Schindler
If you want to upgrade the index, you may try to run IndexUpgrader on Lucene 4.9, to have it up to date. But Index upgrading may fail because of the BETA-Status of the original creator. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de

RE: 4.10.0: java.lang.IllegalStateException: cannot write 3x SegmentInfo unless codec is Lucene3x (got: Lucene40)

2014-09-10 Thread Uwe Schindler
In Lucene 4.10 we changed version handling and parsing version numbers a bit, so this may be the cause for the error. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Ian Lea [mailto:ian@g

RE: KeywordAnalyzer still getting tokenized on spaces

2014-09-09 Thread Uwe Schindler
es the approach you should use, if you don't need "syntax". Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: atawfik [mailto:contact.txl...@gmail.com] > Sent: Tuesday, S

RE: IOExceptions during search

2014-09-08 Thread Uwe Schindler
whole issue are those stale file handles, which makes commits in Lucene unreliable. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Shlomit Rosen [mailto:shlom...@il.ibm.com] > Sent: Monday,

RE: indexing json

2014-09-04 Thread Uwe Schindler
Elasticsearch works perfectly fine with one node, also embedded :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Larry White [mailto:ljw1...@gmail.com] > Sent: Thursday, September 04, 201

RE: BlockTreeTermsReader consumes crazy amount of memory

2014-08-28 Thread Uwe Schindler
completely outside DirectoryReader on the older commit point opens all segments on its own. Maybe a solution would be to extends IndexWriter.open() to also take a commit point with IndexWriter. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u

RE: Speed up searching on index created using JdbcDirectory

2014-08-23 Thread Uwe Schindler
Hi, there is no need to have an index in a relational database. Lucene indexes are commonly stored as files on local disks. Use FSDirectory subclasses to do this! For more details about performance problem, you should maybe give us more details. Uwe - Uwe Schindler H.-H.-Meier-Allee 63

<    1   2   3   4   5   6   7   8   9   10   >