date:20150111

Custom tokenizer

2015-01-11 Thread Vihari Piratla

Hi, I am trying to implement a custom tokenizer for my application and I have few queries regarding the same. 1. Is there a way to provide an existing analyzer (say EnglishAnanlyzer) the custom tokenizer and make it use this tokenizer instead of say StandardTokenizer? 2. Why are analyzers such as S

Re: SegmentCommitInfos and live/deleted files

2015-01-11 Thread Varun Thacker

Thanks Robert for pointing out the difference. On Sun, Jan 11, 2015 at 10:29 PM, Robert Muir wrote: > files are either per-segment or per-commit. > > the first only returns per-segment files. this means it won't include > any per-commit files: > * segments_N itself > * generational .liv for dele

Re: Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

2015-01-11 Thread Martin Wunderlich

Hi Uwe, Thanks a lot for the detailed reply. I'll see how far I get with it, but being quite new to Lucene, it seems I am lacking a bit of background information to fully understand the response below. In particular, I need to do some background reading on how token streams and readers work, I

Re: SegmentCommitInfos and live/deleted files

2015-01-11 Thread Robert Muir

files are either per-segment or per-commit. the first only returns per-segment files. this means it won't include any per-commit files: * segments_N itself * generational .liv for deletes * generational .fnm/.dvd/etc for docvalues updates. the second includes per-commit files, too. it doesnt incl

SegmentCommitInfos and live/deleted files

2015-01-11 Thread Varun Thacker

I wanted to know whats the difference betwen the two ways that I am getting a list of all segment files belonging to a segment? method1 never returns .liv files. https://gist.github.com/vthacker/98065232c3d2da579700 -- Regards, Varun Thacker http://www.vthacker.in/

Highlighter - SimpleSpanFragmenter bug

2015-01-11 Thread zsolt.szloboda

the highlighter's SimpleSpanFragmenter has a bug documented in https://issues.apache.org/jira/browse/LUCENE-2229 that practically makes it unusable with PhraseQuery I can confirm that the bug still exists in version 4.10 (the JIRA issue was created back in year 2010) the symptom is that if there

Re: Details on setting block parameters for Lucene41PostingsFormat

2015-01-11 Thread Michael McCandless

On Sat, Jan 10, 2015 at 7:58 PM, Tom Burton-West wrote: > Thanks Mike, > > We run our Solr 3.x indexing with 10GB/shard. I've been testing Solr 4 > with 4,6, and 8GB for heap. As of Friday night when the indexes were about > half done (about 400GB on disk) only the 4GB had issues. I'll find out

RE: Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

2015-01-11 Thread Uwe Schindler

Hi, First, there is also a migrate guide next to the changes log: http://lucene.apache.org/core/4_10_3/MIGRATE.html 1. If you implement analyzer, you have to override createComponents() which return TokenStreamComponents objects. See other Analyzer’s source code to understand how to use

Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

2015-01-11 Thread Martin Wunderlich

Hi all, I am currently in the process of upgrading a search engine application from Lucene 3.5.0 to version 4.10.3. There have been some substantial API changes in version 4 that break backward compatibility. I have managed to fix most of them, but a few issues remain that I could use some h

Custom tokenizer

Re: SegmentCommitInfos and live/deleted files

Re: Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

Re: SegmentCommitInfos and live/deleted files

SegmentCommitInfos and live/deleted files

Highlighter - SimpleSpanFragmenter bug

Re: Details on setting block parameters for Lucene41PostingsFormat

RE: Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

Upgrading Lucene from 3.5 to 4.10 - how to handle Java API changes

9 matches

Site Navigation

Mail list logo

Footer information