RE: Tokenize on paragraphs and sentences

2013-04-18 Thread Alex Cougarman
: Tokenize on paragraphs and sentences Hi. Is it possible to search within paragraphs or sentences in Solr? The PatternTokenizerFactory uses regular expressions, but how can this be done with plain ASCII docs that don't have p tags (HTML), yet they're broken into paragraphs? Thanks. Warm regards

Re: Tokenize on paragraphs and sentences

2013-04-15 Thread Jack Krupansky
by white space for sentence (with some more heuristics for abbreviations.) Or you could have an update processor do the marking. -- Jack Krupansky -Original Message- From: Alex Cougarman Sent: Monday, April 15, 2013 9:48 AM To: solr-user@lucene.apache.org Subject: Tokenize on paragraphs