If search matches index in the middle of filter chain, will result return?

2011-11-22 Thread Ellery Leung
Hi all

 

I am using Solr 3.4 with Win7 and Jetty.

 

When I do a search on a field, according to the Analysis from Solr, the
search string matches the index in the middle of the chain.  Here is the
schema:

 

fieldType name=substring_search class=solr.TextField
positionIncrementGap=100

analyzer type=index

charFilter
class=solr.MappingCharFilterFactory
mapping=../../filters/filter-mappings.txt/

charFilter
class=solr.HTMLStripCharFilterFactory /

tokenizer
class=solr.KeywordTokenizerFactory/

filter
class=solr.ASCIIFoldingFilterFactory/

filter class=solr.TrimFilterFactory /

filter class=solr.LowerCaseFilterFactory
/

filter
class=solr.CommonGramsFilterFactory words=../../filters/stopwords.txt
ignoreCase=true/

filter class=solr.NGramFilterFactory
minGramSize=1 maxGramSize=20/

filter
class=solr.RemoveDuplicatesTokenFilterFactory /

/analyzer

analyzer type=query

charFilter
class=solr.MappingCharFilterFactory
mapping=../../filters/filter-mappings.txt/

charFilter
class=solr.HTMLStripCharFilterFactory /

tokenizer
class=solr.KeywordTokenizerFactory/

filter
class=solr.ASCIIFoldingFilterFactory/

filter class=solr.TrimFilterFactory /

filter class=solr.LowerCaseFilterFactory
/

filter
class=solr.RemoveDuplicatesTokenFilterFactory /

/analyzer

/fieldType

 

I am searching for an email called: off...@officeofficeoffice.com.  If I
search any text under 20 characters, result will be returned.  But when I
search the whole string: off...@officeofficeoffice.com, no result return.

 

As you all see in the schema in index part, when I search the whole
string, it will match the index chain before NGramFilterFactory.  But after
NGram, no result found.

 

Here are my questions:

-  Is this behavior normal?

-  In order to get off...@officeofficeoffice.com, does it mean
that I have to make the maxGramSize larger (like 70)?

 

Thank you in advance for all your support.  This is a great community.



RE: If search matches index in the middle of filter chain, will result return?

2011-11-22 Thread Ellery Leung
Thanks Shawn.  So to recap:

- Every match must be found after entire chain, not in the middle of the
chain.
- Suggested: index and query chain should be the same.

In my situation, if I make both of them the same, the result may be
misleading because it will also match other records that have the same
partial string.

But your suggestion is wonderful.  Thank you very much.

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: 2011年11月23日 12:04 下午
To: solr-user@lucene.apache.org
Subject: Re: If search matches index in the middle of filter chain, will
result return?

On 11/22/2011 7:54 PM, Ellery Leung wrote:
 I am searching for an email called: off...@officeofficeoffice.com.  If I
 search any text under 20 characters, result will be returned.  But when I
 search the whole string: off...@officeofficeoffice.com, no result return.

 As you all see in the schema in index part, when I search the whole
 string, it will match the index chain before NGramFilterFactory.  But
after
 NGram, no result found.

 Here are my questions:
 -  Is this behavior normal?

I'm pretty sure that your query must match after the entire analyzer 
chain is done.  I would expect that behavior to be normal.

 -  In order to get off...@officeofficeoffice.com, does it mean
 that I have to make the maxGramSize larger (like 70)?

If you were to increase the maxGramSize to 70, you would get a match in 
this case, but your index might get a lot larger, depending on what's in 
your source data.  That's probably not the right approach, though.

In general, you want to have your index and query analyzer chains 
exactly the same.  There are some exceptions, but I don't think the 
NGram filter is one of them.  The synonym filter and WordDelimiterFilter 
are examples where it is expected that your index and query analyzer 
chains will be different.

Add the NGram and CommonGram filters to the query chain, and everything 
should start working.  If you were to go with a single analyzer for both 
like the following, I think it would start working.  You wouldn't even 
need to reindex, since you wouldn't be changing the index analyzer.

fieldType name=substring_search class=solr.TextField 
positionIncrementGap=100
analyzer
charFilter class=solr.MappingCharFilterFactory 
mapping=../../filters/filter-mappings.txt/
charFilter class=solr.HTMLStripCharFilterFactory /
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.TrimFilterFactory /
filter class=solr.LowerCaseFilterFactory /
filter class=solr.CommonGramsFilterFactory 
words=../../filters/stopwords.txt ignoreCase=true/
filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=20/
filter class=solr.RemoveDuplicatesTokenFilterFactory /
/analyzer
/fieldType

Regarding your NGram filter,  I would actually increase the minGramSize 
to at least 2 and decrease the maxGramSize to something like 10 or 15, 
then reindex.

An additional note: CommonGrams may not be all that useful unless you 
are indexing large numbers of huge documents, like entire books.  This 
particular fieldType is not suitable for full text anyway, since it uses 
KeywordTokenizer.  Consider removing CommonGrams from this fieldType and 
reindexing.  Unless you are dealing with large amounts of text, consider 
removing it from the entire schema.  If you do remove it, it's usually 
not a good idea to replace it with a StopFilter.  The index size 
reduction found in stopword removal is not usually worth the potential 
loss of recall.

Be prepared to test all reasonable analyzer combinations, rather than 
taking my word for it.

After reading the Hathi Trust blog, I tried CommonGrams on my own 
index.  It actually made things slower, not faster.  My typical document 
is only a few thousand bytes of metadata.  The Hathi Trust is indexing 
millions of full-length books.

Thanks,
Shawn




RE: Weird: Solr Search result and Analysis Result not match?

2011-11-08 Thread Ellery Leung
Thanks Erick, here are my responses:

1. Yes.  What I want to achieve is that when index is filtered with EdgeNgram, 
and a query that is not filtered in that way, I can do search on partial string.
2. Good suggestion, will test it.
3. ok
4. Thank you
5/6. Will remove the synonyms and word delimiterfilterfactory in query
7. will look at that using Luke.  By the way, it is the first time I saw that 
there is a tool for that.  Thank you.
8. Yes.

Will check that again, thank you.

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 2011年11月8日 9:52 下午
To: solr-user@lucene.apache.org; elleryle...@be-o.com
Subject: Re: Weird: Solr Search result and Analysis Result not match?

Several things:

1 You don't have EdgeNGramFilterFactory in your query analysis chain,
is this intentional?
2 You have a LOT of stuff going on here, you might try making your
analysis chain simpler and
 adding stuff back in until you see the error. Don't forget to re-index!
3 Analysis doesn't take into account query *parsing*, so it's
possible to get a false sense of
 assurance when the analysis page matches your expectations.
4 Even though nothing jumps out at me except the Edge factory,
nice job of including
 information.
5 It's unusual to expand synonyms both at query and index time,
usually one or the
 other with index time preferred.
6 Same with WordDelimiterFilterFactory. If you put all the variants
in the index, you don't
 need to put all the variants in the query and vice-versa.
7 Take a look at your actual contents, perhaps using Luke to insure
that what you expect
  to be in your index actually is.
8 You did re-index after your latest changes to your schema, right G?

All of this is a way of saying that I don't quite see what the problem
is, but at least there are
some avenues to explore.

Best
Erick

On Mon, Nov 7, 2011 at 9:29 PM, Ellery Leung elleryle...@be-o.com wrote:
 Hi all.



 I am using Solr 3.4 under Win 7.



 In schema there is a multivalue field indexed in this way:

 ==

 Schema:

 ==

 field name=myEvent type=myCustomText multiValued=true indexed=true
 stored=true omitNorms=true/



 fieldType name=myCustomText class=solr.TextField
 positionIncrementGap=100

analyzer type=index

charFilter class=solr.MappingCharFilterFactory
 mapping=../../filters/filter-mappings.txt/

charFilter class=solr.HTMLStripCharFilterFactory/

tokenizer class=solr.StandardTokenizerFactory/

filter class=solr.TrimFilterFactory/

filter class=solr.LowerCaseFilterFactory/

filter class=solr.SynonymFilterFactory
 synonyms=../../filters/filter-synonyms.txt ignoreCase=true
 expand=true/

filter class=solr.ASCIIFoldingFilterFactory/

filter class=solr.WordDelimiterFilterFactory
 splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 preserveOriginal=1/

filter class=solr.PhoneticFilterFactory
 encoder=DoubleMetaphone inject=true/

filter class=solr.PorterStemFilterFactory/

filter class=solr.EdgeNGramFilterFactory minGramSize=1
 maxGramSize=50 side=front/

filter class=solr.RemoveDuplicatesTokenFilterFactory/

/analyzer

analyzer type=query

charFilter class=solr.MappingCharFilterFactory
 mapping=../../filters/filter-mappings.txt/

charFilter class=solr.HTMLStripCharFilterFactory/

tokenizer class=solr.StandardTokenizerFactory/

filter class=solr.TrimFilterFactory/

filter class=solr.LowerCaseFilterFactory/

filter class=solr.SynonymFilterFactory
 synonyms=../../filters/filter-synonyms.txt ignoreCase=true
 expand=true/

filter class=solr.ASCIIFoldingFilterFactory/

filter class=solr.WordDelimiterFilterFactory
 splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1
 generateWordParts=0 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 preserveOriginal=1/

filter class=solr.PhoneticFilterFactory
 encoder=DoubleMetaphone/

filter class=solr.PorterStemFilterFactory/

filter class=solr.RemoveDuplicatesTokenFilterFactory/

/analyzer

 /fieldType

 ==

 Actual index:

 ==

 arr name=myEvent

 str2284e2/str

 str2284e4/str

 str2284e5/str

 str1911e2/str

 /arr



 ==

 Question:

 ==

 Now when I do a search like this:



 myEvent:1911e2



 This should match the 4th item.  Now on Full Interface, it does not return
 any result.  But on analysis, matches are highlighted.



 By using Debug: the parsedquery is:



 MultiPhraseQuery

Weird: Solr Search result and Analysis Result not match?

2011-11-07 Thread Ellery Leung
Hi all.

 

I am using Solr 3.4 under Win 7.

 

In schema there is a multivalue field indexed in this way:

==

Schema:

==

field name=myEvent type=myCustomText multiValued=true indexed=true
stored=true omitNorms=true/

 

fieldType name=myCustomText class=solr.TextField
positionIncrementGap=100

analyzer type=index

charFilter class=solr.MappingCharFilterFactory
mapping=../../filters/filter-mappings.txt/

charFilter class=solr.HTMLStripCharFilterFactory/

tokenizer class=solr.StandardTokenizerFactory/

filter class=solr.TrimFilterFactory/

filter class=solr.LowerCaseFilterFactory/

filter class=solr.SynonymFilterFactory
synonyms=../../filters/filter-synonyms.txt ignoreCase=true
expand=true/

filter class=solr.ASCIIFoldingFilterFactory/

filter class=solr.WordDelimiterFilterFactory
splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 preserveOriginal=1/

filter class=solr.PhoneticFilterFactory
encoder=DoubleMetaphone inject=true/

filter class=solr.PorterStemFilterFactory/

filter class=solr.EdgeNGramFilterFactory minGramSize=1
maxGramSize=50 side=front/

filter class=solr.RemoveDuplicatesTokenFilterFactory/

/analyzer

analyzer type=query

charFilter class=solr.MappingCharFilterFactory
mapping=../../filters/filter-mappings.txt/

charFilter class=solr.HTMLStripCharFilterFactory/

tokenizer class=solr.StandardTokenizerFactory/

filter class=solr.TrimFilterFactory/

filter class=solr.LowerCaseFilterFactory/

filter class=solr.SynonymFilterFactory
synonyms=../../filters/filter-synonyms.txt ignoreCase=true
expand=true/

filter class=solr.ASCIIFoldingFilterFactory/

filter class=solr.WordDelimiterFilterFactory
splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1
generateWordParts=0 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 preserveOriginal=1/

filter class=solr.PhoneticFilterFactory
encoder=DoubleMetaphone/

filter class=solr.PorterStemFilterFactory/

filter class=solr.RemoveDuplicatesTokenFilterFactory/

/analyzer

/fieldType

==

Actual index: 

==

arr name=myEvent

str2284e2/str

str2284e4/str

str2284e5/str

str1911e2/str

/arr

 

==

Question:

==

Now when I do a search like this:

 

myEvent:1911e2

 

This should match the 4th item.  Now on Full Interface, it does not return
any result.  But on analysis, matches are highlighted.

 

By using Debug: the parsedquery is:

 

MultiPhraseQuery(myEvent:(1911e2 1911) (A e) 2)

 

Parsedquery_toString:

 

myEvent:(1911e2 1911) (A e) 2

 

Can anyone please help me on this?



How to return exact set of multivalue field

2011-10-20 Thread Ellery Leung
Hi all

 

I am using Solr 3.4 on Windows 7.

 

Here is the example of a multivalue field:

 

doc

arr name=field_name

str387/str

str386/str

/arr

/doc

 

doc

arr name= field_name 

str387/str

str386/str

/arr

/doc

 

doc

arr name= field_name

str387/str

str386/str

str385/str

str382/str

str312/str

str311/str

/arr

/doc

 

I am doing a search on field_name and JUST want to return record that IS
387 and 386 (the first and second record).

 

Here is the query:

 

field_name: (387 AND 386)

 

But this query return all 3 records, which is wrong.

 

I have tried using filter: field_name: (387 AND 386) but it still doesn't
work.

 

Therefore I would like to ask, are there any way to change this query so
that it will ONLY return first and second record?

 

Thank you in advance for any help.



RE: How to return exact set of multivalue field

2011-10-20 Thread Ellery Leung
Thank you very much for your help!

Follow up question: what if it is a string instead of number?  While you can
use [387 TO *] to find out all number that is bigger than 387, how do you
find specific set of string?

Thank you again for any help here.

-Original Message-
From: dan sutton [mailto:danbsut...@gmail.com] 
Sent: 2011年10月20日 6:09 下午
To: solr-user@lucene.apache.org; elleryle...@be-o.com
Subject: Re: How to return exact set of multivalue field

-field_name:[ * TO 384] +field_name:[385 TO 386]  -field_name:[387 TO *]

On Thu, Oct 20, 2011 at 10:51 AM, Ellery Leung elleryle...@be-o.com wrote:
 Hi all



 I am using Solr 3.4 on Windows 7.



 Here is the example of a multivalue field:



 doc

 arr name=field_name

 str387/str

 str386/str

 /arr

 /doc



 doc

 arr name= field_name 

 str387/str

 str386/str

 /arr

 /doc



 doc

 arr name= field_name

 str387/str

 str386/str

 str385/str

 str382/str

 str312/str

 str311/str

 /arr

 /doc



 I am doing a search on field_name and JUST want to return record that IS
 387 and 386 (the first and second record).



 Here is the query:



 field_name: (387 AND 386)



 But this query return all 3 records, which is wrong.



 I have tried using filter: field_name: (387 AND 386) but it still doesn't
 work.



 Therefore I would like to ask, are there any way to change this query so
 that it will ONLY return first and second record?



 Thank you in advance for any help.





RE: solr Invalid Date in Date Math String/Invalid Date String

2011-06-02 Thread Ellery Leung
:
[2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z]

Best
Erick

2011/5/27 Ellery Leung elleryle...@be-o.com:
 Thank you Mike.

 So I understand that now.  But what about the other items that have values
 on both size?  They don't work at all.


 -Original Message-
 From: Mike Sokolov [mailto:soko...@ifactory.com]
 Sent: 2011年5月27日 10:23 下午
 To: solr-user@lucene.apache.org
 Cc: alucard001
 Subject: Re: solr Invalid Date in Date Math String/Invalid Date String

 The * endpoint for range terms wasn't implemented yet in 1.4.1  As a
 workaround, we use very large and very small values.

 -Mike

 On 05/27/2011 12:55 AM, alucard001 wrote:
 Hi all

 I am using SOLR 1.4.1 (according to solr info), but no matter what date
 field I use (date or tdate) defined in default schema.xml, I cannot do a
 search in solr-admin analysis.jsp:

 fieldtype: date(or tdate)
 fieldvalue(index): 2006-12-22T13:52:13Z (I type it in manually, no
 trailing
 space)
 fieldvalue(query):

 The only success case:
 2006-12-22T13:52:13Z

 All search below are failed:
 * TO NOW
 [* TO NOW]

 2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z
 2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z
 [2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z]
 [2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z]

 2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z
 2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z
 [2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z]
 [2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z]

 2006-12-22T00:00:00Z TO *
 2006\-12\-22T00\:00\:00Z TO *
 [2006-12-22T00:00:00Z TO *]
 [2006\-12\-22T00\:00\:00Z TO *]

 2006-12-22T00:00:00.000Z TO *
 2006\-12\-22T00\:00\:00\.000Z TO *
 [2006-12-22T00:00:00.000Z TO *]
 [2006\-12\-22T00\:00\:00\.000Z TO *]
 (vice versa)

 I get either:
 Invalid Date in Date Math String or
 Invalid Date String
 error

 What's wrong with it?  Can anyone please help me on that?

 Thank you.

 --
 View this message in context:

http://lucene.472066.n3.nabble.com/solr-Invalid-Date-in-Date-Math-String-Inv
 alid-Date-String-tp2991763p2991763.html
 Sent from the Solr - User mailing list archive at Nabble.com.






Match in the process of filter, not end, does it mean not matching?

2011-05-29 Thread Ellery Leung
This is the schema:

 

fieldType name=textContains class=solr.TextField
positionIncrementGap=100

analyzer type=index

charFilter
class=solr.MappingCharFilterFactory
mapping=../../filters/filter-mappings.txt/

charFilter
class=solr.HTMLStripCharFilterFactory /

tokenizer
class=solr.KeywordTokenizerFactory/

filter
class=solr.ISOLatin1AccentFilterFactory/

filter class=solr.TrimFilterFactory /

filter class=solr.LowerCaseFilterFactory
/

filter
class=solr.CommonGramsFilterFactory words=../../filters/stopwords.txt
ignoreCase=true/

filter class=solr.ShingleFilterFactory
minShingleSize=2 maxShingleSize=30/

filter class=solr.NGramFilterFactory
minGramSize=2 maxGramSize=30/

filter
class=solr.RemoveDuplicatesTokenFilterFactory /

/analyzer

analyzer type=query

charFilter
class=solr.MappingCharFilterFactory
mapping=../../filters/filter-mappings.txt/

charFilter
class=solr.HTMLStripCharFilterFactory /

tokenizer
class=solr.KeywordTokenizerFactory/

filter
class=solr.ISOLatin1AccentFilterFactory/

filter class=solr.TrimFilterFactory /

filter class=solr.LowerCaseFilterFactory
/

filter
class=solr.RemoveDuplicatesTokenFilterFactory /

/analyzer

/fieldType

 

And there is a multiValued field:

 

field name=textContains_Something type=textContains multiValued=true
indexed=true stored=true /

 

Now I want to search this string: Merry Christmas and Happy New Year

 

In Admin Analysis in solr admin, it highlight (in light blue) the matching
word in LowerCaseFilterFactory, CommonGramsFilterFactory and
ShingleFilterFactory.  However, it does not have any highlight in
NGramFilterFactory.

 

Now, I did a search in full-interface mode in solr admin: 

 

textContains_Something:Merry Christmas and Happy New Year

 

It contains NO RESULT.

 

Does it mean that matching only counts after all tokenizer and filters?

 

Thank you in advance for any help.



HTMLStripTransformer will remove the content in XML??

2011-05-27 Thread Ellery Leung
I have an XML string like this:

 

?xml version=1.0
encoding=UTF-8?languageintl![CDATA[hello]]/intlloc![CDATA[solr
]]/loc/language

 

By using HTMLStripTransformer, I expect to get 'hello,solr'.

 

But actual this transformer will remove ALL THE TEXT INSIDE!

 

Did I do something silly, or is it a bug? 

 

Thank you



RE: HTMLStripTransformer will remove the content in XML??

2011-05-27 Thread Ellery Leung
Got it.  Actually I use solr.MappingCharFilterFactory to replace the ![CDATA[ 
and ]] to empty first, and use HTMLStripCharFilterFactory to get hello and 
solr.

For future reference, here is part of schema.xml

fieldType name=textMaxWord class=solr.TextField 
analyzer type=index
charFilter class=solr.MappingCharFilterFactory 
mapping=mappings.txt/
charFilter class=solr.HTMLStripCharFilterFactory /
...

In mappings.txt (2 lines)

![CDATA[ = 

]] = 

Restart Solr

It works.

Thank you

-Original Message-
From: bryan rasmussen [mailto:rasmussen.br...@gmail.com] 
Sent: 2011年5月27日 4:20 下午
To: solr-user@lucene.apache.org; elleryle...@be-o.com
Subject: Re: HTMLStripTransformer will remove the content in XML??

I would expect that it doesn't understand CDATA and thinks of
everything between  and  as a 'tag'.

Best Regards,
Bryan Rasmussen

On Fri, May 27, 2011 at 9:41 AM, Ellery Leung elleryle...@be-o.com wrote:
 I have an XML string like this:



 ?xml version=1.0
 encoding=UTF-8?languageintl![CDATA[hello]]/intlloc![CDATA[solr
 ]]/loc/language



 By using HTMLStripTransformer, I expect to get 'hello,solr'.



 But actual this transformer will remove ALL THE TEXT INSIDE!



 Did I do something silly, or is it a bug?



 Thank you





RE: solr Invalid Date in Date Math String/Invalid Date String

2011-05-27 Thread Ellery Leung
Thank you Mike.

So I understand that now.  But what about the other items that have values
on both size?  They don't work at all.


-Original Message-
From: Mike Sokolov [mailto:soko...@ifactory.com] 
Sent: 2011年5月27日 10:23 下午
To: solr-user@lucene.apache.org
Cc: alucard001
Subject: Re: solr Invalid Date in Date Math String/Invalid Date String

The * endpoint for range terms wasn't implemented yet in 1.4.1  As a 
workaround, we use very large and very small values.

-Mike

On 05/27/2011 12:55 AM, alucard001 wrote:
 Hi all

 I am using SOLR 1.4.1 (according to solr info), but no matter what date
 field I use (date or tdate) defined in default schema.xml, I cannot do a
 search in solr-admin analysis.jsp:

 fieldtype: date(or tdate)
 fieldvalue(index): 2006-12-22T13:52:13Z (I type it in manually, no
trailing
 space)
 fieldvalue(query):

 The only success case:
 2006-12-22T13:52:13Z

 All search below are failed:
 * TO NOW
 [* TO NOW]

 2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z
 2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z
 [2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z]
 [2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z]

 2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z
 2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z
 [2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z]
 [2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z]

 2006-12-22T00:00:00Z TO *
 2006\-12\-22T00\:00\:00Z TO *
 [2006-12-22T00:00:00Z TO *]
 [2006\-12\-22T00\:00\:00Z TO *]

 2006-12-22T00:00:00.000Z TO *
 2006\-12\-22T00\:00\:00\.000Z TO *
 [2006-12-22T00:00:00.000Z TO *]
 [2006\-12\-22T00\:00\:00\.000Z TO *]
 (vice versa)

 I get either:
 Invalid Date in Date Math String or
 Invalid Date String
 error

 What's wrong with it?  Can anyone please help me on that?

 Thank you.

 --
 View this message in context:
http://lucene.472066.n3.nabble.com/solr-Invalid-Date-in-Date-Math-String-Inv
alid-Date-String-tp2991763p2991763.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: What is this error means?

2010-01-13 Thread Ellery Leung

Hi Israel

Thank you for your response.

However, I use both ini_set and set the _defaultTimeout to 6000 but the
error still occur with same error message.

Now, when I start build the index, the error pops up much faster than
changing it before.

So do you have any idea?

Thank you in advance for your help.




Israel Ekpo wrote:
 
 Ellery,
 
 A preliminary look at the source code indicates that the error is
 happening
 because the solr server is taking longer than expected to respond to the
 client
 
 http://code.google.com/p/solr-php-client/source/browse/trunk/Apache/Solr/Service.php
 
 The default time out handed down to Apache_Solr_Service:_sendRawPost() is
 60
 seconds since you were calling the addDocument() method
 
 So if it took longer than that (1 minute), then it will exit with that
 error
 message.
 
 You will have to increase the default value to something very high like 10
 minutes or so on line 252 in the source code since there is no way to
 specify that in the constructor or the addDocument method.
 
 Another alternative will be to update the default_socket_timeout in the
 php.ini file or in the code using ini_set
 
 I hope that helps
 
 
 
 On Tue, Jan 12, 2010 at 9:33 PM, Ellery Leung elleryle...@be-o.com
 wrote:
 

 Hi, here is the stack trace:

 br /
 Fatal error:  Uncaught exception 'Exception' with message 'quot;0quot;
 Status: Communication Error' in
 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv
 ice.php:385
 Stack trace:
 #0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652):
 Apache_Solr_Ser
 vice-gt;_sendRawPost('http://127.0.0', 'lt;add allowDups=...')
 #1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676):
 Apache_Solr_Ser
 vice-gt;add('lt;add allowDups=...')
 #2

 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221):
 Apache_Solr_Service-gt;addDocument(Object(Apache_Solr_Document))
 #3

 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262):
 SolrSearchEngine-gt;buildIndex(Array, 'key')
 #4
 C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51):
 So
 lrSearchEngine-gt;createFullIndex('contacts', Array, 'key', 'www')
 #5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64):
 Indexer-g
 t;create('www')
 #6 {main}
  thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li
 ne 385br /

 C:\nginx\html\apps\milio\htdocs\Contactspause
 Press any key to continue . . .

 Thanks for helping me.


 Grant Ingersoll-6 wrote:
 
  Do you have a stack trace?
 
  On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote:
 
  When I am building the index for around 2 ~ 25000 records,
 sometimes
  I
  came across with this error:
 
 
 
  Uncaught exception Exception with message '0' Status: Communication
  Error
 
 
 
  I search Google  Yahoo but no answer.
 
 
 
  I am now committing document to solr on every 10 records fetched from
 a
  SQLite Database with PHP 5.3.
 
 
 
  Platform: Windows 7 Home
 
  Web server: Nginx
 
  Solr Specification Version: 1.4.0
 
  Solr Implementation Version: 1.4.0 833479 - grantingersoll -
 2009-11-06
  12:33:40
 
  Lucene Specification Version: 2.9.1
 
  Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25
 
  Solr hosted in jetty 6.1.3
 
 
 
  All the above are in one single test machine.
 
 
 
  The situation is that sometimes when I build the index, it can be
 created
  successfully.  But sometimes it will just stop with the above error.
 
 
 
  Any clue?  Please help.
 
 
 
  Thank you in advance.
 
 
 
 

 --
 View this message in context:
 http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Good Enough is not good enough.
 To give anything less than your best is to sacrifice the gift.
 Quality First. Measure Twice. Cut Once.
 http://www.israelekpo.com/
 
 

-- 
View this message in context: 
http://old.nabble.com/What-is-this-error-means--tp27123815p27155487.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: What is this error means?

2010-01-13 Thread Ellery Leung

Here are a workaround of this issue:

On line 382 of SolrPhpClient/Apache/Solr/Service.php, I change to:

while(true){
$str = file_get_contents($url, false, $this-_postContext);
if(empty($str) == false){
break;
}
}

$response = new Apache_Solr_Response($str, $http_response_header,
$this-_createDocuments, $this-_collapseSingleValueArrays);

As I found that, for some strange reason on Windows, when you post some data
and add index, Solr may not be able to receive it.  Therefore I added an
infinitive loop and if it does not receive any response ($str is empty), we
post it again.

Side effect: when I open the window console to see it, sometimes it will
prompt:

Failed to open stream: HTTP request failed!

I haven't researched it yet, but the index is built successfully.

Hope it helps someone.





Ellery Leung wrote:
 
 Hi Israel
 
 Thank you for your response.
 
 However, I use both ini_set and set the _defaultTimeout to 6000 but the
 error still occur with same error message.
 
 Now, when I start build the index, the error pops up much faster than
 changing it before.
 
 So do you have any idea?
 
 Thank you in advance for your help.
 
 
 
 
 Israel Ekpo wrote:
 
 Ellery,
 
 A preliminary look at the source code indicates that the error is
 happening
 because the solr server is taking longer than expected to respond to the
 client
 
 http://code.google.com/p/solr-php-client/source/browse/trunk/Apache/Solr/Service.php
 
 The default time out handed down to Apache_Solr_Service:_sendRawPost() is
 60
 seconds since you were calling the addDocument() method
 
 So if it took longer than that (1 minute), then it will exit with that
 error
 message.
 
 You will have to increase the default value to something very high like
 10
 minutes or so on line 252 in the source code since there is no way to
 specify that in the constructor or the addDocument method.
 
 Another alternative will be to update the default_socket_timeout in the
 php.ini file or in the code using ini_set
 
 I hope that helps
 
 
 
 On Tue, Jan 12, 2010 at 9:33 PM, Ellery Leung elleryle...@be-o.com
 wrote:
 

 Hi, here is the stack trace:

 br /
 Fatal error:  Uncaught exception 'Exception' with message 'quot;0quot;
 Status: Communication Error' in
 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv
 ice.php:385
 Stack trace:
 #0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652):
 Apache_Solr_Ser
 vice-gt;_sendRawPost('http://127.0.0', 'lt;add allowDups=...')
 #1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676):
 Apache_Solr_Ser
 vice-gt;add('lt;add allowDups=...')
 #2

 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221):
 Apache_Solr_Service-gt;addDocument(Object(Apache_Solr_Document))
 #3

 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262):
 SolrSearchEngine-gt;buildIndex(Array, 'key')
 #4
 C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51):
 So
 lrSearchEngine-gt;createFullIndex('contacts', Array, 'key', 'www')
 #5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64):
 Indexer-g
 t;create('www')
 #6 {main}
  thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li
 ne 385br /

 C:\nginx\html\apps\milio\htdocs\Contactspause
 Press any key to continue . . .

 Thanks for helping me.


 Grant Ingersoll-6 wrote:
 
  Do you have a stack trace?
 
  On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote:
 
  When I am building the index for around 2 ~ 25000 records,
 sometimes
  I
  came across with this error:
 
 
 
  Uncaught exception Exception with message '0' Status: Communication
  Error
 
 
 
  I search Google  Yahoo but no answer.
 
 
 
  I am now committing document to solr on every 10 records fetched from
 a
  SQLite Database with PHP 5.3.
 
 
 
  Platform: Windows 7 Home
 
  Web server: Nginx
 
  Solr Specification Version: 1.4.0
 
  Solr Implementation Version: 1.4.0 833479 - grantingersoll -
 2009-11-06
  12:33:40
 
  Lucene Specification Version: 2.9.1
 
  Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25
 
  Solr hosted in jetty 6.1.3
 
 
 
  All the above are in one single test machine.
 
 
 
  The situation is that sometimes when I build the index, it can be
 created
  successfully.  But sometimes it will just stop with the above error.
 
 
 
  Any clue?  Please help.
 
 
 
  Thank you in advance.
 
 
 
 

 --
 View this message in context:
 http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Good Enough is not good enough.
 To give anything less than your best is to sacrifice the gift.
 Quality First. Measure Twice. Cut Once.
 http://www.israelekpo.com/
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/What-is-this-error-means--tp27123815p27156058.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: What is this error means?

2010-01-12 Thread Ellery Leung

Hi, here is the stack trace:

br /
Fatal error:  Uncaught exception 'Exception' with message 'quot;0quot;
Status: Communication Error' in
C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv
ice.php:385
Stack trace:
#0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652):
Apache_Solr_Ser
vice-gt;_sendRawPost('http://127.0.0', 'lt;add allowDups=...')
#1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676):
Apache_Solr_Ser
vice-gt;add('lt;add allowDups=...')
#2
C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221):
Apache_Solr_Service-gt;addDocument(Object(Apache_Solr_Document))
#3
C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262):
SolrSearchEngine-gt;buildIndex(Array, 'key')
#4
C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51):
So
lrSearchEngine-gt;createFullIndex('contacts', Array, 'key', 'www')
#5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64):
Indexer-g
t;create('www')
#6 {main}
  thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li
ne 385br /

C:\nginx\html\apps\milio\htdocs\Contactspause
Press any key to continue . . .

Thanks for helping me.


Grant Ingersoll-6 wrote:
 
 Do you have a stack trace?  
 
 On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote:
 
 When I am building the index for around 2 ~ 25000 records, sometimes
 I
 came across with this error:
 
 
 
 Uncaught exception Exception with message '0' Status: Communication
 Error
 
 
 
 I search Google  Yahoo but no answer.
 
 
 
 I am now committing document to solr on every 10 records fetched from a
 SQLite Database with PHP 5.3.
 
 
 
 Platform: Windows 7 Home
 
 Web server: Nginx
 
 Solr Specification Version: 1.4.0
 
 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06
 12:33:40
 
 Lucene Specification Version: 2.9.1
 
 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25
 
 Solr hosted in jetty 6.1.3
 
 
 
 All the above are in one single test machine.
 
 
 
 The situation is that sometimes when I build the index, it can be created
 successfully.  But sometimes it will just stop with the above error.
 
 
 
 Any clue?  Please help.
 
 
 
 Thank you in advance.
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html
Sent from the Solr - User mailing list archive at Nabble.com.



What is this error means?

2010-01-11 Thread Ellery Leung
When I am building the index for around 2 ~ 25000 records, sometimes I
came across with this error:

 

Uncaught exception Exception with message '0' Status: Communication Error

 

I search Google  Yahoo but no answer.

 

I am now committing document to solr on every 10 records fetched from a
SQLite Database with PHP 5.3.

 

Platform: Windows 7 Home

Web server: Nginx

Solr Specification Version: 1.4.0

Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06
12:33:40

Lucene Specification Version: 2.9.1

Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25

Solr hosted in jetty 6.1.3

 

All the above are in one single test machine.

 

The situation is that sometimes when I build the index, it can be created
successfully.  But sometimes it will just stop with the above error.

 

Any clue?  Please help.

 

Thank you in advance.



What does it mean about this error message???

2009-12-16 Thread Ellery Leung
there_are_more_terms_than_documents_in_field_someField_but_its_impossible_
to_sort_on_tokenized_fields

 

The index is probably built and run.  Using Solr 1.4.

 

The error message is quite vague that it seems to talk about different
thing..

 

Can somebody please explain what it is?

 

Thank you in advance