Re: Need a bit of help, Solr 1.4: type text.

2010-02-11 Thread Sven Maurmann

Hi,

the parameter for WordDelimiterFilterFactory is catenateAll;
you should set it to 1.

Cheers,
Sven

--On Mittwoch, 10. Februar 2010 16:37 -0800 Yu-Shan Fung 
ambivale...@gmail.com wrote:



Check out the configuration of WordDelimiterFilterFactory in your
schema.xml.

Depending on your settings, it's probably tokenizaing 13th into 13 and
th. You can also have them concatenated back into a single token, but I
can't remember the exact parameter. I think it could be catenateAll.



On Wed, Feb 10, 2010 at 4:32 PM, Dickey, Dan dan.dic...@savvis.net
wrote:


I'm using the standard text type for a field, and part of the data
being indexed is 13th, as in Friday the 13th.

I can't seem to get it to match when I'm querying for Friday the 13th
either quoted or not.

One thing that does match is 13 th if I send the search query with a
space between...

Any suggestions?

I know this is short on detail, but it's been a long day... time to get
outta here.

Thanks for any and all help.

   -Dan




This message contains information which may be confidential and/or
privileged. Unless you are the intended recipient (or authorized to
receive for the intended recipient), you may not read, use, copy or
disclose to anyone the message or any information contained in the
message. If you have received the message in error, please advise the
sender by reply e-mail and delete the message and any attachment(s)
thereto without retaining any copies.





--
When nothing seems to help, I go look at a stonecutter hammering away
at his rock perhaps a hundred times without as much as a crack showing in
it. Yet at the hundred and first blow it will split in two, and I know it
was not that blow that did it, but all that had gone before. — Jacob
Riis


RE: Need a bit of help, Solr 1.4: type text.

2010-02-11 Thread Dickey, Dan
Sven  Yu-Shan - thank you for your advice.

It doesn't seem to work for me for some reason however,
this is what I was trying to get working last night before sending
My message out.

I'll try to explain in more detail what my setup is like.

I use a multiValued text field as a sort of holder for everything else.
Let's call this field euts (acronym for everything under the sun).
Nothing is directly stored into this field.
I use about 20 or so copyField's to put everything else into it.
One of the fields I use is Description, a text field.
In this field I'm storing Friday the 13th, along with other potential
Text.
I have a copyField like:
copyField source=Description dest=euts
The field for euts is:
field name=euts type=text indexed=true stored=true
multiValued=true /
The field for Description is:
field name=Description type=text indexed=true
stored=true /

The definition for the text type is straight out of the Solr tarball
From the example/solr/conf directory.  I tried setting catenateAll=1
And reindexing, but that didn't work.

Btw - My search query effectively looks like euts:(Friday the 13th).
I'm just running this through the solr admin page using the Full
Interface.
(no quotes of course).  This does not match a document that has the
String Friday the 13th in its Description.  I've tried it with setting
catenateAll to 1, and the original value of 0.  This is on the index
analyzer.

I've also tried it both ways with the query analyzer (at least I think I
have).  I'm less sure of how the options for the query analyzer should
be
Set.

Also, in the wiki - I found another option for the
WordDelimiterFilterFactory - preserveOriginal.
I tried setting this to 1 with similar results - no match.

And yes, I'm aware that the is a stop word and gets thrown away.
That's fine.
After each of these schema.xml changes, I've re-indexed my documents.
It doesn't take long as I'm just working with a small set of about 180
docs
Right now.

Again, any and all help would be greatly appreciated!  Thanks.
-Dan Dickey

-Original Message-
From: Sven Maurmann [mailto:sven.maurm...@kippdata.de] 
Sent: Thursday, February 11, 2010 2:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Need a bit of help, Solr 1.4: type text.

Hi,

the parameter for WordDelimiterFilterFactory is catenateAll;
you should set it to 1.

Cheers,
 Sven

--On Mittwoch, 10. Februar 2010 16:37 -0800 Yu-Shan Fung 
ambivale...@gmail.com wrote:

 Check out the configuration of WordDelimiterFilterFactory in your
 schema.xml.

 Depending on your settings, it's probably tokenizaing 13th into 13
and
 th. You can also have them concatenated back into a single token,
but I
 can't remember the exact parameter. I think it could be catenateAll.



 On Wed, Feb 10, 2010 at 4:32 PM, Dickey, Dan dan.dic...@savvis.net
 wrote:

 I'm using the standard text type for a field, and part of the data
 being indexed is 13th, as in Friday the 13th.

 I can't seem to get it to match when I'm querying for Friday the
13th
 either quoted or not.

 One thing that does match is 13 th if I send the search query with
a
 space between...

 Any suggestions?

 I know this is short on detail, but it's been a long day... time to
get
 outta here.

 Thanks for any and all help.

-Dan




 This message contains information which may be confidential and/or
 privileged. Unless you are the intended recipient (or authorized to
 receive for the intended recipient), you may not read, use, copy or
 disclose to anyone the message or any information contained in the
 message. If you have received the message in error, please advise the
 sender by reply e-mail and delete the message and any attachment(s)
 thereto without retaining any copies.




 --
 When nothing seems to help, I go look at a stonecutter hammering away
 at his rock perhaps a hundred times without as much as a crack showing
in
 it. Yet at the hundred and first blow it will split in two, and I know
it
 was not that blow that did it, but all that had gone before. - Jacob
 Riis

This message contains information which may be confidential and/or privileged. 
Unless you are the intended recipient (or authorized to receive for the 
intended recipient), you may not read, use, copy or disclose to anyone the 
message or any information contained in the message. If you have received the 
message in error, please advise the sender by reply e-mail and delete the 
message and any attachment(s) thereto without retaining any copies.


RE: Need a bit of help, Solr 1.4: type text.

2010-02-11 Thread Dickey, Dan
Hmm... I think I'm onto something.
It may be the stop word removal of the.
When I changed my query analyzer for text to set
enablePositionIncrements=false instead of true,
the query seems to find what I'm expecting.  I'll keep
looking into this.

Is there any information available on what Position Increments are?
-Dan

-Original Message-
From: Dickey, Dan [mailto:dan.dic...@savvis.net] 
Sent: Thursday, February 11, 2010 8:29 AM
To: solr-user@lucene.apache.org
Subject: RE: Need a bit of help, Solr 1.4: type text.

Sven  Yu-Shan - thank you for your advice.

It doesn't seem to work for me for some reason however,
this is what I was trying to get working last night before sending
My message out.

I'll try to explain in more detail what my setup is like.

I use a multiValued text field as a sort of holder for everything else.
Let's call this field euts (acronym for everything under the sun).
Nothing is directly stored into this field.
I use about 20 or so copyField's to put everything else into it.
One of the fields I use is Description, a text field.
In this field I'm storing Friday the 13th, along with other potential
Text.
I have a copyField like:
copyField source=Description dest=euts
The field for euts is:
field name=euts type=text indexed=true stored=true
multiValued=true /
The field for Description is:
field name=Description type=text indexed=true
stored=true /

The definition for the text type is straight out of the Solr tarball
From the example/solr/conf directory.  I tried setting catenateAll=1
And reindexing, but that didn't work.

Btw - My search query effectively looks like euts:(Friday the 13th).
I'm just running this through the solr admin page using the Full
Interface.
(no quotes of course).  This does not match a document that has the
String Friday the 13th in its Description.  I've tried it with setting
catenateAll to 1, and the original value of 0.  This is on the index
analyzer.

I've also tried it both ways with the query analyzer (at least I think I
have).  I'm less sure of how the options for the query analyzer should
be
Set.

Also, in the wiki - I found another option for the
WordDelimiterFilterFactory - preserveOriginal.
I tried setting this to 1 with similar results - no match.

And yes, I'm aware that the is a stop word and gets thrown away.
That's fine.
After each of these schema.xml changes, I've re-indexed my documents.
It doesn't take long as I'm just working with a small set of about 180
docs
Right now.

Again, any and all help would be greatly appreciated!  Thanks.
-Dan Dickey

-Original Message-
From: Sven Maurmann [mailto:sven.maurm...@kippdata.de] 
Sent: Thursday, February 11, 2010 2:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Need a bit of help, Solr 1.4: type text.

Hi,

the parameter for WordDelimiterFilterFactory is catenateAll;
you should set it to 1.

Cheers,
 Sven

--On Mittwoch, 10. Februar 2010 16:37 -0800 Yu-Shan Fung 
ambivale...@gmail.com wrote:

 Check out the configuration of WordDelimiterFilterFactory in your
 schema.xml.

 Depending on your settings, it's probably tokenizaing 13th into 13
and
 th. You can also have them concatenated back into a single token,
but I
 can't remember the exact parameter. I think it could be catenateAll.



 On Wed, Feb 10, 2010 at 4:32 PM, Dickey, Dan dan.dic...@savvis.net
 wrote:

 I'm using the standard text type for a field, and part of the data
 being indexed is 13th, as in Friday the 13th.

 I can't seem to get it to match when I'm querying for Friday the
13th
 either quoted or not.

 One thing that does match is 13 th if I send the search query with
a
 space between...

 Any suggestions?

 I know this is short on detail, but it's been a long day... time to
get
 outta here.

 Thanks for any and all help.

-Dan




 This message contains information which may be confidential and/or
 privileged. Unless you are the intended recipient (or authorized to
 receive for the intended recipient), you may not read, use, copy or
 disclose to anyone the message or any information contained in the
 message. If you have received the message in error, please advise the
 sender by reply e-mail and delete the message and any attachment(s)
 thereto without retaining any copies.




 --
 When nothing seems to help, I go look at a stonecutter hammering away
 at his rock perhaps a hundred times without as much as a crack showing
in
 it. Yet at the hundred and first blow it will split in two, and I know
it
 was not that blow that did it, but all that had gone before. - Jacob
 Riis

This message contains information which may be confidential and/or
privileged. Unless you are the intended recipient (or authorized to
receive for the intended recipient), you may not read, use, copy or
disclose to anyone the message or any information contained in the
message. If you have received the message in error, please advise the
sender by reply e-mail and delete the message and any

Need a bit of help, Solr 1.4: type text.

2010-02-10 Thread Dickey, Dan
I'm using the standard text type for a field, and part of the data
being indexed is 13th, as in Friday the 13th.

I can't seem to get it to match when I'm querying for Friday the 13th
either quoted or not.

One thing that does match is 13 th if I send the search query with a
space between...

Any suggestions?

I know this is short on detail, but it's been a long day... time to get
outta here.

Thanks for any and all help.

-Dan

 


This message contains information which may be confidential and/or privileged. 
Unless you are the intended recipient (or authorized to receive for the 
intended recipient), you may not read, use, copy or disclose to anyone the 
message or any information contained in the message. If you have received the 
message in error, please advise the sender by reply e-mail and delete the 
message and any attachment(s) thereto without retaining any copies.

Re: Need a bit of help, Solr 1.4: type text.

2010-02-10 Thread Yu-Shan Fung
Check out the configuration of WordDelimiterFilterFactory in your
schema.xml.

Depending on your settings, it's probably tokenizaing 13th into 13 and
th. You can also have them concatenated back into a single token, but I
can't remember the exact parameter. I think it could be catenateAll.



On Wed, Feb 10, 2010 at 4:32 PM, Dickey, Dan dan.dic...@savvis.net wrote:

 I'm using the standard text type for a field, and part of the data
 being indexed is 13th, as in Friday the 13th.

 I can't seem to get it to match when I'm querying for Friday the 13th
 either quoted or not.

 One thing that does match is 13 th if I send the search query with a
 space between...

 Any suggestions?

 I know this is short on detail, but it's been a long day... time to get
 outta here.

 Thanks for any and all help.

-Dan




 This message contains information which may be confidential and/or
 privileged. Unless you are the intended recipient (or authorized to receive
 for the intended recipient), you may not read, use, copy or disclose to
 anyone the message or any information contained in the message. If you have
 received the message in error, please advise the sender by reply e-mail and
 delete the message and any attachment(s) thereto without retaining any
 copies.




-- 
“When nothing seems to help, I go look at a stonecutter hammering away at
his rock perhaps a hundred times without as much as a crack showing in it.
Yet at the hundred and first blow it will split in two, and I know it was
not that blow that did it, but all that had gone before.” — Jacob Riis