One possibility could be to borrow an idea from the NLP world, and
pre-process parenthesis to -LRB- and -RRB- tokens (and square and curly to
their corresponding forms). Bypasses issues of escaping but needs
reindexing, and preprocessing the query.

-sujit


On Tue, Oct 19, 2021 at 8:27 AM Casteel, Kayla Lynne
<kayla.cast...@jacobs.com.invalid> wrote:

> Unfortunately I can't change the type of the allText field to string. We
> need the features that come with it being a text field.
>
> (We did try changing it to string just to see what would happen -- it made
> the problem worse, and solr still didn't handle the escaped parentheses
> properly)
>
>
> We're using solr 8.0.0, if it matters.
>
>
> Some more details about the allText field: Right now it's a text_general
> type, which we define as:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>     <analyzer type="index">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
>     <analyzer type="query">
>       <tokenizer class="solr.StandardTokenizerFactory"/>
>       <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>       <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>       <filter class="solr.LowerCaseFilterFactory"/>
>     </analyzer>
> </fieldType>
>
> And the allText field itself:
> <field name="allText" type="text_general" docValues="false"
> multiValued="true" indexed="true" stored="true"/>
>
>
> I don't know if that helps at all. Solr automagically escaping the escape
> characters I use in the query is still bugging me.
>
>
> Thank you,
>
> Kayla Casteel
>
> ________________________________
> From: Deepak Goel <deic...@gmail.com>
> Sent: Tuesday, October 19, 2021 2:49:54 AM
> To: users@solr.apache.org
> Subject: [EXTERNAL] Re: Having issues searching literal parentheses
>
> Hey
>
> It might be possible  *allText* does not consider them *()* as text. You
> might have to try something else (possibly String)
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deic...@gmail.com
>
> Facebook:
> https://urldefense.com/v3/__https://www.facebook.com/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnvb_rKfQ$
> LinkedIn:
> https://urldefense.com/v3/__http://www.linkedin.com/in/deicool__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnMfPoJ4c$
>
> "Plant a Tree, Go Green"
>
> Make In India :
> https://urldefense.com/v3/__http://www.makeinindia.com/home__;!!B5cixuoO7ltTeg!XNU2TawhCzIqQitd3wiwO7MaKrpvSjjvyttSMHAr_5ePIqE7PUCkUIRGUvmnxN7SEE4$
>
>
> On Tue, Oct 19, 2021 at 2:54 AM Casteel, Kayla Lynne
> <kayla.cast...@jacobs.com.invalid> wrote:
>
> > Hello all,
> >
> > I have been going mad trying to get SOLR to search for parentheses as
> > literals. For example, "(Figure 5)". I've tried entering it in the fq
> field
> > as:
> > allText:\(Figure 5\)
> >
> > (where allText is a facet). SOLR interprets this in the response as
> >
> > "fq":"allText:\\(Figure 5\\)"
> >
> > and it ends up finding text like "In Figure 5" with no parentheses. I
> > assume this is because it is escaping the escape characters.
> >
> >
> > I've tried escaping, I've tried URL encoding them, I've tried banging my
> > head on the desk. I can't get solr to understand that this should be an
> > exact match and that the parentheses are both literal and mandatory.
> >
> > In the wiki it even gives an example of escaping parentheses as part of
> > the "Escaping Special Characters" section but it doesn't seem to work in
> my
> > case.
> >
> > Has anyone else experienced this issue? Is there something I'm doing
> wrong?
> >
> >
> > Thank you,
> >
> > Kayla Casteel
> >
> > ________________________________
> >
> > NOTICE - This communication may contain confidential and privileged
> > information that is for the sole use of the intended recipient. Any
> > viewing, copying or distribution of, or reliance on this message by
> > unintended recipients is strictly prohibited. If you have received this
> > message in error, please notify us immediately by replying to the message
> > and deleting it from your computer.
> >
>
> ________________________________
>
> NOTICE - This communication may contain confidential and privileged
> information that is for the sole use of the intended recipient. Any
> viewing, copying or distribution of, or reliance on this message by
> unintended recipients is strictly prohibited. If you have received this
> message in error, please notify us immediately by replying to the message
> and deleting it from your computer.
>

Reply via email to