Hi,
Faceting on text field requires use of field cache which can eat up a large 
heap and result in unstable Solr. It is recommended to have doc values enabled 
for field that you plan to do faceting on, but you cannot enable doc values on 
text field. It is recommended to do preprocessing of text and store results as 
multivalue fields (or in some cases as boolean field like you mentioned) and 
enable doc values on that field. You can do it on Solr with custom update 
request processor: 
http://www.od-bits.com/2018/02/solr-docvalues-on-analysed-field.html 
<http://www.od-bits.com/2018/02/solr-docvalues-on-analysed-field.html>

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 6 Mar 2018, at 18:16, Moncif Aidi <aidi.mon...@gmail.com> wrote:
> 
> Hello,
> 
> I am using Solr to power faceting features for our  application.
> 
> I know that SOLR can do free text search but what is the best practice for
> faceting on common terms inside SOLR text fields?
> 
> For example, we have a large blob of text (a description of a property)
> which contains useful text to facet on like 'city', 'formation', 'year',
> 'school', 'skill', ... dozens more like these.
> 
> I would like to create a view which lets users see the number of properties
> with each of these terms and allow the users to drill down to the relevant
> properties.
> 
> One obvious solution is to pre-process the data, parse the text, and create
> the facets for each of these key phrases with a boolean yes/no value.
> 
> I'd ideally like to automate this, so I imagine the SOLR free text search
> engine might allow this? e.g. Can I use the free text search engine to
> remove stop words and collect counts of common phrases which we can then
> present to the user?
> 
> If pre-processing is the only way, is there a common/best practice approach
> to this or any open source libraries which perform this function?
> 
> What is the best practice for counting and grouping common phrases from a
> text field in SOLR?
> 
> 
> Cordialement
> 
> *Moncif AIDI*. Ingénieur Chef d'équipe à TeslaTeam-Maroc
> <http://www.teslateam.ma/>
> M:+212 658 541 045 | T:+212 537 70 81 21
> Linkedin
> <https://www.linkedin.com/profile/view?id=131220035&trk=nav_responsive_tab_profile>
> | Facebook <https://www.facebook.com/M0ziNsof> | Twitter
> <http://twitter.com/teslateam> | *Skype :* moncif44

Reply via email to