Re: Query fields with data of certain length

2018-02-04 Thread Zheng Lin Edwin Yeo
Hi, Thanks for the reply. Meaning we have to write this custom QParser ourselves? Regards, Edwin On 3 February 2018 at 03:28, Chris Hostetter wrote: > > : Have you manage to get the regex for this string in Chinese: > 预支款管理及账务处理办法 ? > ... > : > An example

Re: Query fields with data of certain length

2018-02-02 Thread Chris Hostetter
: Have you manage to get the regex for this string in Chinese: 预支款管理及账务处理办法 ? ... : > An example of the string in Chinese is 预支款管理及账务处理办法 : > : > The number of characters is 12, but the expected length should be 36. ... : >> > So this would likely be different from what the

Re: Query fields with data of certain length

2018-02-01 Thread Emir Arnautović
Hi Edwin, Unfortunately, I was not able find regex that would work in your case. Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 1 Feb 2018, at 05:42, Zheng Lin Edwin Yeo

Re: Query fields with data of certain length

2018-01-31 Thread Zheng Lin Edwin Yeo
Hi, Have you manage to get the regex for this string in Chinese: 预支款管理及账务处理办法 ? Regards, Edwin On 4 January 2018 at 18:04, Zheng Lin Edwin Yeo wrote: > Hi Emir, > > An example of the string in Chinese is 预支款管理及账务处理办法 > > The number of characters is 12, but the expected

Re: Query fields with data of certain length

2018-01-04 Thread Zheng Lin Edwin Yeo
Hi Emir, An example of the string in Chinese is 预支款管理及账务处理办法 The number of characters is 12, but the expected length should be 36. Regards, Edwin On 4 January 2018 at 16:21, Emir Arnautović wrote: > Hi Edwin, > I don’t have enough knowledge in eastern languages

Re: Query fields with data of certain length

2018-01-04 Thread Emir Arnautović
Hi Edwin, I don’t have enough knowledge in eastern languages to know what is expected number when you as for sting length. Maybe you can try some of regex unicode settings and see if you’ll get what you need: try setting unicode flag with (?U) or try using regex groups and ranges. If you

Re: Query fields with data of certain length

2018-01-03 Thread Zheng Lin Edwin Yeo
Hi Emir, So this would likely be different from what the operating system counts, as the operating system may consider each Chinese characters as 3 to 4 bytes. Which is probably why I could not find any record with subject:/.{255,}.*/ Is there other tools that we can use to query the length for

Re: Query fields with data of certain length

2018-01-03 Thread Emir Arnautović
Hi Edwin, I do not know, but my guess would be that each character is counted as 1 in regex regardless how many bytes it takes in used encoding. Regards, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/

Re: Query fields with data of certain length

2018-01-03 Thread Zheng Lin Edwin Yeo
Thanks for the reply. I am doing the search on existing data that has already been indexed, and it is likely to be a one time thing. This subject:/.{255,}.*/ works for English characters. However, there are Chinese characters in some of the records. The length seems to be more than 255, but it

Re: Query fields with data of certain length

2018-01-03 Thread Alexandre Rafalovitch
Do that during indexing as Emir suggested. Specifically, use an UpdateRequestProcessor chain, probably with the Clone and FieldLength processors: http://www.solr-start.com/javadoc/solr-lucene/org/apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html Regards, Alex. On 31

Re: Query fields with data of certain length

2018-01-03 Thread Emir Arnautović
Hi Edwin, If it is one time thing you can use regex to filter out results that are not long enough. Something like: subject:/.{255,}.*/. Of course, this means subject is not tokenized. It would be probably best if you index subject length as separate field and include it in query as

Query fields with data of certain length

2017-12-31 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, if it is possible to query a field which has data of more than a certain length? Like for example, I want to query the field subject that has more than 255 bytes. Is it possible? I am currently using Solr 6.5.1. Regards, Edwin