Hi Emir,

So this would likely be different from what the operating system counts, as
the operating system may consider each Chinese characters as 3 to 4 bytes.
Which is probably why I could not find any record with subject:/.{255,}.*/

Is there other tools that we can use to query the length for data that are
already indexed which are not in the standard English language? (Eg:
Chinese, Japanese, etc)

Regards,
Edwin

On 3 January 2018 at 23:51, Emir Arnautović <emir.arnauto...@sematext.com>
wrote:

> Hi Edwin,
> I do not know, but my guess would be that each character is counted as 1
> in regex regardless how many bytes it takes in used encoding.
>
> Regards,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 3 Jan 2018, at 16:43, Zheng Lin Edwin Yeo <edwinye...@gmail.com>
> wrote:
> >
> > Thanks for the reply.
> >
> > I am doing the search on existing data that has already been indexed, and
> > it is likely to be a one time thing.
> >
> > This  subject:/.{255,}.*/  works for English characters. However, there
> are
> > Chinese characters in some of the records. The length seems to be more
> than
> > 255, but it does not shows up in the results.
> >
> > Do you know how the length for Chinese characters and other languages are
> > being determined?
> >
> > Regards,
> > Edwin
> >
> >
> > On 3 January 2018 at 23:01, Alexandre Rafalovitch <arafa...@gmail.com>
> > wrote:
> >
> >> Do that during indexing as Emir suggested. Specifically, use an
> >> UpdateRequestProcessor chain, probably with the Clone and FieldLength
> >> processors: http://www.solr-start.com/javadoc/solr-lucene/org/
> >> apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html
> >>
> >> Regards,
> >>   Alex.
> >>
> >> On 31 December 2017 at 22:00, Zheng Lin Edwin Yeo <edwinye...@gmail.com
> >
> >> wrote:
> >>> Hi,
> >>>
> >>> Would like to check, if it is possible to query a field which has data
> of
> >>> more than a certain length?
> >>>
> >>> Like for example, I want to query the field subject that has more than
> >> 255
> >>> bytes. Is it possible?
> >>>
> >>> I am currently using Solr 6.5.1.
> >>>
> >>> Regards,
> >>> Edwin
> >>
>
>

Reply via email to