Thanks Hugues, This is helpful !! However I was trying to write Java native script for sorting values using ICUCollation. Please find attached JAR . This script works fine. However it gives incorrect sorting result.
Please find below code snippet
Script is as follows:
PUT /custom1_index
{
"my_type":
{"properties" :
{ "LastName" :
{
"type": "string",
"index":"not-anlayzed"
}
}
}
}
PUT /custom1_index/my_type/1
{ "LastName": "AAP"
}
PUT /custom1_index/my_type/2
{ "LastName": "zara"
}
PUT /custom1_index/my_type/3
{ "LastName": "beta"
}
GET /custom1_index/_search
{
"script_fields": {
"sort": {
"script": "ICUSortingScriptFilter",
"lang": "native",
"params": {
"field": “LastName"
}
},
"type": "string"
}
}
Please let me know if any correction is required in this script.
Regards,
Angie
On Friday, 6 February 2015 11:40:53 UTC+5:30, Hugues Malphettes wrote:
>
> Hi Angie,
>
> On Friday, 6 February 2015 12:17:47 UTC+8, Geetanjali Paygude wrote:
>>
>> Hi Hugues,
>>
>> So you have extended "String" type to add custom analyzer.
>>
>> I am referring to this thread
>>
>> http://elasticsearch-users.115913.n3.nabble.com/Support-for-case-insensitive-sorts-with-doc-values-tt4064487.html
>>
>> Is there any way to use script/transform the source and then apply sort
>> on it? If yes will you please share the same.
>>
>
>>
>> As mentioned by Adrien, is there work-around on client-side before the
>> data gets into elasticsearch using some native / groovy script?
>>
> I believe Adrien suggested this procedure:
> - create a second field specifically to store the value as a
> docvalue/not_analyzed string
> - on the client-side analyze the string yourself
> - add the new value as a separate field in the document you index
> - "profit": use that new field for sorting and other queries
>
> A variation of this consists of delegating the generation of the second
> field's value to a _source transform.
> - create the same second field: docvalues-not_analyzed
> - define a source transform for the affected type of document
> - in the script of the source transform apply the transformation you need
> - "profit"
> You are saving some bandwidth, the _source of your document will never
> show the second value and the impact on your client code is limited to the
> queries.
> ES will work more and the transform you can do in the script might be
> limited.
>
>
>>
>> All I want is following
>>
>>
>> 1. We have ICU plugin which helps us achieve custom sorting to some
>> extent.
>> 3. However, the problem now is that we are trying to use the doc_values =
>> true option in mapping but this cannot be used for string fields having
>> analyzer.
>> 4. So if we need to use ICU plugin then we cannot use doc_value option.
>> 5.Other way is to use the ICU plugin as a library i.e. we call some API
>> in that plugin which converts our field into required format for sorting.
>>
>>
>> So is there a way to call some API or transform input using script ?
>>
> I suspect it might be difficult to invoke the ICU transformation via a
> groovy script.
> You could make it work with a native script written in java.
>
>>
>>
>> OR If I use your analyzer in a native script, how to invoke the same from
>> mappings. Please provide usage example
>>
> My code snippet is in fact a new mapping type; not an analyzer.
> It is more or less a fork of the original string mapping as defined inside
> Elasticsearch.
>
> I have packaged this new mapping type in a plugin here:
> https://github.com/hmalphettes/elasticsearch-docvalues-string
>
> It is a work in progress. Help is welcome if it is useful for you.
>
> I hope this helps.
> Let us know,
> Hugues
>
>
>>
>>
>>
>> Thanks,
>> Angie
>>
>> On Friday, 14 November 2014 06:55:36 UTC+5:30, Hugues Malphettes wrote:
>>>
>>> Hi Adrian and everyone,
>>>
>>> I gave a shot at a extending the 'string' type to add another analyzer:
>>> https://gist.github.com/hmalphettes/b402d72230e9009f960c
>>>
>>> The parameter "index_docvalues_analyzer" when present on the mapping
>>> definition will generate a Token Stream and the first token is stored as a
>>> SortedSetDocValuesField.
>>>
>>> It works for me. WOuld it be interesting to make this part of the
>>> standard StringFieldMapper?
>>>
>>> Cheers!
>>> Hugues
>>>
>>> On Tuesday, 7 October 2014 18:11:18 UTC+8, Hugues Malphettes wrote:
>>>>
>>>> Thanks Adrian,
>>>>
>>>> I'll give a shot at the source transform then.
>>>>
>>>> If you consider that it makes sense to support this, would it be
>>>> helpful to file an enhancement request on github?
>>>> Give us a hint if you think it can be done by an occasional contributor
>>>> ;-)
>>>>
>>>> Cheers,
>>>> Hugues
>>>>
>>>> On Tuesday, 7 October 2014 17:59:29 UTC+8, Adrien Grand wrote:
>>>>>
>>>>> Hi Hugues,
>>>>>
>>>>> For now the work-around would indeed be to do the work on client-side
>>>>> before the data gets into elasticsearch (or potentially using the _source
>>>>> transform[1] feature).
>>>>>
>>>>> [1]
>>>>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html
>>>>>
>>>>> On Tue, Oct 7, 2014 at 9:19 AM, Hugues Malphettes <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> Case insensitive sort is elegantly supported by using a custom
>>>>>> analyzer [1].
>>>>>> `doc values` are documented as a great fit for sorting [2] to save
>>>>>> heap memory.
>>>>>>
>>>>>> However doc values are not support for analyzed strings at the moment.
>>>>>>
>>>>>> Are we planning to support doc values for analyzers that emit a
>>>>>> single token per string?
>>>>>> Is it worth it to have the ES client do the lower-casing and
>>>>>> collation itself?
>>>>>>
>>>>>> Thanks!
>>>>>> Hugues
>>>>>>
>>>>>> [1]
>>>>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/sorting-collations.html#case-insensitive-sorting
>>>>>> [2]
>>>>>> http://www.elasticsearch.org/blog/elasticsearch-1-4-0-beta-released/
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "elasticsearch" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/elasticsearch/4095e5b7-1fb4-477a-b27f-3e4519ab9000%40googlegroups.com
>>>>>>
>>>>>> <https://groups.google.com/d/msgid/elasticsearch/4095e5b7-1fb4-477a-b27f-3e4519ab9000%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Adrien Grand
>>>>>
>>>>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3d5fba22-1f19-48b0-bce7-062cad407c01%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
elasticssearchcustom.jar
Description: application/java-archive
