zhangshenghang commented on issue #10231: URL: https://github.com/apache/seatunnel/issues/10231#issuecomment-3738829287
> > > > Using the ElasticSearch connector directly will automatically recognize OpenSearch > > > > > > > > > Thanks a lot, it do works fine. [@zhangshenghang](https://github.com/zhangshenghang) > > > However, I met another problem by using Seatunnel to migrate data from Elasticsearch to OpenSearch/Elasticsearch. > > > Since ES/OS' integer storage supports int 、array、 array、 array ..... by define like this: > > > "position_int": { "type": "integer", "doc_values": true, "store": true }, > > > But Seatunnel only supports array. I met this error which I can't fix by trying all methods I could think of. > > > Caused by: java.lang.NumberFormatException: For input string: "[[14,13,13,13,13]]" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.base/java.lang.Integer.parseInt(Integer.java:652) at java.base/java.lang.Integer.parseInt(Integer.java:770) at org.apache.seatunnel.connectors.seatunnel.elasticsearch.serialize.source.DefaultSeaTunnelRowDeserializer.convertValue(DefaultSeaTunnelRowDeserializer.java:163) > > > Could you help me fix this? Or is this a bug? Or is this A rule that can't modify? > > > > > > You can refer to the array_column parameter in https://seatunnel.apache.org/docs/connector-v2/source/Elasticsearch/ to see if it can solve your problem. > > [@zhangshenghang](https://github.com/zhangshenghang) hi, As I tested, I can confirm that it's a bug or mechanism mismatch In Seatunnel. And it won't happen by using Logstash to transform data from ES to ES/OpenSearch which adapts this feature. > > Since in ES, all int / array / array<array> will get flattened in Lucene's storage(a mechanism of dynamic type leniency). The fundamental problem is: ES accepts element that is integer, no matter which container the element is in. While in Seatunnel, it accepts the container which is defined as int / array / array<array>. As the following picture shows, this case works well in ES. > > <img alt="Image" width="969" height="376" src="https://private-user-images.githubusercontent.com/4587937/531708918-753fd6d4-0ff4-403c-86c2-3fea72504592.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NjgyMjgyNzYsIm5iZiI6MTc2ODIyNzk3NiwicGF0aCI6Ii80NTg3OTM3LzUzMTcwODkxOC03NTNmZDZkNC0wZmY0LTQwM2MtODZjMi0zZmVhNzI1MDQ1OTIucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI2MDExMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNjAxMTJUMTQyNjE2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MWZjNGQ0NTRhZDMwZTIwNjMzYTdmYWRjZmE1MjM4MGZlZGU4ZDUwNjgzZjk2M2JlYjQ3NDk0Mzk1MzlkNGNlZSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.BGJyDxlVvRJCplOIAI6m-9yiVkFNIcaKtGs04iKmiGw"> > However, no matter I configure this as array_column or int in Seatunnel's config file, it won't work at all. > > Do you have a plan to solve this problem by seatunnel's Elasticsearch team? Or Do you positively accept a pr which I think I can add a adaptor that can slove the Elasticsearch's speical mechanism of dynamic type leniency.( I am a software engineer focusing on ES/OpenSearch) > > Looking forward to your reply, thanks thanks @pyyuhao Thank you for your careful consideration. look forward to your plan. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
