[ 
https://issues.apache.org/jira/browse/FLINK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679756#comment-16679756
 ] 

ASF GitHub Bot commented on FLINK-10801:
----------------------------------------

pnowojski opened a new pull request #7060: [FLINK-10801][e2e] Retry 
verify_result_hash in elastichsearch-common
URL: https://github.com/apache/flink/pull/7060
 
 
   Instead of looping the verification until the expected number of results 
loop until we get the correct output. This tries to solve the problem of some 
records (aggregated? updated?) arriving later.
   
   ## Verifying this change
   
   This is a change in end-to-end tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix sql client integrate elasticsearch connector test failure
> -------------------------------------------------------------
>
>                 Key: FLINK-10801
>                 URL: https://issues.apache.org/jira/browse/FLINK-10801
>             Project: Flink
>          Issue Type: Bug
>          Components: E2E Tests
>            Reporter: vinoyang
>            Assignee: Piotr Nowojski
>            Priority: Major
>              Labels: pull-request-available
>
> It usually reports : 
> {code:java}
> FAIL SQL Client Elasticsearch Upsert: Output hash mismatch. Got 
> 6187222e109ee9222e6b2f117742070c, expected 982cb32908def9801e781381c1b8a8db.
> head hexdump of actual:
> 0000000 { \n " h i t s " : { \n 
> 0000010 " t o t a l " : 3 , \n
> 0000020 " m a x _ s c o r e " 
> 0000030 : 1 . 0 , \n " h i t s
> 0000040 " : [ \n { \n 
> 0000050 " _ i n d e x " :
> 0000060 " m y _ u s e r s " , \n 
> 0000070 " _ t y p e " : "
> 0000080 u s e r " , \n "
> 0000090 _ i d " : " 1 _ B o b "
> 00000a0 , \n " _ s c o r
> 00000b0 e " : 1 . 0 , \n 
> 00000ba
> {code}
> the actual hash means : 
> {code:java}
> {
>   "hits" : {
>     "total" : 3,
>     "max_score" : 1.0,
>     "hits" : [
>       {
>         "_index" : "my_users",
>         "_type" : "user",
>         "_id" : "1_Bob  ",
>         "_score" : 1.0,
>         "_source" : {
>           "user_id" : 1,
>           "user_name" : "Bob  ",
>           "user_count" : 1
>         }
>       },
>       {
>         "_index" : "my_users",
>         "_type" : "user",
>         "_id" : "22_Alice",
>         "_score" : 1.0,
>         "_source" : {
>           "user_id" : 22,
>           "user_name" : "Alice",
>           "user_count" : 1
>         }
>       },
>       {
>         "_index" : "my_users",
>         "_type" : "user",
>         "_id" : "42_Greg ",
>         "_score" : 1.0,
>         "_source" : {
>           "user_id" : 42,
>           "user_name" : "Greg ",
>           "user_count" : 3
>         }
>       }
>     ]
>   }
> }
> {code}
> the expected hash code means : 
> {code:java}
> {
>   "hits" : {
>     "total" : 3,
>     "max_score" : 1.0,
>     "hits" : [
>       {
>         "_index" : "my_users",
>         "_type" : "user",
>         "_id" : "1_Bob  ",
>         "_score" : 1.0,
>         "_source" : {
>           "user_id" : 1,
>           "user_name" : "Bob  ",
>           "user_count" : 2
>         }
>       },
>       {
>         "_index" : "my_users",
>         "_type" : "user",
>         "_id" : "22_Alice",
>         "_score" : 1.0,
>         "_source" : {
>           "user_id" : 22,
>           "user_name" : "Alice",
>           "user_count" : 1
>         }
>       },
>       {
>         "_index" : "my_users",
>         "_type" : "user",
>         "_id" : "42_Greg ",
>         "_score" : 1.0,
>         "_source" : {
>           "user_id" : 42,
>           "user_name" : "Greg ",
>           "user_count" : 3
>         }
>       }
>     ]
>   }
> }
> {code}
> It seems that the user count for "Bob" is off by 1.
> The speculation is due to the premature acquisition of aggregated statistics 
> from Elasticsearch.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to