SbloodyS commented on issue #8790:
URL: 
https://github.com/apache/dolphinscheduler/issues/8790#issuecomment-1064089509


   > and this is the full error info: `Duplicate key TaskDefinition{id=0, 
code=4720070430336, name='execute_tidb', version=1, description='', 
projectCode=0, userId=0, taskType=PYTHON, 
taskParams='{"resourceList":[{"id":7},{"id":9},{"id":19},{"id":13},{"id":21},{"id":12},{"id":22},{"id":60},{"id":72},{"id":71},{"id":77},{"id":81}],"localParams":[],"rawScript":"from
 utils import postgres_functions, mysql_functions, aws_functions, 
avro_functions, slack_functions\nimport os\n\nhostname = 
os.environ[\"HOSTNAME\"]\nserver = hostname.split(\"-\")[2]\n\nif server == 
\"staging\":\n secret_name = 
\"dtsta/eks/dolphinscheduler/app-credentials\"\n\nelif server == \"dev\":\n 
secret_name = \"dtdev/eks/dolphinscheduler/app-credentials\"\n\nelif server == 
\"production\":\n secret_name = 
\"dtpro/eks/dolphinscheduler/app-credentials\"\n\nregion_name = 
\"ap-southeast-1\"\naws_secrets = aws_functions.get_secrets(secret_name, 
region_name)\n\n# tidb connection\ntidb_host = aws_secrets[\"TIDB_BEHAVIOR_HOST\
 "]\ntidb_port = int(aws_secrets[\"TIDB_BEHAVIOR_PORT\"])\ntidb_user = 
aws_secrets[\"TIDB_BEHAVIOR_USERNAME\"]\ntidb_password = 
aws_secrets[\"TIDB_BEHAVIOR_PASSWORD\"]\ntidb_database = 
aws_secrets[\"TIDB_REPORTING_DATABASE\"]\ntidb_conn = 
mysql_functions.create_mysql_connection(tidb_host, tidb_port, tidb_user, 
tidb_password, tidb_database)\n\nsql = \"\"\"\nCREATE TABLE IF NOT EXISTS 
datawarehouse_reporting.dwd_crypto_fiat_payment_views_bcb_multiple_disbursement_cur\n(\n
 vendor_transaction_id varchar(255) not null\n\t\tprimary key,\n status 
varchar(255) ,\n created_at timestamp(6) NOT NULL,\n currency varchar(255) ,\n 
amount numeric(28,6),\n reference text NOT NULL,\n account_id text NOT NULL,\n 
beneficiary_id text NOT NULL,\n multiple_disbursement_found boolean,\n 
slack_posted_indicator integer,\n remarks text\n)\n;\n\"\"\"\n\nresults = 
mysql_functions.execute_mysql_sql(tidb_conn, 
sql)\nprint(results)","dependence":{},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTim
 eout":{},"switchResult":{}}', taskParamList=null, taskParamMap=null, flag=YES, 
taskPriority=MEDIUM, userName='null', projectName='null', 
workerGroup='default', failRetryTimes=1, environmentCode='-1', 
failRetryInterval=1, timeoutFlag=CLOSE, timeoutNotifyStrategy=WARN, timeout=0, 
delayTime=0, resourceIds='null', createTime=null, updateTime=null}` and what I 
have looked up in my database from table t_ds_task_definition_log: <img 
alt="Screen Shot 2022-03-10 at 8 22 00 PM" width="1229" 
src="https://user-images.githubusercontent.com/89381700/157660821-e4c76ef4-e372-44b0-8a6b-a511aee961e9.png";>
   > 
   > it seems have reset the version of my task to 1
   
   From the screenshot you provided. It seems there is no duplicate data in the 
```t_ds_task_definition_log``` table. Can you check whether there is any 
duplicate data based on the following SQL.
   
   ```
   SELECT
       a.process_definition_code
       ,MAX(a.id) as max_id
       ,a.pre_task_code
       ,a.pre_task_version
       ,a.post_task_code
       ,a.post_task_version
       ,a.process_definition_version
       ,COUNT(*) cnt
   FROM
   t_ds_process_task_relation_log a
   JOIN (
       SELECT
           code
       FROM
           t_ds_process_definition
       GROUP BY code
   )b ON b.code = a.process_definition_code
   WHERE 1=1
   GROUP BY a.pre_task_code
       ,a.post_task_code
       ,a.pre_task_version
       ,a.post_task_version
       ,a.process_definition_code
       ,a.process_definition_version
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to