nikfio commented on issue #3102: URL: https://github.com/apache/arrow-adbc/issues/3102#issuecomment-3288877196
Hi all, sorry for my late reply. I managed to achieve the result by explicitly executing a clean of all rows with the exact same timestamp and keeping valid the first one encountered. Here is the code snippet: ` conn = self.connect() # delete duplicates query_clean = f'''DELETE FROM {target_table} WHERE ROWID NOT IN ( SELECT MIN(ROWID) FROM {target_table} GROUP BY {BASE_DATA_COLUMN_NAME.TIMESTAMP} );''' cur = conn.cursor() res = cur.execute(query_clean) # Close cur.close() conn.commit() conn.close() ` Let me know what you guys think of the solution. Of course, it assumes that there is already a timestamp column in place, which may be a hard constraint. Timestamp column may be any column having a unique value for rows having same values on all columns. Thanks, Nick -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org