kevinlo opened a new issue, #2910: URL: https://github.com/apache/drill/issues/2910
I am new to Apache Drill. I post the question in stackflow and don't get an answer. So, I try it here to see if someone can answer the question. I am sorry if that is not the right place. I have a large (more than 8.5GB) CSV file that is updated on the first day of each month. But from the 2nd to the last day of each month, it can have new updated data in the JSON format. These JSON format data will be merged to the CSV and become the new CSV on the first day of next month. I convert the CSV to panquet and do the query in Apache Drill, it works fine. But how can I query the big file with the updated file? e.g. In the Apr 1st CSV file, it has ID Name Value LastUpdatedTime 100 John 98 2024-01-05 In the Apr 15 JSON file, it has ID Name Value LastUpdatedTime 100 John 100 2024-04-15 When it query all these files for ID = 100, it should give Value=100 as it has newer LastUpdatedTime. I find this [post](https://stackoverflow.com/questions/48660704/update-insert-when-modifying-rdbmss-using-drill) saying people use Drill on data that is no longer changing. Is that true? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org