kevinlo opened a new issue, #2910:
URL: https://github.com/apache/drill/issues/2910

   I am new to Apache Drill.   I post the question in stackflow and don't get 
an answer.    So, I try it here to see if someone can answer the question.  I 
am sorry if that is not the right place.
   
   I have a large (more than 8.5GB) CSV file that is updated on the first day 
of each month. But from the 2nd to the last day of each month, it can have new 
updated data in the JSON format.   These JSON format data will be merged to the 
CSV and become the new CSV on the first day of next month.
   
   I convert the CSV to panquet and do the query in Apache Drill, it works 
fine. But how can I query the big file with the updated file?
   
   e.g. In the Apr 1st CSV file, it has
   
   ID          Name           Value    LastUpdatedTime
   100         John           98       2024-01-05
   In the Apr 15 JSON file, it has
   
   ID          Name           Value    LastUpdatedTime
   100         John           100      2024-04-15
   When it query all these files for ID = 100, it should give Value=100 as it 
has newer LastUpdatedTime.
   
   I find this 
[post](https://stackoverflow.com/questions/48660704/update-insert-when-modifying-rdbmss-using-drill)
 saying people use Drill on data that is no longer changing.
   
   Is that true?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to