[ 
https://issues.apache.org/jira/browse/DRILL-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797083#comment-15797083
 ] 

Padma Penumarthy edited comment on DRILL-5127 at 1/4/17 4:07 AM:
-----------------------------------------------------------------

Drill does not have support for concurrent updates to metadata cache files. So, 
cache files can get corrupted if multiple writers try to update at the same 
time.  Fix for DRILL-4831 addresses this problem of metadata cache files 
getting corrupted by writing to a temporary file first and do atomic rename at 
the end.  However, the problem was rename does not update the modification time 
of the file that is renamed. But, since parent directory is changed, its 
modification time is updated. All file systems (HDFS, MapR FS, Linux, Mac OS) 
behave this way. During query processing, we check file modification time 
against parent directory's modification time to figure out if metadata cache is 
stale and needs to be rebuilt. Every time we rebuild, we end up changing parent 
directory's modification time and that causes cache to be rebuilt again next 
time. One solution that is tested be working fine and fixes this issue is to 
update the modification time of the file to be same as parent directory using 
FileSystem setTimes API.  However, we seem to be hitting other known issues 
more frequently. 



was (Author: ppenumarthy):
Drill does not have support for concurrent updates to metadata cache files. So, 
cache files can get corrupted if multiple writers try to update at the same 
time.  Fix for DRILL-4831 addresses this problem of metadata cache files 
getting corrupted by writing to a temporary file first and do atomic rename at 
the end.  However, the problem was rename does not update the modification time 
of the file that is renamed. But, since parent directory is changed, its 
modification time is updated. All file systems (HDFS, MapR FS, Linux, Mac OS) 
behave this way. During query processing, we check file modification time 
against parent directory's modification time to figure out if metadata cache is 
stale and needs to be rebuilt. One solution that is tested be working fine and 
fixing this issue is to update the modification time of the file to be same as 
parent directory using FileSystem setTimes API.  However, we seem to be hitting 
other known issues more frequently. 


> Revert the fix for DRILL-4831
> -----------------------------
>
>                 Key: DRILL-5127
>                 URL: https://issues.apache.org/jira/browse/DRILL-5127
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 1.10
>            Reporter: Rahul Challapalli
>
> Git Commit # : 3f3811818ecc3bbf6f307a408c30f0406fadc703
> DRILL-4831 introduced a major regression DRILL-5082. I tested the supposed 
> fix for DRILL-5082 and that increased the frequency of other known issues 
> (DRILL-5115 & DRILL-4832). Since there is no fix in-sight before the next 
> release (DRILL-1.11). I suggest we back off the original fix made for 
> DRILL-4831.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to