Re: Review Request 62360: HIVE-16898: Validation of source file after distcp in repl load

Daniel Dai Mon, 18 Sep 2017 15:55:27 -0700


> On Sept. 18, 2017, 4:49 a.m., anishek wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
> > Lines 73 (patched)
> > <https://reviews.apache.org/r/62360/diff/1/?file=1828081#file1828081line73>
> >
> >     Evaluation of doing a regularCopy or distCp can be done in the inner 
> > most function call, this will reduce passing in another variable from the 
> > top which can be evaluated later


I need to cache useRegularCopy and pass it to multiple doCopyRetry, that's why 
I put in the outer function.


> On Sept. 18, 2017, 4:49 a.m., anishek wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
> > Lines 92 (patched)
> > <https://reviews.apache.org/r/62360/diff/1/?file=1828081#file1828081line92>
> >
> >     I think eventually we have to move to a model of doing the checksum on 
> > sourceFS vs destinationFS as you have done here, though certain FS 
> > configurations change the value of checksum and unless we can guarantee 
> > that we calculate the checksum on the data by reading the data this might 
> > lead to more failures,
> >     
> >     I thought the idea for now was that,
> >     
> >     1>> we get the checksum of the file on sourceFS before copy
> >     2>> we do the copy
> >     3>> we get the checksum on the file on sourceFS again 
> >     4>> we compare the checksum in 1 and 3 and if its not changed then 
> > during our copy the value wouldnt have either. 
> >     
> >     until we can figure out the acutal solution to this, the fall back of 
> > doing the check on sourceFS might be the way to go.

Yes, that's right. The checksum of the file is in _files.


> On Sept. 18, 2017, 4:49 a.m., anishek wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
> > Lines 116 (patched)
> > <https://reviews.apache.org/r/62360/diff/1/?file=1828081#file1828081line116>
> >
> >     As a part of doing copy if the copy fails due to fileNotFoundException 
> > for a file location to actual location on hdfs then we should retry with 
> > the corresponding CMRoot Path for this file since it was moved while we 
> > were in the porcess of doing the copy.
> >     
> >     Also if this happnes for a CM root file then there is an issue in our 
> > configuration such that the CM root FS is cleaned before the copy is done 
> > and we should log this as an error as the cleaner thread for CMroot is not 
> > configured for the right time. i did rather fail repl load, instead of just 
> > logging the error else we might not know how many such instances might 
> > happen before we realize that replication is broken.

Retry with CM path is part of copyAndVerify. doCopyRetry is shared between 
regular import and repl load, it does not deal with CM logic.


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62360/#review185534
-----------------------------------------------------------


On Sept. 15, 2017, 6:10 p.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62360/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2017, 6:10 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See HIVE-16898
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java 
> 88d6a7a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 54746d3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java 28e7bcb 
> 
> 
> Diff: https://reviews.apache.org/r/62360/diff/1/
> 
> 
> Testing
> -------
> 
> Manually test it with debugger: setup a breakpoint right before copy, and 
> drop table in another session.
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>

Re: Review Request 62360: HIVE-16898: Validation of source file after distcp in repl load

Reply via email to