[ 
https://issues.apache.org/jira/browse/IMPALA-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-6108.
-------------------------------------
    Resolution: Fixed

{code}
commit 76111ce168c25a7882a13705963dc3c7118121a3
Author: Philip Zeyliger <[email protected]>
Date:   Wed Oct 25 16:38:22 2017 -0700

    IMPALA-6108, IMPALA-6070: Parallel data load (re-instated).

    This is a revert of a revert, re-enabling parallel data load.  It avoid
    the race condition by explicitly configuring the temporary directory in
    question in load-data.py.

    When the parallel data load change went in, we discovered
    a race with a signature of:

      java.io.FileNotFoundException: File
      /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist

    The number in this path is milliseconds since the epoch, and the race
    occurs when two queries submitted to HiveServer2, running with the local
    runner, hit the same millisecond time stamp.  The upstream bug is
    https://issues.apache.org/jira/browse/MAPREDUCE-6441, and I described the
    symptoms in https://issues.apache.org/jira/browse/MAPREDUCE-6992 (which
    is now marked as a dupe).

    I've tested this by running data load 5 times on the same machines
    where it failed before. I also ran data load manually and inspected
    the system to make sure that the temporary directories are getting
    created as expected in /tmp/impala-data-load-*.

    Change-Id: I60d65794da08de4bb3eb439a2414c095f5be0c10
    Reviewed-on: http://gerrit.cloudera.org:8080/8405
    Reviewed-by: Tim Armstrong <[email protected]>
    Tested-by: Impala Public Jenkins
{code}

> TPC-DS data load failed with FileNotFoundException
> --------------------------------------------------
>
>                 Key: IMPALA-6108
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6108
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Tim Armstrong
>            Assignee: Philip Zeyliger
>            Priority: Critical
>              Labels: flaky
>
> {noformat}
> 11:44:13 Loading TPC-DS data (logging to 
> /data/jenkins/workspace/impala-asf-master-core-data-load/repos/Impala/logs/data_loading/load-tpcds.log)...
>  
> 12:05:42     FAILED (Took: 21 min 29 sec)
> 12:05:42     'load-data tpch core' failed. Tail of log:
> 12:05:42      at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
> 12:05:42      at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 12:05:42      at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 12:05:42      at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 12:05:42      at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 12:05:42      at java.lang.Thread.run(Thread.java:745)
> 12:05:42 Caused by: java.util.concurrent.ExecutionException: 
> java.io.FileNotFoundException: File 
> /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist
> 12:05:42      at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> 12:05:42      at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> 12:05:42      at 
> org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:139)
> 12:05:42      ... 37 more
> 12:05:42 Caused by: java.io.FileNotFoundException: File 
> /tmp/hadoop-jenkins/mapred/local/1508958341829_tmp does not exist
> 12:05:42      at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598)
> 12:05:42      at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
> 12:05:42      at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
> 12:05:42      at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileLinkStatusInternal(RawLocalFileSystem.java:827)
> 12:05:42      at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:813)
> 12:05:42      at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatus(RawLocalFileSystem.java:784)
> 12:05:42      at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileLinkStatus(DelegateToFileSystem.java:132)
> 12:05:42      at 
> org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:701)
> 12:05:42      at 
> org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:236)
> 12:05:42      at 
> org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:674)
> 12:05:42      at org.apache.hadoop.fs.FileContext.rename(FileContext.java:932)
> 12:05:42      at 
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
> 12:05:42      at 
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
> 12:05:42      ... 4 more
> {noformat}
> This happens on commit e4f585240ac8f478e25402806f4ea38531b4bf84



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to